Elementary General Relativity - ustc.edu.cnstaff.ustc.edu.cn/~jmy/documents/ebooks/elementary GR.pdf · Elementary General Relativity Version 3.35 ... 2.2 The Key to General Relativity

ElementaryGeneral

RelativityVersion 3.35

Alan Macdonald

Luther College, Decorah, IA USAmailto:[email protected]

http://faculty.luther.edu/∼macdonalc©

mailto:[email protected]

http://faculty.luther.edu/~macdonal

To Ellen

“The magic of this theory will hardly fail to impose itself on anybodywho has truly understood it.”

Albert Einstein, 1915

“The foundation of general relativity appeared to me then [1915],and it still does, the greatest feat of human thinking about Nature,the most amazing combination of philosophical penetration, physicalintuition, and mathematical skill.”

Max Born, 1955

“One of the principal objects of theoretical research in any depart-ment of knowledge is to find the point of view from which the subjectappears in its greatest simplicity.”

Josiah Willard Gibbs

“There is a widespread indifference to attempts to put accepted the-ories on better logical foundations and to clarify their experimentalbasis, an indifference occasionally amounting to hostility. I am con-cerned with the effects of our neglect of foundations on the educa-tion of scientists. It is plain that the clearer the teacher, the moretransparent his logic, the fewer and more decisive the number of ex-periments to be examined in detail, the faster will the pupil learnand the surer and sounder will be his grasp of the subject.”

Sir Hermann Bondi

“Things should be made as simple as possible, but not simpler.”

Albert Einstein

Contents

Preface

1 Flat Spacetimes1.1 Spacetimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.2 The Inertial Frame Postulate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.3 The Metric Postulate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.4 The Geodesic Postulate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 Curved Spacetimes2.1 History of Theories of Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.2 The Key to General Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.3 The Local Inertial Frame Postulate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.4 The Metric Postulate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372.5 The Geodesic Postulate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.6 The Field Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 Spherically Symmetric Spacetimes3.1 Stellar Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .483.2 The Schwartzschild Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.3 The Solar System Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.4 Kerr Spacetimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593.5 The Binary Pulsar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613.6 Black Holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4 Cosmological Spacetimes4.1 Our Universe I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .654.2 The Robertson-Walker Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.3 The Expansion Redshift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704.4 Our Universe II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724.5 General Relativity Today . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Preface

The purpose of this book is to provide, with a minimum of mathematicalmachinery and in the fewest possible pages, a clear and careful explanation ofthe physical principles and applications of classical general relativity. The pre-requisites are single variable calculus, a few basic facts about partial derivativesand line integrals, a little matrix algebra, and a little basic physics.

The book is for those seeking a conceptual understanding of the theory, notentry into the research literature. Despite it’s brevity and modest prerequisites,it is a serious introduction to the physics and mathematics of general relativitywhich demands careful study. The book can stand alone as an introduction togeneral relativity or be used as an adjunct to standard texts.

Chapter 1 is a self-contained introduction to those parts of special relativitywe require for general relativity. We take a nonstandard approach to the metric,analogous to the standard approach to the metric in Euclidean geometry. Ingeometry, distance is first understood geometrically, independently of any coor-dinate system. If coordinates are introduced, then distances can be expressed interms of coordinate differences: ∆s2 = ∆x2 + ∆y2. The formula is important,but the geometric meaning of the distance is fundamental.

Analogously, we define the spacetime interval of special relativity physically,independently of any coordinate system. If inertial frame coordinates are in-troduced, then the interval can be expressed in terms of coordinate differences:∆s2 = ∆t2 − ∆x2 − ∆y2 − ∆z2. The formula is important, but the physicalmeaning of the interval is fundamental. I believe that this approach to the met-ric provides easier access to and deeper understanding of special relativity, andfacilitates the transition to general relativity.

Chapter 2 introduces the physical principles on which general relativity isbased. The basic concepts of Riemannian geometry are developed in order toexpress these principles mathematically as postulates. The purpose of the pos-tulates is not to achieve complete rigor – which is neither desirable nor possiblein a book at this level – but to state clearly the physical principles, and toexhibit clearly the relationship to special relativity and the analogy with sur-faces. The postulates are in one-to-one correspondence with the fundamentalconcepts of Riemannian geometry: manifold, metric, geodesic, and curvature.Concentrating on the physical meaning of the metric greatly simplifies the de-velopment of general relativity. In particular, a formal development of tensorsis not needed. There is, however, a brief introdution to tensors in an appendix.(Similarly, modern elementary differential geometry texts often develop the in-trinsic geometry of curved surfaces by focusing on the geometric meaning of themetric, without using tensors.)

The first two chapters systematically exploit the mathematical analogy whichled to general relativity: a curved spacetime is to a flat spacetime as a curvedsurface is to a flat surface. Before introducing a spacetime concept, its analogfor surfaces is presented. This is not a new idea, but it is used here more system-atically than elsewhere. For example, when the metric ds of general relativityis introduced, the reader has already seen a metric in three other contexts.

Chapter 3 solves the field equation for a spherically symmetric spacetimeto obtain the Schwartzschild metric. The geodesic equations are then solvedand applied to the classical solar system tests of general relativity. There is asection on the Kerr metric, including gravitomagnetism and the Gravity ProbeB experiment. The chapter closes with sections on the binary pulsar and blackholes. In this chapter, as elsewhere, I have tried to provide the cleanest possiblecalculations.

Chapter 4 applies general relativity to cosmology. We obtain the Robertson-Walker metric in an elementary manner without the field equation. We reviewthe evidence for a spatially flat universe with a cosmological constant. Wethen apply the field equation with a cosmological constant to a spatially flatRobertson-Walker spacetime. The solution is given in closed form. Recentastronomical data allow us to specify all parameters in the solution, giving thenew “standard model” of the universe with dark matter, dark energy, and anaccelerating expansion.

There have been many spectacular astronomical discoveries and observa-tions since 1960 which are relevant to general relativity. We describe them atappropriate places in the book.

Some 50 exercises are scattered throughout. They often serve as examplesof concepts introduced in the text. If they are not done, they must be read.

Some tedious (but always straightforward) calculations have been omitted.They are best carried out with a computer algebra system. Some material hasbeen placed in about 20 pages of appendices to keep the main line of developmentvisible. The appendices occasionally require more background of the reader thanthe text. They may be omitted without loss of anything essential. Appendix1 gives the values of various physical constants. Appendix 2 contains severalapproximation formulas used in the text.

Chapter 1

Flat Spacetimes

1.1 Spacetimes

The general theory of relativity is our best theory of space, time, and gravity.It is commonly felt to be the most beautiful of all physical theories. AlbertEinstein created the theory during the decade following the publication of hisspecial theory of relativity in 1905. The special theory is our best theory ofspace and time when gravity is insignificant. The general theory generalizes thespecial theory to include gravity.

In geometry the fundamental entities are points. A point is a specific place.In relativity the fundamental entities are events. An event is a specific time andplace. For example, the collision of two particles is an event. A concert is anevent (idealizing it to a single time and place). To attend the concert, you mustbe at the time and the place of the event.

A flat or curved surface is a set of points. (We shall prefer the term “flatsurface” to “plane”.) Similarly, a spacetime is a set of events. For example,we might consider the events in a specific room between two specific times. Aflat spacetime is one without significant gravity. Special relativity describesflat spacetimes. A curved spacetime is one with significant gravity. Generalrelativity describes curved spacetimes.

There is nothing mysterious about the words “flat” or “curved” attachedto a set of events. They are chosen because of a remarkable analogy, alreadyhinted at, concerning the mathematical description of a spacetime: a curvedspacetime is to a flat spacetime as a curved surface is to a flat surface. Thisanalogy will be a major theme of this book; we shall use the mathematics offlat and curved surfaces to guide our understanding of the mathematics of flatand curved spacetimes.

We shall explore spacetimes with clocks to measure time and rods (rulers)to measure space, i.e., distance. However, clocks and rods do not in fact liveup to all that we usually expect of them. In this section we shall see what weexpect of them in relativity.

11

1.1 Spacetimes

Clocks. A curve is a continuous succession of points in a surface. Similarly,a worldline is a continuous succession of events in a spacetime. A movingparticle or a pulse of light emitted in a single direction is present at a continuoussuccession of events, its worldline. Even if a particle is at rest, time passes, andthe particle has a worldline.

The length of a curve between two given points depends on the curve. Sim-ilarly, the time between two given events measured by a clock moving betweenthe events depends on the clock’s worldline! J. C. Hafele and R. Keating pro-vided the most direct verification of this in 1971. They brought two atomicclocks together, placed one of them in an airplane which circled the Earth, andthen brought the clocks together again. Thus the clocks took different world-lines between the event of their separation and the event of their reunion. Theymeasured different times between the two events. The difference was small,about 10−7 sec, but was well within the ability of the clocks to measure. Thereis no doubt that the effect is real.

Relativity predicts the measured difference. Exercise 1.10 shows that specialrelativity predicts a difference between the clocks. Exercise 2.1 shows thatgeneral relativity predicts a further difference. Exercise 3.3 shows that generalrelativity predicts the observed difference. Relativity predicts large differencesbetween clocks whose relative velocity is close to the speed of light.

The best answer to the question “How can the clocks in the experimentpossibly disagree?” is the question “Why should they agree?” After all, theclocks are not connected. According to everyday ideas they should agree becausethere is a universal time, common to all observers. It is the duty of clocks toreport this time. The concept of a universal time was abstracted from experiencewith small speeds (compared to that of light) and clocks of ordinary accuracy,where it is nearly valid. The concept permeates our daily lives; there are clockseverywhere telling us the time. However, the Hafele-Keating experiment showsthat there is no universal time. Time, like distance, is route dependent.

Since clocks on different worldlines between two events measure differenttimes between the events, we cannot speak of the time between two events. Butwe can speak of the time along a given worldline between the events. For therelative rates of all processes – the ticking of a clock, the frequency of a tuningfork, the aging of an organism, etc. – are the same along all worldlines. (Unlesssome adverse physical condition affects a rate.) Twins, each remaining closeto one of the clocks during the experiment, would age according to their clock.They would thus be of slightly different ages when reunited.

12

1.1 Spacetimes

Rods. Consider astronauts in interstellar space, where gravity is insignifi-cant. If their rocket is not firing and their ship is not spinning, then they willfeel no forces acting on them and they can float freely in their cabin. If theirspaceship is accelerating, then they will feel a force pushing them back againsttheir seat. If the ship turns to the left, then they will feel a force to the right.If the ship is spinning, they will feel a force outward from the axis of spin. Callthese forces inertial forces.

Fig. 1.1: Anaccelerometer.The weight is heldat the center bysprings. Acceler-ation causes theweight to movefrom the center.

Accelerometers measure inertial forces. Fig. 1.1shows a simple accelerometer consisting of a weight heldat the center of a frame by identical springs. Inertialforces cause the weight to move from the center. An in-ertial object is one which experiences no inertial forces.If an object is inertial, then any object moving at a con-stant velocity with respect to it is also inertial and anyobject accelerating with respect to it is not inertial.

In special relativity we make an assumption whichallows us to speak of the distance between two inertialobjects at rest with respect to each other: Inertial rigidrods side by side and at rest with respect to two inertialobjects measure the same distance between the objects.More precisely, we assume that any difference is due tosome adverse physical cause (e.g., thermal expansion)to which an “ideal” rigid rod would not be subject. Inparticular, the history of a rigid rod does not affect itslength. Noninertial rods are difficult to deal with inrelativity, and we shall not consider them.

In the next three sections we give three postulates for special relativity.The inertial frame postulate asserts that certain natural coordinate systems,called inertial frames, exist for a flat spacetime. The metric postulate assertsa universal light speed and a slowing of clocks moving in inertial frames. Thegeodesic postulate asserts that inertial particles and light move in a straight lineat constant speed in inertial frames.

We shall use the analogy given at the beginning of this section to help us un-derstand the postulates. Imagine two dimensional beings living in a flat surface.These surface dwellers can no more imagine leaving their two spatial dimensionsthan we can imagine leaving our three spatial dimensions. Before introducing apostulate for a flat spacetime, we introduce the analogous postulate formulatedby surface dwellers for a flat surface. The postulates for a flat spacetime use atime dimension, but those for a flat surface do not.

13

1.2 The Inertial Frame Postulate


Surface dwellers find it useful to label the points of their flat surface with co-ordinates. They construct, using identical rigid rods, a square grid and assignrectangular coordinates (x, y) to the nodes of the grid in the usual way. SeeFig. 1.2. They specify a point by using the coordinates of the node nearestthe point. If more accurate coordinates are required, they build a finer grid.Surface dwellers call the coordinate system a planar frame. They postulate:

The Planar Frame Postulate for a Flat Surface

A planar frame can be constructed with any point P as origin andwith any orientation.

Fig. 1.2: A planarframe.

Similarly, it is useful to label the events in a flatspacetime with coordinates (t, x, y, z). The coordi-nates specify when and where the event occurs, i.e.,they completely specify the event. We now describehow to attach coordinates to events. The procedureis idealized, but it gives a clear physical meaning tothe coordinates.

To specify where an event occurs, construct, us-ing identical rigid rods, an inertial cubical lattice.See Fig. 1.3. Assign rectangular coordinates (x, y, z)to the nodes of the lattice in the usual way. Specifywhere an event occurs by using the coordinates ofthe node nearest the event.

Fig. 1.3: An inertiallattice.

To specify when an event occurs, place a clock ateach node of the lattice. Then the times of events ata given node can be specified by reading the clock atthat node. But to compare meaningfully the times ofevents at different nodes, the clocks must be in somesense synchronized. As we shall see soon, this is nota trivial matter. (Remember, there is no universaltime.) For now, assume that the clocks have beensynchronized. Then specify when an event occurs byusing the time, t, on the clock at the node nearest theevent. And measure the coordinate time difference∆t between two events using the synchronized clocksat the nodes where the events occur. Note that thisrequires two clocks.

The four dimensional coordinate system obtained in this way from an inertialcubical lattice with synchronized clocks is called an inertial frame. The event(t, x, y, z) = (0, 0, 0, 0) is the origin of the inertial frame. We postulate:

The Inertial Frame Postulate for a Flat SpacetimeAn inertial frame can be constructed with any event E as origin,with any orientation, and with any inertial object at E at rest in it.

14


If we suppress one or two of the spatial coordinates of an inertial frame, thenwe can draw a spacetime diagram and depict worldlines. For example, Fig. 1.4shows the worldlines of two particles. One is at rest on the x-axis and one movesaway from x = 0 and then returns more slowly.

Fig. 1.4: Two worldlines. Fig. 1.5: Worldline of a particlemoving with constant speed.

Exercise 1.1. Show that the worldline of an object moving along the x-axisat constant speed v is a straight line with slope v. See Fig. 1.5.

Exercise 1.2. Describe the worldline of an object moving in a circle in thez = 0 plane at constant speed v.

Synchronization. We return to the matter of synchronizing the clocks inthe lattice. We can directly compare side-by-side clocks. But what does it meanto say that separated clocks are synchronized? Einstein realized that the answerto this question is not given to us by Nature; rather, it must be answered witha definition.

Exercise 1.3. Why not simply bring the clocks together, synchronize them,move them to the nodes of the lattice, and call them synchronized?

We might try the following definition. Send a signal from a node P of thelattice at time tP according to the clock at P . Let it arrive at a node Q ofthe lattice at time tQ according to the clock at Q. Let the distance betweenthe nodes be D and the speed of the signal be v. Say that the clocks aresynchronized if

tQ = tP + D/v. (1.1)

Intuitively, the term D/v compensates for the time it takes the signal to get toQ. This definition is flawed because it contains a logical circle: v is defined bya rearrangement of Eq. (1.1): v = D/(tQ − tP ). Synchronized clocks cannot bedefined using v because synchronized clocks are needed to define v.

We adopt the following definition, essentially due to Einstein. Emit a pulseof light from a node P at time tP according to the clock at P . Let it arrive at anode Q at time tQ according to the clock at Q. Similarly, emit a pulse of lightfrom Q at time t′Q and let it arrive at P at t′P . The clocks are synchronized if

tQ − tP = t′P − t′Q, (1.2)

15


i.e., if the times in the two directions are equal.Reformulating the definition makes it more transparent. If the pulse from

Q to P is the reflection of the pulse from P to Q, then t′Q = tQ in Eq. (1.2).Let 2T be the round trip time: 2T = t′P − tP . Substitute this in Eq. (1.2):

tQ = tP + T ; (1.3)

the clocks are synchronized if the pulse arrives at Q in half the time it takes forthe round trip.

Exercise 1.4. Explain why Eq. (1.2) is a satisfactory definition but Eq.(1.1) is not.

There is a tacit assumption in the definition of synchronized clocks that thetwo sides of Eq. (1.2) do not depend on the times that the pulses are sent:

Emit pulses of light from a node R at times tR and t′R accordingto a clock at R. Let them arrive at a node S at times tS and t′Saccording to a clock at S. Then

t′S − t′R = tS − tR. (1.4)

With this assumption we can be sure that synchronized clocks will remain syn-chronized.

Exercise 1.5. Show that with the assumption Eq. (1.4), T in Eq. (1.3) isindependent of the time the pulse is sent.

A rearrangement of Eq. (1.4) gives

∆so = ∆se, (1.5)

where ∆so = t′S − tS is the time between the observation of the pulses at S and∆se = t′R− tR is the time between the emission of the pulses at R. (We use ∆srather than ∆t to conform to notation used later in more general situations.) Ifa clock at R emits pulses of light at regular intervals to S, then Eq. (1.5) statesthat an observer at S sees (actually sees) the clock at R going at the same rateas his clock. Of course, the observer at S will see all physical processes at Rproceed at the same rate they do at S.

Redshifts. We will encounter situations in which ∆so 6= ∆se . Define theredshift

z =∆so

∆se− 1. (1.6)

Equations (1.4) and (1.5) correspond to z = 0. If, for example, z = 1 (∆so/∆se

= 2), then the observer at S would see clocks at R, and all other physicalprocesses at R, proceed at half the rate they do at S.

If the two “pulses” of light in Eq. (1.6) are successive wavecrests of lightemitted at frequency fe = (∆se)−1 and observed at frequency fo = (∆so)−1,then Eq. (1.6) can be written

z =fe

fo− 1. (1.7)

16


In Exercise 1.6 we shall see that Eq. (1.5) is violated, i.e., z 6= 0, if theemitter and observer are in relative motion in a flat spacetime. This is calleda Doppler redshift . Later we shall see two other kinds of redshift: gravitationalredshifts in Sec. 2.2 and expansion redshifts in Sec. 4.1. The three types ofredshifts have different physical origins and so must be carefully distinguished.

The inertial frame postulate asserts in part that clocks in an inertial latticecan be synchronized according to the definition Eq. (1.2), or, in P. W. Bridge-man’s descriptive phrase, we can “spread time over space”. Appendix 3 provesthis with the aid of an auxiliary assumption.

17

1.3 The Metric Postulate


Let P and Q be points in a flat surface. Different curves between the points havedifferent lengths. But surface dwellers single out for special study the length ∆sof the straight line between P and Q. They call ∆s the proper distance betweenthe points.

We defined the proper distance ∆s between two points geometrically, withoutusing coordinates. But if a planar frame is introduced, then there is a simpleformula for ∆s in terms of the coordinate differences between the points:

The Metric Postulate for a Flat SurfaceLet ∆s be the proper distance between points P and Q. Let P andQ have coordinate differences (∆x,∆y) in a planar frame. Then

∆s2 = ∆x2 + ∆y2. (1.8)

Fig. 1.6: ∆s2 = ∆x2 +∆y2 = ∆x2 + ∆y2.

The coordinate differences ∆x and ∆y be-tween P and Q are different in different planarframes. See Fig. 1.6. However, the particularcombination of the differences in Eq. (1.8) willalways produce ∆s. Neither ∆x nor ∆y has ageometric significance independent of the par-ticular planar frame chosen. The two of themtogether do: they determine ∆s, which has a di-rect geometric significance, independent of anycoordinate system.

Now let E and F be events in a flat space-time. There is a distance-like quantity ∆s be-tween them. It is called the (spacetime) interval between the events. Thedefinition of ∆s in a flat spacetime is more complicated than in a flat surface,because there are three ways in which events can be, we say, separated :

• If E and F can be on the worldline of a pulse of light, they are lightlikeseparated. Then define ∆s = 0.

• If E and F can be on the worldline of an inertial clock, they are timelikeseparated. Then define ∆s to be the time between the events measuredby the inertial clock. This is the proper time between the events. Clockson other worldlines between the events will measure different times. Butwe single out for special study the proper time ∆s .

• If E and F can be simultaneously at the ends of an inertial rigid rod –simultaneously in the sense that light flashes emitted at E and F reachthe center of the rod simultaneously, or equivalently, that E and F aresimultaneous in the rest frame of the rod – they are spacelike separated.Then define |∆s | to be the length the rod. (The reason for the absolutevalue will become clear later.) This is the proper distance between theevents.

18


We defined the spacetime interval ∆s between two events physically, withoutusing coordinates. But if an inertial frame is introduced, then there is a simpleformula for ∆s in terms of the coordinate differences between the events:

The Metric Postulate for a Flat Spacetime

Let ∆s be the interval between events E and F . Let the events havecoordinate differences (∆t,∆x, ∆y,∆z) in an inertial frame. Then

∆s2 = ∆t2 −∆x2 −∆y2 −∆z2. (1.9)

The coordinate differences between E and F , including the time coordinatedifference, are different in different inertial frames. For example, suppose thatan inertial clock measures a proper time ∆s between two events. In an inertialframe in which the clock is at rest, ∆t = ∆s and ∆x = ∆y = ∆z = 0. This willnot be the case in an inertial frame in which the clock is moving. However, theparticular combination of the differences in Eq. (1.9) will always produce ∆s.No one of the coordinate differences has a physical significance independent ofthe particular inertial frame chosen. The four of them together do: they deter-mine ∆s, which has a direct physical significance, independent of any inertialframe.

This shows that the joining of space and time into spacetime is not anartificial technical trick. Rather, in the words of Hermann Minkowski, whointroduced the spacetime concept in 1908, “Space by itself, and time by itself,are doomed to fade away into mere shadows, and only a kind of union of thetwo will preserve an independent reality.” Minkowskian geometry, with mixedsigns in its expression for the interval, Eq. (1.9), is different from Euclideangeomtry, the geometry of flat space. But it is a perfectly valid geometry, thegeometry of flat spacetime.

Physical Meaning. We now describe the physical meaning of the metricpostulate for lightlike, timelike, and spacelike separated events. We do not needthe y- and z-coordinates for the discussion, and so we use Eq. (1.9) in the form

∆s2 = ∆t2 −∆x2. (1.10)

Lightlike separated events. By definition, a pulse of light can movebetween lightlike separated events, and ∆s = 0 for the events. From Eq. (1.10)the speed of the pulse is |∆x|/∆t = 1 . The metric postulate asserts that thespeed c of light has always the same value c = 1 in all inertial frames. Theassertion is that the speed is the same in all inertial frames; the actual valuec = 1 is a convention: Choose the distance light travels in one second – about3 × 1010 cm – as the unit of distance. Call this one (light) second of distance.(You are probably familiar with a similar unit of distance – the light year.)Then 1 cm = 3.3× 10−11 sec. With this convention c = 1, and all other speedsare expressed as a dimensionless fraction of the speed of light. Ordinarily thefractions are very small. For example, 3 km/sec = .00001.

19


Timelike separated events. By definition, an inertial clock can move be-tween timelike separated events, and ∆s is the (proper) time the clock measuresbetween the events. The speed of the clock is v = |∆x|/∆t . Then from Eq.(1.10), the proper time is

∆s = (∆t2 −∆x2)12 = [1− (∆x/∆t)2]

12 ∆t = (1− v2)

12 ∆t. (1.11)

The metric postulate asserts that the proper time between two events is less thanthe time determined by the synchronized clocks of an inertial frame: ∆s < ∆t.Informally, “moving clocks run slowly”. This is called time dilation. Accordingto Eq. (1.11) the time dilation factor is (1− v2)

12 . For normal speeds, v is very

small, v2 is even smaller, and so from Eq. (1.11), ∆s ≈ ∆t, as expected. Butas v → 1,∆t/∆s = (1− v2)−

12 →∞. Fig. 1.7 shows the graph of ∆t/∆s vs. v.

Exercise 1.6. Investigate the

Fig. 1.7: ∆t/∆s = (1− v2)−12 .

Doppler redshift. Let a source of lightpulses move with velocity v directlyaway from an observer at rest in an iner-tial frame. Let ∆te be the time betweenthe emission of pulses, and ∆to be thetime between the reception of pulses bythe observer.

a. Show that ∆to = ∆te + v∆te/c .b. Ignore time dilation in Eq. (1.6)

by setting ∆s = ∆t. Show that z = v/cin this approximation.

c. Show that the exact formula isz = [(1 + v)/(1− v)]

12 − 1 (with c = 1).

Use the result of part a.

Spacelike separated events. By definition, the ends of an inertial rigidrod can be simultaneously present at the events, and |∆s | is the length of therod. Appendix 5 shows that the speed of the rod in I is v = ∆t/|∆x| . (That isnot a typo.) A calculation similar to Eq. (1.11) gives

|∆s | = (1− v2)12 |∆x|. (1.12)

The metric postulate asserts that the proper distance between spacelike separatedevents is less than an inertial frame distance: |∆s | < |∆x|. (This is not lengthcontraction, which we discuss in Appendix 6.)

Connections. We have just seen that the physical meaning of the metricpostulate is different for lightlike, timelike, and spacelike separated events. How-ever, the meanings are connected: the physical meaning for lightlike separatedevents (a universal light speed) implies the physical meanings for timelike andspacelike separated events. We now prove this for the timelike case. The proofis quite instructive. The spacelike case is less so; it is relegated to Appendix 5.

20


Fig. 1.8: ∆s2 = ∆t2 − ∆x2 for timelikeseparated events. e and f are the points inI at which events E and F occur.

By definition, an inertialclock C can move between time-like separated events E and F ,and ∆s is the time C measuresbetween the events. Let C carrya rod R perpendicular to its di-rection of motion. Let R havea mirror M on its end. At E alight pulse is sent along R fromC. The length of R is chosen sothat the pulse is reflected by Mback to F . Fig. 1.3 shows the path of the light in I, together with C,R, and Mas the light reflects off M .

Refer to the rightmost triangle in Fig. 1.3. In I, the distance between Eand F is ∆x. This gives the labeling 1

2∆x of the base of the triangle. In I, thelight takes the time ∆t from E to M to F . Since c = 1 in I, the light travelsa distance ∆t in I. This gives the labeling 1

2∆t of the hypotenuse. C is at restin some inertial frame I ′. In I ′, the light travels the length of the rod twice inthe proper time ∆s between E and F measured by C. Since c = 1 in I ′, thelength of the rod is 1

2∆s in I ′. This gives the labeling 12∆s of the altitude of

the triangle. (There is a tacit assumption here that the length of R is the samein I and I ′. Appendix 6 discusses this.) Applying the Pythagorean theorem tothe triangle shows that Eq. (1.10) is satisfied for timelike separated events.

In short, since the light travels farther in I than in I ′ (the hypotenuse twicevs. the altitude twice) and the speed c = 1 is the same in I and I ′, the time (=distance/speed) between E and F is longer in I than for C. This shows, in amost graphic way, that accepting a universal light speed forces us to abandon auniversal time.

Looking at it another way, we see how it is possible for a single pulse of lightto have the same speed in inertial frames moving with respect to each other:the speed (= distance/time) can be the same because the distance and the timeare different in the two frames.

Exercise 1.7. Criticize the following argument. We have just seen thatthe time between two events is greater in I than in I ′. But exactly the sameargument carried out in I ′ will show that the time between the events is greaterin I ′ than in I. This is a contradiction.

Local Forms. The metric postulate for a planar frame, Eq. (1.8), gives onlythe distance along a straight line between two points. The differential versionof Eq. (1.8) gives the distance ds between neighboring points along any curve:

The Metric Postulate for a Flat Surface, Local Form

Let P and Q be neighboring points. Let ds be the distance betweenthem. Let the points have coordinate differences (dx, dy) in a planarframe. Then

ds2 = dx2 + dy2.

21


Thus, if a curve is parameterized (x(p), y(p)), a ≤ p ≤ b, then

ds2 =

[(dx

dp

)2

+(

dy

dp

)2]

dp2,

and the length of the curve is

s =∫ b

p=a

ds =∫ b

p=a

[(dx

dp

)2

+(

dy

dp

)2] 1

2

dp .

Of course, different curves between two points can have different lengths.Exercise 1.8. Calculate the circumference of the circle x = r cos θ, y =

r sin θ, 0 ≤ θ ≤ 2π.

The metric postulate for an inertial frame Eq. (1.9) is concerned only withtimes measured by inertial clocks. The differential version of Eq. (1.9) givesthe time ds measured by any clock between neighboring events on its worldline:

The Metric Postulate for a Flat Spacetime, Local Form

Let E and F be neighboring events. If E and F are lightlike sep-arated, let ds = 0. If the events are timelike separated, let ds bethe time between them as measured by any (inertial or noninertial)clock. Let the events have coordinate differences (dt, dx, dy, dz) inan inertial frame. Then

ds2 = dt2 − dx2 − dy2 − dz2. (1.13)

From Eq. (1.13), if the worldline of a clock is parameterized

(t(p), x(p), y(p), z(p)), a ≤ p ≤ b,

then the time s to traverse the worldline, as measured by the clock, is

s =∫ b

p=a

ds =∫ b

p=a

[(dt

dp

)2

−(

dx

dp

)2

−(

dy

dp

)2

−(

dz

dp

)2] 1

2

dp.

In general, clocks on different worldlines between two events will measure dif-ferent times between the events.

Exercise 1.9. Let a clock move between two events with a time difference∆t. Let v be the small constant speed of the clock. Show that ∆t−∆s ≈ 1

2v2∆t.

Exercise 1.10. Consider a simplified Hafele-Keating experiment. Oneclock remains on the ground and the other circles the equator in an airplane tothe west – opposite to the Earth’s rotation. Assume that the Earth is spinningon its axis at one revolution per 24 hours in an inertial frame. (Thus the clockon the ground is not at rest.) Notation: ∆t is the duration of the trip in the

22


inertial frame, vr is the velocity of the clock remaining on the ground, and ∆sr

is the time it measures for the trip. The quantities va and ∆sa are definedsimilarly for the airplane.

Use Exercise 1.9 for each clock to show that the difference between theclocks due to time dilation is ∆sa−∆sr = 1

2 (v2a−v2

r)∆t. Suppose that ∆t = 40hours and the speed of the airplane with respect to the ground is 1000 km/hr.Substitute values to obtain ∆sa −∆sr = 1.4× 10−7 s.

Experimental Evidence. Because general relativistic effects play a part inthe Hafele-Keating experiment (see Exercise 2.1), and because the uncertaintyof the experiment is large (±10%), this experiment is not a precision test of timedilation for clocks. Much better evidence comes from observations of subatomicparticles called muons. When at rest the average lifetime of a muon is 3× 10−6

sec. According to the differential version of Eq. (1.11), if the muon is movingin a circle with constant speed v, then its average life, as measured in thelaboratory, should be larger by a factor (1− v2)−

12 . An experiment performed

in 1977 showed this within an experimental error of .2%. In the experimentv = .9994, giving a time dilation factor ∆t/∆s = (1− v2)−

12 = 29! The circular

motion had an acceleration of 1021 cm/sec2, and so this is a test of the localform Eq. (1.13) of the metric postulate as well as the original form Eq. (1.9).

There is excellent evidence for a universal light speed. First of all, realizethat if clocks at P and Q are synchronized according to the definition Eq.(1.2), then the speed of light from P to Q is equal to the speed from Q to P .We emphasize that with our definition of synchronized clocks this equality is amatter of definition which can be neither confirmed nor refuted by experiment.

The speed c of light can be measured by sending a pulse of light from a pointP to a mirror at a point Q at distance D and measuring the elapsed time 2Tfor it to return. Then c = 2D/2T ; c is a two way speed, measured with a singleclock. Equation 1.5 shows that this two way speed is equal to the one way speedfrom P to Q. Thus the one way speed of light can be measured by measuringthe two way speed.

In a famous experiment performed in 1887, A. A. Michelson and E. W.Morley compared the two way speed of light in perpendicular directions from agiven point. Their experiment has been repeated many times, most accuratelyby G. Joos in 1930, who found that any difference in the two way speeds isless than six parts in 1012. The Michelson-Morley experiment is described inAppendix 7. A modern version of the experiment using lasers was performedin 1979 by A. Brillit and J. L. Hall. They found that any difference in the twoway speed of light in perpendicular directions is less than four parts in 1015.See Appendix 7.

Another experiment, performed by R. J. Kennedy and E. M. Thorndike in1932, found the two way speed of light to be the same, within six parts in 109,on days six months apart, during which time the Earth moved to the oppositeside of its orbit. See Appendix 7. Inertial frames in which the Earth is at reston days six months apart move with a relative speed of 60 km/sec (twice the

23


Earth’s orbital speed). A more recent experiment by D. Hils and Hall improvedthe result by over two orders of magnitude.

These experiments provide good evidence that the two way speed of lightis the same in different directions, places, and inertial frames and at differenttimes. They thus provide strong motivation for our definition of synchronizedclocks: If the two way speed of light has always the same value, what could bemore natural than to define synchronized clocks by requiring that the one wayspeed have this value?

In all of the above experiments, the source of the light is at rest in theinertial frame in which the light speed is measured. If light were like baseballs,then the speed of a moving source would be imparted to the speed of light itemits. Strong evidence that this is not so comes from observations of certainneutron stars which are members of a binary star system and which emit X-raypulses at regular intervals. These systems are described in Sec. 3.1. If thespeed of the star were imparted to the speed of the X-rays, then various strangeeffects would be observed. For example, X-rays emitted when the neutron staris moving toward the Earth could catch up with those emitted earlier when itwas moving away from the Earth, and the neutron star could be seen comingand going at the same time! See Fig. 1.9.

Fig. 1.9: The speed of light is independent ofthe speed of its source.

This does not happen; ananalysis of the arrival timesof the pulses at Earth madein 1977 by K. Brecher showsthat no more than two partsin 109 of the speed of thesource is added to the speedof the X-rays. (It is notpossible to “see” the neutronstar in orbit around its com-panion directly. The speedof the neutron star toward oraway from the Earth can be determined from the Doppler redshift of the timebetween pulses. See Exercise 1.6.)

Finally, recall from above that the universal light speed part of the metricpostulate implies the parts about timelike and spacelike separated events. Thusthe evidence for a universal light speed is also evidence for the other two parts.

The universal nature of the speed of light makes possible the modern defini-tion of the unit of length: “The meter is the length of the path traveled by lightduring the time interval of 1/299,792,458 of a second.” Thus, by definition, thespeed of light is 299,792,458 m/sec.

24

1.4 The Geodesic Postulate


We will find it convenient to use superscripts to distinguish coordinates. Thuswe use (x1, x2) instead of (x, y) for planar frame coordinates.

Fig. 1.10: A geodesic in a planar frame.

The line in Fig. 1.10 can be parameterized by the (proper) distance s from(b1, b2) to (x1, x2): x1(s) = (cos θ)s+b1, x

2(s) = (sin θ)s+b2. Differentiate twicewith respect to s to obtain

The Geodesic Postulate for a Flat Surface

Parameterize a straight line with arclength s . Then in every planarframe

xi = 0, i = 1, 2. (1.14)

The straight lines are called geodesics.Not all parameterizations of a straight line satisfy the geodesic differential

equations Eq. (1.14). Example: xi(p) = aip3 + bi.

25


It is convenient to use (x0, x1, x2, x3) instead of (t, x, y, z) for inertial framecoordinates. Our third postulate for special relativity says that inertial particlesand light pulses move in a straight line at constant speed in an inertial frame,i.e., their equations of motion are

xi = aix0 + bi, i = 1, 2, 3. (1.15)

(Differentiate to give dxi/dx0 = ai; the velocity is constant.) For inertial parti-cles this is called Newton’s first law.

Set x0 = p, a parameter; a0 = 1; and b0 = 0, and find that worldlines ofinertial particles and light can be parameterized

xi(p) = aip + bi, i = 0, 1, 2, 3 (1.16)

in an inertial frame. Eq. (1.16), unlike Eq. (1.15), is symmetric in all fourcoordinates of the inertial frame. Also, Eq. (1.16) shows that the worldline isa straight line in the spacetime. Thus “straight in spacetime” includes both“straight in space” and “straight in time” (constant speed). See Exercise 1.1.The worldlines are called geodesics.

Exercise 1.11. In Eq. (1.16) the parameter p = x0 . Show that theworldline of an inertial particle can also parameterized with s, the proper timealong the worldline.

The Geodesic Postulate for a Flat Spacetime

Worldlines of inertial particles and pulses of light can be parameter-ized with a parameter p so that in every inertial frame

xi = 0, i = 0, 1, 2, 3. (1.17)

For inertial particles we may take p = s.

The geodesic postulate is a mathematical expression of our physical assertionthat in an inertial frame inertial particles and light move in a straight line atconstant speed.

Exercise 1.12. Make as long a list as you can of analogous properties offlat surfaces and flat spacetimes.

26

Chapter 2

Curved Spacetimes

2.1 History of Theories of Gravity

Recall the analogy from Chapter 1: A curved spacetime is to a flat spacetimeas a curved surface is to a flat surface. We explored the analogy between a flatsurface and a flat spacetime in Chapter 1. In this chapter we generalize from flatsurfaces and flat spacetimes (spacetimes without significant gravity) to curvedsurfaces and curved spacetimes (spacetimes with significant gravity). Generalrelativity interprets gravity as a curvature of spacetime.

Fig. 2.1: The position of aplanet ( ) with respect to thestars changes nightly.

Before embarking on a study of grav-ity in general relativity let us review, verybriefly, the history of theories of gravity.These theories played a central role in therise of science. Theories of gravity have theirroots in attempts of the ancients to predictthe motion of the planets. The position ofa planet with respect to the stars changesfrom night to night, sometimes exhibitinga “loop” motion, as in Fig. 2.1. The word“planet” is from the Greek “planetai”: wan-derers.

In the second century, Claudius Ptolemy devised a scheme to explain thesemotions. Ptolemy placed the Earth near the center of the universe with a planetmoving on a small circle, called an epicycle, while the center of the epicyclemoves (not at a uniform speed) on a larger circle, the deferent, around theEarth. See Fig. 2.2. By appropriately choosing the radii of the epicycle anddeferent, as well as the speeds involved, Ptolemy was able to reproduce, withfair accuracy, the motions of the planets. This remarkable but cumbersometheory was accepted for over 1000 years.

27


In 1543 Nicholas Copernicus published a theory which replaced Ptolemy’sgeocentric (Earth centered) system with a heliocentric (sun centered) system.Copernicus retained the system of uniform circular motion with deferrents andepicycles for the planets. With his system, but not Ptolemy’s, the relativedistances between the planets and the sun can be determined.

Fig. 2.2: Ptolemy’s theory ofplanetary motion.

In 1609 Johannes Kepler published atheory which is more accurate than Coper-nicus’: the path of a planet is an ellipsewith the Sun at one focus. At about thesame time Galileo Galilei was investigatingthe acceleration of objects near the Earth’ssurface. He found two things of interest forus: the acceleration is constant in time andindependent of the mass and composition ofthe falling object.

In 1687 Isaac Newton published a the-ory of gravity which explained Kepler’s as-

tronomical and Galileo’s terrestrial findings as manifestations of the same phe-nomenon: gravity. To understand how orbital motion is related to falling mo-tion, refer to Fig. 2.1. The curves A,B, C are the paths of objects leavingthe top of a tower with greater and greater horizontal velocities. They hit theground farther and farther from the bottom of the tower until C when the objectgoes into orbit.

Fig. 2.3: Fallingand orbital mo-tion are the same.

Mathematically, Newton’s theory says that a planetin the Sun’s gravity or an apple in the Earth’s gravity isattracted by the central body (do not ask how!), causingan acceleration

a = −κM

r2, (2.1)

where κ is the Newtonian gravitational constant, M isthe mass of the central body, and r is the distance to thecenter of the central body. Eq. (2.1) implies that theplanets orbit the Sun in ellipses, in accord with Kepler’sfindings. See Appendix 8. By taking the distance rto the Earth’s center to be sensibly constant near theEarth’s surface, we see that Eq. (2.1) is also in accordwith Galileo’s findings: a is constant in time and is independent of the massand composition of the falling object.

Newton’s theory has enjoyed enormous success. A spectacular example oc-curred in 1846. Observations of the position of the planet Uranus disagreed withthe predictions of Newton’s theory of gravity, even after taking into account thegravitational effects of the other known planets. The discrepancy was about 4arcminutes – 1/8th of the angular diameter of the moon. U. Le Verrier, a Frenchastronomer, calculated that a new planet, beyond Uranus, could account for thediscrepancy. He wrote J. Galle, an astronomer at the Berlin observatory, telling

28


him where the new planet should be – and Neptune was discovered! It waswithin 1 arcdegree of Le Verrier’s prediction.

Even today, calculations of spacecraft trajectories are made using Newton’stheory. The incredible accuracy of his theory will be examined further in Sec.3.3.

Nevertheless, Einstein rejected Newton’s theory because it is based on pre-relativity ideas about time and space which, as we have seen, are not correct.For example, the acceleration in Eq. (2.1) is instantaneous with respect to auniversal time.

29

2.2 The Key to General Relativity


A curved surface is different from a flat surface. However, a simple observationby the nineteenth century mathematician Karl Friedrich Gauss provides the keyto the construction of the theory of surfaces: a small region of a curved surfaceis very much like a small region of a flat surface. This is familiar: a smallregion of a (perfectly) spherical Earth appears flat. This is why it took so longto discover that it is not flat. On an apple, a smaller region must be chosenbefore it appears flat.

In the next three sections we shall formalize Gauss’ observation by takingthe three postulates for flat surfaces from Chapter 1, restricting them to smallregions, and then using them as postulates for curved surfaces.

We shall see that a curved spacetime is different from a flat spacetime.However, a simple observation of Einstein provides the key to the constructionof general relativity: a small region of a curved spacetime is very much like asmall region of a flat spacetime. To understand this, we must extend the conceptof an inertial object to curved spacetimes.

Passengers in an airplane at rest on the ground or flying in a straight lineat constant speed feel gravity, which feels very much like an inertial force. Ac-celerometers in the airplane respond to the gravity. On the other hand, astro-nauts in orbit or falling radially toward the Earth feel no inertial forces, eventhough they are not moving in a straight line at constant speed with respect tothe Earth. And an accelerometer carried by the astronauts will register zero.We shall include gravity as an inertial force and, as in special relativity, call anobject inertial if it experiences no inertial forces. An inertial object in gravityis in free fall. Forces other than gravity do not act on it.

We now rephrase Einstein’s key observation: as viewed by inertial observers,a small region of a curved spacetime is very much like a small region of a flatspacetime. We see this vividly in motion pictures of astronauts in orbit. Nogravity is apparent in their cabin: Objects suspended at rest remain at rest.Inertial objects in the cabin move in a straight line at constant speed, just asin a flat spacetime. Newton’s theory predicts this: according to Eq. (2.1) aninertial object and the cabin accelerate the same with respect to the Earth andso they do not accelerate with respect to each other.

In the next three sections we shall formalize Einstein’s observation by tak-ing our three postulates for flat spacetimes, restricting them to small spacetimeregions, and then using them as our first three (of four) postulates for curvedspacetimes. The local inertial frame postulate asserts the existence of small in-ertial cubical lattices with synchronized clocks to serve as coordinate systemsin small regions of a curved spacetime. The metric postulate asserts a universallight speed and a slowing of moving clocks in local inertial frames. The geodesicpostulate asserts that inertial particles and light move in a straight line at con-stant speed in local inertial frames. We first discuss some experimental evidencefor the postulates.

30


Experiments of R. Dicke and of V. B. Braginsky, performed in the 1960’s,verify to extraordinary accuracy Galileo’s finding incorporated into Newton’stheory: the acceleration of a free falling object in gravity is independent of itsmass and composition. (See Sec. 2.1.) We may reformulate this in the languageof spacetimes: the worldline of an inertial object in a curved spacetime is inde-pendent of its mass and composition. The geodesic postulate will incorporatethis by not referring to the mass or composition of the inertial objects whoseworldlines it describes.

Dicke and Braginsky used the Sun’s

Fig. 2.4: Masses A and B ac-celerate the same toward theSun.

gravity. We can understand the principleof their experiments from the simplified di-agram in Fig. 2.4. The weights A and B,supported by a quartz fiber, are, with theEarth, in free fall around the Sun. Dickeand Braginsky used various substances withvarious properties for the weights. Any dif-ference in their acceleration toward the Sunwould cause a twisting of the fiber. Due tothe Earth’s rotation, the twisting would bein the opposite direction twelve hours later.The apparatus had a resonant period of os-cillation of 24 hours so that oscillations could build up. In Braginsky’s exper-iment the difference in the acceleration of the weights toward the Sun was nomore than one part in 1012 of their mutual acceleration toward the Sun. Aplanned satellite test (STEP) will test the equality of accelerations to one partin 1018.

A related experiment shows that the Earth and the Moon, despite the hugedifference in their masses, accelerate the same in the Sun’s gravity. If this werenot so, then there would be unexpected changes in the Earth-Moon distance.Changes in this distance can be measured within 2 cm (!) by timing the returnof a laser pulse sent from Earth to mirrors on the Moon left by astronauts.This is part of the lunar laser experiment . The measurements show that therelative acceleration of the Earth and Moon is no more than a part in 1013 oftheir mutual acceleration toward the Sun. The experiment also shows that theNewtonian gravitational constant κ in Eq. (2.1) does not change by more than1 part in 1012 per year. The constant also appears in Einstein’s field equation,Eq. (2.28).

There is another difference between the Earth and the Moon which mightcause a difference in their acceleration toward the Sun. Imagine disassemblingto Earth into small pieces and separating the pieces far apart. The separationrequires energy input to counteract the gravitational attraction of the pieces.This energy is called the Earth’s gravitational binding energy . By Einstein’sprinciple of the equivalence of mass and energy (E = mc2), this energy isequivalent to mass. Thus the separated pieces have more total mass than theEarth. The difference is small – 4.6 parts in 1010 . But it is 23 times smallerfor the Moon. One can wonder whether this difference between the Earth and

31


Moon causes a difference in their acceleration toward the Sun. The lunar laserexperiment shows that this does not happen. This is something that the Dickeand Braginsky experiments cannot test.

The last experiment we shall consider as evidence for the three postulates isthe terrestrial redshift experiment . It was first performed by R. V. Pound and G.A. Rebka in 1960 and then more accurately by Pound and J. L. Snider in 1964.The experimenters put a source of gamma radiation at the bottom of a tower.Radiation received at the top of the tower was redshifted: z = 2.5 × 10−15,within an experimental error of about 1%. This is a gravitational redshift .

According to the discussion following Eq. (1.6), an observer at the top ofthe tower would see a clock at the bottom run slowly. Clocks at rest at differentheights in the Earth’s gravity run at different rates! Part of the result of theHafele-Keating experiment is due to this. See Exercise 2.1.

We showed in Sec. 1.3 that the assumption Eq. (1.4), necessary for synchro-nizing clocks at rest in the coordinate lattice of an inertial frame, is equivalentto a zero redshift between the clocks. This assumption fails for clocks at thetop and bottom of the tower. Thus clocks at rest in a small coordinate latticeon the ground cannot be (exactly) synchronized.

We now show that the experiment provides evidence that clocks at rest ina small inertial lattice can be synchronized. In the experiment, the tower has(upward) acceleration g, the acceleration of Earth’s gravity, in a small inertiallattice falling radially toward Earth. We will show shortly that the same redshiftwould be observed with a tower having acceleration g in an inertial frame ina flat spacetime. This is another example of small regions of flat and curvedspacetimes being alike. Thus it is reasonable to assume that there would be noredshift with a tower at rest in a small inertial lattice in gravity, just as with atower at rest in an inertial frame. In this way, the experiment provides evidencethat the condition Eq. (1.4), necessary for clock synchronization, is valid forclocks at rest in a small inertial lattice. Loosely speaking, we may say that sincelight behaves “properly” in a small inertial lattice, light accelerates the same asmatter in gravity.

We now calculate the Doppler redshift for a tower with acceleration g in aninertial frame. Suppose the tower is momentarily at rest when gamma radiationis emitted. The radiation travels a distance h, the height of the tower, in theinertial frame. (We ignore the small distance the tower moves during the flightof the radiation. We shall also ignore the time dilation of clocks in the movingtower and the length contraction – see Appendix 6 – of the tower. These effectsare far too small to be detected by the experiment.) Thus the radiation takestime t = h/c to reach the top of the tower. (For clarity we do not take c = 1.)In this time the tower acquires a speed v = gt = gh/c in the inertial frame.From Exercise 1.6, this speed causes a Doppler redshift

z =v

c=

gh

c2. (2.2)

In the experiment, h = 2250 cm. Substituting numerical values in Eq.(2.2) gives the value of z measured in the terrestrial redshift experiment; the

32


gravitational redshift for towers accelerating in inertial frames is the same asthe Doppler redshift for towers accelerating in small inertial lattices near Earth.Exercise 3.4 shows that a rigorous calculation in general relativity also gives Eq.(2.2).

Exercise 2.1. Let h be the height at which the airplane flies in the simplifiedHafele-Keating experiment of Exercise 1.10. Show that the difference betweenthe clocks due to the gravitational redshift is

∆sa −∆sr = gh∆t.

Suppose that h = 10 km. Substitute values to obtain ∆sa −∆sr = 1.6× 10−7

sec.Adding this to the time dilation difference of Exercise 1.10 gives a difference

of 3.0 × 10−7 sec. Exercise 3.3 shows that a rigorous calculation in generalrelativity gives the same result.

The satellites of Global Positioning System (GPS) carry atomic clocks. Timedilation and gravitational redshifts must be taken into account for the system tofunction properly. In fact, the effects are 10,000 times too large to be ignored.

33

2.3 The Local Inertial Frame Postulate


Fig. 2.5: A local planarframe.

Suppose that curved surface dwellers attempt toconstruct a square coordinate grid using rigidrods constrained (of course) to their surface. Ifthe rods are short enough, then at first they willfit together well. But, owing to the curvature ofthe surface, as the grid gets larger the rods mustbe forced a bit to connect them. This will causestresses in the lattice and it will not be quitesquare. Surface dwellers call a small (nearly)square coordinate grid a local planar frame at P ,where P is the point at the origin of the grid. Insmaller regions around P , the grid must becomemore square. See Fig. 2.5.

The Local Planar Frame Postulate for a Curved Surface

A local planar frame can be constructed at any point P of a curvedsurface with any orientation.

We shall see that local planar frames

Fig. 2.6: Spherical coordi-nates (φ, θ) on a sphere.

at P provide surface dwellers with an in-tuitive description of properties of a curvedsurface at P . However, in order to studythe surface as a whole, they need global co-ordinates, defined over the entire surface.There are, in general, no natural global co-ordinate systems to single out in a curvedsurface as planar frames are singled out in aflat surface. Thus they attach global coordi-nates (y1, y2) in an arbitrary manner. Theonly restrictions are that different pointsmust have different coordinates and nearby

points must receive nearby coordinates. In general, the coordinates will nothave a geometric meaning; they merely serve to label the points of the surface.

One common way for us (but not surface dwellers) to attach global coordi-nates to a curved surface is to parameterize it in three dimensional space:

x = x(y1, y2), y = y(y1, y2), z = z(y1, y2) . (2.3)

As (y1, y2) varies, (x, y, z) varies on the surface. Assign coordinates (y1, y2) tothe point (x, y, z) on the surface given by Eq. (2.3). For example, Fig. 2.6 showsspherical coordinates (y1, y2) = (φ, θ) on a sphere of radius R. The coordinatesare assigned by the usual parameterization

x = R sinφ cos θ, y = R sinφ sin θ, z = R cos φ. (2.4)

34


The Global Coordinate Postulate for a Curved Surface

The points of a curved surface can be labeled with coordinates(y1, y2).

(Technically, the postulate should state that a curved surface is a two dimen-sional manifold . The statement given will suffice for us.)

In the last section we saw that inertial objects in an astronaut’s cabin behaveas if no gravity were present. Actually, they will not behave ex:actly as if nogravity were present. To see this, assume for simplicity that their cabin is fallingradially toward Earth. Inertial objects in the cabin do not accelerate exactly thesame with respect to the Earth because they are at slightly different distancesand directions from the Earth’s center. See Fig. 2.7. Thus, an object initiallyat rest near the top of the cabin will slowly separate from one initially at restnear the bottom. In addition, two objects initially at rest at the same heightwill slowly move toward each other as they both fall toward the center of theEarth. These changes in velocity are called tidal accelerations. (Why?) Theyare caused by small differences in the Earth’s gravity at different places in thecabin. They become smaller in smaller regions of space and time, i.e., in smallerregions of spacetime.

Suppose that astronauts in a curved spacetime at-

Fig. 2.7: Tidalaccelerations ina radially freefalling cabin.

tempt to construct an inertial cubical lattice using rigidrods. If the rods are short enough, then at first they willfit together well. But as the grid gets larger, the latticewill have to resist tidal accelerations, and the rods can-not all be inertial. This will cause stresses in the latticeand it will not be quite cubical.

In the last section we saw that the terrestrial redshiftexperiment provides evidence that clocks in a small iner-tial lattice can be synchronized. Actually, due to smalldifferences in the gravity at different places in the lattice,an attempt to synchronize the clocks with the one at theorigin with the procedure of Sec. 1.3 will not quite work.However, we can hope that the procedure will work with as small an error asdesired by restricting the lattice to a small enough region of a spacetime.

A small (nearly) cubical lattice with (nearly) synchronized clocks is called alocal inertial frame at E, where E is the event at the origin of the lattice whenthe clock there reads zero. In smaller regions around E, the lattice is morecubical and the clocks are more nearly synchronized. Local inertial frames arein free fall.

The Local Inertial Frame Postulate for a Curved Spacetime

A local inertial frame can be constructed at any event E of a curvedspacetime, with any orientation, and with any inertial object at Einstantaneously at rest in it.

35


We shall find that local inertial frames at E provide an intuitive descriptionof properties of a curved spacetime at E. However, in order to study a curvedspacetime as a whole, we need global coordinates, defined over the entire space-time. There are, in general, no natural global coordinates to single out in acurved spacetime, as inertial frames were singled out in a flat spacetime. Thuswe attach global coordinates in an arbitrary manner. The only restrictions arethat different events must receive different coordinates and nearby events mustreceive nearby coordinates. In general, the coordinates will not have a physicalmeaning; they merely serve to label the events of the spacetime.

Often one of the coordinates is a “time” coordinate and the other three are“space” coordinates, but this is not necessary. For example, in a flat spacetimeit is sometimes useful to replace the coordinates (t, x) with (t + x, t− x).

The Global Coordinate Postulate for a Curved Spacetime

The events of a curved spacetime can be labeled with coordinates(y0, y1, y2, y3).

In the next two sections we give the metric and geodesic postulates of gen-eral relativity. We first express the postulates in local inertial frames. Thislocal form of the postulates gives them the same physical meaning as in specialrelativity. We then translate the postulates to global coordinates. This globalform of the postulates is unintuitive and complicated but is necessary to carryout calculations in the theory.

We can use arbitrary global coordinates in flat as well as curved spacetimes.We can then put the metric and geodesic postulates of special relativity in thesame global form that we shall obtain for these postulates for curved space-times. We do not usually use arbitrary coordinates in flat spacetimes becauseinertial frames are so much easier to use. We do not have this luxury in curvedspacetimes.

It is remarkable that we shall be able to describe curved spacetimes intrinsi-cally , i.e., without describing them as curved in a higher dimensional flat space.Gauss created the mathematics necessary to describe curved surfaces intrinsi-cally in 1827. G. B. Riemann generalized Gauss’ mathematics to curved spacesof higher dimension in 1854. His work was extended by several mathematicians.Thus the mathematics necessary to describe curved spacetimes intrinsically waswaiting for Einstein when he needed it.

36



Fig. 2.5 shows that local planar frames provide curved surface dwellers with aconvenient way to express infinitesimal distances on a curved surface.

The Metric Postulate for a Curved Surface, Local Form

Let point Q have coordinates (dx1, dx2) in a local planar frame atP . Let ds be the distance between the points. Then

ds2 = (dx1)2 + (dx2)2. (2.5)

Even though the local planar frame extends a finite distance from P , Eq. (2.5)holds only for infinitesimal distances from P .

We now express Eq. (2.5) in terms of global coordinates. Set the matrix

f = (fmn) =(

1 00 1

).

Then Eq. (2.5) can be written

ds2 =2∑

m,n=1

fmndxmdxn. (2.6)

Henceforth we use the Einstein summation convention by which an index whichappears twice in a term is summed without using a Σ. Thus Eq. (2.6) becomes

ds2 = fmndxmdxn. (2.7)

As another example of the summation convention, consider a function f(y1, y2).Then we may write the differential df =

∑2i=1(∂f/∂yi) dyi = (∂f/∂yi) dyi.

Let P and Q be neighboring points on a curved surface with coordinates(y1, y2) and (y1 + dy1, y2 + dy2) in a global coordinate system. Let Q havecoordinates (dx1, dx2) in a local planar frame at P . We may think of the (xi)coordinates as functions of the (yj) coordinates, just as cartesian coordinates inthe plane are functions of polar coordinates: x = r cos θ, y = r sin θ. This givesmeaning to the partial derivatives ∂xi/∂yj . From Eq. (2.7), the distance fromP to Q is

ds2 = fmn dxmdxn

= fmn

(∂xm

∂yjdyj

)(∂xn

∂ykdyk

)(sum on m, n, j, k)

=(

fmn

∂xm

∂yj

∂xn

∂yk

)dyjdyk

= gjk(y) dyjdyk, (2.8)

37


where we have setgjk(y) = fmn

∂xm

∂yj

∂xn

∂yk. (2.9)

Since the matrix (fmn) is symmetric, so is (gjk). Use a local planar frame ateach point of the surface in this manner to translate the local form of the metricpostulate, Eq. (2.5), to global coordinates:

The Metric Postulate for a Curved Surface, Global Form

Let (yi) be global coordinates on the surface. Let ds be the distancebetween neighboring points (yi) and (yi + dyi). Then there is asymmetric matrix (gjk(yi)) such that

ds2 = gjk(yi) dyjdyk. (2.10)

(gjk(yi)) is called metric of the surface with respect to the coordinates (yi).Exercise 2.2. Show that the metric for the (φ, θ) coordinates on the sphere

in Eq. (2.4) is ds2 = R2dφ2 + R2 sin2φdθ2, i.e.,

g(φ, θ) =(

R2 00 R2 sin2φ

). (2.11)

Do this in two ways:a. By converting from the metric of a local planar frame. Show that for

a local planar frame whose x1-axis coincides with a circle of latitude, dx1 =R sinφdθ and dx2 = Rdφ. See Fig. 2.10.

b. Use Eq. (2.4) to convert ds2 = dx2 + dy2 + dz2 to (φ, θ) coordinates.Exercise 2.3. Consider the hemisphere z = (R2 − x2 − y2)

12 . Assign

coordinates (x, y) to the point (x, y, z) on the hemisphere. Find the metric in thiscoordinate system. Express your answer as a matrix. Hint: Use z2 = R2−x2−y2

to compute dz2 .

We should not think of a vector as a list of components (vi), but as a singleobject v which represents a magnitude and direction (an arrow), independentlyof any coordinate system. If a coordinate system is introduced the vector ac-quires components. They will be different in different coordinate systems.

Similarly, we should not think of a metric as a list of components (gjk), butas a single object g which represents infinitesimal distances, independently ofany coordinate system. If a coordinate system is introduced the metric acquirescomponents. They will be different in different coordinate systems.

Exercise 2.4. Let (yi) and (yi) be two coordinate systems on the samesurface, with metrics (gjk(yi)) and (gpq(yi)). Show that

gpq = gjk∂yj

∂yp

∂yk

∂yq. (2.12)

Hint: See Eq. (2.8).

38


The Metric Postulate for a Curved Spacetime, Local Form

Let event F have coordinates (dxi) in a local inertial frame at E.If E and F are lightlike separated, let ds = 0. If the events aretimelike separated, let ds be the time between them as measured byany (inertial or noninertial) clock. Then

ds2 = (dx0)2 − (dx1)2 − (dx2)2 − (dx3)2. (2.13)

The metric postulate asserts a universal light speed and a formula for propertime in local inertial frames. (See the discussion following Eq. (1.10).) Thisis another instance of the key to general relativity: a small region of a curvedspacetime is very much like a small region of a flat spacetime.

We now translate the metric postulate to global coordinates. Eq. (2.13) canbe written

ds2 = fmndxmdxn, (2.14)

where

f = (fmn) =

1 0 0 00 −1 0 00 0 −1 00 0 0 −1

.

Using Eq. (2.14), the calculation Eq. (2.8), which produced the global formof the metric postulate for curved surfaces, now produces the global form of themetric postulate for curved spacetimes.

The Metric Postulate for a Curved Spacetime, Global Form

Let (yi) be global coordinates on the spacetime. If neighboringevents (yi) and (yi + dyi) are lightlike separated, let ds = 0. Ifthe events are timelike separated, let ds be the time between themas measured by a clock moving between them. Then there is a sym-metric matrix (gjk(yi)) such that

ds2 = gjk(yi) dyjdyk. (2.15)

(gjk(yi)) is called the metric of the spacetime with respect to (yi).

39



Curved surface dwellers find that some curves in their surface are straight inlocal planar frames. They call these curves geodesics. Fig. 2.8 shows that theequator is a geodesic but the other circles of latitude are not. To traverse ageodesic, a surface dweller need only always walk “straight ahead”. A geodesicis as straight as possible, given that it is constrained to the surface.

As with the geodesic postulate for flat surfaces Eq. (1.14), we have

The Geodesic Postulate for a Curved Surface, Local Form

Parameterize a geodesic with arclength s. Let point P be on thegeodesic. Then in every local planar frame at P

xi(P ) = 0, i = 1, 2. (2.16)

We now translate Eq. (2.16) from the local planar plane coordinates x toglobal coordinates y to obtain the global form of the geodesic equations. Wefirst need to know that the metric g = (gij) has an inverse g−1 = (gjk).

Exercise 2.5. a. Let the matrix a = (∂xn/∂yk).

Fig. 2.8: The equa-tor is the only circleof latitude which isa geodesic.

Show that the inverse matrix a−1 = (∂yk/∂xj).b. Show that Eq. (2.9) can be written g = atfa,

where t means “transpose”.c. Prove that g−1 = a−1(f)−1(a−1)t.

Introduce the notation ∂kgim = ∂gim/∂yk. Definethe Christoffel symbols:

Γijk = 1

2 gim [∂kgjm + ∂jgmk − ∂mgjk] . (2.17)

Note that Γijk = Γi

kj . The Γijk , like the gjk , are

functions of the coordinates.Exercise 2.6. Show that for the metric Eq.

(2.11)Γφ

θθ = − sinφ cos φ and Γθθφ = Γθ

φθ = cotφ.

The remaining Christoffel symbols are zero.You should not try to assign a geometric meaning to the Christoffel symbols;

simply think of them as what appears when the geodesic equations are translatedfrom their local form Eq. (2.16) (which have an evident geometric meaning) totheir global form (which do not):

The Geodesic Postulate for a Curved Surface, Global Form

Parameterize a geodesic with arclength s. Then in every globalcoordinate system

yi + Γijkyj yk = 0, i = 1, 2. (2.18)

40


Appendix 10 translates the local form of the postulate to the global form.The translation requires an assumption. A local planar frame at P extends to afinite region around P . Let f = (fmn(x)) represent the metric in this coordinatesystem. According to Eq. (2.7), (fmn(P )) = f. It is a mathematical fact thatthere are coordinates satisfying this relationship and also

∂ifmn(P ) = 0 (2.19)

for all m,n, i. They are called geodesic coordinates. See Appendix 9. A functionwith a zero derivative at a point is not changing much at the point. In this senseEq. (2.19) states that f stays close to f near P . Since a local planar frame atP is constructed to approximate a planar frame as closely as possible near P ,surface dwellers assume that a local planar frame at P is a geodesic coordinatesystem satisfying Eq. (2.19).

Exercise 2.7. Show that the metric of Exercise 2.3 satisfies Eq. (2.19) at(x, y) = (0, 0) .

Exercise 2.8. Show that Eq. (2.18) reduces to Eq. (2.16) for local planarframes.

Exercise 2.9. Show that the equator is the only circle of latitude which isa geodesic. Of course, all great circles on a sphere are geodesics. Use the resultof Exercise 2.6. Before using the geodesic equations you must parameterize thecircles with s.

The Geodesic Postulate for a Curved Spacetime, Local Form

Worldlines of inertial particles and pulses of light can be parame-terized so that if E is on the worldline, then in every local inertialframe at E

xi(E) = 0, i = 0, 1, 2, 3. (2.20)

For inertial particles we may take the parameter to be s .

The worldlines are called geodesics.The geodesic postulate asserts

Fig. 2.9: The path of an inertial parti-cle in a lattice stuck to the Earth and inan inertial lattice. The dots are spacedat equal time intervals.

that in a local inertial frame iner-tial particles and light move in astraight line at constant speed. (Seethe remarks following Eq. (1.17).)The worldline of an inertial particleor pulse of light in a curved space-time is as straight as possible (inspace and time – see the remarksfollowing Eq. (1.16)), given that itis constrained to the spacetime. The geodesic is straight in local inertial frames,but looks curved when viewed in an “inappropriate” coordinate system. SeeFig. 2.5.

41


Einstein’s “straightest worldline in a curved spacetime” description of thepath is very different from Newton’s “curved path in a flat space” description.

An analogy might make the difference more vivid. Imagine two surfacedwellers living on a sphere. They start at nearby points on their equator andwalk north along a circle of longitude (which is a geodesic). As they walk, theycome closer together. They might attribute this to an attractive gravitationalforce between them. But we see that it is due to the curvature of their surface.

We now assume that the metric of a local inertial frame at E satisfies Eq.(2.19) for the same reasons as given above for local planar frames. Then Ap-pendix 10 translates the local form of the geodesic postulate for curved space-times to the global form:

The Geodesic Postulate for a Curved Spacetime,Global Form

Worldlines of inertial particles and pulses of light can be parameter-ized so that in every global coordinate system

yi + Γijk yj yk = 0, i = 0, 1, 2, 3. (2.21)

For inertial particles we may take the parameter to be s .

42

2.6 The Field Equation


Previous sections of this chapter explored similarities between small regions offlat and curved surfaces and between small regions of flat and curved spacetimes.This section explores differences.

The local forms of our curved surface postulates show

Fig. 2.10:K = 1/R2 on asphere.

that in many ways a small region of a curved surface islike a small region of a flat surface. Surface dwellersmight suppose that all differences between the regionsvanish as the regions become smaller. This is not so.To see this, pass geodesics through a point P in everydirection.

Exercise 2.10. Show that there is a unique geodesicthrough every point of a curved surface in every direction.Hint: Use a basic theorem on the existence and unique-ness of solutions of systems of differential equations.

Connect all the points at distance r from P along thegeodesics, forming a “circle” C of radius r. Let C(r) bethe circumference of the circle. Define the curvature K of the surface at P :

K =3π

limr→ 0

2π r − C(r)r3

. (2.22)

Clearly, K = 0 for a flat surface. From Fig. 2.10, C(r) < 2πr for a sphereand so K ≥ 0. From Fig. 2.10, we find

C(r) = 2πR sinφ = 2πR sin(r/R) = 2πR[r/R− (r/R)3/6 + . . .

].

A quick calculation shows that K = 1/R2. The cur-

Fig. 2.11: K =−1/R2 on a pseudo-sphere.

vature is a difference between regions of a sphere anda flat surface which does not vanish as the regionsbecome smaller.

The surface of revolution of Fig. 2.11 is a pseudo-sphere. The horizontal “circles of latitude” are con-cave inward and the vertical “lines of longitude” areconcave outward. Thus C(r) > 2πr and so K ≤ 0.Exercise 2.14 shows that K = −1/R2, where R is aconstant. Despite the examples, in general K variesfrom point to point in a curved surface.

Distances and geodesics are involved in the defi-nition of K. But distances determine geodesics: dis-tances determine the metric, which determines theChristoffel symbols Eq. (2.17), which determine the

geodesics Eq. (2.18). Thus we learn an important fact: distances determine K.Thus K is measurable by surface dwellers.

Exercise 2.11. Show that a map of a region of the Earth must distortdistances. Take the Earth to be perfectly spherical. Make no calculations.

43


Roll a flat piece of paper into a cylinder. Since this does not distort distanceson the paper, it does not change K. Thus K = 0 for the cylinder. Viewed fromthe outside, the cylinder is curved, and so K = 0 seems “wrong”. However,surface dwellers cannot determine that they now live on a cylinder withoutcircumnavigating it. Thus K “should” be zero for a cylinder.

The formula expressing K in terms of distances was given by Gauss. Ifg12 = 0, then

K = − (g11g22)− 1

2

∂1

(g− 1

211 ∂1g

1222

)+ ∂2

(g− 1

222 ∂2g

1211

) (∂i ≡ ∂/∂yi

). (2.23)

Exercise 2.12. Show that Eq. (2.23) gives K = 1/R2 for a sphere of radiusR. Use Eq. (2.11).

Exercise 2.13. Generate a surface of revolution by rotating the param-eterized curve y = f(u), z = h(u) about the z-axis. Let (r, θ, z) be cylin-drical coordinates and parameterize the surface with coordinates (u, θ). Useds2 = dr2 + r2dθ2 + dz2 to show that the metric is(

f ′2 + h′2 00 f2

).

Exercise 2.14. If y = Re−u and z = R∫ u

0(1 − e−2t)

12 dt, then the surface

of revolution in Exercise 2.13 is the pseudosphere of Fig. 2.11. Show thatK = −1/R2 for the pseudosphere.

Exercise 2.15. If y = 1 and z = u, then the surface of revolution inExercise 2.13 is a cylinder. Show that K = 0 for a cylinder using Eq. (2.23).

The local forms of our curved spacetime postulates show that in many re-spects a small region of a curved spacetime is like a small region of a flat space-time. We might suppose that all differences between the regions vanish as theregions become smaller. This is not so.

To see this, refer to Fig. 2.7. Let ∆ r be the small distance between objectsat the top and bottom of the cabin and let ∆ a be the small tidal accelerationbetween them. In the curved spacetime of the cabin ∆ a 6= 0 , which is differentfrom the flat spacetime value ∆ a = 0. But in the cabin ∆ a → 0 as ∆ r → 0 ;this difference between a curved and flat spacetime does vanish as the regionsbecomes smaller. But also in the cabin ∆ a/∆ r 6= 0 , again different from the flatspacetime ∆ a/∆ r = 0 . This difference does not vanish as the regions becomesmaller: in the cabin, using Eq. (2.1), ∆ a/∆ r → da/dr = 2κM/r3 6= 0.

44


The metric and geodesic postulates describe the behavior of clocks, light, andinertial particles in a curved spacetime. But to apply these postulates, we mustknow the metric of the spacetime. Our final postulate for general relativity, thefield equation, determines the metric. Loosely speaking, the equation determinesthe “shape” of a spacetime, how it is “curved”.

We constructed the metric in Sec. 2.4 using local inertial frames. Thereis obviously a relationship between the motion of local inertial frames and thedistribution of mass in a curved spacetime. Thus, there is a relationship betweenthe metric of a spacetime and the distribution of mass in the spacetime. Thefield equation gives this relationship. Schematically it reads[

quantity determinedby metric

]=[

quantity determinedby mass/energy

]. (2.24)

To specify the two sides of this equation, we need several definitions. Definethe Ricci tensor

Rjk = Γptk Γt

jp − Γptp Γt

jk + ∂kΓpjp − ∂pΓ

pjk . (2.25)

Don’t panic over this complicated definition: You need not have a physicalunderstanding of the Ricci tensor. And while the Rjk are extremely tedious tocalculate by hand, computers can readily calculate them for us.

As with the metric g , we will use R to designate the Ricci tensor as a singleobject, existing independently of any coordinate system, but which in a givencoordinate system acquires components Rjk .

Define the curvature scalar R = gjkRjk .We can now specify the left side of the schematic field equation Eq. (2.24):

the quantity determined by the metric is the Einstein tensor

G = R− 12R g. (2.26)

The right side of the field equation is given by the energy-momentum tensorT. It represents the source of the gravitational field in general relativity. Allforms of matter and energy, including electromagnetic fields, and also pressures,stresses, and viscosity contribute to T. But for our purposes we need to con-sider only matter of a special form, called dust . In dust, matter interacts onlygravitationally; there are no pressures, stresses, or viscosity. Gas in interstellarspace which is thin enough so that particle collisions are infrequent is dust.

We now define T for dust. Choose an event E with coordinates (yi). Letρ be the density of the dust at E as measured by an observer moving with thedust. (Thus ρ is independent of any coordinate system.) Let ds be the timemeasured by the observer between E and a neighboring event on the dust’sworldline with coordinates (yi + dyi). Define

T jk = ρdyj

ds

dyk

ds. (2.27)

45


We can now state our final postulate for general relativity:

The Field Equation

G = −8πκT. (2.28)

Here κ is the Newtonian gravitational constant of Eq. (2.1).The field equation is the centerpiece of Einstein’s theory. It relates the

curvature of spacetime at an event, represented by G, to the density and motionof matter and energy at the event, represented by T. In our applications, wewill specify T, and then solve the equation for G.

The definitions of G, R, and Γijk (Eqs. (2.26), (2.25), and (2.17)) show that

G, like the curvature K of a surface, invloves the metric components gjk andtheir first and second partial derivatives. Thus the field equation is a system ofsecond order partial differential equations in the unknown gjk.

Appendix 13 gives a plausibility argument, based on reasonable assumptions,which leads from the schematic field equation Eq. (2.24) to the field equationEq. (2.28). It is by no means be a proof of the equation, but it should beconvincing enough to make us anxious to confront the theory with experimentin the next chapter.

The geodesic equations and the field equation are valid in all coordinatesystems. They are examples of our ability to describe a curved spacetime in-trinsically, i.e., without reference to a higher dimensional flat space in which itis curved. You should be impressed with the power of this mathematics!

If ρ = 0 at some event, then the field equation is Rjk− 12Rgjk = 0. Multiply

this by gjk:

0 = gjk(Rjk − 1

2R gjk

)= gjkRjk − 1

2R gjkgjk = R− 12R 4 = −R .

Substitute R = 0 into the field equation to obtain

The Vacuum Field Equation

R = 0 . (2.29)

At events in a spacetime where there is no matter we may use this vacuum fieldequation.

Before we can use the field equation we must deal with a technical matter.The indices on the metric and the Ricci tensor are subscripts: gjk and Rjk . Theindices on the energy-momentum tensor are superscripts: T jk. By convention,the placement of indices indicates how components transform under a changeof coordinates.1

1This is a convention of tensor algebra. Tensor algebra and calculus are powerful tools forcomputations in general relativity. But we do not need them for a conceptual understandingof the theory. Appendix 11 gives a short introduction to tensors.

46


Subscripted components ajk transform covariantly :

apq = ajk∂yj

∂yp

∂yk

∂yq. (2.30)

According to Exercise 2.4, the gjk transform covariantly. The Rjk also transformcovariantly. (Do not attempt to verify this at home!) The Ricci scalar is thesame in all coordinate systems. Thus the left side of the field equation, Gjk =Rjk − 1

2Rgjk, transforms covariantly.Superscripted components ajk transform contravariantly :

apq = ajk ∂yp

∂yj

∂yq

∂yk. (2.31)

Exercise 2.16. Show that the T jk transform contravariantly.Since the Gjk and T jk transform differently, we cannot take Gjk = −8πκT jk

as the components of the field equation: even if this equation were true in onecoordinate system, it need not be in another. The solution is to raise thecovariant indices to contravariant indices: Gmn = gmjgnkGjk .

Exercise 2.17. Show that if the Gjk transform covariantly, then the Gmn

transform contravariantly.Then we can take the components of the field equation to be Gjk = −8πκT jk.

If this equation is valid in any one coordinate system, then, since the two sidestransform the same between coordinate systems, it is true in all. (Alternatively,we can lower the contravariant indices to covariant indices: Tmn = gmjgnkT jk,and use the equivalent equation Gjk = −8πκTjk .)

Curvature. The field equation gives a simple and elegant relationship be-tween the curvatures K of certain surfaces through an event and the density ρat the event. To obtain it we must use Fermi normal coordinates, discussed inAppendix 12.

The metric f of a local inertial frame at E satisfies Eq. (2.14), (fmn(E)) = f.The metric of a geodesic coordinate system satisfies in addition Eq. (2.19),∂ifmn(E) = 0. And in a spacetime, the metric of a Fermi normal coordinatesystem satisfies further ∂0 ∂ifmn(E) = 0.

Consider the element of matter at an event E. According to the local intertialframe postulate, the element is instantaneously at rest in some local intertialframe at E, which we take to have Fermi normal coordinates. Let K12 bethe curvature of the surface formed by holding the time coordinate x0 and thespatial coordinate x3 fixed, while varying the other two spatial coordinates, x1

and x2. Appendix 12 shows that G00 = −(K12 + K23 + K31). Thus from thefield equation

K12 + K23 + K31 = 8πκρ.

In particular, if ρ = 0, then K12 + K23 + K31 = 0.

47

Chapter 3

Spherically SymmetricSpacetimes

3.1 Stellar Evolution

Several applications of general relativity described in this chapter involve obser-vations of stars at various stages of their life. We thus begin with a brief surveyof the relevant aspects of stellar evolution.

Clouds of interstellar gas and dust are a major component of our Milky Waygalaxy. Suppose some perturbing force causes a cloud to begin to contract byself gravitation. If the mass of the cloud is small enough, then gas pressuresand/or mechanical forces will stop the contraction and a planet sized objectwill result. For a larger cloud, these forces cannot stop the contraction. Thecloud will continue to contract and become hotter until thermonuclear reactionsbegin. The heat from these reactions will increase the pressure in the cloud andstop the contraction. A star is born! Our own star, the Sun, formed in thisway 4.6 billion years ago. A star will shine for millions or billions of years untilits nuclear fuel runs out and it begins to cool. Then the contraction will beginagain. The subsequent evolution of the star depends on its mass.

For a star of ∼ 1.4 solar masses or less, the contraction will be stopped bya phenomenon known as degenerate electron pressure – but not until enormousdensities are reached. For example, the radius of the Sun will decrease by afactor of 102 – to about 1000 km – and its density will thus increase by a factorof 106, to about 106g/cm3! Such a star is a white dwarf . They are common. Forexample, Sirius, the brightest star in the sky, is part of a double star system. Itsdim companion is a white dwarf. The Sun will become a white dwarf in about5 billion years.

A white dwarf member of a binary star system often accretes matter fromits companion. If it accretes sufficient matter, it will erupt in a thermonuclearexplosion, called a Type Ia supernova, destroying the star. At its brightest, aType Ia supernova has a luminosity over a billion times that of the Sun.

48

3.1 Stellar Evolution

For a star over 1.4 solar masses, degenerate electron pressure cannot stop thecontraction. Eventually a supernova of another kind, called Type II, occurs. Theouter portions of the star blow into interstellar space. A supernova was recordedin China in 1054. It was visible during the day for 23 days and outshone allother stars in the sky for several weeks. Today we see the material blown fromthe star as the Crab Nebula.

One possible result of a Type II supernova is a neutron star. In a neutronstar, fantastic pressures force most of the electrons to combine with protons toform neutrons. Degenerate neutron pressure prevents the star from collapsingfurther. A typical neutron star has a radius of 10 km and a density of 1014g/cm3!

Neutron stars manifest themselves in two ways. They often spin rapidly –up to nearly 1000 times a second. For poorly understood reasons, there canbe a small region of the star that emits radio and optical frequency radiationin a narrow cone. If the Earth should happen to be in the direction of thecone periodically as the star rotates, then the star will appear to pulse at thefrequency of rotation of the star, rather like a lighthouse. The neutron star is apulsar . The first pulsar was discovered in 1968; many are now known. There isa pulsar at the center of the crab nebula.

There are also neutron stars which emit X-rays. They are always membersof a binary star system. We briefly discuss these systems in Sec. 3.6.

Even degenerate neutron pressure cannot stop the contraction if the star istoo massive. A massive star can collapse to a black hole. Sec. 3.6 is devoted toblack holes.

We close our catalog of remarkable astronomical objects with quasars, dis-covered in 1963. Quasars sit at the center of some galaxies, mostly distant. Atypical quasar emits 100 times the energy of our entire Milky Way galaxy froma region 1017 times smaller!

49

3.2 The Schwartzschild Metric


In this section we solve the field equation for the metric of a spacetime arounda spherically symmetric object such as the Sun. This will enable us in the nextsection to compare general relativity’s predictions for the motion of planets andlight in our solar system with observations.

Exercise 3.1. Show that in spherical coordinates the flat spacetime metricEq. (1.13) becomes

ds2 = dt2 − dr2 − r2dΩ 2, (3.1)

where from Eq. (2.11), dΩ 2 = dφ2 + sin2φ dθ2 is the metric of the unit sphere.Hint: Don’t calculate; think geometrically.

How can Eq. (3.1) change in a curved spherically symmetric spacetime? Insuch a spacetime:

• The θ and φ coordinates can have their usual meaning. (As in a curvedsurface, angles can be measured in the usual way in a curved spacetime.)

• dθ and −dθ produce the same ds, as do dφ and −dφ. Thus none of theterms drdθ, drdφ, dθdφ, dtdθ, or dtdφ can appear in the metric.

• The surface t = to, r = ro has the metric of a sphere, although notnecessarily of radius ro.

Thus the metric is of the form

ds2 = e2µdt2 − 2udtdr − e2νdr2 − r2e2λd Ω 2, (3.2)

where µ, ν, u, and λ are unknown functions of t and r, but not, by symmetry, ofθ or φ. We write some of the coefficients as exponentials for convenience.

The coordinate change r = reλ eliminates the e2λ factor. (Then rename theradial coordinate r.) A change in the t coordinate eliminates the dtdr term. SeeAppendix 14. These coordinate changes put the metric in a simpler form:

ds2 = e2µdt2 − e2νdr2 − r2d Ω 2. (3.3)

This is as far as we can go with symmetry and coordinate changes. Howcan we determine the remaining unknowns µ and ν in the metric? Well, thefield equation determines the metric. And since we are interested only in thespacetime outside the central object, we may use the vacuum field equationR = 0. From the discussion following the statement of the field equation,Eq. (2.28), this is a system of second order partial differential equations in theunknowns µ and ν. Let’s solve them.

In components the vacuum field equation says Rjk = 0. The Rjk are bestcalculated with a computer. We first use Rtr =−2 (∂ν/∂t)/r. Since Rtr = 0,∂ν/∂t = 0, i.e., ν depends only on r.

50


Now

Rtt =(−µ ′′ + µ ′ν ′ − µ ′ 2 − 2 µ ′r−1

)e2(µ−ν)

Rrr =(−µ ′′ + µ ′ν ′ − µ ′ 2 + 2 ν ′r−1

)(3.4)

Rφφ =(−rµ ′ + rν ′ + e2ν − 1

)e−2ν ,

where primes indicate differentiation with respect to r.Set Rtt = 0 and Rrr = 0, cancel the exponential factor, and subtract:

µ ′ + ν ′ = 0. Now Rφφ = 0 implies 2 rν ′ + e2ν − 1 = 0, or (r e−2ν) ′ = 1.Integrate: e−2ν = 1 − 2 m/r, where 2m is a constant of integration. Since νdoes not depend on t, neither does m. The metric Eq. (3.3) is now of the form

ds2 =(

1− 2m

r

)e2(µ+ν)dt2 −

(1− 2m

r

)−1

dr2 − r2d Ω 2.

Since µ ′ + ν ′ = 0, µ + ν is a function only of t. Thus the substitutiondt = eµ+νdt eliminates the e2(µ+ν) factor. We arrive at our solution, theSchwartzschild metric, obtained by Karl Schwartzschild in 1916:

Schwartzschild Metric

ds2 =(

1− 2m

r

)dt2 −

(1− 2m

r

)−1

dr2 − r2d Ω 2. (3.5)

In matrix form the metric is

g = (gjk) =

1− 2m

r 0 0 00 −(1− 2m

r )−1 0 00 0 −r2 00 0 0 −r2 sin2 φ

.

We shall see later thatm = κM, (3.6)

where κ is the Newtonian gravitational constant of Eq. (2.1) and M is the massof the central object. For the Sun m is small:

m = 4.92× 10−6 sec = 1.47 km.

If r 2m, then the Schwartzschild metric Eq. (3.5) and the flat spacetimemetric Eq. (3.1) are nearly identical. Thus if r 2m, then for most purposes rmay be considered radial distance and t time measured by slowly moving clocks.

Exercise 3.2. Show that the time ds measured by a clock at rest at r isrelated to the coordinate time dt by

ds =(

1− 2m

r

)12

dt. (3.7)

Exercise 3.3. Show that the Schwartzschild metric predicts the sum of theresults of Exercises 1.10 (time dilation) and 2.1 (gravitational redshift) for theresult of the Hafele-Keating experiment: ∆sa −∆sr =

(12 (v2

a − v2r) + gh

)∆t .

51


Gravitational Redshift. Emit pulses of light radially outward from (te, re)and (te + ∆te, re) in a Schwartzschild spacetime. Observe them at (to, ro) and(to+∆to, ro). Since the Schwartzschild metric is time independent, the worldlineof the second pulse is simply a time translation of the first by ∆te. Thus∆to = ∆te. Let ∆se be the time between the emission events as measured bya clock at rest at re, with ∆so defined similarly. By Eqs. (1.6) and (3.7) anobserver at ro finds a redshift

z =∆so

∆se− 1 =

(1− 2m/ro)12 ∆to

(1− 2m/re)12 ∆te

− 1 =(

1− 2m/ro

1− 2m/re

)12

− 1. (3.8)

Exercise 3.4. Show that if re 2m and ro − re is small, then Eq. (3.8)reduces to Eq. (2.2), the gravitational redshift formula derived in connectionwith the terrestrial redshift experiment. Note that c = 1 in Eq. (3.8).

For light emitted at the surface of the Sun and received at Earth, Eq. (3.8)gives z = 2 × 10−6. This is difficult to measure but it has been verified within7%. Light from a star with the mass of the Sun but a smaller radius will, by Eq.(3.8), have a larger redshift. For example, light from the white dwarf companionto Sirius has a gravitational redshift z = 3× 10−4. And a gravitational redshiftz = .35 has been measured in X-rays emitted from the surface of a neutron star.

Exercise 3.5. Show that a clock on the Sun will lose 63 sec/year comparedto a clock far from the Sun. The corresponding losses for the white dwarf andneutron star just mentioned are 2.6 hours and 95 days, respectively.

An accurate measurement of the gravitational redshift was made in 1976 byan atomic clock in a rocket. The reading of the clock was compared, via radio,with one on the ground during the two hour flight of the rocket. Of course thegravitational redshift changed with the changing height of the rocket. Aftertaking into account the Doppler redshift due to the motion of the rocket, thegravitational redshift predicted by general relativity was confirmed within 7parts in 105.

52


Geodesics. To determine the motion of inertial particles and light in aSchwartzschild spacetime, we must solve the geodesic equations Eq. (2.21).They are:

t +2m/r2

1− 2m/rr t = 0 (3.9)

r +m(1− 2m/r)

r2t2 − m/r2

1− 2m/rr2 − r(1− 2m/r)

(φ2 + sin2φ θ2

)= 0 (3.10)

θ +2r

r θ + 2 cotφ φ θ = 0 (3.11)

φ +2r

rφ− sinφ cos φ θ2 = 0. (3.12)

Exercise 3.6. What is the Christoffel symbol Γ trt?

From spherical symmetry a geodesic lies in a plane. Let it be the plane

φ = π/2 . (3.13)

This is a solution to Eq. (3.12).

Exercise 3.7. Use Eqs. (3.10) and (3.5) to show that ds = (1− 3m/r)12 dt

for a circular orbit at φ = π/2. (Cf. Exercise 3.2.) This shows that a particle(with ds > 0) can have a circular orbit only for r > 3m and that light (withds = 0) can orbit at r = 3m. (Of course the central object must be inside ther of the orbit.)

Integrate Eqs. (3.9) and (3.11):

t (1− 2m/r) = B (3.14)

θ r2 = A, (3.15)

where A and B are constants of integration. Physically, A and B are conservedquantities: A is the angular momentum per unit mass along the geodesic andB is the energy per unit mass.

Exercise 3.8. Differentiate Eqs. (3.14) and (3.15) to obtain Eqs. (3.9)and (3.11). (Remember that φ = π/2 .)

Substitute Eqs. (3.13)-(3.15) into Eq. (3.10) and integrate:

r2

1− 2m/r+

A2

r2− B2

1− 2m/r= −E =

0 for light−1 for inertial particles,

(3.16)

where E is a constant of integration. The values given for E are verified bysubstituting Eqs. (3.13)-(3.16) into the Schwartzschild metric Eq. (3.5), giving(ds/dp)2 = E. For light ds = 0 (see Eq. (2.16)) and so E = 0. For an inertialparticle we may take p = s (see Eq. (2.21)) and so E = 1.

The partially integrated geodesic equations Eqs. (3.13)-(3.16) will be thestarting point for the study of the motion of planets and light in the solarsystem in the next section.

53


Summary. It was a long journey to obtain the geodesic equations, so itmight be useful to outline the steps used:

• We obtained a general form for the metric of a spherically symmetricspacetime, Eq. (3.2).

• The Ricci tensor involves second derivatives of a metric. In Eqs. (3.4) wewrote some of the components of this tensor for the spherically symmetricmetric.

• We wrote the vacuum field equations for the metric by setting the com-ponents of the Ricci tensor to zero.

• We solved the vacuum field equations to obtain the Schwartzschild metric,Eq. (3.5).

• The geodesic equations involve first derivatives of a metric. In Eqs. (3.9)-(3.12) we wrote the geodesic equations for the Schwartzschild metric. Theyare of second order.

• We integrated the second order geodesic equations to obtain first orderEqs. (3.13)-(3.16).

The Constant m. We close this section by evaluating the constant m in theSchwartzschild metric Eq. (3.5). Consider radial motion of an inertial particle.By Eq. (3.15), A = 0. Thus Eq. (3.16) becomes

dr

ds= ±

(B2 − 1 +

2m

r

)12

, (3.17)

the sign chosen according as the motion is outward or inward. Differentiate Eq.(3.17) and substitute Eq. (3.17) into the result:

d2r

ds2= −m

r2. (3.18)

For a distant slowly moving particle, ds ≈ dt. Since the Newtonian theoryapplies to this situation, Eqs. (2.1) and (3.18) must coincide. Thus m = κM ,in agreement with Eq. (3.6).

54

3.3 The Solar System Tests


This section compares the predictions of general relativity with observations ofthe motion of planets and light in our solar system. Three general relativisticeffects have been measured: perihelion advance, light deflection, and light delay.

Perihelion Advance. We first solve the geodesic equations for the planets.Set u = 1/r. By Eq. (3.15),

r =dr

du

du

dθθ = −u−2 du

dθA r−2 = −A

du

dθ.

Substitute this into Eq. (3.16), multiply by 1− 2mu, differentiate with respectto θ, and divide by 2A2du/dθ:

d2u

dθ2+ u = mA−2E + 3mu2. (3.19)

According to Eq. (3.16), E = 1 for the planets.Exercise 3.9. Among the planets, Mercury has the largest ratio

3mu2/mA−2 of the terms on the right side of Eq. (3.19). Show that it isless than 10−7.

As a first approximation, solve Eq. (3.19) without the small 3mu2 term:

u = mA−2 [1 + e cos(θ − θp)] . (3.20)

This is the polar equation of an ellipse with eccentricity e and perihelion (pointof closest approach to the Sun) at θ = θp. The same equation is derived fromNewton’s theory in Appendix 8. Thus, although Einstein’s theory is concep-tually entirely different from Newton’s, it gives nearly the same predictions forthe planets. This is necessary for any theory of gravity, as Newton’s theory isvery accurate for the planets.

Set θ = θp in Eq. (3.20):

mA−2 = r−1p (1 + e)−1 (3.21)

To obtain a more accurate solution of Eq. (3.19), substitute Eqs. (3.20) and(3.21) into the right side of Eq. (3.19) and solve:

u = r−1p (1 + e)−1

1 + e

[cos(θ − θp) + 3mr−1

p (1 + e)−1θ sin(θ − θp)

]+ mr−2

p (1 + e)−2 3 + 12e2 [ 3− cos(2(θ − θp))]

. (3.22)

Drop the last term, which stays small because of the factor mr−2p . For small α,

cos β + α sinβ ≈ cos β cos α + sinβ sinα = cos(β − α).

Use this approximation in Eq. (3.22):

u = r−1p (1 + e)−1

1 + e cos

[θ −

(θp +

3mθ

rp (1 + e)

)]. (3.23)

55


Since 3m/rp(1 + e) is small, this may be considered the equation of an ellipsewith perihelion advanced by 6mπ/ rp(1 + e) per revolution. (C. f. Eq. (3.20).)

Exercise 3.10. Show that for Mercury the predicted perihelion advance is43.0 arcsecond/century.

The perihelion of Mercury advances 574 arcsec/century. In 1845 Le Verriershowed that Newton’s theory explains the advance as small gravitational effectsof other planets – except for about 35 arcsec, for which he could not account.This was refined to 43 arcsec by the time Einstein published the general the-ory of relativity, and was the only discrepancy between Newton’s theory andobservation known at that time. Less than an arcminute per century ! This isthe remarkable accuracy of Newton’s theory. The explanation of the discrep-ancy was the first observational verification of general relativity. Today thisprediction of general relativity is verified within .1%.

Light Deflection. Gen-

Fig. 3.1: Deflection of light by the sun.

eral relativity predicts thatlight passing near the Sunwill be deflected. See Fig.3.1. Consider light whichgrazes the Sun at θ = 0 whenu = up = 1/rp. Set E = 0 inEq. (3.19) for light to givethe geodesic equation

d2u

dθ2+ u = 3mu2. (3.24)

As a first approximation solve Eq. (3.24) without the small 3mu2 term: u =up cos θ. This is the polar equation of a straight line. Substitute u = up cos θinto the right side of Eq. (3.24) and solve to obtain a better approximation:

u = up cos θ + 12mu2

p(3− cos 2θ).

Set u = 0 (r = ∞; θ = ±(π/2 + δ/2) and approximate

cos(π/2 + δ/2) = − sin δ/2 ≈ −δ/2, cos(π + δ) = − cos δ ≈ −1

to obtain the observed deflection angle δ = 4m/rp. (The Newtonian equationEq. (2.1) predicts half of this deflection.)

Exercise 3.11. Show that for the Sun, δ = 1.75 arcsec.Stars can be seen near the Sun only during a solar eclipse. Sir Arthur Stanley

Eddington organized an expedition to try to detect the deflection during theeclipse of 1919. He confirmed the prediction of general relativity within about20%. Later eclipse observations have improved the accuracy, but not by much.

In 1995 the predicted deflection was verified within .1% using radio wavesemitted by quasars. Radio waves have two advantages. First, radio telescopes

56


thousands of kilometers apart can work together to measure angles more ac-curately than optical telescopes. Second, an eclipse is not necessary, as radiosources can be detected during the day. The accuracy of such measurements isnow better than 10−4 arcsec. Light arriving from a quasar at an angle of 90

from the sun is deflected by 4 × 10−3 arcsec by the sun, and so the deflectioncan be measured.

Fig. 3.2: Gravitational lens.

A spectacular example of the gravita-tional deflection of light was discovered in1979. Two quasars, 6 arcsec apart, are infact the same quasar! Fig. 3.2 shows thecause of the double image of the quasar.The galaxy is a gravitational lens. Manymultiple (up to 7) image quasars are nowknown. It is not necessary that the quasarbe exactly centered behind the galaxy forthis to occur. But in many cases this is nearly so; then the quasar appearsas one or several arcs around the galaxy. And in at least one case the quasarappears as a ring around the galaxy! A gravitational lens can also brighten adistant quasar by up to 100 times, enabling astronomers to study them better.

Fig. 3.3: Radar echo delay.

Light Delay. Radar can be sent fromEarth, reflected off a planet, and detectedupon its return to Earth. If this is doneas the planet is about to pass behind theSun, then according to general relativity,the radar’s return is delayed by the Sun’sgravity. See Fig. 3.3.

To calculate the time t for light to gofrom Earth to Sun, use Eq. (3.14):

dr

dp=

dr

dt

dt

dp=

dr

dtB

(1− 2m

r

)−1

.

Set r = rp and r|r=rp= 0 in Eq. (3.16):

A2B−2 = r2p

(1− 2m

rp

)−1

.

Divide Eq. (3.16) by B2, substitute the above two equations into the result,separate the variables, and integrate:

t =

re∫rp

(1− 2m

r

)−2

1− 1−2m/r1−2m/rp

( rp

r

)21

2

dr. (3.25)

In Appendix 15 we approximate the integral:

t = (r2e − r2

p)12 + 2m ln(2re/rp) + m. (3.26)

57


The first term, (r2e − r2

p)12 , is, by the Pythagorean theorem, the time required

for light to travel in a straight line in a flat spacetime from Earth to Sun. Theother terms represent a delay of this flat spacetime t.

Add analogous delay terms for the path from Sun to Planet, double to includethe return trip, and find for Mercury a delay of 2.4 × 10−4 sec. One difficultyin performing the experiment lies in determining the positions of the planetsin terms of the r coordinate of the Schwartzschild metric accurately enough tocalculate the flat spacetime time. This difficulty and others have been overcomeand the prediction verified within 5%. Signals from the Viking spacecraft onMars have confirmed this effect within .1%

Quasars can vary in brightness on a timescale of months. We see variation inthe southern image of the original double quasar about 1 1

2 years after the samevariation of the northern image. The southern image of the quasar is 1 arcsecfrom the lensing galaxy; that of the northern is 5 arcsec. Thus the light of thesouthern image travels less distance than that of the southern, causing a delay inthe northern image. But this is more than compensated for by the gravitationaldelay of the light of the southern image passing closer to the lensing galaxy.

Geodetic Precession. Another effect of general relativity, Willem deSit-ter’s geodetic precession, is motivated in Figs. 3.4 and 3.5. In Fig. 3.4, a vectorin a plane is moved parallel to itself from A around a closed curve made upof geodesics. The vector returns to A with its original direction. In Fig. 3.5a vector on a sphere is moved parallel to itself (i.e., parallel to itself in localplanar frames), from A around a closed curve made of geodesics. The vectorreturns to A rotated through the angle α. This is a manifestation of curvature.

Fig. 3.4: A parallel transportedvector returns to A with its origi-nal direction.

Fig. 3.5: A parallel trans-ported vector returns to A rotatedthrough angle α.

In a flat spacetime the axis of rotation of an inertial gyroscope moves par-allel to itself in an inertial frame. The same is true in a curved spacetime inlocal inertial frames. But in a curved spacetime the orientation of the axis withrespect to the distant stars can change over a worldline. This is geodetic preces-sion. The rotating Earth-Moon system is a “gyroscope” orbiting the Sun. Thepredicted geodetic precession is ∼ .02 arcsec/yr. The lunar laser experiment hasconfirmed this to better than 1%.

58

3.4 Kerr Spacetimes

3.4 Kerr Spacetimes

The Schwartzschild metric describes the spacetime around a spherically sym-metric object. In particular, the object cannot be rotating, as a rotating objecthas, e.g., an axis of rotation. The Kerr metric, discovered only in 1963 byRoy Kerr, describes the spacetime outside a rotating object which is symmetricaround its axis of rotation and unchanging in time:

Kerr Metric

ds2 =(

1− 2mr

Σ

)dt2 +

4mar sin2φ

Σdt dθ − Σ

∆dr2

−Σ dφ2 −(

r2 + a2 +2ma2r sin2φ

Σ

)sin2φdθ2, (3.27)

where m = κM , as in the Schwartzschild metric; a = J/M , the angular momen-tum per unit mass of the central object; Σ = r2 + a2 cos2φ ; ∆ = r2− 2mr + a2;and the axis of rotation is the z-axis. The parameter a is restricted to 0 ≤ a ≤ 1 .

Exercise 3.12. Show that if the central object is not spinning, then theKerr metric Eq. (3.27) reduces to the Schwartzschild metric Eq. (3.5).

Note the surfaces t = const, r = const in a Kerr spacetime do not have themetric of a sphere.

Gravitomagnetism. All of the gravitational effects that we have discusseduntil now are due to the mass of an object. Gravitomagnetism is the gravi-tational effect of the motion of an object. Its name refers to an analogy withelectromagnetism: an electric charge at rest creates an electric field, while amoving charge also creates a magnetic field.

The Lense-Thirring or frame dragging gravitomag-

Conscience

Fig. 3.6: Theouter shell androckets keep theconscience iner-tial.

netic effect causes a change in the orientation of the axisof rotation of a small spinning object in the spacetimearound a large spinning object. An Earth satellite is, byvirtue of its orbital motion, a spinning object, whose axisof rotation is perpendicular to its orbital plane. The Kerrmetric predicts that the axis of a satellite in polar orbitat two Earth radii will change by ∼ .03 arcsec/yr. This isa shift of ∼1 m/yr in the intersection of the orbital andequatorial planes. Observations of two LAGEOS satel-lites in non-polar orbits have detected this within, it isclaimed, 10% of the prediction of general relativity. Thisis the first direct measurement of gravitomagnetism.

The Gravity Probe B experiment, launched in polar orbit in April 2004,will accurately measure frame dragging on gyroscopes. The experiment is atechnological tour-de-force. The gyroscopes are rotating ping-pong ball sizedspheres made of fused quartz, polished within 40 atoms of a perfect sphere,and cooled to near absolute zero with liquid helium. To work properly, the

59

3.4 Kerr Spacetimes

gyroscopes must move exactly on a geodesic. In particular, atmospheric dragand the solar wind must not affect the gyroscopes’ orbit. To accomplish this,the satellite is drag free: the gyroscopes are in an inner free floating part of thesatellite, its conscience. The conscience is protected from outside influences byan outer shell. See Fig. 3.6. Small rockets on the shell keep it centered aroundthe conscience when the shell deviates from inertial motion. Clever!

The predicted frame dragging precession of a gyroscope whose axis is perpen-dicular to its orbit is .04 arcsec/year. The experiment will test this to about 1%.The experiment will also measure geodetic precession. The predicted geodeticprecession of a gyroscope whose axis is in the plane of its orbit is 7 arcsec/year.The experiment will test this to better than one part in 104. Results are ex-pected in 2007.

The frame dragging effects of the rotating Earth are tiny. But frame draggingis an important effect in accretion disks around a rotating black hole. Thesedisks are a distant relative of the rings of Saturn. They form when a black holein a binary star system attracts matter from its companion, forming a rapidlyrotating disk of hot gas around it. There is evidence that frame dragging causesthe axis of some disks to precess hundreds of revolutions a second.

Satellites in the same circular equatorial orbit in a Kerr spacetime but withopposite directions take different coordinate times ∆t± for the complete orbitsθ → θ ± 2π. (Note that the orbits do not return the satellite to the same pointwith respect to the spinning Earth’s surface.) We derive this gravitomagneticclock effect from the r-geodesic equation for circular equatorial orbits in theKerr metric:

t 2 − 2atθ +(

a2 − r3

m

)θ2 = 0. (3.28)

Solve for dt/dθ: (dt

dθ

)±

= ±(

r3

m

)12

+ a, (3.29)

where ± correspond to increasing and decreasing θ.Integrate Eq. (3.29) over one orbit. This multiplies the right side of the

equation by ±2π: ∆t± = 2π(r3/m)12 ± 2πa. Then ∆t+ −∆t− = 4πa; an orbit

in the sense of a takes longer than an opposite orbit.Exercise 3.13. Show that for the Earth, ∆t+ −∆t− = 2× 10−7 sec. This

has not been measured.

60

3.5 The Binary Pulsar

3.5 The Binary Pulsar

We have seen that the differences between Einstein’s and Newton’s theoriesof gravity are exceedingly small in the solar system. Large differences requirestrong gravity and/or large velocities. Both are present in a remarkable binarystar system discovered in 1974: two 1.4 solar mass neutron stars are about asolar diameter apart and have an orbital period of eight hours. Imagine!

By great good fortune, one of the neutron stars is a pulsar spinning at 17revolutions/sec. The pulsar’s pulses serve as the ticking of a very accurate clockin the system, rivalling the most accurate clocks on Earth. Several relativisticeffects have been measured by analyzing the arrival times of the pulses at Earth.These measurements provide the most probing tests of general relativity to date,earning the system the title “Nature’s gift to relativists”.

The periastron advance of the pulsar is 4.2/year – 35,000 times that ofMercury. (Perihelion refers to the Sun; periastron is the generic term.)

Several changing relativistic effects affecting the arrival times at Earth ofthe pulsar’s pulses have been measured: the gravitational redshift of the rate ofclocks on Earth as the Earth’s distance from the Sun changes over the course ofthe year, the gravitational redshift of the time between the pulses as the pulsarmoves in its highly elliptical (eccentricity = .6) orbit, time dilation of the pulsarclock due to its motion, and light delay in the pulses.

Most important, the system provides evidence for the existence of gravita-tional waves, predicted by general relativity. Gravitational waves are propagateddisturbances in the metric caused by matter in motion. They travel at the speedof light and carry energy away from their source. Violent astronomical phenom-ena, such as supernovae and the big bang (see Sec. 4.1), emit gravitationalwaves. So far, none has been directly detected on Earth. (But see Sec. 4.5.)

General relativity predicts that the orbital period of the binary pulsar systemdecreases by 10−4 sec/year due to energy loss from the system from gravitationalradiation. The prediction is based on the full field equation, rather than the vac-uum field equation used earlier in this chapter, and provides the only precisiontest of the full field equation to date. It is confirmed within .3%.

As gravitational waves carry energy away from the system, the stars comecloser together. They will collide in about 300 million years, with an enormousburst of gravitational radiation. The orbital period just before the collision willbe of the order of 10−3 sec!

An even more spectacular binary system of neutron stars was discovered in2003: both neutron stars are pulsars, with an orbital period of 2.4 hours. A fewyears of observation of this system should greatly improve the accuracy of someexisting tests of general relativity and also provide new tests.

61

3.6 Black Holes

3.6 Black Holes

The (1 − 2m/r)−1dr2 term in the Schwartzschild metric Eq. (3.5) has a sin-gularity at the Schwartzschild radius r = 2m. For the Sun, the Schwartzschildradius is 3 km and for the Earth it is .9 cm, well inside both bodies. Sincethe Schwartzschild metric is a solution of the vacuum field equation, valid onlyoutside the central body, the singularity has no physical relevance for the Sunor Earth.

We may inquire however about the properties of an object which is insideits Schwartzschild radius. The object is then a black hole. The r = 2m surfaceis called the (event) horizon of the black hole.

Exercise 3.14. The metric of a plane using polar coordinates is ds2 =dr2 + r2dθ2. Substitute u = 1/r to obtain the metric ds2 = u−4du2 + u−2dθ2.This metric has a singularity at the origin, even though there is no singularityin the plane there. It is called a coordinate singularity.

Introduce Painleve coordinates with the coordinate change

dt = dt +(2mr)

12

r − 2mdr

in the Schwartzschild metric, Eq. (3.5), to obtain the metric

ds2 = dt2 −

(dr +

(2m

r

) 12

dt

)2

− r2d Ω 2.

The only singularity in this metric is at r = 0. This shows that the singularityat r = 2m in the Schwartzschild metric is only a coordinate singularity; thereis no singularity in the spacetime. An observer crossing the horizon would notnotice anything special. The r = 0 singularity is a true spacetime singularity.

Suppose a material particle or pulse of light moves outward (not necessarilyradially) from r1 = 2m to r2. Since ds2 ≥ 0 and dΩ2 ≥ 0 , the Schwartzschildmetric Eq. (3.5) shows that dt ≥ (1 − 2m/r)−1dr. Thus the total coordinatetime ∆t satisfies

∆t ≥r2∫

2m

(1− 2m

r

)−1

dr =

r2∫2m

(1 +

2m

r − 2m

)dr = ∞ ;

neither matter nor light can escape a Schwartzschild black hole! The samecalculation shows that it takes an infinite coordinate time ∆t for matter or lightto enter a black hole.

Exercise 3.15. Suppose an inertial object falling radially toward a blackhole is close to the Schwartzschild radius, so that r−2m is small. Use Eqs. (3.14)and (3.17) and approximate to show that dr/dt = (2m− r)/2m. Integrate andshow that r − 2m decreases exponentially in t.

62

3.6 Black Holes

According to Eq. (3.8), the redshift of an object approaching aSchwartzschild black hole increases to infinity as the object approaches theSchwartzschild radius at r = 2m. A distant observer will see the rate of allphysical processes on the object slow to zero as it approaches the Schwartzschildradius. In particular, its brightness (rate of emission of light) will dim to zeroand it will effectively disappear.

On the other hand, an inertial object radially approaching a Schwartzschildblack hole will cross the Schwartzschild radius in a finite time according to aclock it carries! To see this, observe that by the metric postulate Eq. (2.15),dr/ds is the rate of change of r measured by a clock carried by the object.By Eq. (3.17), |dr/ds| increases as r decreases. Thus the object will cross theSchwartzschild radius, never to return, in a finite proper time ∆s measured byits clock.

For a Schwartzschild black hole, grr = ∞

Fig. 3.7: The horizon andstatic limit of a Kerr blackhole.

and gtt = 0 on the surface r = 2m, the horizon.We have seen that there is no singularity inthe spacetime there. For a Kerr black hole,grr = ∞ on the surface ∆ = 0, and gtt =0 on the surface Σ = 2mr. Fig. 3.6 showsthe axis of rotation of the Kerr black hole, theinner surface ∆ = 0 (called the horizon), andthe outer surface Σ = 2mr (called the staticlimit). There is no singularity in the spacetimeon either surface.

Exercise 3.16. a. Show that the horizon is at r = m + (m2 − a2)12 .

b. Show that the static limit is at r = m + (m2 − a2 cos2φ)12 .

As with the Schwartzschild horizon, it is impossible to escape the Kerr hori-zon, even though it is possible to reach it in finite proper time. The proofs aremore difficult than for the Schwartzschild metric. They are not given here.

The crescent shaped region between the horizon and the static limit is theergosphere. In the ergosphere, the dt2, dr2, dφ2, and dθ2 terms in the Kerrmetric Eq. (3.27) are negative. Since ds2 ≥ 0 for both particles and light, thedtdθ term must be positive, i.e., a(dθ/dt) > 0. In the ergosphere it is impossibleto be at rest, and motion must be in the direction of the rotation!

63

3.6 Black Holes

General relativity allows black holes, but do they actually exist? The answeris almost certainly “yes”. At least two varieties are known: solar mass blackholes, of a few solar masses, and supermassive black holes, of 106 − 1010 solarmasses.

We saw in Sec. 3.1 that the Earth, the Sun, white dwarf stars, and neutronstars are stabilized by different forces. No known force can stabilize a supernovaremnant larger than ∼3 solar masses; it will collapse to a black hole.

Since black holes emit no radiation, they cannot be observed directly. Theymust be inferred from their effects on their environment.

Several black holes of a few solar masses are known in our galaxy. Theyare members of X-ray emitting binary star systems in which a normal star isorbiting with a compact companion. Orbital data show that in some of thesesystems, the companion’s mass is small enough to be a neutron star. But inothers, the companion is well over the three solar mass limit for neutron stars,and is therefore believed to be a black hole.

During “normal” periods, gas pulled from the normal star and heated vi-olently on its way to the compact companion causes the X-rays. Some of thesystems with a neutron star companion exhibit periods of much stronger X-rayemission. Sometimes the cause is gas suddenly crashing onto the surface of theneutron star. Sometimes the cause is a thermonuclear explosion of materialaccumulated on the surface. Systems with companions over three solar massesnever exhibit such periods. Presumably this is because the companions are blackholes, which do not have a surface.

Many, if not most, galaxies have a supermassive black hole at their center.X-ray observations of several galactic nuclei indicate that matter orbiting thenuclei at 3-10 Schwartzschild radii at 10% of the speed of light. Quasars areapparently powered by a massive black hole at the center of a galaxy.

There is a supermassive black hole at the center of our galaxy. A star hasbeen observed orbiting the hole with eccentricity .87, closest approach to thehole 17 light days, and period 15 years. This implies that the black hole is ≈ 4million solar masses.

Most stars rotate. Due to conservation of angular momentum, a rotatingstar collapsed to a black hole will spin rapidly. Also, angular momentum canbe transferred from an accretion disk to its black hole. Since the Kerr metricdescribes a rotating black hole, it is of fundamental astrophysical importance.

The black hole at the center of our galaxy is spinning rapidly: there isgood evidence that the angular momentum parameter a in the Kerr metric, Eq.(3.27), is about half of the maximum possible value a = 1 .

64

Chapter 4

Cosmological Spacetimes

4.1 Our Universe I

This chapter is devoted to cosmology , the study of the universe as a whole. Sincegeneral relativity is our best theory of space and time, we use it to construct amodel of the spacetime of the entire universe, from its origin, to today, and intothe future.. This is an audacious move, but the model is very successful, as weshall see.

We begin with a description of the universe as seen from Earth. The Sun isbut one star of 200 billion or so bound together gravitationally to form the MilkyWay galaxy. It is but one of tens of billions of galaxies in the visible universe.It is about 100,000 light years across. Galaxies cluster in groups of a few toa few thousand. Our Milky Way is a member of the local group, consisting ofover 30 galaxies. The nearest major galaxies are a few million light years away.Galaxies are inertial objects.

Fig. 4.1: Balloon anal-ogy to the universe.

The Universe Expands. All galaxies, excepta few nearby, recede from us with a velocity v pro-portional to their distance d from us:

v = Hd. (4.1)

This is the “expansion of the universe”. Eq. (4.1)is Hubble’s law ; H is Hubble’s constant. Its valueis 22 (km/sec)/million light years, within 5%.

An expanding spherical balloon, on which bitsof paper are glued, each representing a galaxy, pro-vides a very instructive analogy. See Fig. 4.1. Theballoon’s two dimensional surface is the analog ofour universe’s three dimensional space. From the

viewpoint of every galaxy on the balloon, other galaxies recede, and Eq. (4.1)holds. The galaxies are glued, not painted on, because they do not expand withthe universe.

65

4.1 Our Universe I

We see the balloon expanding in an already existing three dimensional space,but surface dwellers are not aware of this third spatial dimension. For both sur-face dwellers in their universe and we in ours, galaxies separate not because theyare moving apart in a static fixed space, but because space itself is expanding,carrying the galaxies with it.

We cannot verify Eq. (4.1) directly because neither d nor v can be measureddirectly. But they can be measured indirectly, to moderate distances, as follows.

The distance d to objects at moderate distances can be determined fromtheir luminosity (energy received at Earth/unit time/unit area). If the objectis near enough so that relativistic effects can be ignored, then its luminosity `at distance d is its intrinsic luminosity L (energy emitted/unit time) divided bythe area of a sphere of radius d: ` = L/4πd 2 . Thus if the intrinsic luminosityof an object is known, then its luminosity determines its distance. The intrin-sic luminosity of various types of stars and galaxies has been approximatelydetermined, allowing an approximate determination of their distances.

The velocity v of galaxies to moderate distances can be determined fromtheir redshift. Light from all galaxies, except a few nearby, exhibits a redshiftz > 0. By definition, z = fe/fo−1 (see Eq. (1.7)). If fe is known (for example ifthe light is a known spectral line), then z is directly and accurately measurable.Eq. (4.11) shows that the redshifts are not Doppler redshifts, but expansionredshifts, the result of the expansion of space.

However, to moderate distances we can ignore relativistic effects and deter-mine v using the approximate Doppler formula z = v from Exercise 1.6 b.Substituting z = v into Eq. (4.1) gives z = Hd, valid to moderate distances.1

This approximation to Hubble’s law was established in 1929 by Edwin Hubble.Since z increases with d, we can use z (which is directly and accurately

measurable), as a proxy for d (which is not). Thus we say that a galaxy is “at”,e.g., z = 2.

The Big Bang. Reversing time, the universe contracts. From Eq. (4.1)galaxies approach us with a velocity proportional to their distance. Thus theyall arrive here at the same time. Continuing into the past, galaxies were crushedtogether. The matter in the universe was extremely compressed, and thus at anextremely high temperature, accompanied by extremely intense electromagneticradiation. Matter and space were taking part in an explosion, called the bigbang. The recession of the galaxies that we see today is the continuation of theexplosion. The big bang is the origin of our universe. It occurred ∼ 13 billionyears ago. The idea of a hot early universe preceding today’s expansion wasfirst proposed by Georges Lemaıtre in 1927.

There is no “site” of the big bang; it occurred everywhere. Our balloon againprovides an analogy. Imagine it expanding from a point, its big bang. At a latertime, there is no specific place on its surface where the big bang occurred.

Due to the finite speed of light, we see a galaxy not as it is today, but as itwas when the light we detect from it was emitted. When we look out in space,we look back in time! This allows us to study the universe at earlier times. Fig.

1The precise relation between z and d is given by the distance-redshift relation, derived inAppendix 17.

66

4.1 Our Universe I

4.2 illustrates this. Telescopes today see some galaxies as they were when theuniverse was less than 1 billion years old.

The Cosmic Background Radia-

Fig. 4.2: The redshift z of agalaxy vs. the age of the universe(in 109 years) when the light wesee from it was emitted.

tion. In 1948 Ralph Alpher and RobertHerman predicted that the intense elec-tromagnetic radiation from the big bang,much diluted and cooled by the universe’sexpansion, still fills the universe. In1965 Arno Penzias and Robert Wilson,by accident and unaware of the predic-tion, discovered microwave radiation withthe proper characteristics. This cosmicbackground radiation (CBR) is strong ev-idence that a big bang really occurred.Sec. 4.4 describes the wealth of cosmo-logical information in the CBR.

The CBR has a 2.7 K blackbody spec-trum. The radiation started its journeytoward us ∼ 380, 000 years after the bigbang, when the universe had cooled toabout 3000K, allowing nuclei and electrons to condense into neutral atoms,which made the universe transparent to the radiation.

To a first approximation the CBR is isotropic, i.e., spherically symmetricabout us. However, in 1977 astronomers measured a small dipole anisotropy: arelative redshift in the CBR of z = −.0012 cos θ, where θ is the angle from theconstellation Leo.

Exercise 4.1. Explain this as due to the Earth moving toward Leo at 370km/sec with respect to an isotropic CBR. In this sense we have discovered theabsolute motion of the Earth.The relative redshift is modulated by the orbital motion of the Earth aroundthe Sun. After correcting for the dipole anisotropy, the CBR is isotropic to 1part in 10,000.

Further evidence for a big bang comes from the theory of big bang nucleosyn-thesis. The theory predicts that from a few seconds to a few minutes after thebig bang, nuclei formed, almost all hydrogen (75% by mass) and helium-4 (25%),with traces of some other light elements. Observations confirm the prediction.This is a spectacular success of the big bang theory. Other elements contributeonly a small fraction of the mass of the universe. They are mostly formed instars and are strewn into interstellar space by supernovae and by other means,to be incorporated into new stars, their planets, and any life which may ariseon the planets.

The big bang is part of the most remarkable generalization of science: Theuniverse had an origin and is evolving at many interrelated levels. For example,galaxies, stars, planets, life, and cultures evolve.

67

4.2 The Robertson-Walker Metric


The CBR is highly isotropic. Galaxies are distributed approximately isotropic-ally about us. Redshift surveys of hundreds of thousands of galaxies show thatour position in the universe is not special. Our cosmological model assumes thatthe universe is exactly isotropic about every point. This smears the galaxies intoa uniform distribution of matter. We assume that this does not affect what weare interested in: the large scale structure of the universe.

We first set up a coordinate system (t, r, φ, θ) for the universe. As the uni-verse expands, the density ρ of matter decreases. Place a clock at rest in everygalaxy and set it to some agreed time when some agreed value of ρ is observedat the galaxy. The t coordinate is the (proper) time measured by these clocks.

Choose any galaxy as the spatial origin. Isotropy demands that the φ and θcoordinates of other galaxies do not change in time. Choose any time t0. Definethe r coordinate at t0 to be any quantity that increases with increasing distancefrom the origin. Galaxies in different directions but at the same distance areassigned the same r. Now define the r coordinate of galaxies at times otherthan t0 by requiring that r, like φ and θ, does not change. Isotropy ensures thatall galaxies at a given r at a given t are at the same distance from the origin.

The coordinates (r, φ, θ) are called comoving . Distances between galaxieschange in time not because their (r, φ, θ) coordinates change, but because theirt coordinate changes, which changes the metric.

Our balloon provides an analogy. Recall that d Ω 2 = dφ2 + sin2φ dθ2 is themetric of a unit sphere. If the radius of the balloon at time t is S(t), then theballoon’s metric at time t is

ds2 = S 2(t) d Ω 2. (4.2)

If S(t) increases, then the balloon expands. Distances between the glued ongalaxies increase, but their comoving (φ, θ) coordinates do not change.

Exercise 4.2. Derive the analog of Eq. (4.1) for the balloon. Let d be thedistance between two galaxies on the balloon at time t, and let v be the speedwith which they are separating. Show that v = Hd, where H = S ′(t)/S(t).Hint: Peek ahead at the derivation of Eq. (4.9).

Exercise 4.3. Substitute sinφ = r/(1 + r2/4) in Eq. (4.2). Show that themetric with respect to the comoving coordinates (r, θ) is

ds2 = S 2(t)dr2 + r2dθ2

(1 + r2/4)2. (4.3)

Again, distances between the glued on galaxies increase with S(t) , but theircomoving (r, θ) coordinates do not change.

68


The metric of an isotropic universe, after suitably choosing the comoving rcoordinate, is:

Robertson-Walker Metric

ds2 = dt2 − S 2(t)dr2 + r2d Ω 2

(1 + k r2/4)2(k = 0,±1)

≡ dt2 − S 2(t) dσ2. (4.4)

The elementary but somewhat involved derivation is given in Appendix 16. Thefield equation is not used. We will use it in Sec. 4.4 to determine S(t). Theconstant k is the curvature of the spatial metric dσ: k = −1, 0, 1 for negativelycurved, flat, and positively curved space respectively.

The metric was discovered independently by H. P. Robertson and E. W.Walker in the mid 1930’s. You should compare it with the metric in Eq. (4.3).

Distance. Emit a light pulse from a galaxy to a neighboring galaxy. Sinceds = 0 for light, Eq. (4.4) gives dt = S(t) dσ. By the definition of t, or from Eq.(4.4), t is measured by clocks at rest in galaxies. Thus dt is the elapsed timein a local inertial frame in which the emitting galaxy is at rest. Since c = 1 inlocal inertial frames, the distance between the galaxies is

S(t) dσ. (4.5)

This is the distance that would be measured by a rigid rod. The distance is theproduct of two factors: dσ, which is independent of time, and S(t), which isindependent of position. Thus S is a scale factor for the universe: if, e.g., S(t)doubles over a period of time, then so do the distances between galaxies.

The metric of points at coordinate radius r in the Robertson-Walker metricEq. (4.4) is that of a sphere of radius S(t) r/(1 + k r2/4). From Eq. (4.5) thismetric measures physical distances. Thus the sphere has surface area

A(r) =4πr2S 2(t)

(1 + k r2/4)2(4.6)

and volume

V (r) = S(t)

r∫0

A(ρ)1 + kρ2/4

dρ . (4.7)

If k = −1 (negative spatial curvature), then r is restricted to 0 ≤ r <2. The substitution ρ = 2 tanh ξ shows that V (2) = ∞, i.e., the universehas infinite volume. If k = 0 (zero spatial curvature), 0 ≤ r < ∞ and theuniverse has infinite volume. If k = 1 (positive spatial curvature), then A(r)increases as r increases from 0 to 2, but then decreases to zero as r →∞. Thesubstitution ρ = 2 tan ξ shows that V (∞) = 2π2S 3(t); the universe has finitevolume! The surface of our balloon provides an analogy. It has finite area andpositive curvature. Circumferences of circles of increasing radius centered at theNorth pole increase until the equator is reached and then decrease to zero.

69

4.3 The Expansion Redshift


Hubble’s Law. At time t the distance from Earth at r = 0 to a galaxy atr = re is, by Eq. (4.5),

d = S(t)

re∫0

dσ. (4.8)

This is the distance that would be obtained by adding the lengths of small rigidrods laid end to end between Earth and the galaxy at time t. (Yes, I know thatthis is not a practical way to measure the distance!)

Differentiate to give the velocity of the galaxy with respect to the Earth:

v = S ′(t)

re∫0

dσ.

Divide to give Hubble’s law v = Hd , Eq. (4.1), where

H(t) =S ′(t)S(t)

. (4.9)

Hubble’s “constant” H(t) depends on t. But for a fixed t (e.g., today), v = Hdholds for all distances and directions.

From v = Hd we see that if d = 1/H ≈ 13.6 billion light years, then v = 1 ,the speed of light. Galaxies beyond this distance are today receding from usfaster than the speed of light. What about the rule “nothing can move fasterthan light”? In special relativity the rule applies to objects moving in an inertialframe, and in general relativity to objects moving in a local inertial frame. Thereis no local inertial frame containing both the Earth and such galaxies.

The distance d = 1/H is at z = 1.5 . This follows from the distance-redshiftrelation, derived in Appendix 17.

The Expansion Redshift. We now obtain a relationship between theexpansion redshift of light from a galaxy and the size of the universe when thelight was emitted. Suppose that light signals are emitted toward us at events(te, re) and (te + ∆te, re) and received by us at events (to, 0) and (to + ∆to, 0).Since ds = 0 for light, Eq. (4.4) gives

dt

S(t)=

dr

1 + k r2/4. (4.10)

Integrate this over both light worldlines to give

to∫te

dt

S(t)=

re∫0

dr

1 + k r2/4=

to+∆to∫te+∆te

dt

S(t).

70


Subtract∫ to

te+∆tefrom both ends to give

te+∆ te∫te

dt

S(t)=

to+∆to∫to

dt

S(t).

If ∆te and ∆to are small, this becomes∆teS(te)

=∆toS(to)

.

Use Eq. (1.6) to obtain the desired relation:

z + 1 =S(to)S(te)

. (4.11)

This wonderfully simple formula directly relates the expansion redshift to theuniverse’s expansion. Since S = 0 at the big bang, the big bang is at z = ∞.

A galaxy at z ≈ 7 is the most distant known. According to Eq. (4.11), thegalaxy was 1/8 as far away when it emitted the light we see today. The lightwas emitted ∼ 750 million years after the big bang. It is brightened by a factorof at least 25 by a foreground gravitational lensing galaxy.

According to Eq. (1.7), the observed frequency of light from the galaxy is1/8 the emitted frequency. The observed rate of any other physical processdetected in the galaxy would, by Eq. (1.6), also be 1/8 the actual rate.

Distant supernovae exhibit this effect. Type Ia supernovae all have aboutthe same duration. But supernovae out to z = .8 (the farthest to which mea-surements have been made) last about 1 + z times longer as seen from Earth.

Suppose that an object at distance z emits blackbody radiation at temper-ature T (z). According to Planck’s law, the object radiates at frequency fe withintensity proportional to

f3e

ehfe/kT (z) − 1,

where h is Planck’s constant and k is Boltzmann’s constant. Radiation emittedat fe is received by us redshifted to fo, where from Eq. (1.7), fe = (z + 1)fo.Thus the intensity of the received radiation is proportional to

f3o

ehfo/kT (0) − 1,

where T (0) is defined by

z + 1 =T (z)T (0)

. (4.12)

The observed spectrum is also blackbody, at temperature T (0).As stated in Sec. 4.1, the CBR was emitted at ≈ 3000 K and is observed

today at 2.7 K. Thus from Eq. (4.12), its redshift z ≈ 3000 K/2.7 K ≈ 1100.Analysis of spectral lines in a quasar at z = 3 shows that they were excited

by an ∼11 K CBR. Eq. (4.12) predicts this: T (3) = (3 + 1)T (0) ≈ 11 K. FromEq. (4.11), the universe was then about 1/(1 + 3) = 25% of its present size.

71

4.4 Our Universe II

4.4 Our Universe II

The Robertson-Walker metric Eq. (4.4) is a consequence of isotropy alone; thefield equation was not used in its derivation. There are two unknowns in themetric: the curvature k and the expansion factor S(t). In this section we firstreview the compelling observational evidence that k = 0, i.e., that our universeis spatially flat. Together with other data, this will force us to modify the fieldequation. We then solve the modified field equation to determine S(t).

The Universe Is Flat. In 1981 Alan Guth proposed that there was aperiod of exponential expansion in the very early universe. This inflation startedperhaps 10−34 sec after the big bang, lasted perhaps 10−35 sec, and expanded theuniverse by a factor of perhaps 1026! As fantastic as inflation seems, it explainedseveral puzzling cosmological observations and steadily gained support over theyears. By 1990, most cosmologists believed that inflation occurred. Since thenmore evidence for inflation has accumulated. Inflation predicts a spatially flatuniverse, i.e., that k = 0 in the Robertson-Walker metric Eq. (4.4).

There is more direct evidence for a spatially flat universe from the CBR.When the CBR was emitted, matter was almost, but not quite, uniformly dis-tributed. The variations originated as random quantum fluctuations. We can“see” them because regions of higher density were hotter, leading to tiny tem-perature anisotropies (∼ .06 K, a few parts in 105) in the CBR. The size of theseregions does not depend on the curvature of space, but their observed angularsize does. In a flat universe, the average temperature difference as a function ofangular separation peaks at .8 arcdegree. Observations in 2000 found the peakat .8 arcdegree, providing compelling evidence for a spatially flat universe.

Fig. 4.3: Angularsize on a sphere.

To understand how curvature can affect angular size,consider the everyday experience that the angular size ofan object decreases with distance. The rate of decreaseis different for differently curved spaces. Imagine surfacedwellers whose universe is the surface of the sphere inFig. 4.3 and suppose that light travels along the greatcircles of the sphere. According to Exercise 2.9 thesecircles are geodesics. For an observer at the north pole,the angular size of an object of length D decreases withdistance, but at a slower rate than on a flat surface.And when the object crosses the equator, its angularsize begins to increase. The object has the same angularsize α at the two positions in the figure. On a pseudosphere the angular sizedecreases at a faster rate than on a flat surface.

The angular size-redshift relation gives the relationship between the angularsize and redshift of objects in our expanding universe. Its graph is shown inFig. 4.5. It has a minimum at z = 1.6.

The Field Equation. Despite the strong evidence for a flat universe, weshall see that the field equation Eq. (2.28) applied to the Robertson-Walkermetric Eq. (4.4) with k = 0 gives results in conflict with observation. Several

72

4.4 Our Universe II

solutions to this problem have been proposed. The simplest is to add a termΛg to the field equation:

G + Λg = −8πκT , (4.13)

where Λ is a constant, called the cosmological constant. This is the most naturalchange possible to the field equation. In fact, the term Λg has appeared before,in the field equation Eq. (A.16). We eliminated it there by assuming thatspacetime is flat in the absence of matter. Thus the new term represents agravitational effect of empty space; it literally “comes with the territory”. If Λis small enough, then its effects in the solar system are negligible, but it candramatically affect cosmological models, as we shall see.

To apply the new field equation, we need the energy-momentum tensor T.In Sec. 4.2 we spread the matter in the universe into a uniform distribution ofdensity ρ = ρ(t). Since galaxies interact only gravitationally, it is reasonable toassume that the matter is dust. (This is unreasonable in the very early universe.)Thus T is of the form Eq. (2.27). For the comoving matter, dr = dθ = dφ = 0,and from Eq. (4.4), dt/ds = 1. Thus T tt = ρ is the only nonzero T jk.

The field equation Eq. (4.13) for the Robertson-Walker metric Eq. (4.4) withk = 0 and the T just obtained reduces to two ordinary differential equations:

−3(

S ′

S

)2

+ Λ = −8πκρ , (tt component) (4.14)

S ′2 + 2 SS ′′ − ΛS 2 = 0 . (rr, φφ, θθ components) (4.15)

Before solving the equations for S, we explore some of their consequences asthey stand.

From Eqs. (4.14) and (4.9),

ρ +Λ

8πκ= ρc , (4.16)

where

ρc =3H2

8πκ(4.17)

is called the critical density . Today ρc ≈ 10−29g/cm3 – the equivalent of a fewhydrogen atoms per cubic meter. This is 107 times less dense than the bestvacuum created on Earth.

Like the other two terms in Eq. (4.16), the term Λ/8πκ has the dimensionsof a density. It is the density of a component of our universe called dark energy.We’ll have more to say about dark energy soon. According to Eq. (4.16) thedensity of matter and dark energy add to ρc . But general relativity does notdetermine their individual contributions to ρc. This has been determined byobservation. We discuss the results.

73

4.4 Our Universe II

Matter. Two accurate determinations of ρ were announced in 2003. Oneused precise measurements of the anisotropies in the CBR. The other usedmeasurements of the 3D distribution of hundreds of thousands galaxies. Thesetwo very different kinds of measurements both give ρ = .30ρc, within a fewpercent. This confirmed earlier estimates based on gravitational lens statistics.Some of this matter is in known forms such as protons, neutrons, electrons,neutrinos, and photons. But most is in a mysterious component of our universecalled dark matter.

The measurements show that known forms of matter contribute only .05ρc toρ. This is consistent with two earlier, less accurate, determinations. One usedspectra of distant galactic nuclei to “directly” measure the density of knownforms of matter in the early universe. The other used the theory of big bangnucleosynthesis, which expresses the density of ordinary matter as a function ofthe abundance of deuterium in the unverse. The abundance was measured indistant pristine gas clouds backlit by even more distant quasars.

It is gratifying to have excellent agreement between the CBR and deuteriummeasurements, as the deuterium was created a few minutes after the big bangand the CBR was emitted hundreds of thousands of years later. Today over90% of this matter is intergalactic hydrogen. Only a small fraction is in stars.

Dark matter contributes the remaining .25ρc to ρ. Astronomers agree thatdark matter exists and five times as abundant as ordinary matter, but no oneknows what it is. Gravitational lensing is used to map its distribution. Sinceordinary and dark matter enter the field equations in the same way, via ρ in Eq.(4.14), they behave the same gravitationally.

There is insufficient ordinary matter in galaxies to hold them together grav-itationally – dark matter is required to prevent them from flying apart. Norwould galaxies have formed without dark matter. They formed from the smalldensity variations seen in the CBR. Any density variations of ordinary matterin the early universe were smoothed out by interactions with intense electro-magnetic radiation. But dark matter neither emits nor absorbs electromagneticradiation. (Hence its name.) Thus variations within it could survive, grow, andevolve into galaxies. Computer simulations of the formation and evolution ofgalaxies from the time of the CBR to today produce the distribution of galaxieswe see today. They must incorporate dark matter to do this.

Dark Energy. From Eq. (4.16) and ρ = .3ρc , the density of dark energyis Λ/8πκ = ρc − .3ρc = .7ρc. As with dark matter, the nature of dark energy isnot known.

Exercise 4.4. Show that Λ ≈ 1.2× 10−35/sec2.It would be hasty to accept that Λ/8πκ = .7ρc, i.e., that 70% of the mass-

energy of the universe is in some totally unknown form, on the basis of theabove “it is what is needed to satisfy Eq. (4.16)” argument. Fortunately, thereis more direct evidence for dark energy.

All type Ia supernovae have about the same peak intrinsic luminosity L.The luminosity-redshift relation, derived in Appendix 17, expresses the observed

74

4.4 Our Universe II

luminosity ` of an object as a function of its intrinsic luminosity and redshift.The function depends on Λ . Fig. 4.4 graphs ` vs. z for a fixed L and threevalues of Λ . The points (z, `) for supernovae in the range z ≈ .5 to z ≈ 1.5 lieclosest to the Λ/8πκ = .7ρc curve, and definitely not the Λ = 0 curve. This isstrong evidence that Λ 6= 0 .

Fig. 4.4: The luminosity-redshiftrelation. Upper: Λ = 0; middle:Λ/8πκ = .7ρc; lower: Λ/8πκ =ρc.

Fig. 4.5: The angular size-red-shift relation for Λ/8πκ = .7ρc.

The effect of Λ has been probed to greater distances using certain verycompact sources of radio frequency radiation. The angular size-redshift relation,also derived in Appendix 17, expresses the angular size α of an object as afunction of its redshift. These sources lie on the graph of the angular size-redshift relation for Λ/8πκ = .7ρc, Fig. 4.5, out to z = 3.

Additional independent evidence for the existence and density of dark en-ergy comes from gravitational lens statistics and from a phenomenon called theintegrated Sachs-Wolfe effect.

The Expansion Accelerates. A quick calculation using Eq. (4.15) showsthat

(SS ′ 2 − ΛS 3/3

)′ = 0. Thus from Eq. (4.14),

ρ S 3 = constant. (4.18)

According to Eq. (4.18), ρ is inversely proportional to the cube of the scalefactor S. This is what we would expect. In contrast, the density of dark energy,Λ/8πκ, is constant in time (and space). Thus the ratio of the dark energydensity to the matter density increases as the universe expands. At early timesmatter dominates and at later times dark energy dominates.

This has an important consequence for the expansion of the universe. SolveEq. (4.14) for S′2 and substitute into Eq. (4.15), giving S ′′ = S (Λ− 4πκρ) /3 .At early times ρ is large and so S ′′ < 0 ; the expansion decelerates. This is nosurprise: matter is gravitationally attractive. At later times the dark energy

75

4.4 Our Universe II

term dominates, and S ′′ > 0 ; the expansion accelerates! This is a surprise:dark energy is gravitationally repulsive.

Exercise 4.5. Show that the transition from decelerating to acceleratingoccurred at z ≈ .7. That is, the light we receive from z = .7 galaxies wasemitted at the time of the transition.

According to the luminosity-redshift relation, supernovae at z ≈ .5 are∼30%dimmer they would be if Λ were 0. They are dimmer because their light hasfarther to travel to get to us, as we accelerate away from them. Supernovaefrom z = 1 to z ≈ 1.5 (the limit of the data) show the opposite effect: they arebrighter than they would be if the expansion had always been accelerating.

S(t). We now solve the field equations (4.14) and (4.15) for S(t) . MultiplyEq. (4.14) by −S 3 and use Eq. (4.18):

3SS ′ 2 − ΛS 3 = 8πκρoS3(to) , (4.19)

where a “o” subscript means that the quantity is evaluated today. Solve:

S 3(t) =8πκρo

Λsinh2

(√3Λ2

t

)S 3(to). (4.20)

Since the big bang is the origin of the

Fig. 4.6: S (in units of S(to)) asa function of t (in 109 years).

universe, it is convenient to set t = 0when S = 0. This convention elimi-nates a constant of integration in S. Thegraph of S(t) is given in Fig. 4.6. As ex-pected, the universe decelerates at earlytimes and accelerates at later times.

Exercise 4.6. Show that Eq. (4.20)implies that the age of the universe isto = 13.1 billion years.

Exercise 4.7. Show that Eq. (4.20)implies that in the future the universewill undergo exponential expansion.

Fig. 4.2 graphed the redshift z of agalaxy against the time t it emitted thelight we see from it. This relationshipbetween z and t is obtained by substituting Eq. (4.11) into Eq. (4.20).

Summary. The universe had a big bang origin 13 billion years ago. It isspatially flat, expanding, and the expansion is speeding up. It consists of 5%known matter, 25% dark matter, and 70% dark energy. Each of these numbersis supported by independent lines of evidence.

Our cosmological model is based on general relativity with a cosmologicalconstant. The construction of a model which fits the observations so well is a

76

4.4 Our Universe II

magnificent achievement. But this “standard cosmological model” crystallizedonly at the turn of the millennium. And we are left with two mysteries: thenature of dark matter and dark energy. They constitute 95% of the “stuff” ofthe universe. We are in “a golden age of cosmology” in which new cosmologicaldata are pouring in. We must be prepared for changes, even radical changes, inour model.

Horizons. Light emitted at the big bang has only travelled a finite distancesince that time. Our particle horizon is the sphere from which we receive todaylight emitted at the big bang. Galaxies beyond the particle horizon cannot beseen by us today because their light has not had time to reach us since thebig bang! Conversely, galaxies beyond the particle horizon cannot see us today.(Actually we can only see most of the way to the horizon, to the CBR.)

From Eqs. (4.8) and (4.4) with ds = 0 and k = 0, the distance to the particlehorizon today is

S(to)∫ to

t=0

dt

S(t).

Exercise 4.8. Show that with S(t) from Eq. (4.20) this distance is ≈ 45billion light years. Actually, the distance to the horizon highly uncertain, asit is sensitive to the duration and amount of inflation, which is uncertain, andwhich Eq. (4.20) does not take into account.

The distance today to the particle horizon of another time t1 isS(to)

∫ t1t=0

dt/S(t). Since the integral increases with t1, we can see, and canbe seen by, more galaxies with passing time. However, the integral converges ast1 →∞ . Thus there are galaxies which will remain beyond the particle horizonforever. The distance today to the most distant galaxies that we will ever seeis S(to)

∫∞t=0

dt/S(t) ≈ 61 billion light years.Light emitted today will travel only a finite coordinate distance as t → ∞.

Our event horizon is the sphere from which light emitted today reaches us ast →∞. We will never see galaxies beyond the event horizon as they are todaybecause light emitted by them today will never reach us!

Exercise 4.9. Show that the distance to the event horizon today isS(to)

∫∞t=to

dt/S(t). This is ≈ 16 billion light years.

The table shows various redshifts z and distances d (in 109 lightyears) dis-cussed above. They are related by the distance-redshift relation, derived inAppendix 17. The S′′ = 0 column is the decelerating/accelerating transition.The v = c column is where galaxies recede at the speed of light.

S′′ = 0 v = cMin ∠size

Eventhorizon CMB

Particlehorizon

Farthestever see

z .7 1.5 1.6 1.8 1100 ∞ –d 7.8 13.6 14.5 16 44 45?? 61

77

4.5 General Relativity Today


General relativity extended the revolution in our ideas about space and timebegun by special relativity. Special relativity showed that space and time arenot absolute and independent, but related parts of a whole: spacetime. Gen-eral relativity showed that a spacetime containing matter is not flat and static,but curved and dynamic, not only affecting matter (according to the geodesicpostulate), but also affected by matter (according to the field equation). In con-sequence, Euclidean geometry, thought inviolate for over twenty centuries, doesnot apply (exactly) in a gravitational field. The Robertson–Walker spacetimesillustrate these points particularly clearly.

Despite its fundamental nature, general relativity was for half a centuryafter its discovery in 1915 out of the mainstream of physical theory becauseits results were needed only for the explanation of the few minute effects inthe solar system described in Sec. 3.3. Newton’s theory of gravity accountedfor all other gravitational phenomena. But since 1960 astronomers have madeseveral spectacular discoveries: quasars, the CBR, neutron stars, the binarypulsar, gravitational lenses, the emission of gravitational radiation, black holes,detailed structure in the CBR, and the accelerating expansion of the universe.These discoveries have revealed a wonderfully varied, complex, and interestinguniverse. They have enormously widened our “cosmic consciousness”. And weneed general relativity to help us understand them.

General relativity has passed every test to which it has been put: It reducesto special relativity, an abundantly verified theory, in the absence of significantgravitational fields. It reduces to Newton’s theory of gravity, an exceedinglyaccurate theory, in weak gravity with small velocities. Sec. 2.2 cited experi-mental evidence and logical coherence as reasons for accepting the postulates ofthe theory. The theory explains the minute details of the motion of matter andlight in the solar system; the tests of the vacuum field equation described in Sec.3.3 – the perihelion advance of Mercury, light deflection, and light delay – allconfirm the predictions of general relativity with an impressive accuracy. Thelunar laser experiment confirms tiny general relativistic effects on the motionof the Earth and moon. Gravitomagnetism has been detected. The emissionof gravitational radiation from its system provide quantitative confirmation ofthe full field equation. And the application of general relativity on the grandestpossible scale, to the universe as a whole, appears to be successful.

As satisfying as all this is, we would like more evidence for such a fundamen-tal theory. This would give us more confidence in applying it. The difficultyin finding tests for general relativity is that gravity is weak (about 10−40 asstrong as the electromagnetism), and so astronomical sized masses over whichan experimenter has no control, must be used. Also, Newton’s theory is alreadyvery accurate in most circumstances. Rapidly advancing technology has made,and will continue to make, more tests possible.

78


Foremost among these are attempts to detect gravitational waves. The hopeis that they will someday complement electromagnetic waves in providing infor-mation about the universe. Sec. 3.5 described observations of the binary pulsarwhich show indirectly that gravitational waves exist. The planning, construc-tion, and comissioning of several gravitational wave detectors is under way. Thewaves are extremely weak and so are difficult to detect. One detector will beable to measure changes in the distance between two test masses of 1/100 of thediameter of a proton! We await the first detection.

The most unsatisfactory feature of general relativity is its conception of mat-ter. The density ρ in the energy-momentum tensor T represents a continuousdistribution of matter, entirely ignoring its atomic and subatomic structure.Quantum theory describes this structure. The standard model is a quantumtheory of subatomic particles and forces. The theory, forged in the 1960’s and1970’s, describes all known forms of matter, and the forces between them, exceptgravity. The fundamental particles of the theory are quarks, leptons, and forcecarriers. There are several kinds of each. Protons and neutrons consist of threequarks. Electrons and neutrinos are leptons. Photons carry the electromagneticforce.

The standard model has been spectacularly successful, correctly predictingthe results of all particle accelerator experiments (although many expect dis-crepancies to appear soon). The theory of big bang nucleosynthesis uses thestandard model. However, in cosmology the standard model leaves many mat-ters unexplained. For example, inflation, dark matter, and dark energy are allbeyond the scope of the theory.

Quantum theory and general relativity are the two fundamental theoriesof contemporary physics. But they are separate theories; they have not beenunified. Thus they must be combined on an ad hoc basis when both gravityand quantum effects are important. For example, in 1974 Steven Hawkingused quantum theoretical arguments to show that a black hole emits subatomicparticles and light. A black hole is not black! The effect is negligible for a blackhole with the mass of the Sun, but it would be important for very small blackholes.

The unification of quantum theory and general relativity is the most impor-tant goal of theoretical physics today. Despite decades of intense effort, the goalhas not been reached. String theory and loop quantum gravity are candidates fora theory of quantum gravity, but they are very much works in progress. Successis by no means assured.

Despite the limited experimental evidence for general relativity and the gulfbetween it and quantum theory, it is a necessary, formidable, and important toolin our quest for a deeper understanding of the universe in which we live. Thisand the striking beauty and simplicity – both conceptually and mathematically– of the unification of space, time, and gravity in the theory make generalrelativity one of the finest creations of the human mind.

79

Appendices

A.1 Physical Constants. When appropriate, the first number given uses(light) seconds as the unit of distance.

c = 1 = 3.00× 1010 cm/sec = 186,000 mi/sec (Light speed)

κ = 2.47× 10−39sec/g = 6.67× 10−8 cm3/(g sec2)(Newtonian gravitational constant)

g = 3.27× 10−8/sec = 981cm/sec2 = 32.2 ft/sec2 (Acceleration Earth gravity)

Mass Sun = 1.99× 1033 g

Radius Sun = 2.32 sec = 6.96× 1010 cm = 432,000 mi

Mean radius Mercury orbit = 1.9× 102 sec = 5.8× 1012 cm = 36,000,000 mi

Mercury perihelion distance = 1.53× 102 sec = 4.59× 1012 cm

Eccentricity Mercury orbit = .206

Period Mercury = 7.60× 106 sec = 88.0 day

Mean radius Earth orbit = 5× 102 sec = 1.5× 1013 cm = 93,000,000 mi

Mass Earth = 5.97× 1027 g

Radius Earth = 2.13× 10−2 sec = 6.38× 108 cm = 4000 mi

Angular momentum Earth = 1020 g sec = 1041 g cm2/sec

Ho = 2.3× 10−18/sec = (22 km/sec)/106 light year (Hubble constant)

A.2 Approximations. The approximations are valid for small x.

(1 + x)n ≈ 1 + nx[e.g., (1 + x)

12 ≈ 1 + x/2 and (1 + x)−1 ≈ 1− x

]sin(x) ≈ x− x3/6

cos(x) ≈ 1

80

A.3 Clock Synchronization

A.3 Clock Synchronization. Let 2T be the time, as measured by aclock at the origin O of the lattice, for light to travel from O to another nodeQ and return after being reflected at Q. Emit a pulse of light at O toward Q attime tO according to the clock at O. When the pulse arrives at Q set the clockthere to tQ = tO + T . According to Eq. (1.3) the clocks at O and Q are nowsynchronized.

Synchronize all clocks with the one at O in this way.To show that the clocks at any two nodes P and Q are now synchronized

with each other, we must make this assumption:

The time it takes light to traverse a triangle in the lattice is inde-pendent of the direction taken around the triangle.

See Fig. A.7. In an experiment performed

Fig. A.7: Light traversinga triangle in opposite direc-tions.

in 1963, W. M. Macek and D. T. M. Davis, Jr.verified the assumption for a square to one partin 1012. See Appendix 4.

Reflect a pulse of light around the trian-gle OPQ. Let the pulse be at O,P,Q,Oat times tO, tP , tQ, tR according to the clockat that node. Similarly, let the times for apulse sent around in the other direction bet′O, t′Q, t′P , t′R. See Fig. A.7. We have thealgebraic identities

tR − tO = (tR − tP ) + (tP − tQ) + (tQ − tO)t′R − t′O = (t′R − t′P ) + (t′P − t′Q) + (t′Q − t′O).

(A.1)

According to our assumption, the left sides ofthe two equations are equal. Also, since theclock at O is synchronized with those at P and Q,

tP − tO = t′R − t′P and tR − tQ = t′Q − t′O.

Thus, subtracting the equations Eq. (A.1) shows that the clocks at P and Qare synchronized:

tQ − tP = t′P − t′Q .

81

A.4 The Macek-Davis Experiment

A.4 The Macek-Davis Experiment. In this experiment, light wasreflected in both directions around a ring laser, setting up a standing wave. Afraction of the light in both directions was extracted by means of a half silveredmirror and their frequencies compared. See Fig. A.8.

Let t1 and t2 be the times it takes light

Fig. A.8: Comparing theround trip speed of light in op-posite directions.

to go around in the two directions and f1

and f2 be the corresponding frequencies.Then t1f1 = t2f2, as both are equal tothe number of wavelengths in the stand-ing wave. Thus any fractional differencebetween t1 and t2 would be accompaniedby an equal fractional difference between f1

and f2. The frequencies differed by no morethan one part in 1012.

(If the apparatus is rotating, say, clock-wise, then the clockwise beam will have alonger path, having to “catch up” to themoving mirrors, and the frequencies of thebeams will differ. The experiment was per-formed to detect this effect.)

82

A.5 Spacelike Separated Events

A.5 Spacelike Separated Events. By definition, spacelike separatedevents E and F can be simultaneously at the ends of an inertial rigid rod –simultaneously in the sense that light flashes emitted at E and F reach thecenter of the rod simultaneously, or equivalently, E and F are simultaneous inthe rest frame I ′ of the rod. Since the events are simultaneous in I ′, they areneither lightlike separated (|∆x| = |∆t| in I) nor timelike separated (|∆x| < |∆t|in I). Thus |∆x| > |∆t| , i.e., |∆t/∆x| < 1 .

Fig. A.9: ∆s2 = ∆t2 − ∆x2 for spacelike separated events E and F . See thetext.

For convenience, let E have coordinates E(0, 0). Fig. A.9 shows the world-line W ′ of an inertial observer O′ moving with velocity v = ∆t/∆x in I. (Thatis not a typo.) L± are the light worldlines through F . Since c = 1 in I, the slopeof L± is ±1. Solve simultaneously the equations for L− and W ′ to obtain thecoordinates R(∆x,∆t). Similarly, the equations for L+ and W ′ give S(∆x,∆t).

According to Eq. (1.10), the proper time, as measured by O′, between thetimelike separated events S and E, and between E and R, is (∆x2−∆t2)

12 . Since

the times are equal, E and F are, by definition, simultaneous in an inertial frameI ′ in which O′ is at rest.

Since c = 1 in I ′, the distance between E and F in I ′ is (∆x2−∆t2)12 . This

is the length of a rod at rest in I ′ with its ends simultaneously at E and F . Bydefinition, this is the proper distance |∆s | between the events. This proves Eq.(1.10) for spacelike separated events.

83

A.6 Moving Rods

A.6 Moving Rods. If an inertial rod is pointing perpendicular to itsdirection of motion in an inertial frame, then its length, as measured in theinertial frame, is unchanged. If the rod is pointing parallel to its directionof motion in the inertial frame, then its length, as measured in the inertialframe, contracts. Why this difference? An answer can be given in terms ofEinstein’s principle of relativity: identical experiments performed in differentinertial frames give identical results. We show that the principle demands thatthe length be unchanged in the perpendicular case, but not in the parallel case.

Consider identical rods R and R′, both inertial and in motion with respectto each other. According to the inertial frame postulate, R and R′ are at restin inertial frames, say I and I ′.

Fig. A.10:Rodperpen-dicularto its di-rection ofmotionin I.

Perpendicular case. First, situate the rods as in Fig. A.10.Suppose also that the ends A and A′ coincide when the rodscross. Then we can compare the lengths of the rods directly,independently of any inertial frame, by observing the relativepositions of the ends B and B′ when the rods cross. Accordingto the principle of relativity, R and R′ have the same length. Forif, say, B′ passed below B, then this would violate the principle:a rod moving in I is shorter, while an identical rod moving in I ′

is longer.Parallel case. Now situate the rods as in Fig. A.11. Wait

until A and A′ coincide. Consider the relative positions of Band B′ at the same time. But “at the same time” is differentin I and I ′! What happens is this: At the same time in I, B′

will not have reached B, so R′ is shorter than R in I. And atthe same time in I ′, B will have passed B′, so R is shorter thanR′ in I ′. This is length contraction. There is no violation ofthe principle of relativity here: in each frame the moving rod isshorter.

There is no contradiction here: The lengths are compareddifferently in the two inertial frames, each frame using the synchronized clocksat rest in that frame. Thus there is no reason for the two comparisons to agree.This is different from Fig. A.10, where the lengths are compared directly. Thenthere can be no disagreement over the relative lengths of the rods.

Fig. A.11: Rod parallel to its di-rection of motion in I.

We calculate the length contractionof R′ in I. Let the length of R′ in I andI ′ be D and D′, respectively. Let v bethe speed of R′ in I. Emit a pulse of lightfrom A′ toward B′. In I, B′ is at distanceD from A′ when the light is emitted.

Since c = 1 in I, the relative speed of the light and B′ in I is 1 − v. Thusin I the light will take time D/(1 − v) to reach A′. Similarly, if the pulse isreflected at B′ back to A′, it will take time D/(1 + v) to reach A′. Thus the

84

A.6 Moving Rods

round trip time in I is

∆x0 =D

1− v+

D

1 + v=

2D

1− v2.

According to Eq. (1.11), a clock at A′ will measure a total time

∆s =(1− v2

) 12 ∆x0

for the pulse to leave and return.Since c = 1 in I ′ and since the light travels a distance 2D′ in I ′, ∆s is also

given by∆s = 2D′.

Combine the last three equations to give D = (1− v2)12 D′; the length of R′,

as measured in I, is contracted by a factor (1− v2)12 of its rest length.

85

A.7 The Two Way Speed of Light

A.7 The Two Way Speed of Light. This appendix describes threeexperiments which together show that the two way speed of light is the same indifferent directions, at different times, and in different places.

The Michelson-Morley Experiment. Michelson and Morley comparedthe two way speed of light in perpendicular directions. They used a Michel-son interferometer that splits a beam of monochromatic light in perpendiculardirections by means of a half-silvered mirror. The beams reflect off mirrorsand return to the half-silvered mirror, where they reunite and proceed to anobserver. See Fig. A.12.

Let 2T be the time for the light to traverse

Fig. A.12: A Michelsoninterferometer.

an arm of the interferometer and return, D thelength of the arm, and c the two way speed oflight in the arm. Then by the definition of twoway speed, 2T = 2D/c. The difference in thetimes for the two arms is 2D1/c1 − 2D2/c2. Iff is the frequency of the light, then the lightin the two arms will reunite

N = f

(2D1

c1− 2D2

c2

)(A.2)

cycles out of phase.If N is a whole number, the uniting beams

will constructively interfere and the observerwill see light; if N is a whole number + 1

2 , theuniting beams will destructively interfere and the observer will not see light.

Rotating the interferometer 90 switches c1 and c2, giving a new phase dif-ference

N ′ = f

(2D1

c2− 2D2

c1

).

As the interferometer rotates, light and dark will alternate N −N ′ times. Setc2 − c1 = ∆c and use c1 ≈ c2 = c (say) to give

N −N ′ = 2f (D1 + D2)(

1c1− 1

c2

)≈ 2f

D1 + D2

c

∆c

c. (A.3)

In Joos’ experiment, f = 6× 1014/sec, D1 = D2 = 2100 cm, and |N −N ′| <10−3. From Eq. (A.3), ∣∣∣∣∆c

c

∣∣∣∣ < 6× 10−12.

86

A.7 The Two Way Speed of Light

The Brillit-Hall Experiment. In this experiment, the output of a laserwas reflected between two mirrors, setting up a standing wave. The frequencyof the laser was servostabilized to maintain the standing wave. This part of theexperiment was placed on a granite slab that was rotated. Fig. A.13 shows aschematic diagram of the experiment.

Let 2T be the time for light to

Fig. A.13: The Brillit-Hall experiment.

travel from one of the mirrors to theother and back, f be the frequencyof the light, and N be the num-ber of wavelengths in the standingwave. The whole number N is heldconstant by the servo. Then

Tf = N. (A.4)

But 2T = 2D/c, where D is thedistance between the mirrors and cis the two way speed of light be-tween the mirrors. Substitute thisinto Eq. (A.4) to give Df = cN .Thus any fractional change in c as

the slab rotates would be accompanied by an equal fractional change in f . Thefrequency was monitored by diverting a portion of the light off the slab andcomparing it with the output of a reference laser which did not rotate. Thefractional change in f was no more than four parts in 1015.

The Kennedy-Thorndike Experiment. Kennedy and Thorndike used aMichelson interferometer with arms of unequal length. Instead of rotating theinterferometer to observe changes in N in they observed the interference of theuniting beams over the course of several months. Any change dc in the two wayspeed of light (due, presumably, to a change in the Earth’s position or speed)would result in a change dN in N : if we set, according to the result of theMichelson-Morley experiment, c1 = c2 = c in Eq. (A.2), we obtain

dN = 2fD1 −D2

c

dc

c

In the experiment, f = 6×1014/sec, D1−D2 = 16 cm, and |dN | < 3×10−3.Thus |dc/c| < 6× 10−9.

87

A.8 Newtonian Orbits

A.8 Newtonian Orbits. Let the Sun be at the origin and r(t) be the pathof the planet. At every point r define orthogonal unit vectors

Fig. A.14: The polarunit vectors.

ur = cos θ i + sin θ j ,

uθ = − sin θ i + cos θ j .

See Fig. 8. Differentiate r = rur to give the ve-locity

v =drdt

=dr

dtur + r

dur

dθ

dθ

dt=

dr

dtur + r

dθ

dtuθ .

A second differentiation gives the acceleration

a =

[d 2r

dt2− r

(dθ

dt

)2]ur +

[r−1 d

(r2 dθ/dt

)dt

]uθ .

Since the acceleration is radial, the coefficient of uθ is zero, i.e.,

r2 dθ

dt= A , (A.5)

a constant. (This is Kepler’s law of areas, i.e., conservation of angular momen-tum. Cf. Eq. (3.15).)

Using Eqs. (2.1) and (A.5), we have

a = −κM

r2ur = −κM

A

dθ

dtur =

κM

A

duθ

dt.

Integrate to obtain v and equate it to the expression for v above:

κM

A(uθ + e) =

dr

dtur + r

dθ

dtuθ ,

where the vector e is a constant of integration. Take the inner product of thiswith uθ and use Eq. (A.5) again:

κM

A2[1 + e cos (θ − θp)] =

1r, (A.6)

where e = |e| and θ−θp is the angle between uθ and e . This is the polar equationof an ellipse with eccentricity e and perihelion (point of closest approach to theSun) at θ = θp.

88

A.9 Geodesic Coordinates

A.9 Geodesic Coordinates. By definition, the metric f(x) = (fmn(x))of a local planar frame at P satisfies (fmn(P )) = f. Given a local planar framewith coordinates x, we construct a new local planar frame with coordinates x inwhich the metric

(fmn(x)

)additionally satisfies ∂ifmn(P ) = 0. The coordinates

are called geodesic coordinates.We first express the derivatives of the metric in terms of the Christoffel

symbols by “inverting” the definition Eq. (2.17) of the latter in terms of theformer. Multiply Eq. (2.17) by (gji) and use the fact that (gjig

im) is theidentity matrix to obtain gjiΓi

kp = 12 (∂pgkj + ∂kgjp − ∂jgkp). Exchange j and k

to obtain gkiΓijp = 1

2 (∂pgjk + ∂jgkp − ∂kgjp). Add the two equations to obtainthe desired expression:

∂pgjk = gjiΓikp + gkiΓi

jp . (A.7)

Define new coordinates x by

xm = xm − 12Γm

stxsxt ,

where the Christoffel symbols are those of the x-coordinates at P . Differentiatethis and use Γm

it = Γmti :

∂xm

∂xj=

∂xm

∂xj− 1

2Γmsj x

s − 12Γm

jtxt = δm

j − Γmsj x

s .

Substitute this into Eq. (2.9):

fjk(x) = fjk(x)− fjn(x)Γnskxs − fmk(x)Γm

sj xs + 2 nd order terms.

This shows already that the x are inertial frame coordinates: f(P ) = f(P ) = f0.Differentiate the above equation with respect to xp, evaluate at P , and applyEq. (A.7) to f :

∂pfjk = ∂sfjk(P )∂xs

∂xp−(fjn(P )Γn

pk + fmk(P )Γmpj

)= ∂pfjk(P )−

(fjn(P )Γn

pk + fmk(P )Γmpj

)= 0 .

Thus x is a geodesic coordinate system.

89

A.10 The Geodesic Equations

A.10 The Geodesic Equations. This appendix translates the local formof the geodesic equations Eq. (2.16) to the global form Eq. (2.18). Let (fjk)represent the metric with respect to a local planar frame at P with geodesiccoordinates (xi). Then from Eq. (2.6), (fjk(P )) = f. And from Eq. (2.19),∂ifjk(P ) = 0.

Let (yi) be a global coordinate system with metric g(yi). We use the notation

yir =

∂yi

∂xr, xu

v =∂xu

∂yv, yi

rs =∂2yi

∂xs∂xr, xu

vw =∂2xu

∂yw∂yv.

Differentiate yi = yiqx

q and use Eq. (2.16):

yi = yiqx

q + yiqrx

rxq = yiqrx

rj y

jxqkyk .

In terms of components, the formula g−1 = a−1 f−1(a−1

)t from Exercise2.5c is

gim = yirf rsym

s .

Eq. (2.12) shows that Eq. (2.9) holds throughout the local planar frame, notjust at P. Apply ∂i to Eq. (2.9), and use Eq. (2.19) and the symmetry of f andg. Evaluate at P :

∂igjk = (∂pfmn)xpi x

mj xn

k + fmnxmjix

nk + fmnxm

j xnki = 2fmnxm

j xnkj .

Note that (xumym

s ) = (∂xu/∂xs) is the identity matrix. Substitute the last twodisplayed equations in the definition Eq. (2.17) of the Christoffel symbols:

Γijk = 1

2 gim (∂kgjm + ∂jgmk − ∂mgjk)

= yirf rsym

s

(fuvxu

j xvmk + fuvxu

mxvkj − fuvxu

j xvkm

)= yi

rf rsym

s fuvxumxv

kj

= yirf rsfuvxu

myms xv

kj

= yirf rsfsvxv

kj

= yirx

rkj .

Substitute for yi and Γijk from above to give the global form Eq. (2.18) of

the geodesic equations:

yi + Γijk yj yk =

(yi

qrxqkxr

j + yirx

rkj

)yj yk = ∂k

(yi

rxrj

)yj yk = 0 .

90

A.11 Tensors

A.11 Tensors. Tensors are a generalization of vectors. The componentsai of a vector are specified using one index. Many objects of interest havecomponents specified by more than one index. For example, the metric gij hastwo. Another example is the Riemann (curvature) tensor , which has four:

Rprsq = Γp

ts Γtrq − Γp

tq Γtrs + ∂sΓp

rq − ∂qΓprs . (A.8)

We shall see that the metric and the Riemann tensor are tensors. A scalar is atensor with a single component, i.e., it is a number, and so requires no indices.

As with a vector, we should think of a tensor as a single object, existingindependently of any coordinate system, but which in a given coordinate systemacquires components. There is a fully developed algebra and calculus of tensors.This formalism is a powerful tool for computations in general relativity, but wedo not need it for a conceptual understanding of the theory.

A tensor index is written as a superscript (called a contravariant index),or as a subscript (called covariant). A superscript p index can be low-ered to a subscript t index by multiplying by gtp and summing. For example,Rtrsa = gtpR

prsa. Similarly, an index can be raised using gpt : Rp

rsa = gptRtrsa.Lowering an index is a combination of two tensor operations: a tensor productgtuRp

rsa, followed by a contraction gtpRprsa, formed by summing over a repeated

upper and lower index.The Ricci tensor, Eq. (2.25), is a contraction of the Riemann tensor:

Rjk = Rpjkp.

The components of a vector change under a rotation of coordinates. Simi-larly, the components of a tensor change under a change of coordinate systems.The formula for transforming the components from y to y coordinates can bediscerned from the example of the Riemann tensor:

Rabcd = Rp

rsq

∂ya

∂yp

∂yr

∂yb

∂ys

∂yc

∂yq

∂yd.

(A direct verification of this is extremely messy.) There is a derivative for eachindex and the derivatives are different for upper and lower indices.

We adopt the transformation law as our definition of a tensor: “a tensor issomething which transforms as a tensor”. Eq. (2.12) shows that the metric isa tensor. A scalar tensor has the same value in all coordinate systems. Notall objects specified with indices form a tensor. For example, the Christoffelsymbols do not.

Our rather old-fashioned definition of a tensor is not very satisfactory, as itgives no geometric insight into what a tensor is. However, it is the best we cando without getting into heavier mathematics. Exercise A.1 gives practice withthe algebra of tensors.

91

A.11 Tensors

Exercise A.1. a. Show that the coordinate differences (dyi) form a tensor.b. Show that the product of two tensors is a tensor.c. Show that the contraction of a tensor is a tensor.d. Show that raising and lowering a tensor index are inverse operations.

Exercise A.2. This exercise shows that vectors in Rn are tensors. Let(yi) = ( yi(x) ) be global coordinates with metric g. Fix a point P .

a. Define a basis (∂i) at P by ∂i = ∂x/∂yi. The vector ∂i is tangent to thecurve obtained by varying yi while holding the other y’s fixed. Expand a vectorv with respect to this basis: v = vi ∂i. Show that the vi are components of acontravariant tensor.

b. Define a second basis (dyj) by dyj = ∇yj . The vector dyj is orthogonalto the surface yj = constant. Expand v with respect to this basis: v = vj dyj .Show that the vj are components of a covariant tensor.

c. Show that the two bases are dual : 〈∂i,dyj〉 = δji , where 〈 ·, · 〉 is the

inner product.d. Show that vj = gij vi.e. Show that 〈v,w〉 = gij viwj .We give only an example of the calculus of tensors. Consider the geodesic

equations Eq. (2.21). The derivatives (yi) are the components of a tensor.The second derivatives (yi) are not. The Christoffel symbols on the left sideof the geodesic equations “correct” this: the quantities (yi + Γi

jk yj yk), are thecomponents of a tensor, the “correct” derivative of (yi). The right sides ofthe geodesic equations are zero, the components of the zero tensor. Thus thegeodesic equations form a tensor equation, valid in all coordinate systems.

92

A.12 Fermi Normal Coordinates

A.12 Fermi Normal Coordinates. Recall from Appendix 9 that themetric f of a geodesic coordinate system at E in spacetime satisfies (fmn(E)) =f and ∂ifmn(E) = 0. In spacetime, a Fermi normal coordinate system satisfiesin addition ∂0 ∂ifmn(E) = 0. We do not give a construction of these coordinates.

In Fermi normal coordinates the expansion of f to second order at E is

f00 = 1 − R0l0m xlxm + . . .

f0i = 0 − 23R0lim xlxm + . . . (A.9)

fij = −δij − 13Riljm xlxm + . . . ,

where none of i, j, l, m is 0; the Riljm are components of the Riemann tensor,Eq. (A.8), at E; and δij = 1 if i = j, 0 otherwise. Proof: The constantterms in the expansion express (fmn(E)) = f. The linear terms vanish because∂ifmn(E) = 0. The quadratic terms do not include l = 0 or m = 0 because∂0 ∂ifmn(E) = 0. And a calculation (with a computer!) of the Riemann tensorfrom the metric shows that the Riljm are components of that tensor. (Severalsymmetries of the Riemann tensor facilitate the calculation:

Riljm = −Rilmj , Riljm = −Rlijm,

Riljm = Rimjl, Riljm + Rimlj + Rijml = 0.

The first two identities imply that Rilmm = 0 and Riijm = 0.)A calculation of the Einstein tensor from the expansion Eq. (A.9) gives

G00 = R1212 + R2323 + R3131 . (A.10)

Eq. (2.23) gives the curvature K of a surface when g12 = 0. The generalformula is

K =R1212

det(g). (A.11)

Now consider the surface formed by holding the time coordinate x0 and aspatial coordinate, say x3, fixed, while varying the other two spatial coordinates,x1 and x2. From Eq. (A.9) the surface has metric

fij = δij + 13Riljmxlxm + . . . ,

where i, j, l,m vary over 1, 2. A calculation shows that the 1212 componentof the Riemann tensor for this metric is −R1212. (Remember that R is theRiemann tensor for the spacetime, not the surface.) Thus from Eq. (A.11), thecurvature of the surface at the origin is

K12 =−R1212

det(δij)= −R1212. (A.12)

Putting together Eqs. (A.10) and (A.12),

G00 = −(K12 + K23 + K31) .

93

A.13 The Form of the Field Equation

A.13 The Form of the Field Equation. We give a plausibility argu-ment, based on reasonable assumptions, which leads from the schematic fieldequation Eq. (2.24) to Einstein’s field equation Eq. (2.28).

We take Eq. (2.24) to be a local equation, i.e., it equates the two quantitiesevent by event.

Consider first the right side of Eq. (2.24), “quantity determined bymass/energy”. It is reasonable to involve the density of matter. In specialrelativity there are two effects affecting the density of moving matter. First, themass of a body moving with speed v increases by a factor (1−v2)−

12 . According

to Eq. (1.11), this factor is dx0/ds. Second, from Appendix 6, an inertial bodycontracts in its direction of motion by the same factor and does not contract indirections perpendicular to its direction of motion. Thus the density of movingmatter is

T 00 = ρdx0

ds

dx0

ds, (A.13)

where ρ is the density measured by an observer moving with the matter. As em-phasized in Chapter 1, the coordinate difference dx0 has no physical significancein and of itself. Thus Eq. (A.13) is only one component of a whole:

T jk = ρdxj

ds

dxk

ds. (A.14)

This quantity represents matter in special relativity. A local inertial frame is inmany respects like an inertial frame in special relativity. Thus at the origin ofa local inertial frame we replace the right side of Eq. (2.24) with Eq. (A.14).Transforming to global coordinates, the right side of Eq. (2.24) is the energy-momentum tensor T. Thus the field equation is of the form

G = −8πκT, (A.15)

where according to Eq. (2.24) the left side is a “quantity determined by metric”.We denote it G in anticipation that it will turn out to be the Einstein tensor.And −8πκ was inserted for later convenience.

We shall make four assumptions which will uniquely determine G.(i) From Eq. (2.24), G depends on g. Assume that the components of G,

like the curvature K of a surface, depend only on the components of g and theirfirst and second derivatives.

(ii) In special relativity ∂T jk/∂xj = 0. (For k = 0 this expresses conservationof mass and for k = 1, 2, 3 it expresses conservation of the kth component ofmomentum.) Assume that ∂T jk/∂xj = 0 at the origin of local inertial frames.

According to a mathematical theorem of Lovelock, our assumptions alreadyimply that Eq. (A.15) is of the form

A[R− 1

2 gR]+ Λg = −8πκT , (A.16)

where A and Λ are constants.

94

A.13 The Form of the Field Equation

It remains only to determine Λ and A.(iii) Assume that a spacetime without matter is flat. (We will revisit this

assumption in Sec. 4.4.) In a flat spacetime the Christoffel symbols Eq. (2.17),the Ricci tensor Eq. (2.25), and the curvature scalar R all vanish. Thus Eq.(A.16) becomes Λg = 0. Therefore Λ = 0.

(iv) Assume that Einstein’s theory reduces to Newton’s when the latter isaccurate, namely when gravity is weak and velocities are small. We will showthat this implies that A = 1. Then Eq. (A.16) will be precisely the field equationEq. (2.28).

We use results from Chapter 4 with Λ = 0. Solve Eq. (4.15) for S ′′ and useEq. (4.14):

S ′′ = −4πκρS

3. (A.17)

Consider a small ball centered at the origin. Using Eq. (4.5), the ball hasradius S(t)

∫dσ = S(t) σ (σ is constant).

Now remove the matter from a thin spherical shell at radius S(t) σ. Theshell is so thin that the removal changes the metric in the shell negligibly. Aninspection of the derivation of the Schwartzschild metric in Section 3.2 showsthat metric in the shell is entirely determined by the total mass inside the shell.In particular, we may remove the mass outside the shell without changing themetric in the shell. Do so.

Apply the Newtonian equation Eq. (2.1) to an inertial particle in the shell:

S ′′σ = −κ43π (Sσ)3 ρ

(Sσ)2.

This is the same as Eq. (A.17); Einstein’s and Newton’s theories give the sameresult.

But if A 6= 1 in Eq. (A.16), then the left side of Eq. (A.17) would bemultiplied by A. Thus the coincidence of the two theories in this situationrequires A = 1.

95

A.14 Eliminating the dtdr Term

A.14 Eliminating the dtdr Term. We make a coordinate change inEq. (3.2) to eliminate the dtdr term. From the theory of differential equationsthere is an integrating factor I(t, r) and then a function τ(t, r) so that

dτ = I(e2µdt− udr).

Thendt = e−2µ(I−1dτ + udr).

Substitute for dt and then for dτ in the dτdr term:

e2µdt2−2udtdr−e2νdr2 = I−2e−2µdτ2−(2u2e−2µ +e2ν) dr2 ≡ e2µdτ2−e2νdr2.

Drop the bars and rename τ to t and this becomes

e2µdt2 − e2νdr2.

There is no dtdr term.

96

A.15 Approximating an Integral

A.15 Approximating an Integral. Retaining only first order terms inthe small quantities m/r and m/rp we have

(1− 2m/r)−2

1− 1− 2m/r

1− 2m/rp(rp/r)2

≈ (1 + 2m/r)2

1− (1− 2m/r) (1 + 2m/rp) (rp/r)2

≈ 1 + 4m/r

1− (1− 2m/r + 2m/rp) (rp/r)2

=1 + 4m/r(

1− (rp/r)2)(

1− 2mrp

r (r + rp)

)≈(1− (rp/r)2

)−1

(1 + 4m/r)(

1 +2mrp

r (r + rp)

)≈(1− (rp/r)2

)−1(

1 + 4m/r +2mrp

r (r + rp)

).

We now see that Eq. (3.25) can be replaced with

t =

rp∫re

(1−

(rp

r

)2)−1(

1 +4m

r+

2mrp

r (r + rp)

)dr .

Integrate to give

t =(r2e − r2

p

) 12 + 2m ln

re +(r2e − r2

p

) 12

rp+ m

(re − rp

re + rp

)12

.

Use rp re to obtain Eq. (3.26).

97

A.16 The RobertsonWalker Metric

A.16 The Robertson-Walker Metric. This appendix derives theRobertson-Walker for an isotropic universe, Eq. (4.4).

In Chapter 3 we showed that the metric of a spherically symmetric spacetimecan be put in the form Eq. (3.2):

ds2 = e2µdt2 − e2νdr2 − r2e2λd Ω 2, (A.18)

where the coefficients do not depend on φ or θ.Since the coordinates are comoving, dr = dφ = dθ = 0 for neighboring events

on the worldline of a galaxy. Thus by Eq. (A.18), ds2 = e2µdt2. But ds = dt,since both are the time between the events measured by a clock in the galaxy.Thus e2µ = 1. Our metric is now of the form

ds2 = dt2 −(e2νdr2 − r2e2λd Ω 2

)(A.19)

If the universe is expanding isotropically about every galaxy, then it seemsthat the spatial part of the metric could depend on t only through a commonfactor S(t), as in the metric of our balloon, Eq. (4.2). We demonstrate this.

Consider light sent from a galaxy at (t, r, φ, θ) to a galaxy at (t + dt, r +dr, φ, θ). From Eq. (A.19), dt2 = e2νdr2. Thus the light travel time betweenP (r, φ, θ) and Q(r+dr, φ, θ) is eν(t,r)dr. The ratio of this time to that measuredat another time t′ is given by eν(t,r)/eν(t′,r). Similarly, the ratio of the timesbetween P and R(r, θ, φ + dφ) is eλ(t,r)/eλ(t′,r). By isotropy at P the ratios areequal:

eν(t,r)

eν(t′,r)=

eλ(t,r)

eλ(t′,r). (A.20)

(For example, if the time from the galaxy at P to the nearby galaxy at Q doublesfrom t to t′, then the time from P in another direction to the nearby galaxy atR must also double.)

The two sides of Eq. (A.20) are independent of r. For if two r’s gave differentratios, then there would not be isotropy half way between. Fix t′. Then bothmembers of Eq. (A.20) are a function only of t, say S(t). Set ν(t′, r) = ν(r) toobtain eν(t,r) = S(t) eν(r). Similarly, eλ(t,r) = S(t) eλ(r). Thus the metric Eq.(A.19) becomes

ds2 = dt2 − S 2(t)(e2νdr2 + r2e2λd Ω 2

). (A.21)

The coordinate change r = reλ(r) puts Eq. (A.21) in the form

ds2 = dt2 − S 2(t)(e2νdr2 + r2d Ω 2

)= dt2 − S 2(t) dσ2. (A.22)

We now determine ν in Eq. (A.22).Exercise A.3. Show that an application of Eq. (2.23) shows that the

curvature K of the half plane θ = θ0 in the dσ portion of the metric satisfiesν ′e−2ν = Kr.As above, K is independent of r. Integrating thus gives e−2ν = C−Kr2, whereC is a constant of integration.

98

A.16 The RobertsonWalker Metric

Fig. A.15: Evaluatingthe integration constantC.

It only remains to determine C. From Eq.(A.22) we can label the sides of an infinitesimalsector of an infinitesimal circle centered at the ori-gin in the φ = π/2 plane. See Fig. A.15. Since thelength of the subtended arc is the radius times thesubtended angle, eν(0) = 1. Thus C = 1.

Set k = 1, 0,−1 according as K > 0,K =0,K < 0. Then k indicates the sign of the cur-vature of space. If K 6= 0, substitute r = r |K|

12

and S = S |K|12 . Dropping the bars, we obtain

ds2 = dt2−S 2(t)(

dr2

1− k r2+ r2d Ω 2

)(k = 0,±1).

The substitution r = r/(1 + kr2/4) gives the Robertson-Walker metric.

99

A.17 Redshift Relations

A.17 Redshift Relations. This appendix relates three physical param-eters of a distant object to its redshift. We obtain these relationships for a flatuniverse (k = 0) with a cosmological constant Λ, as described in Sec. 4.4.

The Distance-Redshift Relation. The distance-redshift relation d(z)expresses the distance d to a distant object today as a function of its redshift.

To obtain the relation, suppose that light is emitted toward us at event(te, re) and received by us today at event (to, 0). Evaluate Eq. (4.14) at to, useEq. (4.9), and rearrange: 8πκρ(to) = 3H2

o − Λ. Thus from Eqs. (4.18) and(4.11),

8πκρ(t) = 8πκρ(to)S3(to)S3(t)

= (3H2o − Λ)(z + 1)3.

Thus Eq. (4.14) gives(S ′

S

)2

=8πκρ(t)

3+

Λ3

= H2o

[ΩΛ + (1− ΩΛ)(z + 1)3

], (A.23)

where ΩΛ ≡ Λ/3H2o . (Using Eq. (4.17), Eq. (4.16) can be written ρ/ρc +ΩΛ =

1. Thus ΩΛ is the dark energy fraction of the matter-energy density of theuniverse.)

For light emitted at time t and received by us at to, Eq. (4.11) can be writtenS(to) = (z(t) + 1) S(t). Differentiate and substitute into Eq. (A.23):

(1 + z)−2

(dz

dt

)2

= H2o

[ΩΛ + (1− ΩΛ)(z + 1)3

]. (A.24)

From Eq. (4.10) with k = 0, dt = S(t)dr = S(to)(1+z)−1dr. And from Eqs.(4.8) and (4.4) with ds = 0 and k = 0, d = S(to)re. Substitute these into Eq.(A.24), separate variables, and integrate to give the distance-redshift relation:

d(z) = S(to)re = H−1o

z∫0

[ΩΛ + (1− ΩΛ)(ζ + 1)3

]− 12 dζ . (A.25)

100

A.17 Redshift Relations

The Angular Size-Redshift Relation. The angular size-redshift relationα(z) expresses the angular size α of a distant object as a function of its redshift.

To obtain the relation, send a pulse of light from one side of the object to theother. Let (te, re, θe, φe) and (te + dt, re, θe, φe + dφ) be the coordinates of theemission and reception events. From Eq. (4.4), dt = reS(te) dφ. Since c = 1,the size of the object D = dt. Also, α = dφ. Thus using Eq. (4.11),

α =D

reS(te)=

(z + 1)DreS(to)

.

Substitute from the distance-redshift relation, Eq. (A.25), to obtain the angularsize-redshift relation:

α(z) =(z + 1)HoD

z∫0

[ΩΛ + (1− ΩΛ)(ζ + 1)3]−12 dζ

. (A.26)

The Luminosity-Redshift Relation. The luninosity-redshift relation`(z) expresses the luminosity ` of light received from a distant object as a func-tion of its redshift.

To obtain the relation, suppose that light of intrinsic luminosity L is emittedtoward us at event (te, re) and received by us at event (to, 0). Replace theapproximate relation ` = L/4πd 2 from Sec. 4.1 with this exact relation:

` =L

4π(1 + z)2d 2. (A.27)

The new 1 + z factors are most easily understood using the photon descriptionof light. The energy of a photon is E = hf (an equation discovered by Einstein),where h is Planck’s constant. According to Eq. (1.7), the energy of each receivedphoton is diminished by a factor 1 + z. And according to Eq. (1.6), the rate atwhich photons are received is diminished by the same factor.

Substitute d from the distance-redshift relation, Eq. (A.25), into Eq. (A.27)to obtain the luninosity-redshift relation:

`(z) =H2

oL

4π (1 + z)2(

z∫0

ΩΛ + (1− ΩΛ)(ζ + 1)3−12 dζ

)2 . (A.28)

101

Index

accelerometer, 13, 30accretion disk, 60, 64angular size, 72angular size-redshift relation, 72,

75, 101approximating an integral, 97approximations, 80

balloon analogy, 65, 66, 68, 69big bang nucleosynthesis, 67, 74, 79black hole, 49, 62blackbody radiation, 67, 71Boltzmann’s constant, 71Braginsky, V. B., 31Brecher, K., 24Bridgeman, P.W., 17Brillit-Hall experiment, 23, 87

Christoffel symbols, 40clock, 11comoving coordinates, 68contravariant index, 47, 91coordinate singularity, 62coordinates, 14, 34Copernicus, Nicholas, 28cosmic background radiation

(CBR), 67, 71, 72, 74cosmological constant, 73cosmology, 65covariant index, 47, 91crab nebula, 49critical density, 73curvature, 43, 47, 93curvature scalar, 45curve, 12curved spacetime, 11

dark energy, 73, 74, 76, 79

dark matter, 74, 76, 79degenerate electron pressure, 48deSitter, Willem, 58deuterium, 74Dicke, R., 31dipole anisotropy, 67distance-redshift relation, 66, 70,

77, 100, 101double quasar, 57drag free satellite, 60dust, 45, 73

Eddington, Sir Arthur Stanley, 56Einstein summation convention, 37Einstein tensor, 45, 93Einstein, Albert, 11, 29, 30eliminating the dtdr term, 96energy-momentum tensor, 45, 73ergosphere, 63event, 11event horizon, 77

Fermi normal coordinates, 47, 93field equation, 43, 45, 46, 50, 72, 94flat spacetime, 11frame dragging, 59

galaxy, 65galaxy cluster, 65Galilei, Galileo, 28Galle, J., 28Gauss, C. F., 30, 44general relativity, 11geodesic, 25, 26, 40–42geodesic coordinates, 41, 47, 89, 90geodesic equations, 53, 90geodesic postulate, 13, 25, 26, 30,

31, 40–42

102

geodetic precession, 58, 60global coordinate postulate, 35, 36global coordinates, 34, 36GPS, 33gravitational binding energy, 31gravitational lens, 57, 71, 74gravitational lens statistics, 74, 75gravitational waves, 61, 79gravitomagnetic clock effect, 60gravitomagnetism, 59Gravity Probe B, 59

Hafele-Keating experiment, 12, 22,23, 32, 33, 51

Hils-Hall experiment, 24horizon, 62, 63Hubble’s constant, 65, 70Hubble’s law, 65, 66, 70Hubble, Edwin, 66

inertial force, 13, 30inertial frame, 13, 14inertial frame postulate, 13, 14, 17,

30, 36inertial object, 13, 30, 65inflation, 79integrated Sachs-Wolfe effect, 75intrinsic description, 36, 46intrinsic luminosity, 66isotropic universe, 68

Joos, G., 23, 86

Kennedy-Thorndike experiment,23, 87

Kepler, Johannes, 28Kerr metric, 59, 64

LAGEOS satellites, 59Le Verrier, U., 28, 56length contraction, 20, 32, 84, 94Lense-Thirring effect, 59light deflection, 56light delay, 57light retardation, 61light second, 19light speed, 24, 39

lightlike separated, 18, 19local inertial frame, 35local planar frame, 34, 41loop quantum gravity, 79Lovelock’s theorem, 94luminosity, 66luminosity-redshift relation, 74, 76,

101lunar laser experiment, 31, 58

Macek-Davis experiment, 81, 82manifold, 35mass, 94Mercury, 56metric postulate, 13, 18, 19, 22, 30,

37–39Michelson interferometer, 86, 87Michelson-Morley experiment, 23,

86Milky Way, 65Minkowski, Hermann, 19muon, 23

neutron star, 49, 52, 61, 64Newton, Isaac, 28newtonian gravitational constant,

28, 51newtonian orbits, 88

Painleve coordinates, 62particle horizon, 77periastron advance, 61perihelion advance, 55photon, 101physical constants, 80planar frame, 14planar frame postulate, 14, 34Planck’s constant, 71, 101Planck’s law, 71point, 11postulates, global form, 36postulates, local form, 36Pound, R. V., 32principle of relativity, 84proper distance, 18, 83proper time, 18, 20, 39

103

pseudosphere, 43, 44Ptolemy, Claudius, 27pulsar, 49, 61

quantum theory, 79quasar, 49, 56–58, 64, 71, 74

Rebka, G. A., 32redshift, 16

Doppler, 17, 20, 24, 32expansion, 17, 66, 68, 70gravitational, 17, 32, 33, 51, 61,

63Ricci tensor, 45, 91Riemann tensor, 91, 93Riemann, G. B., 36Robertson-Walker metric, 69, 72, 98rod, 11, 13, 14, 21, 34, 35, 84, 94round trip light speed, 82

scalar, 91Schwartzschild metric, 51Schwartzschild radius, 62Schwartzschild, Karl, 51Sirius, 48, 52Snider, J.L., 32spacelike separated, 18, 20, 83spacetime, 11, 19spacetime diagram, 15special relativity, 11standard model, 79static limit, 63stellar evolution, 48STEP, 31string theory, 79supernova, 48, 49, 61, 64, 67, 71, 74surface, 11surface dwellers, 13synchronization, 14, 15, 17, 81

tensors, 46, 91terrestrial redshift experiment, 32,

35, 52tidal acceleration, 35, 44time dilation, 20, 23, 32, 51, 61timelike separated, 18, 20, 21

universal light speed, 19, 21, 23, 39universal time, 12, 29

vacuum field equation, 46, 50

white dwarf, 48, 52, 64worldline, 12

104

Documents

Elementary General Relativity - ustc.edu.cnstaff.ustc.edu.cn/~jmy/documents/ebooks/elementary GR.pdf · Elementary General Relativity Version 3.35 ... 2.2 The Key to General Relativity