229
Binary Stars Steven N. Shore Indiana University, South Bend I. Historical Introduction II. Some Preliminaries III. Classification of Close Binary Systems IV. Evolutionary Processes for Stars in Binary Systems V. Mass Transfer and Mass Loss in Binaries VI. X-Ray Sources and Cataclysmics VII. Formation of Binary Systems VIII. Concluding Remarks GLOSSARY Accretion disk Structure formed when the material ac- creting onto a star has excess angular momentum, form- ing a circulating disk of material supported by internal pressure and heated by turbulent stresses. Lagrangian points Stable points in the orbit of a third body in a binary system; the inner Langrangian point, L 1 , lies along the line of centers and marks the Roche limit for a tidally distorted star. Main sequence Phase of hydrogen core burning; first sta- ble stage of nuclear processing and longest epoch in the evolution of a star. Mass function Method by which the mass of an unseen companion in a spectroscopic binary can be estimated using the radial velocity of the visible star and the or- bital period of the binary. Orbital parameters Inclination, i , of the orbital plane to the line of sight; P , the period of revolution; e, the eccentricity of the orbit; q , the mass ratio. Red giant Stage of helium core burning; subsequent to the subgiant stage. Subgiant Stage of hydrogen shell burning, when the deep envelope initiates nuclear processing around an inert helium core produced by main sequence hydrogen core fusion. This is the transition stage in the expansion of the envelope from the main sequence to the red giant phase. Units Solar mass ( M ), 2 × 10 33 g; solar radius ( R ), 7 × 10 10 cm. BINARY STARS are gravitationally bound stars, formed simultaneously or by capture, that orbit a common cen- ter of mass and evolve at the same time. These stars are formed with similar initial conditions, although often quite different masses. Visual binaries are both sufficiently sep- arated in space and sufficiently near that their angular mo- tion can be directly observed. Spectroscopic binaries are unresolved systems for which motion is detected using 77

Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

Binary StarsSteven N. ShoreIndiana University, South Bend

I. Historical IntroductionII. Some PreliminariesIII. Classification of Close Binary SystemsIV. Evolutionary Processes for Stars

in Binary SystemsV. Mass Transfer and Mass Loss in Binaries

VI. X-RaySourcesandCataclysmicsVII. Formation of Binary SystemsVIII. Concluding Remarks

GLOSSARY

Accretion disk Structure formed when the material ac-creting onto a star has excess angular momentum, form-ing a circulating disk of material supported by internalpressure and heated by turbulent stresses.

Lagrangian points Stable points in the orbit of a thirdbody in a binary system; the inner Langrangian point,L1, lies along the line of centers and marks the Rochelimit for a tidally distorted star.

Main sequence Phase of hydrogen core burning; first sta-ble stage of nuclear processing and longest epoch in theevolution of a star.

Mass function Method by which the mass of an unseencompanion in a spectroscopic binary can be estimatedusing the radial velocity of the visible star and the or-bital period of the binary.

Orbital parameters Inclination, i , of the orbital planeto the line of sight; P , the period of revolution; e, theeccentricity of the orbit; q , the mass ratio.

Red giant Stage of helium core burning; subsequent tothe subgiant stage.

Subgiant Stage of hydrogen shell burning, when the deepenvelope initiates nuclear processing around an inerthelium core produced by main sequence hydrogen corefusion. This is the transition stage in the expansion ofthe envelope from the main sequence to the red giantphase.

Units Solar mass (M), 2 × 1033 g; solar radius (R),7 × 1010 cm.

BINARY STARS are gravitationally bound stars, formedsimultaneously or by capture, that orbit a common cen-ter of mass and evolve at the same time. These stars areformed with similar initial conditions, although often quitedifferent masses. Visual binaries are both sufficiently sep-arated in space and sufficiently near that their angular mo-tion can be directly observed. Spectroscopic binaries areunresolved systems for which motion is detected using

77

Page 2: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

78 Binary Stars

radial velocity variations of spectral lines from the com-ponent stars. If the orbital plane is inclined at nearly 90

to the line of sight, the components will display mutualeclipses. Depending on the orbital period and mass, thestars may share a common envelope (contact), be in a stateof unidirectional mass transfer between the components(semidetached), or evolve without mass transfer but withmutual gravitational perturbation (detached). In semide-tached systems, depending on the rate of mass transfer andthe nature of the accreting object, a hot accretion disk willbe formed. If the companion star is gravitationally col-lapsed, being a neutron star or black hole, X-ray emissionwill result. Accretion onto white dwarf or neutron starsresults in flash nuclear reactions that can trigger the novaevent.

I. HISTORICAL INTRODUCTION

At the close of the eighteenth century, William Herschelargued that the frequency of close visual pairs was larger inany area of the sky than would be expected by chance. Onthis basis, it was suggested that binary stars—that is, phys-ically gravitationally bound stellar systems—must exist.Prior to the discovery of Neptune, this was the most dra-matic available demonstration of the universality of New-ton’s theory of gravitation. Herschel, Boole, and othersextended this study to the demonstration that clustering isa general phenomenon of gravitational systems. The dis-covery of the wobble in the proper motion of Sirius ledBessel, in the 1840s, to argue for the presence of a low-mass, then unseen companion; it was discovered about20 years later by Clarke. Routine observations of visualbinary star orbits were being made by the end of the cen-tury. For very low mass stars, the method is still employedin the search for planets through proper motion perturba-tions, much like the method by which Neptune was dis-covered, although velocity variations have now supplantedthis search strategy.

About the same time as Herschel’s original work,Goodricke and Piggott observed the photomeric variationsin the star β Persei, also known as Algol. The variations,he argued, were due to the system being an unresolved,short-period binary system, with one star considerablybrighter than the other but otherwise about the same phys-ical size. They postulated that the light variations wereconsistent with eclipses, and that were we able to resolvethe system, we would see two stars orbiting a common cen-ter of mass with about a three-day period. The dramaticconfirmation of this picture came with the discovery, byHartmann and Vogel at the end of the last century, of ra-dial velocity variations in this system consistent with theeclipse phenomenology. By mid-century, in part as a re-

sult of the work at Bamberg under Argelander, large-scalesearches for variable stars began to produce very largesamples of stars with β Persei-like behavior. By the mid-1920s, much of the theory of geometric eclipses had beendeveloped. Russell and Shapley, in particular, includedthe effects of reflection (scattering) and ellipsoidal (tidal)distortions.

Most theoretical work on binary stars is the product ofthe past 70 years. Methods for the analysis of eclipses,based on light curve fitting by spheroids, were developedby Russell and Merrill in the first two decades of this cen-tury. Atmospheric eclipses were first discussed by Kopalin the 1930s. The study of mass transfer in binary sys-tems was initiated largely by Struve in the mid-1930s, andthe applications of orbital dynamics to the study of masstransfer began in the 1940s with Kuiper’s study of β Lyrae.Using large-scale computer models, stellar evolution in bi-nary star systems was first studied in detail in the 1960sby Paczynski and collaborators in Warsaw, Plavec andcolleagues in Prague. Hydrodynamic modeling has onlyrecently been possible using realistic physics and remainsa most interesting area of study. Much recent work on bi-nary star evolution and hydrodynamics has been spurredby the study of binary X-ray sources. Following the dis-covery of binarity for several classical novas, by Walkerin the 1950s, the connection between low-mass X-ray bi-naries and cataclysmics has been central to the study ofevolution of stars undergoing mass exchange.

II. SOME PRELIMINARIES

The broadest separation between types of binary stars is onthe basis of observing method. For widely separated sys-tems, which are also close to us, the stars appear physicallydistinct on the sky. If the orbital periods are short enough(that is, less than millennia), it is possible to determine theplane of the projected orbit by observing directly the mo-tion of the stars. For those systems in which the luminosityratios (and possibly mass ratios) are large, it is possibleto obtain orbital characteristics for the two members byobserving periodic wobbling in the proper motion of thevisible member. Such methods are frequently employed inthe search for planetary-like companions to high propermotion stars (that is, stars with large transverse velocitiesto the line of sight, such as Barnard’s star). For systems oflow proper motion and possibly long period, speckle in-terferometry, intensity, and Michelson interferometry, aswell as lunar occultations, can be useful in separating com-ponents and at least determining luminosity ratios. Suchmethods, extended to the near infrared, have been recentlyemployed in the search for brown dwarf stars, objects ofJupiter-sized mass.

Page 3: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

Binary Stars 79

When the system is unresolved, even at separationsof several milliarcseconds, it is necessary to employspectroscopic methods to determine the composition andmotion of the constituent stars. These are the spectroscopicbinaries, by far the largest group so far studied. Two meth-ods of analysis, which are happily sometimes complemen-tary, can be used—observation of radial velocity variationsof the components and eclipse phenomena.

A. Spectroscopic Binary Velocity Curves

Consider two stars in a circular orbit about the center ofmass. Regardless of the perturbations, we can say that theseparation of the two stars of mass M1 and M2 is a =r1 + r2 in terms of separation of the stars from the cen-ter of mass. The individual radii are the distances of thecomponents from the center of mass:

r1/r2 = M2/M1 (1)

The velocity ratio is given by V2/V1 = M1/M2 for a circu-lar orbit, where V is the orbital velocity. By Kepler’s law,

G M = ω2a3 (2)

where ω is the orbital frequency, given by 2π/P where Pis the period, and M is the total mass of the system. Now,assume that the inclination of the plane of the orbit to theobserver is i , that the maximum observed radial velocityfor one star is given by K , and that we observe only oneof the stars. Then,

K 3 P/2πG = M32 sin3i

/M2 = f (M) (3)

The function f (M) is called the mass function and dependsonly on observable parameters, the maximum radial ve-locity of one of the stars, and the period of the orbit. If M1

is the mass of the visible star, f (M) serves to delimit themass of the unseen companion. If both stars are visible,then,

K1/K2 = M2/M1 (4)

independent of the inclination. Thus, if both stars can beobserved, both the mass ratio and the individual massescan be specified to within an uncertainty in the orbital in-clination using f (M). The mass function permits a directdetermination of stellar masses, independent of the dis-tance to the stars. This means that we can obtain one ofthe most important physical parameters for understandingstellar evolution merely by a kinematic study of the stars.

If the orbit is eccentric, departures from simple sinu-soidal radial velocity variations with time are seen. Theeccentricity of the orbit also introduces another symmetry-breaking factor into the velocity variations, because the an-gle between the observer and the line of apsides, the line

that marks the major axis of each ellipse, determines theshapes of the velocity variation curve with orbital phase.

B. Eclipsing Binary Light Curves

The orbital plane of eclipsing stars lies perpendicular tothe plane of the sky. Depending on the relative sizes of thestars, the orbital inclination over which eclipses can occuris considerable, but in general only a small fraction of theknown binary systems will be seen to undergo eclipses.The variation in light serves two purposes. It permits adetermination of the relative radii of the stars, since theduration of ingress and the duration of eclipse depend onthe difference in the sizes of the stars. That is, if t1 isthe total time between first and last contact, and t2 is theduration of totality, assuming that the eclipse is annular ortotal, then,

t1t2

= rg + rs

rg − rs(5)

where rs is the smaller and rg is the greater radius, respec-tively. The diminution in light from the system dependson the relative surface brightness of the stars, which inturn depends on the surface (effective) temperature, Teff.Eclipses will not be total if the two stars are not preciselyin the line of sight, unless they differ considerably in ra-dius, so that the mark of totality is that the light does notvary during the minimum in brightness.

C. Distortions in Photometricand Velocity Curves

Several effects have been noted that distort the light curveand can be used to determine more physical informationabout the constituent stars.

1. Reflection Effect: External Illumination

Light from one component of a close binary can be scat-tered from the photosphere and outer atmosphere of theother, producing a single sinusoidal variation in the systembrightness outside of eclipse. This reflection effect is use-ful in checking properties of the atmospheres of the stars.If the illuminating star is significantly higher temperature,it can also produce a local heating, which alters the atmo-spheric conditions of the illuminated star. Such an effectis especially well seen in X-ray sources, particularly HZHer = Her X-1, which varies from an F-star to an A-starspectrum as one moves from the unilluminated side to thesubstellar point facing the X-ray source.

Page 4: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

80 Binary Stars

2. Photospheric Nonuniformities: Starspots

A similar phenomenon has been noted in the RS CVn stars,where it is caused by the presence of large-scale, activemagnetic regions, called starsports, on the stellar surfaces.Unlike reflection, these dark regions migrate with timethrough the light curve as the active regions move with thedifferential rotation of the stellar envelope, analogouslyto the motion of sunspots. Chemically peculiar magneticstars also show departures from simple eclipse profiles,because of locally cooler photospheric patches, but theseappear to be stable in placement on the stellar surface.

3. Circumstellar Material

The presence of disks or other circumstellar matter alsodistorts the light curves and can alter the radial velocityvariations as well. In Algol systems, this is especially im-portant. The timing of eclipses indicates a circular orbit,while the radial velocity variations are more like that ofa highly eccentric one. The explanation lies in the factthat here is considerable optical depth in the matter in theorbit, which results in the atomic lines producing a dis-tortion in the radial velocity variations. Many of the WSerpentis stars show this effect. It is most noticeable ineclipsing systems because these present the largest pathlength through material in the orbital plane. In some cases,atmospheric eclipses can also distort the lines because ofstellar winds and convection cells intercepted by the line ofsight. These motions, however, are generally small com-pared with the radial velocity and so alter the photometry(light-curve instabilities during eclipse are well markedin the ζ Aur stars) but do not seriously affect the radialvelocity determinations.

4. Ellipsoidal Distortions: Tidal Interaction

If the stars are close enough together, their mutual gravita-tional influences raise tides in the envelopes, distorting thephotospheres and producing a double sinusoidal continu-ous light variation outside of eclipse. Many of these sys-tems also suffer from reflection-effect distortion, so thereare many equivalent periods in these systems, dependingon whether or not they eclipse.

Departures from symmetric minima should accompanyexpansion of the stars within their tidal surfaces. As thephotosphere comes closer to the tidal-limiting radius, theRoche limit, the star becomes progressively more distortedfrom a symmetric ellipsoid and the photometric variationsbecome more like sinc curves. An additional feature isthat as the stars become larger relative to the Roche limitthey subtend a greater solid angle and display increasingreflection effect from the companion.

5. Limb Darkening

Stellar surfaces are not solid, and they have a continuousvariation in surface brightness as one nears the limb. Thiseffect, called limb darkening, is produced by the temper-ature gradient of the outer stellar atmosphere comparedwith the photospheres. The effect of limb darkening onlight curves is to produce a departure from the behaviorof simple, uniform spheres most notable in the softeningof the points of contact during eclipse. It is one of the bestways available for measuring the temperature gradients ofstellar atmospheres.

6. Apsidal Motion: Orbital Precession

The additional effect of the tidal distortion is that the starsare no longer simple point sources, but produce a perturba-tion on the mutual gravitational attraction. The departureof the gravitational potential from that of two point massesproduces apsidal motion, the slow precession of the lineconnecting the two stars. This rate depends on the de-gree of distortion of the two stars, which in turn providesa measure of the degree of central concentration of thestars. Such information is an important input for stellarevolutionary models. One system that has been especiallywell studied is α Virginis (Spica). An additional sourceof apsidal motion is the emission of angular momentumfrom the system, and the presence of a third body.

7. Third Light

Either because of circumstellar material in the orbitalplane, which is not eclipsed but which scatters light fromthe binary components, or because of the presence of afaint third body in the system that is unresolved, someadditional light may be present at a constant level in theeclipsing binary light curve. Frequently, high-resolutionspectroscopy is able to reveal the lines of the companionstar, as in Algol, but often it remains a problem to figureout the source for the nonphotospheric contributions to thelight curve. This is simply added as an offset in the deter-minations of eclipse properties in most methods of light-curve analysis. Such emission may also arise from shocksin accretion disks and from intrinsic disk self-luminosity.

III. CLASSIFICATION OF CLOSEBINARY SYSTEMS

There are several distinctive classes of binary stars, dis-tinguished nominally by their prototypes, usually the firstobserved or best known example of the phenomenology.In several cases, however, overlaps in the properties of the

Page 5: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

Binary Stars 81

various systems make the prototypical separation confus-ing and less useful. Two main features distinguish classesof stars: the masses of the components and the separations.Alternatively, the period of the binary and the evolution-ary status of the components are useful, and we shall usethese alternately as needed.

The broad distinction among various binaries is whetherthe stars are physically separated by sufficient distance thatthe photospheres are distinct, in which case they are calleddetached, or have been significantly tidally distorted andmay be in the process of mass transfer in some form.This latter class divides into those that have only one startransferring mass, the semidetached, and those with bothstars immersed in a common envelope of gas that is mu-tually exchanged, the contact systems. This classification,first developed by Kuiper, has proven to be a most gen-eral taxonomic tool for distinguishing the various physicalprocesses that occur at different stages in the evolution ofclose binaries. It is most important to note that a binary,depending on its initial period, mass, and mass ratio maypass through any or all of these stages at some time in itslife. This is due to the expansion of stellar envelopes asstars evolve.

A. The Roche Surface

The key element in binary star evolution is the role of theRoche limit, which was first introduced in the three-bodyproblem. It is known in the celestial mechanics literature asa zero-velocity surface, but we will treat it as the boundingequipotential for a self-gravitating star:

(r ) = −G M1

r1− G M2

r2− 1

2r2ω2 (6)

where ri = |r − xi | for masses located at positions x1 =M2a/M and x2 = M1a/M for a circular orbit with thestellar separation a. This is the potential in the corotatingframe with frequency ω. There are five equilibrium pointswhere the gradient of vanishes, three of which are alongthe line of centers. Two are peripheral to the masses andlie as the critical points along equipotentials that envelopeboth stars. These are saddle points for which particle tra-jectories are unconditionally unstable. Two other points,L4 and L5, lie diametrically opposite each other perpen-dicular to the line of centers. These are quasi-equilibriumpoints for which local orbits are possible because of theCoriolis acceleration. The Roche lobe is the equipotentialthat passes through L1, called the inner Lagrangian point,that lies between the masses along the line of centers. Massloss is inevitable for the star that is contacting its Rochelobe. As a star’s radius approaches the Roche surface, thebody becomes more distorted and eventually, when it con-tacts L1, the inner Lagrangian point, it loses mass to the

companion and possibly from the system. This actuallyoccurs before the photosphere reaches RRL in the absenceof magnetic fields or other constraints on the flow. Severalanalytic approximations have been derived for the radiusof the equivalent sphere whose volume equals that of theappropriate lobe. In general, the Roche radius depends onthe mass of the components and q through RRL = f (q)a,where f (q) is provided by functional approximate fits tothe exact calculations. Two compact, though restricted,approximations are

f (q) = 0.2 + 0.38 log q (0.5 ≤ q ≤ 20)

(7)f (q) = 0.462

(q

1 + q

)1/3

(0 ≤ q ≤ 0.5)

Binary systems are theoretically distinguished by thesizes of the components relative to their respective Rochelobes, although a system in its lifetime may pass throughseveral or none of these, depending on the separation ofthe stars and their masses. Stars that are in mutual contactwith, or overflowing, their critical equipotential surfacesare called common envelope or contact systems. Whenonly one star’s radius equals its Roche radius, the sys-tem is called semidetached. Finally, if both stars are sep-arate, however much they may be distorted, the binary isdetached. Geometrical methods for treating light curveshave been developed based on this idea, each treating theshape of the star and the distribution of temperature overthe surface more or less phenomenologically (by scalingmodel atmospheres to the local conditions on tidal ellip-soids or Roche surfaces and piecing together the surface).The tidal distortion is also important for the internal struc-ture of the stars and must be handled in a more detailedway than the axisymmetric case, but similar problems arisenonetheless.

The Roche surface is the star’s response to the tide raisedby its companion. There are two sorts of tides. One is theequilibrium tidal surface that is instantaneously in hydro-static equilibrium everywhere in the star. The baroclinicityof the surfaces, as in the rotationally distorted case, inducesslow circulation that ultimately redistributes angular mo-mentum as well as energy. This produces circularizationby loss of orbital angular momentum through viscosity andis most efficient for stars with deep convective envelopesbecause the turbulence acts like an effective viscosity. Theother is a dynamical tide that acts like a nonradial pulsa-tion of the envelope and produces faster internal flows,internal mixing, and redistribution of angular momentum.In the Earth–Moon system, the Moon rotates with thesame period as its orbit—that is, synchronously—whilethe Earth rotates more rapidly. Any point on Earth mustthen experience a periodic tidal acceleration since, in the

Page 6: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

82 Binary Stars

rotating frame, any locale on the planet is carried throughthe alternating extrema of the perturbing force. This slowlyalters the lunar orbit. The solid Earth is not distorted suffi-ciently to dissipate its rotational energy, hence the rotationonly slowly approaches the lunar orbital period throughtidal friction. The physical mechanism must be somethinglike this. Stars are fluid and fluids yield to shear. Hence, theinduced tidal distortion produces flows because the grav-itational potential develops along equipotential surfaces.These flows transport momentum in the rotating frame,leading toward solid body rotation and synchronism de-pending on the internal viscosity, the precise nature ofwhich is currently not known.

B. Evolutionary Stages of Starsin Close Binaries

On reaching the Roche surface, the stellar envelope is pre-sumed to become unbound, and mass loss or transfer is ini-tiated on a hydrodynamic timescale. The star is genericallyreferred to as the loser or donor, terms usually applied inthe case of mass transfer between the binary components.The companion is called the gainer, which implies someamount of accretion. The alternative—to call one star theprimary and the other the secondary—is based on the rel-ative contributions to the combined spectroscopic and/orphotometric properties and does not capture the physicalnature of the interaction. For detached systems on the mainsequence, these terms also describe the respective massesbut that correspondence breaks down once the stars begintheir ascent of the H–R diagram.

The taxonomic distinctions for the different evolution-ary cases are based on the stage at which the nomenclatureapplies. Case A occurs before the terminal main sequence,when the loser is still undergoing core hydrogen burn-ing. This is a slow nuclear stage, and not very sensitiveto the stellar mass. For mass transfer to occur requiresvery small orbital separation because of the small stellarradii and the time scale for stellar expansion is very slow.Case B occurs after hydrogen core exhaustion and dur-ing a relatively rapid stage of radial expansion, either inthe traverse across the Hertzsprung gap (hydrogen shellburning) or on first ascent of the giant branch but beforehelium core ignition. Case C is a late stage, when thestar has developed a helium core and is on or near the gi-ant branch or asymptotic giant branch (helium shell burn-ing). Two mass-transfer cases are distinguished, as well.In conservative mass transfer, the process can be studiedin a straightforward way because we neglect any net massor angular momentum losses. The Roche surface wouldrecede into the loser were it not for the increase in theseparation between the components that results from thechange in the mass ratio. The loser maintains contact with

RRL until contact is broken by expansion of the system andthereafter the more evolved star continues as if it were asingle star. In the event of mass loss from the gainer, theprocess of mass transfer may be reinitiated at some laterstage, but this is unlikely. Dropping these assumptions ofconstant binary mass and total angular momentum makesthe scenario more realistic but leads to a dizzying range ofphenomenological models, each marked by the adoptionof specific mechanisms for breaking constancy of one orboth of these quantities.

1. Common Envelope Evolution

Since the Roche surface represents the limit of a set ofbounding equipotentials, it is a surface along which flowscan occur but that a star can maintain as an equilibriumshape. A particularly important state is encountered bythe binary if the radius of the outer layers of both starsexceeds RRL , since matter does not have to catastrophi-cally flow toward either one from the other component.Instead, if both stars are in contact with L1, a low-speedcirculation can, in principle, be established because ofthe pressure reaction from the companion’s outer layers.The resulting optically thick common envelope should,for contact systems, behave as if the layer sits on top of avery strange equipotential surface, one where the surfacegravity depends on both colatitude and colongitude. Theouter bounding surface through which some mass is cer-tainly lost from the binary passes through either L2 andL3, but this is small compared with the flow that mustpass through L1 to maintain thermal balance. This is theobserved situation in the W UMa stars. Marked by contin-uous light variations, these stars appear to be surroundedby a common atmosphere, but they have otherwise stablecores that place them on the main sequence. Observa-tionally, although the stars have different masses (q ≈ 0.5and luminosities up to a factor of 10), their envelopeshave nearly uniform temperature requiring very efficientheat transport between the components while leaving thestellar interior otherwise unaffected. The coolest W UMastars must have common convective envelopes, where thetemperature gradient adjusts to the large variation in thesurface gravity due to the angular gradients in the equipo-tential. How this happens has been a controversial questionfor decades and remains an important unsolved problemin stellar hydrodynamics. The current consensus is thatthe envelopes are never precisely in thermal equilibriumand that the mass transfer fluctuates between the com-ponents. One way to picture this is that the mass-transferrate exceeds the thermal time-scale for the entire envelopewhich drives thermal oscillations that periodically overfillthe Roche lobe of the gainer, or rather the star that is theaccreter at that moment.

Page 7: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

Binary Stars 83

Main sequence contact systems are only one example ofevolutionary stages where common envelopes occur. Anycircumstellar matter that completely engulfs the compan-ion and is optically thick is, in effect, a common envelope.Cataclysmic binaries are extremely close systems withperiods of less than 1 day and at least one degeneratecomponent. Their origin is linked in current hypothesesto a relatively late stage in the evolution of one of themore massive components, since there are no main se-quence progenitors with these characteristics. There aretwo obvious ways to form a white dwarf. One is to invokemagic and drive the envelope off during planetary neb-ula formation in the post-AGB phase. The other occursin a common envelope. If one star engulfs the other asit evolves, and this can happen for virtually any systemwith orbital periods of less than a few weeks on the mainsequence, the lower mass companion will find itself orbit-ing within a dense circumstellar environment produced bythe more massive star. Differential motion leads to heating,which scales as v3

orba−1 ∼ a−5/2, and transfer of angularmomentum between the lower mass component and theengulfing envelope. Consequently, if the heating is suf-ficient, the more massive star is stripped of its envelopewith the resulting loss of binding energy, and the com-panion spirals inward. The result is a white dwarf witha much less evolved companion in a very short periodsystem.

C. Some Prototype Subclassesof Binary Systems

In this section, we summarize some general properties ofimportant subclasses of close binary stars. Several of thesehave been discussed previously as well, in order to placethem in a more physical context.

1. W Ursa Majoris Systems

These are main sequence contact binaries. They are typ-ically low mass, of order 1 − 2M, with orbital periodsfrom about 2 hr to 1 day. The envelopes are distinguishedas being in either radiative or convective equilibrium. Thechief observational characteristics are that they show con-tinuously variable light curves and line profile variationsindicative of uniform temperature and surface brightnessover both stars, although the mass ratio ranges from 0.1 to1. Surface temperatures range from about 5500 to 8000 K.The lower mass systems are called W type, the higher masssystems are A type; the W-type envelopes are convective.Typical of this class are W UMa, TX Cnc, and DK Cyg.There may be massive analogs of this class, although thelight curves are more difficult to interpret.

2. RS Canes Venaticorum Starsand Active Binaries

Close binaries with periods less than 2 weeks induce syn-chronous rotation via tidal coupling on time scales of order108—109 yrs. For main sequence stars, this generally re-sults in slow rotation compared with that observed in sin-gle stars; for evolved stars, the opposite holds. The RS CVnand related stars show rapid rotation of a cool evolved starthat displays enhanced dynamo activity. These stars aremarked by exceptionally strong chromospheres and coro-nas, sometimes having ultraviolet (UV) and X-ray fluxesgreater than 103 times that observed in normal G–K giants(cool giants). The photometric behavior of these systemsis marked by the appearance of a dark wave (large activeregions) which migrates through the light curve toward adecreasing phase, suggestive of differential rotation. Theactive stars have deep convective envelopes. Several sub-giant systems, notably V471 Tau, have white dwarf com-panions, although most systems consist of detached sub-giant or giant primaries and main sequence secondaries.With the exception of HR 5110, most of these systems aredetached. Other representative members of this class areAR Lac, Z Her, and WW Dra. Typically, the mass ratiosare very near unity, although this may be a selection effect.

An analog class, the FK Com stars, shows many ofthe RS CVn characteristics, especially enhanced chromo-spheric and coronal activity and rapid rotation, but it ap-pears to be a class of single stars. The FK Com stars are ar-gued to have resulted from the common envelope phase ofan evolved system leading to accretion of the companion.

Both of these subclasses are especially notable as ra-dio sources, often displaying long-time-duration (days toweeks) flares with energy releases of some 107 times thatof the largest solar flares. The dMe stars are the low-massanalogs of these systems, although not all of these are bina-ries. It appears that the binarity is most critical in produc-ing more rapid than normal rotation, which is responsiblefor the enhanced dynamo activity.

3. z Aurigae Stars and Atmospheric Eclipses

These systems consist of hot, main sequence stars, typi-cally spectral type B, and highly evolved giants or super-giants with low surface temperatures (G or K giants). Forseveral eclipsing systems, notably ζ Aur, 31 Cyg, and 32Cyg, eclipses of the hot star can be observed through thegiant atmosphere. The B star thus acts like a probe throughthe atmosphere of the giant during eclipse, almost like aCAT scan. Observations of photometric and spectroscopicfluctuations during eclipse provide a unique opportunityfor studying turbulence in the envelopes of evolved stars.The systems are long period, although for several, notably

Page 8: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

84 Binary Stars

22 Vul, there is evidence for some interaction between thestars due to accretion of the giant wind by the main se-quence star in the form of an accretion wake. The mostextreme example of this subclass is ε Aur, a 27-yr-periodbinary with an unseen supergiant or hypergiant evolvedcool star accompanied by an early-type companion.

4. Algol Binaries

These are the classic semidetached systems. They aremarked by evidence for gas streams, distortions of theeclipse profiles due to instabilities in the circumstellarmaterial on the time scale of several orbits, and some-times enhanced radio and X-ray flaring activity of themore evolved star. For several systems, notably U Cep,the stream ejected from the giant hits the outer envelopeof the accreting star and spins it up to very high veloc-ity. For others, the stream circulates to form an accretiondisk about the companion, which is heated both by turbu-lent viscosity and the impact of the stream in its periph-ery. These systems typically show inverted mass ratios,in that the more evolved star has the lower mass. Theyhave masses ranging from about 1M each to greater than5M for the constituent stars. Among the best examplesof these systems are SX Cas, W Ser, and β Lyr.

5. Symbiotic Stars

These systems are so named because of the observationof strong emission lines from highly ionized species andcool absorption lines of neutral atoms and molecules inthe same spectrum. They consist of a highly evolved coolgiant or supergiant and either a main sequence, subdwarf,or collapsed companion. Several of the systems, notablyR Aqr, show pulsating primary stars with periods of hun-dreds of days. The orbital periods are typically quite long,of order one year; the mass ratios have not generally in-verted except in those systems where a white dwarf isestablished as a member. The ionizing source appears tobe an accretion disk about the companion star, fed by astellar wind and perhaps gas streams in the system. Radioand optical jets have been observed emanating from sev-eral systems, especially CH Cyg and R Aqr. Many of thephenomena observed in these systems are similar to thoseobserved in recurrent novae like RSOph TCrB and V3890Sgr, which have red giant companions. These systems alsodisplay unstable light curves, presumably attributable toinstabilities in the accretion disks. The most extreme ex-amples of this class, having the longest periods and themost evolved red components, are the VV Cep stars.

6. Cataclysmic Variables

These systems typically consist of low-mass main se-quence stars of less than 1M, and either white dwarf or

neutron star companions. They display outbursts of thenova type when flash nuclear reactions release sufficientenergy to expel the outer accreted layers off the surfaceof the collapsed star, or show unstable disks that appearto account for the dwarf novas. These systems will be dis-cussed later at greater length. Among the best examplesof this class are U Gem, SS Cyg, and OY Car for thedwarf novae; GK Per and DQ Her for the classical novae;and AM Her and CW 1103 + 254 for the magnetic whitedwarf accreters. The low-mass X-ray binaries share manyof the same characteristics without the extreme photomet-ric variability (for example, Cyg X-2 and Sco X-1).

IV. EVOLUTIONARY PROCESSESFOR STARS IN BINARY SYSTEMS

Normally, stars evolve from the main sequence, duringwhich time their energy generation is via core hydrogenfusion, to red giants, when the star is burning helium in itscore, at roughly constant mass. While stellar winds carryoff some material during the main sequence stages of mas-sive stars, most stars do not undergo serious alteration oftheir masses until the postgiant stages of their lives. Thisis only achieved when the escape velocity has been re-duced to such an extent that the star can impart sufficientmomentum to the outer atmosphere for a flow to be ini-tiated. Envelope expansion reduces the surface gravity ofthe star, so that the radiative acceleration due to the highluminosity of stars in the postgiant stage, or the extremeheating affected by envelope convection in the outer stel-lar atmosphere, provides the critical momentum input. Ina binary system, however, the star is no longer necessarilyfree to expand to any arbitrary radius. The presence of acompanion fixes the maximum radius at which matter canremain bound to a star.

Should the primary (that is, more massive) star in abinary have a radius that exceeds this value, it will developa cusp along the line of centers at the inner Lagrangianpoint, L1. Nuclear processes continue in the stellar interior,driving the increase in the envelope radius, so that eventhough the mass of the star is decreasing, the center ofmass of the system shifts toward the companion star andthe continued expansion of the primary causes the masstransfer to be maintained. Inexorably, the mass ratio willcontinue to increase until the star is so sufficiently strippedof matter that it becomes smaller than the instantaneousRoche lobe. At this point, the mass transfer stops.

The evolution of the system is determined by the frac-tion of the lost mass that is accreted by the second starand the fraction of both mass and angular momentum ofthe system that is lost through the outer contact surface.The loss of mass from one of the stars alters its surface

Page 9: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

Binary Stars 85

composition as successive layers are peeled off. It is gener-ally assumed that the star will appear as nitrogen enhanced,because the outer layers of the CNO-burning shell will beexposed to view if enough of the envelope is removed.The OBN and WN stars are assumed to be the result ofsuch processing. In addition, the alteration of the mass ofthe star will produce a change in the behavior of turbu-lent convection in the envelope, although the details arepresently very uncertain. The enhancement of turbulentmixing should be responsible for exposing the effects ofnuclear processing of even deeper layers to view, but thishas yet to be fully explored.

The behavior of the mass loser with time is significantlysensitive to whether the envelope is convective or radiative,that is, to whether the primary mode of energy transport isby mass motions or photons. In turn, these are sensitive tothe temperature gradient. If the envelope is convective, thereduction in mass causes the envelope to expand. If radia-tive, the envelope will contract on mass being removed.Consequently, the mass transfer is unstable if the envelopeis convective, and the star will continue to dump mass ontothe companion until it becomes so reduced in mass thatits envelope turns radiative. The instability, first describedby Bath, may be responsible for the extreme mass-transferevents seen in symbiotic stars and may also be implicatedin some nova phenomena.

V. MASS TRANSFER AND MASSLOSS IN BINARIES

Mass transfer between components in a binary systemtakes place in two ways, by the formation of a stream or awind. Either can give rise to an accretion disk, dependingon the angular momentum of the accreting material. In theζ Aur systems and in most WR binaries, the accretion iswindlike. This also occurs in some high-mass X-rays bi-naries (HMXRB), notably Cir X-1. In these, the accretionradius is given by the gravitational capture radius for thewind, which varies as:

R ∼ M/ν2

wind (8)

where M is the mass of the accreting star and νwind is thewind velocity at the gainer. The formation of an accretionwake has been observed in several systems, notably the ζ

Aur systems. The wake is accompanied by a shock. Shouldthe star have a wind of its own, however, the material fromthe primary loser will be accelerated out of the systemalong an accretion cone, with little actually falling ontothe lower mass component. Such wind–wind interactionis observed in Wolf–Rayet systems, notably V444 Cyg,and is responsible for strong X-ray emission.

The formation of a stream is assured if the mass loss rateis low and the star losing mass is in contact with the Rochesurface. In this case, the L1 point in the binary acts tofunnel the mass into a narrow cone, which then transportsboth angular momentum and mass from the loser to thegainer.

The atmosphere of the mass losing star has a finite pres-sure, even though at the L1 point the gravitational accel-eration vanishes; thus, the mass loss becomes supersonicinterior to the throat formed by the equipotentials, anda stream of matter is created between the stars. The factthat the center of mass is not the same as the center offorce (that is, L1) means that the stream has an excessangular momentum when it is in the vicinity of the sec-ondary or mass gaining star. It thus forms an accretion diskaround the companion. However, if the ejection velocityis great enough, the size of the companion large enoughcompared with the separation of the stars, and the massratio small enough, the stream will impact the photosphereof the gainer. Instead of the formation of a stable disk, thestream is deflected by the stellar surface, after producingan impacting shock, with the consequence that the outerlayers of the gainer are sped up to nearly the local orbitalspeed, also called the Keplerian velocity:

νK = (G M2/R2)1/2 (9)

Some mass (the fraction is not well known) will also belost through the outer Lagrangian point, L3, on the rearside of the loser from the gainer along the line of centers.The mass of the system as a whole is therefore reduced.This means that the matter can also carry away angularmomentum from the system. If the reduced mass of thesystem is given by

µ = M1 M2/M (10)

where M is the total mass, then the angular momentum ofthe binary is

J = µa2ω = G2/3 M1 M2 M−1/3ω−4/3 (11)

where

1

µ= 1

M1+ 1

M2

is the reduced mass. The change in the angular momentumof the system then produces a period change

FJ = δM1

M1

(1 − 1

3

M1

M

)+ δM2

M2

(1 − 1

3

M2

M

)+ 4

3

δP

P

(12)

for a fraction FJ of angular momentum lost from the sys-tem and an amount of δM of mass lost. Notice that theperiod evolution is very sensitive to both the amount of

Page 10: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

86 Binary Stars

angular momentum lost and to the fraction of the masslost from M1, which is lost from the binary system.

The loss of angular momentum from the system is oneof the currently unknown physical properties of variousmodels. It is the most critical problem currently facingthose studying the long-term behavior of the mass trans-fer in binaries, since it is the controlling factor in theorbital evolution. Two classes of models have been pro-posed, those in which much of the angular momentum ofthe stream is stored in the accretion disk or in the spun-up stellar envelope of the gainer and those in which theJ is carried out of the system entirely. Magnetic fieldscan also act to transport angular momentum, and the de-gree of spin–orbit coupling between the components alsoaffects the overall system evolution. As a result, much ofthe detailed behavior of mass exchanging or semidetachedsystems is not yet fully understood.

VI. X-RAYSOURCESANDCATACLYSMICS

The presence of a collapsed component in the system altersmuch of the observable behavior of binaries. In particular,the signature of mass accretion onto a white dwarf, neutronstar, or black hole is X-ray emission. With the discoveryof binary X-ray sources in the late 1960s, following thelaunch of UHURU, the first all-sky X-ray survey satel-lite, the field has rapidly grown. Early observations wereinterpreted as accretion onto white dwarf stars, but thephysical details of the accretion processes onto specificcompact objects have been refined so that it is now possi-ble to distinguish many of the marks of a specific gainerby the observable behavior.

X-ray emission results from accretion onto a collapsedstar because of the depth of its gravitational potential well.As the infalling matter traverses the accretion disk, it heatsup because of collisions with rapidly revolving matter andradiates most of its energy away. If the disk is opticallythick, this radiation will appear at the surface of the ac-cretion disk as a local blackbody emitter at a temperaturecharacteristic of the local heating rate for the matter in thedisk. Since the material is slowly drifting inward, becauseof loss of angular momentum through viscosity-like inter-actions within the disk, the heating can be likened to thatresulting from a turbulent medium that is capable of ra-diating away kinetic energy gained from infall. The massdistribution is not radially uniform, and the vertical extentof the disk is determined by pressure equilibrium in the zdirection, so that the surface area and temperature vary asfunctions of distance from the central star. As a result, theflux merging from the surface and seen by a distant ob-server is an integrated one, summing up different regionsof the disk which have different temperatures. The emer-

gent spectrum of the material is not that of a blackbody,nor even very similar to a star. In general, it will appear tobe a power law distribution with frequency, looking non-thermal but in fact reflecting the run of temperature andpressure in the disk.

A. Accretion Disks Processes

If only one star is in contact with RRL , then secular masstransfer happens. That will now occupy our attention.Mass loss must occur whenever a star comes into con-tact with its Roche surface as a stream directed toward thecompanion. The net gravitational acceleration vanishes atL1. Gas in the envelope of a star whose radius equals RRL

therefore generates a pressure-driven acceleration that atL1 reaches the sound speed. The result is a highly super-sonic flow that launches toward the companion with thesound speed from the L1 point and is deflected by theCoriolis acceleration. The stream orbits the companionand collides with itself, again supersonically, forming anoblique shock that eventually deflects it into an orbit. Ul-timately, a disk is formed, the structure of which we nowtreat.

Mass lost through the L1 point generally carries angularmomentum, so it so cannot fall directly onto the gainer.Even if it were coming from the precise center of mass,the Coriolis acceleration in the corotating frame causesthe stream to deviate from radial infall. The condition fordisk formation is that the angular momentum is sufficientto send material into orbit around the gainer. If the angularmomentum is too small, the stream may directly impact thegainer. The result is local shock heating and a deflection ofthe stream by the gainer’s atmosphere. If a disk does form,this interaction point is moved out, but it is still present.The reason is that the stream, which is flowing superson-ically, cannot adjust its structure on the slower sonic timescale. As a result, it slams into the circulating materialand forms a standing shock. Since it is an oblique impact,this region refracts the shock and produces a contact, slip,discontinuity at the boundary.

Assuming hydrostatic equilibrium, the disk’s verticalstructure is governed by the tidal component of theacceleration:

1

p

d P

dz= −G M

r2

z

r= −v2

K

r2z (13)

The simplest estimate of the disk thickness comes fromassuming that it is vertically isothermal, so P = pc2

s , withcs being the sound speed. The density therefore displaysa gaussian vertical profile with a scale height:

z0 = cs

vKr (14)

Page 11: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

Binary Stars 87

This is a thin disk since generally cs/vK 1, althoughbecause of the radial dependence of the angular velocity,the thickness increases with increasing distance from thecentral body.

For accretion to occur, the disk gas must lose angularmomentum and drift radially inward. If the material re-mains dissipationless in the disk, and cannot somehow loseangular momentum, it will continue to circulate and ac-cumulate, storing angular momentum that it gained frominfall by the stream. This may actually happen in somesystems, and it is important in the stability of the plane-tary ring systems observed in the solar system, but it willnot explain accretion. This is why much of the effort inaccretion disk theory goes into understanding the mecha-nisms that generate viscosity, which is parameterized bythe quantity η. How does this work? The shear in an ax-isymmetric flow has the explicit form

σrφ = ∂vφ

∂r− vφ

r= r

dr(15)

and, with representing the surface density, the viscoustorque is

τ = 1

r

∂r

[ηr2 dω

dr

](16)

For any fluid, the volumetric energy dissipation rate is

d E

dt∼ η

∫σrφTrφ dV (17)

where V is the volume and the stress, Trφ , is assumed to beproportional to the pressure through Trφ − = αP , where α

is a free parameter that is usually assumed to be a constantin space, although not necessarily a constant in time. Theviscosity is given by η ≈ α ∼ cs z0. This is the so-calledα-disk. With this parameterization, you see that η dependson the pressure which in turn connects the mass transferthrough the continuity equation to the vertical structure ofthe disk. For constant α, the heating also depends on thepressure.

These disks are thermally unstable if the mass accretionrate is changed over time and can react much like a pul-sating star. The local energy generation depends on α, andany local heating produces a decrease in because of theincrease of z0. This, in turn, leads to smaller optical depthsand subsequent cooling. The collapse of the disk verticallymeans the plane is “mass loaded” and this matter must beadvected inward by the torque. Therefore, in the coolingstate, or if α is sufficiently small, the mass transfer canbe large toward the gaining body. If, on the other hand,the efficiency of radiation is reduced or the cooling in thevertical direction is large, the disk can remain hot and themass will not effectively transfer to the inner region. Ifthe disk is optically thin, it can be treated as a circulat-

ing gas that immediately radiates any energy gained byits drift onto the gainer, and the emission should resultfrom an accretion shock that may develop at the surfaceof the mass-accepting star. However, for many systems,especially cataclysmic variables, the depth of the gravi-tational potential and the opacity of the disk conspire toproduce complex time dependence in the disk structure.The vertical temperature gradient governs the surface den-sity through the condition of hydrostatic equilibrium. Re-member that, even in the thin disk approximation, the diskthickness depends on T through the scale height. You cansee this by using as the dependent variable and rewritingthe vertical radiative diffusion equation as

F = −4acT 3

∂T

∂(18)

Since the opacity depends on both temperature and den-sity, the vertical structure of the disk feeds back into therate of mass accretion. This is the basis for the disk oscil-lations that are observed in nonlinear models for the disksof cataclysmic variables. These arise mainly because ofrecombination effects when the column density is high.The oscillations are the response to changes in the massinflow.

With all sources of dissipation included, the total grav-itational potential energy of each parcel of infalling gaswill ultimately be radiated, so that an estimate for the to-tal luminosity of the system is Lacc ∼ GMM/R, where R

is the stellar radius. The temperature should reach aboutTacc ∼ GM/κ R. Thus, for a compact body, there are twoimportant conclusions to be drawn from this exercise. Thefirst is that a WD, neutron star, or black hole in a binarysystem should produce emissions ranging from soft tovery hard X-rays. The second is that such a source caneat only so fast. The Eddington luminosity sets a limit onthe emission a star can support simultaneously in hydro-static and radiative equilibrium. For a black hole, whoseradius depends only on its mass, this means that M ∼ Mso the luminosity of such a source is a measure of its massaccretion rate. For less extreme objects, the luminosity de-pends on the stellar radius and mass, so that M ∼ R. Fora white dwarf, for instance, this means that M ∼ M−1/3.In other words, some of the mass will not wind up onthe star if it is accreted too quickly. Mass loss from thebinary may happen through jets driven from the disk. Itis possible to observationally detect accretion disks, forinstance in eclipsing binaries where the absorption linesfrom the environmental gas are seen in projection againstthe stellar photosphere. Emission lines from the circulat-ing material have characteristic double peaks separatedby the projected orbital velocity of the inner portions ofthe disk. Finally, the continuum emissions from a vis-cously heated disk show a power law with increasing flux

Page 12: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

88 Binary Stars

to shorter wavelengths, Fν ∼ ν1/3 for frequency ν. This isbecause the emission is weighted toward the inner regionby temperature but modified by the increased surface areafor the cooler outer parts of the flow.

1. Magnetorotational Instabilityand Viscosity in Disks

Since disks rotate differentially, even a weak magneticfield can have profound consequences for their structurebecause of dynamo action. The field can be amplified andcommunicates a torque over a large distance. This pro-duces a very strong instability that is inherent to all con-ducting media, the magnetorotational instability (MRI).Imagine that the disk is threaded by a weak magnetic fieldthat lies in the φ and z directions (neglecting the radialcomponent). The combined effects of buoyancy and ro-tation, which amplify any seed field with time, leads tothe instability. Even ignoring buoyancy, the disk is stillunstable unless

d

dr> 0 (19)

Even Keplerian motion fails this stability criterion! TheRayleigh criterion, that the angular momentum must in-crease outward, may be met by the circulation, yet, evenif the seed magnetic field is vanishingly small, it growsthrough shear and the MRI grows. Although it is not clearfrom our development how this leads to turbulence, theMRI clearly produces strong, growing fluctuations in thedensity and velocity that likely become turbulent. We haveneglected dissipation, but as soon as the velocity fluctua-tions grow large enough they will certainly drive heatingthroughout the disk by the strong viscous coupling. Wewill close our treatment here with a few additional re-marks. This instability is very general, and any ionizedshearing medium is likely to experience some form of theMRI. This includes galactic-scale disks as well as proto-stellar accretion disks.

B. Cataclysmic Variables and CompactObjects in Close Binaries

The cataclysmic variables are a heterogeneous type ofclose binary, unified phenomenologically by their propen-sity for “violent” behavior. What they all have in commonis that one of the components is a compact, degeneratestar—a white dwarf, neutron star, or black hole. The ar-ray of types is as dizzying as with any other area of as-tronomy. Disk-dominated systems, for which the compactstar (white dwarf) is overwhelmed with the accretion en-vironment, include the SU Uma stars, for which the diskproduces a complex light variation due to a superhump

that arises from the hot spot in the disk where the streamimpacts its periphery. The excess emission progresses inretrograde through the light curve (that is, it moves towardearlier phase with time) and the systems undergo occa-sional outbursts. These systems are some of the shortestperiod binaries known; WZ Sge has an orbital period ofonly about 1 hour. At least one nova, V1974 Cyg, ap-pears to also be a related system. The U Gem systems areeruptive disk systems that undergo nearly periodic out-bursts that obey a period–amplitude relation. Magneticwhite dwarfs are the gainers in the AM Her systems, acategory that also includes some novae such as V1500Cyg and DQ Her.

Cataclysmics binaries with neutron stars include mostlow-mass X-ray binaries (LMXRB) such as Cyg X-2 andSco X-1. These systems have low-mass (red dwarf) com-panions and resemble the white dwarf cataclysmics inmany respects. A more massive system, Her X-1 = HZHer, has a late A loser and a magnetized neutron stargainer. High-mass X-ray binaries (HMXRB) have moremassive (OB) losers, such as Vela X-1, 4U 0900-17, andφ Per. The primary is sometimes a Be star, meaning that itshows either a strong wind or evidence for an extended cir-cumstellar environment. Cataclysmics with white dwarfgainers include dwarf nova systems such as WZ Sge, UGem, and SS Cyg. Classical and some recurrent novae fallinto this category, as well, having cool low-mass loserssupplying the mass.

C. Novae and X-Ray Bursts:Surface Nuclear Explosions

Classical novae are optically detected by swift increasesin brightness of order 10 magnitudes from quiescence tomaximum in a mere few days and decays on time scales ofbetween weeks and months. They eject shells with massesbetween 10−7M and 10−4M at speeds between about1000 to 10,000 km sec−1. The speed class is distinguishedby t2 or t3, the time required for the optical light curve tofall between 2 and 3 magnitudes from maximum. Suchsystems do not show repeated outbursts. If they do, ontime scales of decades to centuries, they are classified asrecurrents. Novae are all binary stars in which the gaineris a white dwarf. The loser can be a compact star; anM-type star that can be either a main sequence star orone that is hydrogen poor, indicative of some strippingbefore or during the evolution of the system; or a giantthat is similar to the ones observed in symbiotic stars.No classical system is of the latter variety, but recurrents(V3890 Sgr, RS Oph, and T CrB) are. On the other hand,recurrents can overlap the properties of classical systems(LMC 1990 No. 2 and U Sco have compact companions).X-ray novae, which have much shorter optical duration

Page 13: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

Binary Stars 89

and emit most strongly at high energy, arise on accretingneutron stars in systems that otherwise closely resemblelow mass X-ray binaries. Several are known to repeat,but on shorter recurrence intervals than recurrent novae.All show rapid rises in the optical and no opaque stageanalogous to classical novae.

Piling mass onto a white dwarf produces a very differentresponse than is seen for a normal star. The pressure in-creases by an amount P at the base of the accreted layerof mass M , depending on the star’s surface gravity:

P = G MM

4π R4(20)

so the temperature steadily increases by compression. Ata critical threshold, Pc ≈ 2 × 1019 dyne cm−2, the gasachieves the conditions that ignite nuclear reactions. De-pending on the white dwarf’s composition, these may bereactions on either carbon or heavier elements, but theseare details. The critical value of M depends on the massof the white dwarf, since its radius is mass dependentthrough R ∼ M−1/3. The scenario results in the scalingrelation M ∼ M−7/3, and the higher mass white dwarfdoes not have to accrete as much before the critical con-ditions are reached.

Nuclear reactions start at the base of the accreted layer,which is partially degenerate. The resulting release of en-ergy by the reactions does not increase the pressure al-though the temperature rises continually, resulting in athermonuclear runaway. You have seen this before, buthere it is happening at the surface of a star! The Fermienergy, εF , corresponds to temperatures of approximately108 K which can be reached as the nuclear source increasesin luminosity. The layer crosses the degeneracy thresholdwhen T ≥ εF/κ , and at this point the layer rapidly expandsand the reactions stop. In a stellar interior, at the time ofthe helium flash, the bulk of the mass has such a long ther-mal time scale that the core expands but without a largechange in the envelope. Not so here. The stellar radiusswells and a deep convection zone forms. The convectionis driven by the usual condition that, in order to transportthe flux being produced by nuclear reactions, which isthe Eddington luminosity, the temperature gradient mustbe superadiabatic. This turbulence drives deep mixing,which transports heavy elements into the burning zone,thereby providing fuel for the reactions and enhancing thenucleosynthesis. Products of the reactions, in particular13N and 15O, are β unstable and decay on time scales ofabout 100 sec, short compared with the sound travel timethrough the now bloated envelope of the star. When thesedecay, they typically release 10 MeV/nucleon, sufficientto heat the accreted layer and blow it off the star. The cru-cial condition is, however, that there must be a substantialproduction of these nuclei, and this requires that the ac-

creted gas be hydrogen rich and that the white dwarf mustbe overabundant in target CNO nuclei. Consequently, thelayer is ejected very quickly, with speeds exceeding theescape velocity from the white dwarf of the order of sev-eral thousand km sec−1. The rapid expansion is accom-panied by a dramatic increase in the optical brightness ofthe star, the rise indicating the onset of the classical novaoutburst. When this happens in a stellar core during thehelium flash, the overlying layers respond to the increasedenergy release on the thermal time scale but otherwise re-main essentially hydrostatic. In a nova, there is nothingto prevent the explosion from ejecting the outer envelope.Novae come in two distinct varieties based on ejecta abun-dance patterns, and these are the observational support forthe nucleosynthesis we have invoked to explain the explo-sion. Most novae show strongly enhanced CNO elementsand little else that has been changed. These appear to comefrom roughly 1-M white dwarfs, with progenitor massesin the range of a few solar masses.These pre-nova starsmust have been able to complete helium core burning butthen can have their envelopes removed through mass losscaused by the binary. Another group, the ONeMg subclass,show strongly enhanced abundances of Ne and heavier el-ements, for which higher core temperatures in the pre-novastar are required and must therefore come from a relativelyrarer progenitor. These more massive gainers are thoughtto be closer to the Chandrasekhar limit and therefore torepresent possible precursors to SN Ia events.

In current scenarios for type Ia supernova explosions,gainers at or near the mass limit for a white dwarf, calledthe Chandrasekhar mass, appear to be involved. One ofthe likely possibilities is not substantially different than anova explosion, with the key exception that the trigger isnot hydrogen accretion but carbon ignition. The tempera-ture sensitivity of this reaction is such that once it begins inthe white dwarf, the energy release produces an explosionrather than a fizzle. The debate is currently over whetherthe nuclear reaction proceeds as a deflagration wave oras an actual detonation. In the latter, the expansion of theshock and the local energy release power the continuedexpansion and subsequent burning. In the former, the pro-cess is subsonic and leads to sufficient energy release thatthe star disrupts by thermal conduction.

On a neutron star, the same fundamental process occursexcept that the degeneracy is far greater and therefore liftsonly after much a much higher luminosity is reached forthe nuclear source. This is the mechanism for X-ray bursts.Nucleosynthesis can proceed much further because of thehigher densities and temperatures, since the Fermi level is>109 K, compared with (1 − 3) × 108 K for the most mas-sive white dwarf in a nova system. Rapid proton captureprovides nuclear processing far beyond the iron peak, upto 95Mo, without disrupting the star. Very low mass loss is

Page 14: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

90 Binary Stars

expected, and the luminosities reach the Eddington valuefor a few seconds. Even less mass will be ejected than for awhite dwarf, approximately 10−11 M, and, consequently,the event is much faster than for a classical nova, secondscompared to days.

D. Black Holes in Binary Systems

The search for binary black hole candidates began inthe late 1960s with the (now obvious) suggestion byZel’dovich and Thorne that X-ray emission would be a sig-nal of accretion onto a compact object. Using only radialvelocities, a mass can be obtained for the unseen compo-nent, and if this exceeds the maximum possible for a stableneutron star, the only alternative must be a black hole. Thefirst such system, Cyg X-1, was discovered in 1972. Radialvelocity measurements of HD 269858, the optical coun-terpart, by Bolton and Webster and Murdin showed thatthis O star is a binary with a period of 5.6 days and a largemass function that exceeded reasonable upper limits for aneutron star companion. Since then, many such systemshave been detected, often accompanying the observationof X-ray nova outbursts. These are listed in Table I. TheX-ray signature includes a very hard source (with a powerlaw that extends to >100 keV) and, for the X-ray no-vae, strong variability at all wavelengths, often includinga nonthermal radio source and even relativistic jets.

A fundamental difference between a black hole asgainer and other compact objects is the inner boundarycondition. While a neutron star or white dwarf presents asurface onto which the mass must ultimately settle, a blackhole is an open flow. If the cooling time for the gas is longenough, specifically if the collision time becomes shortcompared to the freefall time at any point in the flow, the

TABLE I Properties of Some Binary Black Hole Candidatesa

System f (M) (M) P (days) Mx (M)

HDE 226868 = Cyg X-1 0.24 5.6 >7

GS2023+33 = V404 Cyg 6.08 6.46 10–14

G2000+25 5.0 0.34 6–18

H1705-25 4.9 0.52 5–8

J1655-40 3.24 2.62 6.5–7.8

A0620-00 3.0 0.32 6±3

GS 1124-68 3.01 0.43 4.5–6.2

GRO J0422+32 1.21 0.21 >4

4U 1543-47 0.22 1.12 2.7–7.5

LMC X-3 2.3 1.7 >7

a Data from Blandford, R., and Gehrels, N. (1999). Phys. Today 52(6),41; Charles, P. (1999). In “Observational Evidence for Black Holesin the Universe” (S. K. Chakrabarti, ed.), p. 279, Kluwer Academic,Dordrecht/Norwell, MA; Grindlay, J. E. (2000). In “Astrophysical Quan-tities,” 4th ed. (A. N. Cox, ed.), Springer–Verlag, Berlin.

radiative efficiency of the gas will drop. This is the expla-nation for the lack of emission from the innermost partsof the disk, a model called advection-dominated accretionflows. Since the cross-section for thermal bremsstrahlung(light emitted following thermalizing collisions betweenions and electrons) decreases with increasing particle ki-netic energy, the gas becomes progressively hotter fromcollisions but cannot remove the extra energy as it col-lapses through the Schwarzschild radius. This is a genericplasma process and happens regardless of the mass of theaccreting hole. In yet another way, binary systems becomelaboratories—they reveal processes that are also expectedto dominate the observable emission from black holes inactive galactic nuclei.

The total luminosity is determined by the rate of infall,since it is assumed that accretion is the primary source foremission of radiation, and the mean temperature is givenby the shock at the surface of the gainer. As an approximateestimator, the temperature can be obtained from the virialtheorem

Ts = GM

m pk R≈ 107 M

RK (21)

where in the latter equation M and R are in solar units.The luminosity is given by the rate of accretion, so that

L = GM

RM (22)

where M is the mass accretion rate. The maximum lumi-nosity that a source of radiation can emit without drivingsome of the material off by radiation pressure, the so-called Eddington limit, is given by

LEdd = 3 × 104 M

ML (23)

so that the accretion can only be stable as long as L ≤LEdd. Most mass estimates for the gainer in X-ray binariesare limited by this luminosity.

The temperature of the emergent spectrum can serveas a guide to the nature of the accreting star. The lowerthe temperature (that is, the softer the X-ray spectrum),the lower the M/R ratio. White dwarfs have M/R ≈ 102,and neutron stars and black holes have M/R ≥ 105 sothat, in general, the more collapsed the gainer, the higherthe temperature and the luminosity for the same rate ofmass accretion. The continuous spectrum from an ac-cretion disk can be approximated as a product of localemission processes. Dissipation depends on the shear andviscosity. If the disk is vertically geometrically thin andoptically thick, each annulus radiates locally like a black-body with a temperature that depends on the rate of shear.Since the luminosity due to viscous dissipation depends onthe shear, Sr,φ = r (dω/dr ), through Ldiss ∼−ηS2

rφ , where

Page 15: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

Binary Stars 91

η is the bulk viscosity, then the rate of radiation varies asLdiss ∼ r−3 for Keplerian disks (those with ω ∼ r−3/2). As-suming that the local temperature is given by the rateof radiative loss, T (r ) ∼ r−3/4. The inner region of thedisk radiates most of the energy but occupies a relativelysmaller area than the outer disk, so its area-weighted con-tribution to the emergent spectrum is reduced. The fre-quency of the spectral peak of any annulus depends di-rectly on the temperature so the inner portion of the diskcontributes to the higher frequency portion of the spec-trum. Integrating over the disk, the emergent spectrum isseen to be Fν ∼ ν1/3, where ν is the frequency and Fν is themonochromatic flux. The approximate power law depen-dence of the spectrum is an indication of the local natureof the emission process. More detailed models, includingthe effects of radiative transfer vertically through the disk,show that the emergent spectrum looks like a combinationof a power law and a very hot stellar atmosphere. The im-portance of this spectral dependence is that different partsof the spectrum probe processes in different portions ofthe disk.

One fundamental feature of accretion disks in closebinaries is that they are unstable over a wide range oftemperatures and mass accretion rates. The theory oftime-dependent accretion disks is still under development.However, it is clear that limit-cycle variations in tempera-ture, and therefore in luminosity, are possible for a varietyof disks. Especially applicable in cataclysmics such as SSCyg and other dwarf novae, the oscillations of accretiondisks are partly driven by the viscosity and its dependenceon the local physical properties of the disk. If the mass-transfer rate should increase, the matter will be stored inthe disk for some period of time, though the details arenot well understood. Angular momentum must be dissi-pated before the material can accrete onto the central star.This is true regardless of the nature of the mass gainer.Viscosity acts both to dissipate energy and to redistributeangular momentum within the disk. As material falls ontothe central body, the accretion produces a rise in tempera-ture and luminosity of the shock and also of the inner partof the disk. The variation of the high-frequency portionof the spectrum relative to the lower frequencies there-fore probes the rate of release of mass and energy in thedisk during a mass-transfer event. The applicability of thispicture to a wide variety of binary systems. Such as sym-biotics, dwarf novae, and Algols, ensures that work willcontinue on this important physical process.

Disk formation is also different in different types ofsystems. The gravitational potential for a white dwarf islower, so the circulation velocity of the material is alsolower. The efficiency of heating being reduced, the disksare cooler and the opacity is also higher. Consequently, thedisks are sensitive to small fluctuations in the local heating

rate and may be thermally unstable. This results in bothflickering and alteration of the vertical structure of the diskwith time. Also, the heating at the inner boundary of bothneutron stars and white dwarfs differs from that seen forblack holes, since there is actually no surface in the latteron which matter can accumulate. Consequently, neutronstars and white dwarfs are more sensitive to the accumu-lation of matter on their surfaces and can undergo flashnuclear processing, the initiation of nuclear fusion underlow-pressure, upper boundary conditions, which serves toblow matter off their surfaces on time scales correspond-ing to the rate of expansion of the outer layers because ofthe release of nuclear binding energy. In effect, these starsbehave like the cores of stars without the presence of theoverlying matter, while black holes, which may go throughsome slow nuclear processing of matter as it falls into thecentral region, cannot initiate flash processes. Thus, no-vas and dwarf novas appear to be due to non-black-holesystems.

The details of mass accretion can only be understood bydetailed modeling of the accretion disk processes. Thesedepend critically on the nature of the environment of thecompact object, because the presence of strong magneticfields appears to significantly alter the details of the accre-tion. A magnetic field exerts a pressure against the matterinfalling to the surface and can serve to channel the matterto the poles of the accreter. In addition, it alters the pres-sure and density conditions at the interaction region in theplane of the accretion disk, forming an extended layer atlarge distance from the central star and can thus lower theeffective temperature emerging from the shock region.

VII. FORMATION OF BINARY SYSTEMS

This is one of the major areas of study in stellar evolutiontheory, because at present, there are few good examplesof pre-main-sequence binary stars so the field remainsdominated by theoretical questions. Protostellar formationbegins with the collapse of a portion of an interstellarcloud, which proceeds to form a massive disk. Throughviscosity and interaction with the ambient magnetic field,the disk slowly dissipates its angular momentum. If thedisk forms additional self-gravitating fragments, they willcollide as they circulate and accrete to form more massivestructures. Models show that such disks are unstable to theformation of a few massive members, which then accreteunincorporated material and grow.

Classical results for stability analysis of rapidly rotat-ing homogeneous objects point to several possible alter-natives for the development of the core object. One is thecentral star in the disk, if it is still rapidly rotating, maydeform to a barlike shape, which can pinch off a low-mass

Page 16: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: ZBU Revised Pages

Encyclopedia of Physical Science and Technology EN002E-52 May 25, 2001 13:43

92 Binary Stars

component. The core may undergo spontaneous fissioninto fragments, which then evolve separately. Simulationsshow, however, that the fission scenario does not yieldnearly equal mass fragments. Such systems more likelyresult either from early protostar fragmentation and diskaccretion in the first stages of star formation or of coa-lescence of fragments during some intermediate stage ofdisk fragmentation before the cores begin to grow.

Binary star formation appears to be one of the avenuesby which collapsing clouds relieve themselves of excessangular momentum, replacing spin angular momentumwith orbital motion of the components. However, whilethe distribution of q may be a clue to the mechanism offormation, even this observational quantity is very poorlydetermined. The discovery of debris disks around severalintermediate-mass, main sequence stars, especially β Picand α Lyr = Vega, has fueled the speculation that planetarysystems may be an alternative to the formation of binarystars for some systems. Statistical studies show that radialvelocity variations are observed in many low-mass, solar-type stars, but the period and mass ratio distribution forthese systems is presently unknown.

VIII. CONCLUDING REMARKS

Binary stars form the cornerstone of the understanding ofstellar evolution. They are the one tool with which we candetermine masses and luminosities for stars independentof their distances. They present theaters in which manyfascinating hydrodynamic processes can be directly ob-served, and they serve as laboratories for the study of theeffects of mass transfer on the evolution of single stars. Bythe study of their orbital dynamics, we can probe the deepinternal density structures of the constituent stars. Finally,binary stars provide examples of some of the most strikingavailable departures from quiescent stellar evolution. The

development of space observation, using X-ray (EXOSAT,EINSTEIN, GINGA. ASCF, ROSAT, and Chandra) andultraviolet (IUE) satellites, has in the past decades, pro-vided much information about the energetics and dynam-ics of these systems. The coming decade, especially dueto observations made with the Hubble Space Telescope,NGSI, and SIRTF, promises to be a fruitful one for binarystar research.

SEE ALSO THE FOLLOWING ARTICLES

GALACTIC STRUCTURE AND EVOLUTION • NEUTRON

STARS • STAR CLUSTERS • STARS, MASSIVE • STELLAR

SPECTROSCOPY • STELLAR STRUCTURE AND EVOLUTION

• SUPERNOVAE

BIBLIOGRAPHY

Batten, A. (1973). “Binary and Multiple Star Systems,” Pergamon,Oxford.

de Loore, C., and Doom, C. (1992). “Structure and Evolution of Singleand Binary Stars,” Kluwer Academic, Dordrecht/Norwell, MA.

Frank, J., King, A. R., and Raine, D. (1992). “Accretion Power in Astro-physics,” 2nd ed., Cambridge Univ. Press, London.

Kenyon, S. J. (1986). “The Symbiotic Stars,” Cambridge Univ. Press,London and New York.

Popper, D. M., Ulrich, R. K., and Plavec, M., eds. (1980). “Close BinarySystems: IAU Symp. 88,” Reidel, Dordrecht, Netherlands.

Pringle, J. (1981). “Accretion disks in astrophysics,” Annu. Rev. Astron.Astrophys. 19, 137.

Shore, S. N., Livio, M., and van den Heuvel, E. P. J. (1994). “InteractingBinary Systems,” (H. Nussbaumer and A. Orr, eds.), Springer–Verlag,Berlin.

Shu, F., and Lubrow, S. (1981). “Mass, angular momentum, and energytransfer in close binary systems,” Annu. Rev. Astron. Astrophys. 19,277.

van der Kamp, P. (1986). “Dark Companions of Stars,” Reidel, Dordrecht,Netherlands.

Page 17: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic InflationEdward P. TryonHunter College and Graduate Center of The CityUniversity of New York

I. Einstein’s Theory of GravitationII. Cosmological PuzzlesIII. Geometry of the CosmosIV. Dynamics of the CosmosV. Contents, Age, and Future of the Cosmos

VI. Horizon and Flatness ProblemsVII. Unified Theories and Higgs Fields

VIII. Scenarios for Cosmic Inflation

GLOSSARY

Comoving Moving in unison with the overall expansionof the universe.

Cosmic microwave background (CMB) Emitted whenthe universe was ∼300,000 years old, this electromag-netic radiation fills all of space and contains detailedinformation about the universe at that early time.

Cosmic scale factor The distance between any two co-moving points is a constant fraction of the cosmic scalefactor a(t), which grows in unison with the cosmicexpansion. The finite volume of a closed universe isVu(t) = 2π2a3(t).

Cosmological constant Physical constant in the fieldequations of general relativity, corresponding to aneffective force that increases in proportion to dis-tance and is repulsive for positive . Its effects areequivalent to those of constant vacuum energy andpressure.

Critical density Curvature of space is positive, zero, or

negative according to whether the average density ofmass-energy is greater than, equal to, or less than acritical density determined by the Hubble parameterand cosmological constant.

Curvature constant A constant k that is +1, 0, or −1for positive, zero, or negative spatial curvature, respec-tively.

Doppler effect Change in observed frequency of any typeof wave due to motion of the source relative to the ob-server: motion of the source away from an observer de-creases the observed frequency by a factor determinedby the relative velocity.

False vacuum Metastable or unstable state predicted bygrand unified theories, in which space has pressureP and mass-energy density ρ related by P = −c2ρ;the gravitational effects mimic a positive cosmologicalconstant.

Field Type of variable representing quantities that may,in principle, be measured (or theoretically analyzed) atany point of space–time; hence, fields possess values

779

Page 18: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

780 Cosmic Inflation

(perhaps zero in some regions) at every point of space–time.

Higgs fields In unified theories, field quantities that, whenall are zero, have positive energy density and negativepressure giving rise to false vacuum. At least one Higgsfield must be nonzero in the state of minimum energy(true vacuum), which breaks an underlying symmetryof the theory.

Horizon distance Continually increasing distance be-yond which neither light nor any causal influence couldyet have traveled since the universe began.

Hubble parameter Ratio H of the recessional velocityof a comoving object to its distance from a comovingobserver; by Hubble’s law, this ratio has the same valuefor all such objects and observers. Its present value iscalled Hubble’s constant.

Inflaton fields In theories of elementary particles, thegeneric term for fields with a false vacuum state thatcould give rise to inflation; e.g., Higgs fields.

Riemannian geometry The geometry of any curvedspace or space-time that is approximately flat over shortdistances.

Robertson–Walker (RW) metric: In homogeneous,isotropic, cosmological models, describes space–timeand its evolution in terms of a curvature constant and(evolving) cosmic scale factor.

Second-rank tensor A 4 × 4 array (i.e., matrix) of quan-tities whose values depend, in a characteristic way, onthe reference frame in which they are evaluated.

Stress-energy tensor The source of gravity in generalrelativity, a second-rank tensor whose components in-clude the energy and momentum densities, pressure,and other stresses.

Work-energy theorem Any change in energy of a systemequals the work done on it; as a corollary, the energyof an isolated system is constant.

COSMIC INFLATION refers to a conjectured, early pe-riod of exponential growth, during which the universe isbelieved to have increased in linear dimensions by a fac-tor exceeding ∼1030. The inflationary period ended beforethe universe was ∼10−30 seconds old, after which cosmicevolution proceeded in accord with the standard big bangmodel. Inflation is based on the general theory of relativ-ity together with grand unified (and several other) theoriesof the elementary particles. The latter predict a peculiarstate called the false vacuum, which rendered gravitationrepulsive for a very early and brief time, thereby caus-ing inflation. Several long-standing cosmic puzzles areresolved by this conjecture.

Standard theories of inflation predict that our universe isspatially flat. This prediction is strikingly supported by re-

cent studies of the cosmic microwave background (CMB).Inflation also makes quantitative predictions about tiny de-viations from perfect uniformity at very early times, theslight concentrations of matter that later grew into galax-ies and larger structures. These predictions also appear tobe confirmed by recent, detailed studies of the CMB.

I. EINSTEIN’S THEORY OF GRAVITATION

In Newton’s theory of gravitation, mass is the source ofgravity and the resulting force is universally attractive.The empirical successes of Newton’s theory are numerousand familiar. All observations of gravitational phenom-ena prior to this century were consistent with Newtoniantheory, with one exception: a minor feature of Mercury’sorbit. Precise observations of the orbit dating from 1765revealed an advance of the perihelion amounting to 43 arcsec per century in excess of that which could be explainedby the perturbing effects of other planets. Efforts weremade to explain this discrepancy by postulating the exis-tence of small, perturbing masses in close orbit around thesun, and this type of explanation retained some plausibil-ity into the early part of the twentieth century. Increasinglyrefined observations, however, have ruled out such a pos-sibility: there is not sufficient mass in close orbit aroundthe sun to explain the discrepancy.

In 1907, Einstein published the first in a series of papersin which he sought to develop a relativistic theory of grav-itation. From 1913 onward, he identified the gravitationalfield with the metric tensor of Riemannian (i.e., curved)space–time. These efforts culminated in his general the-ory of relativity (GTR), which was summarized and pub-lished in its essentially final form in 1916. The scope andpower of the theory made possible the construction ofdetailed models for the entire universe, and Einstein pub-lished the first such model in 1917. This early model wassubsequently retracted, but the science of mathematicalcosmology had arrived.

The GTR reproduces the successful predictions ofNewtonian theory and furthermore explains the perihelionadvance of Mercury’s orbit. Measurement of the bendingof starlight in close passage by the sun during the solareclipse of 1919 resulted in further, dramatic confirmationof GTR, which has since been accepted as the standard the-ory of gravitation. Additional successes of GTR includecorrect predictions of gravitational time dilation and, be-ginning in 1967, the time required for radar echoes toreturn to Earth from Mercury and artificial satellites.

There are no macroscopic phenomena for which thepredictions of GTR are known (or suspected) to fail. Ef-forts to quantize GTR (i.e., to wed GTR with the principlesof quantum theory) have met with continuing frustration,

Page 19: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 781

primarily because the customary perturbative method forconstructing solutions to quantum equations yields diver-gent integrals whose infinities cannot be canceled againstone another (the quantized version of GTR is not renor-malizable). Hence there is widespread belief that micro-scopic gravitational phenomena involve one or more fea-tures differing from those of GTR in a quantized form.In the macroscopic realm, however, any acceptable the-ory of gravitation will have to reproduce the successfulpredictions of GTR. It is reasonable (though not strictlynecessary) to suppose that all the macroscopic predictionsof GTR, including those that remain untested, will be re-produced by any theory that may eventually supersede it.

For gravitational phenomena, the obvious borderlinebetween macroscopic and microscopic domains is pro-vided by the Planck length (LP):

LP ≡ (Gh/c3)1/2 = 4.0 × 10−33 cm (1)

where G is Newton’s constant of gravitation, h is Planck’sconstant, and c is the speed of light. Quantum effects ofgravity are expected to be significant over distances com-parable to or less than LP. According to the wave–particleduality of quantum theory, such short distances are probedonly by particles whose momenta exceed Planck’s con-stant divided by LP. Such particles are highly relativistic,so their momenta equal their kinetic energies divided byc. Setting their energies equal to a thermal value of orderkT (where k is Boltzmann’s constant and T the absolutetemperature), the Planck temperature (TP) is obtained:

TP ≡ ch/kLP = 3.6 × 1032 K (2)

where “K” denotes kelvins (degrees on the absolute scale).For temperatures comparable to or greater than TP, effectsof quantum gravity may be important.

The fundamental constants of quantum gravity can alsobe combined to form the Planck mass (MP):

MP ≡ (hc/G)1/2 = 5.5 × 10−5 g (3)

The Planck mass and length give rise to the Planck densityρP:

ρP ≡ MP/

L3P = 8.2 × 1092 g/cm3 (4)

For mass densities comparable to or greater than ρP, theeffects of quantum gravity are expected to be significant.(Definitions of LP, TP, MP, and ρP sometimes differ fromthose given here by factors of order unity, but only theorders of magnitude are important in most applications.)

According to the standard big bang model, the veryearly universe was filled with an extremely hot, densegas of elementary particles and radiation in (or near)thermal equilibrium. As the universe expanded, the tem-perature and density both decreased. Calculations indi-cate that the temperature fell below TP when the uni-

verse was ∼10−45 sec. old, and the density fell belowρP at ∼10−44 sec. Hence currently unknown effects ofquantum gravity may have been important during the first∼10−44 sec or so of cosmic evolution. For all later times,however, the temperature and density were low enoughthat GTR should provide an adequate description of grav-itation, and our theories of elementary particles togetherwith thermodynamics and other established branches ofphysics should be applicable in a straightforward way. Weshall assume this to be the case.

As remarked earlier, GTR identifies the gravitationalfield with a second-rank tensor, the metric tensor of curvedspace–time. The source of gravitation in GTR is thereforenot simply mass (nor mass-energy), but another second-rank tensor: the stress-energy tensor. Hence stresses con-tribute to the source of gravitation in GTR, the simplestexample of stress being pressure.

It is commonly said that gravitation is a universallyattractive force, but such a sweeping statement is notsupported by the field equations of GTR. It is true thatthe kinds of objects known (or believed) to constitutethe present universe attract each other gravitationally, butGTR contains another possibility: granted an extraordi-nary medium wherein pressure is negative and sufficientlylarge, the resulting gravitational field would be repulsive.This feature of GTR lies at the heart of cosmic inflation:it is conjectured that during a very early and brief stageof cosmic evolution, space was characterized by a statecalled a false vacuum with a large, negative pressure. Afalse vacuum would mimic the effects of a positive cosmo-logical constant in the field equations, as will be explainedin some detail in Section IV.G. Cosmic dimensions wouldhave increased at an exponential rate during this brief pe-riod, which is called the inflationary era.

The concept of negative pressure warrants a brief ex-planation. A medium characterized by positive pressure,such as a confined gas, has an innate tendency to expand.In contrast, a medium characterized by negative pressurehas an innate tendency to contract. For example, considera solid rubber ball whose surface has been pulled out-ward in all directions, stretching the rubber within. Moregenerally, for a medium characterized by pressure P , theinternal energy U contained within a variable volume V isgoverned by U = −PV for adiabatic expansion or con-traction (a result of the work-energy theorem, where Uand V denote small changes in U and V , respectively).If U increases when V increases, then P is negative.

The possibility of an early false vacuum with resultingnegative pressure and cosmic inflation was discovered in1979 by Alan H. Guth, while he was studying grand unifiedtheories (GUTs) of elementary particles and their interac-tions. (The relevant features of GUTs will be described inSection VII.C.) The resulting inflation is believed to have

Page 20: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

782 Cosmic Inflation

increased cosmic linear dimensions by a factor of ∼1030

in the first ∼10−30 sec, and probably by very much more.Over the two decades since Guth’s initial discovery,

it has been noted that inflation is possible in many dif-ferent versions of GUTs. Furthermore, superstring theoryand several other types of particle theory contain featuressimilar to those of GUTs, and give rise to comparableinflation. Inflation therefore encompasses a class of theo-ries, including at least fifty distinct versions that have beenproposed since Guth’s initial model.

All the theories of elementary particles that predict in-flation remain speculative, as does inflation itself. We shallsee, however, that inflation would explain several observedfeatures of the universe that had previously defied expla-nation. Inflation also makes a quantitative prediction aboutdeviations from perfect uniformity in the early universe,deviations that later gave rise to the clumping of matterinto galaxies and clusters of galaxies. This prediction hasrecently been strikingly confirmed by detailed studies ofthe cosmic background radiation. Empirical evidence foran early period of cosmic inflation is therefore quite sub-stantial, whatever the details of the underlying particletheory might prove to be.

In common with the initial model of Guth, the greatmajority of inflationary theories predict that the averagecurvature of space is (virtually) zero, corresponding to flatuniverses. Such theories are called “standard inflation.” Afew theories predict nonzero, negative (hyperbolic) spa-tial curvature, which, if small, cannot be ruled out bypresent observations. The latter theories are sometimescalled “open inflation” (a potentially misleading name, be-cause flat and hyperbolic universes are both infinite, andtraditionally have both been called “open”). Most predic-tions of standard and open inflation are virtually the same,however, so we shall only distinguish between them wherethey differ.

We shall next enumerate and briefly describe someheretofore unexplained features of the universe that maybe understood in terms of cosmic inflation, and then wewill proceed to a detailed presentation of the theory.

II. COSMOLOGICAL PUZZLES

A. The Horizon Problem

Both the special and general theories of relativity precludethe transmission of any causal agent at a speed faster thanlight. Accepting the evidence that our universe has a finiteage, each observer is surrounded by a particle horizon:no comoving particle beyond this horizon could have yetmade its presence known to us in any way, for there hasnot yet been sufficient time for light (or anything else)to have traveled the intervening distance. Regions of the

universe outside each others’ horizons are causally dis-connected, having had no opportunity to influence eachother. The distance to the horizon naturally increases withthe passage of time. In the standard big bang model, anorder-of-magnitude estimate for the horizon distance issimply the speed of light multiplied by the time that haselapsed since the cosmos began. (Modifications to thissimple estimate arise because the universe is expandingand because space–time is curved; see Section VI.A.)

Astronomical observations entail the reception and in-terpretation of electromagnetic radiation, that is, visiblelight and, in recent decades, radiation from nonvisibleparts of the spectrum such as radio waves, X-rays, andγ -rays. We shall refer to that portion of the cosmos fromwhich we are now receiving electromagnetic radiation asthe observable universe; it is that fraction of the wholewhich may be studied by contemporary astronomers.

The universe was virtually opaque to electromagneticradiation for times earlier than ∼300,000 yr (as will beexplained). Therefore, the present radius of the observ-able universe is less than the present horizon distance—∼300,000 ly less in the standard big bang model, or verymuch less if cosmic inflation occurred at the very earlytime characteristic of inflationary models. The cosmosis currently ∼14 billion years old, and the present ra-dius of the observable universe is ∼50 billion light-years.(Sources of light have been moving away from us whilethe light traveled toward us.) Approximately 1011 galax-ies lie within the observable universe. About 10% of thesegalaxies are found in groups called clusters, some of whichform larger groups called superclusters (i.e., clusters ofclusters).

When averaged over suitably large regions, the universeappears to be quite similar in all directions at all distances,i.e., isotropic and homogeneous. The distribution of matteris considerably less homogeneous than was supposed onlytwenty years ago, however. Over the last two decades,continually improving technology has made possible farmore detailed surveys of the sky, and surprising featureshave been discovered.

Over distances of roughly 100 million light-years, theconcentration of matter into galaxies, clusters, and su-perclusters is roughly a fractal pattern. No fractal pat-tern holds over larger distances, but new and larger struc-tures have become apparent. Clusters and superclustersare arrayed in filaments, sheets, and walls, with nearlyempty voids between them. Diameters of the voids typi-cally range from 600 to 900 million light-years. One sheetof galaxies, the “Great Wall,” is about 750 million light-years long, 250 million light-years wide, and 20 millionlight-years thick. The largest voids are nearly 3 billionlight-years across. In very rough terms, the distribution ofmatter resembles an irregular foam of soap bubbles, except

Page 21: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 783

that many of the voids are interconnected as in a sponge.No pattern is evident in the locations or sizes of the wallsand voids—their distribution appears to be random. In astatistical sense, the universe still appears to be isotropicand homogeneous.

More dramatic and precise evidence for isotropy is pro-vided by the cosmic background radiation, whose exis-tence and properties were anticipated in work on the earlyuniverse by George Gamow and collaborators in the late1940s. These investigators assumed that the early universewas very hot, in which case a calculable fraction of the pri-mordial protons and neutrons would have fused togetherto form several light species of nuclei [2H (deuterium),3He, 4He, and 7Li] during the first few minutes after thebig bang. (Heavier nuclei are believed to have formedlater, in the cores of stars and during stellar explosions.)The abundances predicted for these light nuclei agreedwith observations, which confirmed the theory of theirformation.

After this early period of nucleosynthesis, the universewas filled with a hot plasma (ionized gas) of light nu-clei and electrons, together with a thermal distributionof photons. When the universe had expanded and cooledfor ∼300,000 yr, the temperature dropped below 3000 K,which enabled the nuclei and electrons to combine andform electrically neutral atoms that were essentially trans-parent to radiation. (At earlier times the plasma was vir-tually opaque to electromagnetic radiation, because thephotons could travel only short distances before beingscattered by charged particles: the observable universe ex-tends to the distance where we now see it as it was at theage of ∼300,000 years. This is called the time of last scat-tering.) The thermal photons that existed when the atomsformed traveled freely through the resulting gas, and laterthrough the vast reaches of empty space between the starsand galaxies that eventually formed out of that gas. Thesephotons have survived to the present, and they bathe everyobserver who looks toward the heavens. (If a television setis tuned to a channel with no local station and the bright-ness control is turned to its lowest setting, ∼1% of theflecks of “snow” on the screen are a result of this cosmicbackground radiation.)

The photons now reaching earth have been travelingfor ∼14 billion yr and hence were emitted by distant partsof the universe receding from us with speeds near that oflight. The Doppler effect has redshifted these photons, butpreserved the blackbody distribution; hence the radiationremains characterized by a temperature.

Extending earlier work with Gamow, R. A. Alpher andR. C. Herman described the properties of this cosmic back-ground radiation in 1948, estimating its present temper-ature to be ∼5 K (with its peak in the microwave por-tion of the spectrum). No technology existed, however,

for observing such radiation at that time. In 1964, ArnoA. Penzias and Robert W. Wilson accidently discoveredthis radiation (at a single wavelength) with a large horn an-tenna built to observe the Echo telecommunications satel-lite. Subsequent observations by numerous workers es-tablished that the spectrum is indeed blackbody, with atemperature of 2.73 K. The present consensus in favor ofa hot big bang model resulted from this discovery of therelic radiation.

The cosmic background radiation is isotropic within∼one part in 105 (after the effect of earth’s velocity withrespect to distant sources is factored out). This high de-gree of isotropy becomes all the more striking when onerealizes that the radiation now reaching earth from op-posite directions in the sky was emitted from regions ofthe cosmos that were over 100 times farther apart thanthe horizon distance at the time of emission, assuming thehorizon distance was that implied by the standard big bangmodel. How could different regions of the cosmos thathad never been in causal contact have been so preciselysimilar?

Of course the theoretical description of any dynamicalsystem requires a stipulation of initial conditions; onemight simply postulate that the universe came into beingwith perfect isotropy as an initial condition. An initialcondition so special and precise begs for an explanation,however; this is called the horizon problem.

B. The Flatness Problem

In a sweeping extrapolation beyond the range of obser-vations, it is often assumed that the entire cosmos is ho-mogeneous and isotropic. This assumption is sometimescalled the Cosmological Principle, a name that is perhapsmisleading in suggesting greater sanctity for the assump-tion than is warranted either by evidence or by logic. (In-deed, current scenarios for cosmic inflation imply that theuniverse is distinctly different at some great distance be-yond the horizon, as will be explained in Section VIII.B.)The existence of a horizon implies, however, that the ob-servable universe has evolved in a manner independent ofvery distant regions. Hence the evolution of the observ-able universe should be explicable in terms of a model thatassumes the universe to be the same everywhere. In lightof this fact and the relative simplicity of models satisfyingthe Cosmological Principle, such models clearly warrantstudy.

In 1922, Alexandre Friedmann discovered three classesof solutions to the field equations of GTR that fulfillthe Cosmological Principle. These three classes exhaustthe possibilities (within the stated assumptions), as wasshown in the 1930s by H. P. Robertson and A. G. Walker.The three classes of Friedmann models (also called

Page 22: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

784 Cosmic Inflation

Friedmann–Robertson–Walker or Friedmann–Lemaıtremodels) are distinguished from one another by the cur-vature of space, which may be positive, zero, or nega-tive. The corresponding spatial geometries are spherical,flat, and hyperbolic, respectively. Einstein’s field equa-tions correlate the spatial geometry with the average den-sity ρ of mass-energy in the universe. There is a crit-ical density ρc such that spherical, flat, and hyperbolicgeometries correspond to ρ > ρc, ρ = ρc, and ρ < ρc, re-spectively. A spherical universe has finite volume (spacecurves back on itself), and is said to be “closed”; flat andhyperbolic universes have infinite volumes and are said tobe “open.”

Astronomical observations have revealed no depar-tures from Euclidean geometry on a cosmic scale.Furthermore, recent measurements of the cosmic mi-crowave background reveal space to be so nearly flat thatρ = (1.0 ± 0.1)ρc. This is a striking feature of our uni-verse, given that the ratio ρ/ρc need only lie in the rangefrom zero to infinity. The ratio can be arbitrarily largein closed universes, which furthermore can be arbitrarilysmall and short-lived. The ratio can be arbitrarily close tozero in hyperbolic universes, with arbitrarily little matter.Flat universes are quite special, having the highest possi-ble value for ρ/ρc of any infinite space. Our universe isremarkably close to flat, and the question of why is calledthe flatness problem (first pointed out, in detail, by RobertH. Dicke and P. James E. Peebles in 1979).

C. The Smoothness Problem

Although the observable universe is homogeneous on alarge scale, it is definitely lumpy on smaller scales: thematter is concentrated into stars, galaxies, and clusters ofgalaxies. The horizon problem concerns the establishmentof large-scale homogeneity, whereas the smoothness prob-lem concerns the formation of galaxies and the chain ofevents that resulted in the observed sizes of galaxies. Thata problem exists may be seen in the following way.

A homogeneous distribution of matter is rendered un-stable to clumping by gravitational attraction. Any chanceconcentration of matter will attract surrounding matter toit in a process that feeds on itself. This phenomenon wasfirst analyzed by Sir James Jeans, and is called the Jeansinstability. If the universe had evolved in accord with thestandard big bang model since the earliest moment thatwe presume to understand (i.e., since the universe was10−44 sec old), then the inevitable thermal fluctuations indensity at that early time would have resulted in clumps ofmatter very much larger than the galaxies, clusters, and su-perclusters that we observe. This was first noted by P. J. E.Peebles in 1968, and the discrepancy is called the smooth-ness problem.

D. Other Problems

In addition to the questions or problems thus far described,there are others of a more speculative nature that maybe resolved by cosmic inflation. We shall enumerate andbriefly explain them here. The first is a problem that arisesin the context of GUTs.

Grand unified theories predict that a large number ofmagnetic monopoles (isolated north or south magneticpoles) would have been produced at a very early time(when the universe was ∼10−35 sec old). If the universehad expanded in the simple way envisioned by the standardbig bang model, then the present abundance of monopoleswould be vastly greater than is consistent with observa-tions. (Despite extensive searches, we have no persua-sive evidence that any monopoles exist.) This is called themonopole problem (see Sections VII.C and VIII.A).

The remaining questions and problems that we shallmention concern the earliest moment(s) of the cosmos. Inthe standard big bang model, one assumes that the universebegan in an extremely dense, hot state that was expand-ing rapidly. In the absence of quantum effects, the density,temperature, and pressure would have been infinite at timezero, a seemingly incomprehensible situation sometimesreferred to as the singularity problem. Currently unknowneffects of quantum gravity might somehow have renderedfinite these otherwise infinite quantities at time zero. Itwould still be natural to wonder, however, why the uni-verse came into being in a state of rapid expansion, andwhy it was intensely hot.

The early expansion was not a result of the high tem-perature and resulting pressure, for it can be shown thata universe born stationary but with arbitrarily high (posi-tive) pressure would have begun immediately to contractunder the force of gravity (see Section IV.D). We knowthat any initial temperature would decrease as the uni-verse expands, but the equations of GTR are consistentwith an expanding universe having no local random mo-tion: the temperature could have been absolute zero forall time. Why was the early universe hot, and why was itexpanding?

A final question lies at the very borderline of science,but has recently become a subject of scientific specula-tion and even detailed model-building: How and why didthe big bang occur? Is it possible to understand, in sci-entific terms, the creation of a universe ex nihilo (fromnothing)?

The theory of cosmic inflation plays a significant rolein contemporary discussions of all the aforementionedquestions and/or problems, appearing to solve some whileholding promise for resolving the others. Some readersmay wish to understand the underlying description of cos-mic geometry and evolution provided by GTR. Sections

Page 23: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 785

III and IV develop the relevant features of GTR in consid-erable detail, using algebra and, occasionally, some basicelements of calculus. Section V describes the content, age,and future of the universe. Section VI explains the hori-zon and flatness problems in detail. Most of the equationsare explained in prose in the accompanying text, so that areader with a limited background in mathematics shouldnevertheless be able to grasp the main ideas. Alternatively,one might choose to skip these sections (especially in afirst reading) and proceed directly to Section VII on uni-fied theories and Higgs fields, without loss of continuity orbroad understanding of cosmic inflation (the glossary maybe consulted for the few terms introduced in the omittedsections).

III. GEOMETRY OF THE COSMOS

A. Metric Tensors

General relativity presumes that special relativity is validto arbitrarily great precision over sufficiently small re-gions of space–time. GTR differs from special relativity,however, in allowing for space–time to be curved overextended regions. (Curved space-time is called Rieman-nian, in honor of the mathematician G. F. B. Riemannwho, in 1854, made a seminal contribution to the the-ory of curved spaces.) Another difference is that whilegravitation had been regarded as merely one of severalforces in the context of special relativity, it is conceptuallynot a force at all in GTR. Instead, all effects of gravita-tion are imbedded in, and expressed by, the curvature ofspace–time.

In special relativity, there is an invariant interval s12

separating events at t1, r1 and t2, r2 given by

(s12)2 = −c2(t1 − t2)2 + (x1 − x2)2 + (y1 − y2)2

+ (z1 − z2)2 (5)

where x , y, and z denote Cartesian components of theposition vector r. If t1 = t2, |s12| is the distance between r1

and r2. If r1 = r2, then |s12|/c is the time interval betweent1 and t2. [Some scientists define the overall sign of (s12)2

to be opposite from that chosen here, but the physics isnot affected so long as consistency is maintained.]

The remarkable relationship between space and time inspecial relativity is expressed by the fact that the algebraicform of s12 does not change under Lorentz transforma-tions, which express the space and time coordinates in amoving frame as linear combinations of space and time co-ordinates in the original frame. This mixing of space withtime implies a dependence of space and time on the refer-ence frame of the observer, which underlies the counter-

intuitive predictions of special relativity. The flatness ofthe space-time of special relativity is implied by the exis-tence of coordinates t and r in terms of which s12 has the al-gebraic form of Eq. (5) throughout the space-time (calledMinkowski space–time, after Hermann Minkowski).

The most familiar example of a curved space is the sur-face of a sphere. It is well-known that the surface of asphere cannot be covered with a coordinate grid whereinall intersections between grid lines occur at right angles(the lines of longitude on earth are instructive). Similarly, acurved space–time of arbitrary (nonzero) curvature cannotbe spanned by any rectangular set of coordinates such asthose appearing in Eq. (5). One can, however, cover anysmall portion of a spherical surface with approximatelyrectangular coordinates, and the rectangular approxima-tion can be made arbitrarily precise by choosing the re-gion covered to be sufficiently small. In a like manner,any small region of a curved space–time can be spannedby rectangular coordinates, with the origin placed at thecenter of the region in question. The use of such coordi-nates (called locally flat or locally inertial) enables oneto evaluate any infinitesimal interval ds between nearbyspace–time points by application of Eq. (5).

To span an extended region of a curved space–time witha single set of coordinates, it is necessary that the coor-dinate grid be curved. Since one has no a priori knowl-edge of how space–time is curved, the equations of GTRare typically expressed in terms of arbitrary coordinatesxµ (µ = 0, 1, 2, and 3). The central object of study be-comes the metric tensor gµν , defined implicitly by

(ds)2 = gµν dxµ dxν (6)

together with the assumption that space–time is locallyMinkowskian (and subject to an axiomatic proviso thatgµν is symmetric, i.e., gµν = gνµ).

In principle, one can determine gµν at any given pointby using Eq. (5) with locally flat coordinates to deter-mine infinitesimal intervals about that point, and then re-quiring the components of gµν to be such that Eq. (6)yields the same results in terms of infinitesimal changesdxµ in the arbitrary coordinates xµ. The values so deter-mined for the components of gµν will clearly depend onthe choice of coordinates xµ, but it can be shown that thevariation of gµν over an extended region also determines(or is determined by) the curvature of space–time. It is cus-tomary to express models for the geometry of the cosmosas models for the metric tensor, and we shall do so here.

B. The Robertson–Walker Metric

Assuming the universe to be homogeneous and isotropic,coordinates can be chosen such that

Page 24: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

786 Cosmic Inflation

(ds)2 = −c2(dt)2 + a2(t)

(dr )2

1 − kr2

+ r2[(dθ )2 + sin2 θ (dϕ)2]

(7)

as was shown in 1935 by H. P. Robertson and, indepen-dently, by A. G. Walker. The ds of Eq. (7) is called theRobertson–Walker (RW) line element, and the correspond-ing gµν (with coordinates xµ identified as ct , r , θ , and ϕ)is called the Robertson–Walker metric. The cosmologicalsolutions to GTR discovered by Friedmann in 1922 are ho-mogeneous and isotropic, and may therefore be describedin terms of the RW metric.

The spatial coordinates appearing in the RW line el-ement are comoving; that is, r , θ , and ϕ have constantvalues for any galaxy or other object moving in unisonwith the overall cosmic expansion (or contraction). Thust corresponds to proper (ordinary clock) time for comov-ing objects. The radial coordinate r and the constant k arecustomarily chosen to be dimensionless, in which casethe cosmic scale factor a(t) has the dimension of length.The coordinates θ and ϕ are the usual polar and azimuthalangles of spherical coordinates. Cosmic evolution in timeis governed by the behavior of a(t). [Some authors de-note the cosmic scale factor by R(t), but we shall re-serve the symbol R for the curvature scalar appearing inSection IV.A.]

The constant k may (in principle) be any real number. Ifk = 0, then three-dimensional space for any single value oft is Euclidean, in which case neither a(t) nor r is uniquelydefined: their product, however, equals the familiar radialcoordinate of flat-space spherical coordinates. If k = 0, wemay define

r ≡ |k|1/2r and a(t) ≡ a(t)/|k|

in terms of which the RW line element becomes

(ds)2 = −c2(dt)2 + a2(t)

(dr )2

1 ∓ r2

+ r2[(dθ )2 + sin2 θ (dϕ)2]

(8)

where the denominator with ambiguous sign correspondsto 1 − (k/|k|)r2. Note that Eq. (8) is precisely Eq. (7) witha = a, r = r , k = ± 1. Hence there is no loss of generalityin restricting k to the three values 1, 0, and −1 in Eq. (7),and we shall henceforth do so (as is customary). Thesevalues for k correspond to positive, zero, and negativespatial curvature, respectively.

C. The Curvature of Space

If k = ±1, then the spatial part of the RW line element dif-fers appreciably from the Euclidean case for objects withr near unity, so that the kr2 term cannot be neglected. Theproper (comoving meter-stick) distance to objects withr near unity is of order a(t). Hence a(t) is roughly thedistance over which the curvature of space becomes im-portant. This point may be illustrated with the followingexamples.

The proper distance rP(t) to a comoving object withradial coordinate r is

rp(t) = a(t)∫ r

0

dr ′

(1 − kr ′2)1/2(9)

The integral appearing in Eq. (9) is elementary, beingsin−1 r for k = +1, r for k = 0, and sinh−1r for k = −1.Thus the radial coordinate is related to proper distanceand the scale factor by r = sin(rp/a) for k = +1, rp/a fork = 0, and sinh(rp/a) for k = −1.

Two objects with the same radial coordinate but sepa-rated from each other by a small angle θ are separatedby a proper distance given by Eq. (7) as

s = arθ (10)

For k = 0, r = rp/a and we obtain the Euclidean results = rpθ . For k = +1 (or −1), s differs from theEuclidean result by the extent to which sin(rp/a) [orsinh(rp/a)] differs from rp/a. Series expansions yield

s = rpθ

1 − k

6(rp/a)2 +

[(rp/a)4

](11)

where “[x4]” denotes additional terms containing 4th andhigher powers of “x” that are negligible when x is smallcompared to unity. Equation (11) displays the leading-order departure from Euclidean geometry for k = ±1, andit also confirms that this departure becomes important onlywhen rp is an appreciable fraction of the cosmic scale fac-tor. Standard inflation predicts the scale factor to be enor-mously greater than the radius of the observable universe,in which case no curvature of space would be detectableeven if k = 0. Open inflation could produce a wide rangeof values for the scale factor, with no upper limit to therange.

Equation (7) implies that the proper area A of a sphericalsurface with radial coordinate r is

A = 4πa2r2 (12)

and that the proper volume V contained within such asphere is

V = 4πa3∫ r

0dr ′ r ′2

(1 − kr ′2)1/2(13)

Page 25: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 787

The integral appearing in Eq. (13) is elementary for allthree vlaues of k, but will not be recorded here.

When k = 0, Eqs. (9), (12), and (13) reproduce the fa-miliar Euclidean relations between A, V , and the properradius rp. For k = ±1, series expansions of A and V interms of rp yield

A = 4πr2p

1 − k

3(rp/a)2 +

[(rp/a)4

](14)

V = 4π

3r3

p

1 − k

5(rp/a)2 +

[(rp/a)4

](15)

For any given rp, the surface area and volume of a sphereare smaller than their Euclidean values when k = +1, andlarger than Euclidean when k = −1.

The number of galaxies within any large volume ofspace is a measure of its proper volume (provided the dis-tribution of galaxies is truly uniform). By counting galax-ies within spheres of differing proper radii (which mustalso be measured), it is possible in principle to determinewhether proper volume increases less rapidly than r3

p , atthe same rate, or more rapidly; hence whether k = +1, 0,or −1. If k = 0, detailed comparison of observations withEq. (15) would enable one to determine the present valueof the cosmic scale factor.

Other schemes also exist for comparing observationswith theory to determine the geometry and, if k = 0, thescale factor of the cosmos. Thus far, however, observationshave revealed no deviation from flatness. Furthermore, re-cent precise measurements of the cosmic microwave back-ground indicate that space is flat or, if curved, has a scalefactor at least as large as the radius of the observable uni-verse (∼50 billion light-years, explained in Section VI.A).

For k = 0 and k = −1, the coordinate r spans the en-tire space. This cannot be true for k = +1, however, be-cause (1 − kr2) is negative for r > 1 . To understand thissituation, a familiar analog in Euclidean space may behelpful. Consider the surface of a sphere of radius R,e.g., earth’s surface. Let r , θ , and ϕ be standard spher-ical coordinates, with r = 0 at Earth’s center and θ = 0at the north pole. Longitudinal angle then correspondsto ϕ. Line elements on the surface can be described byR dθ and R sin θ dϕ, or by Rdρ/

√1 − ρ2 and Rρ dϕ,

where ρ = sin θ . In the latter case, the radial line ele-ment is singular at ρ = 1, which corresponds to the equa-tor at θ = π/2. This singularity reflects the fact that alllines of constant longitude are parallel at the equator. Italso limits the range of ρ to 0 ≤ ρ ≤ π/2, the northernhemisphere.

The Robertson-Walker metric for k = 1 is singular atr = 1 because radial lines of constant θ and ϕ becomeparallel at r = 1. In fact this three-dimensional space isprecisely the 3-D “surface” of a hypersphere of radius a(t)

in 4-D Euclidean space. With r = 0 at the “north pole,” the“equator” lies at r = 1, so the coordinate r only describesthe “northern hemisphere.” A simple coordinate for de-scribing the entire 3-D space is u ≡ sin−1 r , which equalsπ/2 at the “equator” and π at the “south pole.” The analogsof Eqs. (7)–(15) may readily be found by substituting sin ufor r , wherever r appears. The entire space is then spannedby a finite range of u, namely 0 ≤ u ≤ π .

Such a universe is said to be spherical and closed,and a(t) is called the radius of the universe. Thoughlacking any boundary, a spherical universe is finite. Theproper circumference is 2πa, and the proper volume isV (t) = 2π2a3(t).

In a curved space, lines as straight as possible are calledgeodesics. Such lines are straight with respect to locallyflat regions of space surrounding each segment, but curvedover longer distances. In spaces described by a Robertson-Walker metric, lines with constant θ and ϕ are geodesics(though by no means the only ones). On earth’s surface,the geodesics are great circles, e.g., lines of constant lon-gitude. Note that adjacent lines of longitude are parallel atthe equator, intersect at both poles, and are parallel againat the equator on the opposite side. A spherical universe isvery similar. Geodesics (e.g., rays of light) that are parallelin some small region will intersect at a distance of one-fourth the circumference, in either direction, and becomeparallel again at a distance of half the circumference. Ifone follows a geodesic for a distance equal to the circum-ference, one returns to the starting point, from the oppositedirection.

For k = 0 and k = −1, space is infinite in linear ex-tent and volume; such universes are said to be open. Asremarked earlier, a universe with k = 0 has zero spatialcurvature for any single value of t ; such a universe issaid to be Euclidean or flat. The RW metric with k = 0does not, however, correspond to the flat space–time ofMinkowski. The RW coordinates are comoving with aspherically symmetric expansion (or contraction), so thatthe RW time variable is measured by clocks that are mov-ing relative to one another. Special relativity indicates thatsuch clocks measure time differently from the standardclocks of Minkowski space–time. In particular, a set ofcomoving clocks does not run synchronously with anyset of clocks at rest relative to one another. Furthermore,the measure of distance given by the RW metric is thatof comoving meter sticks, and it is known from specialrelativity that moving meter sticks do not give the sameresults for distance as stationary ones.

Since both time and distance are measured differentlyin comoving coordinates, the question arises whetherthe differences compensate for each other to reproduceMinkowski space–time. For k = 0, the answer is negative:such a space–time can be shown to be curved, provided

Page 26: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

788 Cosmic Inflation

the scale factor is changing in time so that comoving in-struments actually are moving.

If k = −1, space has negative curvature: geodesics (e.g.,light rays) that are parallel nearby diverge from each otheras one follows them away. An imperfect analogy is thesurface of a saddle; two “straight” lines that are locallyparallel in the seat of the saddle will diverge from eachother as one follows them toward the front or back of thesaddle. (The analogy is imperfect because the curvatureof space is presumed to be the same everywhere, unlikethe curvature of a saddle, which varies from one region toanother.) A universe with k = −1 is called hyperbolic.

D. Hubble’s Law

It is evident from Eq. (9) that the proper distance rp to acomoving object depends on time only through a multi-plicative factor of a(t). It follows that

rp/rp = a/a ≡ H (t) (16)

rp = Hrp (17)

where dots over the symbols denote differentiation withrespect to time, for example, f ≡ d f/dt . Thus r p equalsthe velocity of a comoving object relative to the earth (as-suming that earth itself is a comoving object), being posi-tive for recession or negative for approach. An importantproviso, however, arises from the fact that the concept ofvelocity relative to earth becomes problematic if the objectis so distant that curvature of the intervening space–timeis significant. Also, rp is (hypothetically) defined in termsof comoving meter sticks laid end to end. All of thesecomoving meter sticks, except the closest one, are mov-ing relative to earth, and are therefore perceived by us tobe shortened because of the familiar length contraction ofspecial relativity. Thus rp deviates from distance in the fa-miliar sense for objects receding with velocities near thatof light. For these reasons, rp only corresponds to velocityin a familiar sense if rp is less than the distance over whichspace–time curves appreciably and rp is appreciably lessthan c.

A noteworthy feature of Eq. (17) is that rp exceeds cfor objects with rp > c/H . There is no violation of thebasic principle that c is a limiting velocity, however. Asjust noted, rp only corresponds to velocity when smallcompared to c. Furthermore, a correct statement of theprinciple fo limiting velocity is that no object may pass byanother object with a speed greater than c. In Minkowskispace–time the principle may be extended to the relativevelocity between distant objects, but no such extension todistant objects is possible in a curved space–time.

Equations (16) and (17) state that if a = 0, then all co-moving objects are receding from (or approaching) earth

with velocities that are proportional to their distances fromus, subject to the provisos stated above. One of the mostdirect points of contact between theoretical formalismand observation resides in the fact that, aside from lo-cal random motions of limited magnitude, the galaxiesconstituting the visible matter of our universe are reced-ing from us with velocities proportional to their distances.This fact was first reported by Edwin Hubble in 1929,based on determinations of distance (later revised) madewith the 100-in. telescope that was completed in 1918on Mt. Wilson in southwest California. Velocities of thegalaxies in Hubble’s sample had been determined by V.M. Slipher, utilizing the redshift of spectral lines causedby the Doppler effect.

The function H (t) defined by Eq. (16) is called theHubble parameter. Its present value, which we shall denoteby H0, is traditionally called Hubble’s constant (a mis-nomer, since the “present” value changes with time, albeitvery slowly by human standards). Equation (17), govern-ing recessional velocities, is known as Hubble’s law.

The value of H0 is central to a quantitative understand-ing of our universe. Assuming that redshifts of spectrallines result entirely from the Doppler effect, it is straight-forward to determine the recessional velocities of distantgalaxies. Determinations of distance, however, are muchmore challenging.

Most methods for measuring distances beyond theMilky Way are based on the appearance from earth of ob-jects whose intrinsic properties are (believed to be) known.Intrinsic luminosity is the simplest property used in thisway. Apparent brightness decreases at increasing distance,so the distance to any (visible) object of known lumi-nosity is readily determined. Objects of known intrinsicluminosity are called “standard candles,” a term that issometimes applied more generally to any kind of distanceindicator.

Given a set of identical objects, the relevant propertiesof (at least) one of them must be measured in order to usethem as standard candles. This can be a challenge. Forexample, the brightest galaxies in rich clusters are prob-ably similar, representing an upper limit to galactic size.Since they are luminous enough to be visible at great dis-tances (out to ∼1010 ly), they would be powerful standardcandles. Determining their intrinsic luminosity (i.e., cali-brating them) was difficult, however, because the nearestrich cluster is (now known to be) ∼5 × 107 light-yearsaway (the Virgo cluster, containing ∼2500 galaxies). Thisdistance is much too great for measurement by trigono-metric parallax, which is ineffective beyond a few thou-sand light-years. Also note that even when calibrated, thebrightest galaxies are useless for measuring distances lessthat ∼5 × 107 light-years, the distance to the nearest richcluster.

Page 27: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 789

In order to calibrate sets of standard candles spanninga wide range of distances, a cosmic distance ladder hasbeen constructed. Different sets of standard candles formsuccessive rungs of this ladder, with intrinsic luminosityincreasing at each step upward. The bottom rung is basedon parallax and other types of stellar motion. The resultingdistances have been used to determine the intrinsic lumi-nosities of main-sequence stars of various spectral types,which form the second rung. Every rung above the first iscalibrated by using standard candles of the rung immedi-ately below it, in a bootstrap fashion. Since any errors incalibration are compounded as one moves up the ladder,it is clearly desirable to minimize the number of rungsrequired to reach the top.

A key role has been played by variable stars (thethird rung), among which Cepheids are the brightest andmost important. Cepheids are supergiant, pulsating stars,whose size and brightness oscillate with periods rangingfrom (roughly) 1 to 100 days, depending on the star. In1912, a discovery of historic importance was published byHenrietta S. Leavitt, who had been observing 25 Cepheidsin the Small Magellenic Cloud (a small galaxy quite nearthe Milky Way).

Leavitt reported that the periods and apparent bright-nesses of these stars were strongly correlated, beingroughly proportional. Knowing that all of these stars wereabout the same distance from earth, she concluded thatthe intrinsic luminosities of Cepheids are roughly propor-tional to their periods. She was unable to determine theconstant of proportionality, however, because no Cepheidswere close enough for the distance to be measured byparallax.

Over the next decade, indirect and laborious methodswere used by others to estimate the distances to several ofthe nearest Cepheids. Combined with the work of Leavitt,a period-luminosity relation was thereby established, en-abling one to infer the distance to any Cepheid from itsperiod and apparent brightness. Only a small fraction ofstars are Cepheids, but every galaxy contains a great manyof them. When Cepheids were observed (by Hubble) inAndromeda in 1923, the period-luminosity relation re-vealed that Andromeda lies far outside the Milky Way.This was the first actual proof that other galaxies exist.

Until recently, telescopes were only powerful enoughto observe Cepheids out to ∼8 million light-years. At thisdistance (and considerably beyond), random motions ofgalaxies obscure the systematic expansion described byHubble’s law. Two more rungs were required in the dis-tance ladder in order to reach distances where motionis governed by the Hubble flow. (This partly explainswhy Hubble’s original value for H0 was too large by∼700%.) Different observers favored different choicesfor the higher rungs, and made different corrections for

a host of complicating factors. During the two decadesprior to launching of the Hubble Space Telescope (HST)in 1990, values for H0 ranging from 50 to 100 km/sec/Mpcwere published (where Mpc denotes megaparsec, with1 Mpc = 3.26 × 106 ly = 3.09 × 1019 km). The reporteduncertainties were typically near ±10km/sec/Mpc, muchtoo small to explain the wide range of values reported.Unknown systematic errors were clearly present.

The initially flawed mirror of the HST was correctedin 1993, and the reliability of distance indicators hasimproved substantially since then. Our knowledge ofCepheids, already good, is now even better. Far more im-portant, the HST has identified Cepheids in galaxies outto ∼80 million light-years, nearly10 times farther thanhad previously been achieved. This tenfold increase inthe range of Cepheids has placed them next to the high-est rung of the distance ladder. Every standard candle ofgreater luminosity and range has now been calibrated withCepheids, substantially improving their reliabilities.

Ground-based astronomy has also become much morepowerful over the last decade. Several new and very largetelescopes have been built, with more sensitive light de-tectors (charge-coupled devices), and with computer soft-ware that minimizes atmospheric distortions of the im-ages. Large areas of the sky are surveyed and, when objectsof special interest are identified, the HST is also trainedon them. Collaboration of this kind has made possible thecalibration of a new set of standard candles, supernovaeof type Ia (SNe Ia, or SN Ia when singular).

Type SNe Ia occur in binary star systems containing awhite dwarf and a bloated, nearby giant. The dwarf grad-ually accretes matter from its neighbor until a new burstof nuclear fusion occurs, with explosive force. The peakluminosity equals that of an entire galaxy, and is nearlythe same for all SNe Ia. Furthermore, the rate at which theluminosity decays is correlated with its value in a knownway, permitting a measurement of distance within ±6%for each SN Ia. This precision exceeds that of any otherstandard candle.

Because of their enormous luminosities, SNe Ia canbe studied out to very great distances (at least 12 billionlight-years with present technology). In 1997, observa-tions of SNe Ia strongly suggested that our universe isexpanding at an increasing rate, hence that > 0 (where denotes the cosmological constant). Subsequent studieshave bolstered this conclusion. An accelerating expansionis the most dramatic discovery in observational cosmol-ogy since the microwave background (1964), and perhapssince Hubble’s discovery of the cosmic expansion (1929).

Many calculations in cosmology depend on the value ofH0 , which still contains a range of uncertainty (primarilydue to systematic errors, implied by the fact that differentstandard candles yield somewhat different results). It is

Page 28: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

790 Cosmic Inflation

therefore useful (and customary) to characterize H0 by adimensionless number h0 (or simply h), of order unity,defined by

H0 = h0 × 100 km/sec/Mpc (18)

An HST Key Project team led by Wendy L. Freedman,Robert C. Kennicutt, Jr., and Jeremy R. Mould has usednine different secondary indicators whose results, whencombined, yield h0 = 0.70 ± 0.07. Type SNe Ia were oneof their nine choices and, by themselves, led to h0 = 0.68±0.05. The preceding values for h0 would be reduced by∼0.05 if a standard (but debatable) adjustment were madeto correct for differences in metallicity of the Cepheidsused as primary indicators.

As previously suggested, there are many devils in thedetails of distance measurements, and highly respectedobservers have dealt with them in different ways. For sev-eral decades Allan Sandage, with Gustav A. Tammannand other collaborators, has consistently obtained resultsfor h0 between 0.5 and 0.6. Using SNe Ia as secondaryindicators, their most recent result is h0 = 0.585 ± 0.063.Several other teams have also obtained h0 from SNe Ia,with results spanning the range from 0.58 to 0.68.

In light of the preceding results and other studies far toonumerous to mention, it is widely believed that h0 lies inthe range

h0 = 0.65 ± 0.10 (19)

Before leaving this section on cosmic geometry, we notethat a space–time may contain one or more finite regionsthat are internally homogeneous and isotropic while differ-ing from the exterior. Within such a homogeneous region,space–time can be described by an RW metric (which,however, tells one nothing about the exterior region). Thistype of space–time is in fact predicted by theories of cos-mic inflation; the homogeneity and isotropy of the observ-able universe are not believed characteristic of the whole.

IV. DYNAMICS OF THE COSMOS

A. Einstein’s Field Equations

A theory of cosmic evolution requires dynamical equa-tions, taken here to be the field equations of GTR:

Rµν − 12 gµν R + gµν = 8πGc−4Tµν (20)

where Rµν denotes the Ricci tensor (contracted Riemanncurvature tensor), R denotes the curvature scalar (con-tracted Ricci tensor), denotes the cosmological con-stant, and Tµν denotes the stress-energy tensor of all formsof matter and energy excluding gravity. The Ricci tensorand curvature scalar are determined by gµν together withits first and second partial derivatives, so that Eq. (20) is

a set of second-order (nonlinear) partial differential equa-tions for gµν . As with all differential equations, initialand/or boundary conditions must be assumed in order tospecify a unique solution. (A definition of the Riemanncurvature tensor lies beyond the scope of this article. Weremark, however, that some scientists define it with an op-posite sign convention, in which case Rµν and R changesigns and appear in the field equations with opposite signsfrom those above.)

All tensors appearing in the field equations are symmet-ric, so there are 10 independent equations in the absenceof simplifying constraints. We are assuming space to behomogeneous and isotropic, however, which assures thevalidity of the RW line element. The RW metric containsonly one unknown function, namely the cosmic scale fac-tor a(t), so we expect the number of independent fieldequations to be considerably reduced for this case.

Since the field equations link the geometry of space–time with the stress-energy tensor, consistency requiresthat we approximate Tµν by a homogeneous, isotropicform. Thus we imagine all matter and energy to besmoothed out into a homogeneous, isotropic fluid (or gas)of dust and radiation that moves in unison with the cos-mic expansion. Denoting the proper mass-energy densityby ρ (in units of mass/volume) and the pressure by P , theresulting stress-energy tensor has the form characteristicof a “perfect fluid”:

Tµν = (ρ + P/c2)UµUν + Pgµν (21)

where Uµ denotes the covariant four-velocity of a comov-ing point: Uµ = gµν dxν/dτ , with τ denoting proper time(dτ = |ds|/c).

Under the simplifying assumptions stated above, onlytwo of the 10 field equations are independent. They aredifferential equations for the cosmic scale factor and maybe stated as(

a

a

)2

= 8πG

3ρ + c2

3− k

(c

a

)2

(22)

a

a= −4πG

3

(ρ + 3P

c2

)+ c2

3(23)

where a and a denote first and second derivatives, re-spectively, of a(t) with respect to time. Note that the leftside of Eq. (22) is precisely H 2, the square of the Hubbleparameter.

B. Local Energy Conservation

A remarkable feature of the field equations is that the co-variant divergence (the curved–space-time generalizationof Minkowski four-divergence) of the left side is identi-cally zero (Bianchi identities). Hence the field equations

Page 29: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 791

imply that over a region of space–time small enough to beflat, the Minkowski four-divergence of the stress-energytensor is zero. A standard mathematical argument thenleads to the conclusion that energy and momentum areconserved (locally, which encompasses the domain of ex-periments known to manifest such conservation). Hencewithin the context of GTR, energy and momentum conser-vation are not separate principles to be adduced, but ratherare consequences of the field equations to the full extentthat conservation is indicated by experiment or theory.

If a physical system is spatially localized in a space–time that is flat (i.e., Minkowskian) at spatial infinity, thenthe total energy and momentum for the system can bedefined and their constancy in time proven. The condi-tion of localization in asymptotically flat space–time isnot met, however, by any homogeneous universe, for sucha universe lacks a boundary and cannot be regarded as alocalized system. The net energy and momentum of anunbounded universe defy meaningful definition, whichprecludes any statement that they are conserved. (Fromanother point of view, energy is defined as the ability todo work, but an unbounded universe lacks any externalsystem upon which the amount of work might be defined.)

In the present context, local momentum conservation isassured by the presumed homogeneity and isotropy of thecosmic expansion. Local energy conservation, however,implies a relation between ρ and P , which may be stated as

d

dt(ρa3) = − P

c2

d

dt(a3) (24)

[That Eq. (24) follows from the field equations may be seenby solving Eq. (22) for ρ and differentiating with respectto time. The result involves a, which may be eliminated byusing Eq. (23) to obtain a relation equivalent to Eq. (24).]

To see the connection between Eq. (24) and energyconservation, consider a changing spherical volume Vbounded by a comoving surface. Equation (13) indicatesthat V is the product of a3 and a coefficient that is inde-pendent of time. Multiplying both sides of Eq. (24) by thatcoefficient and c2 yields

d

dt(ρc2V ) = −P

dV

dt(25)

The Einstein mass-energy relation implies that ρc2 equalsthe energy per unit volume, so ρc2V equals the energyU internal to the volume V . Hence Eq. (25) is simply thework-energy theorem as applied to the adiabatic expansion(or contraction) of a gas or fluid: U = −PV . Sinceenergy changes only to the extent that work is done, (local)energy conservation is implicit in this result.

By inverting the reasoning that led to Eq. (24), it can beshown that the second of the field equations follows fromthe first and from energy conservation; that is, Eq. (23)

follows from Eqs. (22) and (24). [To reason from Eqs. (23)and (24) to (22) is also possible; k appears as a constant ofintegration.] Hence evolution of the cosmic scale factor isgoverned by Eqs. (22) and (24), together with an equationof state that specifies the relation between ρ and P .

C. Equation for the Hubble Parameter

The present expansion of the universe will continue forhowever long H remains positive. Since H ≡ a/a, Eq. (22)may be written as

H 2 = 8πG

3ρ + c2

3− k

(c

a

)2

(26)

The density ρ is inherently positive. If ≥ 0 (as seemscertain), then H > 0 for all time if k ≤ 0. Open universes,flat or hyperbolic, therefore expand forever. A closed, fi-nite universe may (or may not) expand forever, dependingon the relation between its total mass Mu and (as willbe discussed).

If P = 0, Eq. (24) implies thatρ varies inversely with thecube of a(t), which simply corresponds to a given amountof matter becoming diluted as the universe expands:

P = 0: ρ = ρ0

(a0

a

)3

(27)

where ρ0 and a0 denote present values.The density ρ has been dominated by matter under

negligible pressure (i.e., P ρc2) since the universe was∼106 years old, so Eq. (27) has been a good approxima-tion since that early time. Recalling again that H ≡ a/a,Eqs. (26) and (27) imply

a2 = 8πGρ0a30

3a+ c2a2

3− kc2 (28)

The scale factor a(t) increases as the universe expands.Open universes expand forever, and their scale factorsincrease without limit. If = 0, Eq. (28) implies thata2 → −kc2 as t → ∞. From Eqs. (9) and (16), we see thatevery comoving object has r p → 0 in a flat (k = 0) uni-verse, while r p → c × sinh−1 r in a hyperbolic (k = −1)universe.

If > 0 in an open universe, then a/a → c√

/3 asa → ∞. In this limit,

a = a (with ≡ c√

/3) (29)

The solution is

a(t) = a(tin) exp[(t − tin)] (30)

where tin denotes any initial time that may be convenient.Equation (29) will be satisfied to good approximation aftersome minimum time tmin has passed, so a(t) will be givenfor all t ≥ tmin by Eq. (30), with tin = tmin.

Page 30: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

792 Cosmic Inflation

Equation (30) implies an exponential rate of growth forrp in the final stage of expansion, with a doubling time tdgiven by

td = ln 2

c

√3

(31)

Equation (29) then implies exponential growth for r p aswell, with the same doubling time.

Next we consider the closed, k = +1 case in somedetail, where a(t) is called the radius of the universe.The total mass of such a universe is finite and given byMu = ρVu = ρ(2π2a3). Assuming zero pressure for sim-plicity, Mu is constant, and Eq. (26) implies

H 2 = 4GMu

3πa3+ c2

3− c2

a2(32)

If = 0, H will shrink to zero when the radius a(t)reaches a maximum value

amax = 4GMu

3πc2(33)

The time required to reach amax can be calculated, and isgiven by

t(amax) = 2GMu

3c3(34)

It is remarkable that both amax and the time required toreach it are determined by Mu . The dynamics and geom-etry are so interwoven that the rate of expansion does notenter as an independent parameter.

After reaching amax, the universe will contract (H < 0)at the same rate as it had expanded. Equation (23) ensuresthat the scale factor will not remain at amax, and Eq. (28)implies that the magnitude of H is determined by the valueof a(t).

If the universe is closed and > 0, a more careful anal-ysis is required. A detailed study of Eq. (32) reveals thatthe present expansion will continue forever if

≥(

πc2

2GMu

)2

(35)

If the inequality holds, the size of the universe will in-crease without limit. As for the open case, a/a → c

√/3

as a → ∞. After some time t0 has passed, Eq. (30) willdescribe a(t), and the final stage of expansion will be thesame as for the open case. Both rp and r p will grow atexponential rates, with a doubling time given by Eq. (31).

If the equality in Eq. (35) holds, then the universe willexpand forever but approach a maximum, limiting sizeamax as t → ∞, with amax = 2GMu/πc2 (exactly 3/2 aslarge as for = 0.) If the inequality (35) is violated (small), then the universe will reach a maximum size in a finitetime, after which it will contract (as for the special casewhere = 0).

We note in passing that despite the name hyperbolic,a completely empty universe with = 0 and k = −1has precisely the flat space-time of Minkowski. EmptyMinkowski space-time is homogeneous and isotropic,and therefore must correspond to an RW metric. Withρ = = 0, Eq. (28) implies that k = −1 [and that a = c, inwhich case Eq. (23) implies that P = 0, as does Eq. (24).]The difference in appearance between the Minkowski met-ric and the equivalent RW metric results from a fact notedearlier, namely that RW coordinates are comoving. Theclocks and meter sticks corresponding to the RW metricare moving with respect to each other, so they yield resultsfor time and distance that differ from those ordinarily usedin a description of Minkowski space-time.

To make explicit the equivalence between empty, hy-perbolic space-time and Minkowski space-time, we recallthat Eq. (28) implies a = c. Since a = 0 at t = 0, it fol-lows that a(t) = ct for the hyperbolic case in question. Itmay readily be verified that the Minkowski metric trans-forms into the corresponding RW metric with rM = rct ,tM = t(r2 + 1)1/2, where rM and tM denote the usual coor-dinates for Minkowski space-time.

D. Cosmic Deceleration or Acceleration

We saw in Section IV.B that Eq. (23) contains no infor-mation beyond that implied by Eq. (22) and local energyconservation. Equation (23) makes a direct and transparentstatement about cosmic evolution, however, and thereforewarrants study. Recalling again that the proper distancerp(t) to a comoving object depends on time only througha factor of a(t), we see that Eq. (23) is equivalent to astatement about the acceleration rp of comoving objects(where again the dots refer to time derivatives):

rp =−4πG

3

(ρ + 3P

c2

)+ c2

3

rp (36)

Note that rp is proportional to rp, which is essential to thepreservation of Hubble’s law over time. A positive rp cor-responds to acceleration away from us, whereas a negativerp corresponds to acceleration toward us (e.g., decelera-tion of the present expansion). The same provisions thatconditioned our interpretation of rp as recessional veloc-ity apply here; rp equals acceleration in the familiar senseonly for objects whose distance is appreciably less thanthe scale factor (for k = 0) and whose rp is a small fractionof c.

As a first step in understanding the dynamics containedin Eq. (34), let us consider what Newton’s laws wouldpredict for a homogeneous, isotropic universe filling aninfinite space described by Euclidean geometry. In par-ticular, let us calculate the acceleration rp (relative to acomoving observer) of a point mass at a distance rp away

Page 31: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 793

from the observer. (The distinction of “proper” distanceis superfluous in Newtonian physics, but the symbol rp isbeing used to maintain consistency of notation.)

In Newton’s theory of gravitation, the force F betweentwo point masses m and M varies inversely as the square ofthe distance d between them: F = GmM/d2. If more thantwo point masses are present, a principle of superpositionstates that the net force on any one mass is the vector sumof the individual forces resulting from the other masses.

While it is not obvious (indeed, Newton invented calcu-lus to prove it), the inverse–square dependence on distancetogether with superposition implies that the force exertedby a uniform sphere of matter on any exterior object is thesame as if the sphere’s mass were concentrated at its cen-ter. It may also be shown that if a spherically symmetricmass distribution has a hollow core, then the mass exte-rior to the core exerts no net force on any object inside thecore. (These theorems may be familiar from electrostatics,where they also hold because Coulomb’s law has the samemathematical form as Newton’s law of gravitation.)

The above features of Newtonian gravitation may beexploited in this study of cosmology by noting that in theobserver’s frame of reference, the net force on an object ofmass m (e.g., a galaxy) at a distance rp from the observerresults entirely from the net mass M contained within the(hypothetical) sphere of radius rp centered on the observer:

M = (43πr3

p

)ρ (37)

The force resulting from M is the same as if it were con-centrated at the sphere’s center, and there is no net forcefrom the rest of the universe exterior to the sphere. Com-bining Newton’s law of gravitation with his second law ofmotion (that is, acceleration equals F/m) yields

rp = −GM/

r2p = −4πG

3ρrp (38)

where Eq. (35) has been used to eliminate M , and theminus sign results from the attractive nature of the force.

Note that Eq. (36) agrees precisely with the contributionof ρ to rp in the Einstein field equation (36); this part ofrelativistic cosmology has a familiar, Newtonian analog.In GTR, however, stresses also contribute to gravitation,as exemplified by the pressure term in Eq. (36). If = 0,then a cosmological term contributes as well.

In ordinary circumstances, pressure is a negligiblesource of gravity. To see why, we need only compare thegravitational effect of mass with the effect of pressure fill-ing the same volume. For example, lead has a density ρ =11.3 × 103 kg/m3. In order for pressure filling an equal vol-ume to have the same effect, the pressure would need tobe P = c2ρ/3 = 3.39 × 1020 N/m2 (where 1 N = 0.225 lb).This is ∼3 × 1015 times as great as atmospheric pressure,and ∼109 times as great as pressure at the center of theearth.

The density ρ appearing in the field equations is posi-tive, so its contribution to r p in Eq. (36) is negative: thegravitation of mass-energy is attractive. Positive pressurealso makes a negative contribution to r p, but pressure maybe either positive or negative. Equation (36) implies thatnegative pressure causes gravitational repulsion.

If = 0, its contribution to r p has the same sign as .If > 0 and a time arrives when

>4πG

c2

(ρ + 3P

c2

)(39)

then r p > 0, which corresponds to an accelerating expan-sion. The density and pressure both decrease as the uni-verse expands, whereas is constant. After the first mo-ment when the inequality (39) is satisfied, the universewill expand forever, at an accelerating rate. A time willtherefore come when Eq. (29) is satisfied, after which rp

and r p will both grow at exponential rates, in accordancewith Eqs. (29)–(31). There is recent evidence that our uni-verse has in fact entered such a stage, as will be discussedin Section V.E.

The name of the big bang model is somewhat mislead-ing, for it suggests that the expansion of the cosmos is a re-sult of the early high temperature and pressure, rather likethe explosion of a gigantic firecracker. As just explained,however, positive pressure leads not to acceleration “out-ward” of the cosmos, but rather to deceleration of anyexpansion that may be occurring. Had the universe beenborn stationary, positive pressure would have contributedto immediate collapse. Since early high pressure did notcontribute to the expansion, early high temperature couldnot have done so either. (It is perhaps no coincidence thatthe phrase big bang was initially coined as a term of deri-sion, by Fred Hoyle—an early opponent of the model. Themodel has proven successful, however, and the graphicname remains with us.)

E. The Growth of Space

It is intuitively useful to speak of ρ, P , and as leading togravitational attraction or repulsion, but the geometricalrole of gravitation in GTR should not be forgotten. Thegeometrical significance of gravitation is most apparentfor a closed universe, whose total proper volume is finiteand equal to 2π2a3(t). The field equations describe thechange in time of a(t), hence of the size of the universe.From a naive point of view, Hubble’s law describes cosmicexpansion in terms of recessional velocities of galaxies;but more fundamentally, galaxies are moving apart be-cause the space between them is expanding. How else anincreasing volume for the universe?

The field equation for a(t) implies an equation for rp,which has a naive interpretation in terms of accelerationand causative force. More fundamentally, however, a and

Page 32: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

794 Cosmic Inflation

rp represent the derivative of the rate at which space isexpanding. Our intuitions are so attuned to Newton’s lawsof motion and causative forces that we shall often speak ofvelocities and accelerations and of gravitational attractionand repulsion, but the geometrical nature of GTR shouldnever be far from mind.

The interpretation of gravity in terms of a growthor shrinkage of space resolves a paradox noted above,namely, that positive pressure tends to make the universecollapse. Cosmic pressure is the same everywhere, anduniform pressure pushes equally on all sides of an objectleading to no net force or acceleration. Gravity, however,affects the rate at which space is growing or shrinking, andthereby results in relative accelerations of distant objects.One has no a priori intuition as to whether positive pres-sure should lead to a growth or shrinkage of space; GTRprovides the answer described above.

F. The Cosmological Constant

For historical and philosophical reasons beyond the scopeof this article, Einstein believed during the second decadeof this century that the universe was static. He realized,however, that a static universe was dynamically impossibleunless there were some repulsive force to balance the fa-miliar attractive force of gravitation between the galaxies.The mathematical framework of GTR remains consistentwhen the cosmological term gµν is included in the fieldequations (20), and Einstein added such a term to stabilizethe universe in his initial paper on cosmology in 1917.

Although the existence of other galaxies outside theMilky Way was actually not established until 1923,Einstein (in his typically prescient manner) had incorpo-rated the assumptions of homogeneity and isotropy in hiscosmological model of 1917. With these assumptions, thefield equations reduce to (22) and (23). The universe isstatic if and only if a(t) = 0, which requires (over anyextended period of time) that a(t) = 0 as well. Applyingthese conditions to Eqs. (22) and (23), it is readily seenthat the universe is static if and only if

= 4πG

c2

(ρ + 3P

c2

)(40)

k

(c

a

)2

= 4πG

(ρ + P

c2

)(41)

Since our universe is characterized by positive ρ and pos-itive (albeit negligible) P , the model predicts positive val-ues for and k. Hence the universe is closed and fi-nite, with the scale factor (radius of the universe) givenby Eq. (37) with k = 1. For example, if ρ ∼ 10−30 g/cm3

(roughly the density of our universe) and P = 0, thenEq. (41) yields a cosmic radius of ∼1010 ly, ∼1/5 thesize of our observable universe.

The static model of Einstein has of course been aban-doned, for two reasons. In 1929, Edwin Hubble reportedthat distant galaxies are in fact receding in a systematicway, with velocities that are proportional to their distancefrom us. On the theoretical side, it was remarked by A. S.Eddington in 1930 that Einstein’s static model was intrin-sically unstable against fluctuations in the cosmic scalefactor: if a(t) were momentarily to decrease, ρ would in-crease and the universe would collapse at an acceleratingrate under the newly dominating, attractive force of ordi-nary gravity. Conversely, if a(t) momentarily increased,ρ would decrease and the cosmological term would be-come dominant, thereby causing the universe to expandat an accelerating rate. For this combination of reasons,Einstein recommended in 1931 that the cosmological termbe abandoned (i.e., set = 0); he later referred to the cos-mological term as “the biggest blunder of my life.” From acontemporary perspective, however, only the static modelwas ill-conceived. The cosmological term remains a vi-able possibility, and two recent observations indicate that > 0 (as will be discussed in Section V.E).

We have seen that Einstein was motivated by a desireto describe the cosmos when he added the gµν term tothe field equations, but there is another, more fundamen-tal reason why this addendum is called the cosmologicalterm. Let us recall the structure of Newtonian physics,whose many successful predictions are reproduced byGTR. Newton’s theory of gravity predicts that the grav-itational force on any object is proportional to its mass,while his second law of motion predicts that the result-ing acceleration is inversely proportional to the object’smass. It follows that an object’s mass cancels out of theacceleration caused by gravity. Hence the earth (or anyother source of gravity) may be regarded as generatingan acceleration field in the surrounding space. This fieldcauses all objects to accelerate at the same rate, and maybe regarded as characteristic of the region of space.

The GTR regards gravitational accelerations as morefundamental than gravitational forces, and focuses atten-tion on the acceleration field in the form of gµν as the fun-damental object of study. The acceleration field causedby a mass M is determined by the field equations (20),wherein M contributes (in a way depending on its posi-tion and velocity) to the stress-energy tensor Tµν appear-ing on the right side. The solution for gµν then determinesthe resulting acceleration of any other mass that might bepresent, mimicking the effect of Newtonian gravitationalforce. The term gµν in Eq. (20), however, does not de-pend on the position or even the existence of any matter.Hence the influence of gµν on the solution for gµν cannotbe interpreted in terms of forces between objects.

To understand the physical significance of the cosmo-logical term, let us consider a special case of the fieldequations, namely their application to the case at hand:

Page 33: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 795

a homogeneous, isotropic universe. As we have seen, thefield equations are then equivalent to Eqs. (26) and (34).From the appearance of on the right sides, it is clear that contributes to the recessional velocities of distant galax-ies (through H ), and to the acceleration of that recession.This part of the acceleration cannot be interpreted in termsof forces acting among the galaxies for, as just remarked,the cosmological term does not presume the existence ofgalaxies.

The proper interpretation of the cosmological term re-sides in another remark made earlier: that the recessionof distant galaxies should be understood as resulting fromgrowth of the space between them. We conclude that thecosmological term describes a growth of space that issomehow intrinsic to the nature of space and its evolutionin time. Such a term clearly has no analog in the physicspredating GTR.

G. Equivalence of Λ to VacuumEnergy and Pressure

An inspection of Eqs. (22) and (23) reveals that the contri-butions of are precisely the same as if all of space werecharacterized by constant density and pressure given by

ρV = c2

8πG(42)

PV = − c4

8πG= −c2ρV (43)

From a more basic point of view, Eqs. (42), (43), and (21)yield

Tµν = PV gµν = −c2ρV gµν (44)

Comparing Eq. (44) with the field equations in their mostfundamental form, i.e., Eq. (20), the equivalence between and the above ρV and PV is obvious. We also note thatlocal energy conservation is satisfied by a constant ρV

with PV = −c2ρV , as may be seen from Eq. (24). In factall known principles permit an interpretation of in termsof the constant ρV and PV given by Eqs. (42) and (43).

Is there any reason to suppose that Tµν might containa term such as that described above, equivalent to a cos-molgical constant? Indeed there is. In relativistic quantumfield theory (QFT), which describes the creation, annihi-lation, and interaction of elementary particles, the vacuumstate is not perfectly empty and void of activity, becausesuch a perfectly ordered state would be incompatible withthe uncertainty principles of quantum theory. The vac-uum is defined as the state of minimum energy, but it ischaracterized by spontaneous creation and annihilation ofparticle–antiparticle pairs that sometimes interact beforedisappearing.

For most purposes, only changes in energy are phys-ically significant, in which case energy is only definedwithin an arbitrary additive constant. The energy of thevacuum is therefore often regarded as zero, the simplestconvention. In the context of general relativity, however,a major question arises. Does the vacuum of QFT re-ally contribute nothing to the Tµν appearing in Einstein’sfield equations, or do quantum fluctuations contribute andthereby affect the metric gµν of spacetime?

Let us consider the possibility that quantum fluctuationsact as a source of gravity in the field equations of GTR.These fluctuations occur throughout space, in a way gov-erned by the local physics of special relativity. The energydensity of the vacuum should therefore not change as theuniverse expands. Denoting the vacuum density by ρV

[with ρV = (energy density)/c2)], the corresponding pres-sure PV can be deduced from Eq. (24). Assuming thatρV is constant, Eq. (24) implies PV = −c2ρV , in perfectaccordance with Eq. (43).

If ≥ 0 (as seems certain on empirical grounds), thenthe corresponding vacuum has ρV ≥ 0 and PV ≤ 0. Neg-ative pressure has a simple interpretation in this context.Consider a (hypothetical) spherical surface expanding inunison with the universe, so that all parts of the surfaceare at rest with respect to comoving coordinates. Assum-ing the vacuum energy density to be constant, the energyinside the sphere will increase, because of the increasingvolume. Local energy conservation implies that positivework is being done to provide the increasing energy, hencethat an outward force is being exerted on the surface.

No such force would be required unless the region in-side the sphere had an innate tendency to contract, which isprecisely what is meant by negative pressure. The requiredoutward force on the surface is provided by negative pres-sure in the surrounding region, which exactly balances theinternal pressure (as required in order for every part of thesurface to remain at rest with respect to comoving coordi-nates). A constant ρV and the PV of Eq. (43) describe thissituation perfectly. It is not yet known how to calculateρV , so the value of ρV and the corresponding remainempirical questions.

The theory of cosmic inflation rests on the assumptionthat during a very early period of cosmic history, spacewas characterized by an unstable state called a “false vac-uum.” Over the brief period of its existence, this falsevacuum had an enormous density ρFV ∼ 1080 g/cm3, withPFV

∼= −c2ρFV . This caused the cosmic scale factor togrow at an exponential rate [as in Eqs. (29)–(31)], with adoubling time td ∼ 10−37 sec.

Even if the false vacuum lasted for only 10−35 sec, about100 doublings would have occurred, causing the universeto expand by a factor of ∼2100 ≈ 1030. Recall that in curvedspaces, departures from flatness at a distance rp are oforder (rp/a)2 (as explained in Section III.C). Standard

Page 34: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

796 Cosmic Inflation

inflation predicts that even if space is curved, the scalefactor is now so large that the observable universe is vir-tually flat. [Hyperbolic (“open”) inflation is assumed tohave occurred in a space with negative curvature. The in-flation, while substantial, may not have lasted long enoughto render the observable universe ∼ flat.]

V. CONTENTS, AGE, AND FUTUREOF THE COSMOS

A. The Critical Density

The field equation (26) implies a relation between H , ρ, ,and the geometry of space, since the latter is governed byk. For present purposes, it is convenient (and customary)to express in terms of the equivalent vacuum density ρV ,given by Eq. (42). The sum of all other contributions willbe denoted by ρM . (“M” is short for matter and radiation.The latter is negligible at present, but dominated ρM atvery early times.) We next consider Eq. (26) with = 0and ρ = ρM + ρV .

Setting k = 0 in Eq. (26) and solving for ρ, we obtainρ for the intermediate case of a flat universe. Its value iscalled the critical density, and is presently given by

ρc = 3H 20

8πG= (1.88 × 10−29g/cm3)h2

0 (45)

where h0 is defined by Eq. (18). Inspection of Eq. (26)reveals that k is positive, zero, or negative if ρ is greaterthan, equal to, or less than ρc, respectively. We note inpassing that ρc is extremely small. For h0

∼= 0.65, ρc iscomparable to a density of five hydrogen atoms per cubicmeter of space. This is far closer to a perfect vacuumthan could be achieved in a laboratory in the foreseeablefuture.

To facilitate a comparison of individual densities withthe critical density, it is customary to define a set of num-bers i by

ρi = iρc (46)

where the index i can refer to any of the particular kindsof density. For example, ρM = Mρc, so M = ρM/ρc isthe fraction of ρc residing in matter (and radiation). Thetotal density is described at present by

0 ≡ M + V (47)

Equation (26) now implies that k > 0, k = 0, or k < 0if 0 > 1, 0 = 1, or 0 < 1, respectively. If k = 0, thepresent value of the scale factor is readily seen to be

k = ±1: a0 = c

H0

√k

0 − 1(48)

(Recall from Section III.B that if k = 0, then a0 has nophysical significance, and can be chosen arbitrarily.)

Standard inflation does not predict the geometry of theentire universe, only that the observable universe is veryclose to flat. As was shown in Section III.C, this wouldbe true for either type of curved space if a0 were verymuch larger than the radius of the observable universe.Equation (48) reveals that even if k = ±1, the observableuniverse can be arbitrarily close to flat if 0 is sufficientlyclose to 1.

Note that inflation does not predict whether the uni-verse is finite (k = +1) or infinite (k ≤ 0), for reasons statedabove and also because the observable universe is not re-garded as representative of the whole. In the latter case, theRW metric (upon which our discussion is based) wouldnot be applicable to the entire universe.

B. Kinds of Matter

The nature and distribution of matter are complex subjects,only partially understood. Broadly speaking, the princi-pal sources of ρM are ordinary matter and other (exotic)matter. Electrons and atomic nuclei (made of protons andneutrons) are regarded as ordinary, whether they are sep-arate or bound together in atoms. The universe has zeronet charge, so electrons and protons are virtually identicalin their abundance. (Charged particles are easily detected,and all other charged particles that we know of are short-lived.)

Protons and neutrons have nearly equal masses, and are∼1800 times as massive as electrons. Virtually all the massof ordinary matter therefore resides in nuclei. Protons andneutrons are members of a larger group of particles calledbaryons, and cosmologists often refer to ordinary matteras baryonic matter (despite the fact that electrons are notbaryons). All other baryons are very short-lived and rare,so any significant amount of other matter is nonbaryonic.We shall denote the densities of baryonic and nonbary-onic matter by ρB and ρNB , respectively. The total, presentdensity of matter (and radiation) may now be expressed asρM

∼= ρB + ρNB , where the approximation is valid withina tiny fraction of 1%.

We have no evidence that stars and planets are con-structed of anything but ordinary matter. It follows that anyother matter whose density is a significant fraction of ρc

must be electrically neutral, and also immune to the strongnuclear force (the strong interaction): its presence wouldotherwise be conspicuous in stars and planets. Such mat-ter can interact only weakly with ordinary matter, throughgravity and perhaps through the weak nuclear force (theweak interaction), and conceivably through some otherweak force that remains to be identified.

Neutrinos are particles with the above properties.They are also abundant, having been copiously produced

Page 35: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 797

during the early period of nucleosynthesis studied byGamow and coworkers (big bang nucleosynthesis, orBBN). Their rest masses were long thought to be zero(a convoluted piece of jargon meaning that, like photons,they always travel at the speed of light). The expansionand cooling of the universe would have reduced the en-ergy of such relic neutrinos to a negligible amount, as hashappened with the relic photons that now make up thecosmic microwave background. Recent experiments haverevealed, however, that neutrinos have small but nonzerorest masses. Even with very small masses, neutrinos are soabundant that their density ρν could be a significant frac-tion of ρc. [The Greek letter ν (nu) is a standard symbol forneutrinos.]

There is gravitational evidence (to be described) forthe existence of electrically neutral, weakly interactingparticles other than neutrinos. No such particles have yetbeen observed in laboratories, so their identity and de-tailed properties remain unknown. Such matter is there-fore called exotic, and its density will be denoted by ρX

(however many varieties there may be). We then haveρNB = ρν + ρX . (Neutrinos are sometimes regarded as ex-otic matter, but the present distinction is useful.)

C. Amounts of Matter

Stars are made of baryonic matter, as are interstellar andintergalactic clouds of gas and dust. Bright stars (brighterthan red dwarfs, the dimmest of main sequence stars) areeasily seen and studied, and are found to contain only∼0.5% of ρc. Clouds of gas and dust contain several timesas much, but the amount has been difficult to measure.In fact our best estimate for the total amount of baryonicmatter is based on the relative abundances of nuclei formedduring the early period of BBN.

The fraction of primordial baryonic matter convertedinto 2H (deuterium) during BBN depended strongly onthe density of baryons at that early time, in a (presumably)known way. Intergalactic clouds of gas have never been af-fected by nucleosynthesis in stars, so they should be puresamples of the mixture produced by BBN. By studyinglight from distant quasars that has passed through inter-galactic clouds, the relative abundance of deuterium wasmeasured in 1996 by David R. Tytler, Scott Burles, andcoworkers. On the basis of that and subsequent measure-ments, it is now believed that

B = (0.019 ± 0.001)/

h20 = 0.045 ± 0.009 (49)

The amount of baryonic matter is therefore ∼9 timesgreater than appears in bright stars.

It has become apparent in recent decades that the uni-verse contains a great deal of dark (nonluminous) matter.As indicated above, only ∼1/9 of ordinary matter is in

bright stars, and most of the remainder is cold (or dilute)enough to be dark. There is also gravitational evidencefor ∼7 times as much dark matter as there is baryonicmatter. The evidence resides primarily in the dynamics ofgalaxies and galactic clusters, which are held together bygravitational forces. The motions of stars in galaxies, andof galaxies in clusters, reveal the strength of gravity withinthem. The total amount of mass can then be inferred.

We first consider orbital motion around some large,central mass, such as the motion of planets around oursun. Under the simplifying assumption of circular orbits,Newton’s laws of gravitation and motion (the latter beingF = ma, with centripetal acceleration given by a = v2/r )

imply that

v =√

G M/r (50)

where v and r denote the speed around, and radius of, theorbit, and M denotes the mass of the sun.

We can test this model for the solar system by see-ing whether orbital speeds and radii of the nine planetsare related by the proportionality v ∝ 1/

√r , as predicted

by Eq. (50). The orbits pass this test (making allowancesfor the fact that orbits are actually ellipses). Larger or-bits are indeed traversed at smaller speeds, in the waypredicted by Newton’s laws. Since Eq. (50) implies thatM = v2r/G, the sun’s mass can be determined from thespeed and radius of a single orbit, with confidence that allorbits yield the same answer (because v ∝ 1/

√r ). This is

how we know the value of M.The Milky Way has a small nucleus that is very densely

populated with stars (and may contain a black hole of∼106 M). Centered about this nucleus is a “nuclearbulge” that is also densely populated, with a radius of∼104 ly. The visible disk has a radius of ∼5 × 104 ly, andit becomes thinner and more sparsely populated as oneapproaches the edge. For the preceding reasons, it waslong supposed that most of the galaxy’s mass is less than2 × 104 ly from the center. For stars in circular orbits out-side this central mass, orbital speeds should then be a de-creasing function of radius, roughly satisfying v ∝ 1/

√r

(as for planets in the solar system).Beginning in the 1970s, Vera Rubin and others began

systematic studies of orbital speeds, out to the edge of ourgalaxy and beyond (where a relatively small number ofstars are found). To everyone’s surprise, the speeds didnot decrease with increasing distance. In fact the speedsat 5 × 104 ly (the visible edge) are ∼10% greater than at2 × 104 ly. Furthermore, the speeds are even larger (bya small amount) at 6 × 104 ly, beyond the visible edge.Studies of this kind indicate that beyond ∼2 × 104 ly, theamount of mass within a distance r of the center is roughlyproportional to r , out to 105 ly or so (twice the radius ofthe visible disk).

Page 36: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

798 Cosmic Inflation

Studies of other spiral galaxies have revealed similarproperties. The Milky Way and other spirals are now be-lieved to have roughly spherical distributions of dark mat-ter, containing most of the mass and extending far beyondthe disk, with densities falling off (roughly) like 1/r2 be-yond the nuclear bulge. Most of this dark matter is prob-ably nonbaryonic (as will soon be explained).

The largest clusters contain thousands of galaxies, so itis plausible (but not certain) that little dark matter extendsbeyond their visible boundaries. There are three distinctways to estimate the total masses of large clusters. Theoriginal method (first used by Fritz Zwicky in 1935) isbased on the motions of galaxies within a cluster, fromwhich the strength of gravity and total mass can be in-ferred. The second method exploits the presence of hot,X-ray emitting, intracluster gas. The high temperature pre-sumably results from motion caused by gravity, so the to-tal mass can be inferred from the spectrum of the X-rays.The third method is based on gravitational lensing. Thelight from more distant galaxies bends as it passes by (orthrough) a cluster, and detailed observations of this effectenable one to estimate the total mass. All three methodsyield similar results.

The total luminosities of nearby clusters are easily de-termined, and the mass-to-luminosity ratios are about thesame for all large clusters that have been studied. If oneassumes this ratio to be representative of all galaxies (only∼10% of which reside in clusters), then the average den-sity of matter is readily obtained from the total luminosityof all galaxies within any large, representative region ofspace. This line of reasoning has been applied to the rele-vant data, yielding M = 0.19 ± 0.06. This value for M

would be an underestimate, however, if the matter insidelarge clusters were more luminous than is typical of mat-ter. This would be true if an appreciable amount of darkmatter extends beyond the visible boundaries of large clus-ters, or if an atypically large fraction of baryonic matterinside of large clusters has formed bright stars. Both pos-sibilities are plausible, and at least one appears to be thecase.

In addition to emitting X-rays, the hot intraclus-ter gas slightly alters (by scattering) the cosmicmicrowave radiation passing through it (the Sunyaev-Zel’dovich effect). Since only charged matter generates oraffects radiation, the X-rays and slightly diminished CMBreaching earth both contain information about the amountof baryonic matter. The most precise inference from suchstudies is the ratio of baryonic to nonbaryonic massin large clusters; namely ρB/ρM = (0.075 ± 0.002)/h3/2

0 .Assuming this ratio to be typical of matter every-where, it can be combined with Eq. (49) to ob-tain M = (0.253 ± 0.02)/

√h0 = 0.31 ± 0.03. This is re-

garded as the most reliable value for M . There is a long

history of underestimating uncertainties in this field, how-ever, so it seems appreciably safer to say that

M = 0.31 ± 0.06 = (6.9 ± 1.9)B (51)

In terms of mass, only ∼1/7 of matter is baryonic, andonly ∼1/60 is in bright stars. Luminous matter is quitespecial, even less revealing than the tips of icebergs (whichdisplay ∼1/40 of the whole, and reveal the compositionof the remainder).

D. Nonbaryonic Matter

By mass, ∼6/7 of matter is nonbaryonic. Neutrinos havebeen studied for decades, and are part of the “standardmodel” of elementary particles. As mentioned previously,there is recent evidence that neutrinos have nonzero restmasses (the Super-Kamiokande detection of neutrino os-cillations, beyond the scope of this article). The presentdata place a lower limit on ν , namely ν ≥ 0.003. Lab-oratory experiments have not ruled out the possibility thatneutrinos account for all the nonbaryonic matter, but thereare reasons to believe otherwise.

Indirect evidence for the nature of the nonbaryonic mat-ter arises in computer models for the progressive concen-tration of matter into galaxies, clusters of galaxies, super-clusters, sheets (or walls) of galaxies, and voids. All ofthese structures evolved as gravity progressively ampli-fied small deviations from perfect uniformity, beginningat a very early time. (The original inhomogeneities arebelieved to have been microscopic quantum fluctuationsthat were magnified by inflation.)

When the temperature dropped below ∼3,000 K, theelectrons and nuclei were able to remain together as elec-trically neutral atoms, forming a transparent gas. This hap-pened ∼300,000 yr after the big bang, the time of lastscattering. The photons then present have been travelingfreely ever since, and now form the cosmic microwavebackground. Studies of the CMB reveal that deviationsfrom perfect uniformity at the time of last scattering wereon the order of δρM/ρM ∼ 10−5. Those slight inhomo-geneities have since been amplified by gravity into all thestructures that we see today.

Computer models for the growth of structure requireassumptions about the exotic dark matter. This matter isusually assumed to consist of stable elementary particlesthat are relics from a very early time. Because they areweakly interacting, only gravity should have affected themappreciably (gravity is the only weak force acting overlarge distances).

Since space was virtually transparent to exotic par-ticles, their speeds would have played a key role instructure formation. High speeds would have inhibitedtheir being captured by growing concentrations of matter,

Page 37: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 799

unless those concentrations were unusually extended andmassive. Computer simulations indicate that fast weakly-interacting particles would only have formed very largeconcentrations, of cluster and even supercluster size,whereas slow particles would first have formed smallerconcentrations of galactic size. The scenario wherein con-centrations of cluster size preceded the formation of galax-ies is called “top down” growth of structure. If galaxiesformed first and later clumped into clusters, the growth iscalled “bottom up.”

Since light travels at a finite speed, astronomers seethings as they were in the increasingly distant past as theylook out to increasingly greater distances. The evidenceis very clear that galaxies formed first, with only ∼10%of them clumping later into clusters. Structure formationoccurred from the bottom up. It follows that nonbary-onic particles were predominately slow. Of course “fast”and “slow” are relative terms, and the average speeds ofall types of massive particles decreased as the universeexpanded. These simple labels are useful in a brief dis-cussion, however, and they have fairly definite meaningsamong workers in the field. In further jargon, fast nonra-diating (uncharged) particles are referred to as “hot darkmatter,” and slow ones as “cold dark matter.” Bottom-upstructure formation implies that dark matter is predomi-nately cold.

There are three species of neutrino (electron, muon, andtau-neutrinos). All three species should have reached ther-mal equilibrium with other matter and photons before theuniverse was one second old. (The early, high density ofmatter more than compensated for the weakness of neu-trino interactions.) For this and other reasons, the presentabundance of relic neutrinos is believed to be known, andν can be expressed as

ν =∑

i

mi c2

93.5h20 eV

=∑

i

(0.025 ± 0.005)mi c2

1 eV(52)

where the index i refers to the three species (1 eV =1 electronvolt = 1.60 × 10−19 joule).

If only one species had a nonzero mass, with mi c2 ∼=39 eV, the result would be ν

∼= 1. (For comparison, thelightest particle of ordinary matter is the electron, withmec2 = 5.11 × 105 eV.) Since NB

∼= 0.26, the upper limiton neutrino mass-energies is on the order of 10 eV. Itis the combination of weak interactions with very smallmasses that has frustrated efforts to determine the massof even one species of neutrino. The Super-Kamiokandeexperiment is only sensitive to differences in mass betweendifferent species. The observed mass difference impliesν ≥ 0.003, where the equality would hold if only onespecies had nonzero mass.

Because neutrinos are so light and were in thermal equi-librium at an early, hot time, they are the paradigm of“hot dark matter.” They slowed as the universe expanded,but at any given temperature, the lightest particles werethe fastest. Computer simulations rule out neutrinos as thepredominate form of nonbaryonic matter. Recalling thelower limit discussed previously, we conclude that

0.003 ≤ ν < 12NB (53)

The two most favored candidates for exotic matter areaxions and neutralinos. These arise in speculative, butplausible, theories that go beyond the standard model ofelementary particles. (Neutralinos occur in supersymmet-ric theories, which receive tentative support from a veryrecent experiment on the precession of muons.) For quitedisparate reasons, both qualify as cold dark matter.

Axions are predicted to be much lighter than neutrinos.Unlike neutrinos, however, they would never have beenin thermal equilibrium, and would always have movedslowly (for abstruse reasons beyond the scope of this ar-ticle). In sharp contrast, neutralinos would have been inthermal equilibrium when the universe was very youngand hot. Neutralinos (if they exist) are much more massivethan protons and neutrons, however, and their large masseswould have rendered them (relatively) slow-moving at anygiven temperature. Other candidates for exotic “cold darkmatter” have been proposed, but will not be mentionedhere.

E. Vacuum Energy

In 1998, two teams of observers, one led by SaulPerlmutter and the other by Brian P. Schmidt, announcedthat the cosmic expansion appears to be accelerating. Thisrevolutionary claim was based on observations of typeSNe Ia over a wide range of distances, extending out to∼6 × 109 ly. Distances approximately twice as great havesince been explored using SNe Ia as standard candles.

Redshifts are usually described in terms of a red-shiftparameter z = λ0/λe − 1, where λ0 denotes the observedwavelength and λe the emitted wavelength. With space-time described by a RW metric, it is readily shown thatz = a(t0)/a(te) − 1, where t0 and te denote the (present)time of observation and the earlier time of emission, re-spectively. Supernovae of type Ia have been observed withredshifts as large as z ≈ 1, corresponding to a(t0) ≈ 2a(te).Such light was emitted when the universe was only halfits present size, and roughly half its present age.

Since the intrinsic luminosity of SNe Ia is known, theapparent brightness reveals the distance of the source,from which the time of emission can be inferred (subject topossible corrections for spatial curvature). The evolution

Page 38: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

800 Cosmic Inflation

over time of a(t) has been deduced in this way, and indi-cates an accelerating expansion. Since ρM has decreasedover time while ρV remained constant, they have affecteda(t) in different ways. Thus M and V can both be de-duced from detailed knowledge of a(t). Present SNe Iadata indicate that M ≈ 0.3 and V ≈ 0.7, with uncer-tainties in both cases on the order of ±0.1.

The CMB radiation provides a snapshot of the universeat the time of last scattering, ∼300,000 yr after the bigbang. The CMB reaching us now has come from the out-ermost limits of the observable universe, and has a red-shift z ∼ 1100. It has an extremely uniform temperatureof T = (2.728 ± 0.002)K in all directions, except for tinydeviations on the order of δT/T ∼ 10−5 over small regionsof the sky. These deviations are manifestations of slightinhomogeneities in density of order δρM/ρM ∼ 10−5 at thetime of last scattering. Gravity produced these slight con-centrations of matter from much smaller ones at earliertimes, in ways that are believed to be well understood. Inparticular, the spectrum (probability distribution) of dif-ferent sizes of inhomogeneity is calculable in terms ofmuch earlier conditions.

The angle subtended by an object of known size at aknown distance depends on the curvature of space, asdiscussed in Section III.C. (For example, the circumfer-ence of a circle differs from 2πr in a curved space.) Inthe present context, the “objects of known size” were thelargest concentrations of matter, whose diameters weregoverned by the distance that density waves could havetraveled since the big bang (the sound horizon at the timeof last scattering).

The speed of density waves is a well-understood func-tion of the medium through which they travel, and thesound horizon at last scattering has been calculated with(presumably) good accuracy. The angular diameters ofthe largest concentrations of matter, manifested by δT/T ,therefore reveal the curvature of space. Within observa-tional uncertainties, the data confirm the foremost pre-diction of standard inflation, namely that space is veryclose to flat. There is no evidence of curvature. TheCMB data indicate that space is flat or, if curved, thata0 ≥ 3c/H0 ∼ 45 billion light-years. Equation (48) thenimplies that

0 = 1.0 ± 0.1 (54)

From Eq. (51) and the SNe Ia result cited above, itfollows that

V = 0.69 ± 0.12 (55)

Spatial flatness and vacuum energy are the most impor-tant and dramatic discoveries in observational cosmologysince the CMB (1964).

There has been some speculation that the mass-energydensity filling all of space might not be strictly constant,

but change slowly over time. (Such an energy field issometimes called “quintessence.”) Denoting the densityof such a hypothetical field by ρQ , observations wouldimply a large, negative pressure, PQ < −0.6c2ρQ . Presentdata are wholly consistent with the simplest possibility,however, the one described here in detail: a constant ρV

with PV = −c2ρV , perfectly mimicking a cosmologicalconstant.

F. Age and Future of the Cosmos

We now consider the evolution over time of a flat universe,under the simplifying assumption that PM = 0 (an excel-lent approximation since the first 10 million years or so).Equations (22), (27), (42), and (43) then imply

t = 2

3H0

∫ x(t)

0

dx ′√V x ′2 + M

(56)

where t denotes the time since the big bang, x(t) ≡[a(t)/a0]3/2, denotes the present value of the cosmic scalefactor a(t), and M and V are present values, withM + V = 1.

The scale of time is set by H0, where tH ≡ 1/H0 iscalled the Hubble time. To understand its significance,suppose that every distant galaxy had always been movingaway from us at the same speed it has now. For motion atconstant speed v, the time required to reach a distance d ist = d/v. According to Hubble’s law, v = H0d [Eq. (17)],so the time required for every galaxy to have reached itspresent distance from us would be d/v = 1/H0 = tH . ThustH would be the age of the universe if galactic speeds hadnever been affected by gravity. (It is sometimes called the“Hubble age.”) With H0 given by Eqs. (18) and (19),

tH ≡ 1

H0= (15.1 ± 2.4) × 109 yr (57)

Denoting the actual age of the universe by t0 (the presenttime), we note that x(t0) = 1. For the simple case whereM = 1 (and V = 0), Eq. (56) yields t0 = 2

3 tH ∼ 10 ×109 yr (less than the age of the oldest stars). For V > 0,Eq. (56) implies

t

tH= 2

3√

Vln

(√V (a/a0)3 +

√V (a/a0)3 + M√

M

)(58)

expressing t/tH as a function of a/a0. Using the observedvalue M = 0.31 ± 0.06 (with V = 1 − M ) and settinga = a4

0 , we obtain the present age

t0 = (0.955 ± 0.053)tH = (14.4 ± 2.4) × 109 yr (59)

The oldest stars are believed to be (12 ± 2) × 109 years old,and they are expected to have formed ∼2 × 109 yr after

Page 39: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 801

the big bang. The discovery of substantial vacuum energy(or something very similar) has increased the theoreticalage of the universe comfortably beyond that of the oldeststars, thereby resolving a long-standing puzzle.

The time when the cosmic expansion began acceleratingis readily determined. Replacing in Eq. (23) with theρV and PV of Eqs. (42) and (43), we obtain

a

a= −4πG

3

(ρM − 2ρV + 3PM

c2

)(60)

With PM∼= 0, the acceleration (a > 0) began when ρM fell

below ρV . Using Eq. (27) for ρM , this happened when(a/a0)3 = M/2V 0.22. Denoting the correspondingtime by ta , Eq. (58) yields ta 0.52tH 0.55t0.

When a/a0 reaches 2, ρM will have shrunk (by dilution)to ∼6% of ρV . This will happen at t 1.7tH 1.8t0. Fromroughly that time onward, the universe will grow at anexponential rate, as described by Eqs. (29)–(31) and (42).The doubling time will be

td = ln 2√V

tH ≈ 13 × 109 yr (61)

The ratio a(t)/a0 is displayed in Fig. 1, which should bereliable for t/t0 ≥ 10−3.

FIGURE 1 Cosmic expansion: the distance between any pair ofcomoving objects is proportional to the cosmic scale factor a(t).Its present value is denoted by a0, and the present time by t0.For example, two objects were half as far apart as now whena/a0 = 1/2, which occurred at t ≈ 0.4t0. They will be twice as farapart as now when a/a0 = 2, which happen at t ≈ 1.8t0. The graphis only valid for t/t0 ≥ 10−3 (after inflation).

VI. HORIZON AND FLATNESS PROBLEMS

A. The Horizon Problem

The greatest distance that light could (in principle) havetraveled since the universe began is called the horizon dis-tance. Since no causal influence travels faster than light,any two regions separated by more than the horizon dis-tance could not have influenced each other in any way. Ob-servations reveal the universe to be flat (or nearly so) out tothe source of the CMB now reaching us. We shall assumefor simplicity that space is precisely flat (k = 0, = 1).

Every light path is a “null geodesic,” with (ds)2 = 0in Eq. (7). With k = 0, c dt = ±a(t) dr . Let us place our-selves at r = 0, and determine the proper distance rp = a0rto an object whose light is now reaching us with redshiftz = (a0/a) − 1. Since dt = da/a and a = aH ,

rp = a0c∫ a0

a

da′

a′2 H (a′)(62)

where H (a) is given by Eq. (26) in terms of ρ(a). Weshall initially assume the pressure of matter and radiationto be negligible, in which case ρ ∝ 1/a3. Equation (26)then implies

H 2 = H 20

[M

(a0

a

)3

+ V

](63)

where is being represented by ρV as in Eq. (42) (and, asbefore, M and V denote present values). Equations (62)and (63) imply

rp(z) = c

H0f (z),

(64)

f (z) ≡ 2∫ 1

1/√

z+1

dy√M + V y6

where y ≡ √a/a0.

The integral defining f (z) cannot be evaluated in termsof elementary functions. For small z, however, one candefine x ≡ 1 − y, expand the integrand in powers of x ,and thereby obtain

f (z) = z[1 − 3

4M z + (z2)]

(65)

where (z2) denotes corrections of magnitude ∼z2. Thetwo terms displayed in Eq. (65) are accurate within 1%for z ≤ 1 and M ∼ 0.3. (Despite appearances, Eq. (65)involves V , through our assumption that V = 1 − M .)

Next we note that f (∞) − f (z) is twice the integralover y from 0 to 1/

√z + 1. For large z, y remains small,

Page 40: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

802 Cosmic Inflation

and the integrand can be expanded as a power series in y.We thereby obtain

f (z) = f (∞)− 2√M (z + 1)

1− O

[V

14M (z + 1)3

](66)

where the correction shown is exact below order1/(z + 1)6. If one keeps this correction term, Eq. (66) isvalid within 0.2% for z ≥ 1 and M ∼ 0.3. The value off (∞) is readily determined by numerical integration ofEq. (64).

For times earlier that tEQ ∼ 3 × 105 yr (the time ofequality between low and high-pressure ρM ), radiationand highly relativistic particles contribute more to ρM thandoes matter with negligible pressure. (It appears to be coin-cidental that this is close to the time of last scattering, whenthe universe became transparent and the CMB set forth onits endless journey.) The transition is, of course, a smoothone. For simplicity, however, we assume that PM = 0 fort > tEQ , in which case Eq. (64) would be valid. For t < tEQ ,we assume that PM = c2ρM/3, which is the pressure ex-erted by radiation and highly relativistic particles.

We also impose continuity on a and a at tEQ or, equiv-alently, on a and H . Finally, the CMB was emitted att ∼ tEQ , and has zCMB ∼ 1100. The value of zEQ can onlybe estimated, but it seems certain that zEQ ≥ 100. It fol-lows that ρV ≤ 10−6ρM at tEQ , and we shall neglect it fort ≤ tEQ .

For PM = c2ρM/3, Eq. (24) implies that ρM ∝ 1/a4. Itthen follows from Eq. (26) that 2aa = da2/dt is constantfor t ≤ tEQ . We are presently concerned with the standardbig bang (without inflation), wherein ρ ∝ 1/a4 all the wayback to t = 0 (when a = 0). We therefore have

t ≤ tEQ :

(a

aEQ

)2

= t

tEQ, H ≡ a

a= 1

2t(67)

where aEQ ≡ a(tEQ).Equation (67) holds at early times regardless of what

the present universe is like, an extraordinary fact of greatsimplicity and usefulness.

Equation (67) implies that a2 H is constant for t ≤ tEQ ,so the integral in Eq. (60) is trivial for t ≤ tEQ . Imposingcontinuity on H , Eq. (63) can be used to express HEQ

in terms of H0, M , and a0 (neglecting V at tEQ). Wethereby obtain

z ≥ zEQ :

rp(z) = c

H0

[f (∞) − 1√

M (zEQ + 1)

(z + zEQ + 2)

z + 1

](68)

For M = 0.31 (with V = 0.69), a numerical integrationyields f (∞) = 3.26. As zEQ ranges from 100 to the (un-

realistic) value of ∞, rp(∞), varies by less than 6%, andrp(zEQ) differs from rp(∞) by less than 6%. Note thatrp(∞) is the present standard horizon distance dSH (t0),i.e., the horizon distance if there had been no inflation.For zEQ ∼ 1000, we obtain

dSH (t0) = rp(∞) ≈ 3.2c

H0≈ 48 × 109 ly (69)

A 20% uncertainty in M would only render the factorof 3.2 uncertain by ∼10%. It may seem puzzling thatdSH (t0) ∼ 3 × (age of universe), but the cosmic expansionhas been moving the source away from us while the lighttraveled toward us. This effect has been amplified by ac-celeration of the expansion.

The distance to the source of the CMB is of particu-lar interest, for reasons that will become apparent. SincezCMB ≈ 1100, the percentage difference between rp(zCMB)and dSH (t0) is small, and we have seen that it varies lit-tle with any choice of zEQ ≥ 100. For zEQ = zCMB = 1100,Eq. (68) yields rp(zCMB) ≈ 3.1(c/H0) ≈ 47 × 109 ly. TheCMB sources in opposite directions from us are there-fore separated by ∼90 × 109 ly. Proper distances betweencomoving points scale like a/a0 = z + 1, so sources in op-posite directions were ∼8 × 107 ly apart when they emit-ted the CMB we now observe. This is a fact of extremeinterest, as we shall see.

The CMB reaching us now began its journey attL S ≈ 300,000 yr, where tLS denotes the time of last scatter-ing. We wish to determine dSH (tLS), the (standard) horizondistance at that time. We consider light emitted from r = 0at t = 0, and consider the proper distance dH (t) = a(t)r (t)it has traveled by time t :

dH (t) = ca(t)∫ t

0

dt ′

a(t ′)(70)

Using Eq. (67), we obtain

t ≤ tEQ : dSH (t) = 2ct, (71)

a wonderfully simple result. If tLS ≤ tEQ , Eq. (71) wouldimply that dSH (tL S) ≈ 600,000 ly.

For a considerable time later than tEQ , a/a0 re-mained small enough that ρV ρM . (For example, withV /M ≈ 7/3, ρV ≤ 10−2ρM for a/a0 ≤ 1

6 . This is satis-fied for t ≤ 0.7t0 ≈ 1010 year, as may be seen from Fig. 1.)With ρ ∝ 1/a3, Eq. (63) implies that da3/2 /dt is constant,hence that a3/2 = αt + β for constant α and β. Continuityof a and H at tEQ determines these constants, with theresults

t ≥ tEQ, ρV = 0: a(t) = aEQ

(3t + tEQ

4tEQ

)2/3

, (72)

dSH (t) = c(3t + tEQ)

1 −

[tEQ

2(3t + tEQ)

]1/3

(73)

Page 41: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 803

Now recall that if tLS ≤ tEQ , Eq. (71) yields dSH (tLS) ≈600,000 ly. If tLS ≥ tEQ , Eq. (73) yields dSH ≈ 630,000 ly(within 5%), for any tEQ ≥ 50,000 yr (a very conservativelower bound). In either case, dSH (tLS) was less than 1% of80 million ly, the distance between sources of the CMB inopposite directions from us when the CMB was omitted.This is the horizon problem described in Section II.A.

Next consider a simple inflationary model. Let tband te denote the times when inflation begins andends, respectively, with te 1 sec. For t ≤ tb, (a/ab)2 =t/tb, with H = 1/2t as in Eq. (67). For tb ≤ t ≤ te, a =abexp(t − tb), with H = , a constant [recall Eqs. (29)and (30)]. Continuity of H implies = 1/2tb. Thedoubling time td is related to by = ln 2/td , sotd = (2 ln 2)tb.

Inflation ends at te with ae = abexpt and H = ,where t ≡ te − tb. The density then reverts to its earlierbehavior ρ ∝ 1/a4, and Eq. (26) again implies that da2/dtis constant. The general solution is a2 = ξ t + η, where ξ

and η are fixed by continuity of a and H at te. One read-ily finds that a2 = a2

e [2(t − te) + 1], but a more elegantsolution is at hand.

From the end of inflation onward, the universe evolvesprecisely like a model without inflation, but which hadthe same conditions at te. Denoting time in the lattermodel by t ′, H = 1/2t ′, and H = at t ′

e = 1/2. Thescale factor was zero at t ′ = 0 in this model, so (a/ae)2 =t ′/t ′

e. Since t ′e = 1/2 when t = te, the relation is t ′ =

(t − te) + td/ ln 4.Using dr = c dt/a(t), it is straightforward to calculate

r (tb), r (te) − r (tb), and r (t ′) − r (t ′e). We thereby obtain

dH (t ′)2ct ′ = (et − 1)

√2td

t ′ ln 2+ 1 (74)

(presuming the actual horizon distance dH is given bythe inflationary model). The horizon problem is solved ifdH (tLS) rp(zCMB)/zCMB . This inequality needs to be astrong one, in order to explain how the CMB arriving fromopposite directions can be so precisely similar. Combinedwith Eq. (74), this requires inflation by a factor

et rp(zCMB)

zCMBc√

td tLS(75)

where factors of order unity have been omitted. Fortd ∼ 10−37 sec,

et 1028 (76)

is required. Resolution of the horizon problem is displayedin Figure 2.

FIGURE 2 Solution to the horizon problem: the size of the ob-served universe in the standard and inflationary theories. The ver-tical axis shows the radius of the region that evolves to becomethe presently observed universe, and the horizontal axis showsthe time. In the inflationary theory, the universe is filled with afalse vacuum during the inflationary period, marked on the graphas a vertical gray band. The gray line describing the standard hotbig bang theory is drawn slightly thicker than the black line for theinflationary theory, so that one can see clearly that the two linescoincide once inflation has ended. Before inflation, however, thesize of the universe in the inflationary theory is far smaller thanin the standard theory, allowing the observed universe to cometo a uniform temperature in the time available. The inset, magni-fied by a factor of 10, shows that the rapid expansion of inflationslows quickly but smoothly once the false vacuum has decayed.(The numerical values shown for the inflationary theory are notreliable, but are intended only to illustrate how inflation can work.The precise numbers are highly uncertain, since they depend onthe unknown details of grand unified theories.) [From “The Infla-tionary Universe” by Alan Guth. Copyright © 1997 by Alan Guth.Reprinted by permission of Perseus Books Publishers, a memberof Perseus Books, L.L.C.]

B. The Flatness Problem

Let us now define

(t) ≡ ρ(t)/ρc(t) (77)

where ρc(t) is the critical density given by Eq. (45) with H0

replaced by H (t) = a/a. As emphasized by Robert Dickeand P. James E. Peebles in 1979, any early deviation of from unity would have grown very rapidly with thepassage of time, as Figure 3 displays. Replacing by theequivalent ρV , Eq. (26) implies

− 1

= 3kc2

8πGρa2≤ 3kc2

8πGρM a2(78)

Page 42: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

804 Cosmic Inflation

FIGURE 3 The evolution of omega for the first 30 sec. The curvesstart at one second after the big bang, and each curve representsa different starting value of omega, as indicated by the numbersshown on the graph. [From “The Inflationary Universe” by AlanGuth. Copyright © 1997 by Alan Guth. Reprinted by permission ofPerseus Books Publishers, a member of Perseus Books, L.L.C.]

where the inequality holds for ρV ≥ 0. Since ρ =(/M )ρM ,

0 − 1

0=

(M0

0

)3kc2

8πGρM0a20

(79)

where the subscript “0” denotes present values.The pressure PM of matter (and radiation) became neg-

ligible when the universe was ∼106 years old. Sincethat early time, ρM ρM0(a0/a)3 [as in Eq. (27)].Equations (78) and (79) therefore imply

− 1

≤ 0 − 1

M0

(a

a0

)(80)

which actually holds for all times since inflation, becausethe left side decreases even more rapidly than the rightside as one goes back into the period when PM > 0. Asexplained in Section V, it appears that M0 ≥ 0.2 and|0 − 1| ≤ 0.2, in which case

| − 1|

≤ a

a0(81)

for all times since inflation. The CMB began its journeyat tL S ≈ 300,000 yr, and now has zCMB ≈ 1100. It followsthat |L S − 1| < 10−3.

We do not receive any light emitted before tL S . Todetermine at earlier times, we must study the time-dependence of the scale parameter a. We begin by notingthat Eq. (26) (with = 0) can be written as H 2 = H 2 −k(c/a)2, which implies that k(c/a)2 = [( − 1)/] ×H 2. The k-term in Eq. (26) is therefore negligible when| − 1| 1, and we shall drop it for t ≤ tL S .

For all times earlier than tLS, ρV was completely neg-ligible. It is also known that for times earlier than

tER ∼ 40,000 yr (the end of the radiation era), radiationand/or relativistic particles dominated ρM , exerting a pres-sure PM ρM c2/3. Thus ρ ρER(aER/a)4, and Eq. (78)implies

t ≤ tER : − 1

= ER − 1

ER

(a

aER

)2

(82)

We are here considering the standard model (without in-flation), so a2 ∝ t .

Neither matter nor radiation dominated for times be-tween tLS and tER . Since tER < tLS , however, |ER − 1| <|LS − 1| < 10−3. Equation (82) then implies

t ≤ tER : | − 1| < 10−3 t

tER(83)

Since tER = 1.3 × 1012 sec, it follows that | − 1| < 10−15

at t = 1 sec.It is striking that was so close to unity at 1 sec, but

there is no physical significance to the time of 1 sec.Had there been no inflation, the inequality (83) shouldhave been valid all the way back to the Planck timetP ≡ L P/c ∼ 10−43 sec. Quantum gravity would have beensignificant at earlier times, and is only partially under-stood, but Eq. (83) implies

t = tPl : | − 1| < 10−58 (84)

We have no reason to suppose that quantum gravity wouldlead to a value for near unity, let alone so extremelyclose. The present age of the universe is ∼1060 × tPl , andit is a profound mystery why now differs from unity (ifat all) by less than 20%. This is the “flatness problem.”

The Hubble parameter H ≡ a/a remains constant dur-ing any period of exponential growth [recall Eqs. (29) and(30)]. If a(t) increased by a factor of 10N during a pe-riod of exponential growth, a generalization of Eq. (48)to arbitrary times would imply that | − 1| decreased bya factor of 102N . To remove the necessity for to havebeen close to unity at the Planck time, inflation by a fac-tor ≥1030 is required. It seems unlikely that such growthwould have been the bare minimum required to explainwhy is near unity after 1060 × tPl , so standard inflationpredicts growth by more than 30 powers of 10 (alreadyrequired to solve the horizon problem), probably manymore. It would follow that 0 is extremely close to unity,which standard inflation predicts.

VII. UNIFIED THEORIESAND HIGGS FIELDS

A. Quantum Theory of Forces

In quantum field theory (QFT), forces between particlesresult from the exchange of virtual quanta, i.e., transient,

Page 43: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 805

particlelike manifestations of some field that gives rise tothe force. Electromagnetic forces arise from the exchangeof virtual photons, where photon is the name given toa quantum or energy packet of the electromagnetic field.The weak nuclear force results from the exchange of weakvector bosons, of which there are three kinds called W+,W−, and Z. The strong nuclear force results, at the mostfundamental level, from the exchange of eight colored glu-ons (a whimsical, potentially misleading name in that suchparticles are without visible qualities; color refers here toa technical property). From the viewpoint of QFT, gravi-tation may be regarded as resulting from the exchange ofvirtual gravitons. The four types of force just enumeratedencompass all of the fundamental interactions currentlyknown between elementary particles.

It is well-known that a major goal of Einstein’s laterwork was to construct a unified field theory in which twoof the existing categories of force would be understoodas different aspects of a single, more fundamental force.Einstein failed in this attempt; in retrospect, the experi-mental knowledge of elementary particles and their inter-actions was too fragmentary at the time of his death (1955)for him to have had any real chance of success. The intel-lectual heirs of Einstein have kept his goal of unificationhigh on their agendas, however. With the benefit of con-tinual, sometimes dramatic experimental advances, andmathematical insights from a substantial number of the-orists, considerable progress in unification has now beenachieved. A wholly unexpected by-product of contempo-rary unified theories is the possibility of cosmic inflation.To understand how this came about, let us briefly examinewhat is meant by a unification of forces.

Each of the four categories of force enumerated abovehas several characteristics that, taken together, serve todistinguish it from the other types of force. One obviouscharacteristic is the way in which the force between twoparticles depends on the distance between them. For ex-ample, electrostatic and (Newtonian) gravitational forcesreach out to arbitrarily large distances, while decreas-ing in strength like the inverse square of the distance.Forces of this nature (there seem to be only the two justnamed) are said to be long range. The weak force, incontrast, is effectively zero beyond a very short distance(∼2.5 × 10−16 cm). The strong force between protonsand neutrons is effectively zero beyond a short range of∼1.4 × 10−13 cm, but is known to be a residual effect ofthe more fundamental force between the quarks (elemen-tary, subnuclear particles) of which they are made. It isthe force between quarks that is mediated by colored glu-ons (in a complicated way with the remarkable propertythat, under special conditions, the force between quarks isvirtually independent of the distance between them).

The range of a force is intimately related to the massof the virtual particle whose exchange gives rise to the

force. Let us consider the interaction of particles A andB through exchange of virtual particle P . For the sake ofdefiniteness, suppose that particle P is emitted by A andlater absorbed by B, with momentum transmitted from Ato B in the process.

The emission of particle P appears to violate energyconservation, for P did not exist until it was emitted. Thisprocess is permitted, however, by the time-energy uncer-tainty principle of quantum theory: energy conservationmay be temporarily violated by an amount E for a du-ration t , provided that E × t ≈ h-- (where h-- denotesPlanck’s constant divided by 2π ). If particle P has a rest-mass m p, then E ≥ m pc2, so the duration is limited tot ≤ h-- /m pc2. The farthest that particle P can travel dur-ing this time is ct ≤ h-- /m pc. This limits the range of theresulting force to a distance of roughly h-- /m pc (called theCompton wavelength of particle P).

The preceding discussion is somewhat heuristic, butis essentially correct. If a particle being exchanged hasm > 0, then the resulting force drops rapidly to zero fordistances exceeding its Compton wavelength; conversely,long-range forces (i.e., electromagnetism and gravitation)can only result from exchange of “massless” particles.(Photons and gravitons are said to be massless, meaningzero rest-mass. Calling them “massless” is a linguisticconvention, because such particles always travel at thespeed of light and hence defy any measurement of masswhen at rest. They may have arbitrarily little energy, how-ever, and they behave as theory would lead one to expectof a particle with zero rest-mass.)

There are other distinguishing features of the four typesof force, such as the characteristic strengths with whichthey act, the types of particles on which they act, and theforce direction. We shall not pursue these details furtherhere, but simply remark that two (or more) types of forcewould be regarded as unified if a single equation (or setof equations) were found, based on some unifying prin-ciple, that described all the forces in question. Electricityand magnetism were once regarded as separate types offorce, but were united into a single theory by James ClerkMaxwell in 1864 through his equations which show theirinterdependence. From the viewpoint of QFT, electromag-netism is a single theory because both electrical and mag-netic forces result from the exchange of photons; forcesare currently identified in terms of the virtual quanta orparticles that give rise to them.

B. Electroweak Unificationvia the Higgs Mechanism

When seeking to unify forces that were previously re-garded as being of different types, it is natural to ex-ploit any features that the forces may have in common.Einstein hoped to unify electromagnetism and gravitation,

Page 44: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

806 Cosmic Inflation

in considerable measure because both forces shared thestriking feature of being long-range. As we have noted,Einstein’s efforts were not successful. In the 1960s, how-ever, substantial thought was given to the possibility ofunifying the electromagnetic and weak interactions.

The possibility of unifying the weak and electromag-netic interactions was suggested by the fact that both inter-actions affected all electrically charged particles (albeit indistinctly different ways), and by the suspicion that weakinteractions might result from exchange of a hypotheti-cal set of particles called “weak vector bosons.” (Vectorbosons are particles with intrinsic angular momentum, or“spin,” equal to h-- , like the photon.)

The electromagnetic and weak forces would be re-garded as unified if an intimate relationship could be foundbetween the photon and weak vector bosons, and betweenthe strengths of the respective interactions. (More specifi-cally, in regrettably technical terms, it was hoped that thephoton and an uncharged weak vector boson might trans-form into one another under some symmetry group ofthe fundamental interactions, and that coupling strengthsmight be governed by eigenvalues of the group genera-tors. The symmetry group would then constitute a unifyingprinciple.)

The effort to unify these interactions faced at least twoobstacles, however. The most obvious was that photons aremassless, whereas the weak interaction was known to beof very short range. Thus weak vector bosons were necess-rily very massive, which seemed to preclude the intimatetype of relationship with photons necessary for a unifiedtheory. A second potential difficulty was that while exper-iments had suggested (without proving) the existence ofelectrically charged weak bosons, now known as the W +

and W −, unification required a third, electrically neutralweak boson as a candidate for (hypothetical) transforma-tions with the neutral photon. There were, however, noexperimental indications that such a neutral weak boson(the Z ) existed.

Building on the work of several previous investiga-tors, Steven Weinberg (in 1967) and Abdus Salam (in-dependently in 1968) proposed an ingenious unificationof the electromagnetic and weak interactions. The fea-ture of principle interest to us here was the remarkablemethod for unifying the massless photon with massiveweak bosons. The obstacle of different masses was over-come by positing that, at the most fundamental level, therewas no mass difference. Like the photon, the weak vec-tor bosons were assumed to have no intrinsic mass. How,then, could the theory account for the short range of theweak force and corresponding large effective masses ofthe weak bosons? The theory did so by utilizing an inge-nious possibility discovered by Peter Higgs in 1964, calledthe Higgs mechanism for (effective) mass generation.

A QFT is defined by the choice of fields appearing in itand by the Lagrangian chosen for it. The Lagrangian is afunction of the fields (and their derivatives) that specifiesthe kinetic and potential energies of the fields, and ofthe particles associated with them (the field quanta). Inmore general terms, the Lagrangian specifies all intrinsicmasses, and all interactions among the fields and particlesof the theory. The Langrangian contains two kinds ofphysical constants that are basic parameters of the theory:the masses of those fields that represent massive particles,and coupling constants that determine the strengths ofinteractions.

A theory wherein the Higgs mechanism is operative hasa scalar field variable (r, t) (a Higgs field). This ap-pears in the Lagrangian in places where a factor of masswould appear for certain particles, if those particles hadintrinsic mass. The potential energy of the Higgs field isthen defined in such a way that its state of minimum en-ergy has a nonzero, constant value 0. This state of lowestenergy corresponds to the vacuum. Particles with no in-trinsic mass thereby acquire effective masses proportionalto 0. These effective masses also contain numerical fac-tors that depend on the type of particle, giving rise to thedifferences in mass that are observed.

The Higgs mechanism seems like a very artificial way toexplain so basic a property as mass. When first proposed,it was regarded by many as a mathematical curiosity withno physical significance. The Higgs mechanism plays acentral role in the Weinberg–Salam electroweak theory,however, for the weak bosons are assumed to acquire theirlarge effective masses in precisely this way. Furthermore,the Weinberg–Salam theory has been experimentally con-firmed in striking detail (including the existence of theneutral weak Z boson, with just the mass and other prop-erties predicted by the theory). Verification of the theoryled to Weinberg and Salam sharing the 1979 Nobel Prizein Physics with Sheldon Lee Glashow, who had laid someof the ground-work for the theory.

Only a single Higgs field has been mentioned thus far,but the Higgs mechanism actually requires at least two:one that takes on a nonzero value in the vacuum, plusone for each vector boson that acquires an effective massthrough the Higgs mechanism. [There are four real (ortwo complex) Higgs fields in the Weinberg–Salam theory,since the W +, W −, and Z weak bosons all acquire effectivemasses in this way.] Furthermore, there are many differentstates that share the same minimum energy, and hencemany possible vacuum states for the theory.

As a simple example, consider a theory with two Higgsfields 1 and 2 and a potential energy–density functionV given by

V (1, 2) = b(2

1 + 22 − m2

)2(85)

Page 45: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 807

where b and m are positive constants. (We shall seethat m corresponds to the 0 mentioned previously.)The minimum value of V is zero, which occurs when2

1 + 22 = m2. This is the equation for a circle of radius

m, in a plane where the two axes measure 1 and 2, re-spectively. Every point on this circle corresponds to zeroenergy, and could serve as the vacuum.

A trigonometric identity states that cos2 ψ + sin2 ψ ≡1, for arbitrary angle ψ . The circle of zero energy in the1, 2, plane is therefore described by

1 = m cosψ(86)

2 = m sinψ

where ψ ranges from 0 to 360. Any particular angle ψ0

corresponds to a particular choice of vacuum state. It isuseful to consider a pair of Higgs fields ′

1 and ′2 defined

by

′1 ≡ 1 sinψ0 − 2 cosψ0

(87)′

2 ≡ 1 cosψ0 + 2 sinψ0

Comparing Eq. (86) for ψ = ψ0 with Eq. (87), we see that′

1 = 0 and ′2 = m. Hence one may say that “the vacuum”

is characterized by ′1 = 0 and ′

2 = m, so long as onebears in mind that some particular vacuum correspond-ing to a particular value of ψ has been selected by nature,from a continuous set of possible vacuums. Some intrinsi-cally massless particles acquire effective masses from thefact that ′

2 = m under ordinary conditions, as discussedpreviously.

Theories containing Higgs fields are always constructedso that the Higgs fields transform among one another underthe action of a symmetry group, i.e., a group of transfor-mations that leaves the Langrangian unchanged. [For thesimple example just described, with V given by Eq. (85),the symmetry group is that of rotations in a plane with per-pendicular axes corresponding to 1 and 2: the group iscalled U (1).]

Higgs fields are always accompanied in a theory by(fundamentally) massless vector bosons that transformamong one another under the same symmetry group. Theprocess whereby nature chooses some particular vacuum(corresponding in our simple example to a particular valuefor ψ) is essentially random, and is called spontaneoussymmetry breaking. Once the choice has been made, theoriginal symmetry among all the Higgs fields is broken bythe contingent fact that the actual vacuum realized in na-ture has a nonzero value for some particular Higgs field butnot for the others. At the same time, the symmetry amongthe massless vector bosons is broken by the generation ofeffective mass for one (or more) of them, which in turnshortens the range of the force resulting from exchange ofone (or more) of the vector bosons.

Theories involving the Higgs mechanism are said topossess a hidden symmetry. Such theories contain a sym-metry that is not apparent in laboratories, where nature hasalready selected some particular vacuum, thereby con-cealing the existence of other possible vacuums. Further-more, every possible vacuum endows a subset of intrin-sically massless particles with effective masses, therebyobscuring their fundamental similarity to other particlesthat remain massless. Exchanges of the effectively mas-sive particles result in forces of limited range (∼h-- /mc),whereas any particles that remain massless generate long-range forces (electromagnetism being the only one, ex-cept for gravity). Extraordinary imagination was requiredto realize that electromagnetism and the weak inter-action might be different aspects of a single underly-ing theory. Actual construction of the resulting elec-troweak theory may reasonably be regarded as the work ofgeniuses.

It seems possible, even likely, that in other regionsof space–time far removed from our own, the choice ofvacuum has been made differently. In particular, thereis no reason for the selection to have been made in thesame way in regions of space–time that were outside eachothers’ horizons at the time when selection was made.The fundamental laws of physics are believed to be thesame everywhere, but the spontaneous breaking of sym-metry by random selections of the vacuum state shouldhave occurred differently in different “domains” of theuniverse.

We have remarked that the state of minimum energy hasa nonzero value for one (or more) of the Higgs fields. If weadopt the convention that this state (the vacuum) has zeroenergy, it follows that any region of space where all Higgsfields vanish must have a mass-energy density ρH > 0.

Our example with the Higgs potential described byEq. (85) would have

ρH = bm4/c2 (88)

in any region where 1 = 2 = 0. Work would be requiredto expand any such region, for doing so would increasethe net energy. Such a region is therefore characterized bynegative pressure PH that, by the work-energy theorem,is related to ρH by PH = −c2ρH [as may be seen fromEq. (25) for constant ρ and P].

Of all the fields known or contemplated by particle the-orists, only Higgs fields have positive energy when thefields themselves vanish. This feature of Higgs fields runsstrongly against intuition, but the empirical successes ofthe Weinberg-Salam theory provide strong evidence forHiggs fields. If all Higgs fields were zero throughout thecosmos, the resulting ρH and negative PH would perfectlymimic a positive cosmological constant, causing an expo-nential rate of growth for the universe. The original theory

Page 46: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

808 Cosmic Inflation

of cosmic inflation is based on the premise that all Higgsfields were zero at a very early time. The rate of growthwould have been enormous, increasing the size of the uni-verse by many powers of ten during the first second ofcosmic history. This extraordinary period of cosmic infla-tion ended when the Higgs fields assumed their presentvalues, corresponding to our vacuum.

The Higgs fields relevant to inflation arise in grand uni-fied theories (GUTs) of elementary particles and their in-teractions. The possibility of inflation was first discoveredwhile analyzing a GUT, and many of the subsequent mod-els for inflation are based on GUTs. Inflation also occursin supersymmetric and superstring theories of elementaryparticles, where Higgs fields have analogs called inflatonfields (inflation-causing fields). The Higgs fields of GUTshave most of the essential features, however, and we shalldiscuss them first.

C. Grand Unified Theories (GUTs)

As evidence was mounting for the Weinberg–Salam the-ory in the early 1970s, a consensus was also emergingthat the strong force between quarks is a result of the ex-change of eight colored gluons, as described by a theorycalled quantum chromodynamics. Like the photon and theweak bosons, gluons are vector bosons, which makes fea-sible a unification of all three forces. [Gravitons are notvector bosons (they have a spin of twice h-- ), so a unifica-tion including gravity would require a different unifyingprinciple.]

A striking feature of the standard model is that thestrengths of all three forces are predicted to grow closer asthe energy increases, becoming equal at a collision energyof ∼2 × 1016 GeV (somewhat higher than reported earlier,because of recent data). This merging of all three strengthsat a single energy suggested very strongly that the forcesare unified at higher energies, by some symmetry that isspontaneously broken at ∼2 × 1016 GeV. A symmetry ofthis kind would also explain why electrons and protonshave precisely equal electrical charges (in magnitude). Ex-periments have established that any difference is less thanone part in 1021, a fact that has no explanation unless elec-trons are related by some symmetry to the quarks of whichprotons are made.

Guided by these considerations and by insights gainedfrom the earlier electroweak unification, Howard M.Georgi and Sheldon Glashow proposed the first GUT ofall three forces in 1974 [a group called SU(5) was assumedto describe the underlying symmetry]. The Higgs mech-anism was adduced to generate effective masses for allparticles except photons in the theory (gravitons were notincluded). In the symmetric phase (all Higgs fields zero),electrons, neutrinos, and quarks would be indistinguish-

able. The theory also predicted an (effectively) supermas-sive X boson, whose existence was required to achieve theunification.

Following the lead of Georgi and Glashow, numerousother investigators have proposed competing GUTs, dif-fering in the underlying symmetry group and/or other de-tails. There are now many such theories that reproduce thesuccessful predictions of electroweak theory and quantumchromodynamics at the energies currently accessible inlaboratories, while making different predictions for veryhigh energies where we have little or no data of any kind.Thus we do not know which (or indeed whether any) ofthese GUTs is correct. They typically share three featuresof significance for cosmology, however, which we shallenumerate and regard as general features of GUTs.

Grand unified theories involve the Higgs mechanism(with at least 24 Higgs fields), and contain numerous freeparameters. The energy where symmetry is broken maybe set at the desired value (∼2 × 1016 GeV) by assigningappropriate values to parameters of the theory. This energydetermines the gross features of the Higgs potential, andalso the approximate value of ρH (the mass-energy densitywhen all Higgs fields are zero). The precise value of ρH issomewhat different for different GUTs, and also dependson some parameters whose values can only be estimated.Numerical predictions for ρH and the resulting inflationtherefore contain substantial uncertainties. We shall see,however, that the observable predictions of inflation arescarcely affected by these uncertainties.

The density ρH is given roughly (within a few powersof ten) by

ρH ∼ (1016 GeV)4/(h--3c5) ∼ 1080 g/cm3 (89)

This density is truly enormous, ∼1047 times as great as ifour entire sun were compressed into a cubic centimeter.In fact ρH is comparable to the density that would arise ifall the matter in the observable universe were compressedinto a volume the size of a hydrogen atom. As indicatedby Eq. (89), this enormous value for ρH results from thevery high energy at which symmetry appears to be broken.

If all Higgs fields were zero throughout the cosmos, thenEqs. (42) and (89) would imply an enormous, effectivecosmological constant

H ∼ 1057 m−2 (90)

Any universe whose expansion is governed by a positive will grow at an exponential rate. From Eq. (31), we seethat the doubling time corresponding to H is

td ∼ 10−37 sec (91)

At this rate of growth, only 10−35 sec would be required for100 doublings of the cosmic scale factor (which governs

Page 47: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 809

distances between comoving points, i.e., points moving inunison with the overall expansion). Note that 2100∼1030.If inflation actually occurred, it is quite plausible that ouruniverse grew by a factor much larger than 1030, during avery brief time.

We have remarked that the value of ρH is uncertain bya few powers of ten. If ρH were only 10−10 times as largeas indicated by Eq. (89), then td would be 105 times largerthan given by Eq. (91). Even with td ∼ 10−32 sec, how-ever, 100 doublings would occur in 10−30 sec. Whether10−30 sec were required, or only 10−35 sec, makes no ob-servable difference.

A second important feature of GUTs concerns baryonnumber. Baryon number is a concept used to distinguishcertain types of particles from their antiparticles. Protonsand neutrons are assigned a baryon number of +1, incontrast with −1 for antiprotons and antineutrons. Grandunified theories do not conserve baryon number, whichmeans that matter and antimatter need not be created (ordestroyed) in equal amounts.

Nonconservation of baryon number is a high-energyprocess in GUTs, because it results from emission or ab-sorption of X , bosons, whose mc2 energy is ∼1016 GeV.This is far beyond the range of laboratories, so it cannotbe studied directly. When the universe was very young,however, it was hot enough for X bosons to have beenabundant. A universe created with equal amounts of mat-ter and antimatter could have evolved at a very early timeinto one dominated by matter, such as our own. Such aprocess is called baryogenesis.

In 1964, Steven Weinberg presciently remarked thatthere was no apparent reason for baryon number to beexactly conserved. He had recently shown that the QFTof any long-range force is inconsistent, unless the forceis generated by a strictly conserved quantity. Electromag-netism is just such a force, which “explains” why elec-tric charge is precisely conserved. Gravitation is another(the only other) such force, which “explains” why energyis conserved (more precisely, why ∂µTµν = 0). There isno long-range force generated by a baryon number, how-ever, so it would be puzzling if it were exactly conserved.Weinberg discussed the possibility of baryogenesis in theearly universe in general terms, but was limited by theabsence of any detailed theory for such processes. An-drei Sakharov published a crude model for baryogenesisin 1967, but no truly adequate theory was available beforeGUTs.

In 1978, Motohiko Yoshimura published the first de-tailed model for baryogenesis based on a GUT, and manyothers soon followed. It was typically assumed that theuniverse contained equal amounts of matter and antimat-ter prior to the spontaneous breaking of symmetry, and thatbaryogenesis occurred as thermal energies were dropping

below the critical value of ∼1016 GeV. Such calculationsseem capable of explaining how matter became dominantover antimatter, and also why there are ∼109 times asmany photons in the cosmic microwave background asthere are baryons in the universe.

The apparent success (within substantial uncertainties)of such calculations may be regarded as indirect evidencefor GUTs (or theories with similar features, such as su-persymmetric and superstring theories). More direct andpotentially attainable evidence would consist of observ-ing proton decay, which is predicted by various GUTsto occur with a half-life on the order of 1029–1035 yr(depending on the particular GUT and on the values ofparameters that can only be estimated). No proton de-cay has yet been observed with certainty, however, andcareful experiments have shown the proton half-life to ex-ceed 1032 yr (which rules out some GUTs that have beenconsidered).

A third feature of GUTs results from the fact that thereis no single state of lowest energy but rather a multiplic-ity of possible vacuum states. The choice of vacuum wasmade by nature as the universe cooled below ∼1029 K,when thermal energies dropped below ∼1016 GeV at acosmic age of ∼10−39 sec. Since different parts of theuniverse were outside each others’ horizons and had notyet had time to interact in any way, the choice of vac-uum should have occurred differently in domains thatwere not yet in causal contact. These mismatches of thevacuum in different domains would correspond to sur-facelike defects called domain walls and surprisingly, aswas shown in 1974 by G. t’ Hooft and (independently) byA. M. Polyakov, also to pointlike defects that are magneticmonopoles.

If the universe has evolved in accord with the stan-dard big bang model (i.e., without inflation), then suchdomain walls and magnetic monopoles should be suffi-ciently common to be dramatically apparent. None haveyet been observed with certainty, however, which may ei-ther be regarded as evidence against GUTs or in favor ofcosmic inflation: inflation would have spread them so farapart as to render them quite rare today.

D. Phase Transitions

The notion of phase is most familiar in its application tothe three states of ordinary matter: solid, liquid, and gas.As heat is added to a solid, the temperature gradually risesto the melting point. Addition of further energy equal tothe latent heat of fusion results in a phase transition to theliquid state, still at the melting point. The liquid may thenbe heated to the boiling point, where addition of the latentheat of vaporization results in a transition to the gaseous

Page 48: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

810 Cosmic Inflation

phase. The potential energy resulting from intermolecularforces is a minimum in the solid state, distinctly greater inthe liquid state, and greater still for a gas. The latent heatsof fusion and vaporization correspond to the changes inpotential energy that occur as the substance undergoestransition from one phase to another.

In QFT, the state of minimum energy is the vacuum,with no particles present. If the theory contains Higgsfields, then one of them is nonzero in the vacuum. Thevacuum (if perfect) has a temperature of absolute zero,for any greater temperature would entail the presence ofa thermal (blackbody) distribution of photons. A temper-ature of absolute zero is consistent, of course, with thepresence of any number of massive particles.

If heat is added to some region of space it will resultin thermal motion for any existing particles, and also ina thermal distribution of photons. If the temperature israised to a sufficiently high level, then particle–antiparticlepairs will be created as a result of thermal collisions: themc2 energies of the new particles will be drawn from thekinetic energies of preexisting particles. The Higgs fieldthat was a nonzero constant in the vacuum would retainthat value over a wide range of temperatures, however,and this condition may be regarded as defining a phase: itimplies that certain vector bosons have effective masses,so that the underlying symmetry of the theory is brokenin a particular way.

If heat continues to be added, then the temperature andthermal energy will eventually rise to a point where theHiggs fields no longer remain in (or near) their state oflowest energy: they will undergo random thermal fluctu-ations about a central value of zero. When this happens,the underlying symmetry of the theory is restored: thosevector bosons that previously had effective masses willbehave like the massless particles they fundamentally are.This clearly constitutes a different phase of the theory, andwe conclude that nature undergoes a phase transition asthe temperature is raised above a critical level sufficientto render all Higgs fields zero.

Roughly speaking, this symmetry-restoring phase tran-sition occurs at the temperature where the mass-energyper unit volume of blackbody radiation equals the ρH cor-responding to the vanishing of all Higgs fields; an equipar-tition of energy effects the phase transition at this temper-ature when latent heat equal to ρHc2 is added. For the ρH

given by Eq. (89), the phase transition occurs at a tempera-ture near 1029 K. Like all phase transitions, it can proceedin either direction, depending on whether the temperatureis rising or falling as it passes through the critical value.

The relation between phase transitions and symmetrybreaking may be elucidated by noting that ordinary solidsbreak a symmetry of nature that is restored when the solidmelts. The laws of physics have rotational symmetry, as do

liquids: there is no preferred direction in the arrangementof molecules in a liquid. When a liquid is frozen, however,the rotational symmetry is broken because solids are crys-tals. The axes of a crystal may be aligned in any direction;but once a crystal has actually formed, its particular axessingle out particular directions in space.

It is also worth noting that rapid freezing of a liquid typ-ically results in a moderately disordered situation wheredifferent regions have their crystal axes aligned in dif-ferent directions: domains are formed wherein the rota-tional symmetry is broken in different ways. We may sim-ilarly expect that domains of broken-symmetry phase wereformed as the universe cooled below 1029 K, with the sym-metry broken in different ways in neighboring domains. Inthis case the horizon distance would be expected to governthe domain size.

There is one further aspect of phase transitions that is ofcentral importance for cosmic inflation, namely, the phe-nomenon of supercooling. For a familiar example, con-sider water, which normally freezes at 0C. If a sampleof very pure (distilled) water is cooled in an environmentfree of vibrations, however, it is possible to cool the sam-ple more than 20C below the normal freezing point whilestill in the liquid phase. Liquid water below 0C is obvi-ously in an unstable state: any vibration is likely to triggerthe onset of crystal formation (i.e., freezing), with conse-quent release of the latent heat of fusion from whateverfraction of the sample has frozen.

When supercooling of water is finally terminated bythe formation of ice crystals, reheating occurs: releaseof the crystals’ heat of fusion warms the partially solid,partially liquid sample (to 0C, at which temperature theremaining liquid can be frozen by removal of its heat offusion). Such reheating of a sample by released heat offusion is to be expected whenever supercooling is endedby a phase transition (though not all systems will have theirtemperature rise as high as the normal freezing point).

Some degree of supercooling should be possible in anysystem that undergoes phase transitions, including a GUTwith Higgs fields. A conjecture that supercooling (andreheating) occurred with respect to a GUT phase transitionin the early universe plays a crucial role in theories ofcosmic inflation, which we are now prepared to describein detail.

VIII. SCENARIOS FORCOSMIC INFLATION

A. The Original Model

We now assume that some GUTs with the features de-scribed earlier (in Section VII.C) provide a correct de-scription of particle physics, and consider the potential

Page 49: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 811

consequences for the evolution of the cosmos. Accordingto the standard big bang model, the temperature exceeded1029 K for times earlier than 10−39 sec. Hence the op-erative GUT would initially have been in its symmetricphase wherein all Higgs fields were undergoing thermalfluctuations about a common value of zero.

As the temperature dropped below the critical value of∼1029 K, there are two distinct possibilities (with gra-dations between). One possibility is that the GUT phasetransition occurred promptly, in which case the expansion,cooling, and decrease in mass-energy density of the uni-verse would have progressed in virtually the same way asin the standard model. The GUT symmetry would havebeen broken in different ways in domains outside eachothers’ horizons, and one can estimate how many domainwalls and magnetic monopoles would have resulted. Astandard solution to the field equations (Section VI.A)tells one how much the universe has expanded since then,so it is a straightforward matter to estimate their presentabundances.

No domain walls have yet been observed, nor magneticmonopoles with any confidence (there have been a fewcandidates, but too few for other explanations to be ruledout). The abundances predicted by the preceding line ofreasoning vastly exceed the levels consistent with observa-tions. For example, work by John Preskill and others indi-cated in 1979 that monopoles should be ∼1% as commonas protons. Their magnetic properties would make themeasy to detect, but no search for them has succeeded. Fur-thermore, monopoles are predicted to be ∼1016 times asmassive as protons. If they were ∼1% as common as pro-tons, their average density in the universe would be ∼1012

times the critical density. In reality, the observed densityof monopoles is vastly less than predicted by GUTs witha prompt phase transition. This was called the “monopoleproblem.”

The monopole problem was discovered by elemen-tary particle theorists soon after the first GUT was in-troduced, and appeared for several years to constitute ev-idence against GUTs. In 1980, however, Alan H. Guthproposed a remarkable scenario based on GUTs that heldpromise for resolving the horizon, flatness, monopole, andsmoothness problems, and perhaps others as well. Guthconjectured that as the universe expanded and cooled be-low the GUT critical temperature, supercooling occurredwith respect to the GUT phase transition. All Higgs fieldsretained values near zero for a period of time after it be-came thermodynamically favorable for one (or a linearcombination) of them to assume a nonzero value in thesymmetry-breaking phase.

Supercooling would have commenced when the uni-verse was ∼10−39 sec old. The mass-energy density ofall forms of matter and radiation other than Higgs fields

would have continued to decrease as the universe ex-panded, while that of the Higgs fields retained the constantvalue ρH corresponding to zero values for all Higgs fields.Within ∼10−38 sec, ρH and the corresponding negativepressure PH = −ρHc2 would have dominated the stress-energy tensor, which would then have mimicked a large,positive cosmological constant. The cosmic scale factorwould have experienced a period of exponential growth,with a doubling time of ∼10−37 sec. Guth proposed thatall this had happened, and called the period of exponen-tial growth the inflationary era. (The numbers estimatedby Guth were somewhat different in 1980. Current valuesare used here, with no change in the observable results.)

The Higgs fields were in a metastable or unstable stateduring the inflationary era, so the era was destined to end.Knowledge of its duration is obviously important: Howmany doublings of the scale factor occurred during thisanomalous period of cosmic growth? The answer to thisquestion depends on the precise form of the Higgs poten-tial, which varies from one GUT to another and unfor-tunately depends, in any GUT, on the values of severalparameters that can only be estimated.

Guth originally assumed that the Higgs potential hadthe form suggested by Fig. 4a. For such a potential, azero Higgs field corresponds to a local minimum of theenergy density, and the corresponding state is called a falsevacuum. (The true vacuum is of course the state whereinthe Higgs potential is an absolute minimum; that is thebroken-symmetry phase where one of the Higgs fields hasa nonzero constant value.)

FIGURE 4 Higgs potential energy density for (a) original infla-tionary model, and (b) new inflationary model. For a theory withtwo Higgs fields, imagine each curve to be rotated about its ver-tical axis of symmetry, forming a two-dimensional surface. A truevacuum would then correspond to any point on a circle in thehorizontal plane.

Page 50: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

812 Cosmic Inflation

In the classical (i.e., nonquantum) form of the theory, afalse vacuum would persist forever. The physics is anal-ogous to that of a ball resting in the crater of a volcano.It would be energetically favorable for the ball to be at alower elevation, hence outside the volcano at its base. Inclassical physics there is no way for the ball to rise sponta-neously over the rim and reach the base, so the ball wouldremain in the crater for all time.

In quantum physics, the story changes: a ball in thecrater is represented by a wave packet, which inevitablydevelops a small extension or “tail” outside the crater.This feature of quantum mechanics implies a finite prob-ability per unit time that the ball will “tunnel through” thepotential energy barrier posed by the crater’s rim and ma-terialize outside at a lower elevation, with kinetic energyequal to the decrease in potential energy. Such quantum-mechanical tunneling through energy barriers explains thedecay of certain species of heavy nuclei via the sponta-neous emission of alpha particles, and is a well-establishedphenomenon.

If there exists a region of lower potential energy, foreither alpha particles or the value of a Higgs field, then“decay” of the initial state via tunneling to that regionis bound to occur sooner or later. The timing of individ-ual decays is governed by rules of probability, in accor-dance with the statistical nature of quantum-mechanicalpredictions.

The decay of the false vacuum was first studied and de-scribed by Sidney R. Coleman. Initially all Higgs fieldsare zero, but one or another of them tunnels through theenergy barrier in a small region of space and acquires thenonzero value corresponding to the true vacuum, therebycreating a “bubble” of true vacuum in the broken symme-try phase.

Once a bubble of true vacuum has formed, it grows ata speed that rapidly approaches the speed of light, con-verting the surrounding region of false vacuum into truevacuum as the bubble expands. As the phase transition pro-ceeds, energy conservation implies that the mass-energydensity ρH of the preexisting false vacuum is convertedinto an equal density of other forms of mass-energy, whichGuth hoped would become hot matter and radiation thatevolved into our present universe.

A low rate of bubble formation corresponds to a highprobability that any particular region of false vacuum willundergo many doublings in size before decay of the falsevacuum ends inflation for that region. Guth noted that ifthe rate of bubble formation were low enough for inflationto have occurred by a factor of ∼1030 or more, then the ob-servable portion would have emerged from small enougha region for all parts to have been in causal contact be-fore inflation. Such causal contact would have resulted

in a smoothing of any earlier irregularities via processesleading toward thermal equilibrium, thereby explaining(one hopes) the homogeneity and isotropy now observed:the horizon problem would be solved (and perhaps thesmoothness problem as well).

It was furthermore noted by Guth that comparablegrowth of the cosmic scale factor could solve the flatnessproblem. Recall the definition of in Section VI.B: ≡ρ/ρc, where ρ and ρc denote the actual and critical den-sities of mass-energy, respectively. If the scale factor hadgrown by a factor of 1030 or more during the period ofinflation, any difference between and unity prior to in-flation would have been reduced by a factor of 1060 or moreby its end. [Equation (48) holds at all times]. R. Dicke andP. J. E. Peebles have shown that could not have differedfrom unity by more than one part in 1015 when the universewas one second old, if the present density of matter lies be-tween 20 and 200% of the critical density (as it does, withρM ≈ 0.3ρc). Perhaps of even greater interest, they showedthat if had differed from unity by more that one part in1014 at one second, then stars would never have formed.(For > 1, the universe would have collapsed too soon.For < 1, matter would have been diluted too soon by theexpansion.) This need for extraordinary “fine-tuning” atearly times is the “flatness problem.” Inflation by a factor≥1030 could explain why was so near unity at 1 sec.

If such inflation has occurred, the cosmic scale factor a0

would be 1030 (or more) times greater than in the standardmodel. Even if space is curved, the curvature only becomessignificant over distances comparable to the value of a0

(as explained in Section III.C). Inflation could render a0

so much larger than the radius of the observable universethat no curvature would be apparent.

The rate at which bubbles form depends on the partic-ular GUT and, within any GUT, on parameters includingsome whose values can only be estimated. Guth assumedthat inflation is the reason why our universe is nearlyflat, hence that bubble formation was slow enough for∼100 or more doublings to occur (2100∼1030). With adoubling time td ∼ 10−37 sec, this would have requiredonly ∼10−35 sec.

It would seem coincidental if 100 doublings had oc-curred but not, for example, 105 or more, which wouldhave reduced | − 1| by another factor of 210 ≈ 1000 (ormore). Standard inflation predicts that considerably moredoublings have occurred than are required to explain theobvious features of our universe (e.g., the presence ofstars and us). Standard inflation therefore predicts enoughdoublings to render the observed universe indistinguish-able from flat. Even 107 doublings would require only∼10−30 sec, and this would be inflation by a factor of∼103,000,000.

Page 51: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 813

Guth sought to understand whether a region now thesize of the observable universe was more likely to haveevolved from few or many bubbles, and he encountereda major shortcoming of his model. The role played bybubbles in the phase transition led to a bizarre distributionof mass-energy, quite unlike what astronomers observe.The problem may be described as follows.

We have remarked that the ρH of the false vacuum isconverted into an equal amount of mass-energy in otherforms as the phase transition proceeds. It can be shown,however, that initially the mass-energy is concentrated inthe expanding walls of the bubbles of true vacuum, inthe form of moving “kinks” in the values of the Higgsfields. Conversion of this energy into a homogeneous gasof particles and radiation would require a large number ofcollisions between bubbles of comparable size. Bubblesare forming at a constant rate, however, and grow at nearlythe speed of light. If enough time has passed for there tobe many bubbles, there will inevitably be a wide rangein the ages and therefore sizes of the bubbles: collisionsamong them will convert only a small fraction of the en-ergy in their walls into particles and radiation. In fact itcan be shown that the bubbles would form finite clustersdominated by a single largest bubble, whose energy wouldremain concentrated in its walls.

One could explain the homogeneity of the observableuniverse if it were contained within a single large bubble.There remains, however, the problem that the mass-energywould be concentrated in the bubble’s walls: the interior ofa single large bubble would be too empty (and too cold atearly times) to resemble our universe. In the best spirit ofscience, Guth acknowledged these problems in his initialpublication, and closed it with an invitation to readers tofind a plausible modification of his inflationary model, onethat preserved its strengths while overcoming its initialfailure to produce a homogeneous, hot gas of particlesand radiation that might have evolved into our presentuniverse.

We note in passing that Andre D. Linde, working at theLebedev Physical Institute in Moscow, had discovered theprincipal ingredients of inflation prior to Guth. Further-more, Linde and Gennady Chibisov realized in the late1970s that supercooling might occur at the critical tem-perature, resulting in exponential growth. To their misfor-tune, however, they were not aware of the horizon andflatness problems of standard cosmology. When they re-alized that bubble collisions in their model would lead togross inhomogeneities in the universe, they saw no reasonto publicize their work.

Alexei Starobinsky, of the Landau Institute for Theo-retical Physics in Moscow, proposed a form of inflationin a talk at Cambridge University in 1979. His focus was

on avoidance of an initial singularity at time zero, how-ever, and no mention was made of the horizon, flatness,or monopole problems. Although well received, his talkdid not generate enough excitement for word of it to reachthe United States. By the time it was published in 1980,Guth had already given many lectures emphasizing howinflation might solve problems of standard cosmology. Itwas this emphasis that made the potential significanceof Starobinsky’s and Linde’s work apparent to a wideaudience.

B. The New Inflationary Model

A major improvement over Guth’s original model wasdeveloped in 1981 by A. D. Linde and, independently, byAndreas Albrecht with Paul J. Steinhardt. Success hingeson obtaining a “graceful exit” from the inflationary era,i.e., an end to inflation that preserves homogeneity andisotropy and furthermore results in the efficient conversionof ρH into the mass-energy of hot particles and radiation.

The new inflationary model achieves a graceful exit bypostulating a modified form for the potential energy func-tion of Higgs fields: the form depicted in Fig. 4b, whichwas first studied by Sidney Coleman in collaboration withErick J. Weinberg. A Coleman–Weinberg potential hasno energy barrier separating the false vacuum from truevacuum; instead of resembling the crater of a volcano,the false vacuum corresponds to the center of a ratherflat plateau. (Though not a local minimum of energy, theplateau’s center is nevertheless called a false vacuum forhistorical reasons.)

Supercooling of a Coleman–Weinberg false vacuum re-sults in the formation of contiguous, causually connecteddomains throughout each of which the Higgs fields gradu-ally evolve in a simultaneous, uniform way toward a singlephase of true vacuum. The initial sizes of these domains arecomparable to the horizon distance at the onset of super-cooling, ∼10−28 cm. The mass-energy density of the falsevacuum was that of null Higgs fields, ρH ∼ 1080 g/cm3.Despite this enormous density, the initial mass of the do-main in which our entire universe lies was ∼10−4 g.

Within each domain the evolution of a Higgs field awayfrom its initial value of zero is similar to the horizontal mo-tion of a ball that, given a slight nudge, rolls down the ini-tially flat but gradually increasing slope of such a plateau.The equations describing evolution of the Higgs fieldshave a term corresponding to a frictional drag force for arolling ball; in addition to the initial flatness of the plateau,this “drag term” serves to retard the evolution of a Higgsfield away from zero toward its true-vacuum, broken-symmetry value. The result is called a “slow-rollover”transition to the true vacuum.

Page 52: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

814 Cosmic Inflation

As always, there are different combinations of valuesof Higgs fields that have zero energy and qualify as truevacua; different domains undergo transitions to differenttrue vacua, corresponding to the possibility of a ball rollingoff a plateau in any one of many different directions.

As a Higgs field (or linear combination thereof)“rolls along” the top of the potential energy plateau, themass-energy density remains almost constant despite anychange in the volume of space. The corresponding pres-sure is large and negative, so inflation occurs. The dou-bling time for the cosmic scale factor is again expected tobe ∼10−37 sec. The period of time required for a Higgsfield to reach the edge of the plateau (thereby terminatinginflation) can only be estimated, but again inflation by afactor of 1030, or by very much more, is quite plausible.As in the “old inflationary model” (Guth’s original sce-nario), new “standard” inflation assumed that the actualinflation greatly exceeded the amount required to resolvethe horizon and flatness problems.

When a Higgs field “rolls off the edge” of the energyplateau, it enters a brief period of rapid oscillations aboutits true-vacuum value. In accordance with the field-particleduality of QFT, these rapid oscillations of the field corre-spond to a dense distribution of Higgs particles. The Higgsparticles are intrinsically unstable, and their decay resultsin a rapid conversion of their mass-energy into a widespectrum of less massive particles, antiparticles, and radi-ation. The energy released in this process results in a hightemperature that is less than the GUT critical temperatureof ∼1029 K only by a modest factor ranging between 2and 10, depending on details of the model. This is calledthe “reheating” of the universe.

The reheating temperature is high enough for GUTbaryon-number–violating processes to be common, whichleads to matter becoming dominant over antimatter. Re-heating occurs when the universe is very young, probablyless than 10−30 seconds old, depending on precisely howlong the inflationary era lasts. From this stage onward, theobservable portion evolves in the same way as in the stan-dard big bang model. (A slow-rollover transition is slowrelative to the rate of cooling at that early time, so that su-percooling occurs, and slow relative to the doubling timeof ∼10−37 sec; but it is extremely rapid by any macro-scopic standard.)

The building blocks of the new inflationary model havenow been displayed. As with the old inflationary model,there is no need to assume that any portion of the universewas initially homogeneous and isotropic. It is sufficient toassume that at least one small region was initially expand-ing, with a temperature above the GUT critical tempera-ture. The following chain of events then unfolded.

As long as the temperature exceeded the critical tem-perature of ∼1029 K, all Higgs fields would have under-

gone thermal fluctuations about a common value of zero,and the full symmetry of the GUT would have been mani-fest. The assumed expansion caused cooling, however, andthe temperature fell below 1029 K when the universe was∼10−39 sec old. Supercooling began, and domains formedwith diameters comparable to the horizon distance at thattime, ∼10−28 cm, with initial masses of ∼10−4 g.

As supercooling proceeded, the continuing expansioncaused dilution and further cooling of any preexisting par-ticles and radiation, and the stress-energy tensor becamedominated at a time of roughly 10−37 sec by the virtuallyconstant contribution of Higgs fields in the false vacuumstate. With the dominance of false-vacuum energy andnegative pressure came inflation: the expansion entereda period of exponential growth, with a doubling time of∼10−37 sec. The onset of inflation accelerated the rateof cooling; the temperature quickly dropped to ∼1022 K(the Hawking temperature), where it remained throughoutthe latter part of the inflationary era because of quantumeffects that arise in the context of GTR.

For the sake of definiteness, assume that the domain inwhich we reside inflated by a factor of 1050, which wouldhave occurred in ∼2 × 10−35 sec. With an initial diame-ter of ∼10−28 cm and mass of ∼10−4 g, such a domainwould have spanned ∼1022 cm when inflation ended (asin Fig. 2), with a total mass of ∼10146 g. All but one part in∼10150 of this final mass was produced during inflation, asthe mass-energy of the false vacuum grew in proportion tothe exponentially increasing volume. (Guth has called this“the ultimate free lunch.”) The radius of our presently ob-servable universe would have been ∼60 cm when inflationended (based on the methods of Section VI.A, assumingthat tEQ ≥ tLS).

Inflation ended with a rapid and chaotic conversion ofρH into a hot (T ≥ 1026 K) gas of particles, antiparticles,and radiation. Note that if baryon number were strictlyconserved, inflation would be followed for all time byequal amounts of matter and antimatter (which would haveannihilated each other almost completely, with conversionof their mass-energy into radiation). The observable uni-verse, however, is strongly dominated by matter. Henceinflation requires a GUT (or a theory sharing key featureswith GUTs) not only to explain an accelerating expansion,but also to explain how the symmetry between matter andantimatter was broken: through baryon-number–violatingprocesses at the high temperature following reheating.

C. New, Improved Inflation

In the standard big bang model, thermal fluctuations indensity at the Planck time would gradually have been am-plified, by the concentrating effect of gravity, into presentclumps of matter much larger than anything we see. This

Page 53: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 815

is the smoothness problem, first noted by P. J. E. Peeblesin 1968. When inflation was proposed in 1980, Guth andothers hoped that it would solve this problem. The end ofinflation creates the initial conditions for the ensuing bigbang, so the density perturbations at the end of inflationwere the basic issue. During the Nuffield Workshop onthe Very Early Universe in the summer of 1982, however,it became clear that the simplest GUT, the SU(5) modelof Georgi and Glashow, was unacceptable: it gave rise tovalues for δρ/ρ that were too large by a factor of ∼106.

As other GUTs were analyzed with similar results, aconsensus gradually emerged that no symmetry-breakingHiggs field in a GUT could give rise to acceptable inflation.The size of δρ/ρ at the end of inflation depends criticallyon details of the Higgs potential. The potential of a GUTHiggs field, like that in Figure 4b, would have to be widerand flatter by factors ∼1013 in order for the resulting δρ/ρ

to have given rise to the stars and galaxies that we see.Another problem requiring a much wider and flatter po-

tential became apparent. As the early universe was cool-ing toward the GUT critical temperature, all Higgs fieldswould indeed have undergone thermal fluctuations aboutaverage values of zero, because of cancellations betweenpositive and negative values. For Higgs potentials thatmeet the requirements of GUTs, however, the averagemagnitudes would have been comparable to those in atrue vacuum. The creation of a false vacuum by coolingto the critical temperature would have been extremely un-likely. “Unlikely” is not “impossible,” but plausibility ofthe theory would be undermined by the need for a specialinitial condition.

In contrast, if the potential were 1013 times wider andflatter, cooling below the critical temperature should haveresulted in a false vacuum over some tiny region of space,which might then have inflated into a universe like that inwhich we find ourselves. We are therefore led to considertheories of this kind.

The term inflaton (in-fluh-tonn) refers to any hypothet-ical field that could cause inflation. We are interested, ofcourse, in inflatons that might have produced our universe.A scalar field (having no spatial direction) φ seems bestsuited (perhaps two of them). The underlying particle the-ory must convert false-vacuum energy into a hot soup ofappropriate particles at the end of inflation, in a gracefulexit producing density perturbations of the required kind.Finally, the particle theory must contain a mechanism forbaryogenesis, making matter dominant over antimatter atan early time.

The potential V (φ) could have an extremely broad, al-most flat region surrounding φ = 0, with V (0) > 0. Anattractive alternative was proposed by Andrei Linde, how-ever, in 1983. The inflaton potential of Linde’s chaotic in-flation has the shape of a very wide, upright bowl, with the

true vacuum at its center (φ = 0). Space-time could havebeen in a highly disordered, chaotic state at the Plancktime. The theory only requires that a single, infinitesimalregion had a value for φ far enough up the wall to have pro-duced the desired inflation while “rolling” slowly towardthe center. Like chaotic inflation, most inflaton models as-sume that inflation began very near the Planck time, inpart to explain why the initial microcosm did not collapsebefore inflation began.

A broad range of particle theories can be constructedwith suitable inflaton fields, including supersymmetrictheories, supergravity theories, and superstring theory.Further discussion would take us beyond the range of thisarticle. Improved models employing a wide range of infla-tons share all the successes of the “new inflationary model”of 1982, however, and those predicting a flat universe willbe included when we speak of “standard inflation.”

D. Successes of Inflation

Our universe lies inside a domain with a uniform vac-uum state, corresponding to a particular way in which any(now) hidden symmetry of the underlying particle theorywas broken. Standard inflation predicts that this domaininflated by a factor much greater than 1030, in which caseour universe (almost certainly) lies deeply inside. Nei-ther domain walls nor magnetic monopoles would thenbe found within the observable universe (except for a fewthat might have resulted from extreme thermal fluctuationsafter inflation ended). Standard inflation therefore solvesthe monopole problem. The horizon problem is solved aswell, as illustrated in Fig. 2.

The flatness problem has an interesting history. Wheninflation was proposed in 1980, the matter yet found byastronomers corresponded to a density of matter ρM thatwas ∼10% of the critical density ρc. Astronomers hadby no means completed their search for matter, however.When inflation was proposed, it seemed possible that fu-ture discoveries would bring ρM up to ρc, as required bystandard inflation’s prediction of flatness. With the pas-sage of time, the presence of additional (dark) matter wasindicated by observations, but there has never been evi-dence that ρM is even half of ρc. The currently observedvalue is ρM = (0.31 ± 0.06)ρc, and it seems virtually cer-tain that ρM ≤ ρc/2 (Section V).

By the early 1990s, the apparent discrepancy betweenρM and ρc led to a serious consideration of “open” in-flationary models, wherein ρM < ρc and space has nega-tive curvature. Such models are ingenious and technicallyviable, but require special assumptions that seem ratherartificial.

As evidence mounted that ρM < ρc, however, stan-dard inflation acquired a bizarre feature of its own. Its

Page 54: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

816 Cosmic Inflation

prediction of flatness implied the existence of positive vac-uum energy, in order to bring the total density up to ρc.This would be equivalent to a positive cosmological con-stant (Section IV.F and G), a concept that Einstein aban-doned long ago and few had been tempted to resuscitate.Such vacuum energy would have caused an accelerationof the cosmic expansion, beginning when the universe wasroughly half its present age (Section V.F).

In 1998 two teams of astronomers, one led by SaulPerlmutter and the other by Brian P. Schmidt, reportedan astonishing discovery. They had observed supernovaeof type Ia (SNe Ia) over a wide range of distances outto ∼12 billion light-years (redshift parameter z ≈ 1, seeSection VI.A). Their data strongly indicated that the cos-mic expansion is, indeed, accelerating. This conclusionhas been confirmed by subsequent studies of SNe Ia.Furthermore, the amount of vacuum energy required toexplain the acceleration appears to be just the amountrequired (±10%) to make the universe flat: ρV ≈ 0.7ρc,hence (ρM + ρV ) ≈ ρc (details in Section V.E).

The vacuum energy implied by supernovae data hasbeen strikingly confirmed by studies of the CMB, whichhas been traveling toward us since the time of last scatter-ing: tLS ≈ 300,000 yr. The CMB is extremely uniform, butcontains small regions with deviations from the averagetemperature on the order of δT/T ∼ 10−5. These providea snapshot of temperature perturbations at tLS , which werestrongly correlated with density perturbations.

Prior to tLS , baryonic matter was strongly ionized. Thefree electrons and nuclei were frequently scattered byabundant photons, which pressurized the medium. Den-sity waves (often called acoustic waves) will be excited byperturbations in any pressurized medium, and the speedof such waves is determined (in a known way) by themedium’s properties. At any given time before tLS , therewas a maximum distance that such waves could yet havetraveled, called the acoustic horizon distance.

As the acoustic horizon expanded with the passage oftime, it encompassed a growing number of density per-turbations, which excited standing waves in the medium.The fundamental mode acquired the greatest amplitude,spanning a distance comparable to the acoustic horizondistance. This is called the first acoustic peak. Pertur-bations in the CMB should contain this peak, with anangular diameter reflecting the acoustic horizon distanceat tLS .

The angular diameters of perturbed regions in the CMBdepend not only on the linear dimensions of their sources,but also on the curvature (if any) of space. Sources ofthe CMB are presently a distance rp(zCMB) ≈ 47 × 109 lyfrom us (Section VI.A). If space is curved, angular diam-eters of perturbations in the CMB would be noticeably af-

fected unless the cosmic scale factor a0 satisfied (roughly)a0 ≥ rp(zCMB) (Section III.B). From Eqs. (48) and (69), wesee that this would require |0 − 1| ≤ 0.11.

BOOMERanG (Balloon Observations of MillimetricExtragalactic Radiation and Geomagnetics) results pub-lished in 2000 are portrayed in Fig. 5. The index refersto a multipole expansion (spherical harmonics, averagedover m for each ). If space were flat, the angular diameterof the first acoustic peak would be ≈0.9, correspondingto peak ≈ 180/0.9 ≈ 200.

With ρM ≈ 0.3ρc but no vacuum energy, space wouldbe negatively curved, giving rise to peak ≈ 500. This isclearly ruled out by the data. For 0 near unity, the predic-tion can be expressed as peak ≈ 200/

√0, and the data

in Fig. 5 indicate peak = (197 ± 6). At the 95% confi-dence level, the authors concluded that 0.88 ≤ 0 ≤ 1.12.The best-fitting Friedmann solution for spacetime (Sec-tion IV.A) has ρM = 0.31ρc and ρV = 0.75 , reproducingthe SNe Ia result for vacuum energy within the margin oferror. A more resounding success for standard inflation isnot easily imagined.

A sweeping, semiquantitative prediction about densityperturbations is made by inflation. In the absence of quan-tum fluctuations, inflation would rapidly drive the falsevacuum to an extremely uniform state. Quantum fluctu-ations are ordinarily microscopic in size and very short-lived, but rapid inflation would negate both of these usualproperties.

Inflation would multiply the diameters of quantum fluc-tuations, and furthermore, do this so rapidly that oppositesides would lose causal contact with each other before the

FIGURE 5 Evidence for flatness and vacuum energy (cosmo-logical constant). Angular distribution of (δT/T)2 of the cosmicmicrowave background, measured by instruments in balloonslaunched from McMurdo Station, Antarctica, by the BOOMERanGProject. Curve indicates best-fitting Friedmann solution for space-time, with cosmological parameters M = 0.31, V = 0.75, andHubble parameter h0 = 0.70. At the 95% confidence level, thestudy concluded that 0.88 ≤ 0 ≤ 1.12, where 0 ≡ M + V = 1for flat space. The positive V means positive vacuum energy,corresponding to a cosmological constant that would acceleratethe cosmic expansion. The index refers to a multipole expan-sion in terms of spherical harmonics Ym(θ, ϕ), where m has beensummed over for each . [Adapted from P. de Bernardis, et al.Reprinted by permission from Nature 404, 955 (2000) MacMillanMagazines Ltd.]

Page 55: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 817

fluctuation vanished. The resulting density perturbationswould be “frozen” into the expanding medium, therebyachieving permanency. By inflation’s end, the sizes of per-turbations would span an extremely broad range. The ear-liest fluctuations would have been stretched enormously,the latest very little, and all those from intermediate timesby intermediate amounts. The magnitudes of the resultingδρ/ρ are very sensitive to details of the inflaton potential,but the mechanism described above results in a distribu-tion of sizes that depends rather little on details of thepotential.

The kind of probability distribution described aboveshould be apparent in the CMB, since density perturba-tions correspond closely to perturbations in the radiation’stemperature. The Cosmic Background Explorer satellite,better known as COBE (co-bee), was launched in 1989to study the CMB with unprecedented precision. Its mis-sion included measuring deviations from uniformity inthe CMB’s temperature. Over two years were required togather and analyze the data displayed in Fig. 6.

These data were first presented at a cosmology confer-ence in Irvine, CA, in March of 1992. The long awaitedresults struck many attendees as a historic confirmation ofinflation, and none have since had reason to feel otherwise.Many subsequent studies have confirmed and refined thedata in Fig. 6, but we present the earliest results becauseof their historical significance.

The density perturbations predicted by inflation are sim-ilar to those required to explain the evolution of structureformation in our universe. Indeed, one may reasonablyhope that some particular inflaton potential will be whollysuccessful in describing further details of the CMB soonto be measured, and also in explaining details of the struc-tures we observe.

Inflation provides the only known explanations for sev-eral puzzling, quite special features of our universe. Stan-dard inflation has also made two quantitive predictions,both of which have been strikingly confirmed (at leastwithin present uncertainties). Only time can tell, but theprognosis for inflation seems excellent, in some formbroadly similar to that described here.

E. Eternal Inflation, Endless Creation

Throughout inflation, the inflation field φ is on a downhillslope of the potential V (φ). The false-vacuum density ρFV

is therefore not strictly constant, but gradually decreasesas φ approaches the end of inflation. Since td ∝ 1/

√ρFV

[Eqs. (31) and (42)], the rate of exponential growth alsotends to decrease, all else being equal. In our discussionof density perturbations, however, we noted that quantumfluctuations in φ result in perturbations δρ that are frozen

FIGURE 6 COBE nonuniformity data. The data from the COBEsatellite gives the temperature T of the background radiation forany direction in the sky. The experimental uncertainties for anyone direction are large, but statistically meaningful quantities canbe obtained by averaging. The COBE team considered two direc-tions separated by some specific angle, say 15, and computedthe square of the temperature difference, (T1 − T2)2, measuredin microkelvins (10−6 K). By averaging this quantity over all pairsof directions separated by 15, they obtained a statistically reli-able number for that angle. The process was repeated for anglesbetween 0 and 180. The computed points are shown as smalltriangles, with the estimated uncertainty shown as a vertical lineextending above and below the point. The gray band shows thetheoretically predicted range of values corresponding to the scale-invariant spectrum of density perturbations arising from inflation.The effect of the earth’s motion through the background radiationhas been subtracted from both the data and the prediction, as hasa specific angular pattern (called a quadrupole) which is believedto be contaminated by interference from our own galaxy. Sinceinflation determines the shape of the spectrum but not the magni-tude of density perturbations, the magnitude of the predicted grayband was adjusted to fit the data. [From “The Inflationary Uni-verse” by Alan Guth. Copyright © 1997 by Alan Guth. Reprintedby permission of Perseus Books Publishers, a member of PerseusBooks, L.L.C.]

into permanency by the rapid expansion. A fascinatingquestion therefore arises.

For simplicity, suppose that φ is precisely uniformthroughout space at the beginning of inflation. Quantumfluctuations will immediately result in perturbations thatbecome frozen into space. In regions where δρ > 0, the ex-pansion rate will be slightly greater than average, and suchregions will fill an increasing fraction of space. Withinthese growing regions, later fluctuations will produce newregions where δρ > 0, compounding the original deviationfrom average density and rate of expansion. One expectsnegative δρ as often as positive, but a string of positivescauses that region of space to grow more rapidly, increas-ing the volume of space where a randomly positive δρ

would lengthen the string. This can continue indefinitely.Of course the average ρ is decreasing while the region

described above continues rising above the average, so

Page 56: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

818 Cosmic Inflation

the outcome is unclear. Will inflation come to an end insuch a region, sometime later than average, or will therebe regions where ρFV never shrinks to zero and inflationnever ends?

The answer clearly depends on the size, frequency, andmagnitude of quantum fluctuations, and also on details ofthe inflaton potential. Alexander Vilenkin answered thisquestion for new inflationary models (Section VIII.B) in1973. His extraordinary conclusion was that once “newinflation” has begun, there will always be regions whereit continues. Andrei Linde established this result for awide class of “improved” inflationary models, includinghis own “chaotic inflation,” in 1986. This phenomenon iscalled eternal inflation, and appears to be characteristic ofvirtually all inflationary models that might be relevant toour universe.

Eternal inflation is accompanied by endless formula-tion of “pockets” of true vacuum. Each is surrounded byeternally inflating false vacuum, containing within it allthe other pockets yet created. A detailed analysis revealsthat the number of pockets of true vacuum grows at anexponential rate, with a doubling time comparable to thatof the surrounding false vacuum (typically much less than10−30 sec).

If standard inflation is correct, our observable universelies deep within a single such pocket. The number of suchpockets would double a minimum of 1030 times each sec-ond. This corresponds to the number of pockets (and uni-verses contained therein) increasing by a minimum factorof (101,000,000,000)3, every second, for eternity. The possi-bility of truth being stranger than fiction has risen to a newlevel.

Given that standard inflation implies endless creation,human curiosity naturally wonders whether such inflationhad any beginning. A definitive answer remains elusive.Under certain very technical but plausible assumptions,however, Arvind Borde and Alexander Vilenkin proved in1994 that inflation cannot extend into the infinite past. Abeginning is required, which renews an ancient question.

F. Creation ex nihilo

We have seen that the observable universe may haveemerged from an extremely tiny region that experiencedinflation and then populated the resulting cosmos with par-ticles and radiation created from the mass-energy of thefalse vacuum. An ancient question arises in a new context:How did that tiny region come into being, from which theobservable universe emerged? Is it possible to understandthe creation of a universe ex nihilo (from nothing)?

Scientific speculation about the ultimate origin of theuniverse appears to have begun in 1973, with a proposalby Edward P. Tryon that the universe arose as a vacuum

fluctuation, i.e., a spontaneous quantum transition from anempty state to a state containing tangible matter and en-ergy. The discovery of this theoretical possibility sharestwo features with Guth’s discovery of inflation. The firstis that neither investigator was considering the question towhich an answer was found. Guth was a particle theorist,seeking to determine whether GUTs failed the test of re-ality by predicting too many monopoles. He had not setout to study cosmology, let alone to revolutionize it.

In Tryon’s case, he had never imagined it possible toexplain how and why the big bang occurred. Universeswith no beginning had been quite popular, in large mea-sure because they nullified (by fiat) the issue of creation.When the steady-state model was undermined by discov-ery of the CMB in 1965, the eternally oscillating modelreplaced it as the favored candidate, despite the mysteryof how a universe could have bounced back from eachcollapse and furthermore retained its uniformity throughinfinitely many bounces. The only alternative was a uni-verse of finite age, which seemed incomprehensible froma scientific point of view.

Interests in cosmology, quantum theory, and particlephysics arose in Tryon’s undergraduate years. Graduatestudies commenced in 1962 at the University of California,Berkeley, where his courses included quantum field theoryand general relativity. Classical and quantum theories ofgravitation were the subjects of his dissertation.

It had long been a mystery why mass plays two quitedifferent roles. It describes resistance to acceleration, orinertia, in Newton’s laws of motion. In Newton’s law ofgravitation, F = GmM/r2, mass plays the role of gravita-tional “charge” (analogous to electric charge in Coulomb’slaw). This puzzle had led Ernst Mach in the 1880s to con-jecture that inertia might somehow be a result of inter-action with the rest of the universe (Mach’s Principle).Tryon found this possibility tantalizing, and took a breakfrom his thesis project one afternoon to see if he couldinvent a quantitative theory.

Denoting inertial mass by m I and gravitationalmass (or “charge”) by mG , Newton’s gravity becomesF = mG MG/r2 in suitably chosen units. The two typesof mass are presumably related by

m I = f (r , t)mG, (92)

where f is a scalar field expressing the inertial effect of allmatter in the universe. An obvious constraint on any suchtheory is the empirical fact that f 1/

√G near earth.

Guided by dimensional analysis and simplicity, Tryonguessed that f might be the magnitude of a relativisticanalog to gravitational potential, divided by c2:

f = |VG |/c2, (93)

Page 57: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 819

where the gravitational potential energy (GPE) of a massm would be

GPE = −mG |VG |, (94)

since VG < 0. In the nonrelativistic (Newtonian) limit, onewould then have ∇2 f = −4πρG/c2, where ρG denotes thedensity of gravitational mass or “charge,” and ∇2 is theLaplacian differential operator. The desired analog is

f 2 f = 4π

c2T µ

µ , (95)

where 2 is the invariant d’Alembertian of general rela-tivity, T µ

µ ≡ gµνTµν , and Tµν is the inertial stress-energytensor (so that ρG appears in Tµν/ f ).

For a simple test of the theory, Tryon used the Tµν ofEq. (21) and the critical density ρc given by Eq. (45), as-suming zero pressure. (The actual density was known tobe comparable to ρc, perhaps even equal, and calculationswere simpler in a flat space). In a rough estimate for f ,the universe was treated as static. Since no causal influ-ence could have reached us from beyond the horizon, thevolume integral over ρc was cut off at the horizon distancedSH = 2c/H0 [Eqs. (62) and (64)]. The result was a cut-offversion of Newton’s gravity:

f 2 ≈ (MI

/c2

) × 〈1/r〉, (96)

where MI denotes the total (inertial) mass within the hori-zon, and 〈1/r〉 = 3/2dSH is the average value of 1/r .

With MI given by ρc, Eq. (96) yields f ≈ √3/G, which

is the result desired within a factor of√

3. It would bedifficult to exaggerate the adrenaline that was producedby this discovery. Given the crudity of the calculation, itseemed likely that a more careful analysis would yield theexact result desired for some density near ρc, perhaps thedensity of our own universe.

When Tryon met with his mentor the next day, he wasdisappointed to learn that this approximate relation be-tween ρc and G was already known. In fact Carl Brans andRobert Dicke had published a similar (but far more sophis-ticated) theory a few years earlier (1961). One differencewas that unlike Tryon’s crude theory, theirs contained anadjustable parameter ω, and became identical with GTRin the limit as ω → ∞.

Active interest in the Brans-Dicke theory was soon un-dermined by observations, all of which were consistentwith standard GTR. The Brans-Dicke theory could notbe ruled out by such data, since ω could always be cho-sen large enough to obtain agreement. In the absence ofany discrepancy between observations and GTR, however,GTR remained the favored candidate. Tryon followedthese developments closely, and gradually abandoned theMachian interpretation of the relation ρc between and G.He retained a deep conviction, however, that the nearly

critical density of our universe was no coincidence. Hebelieved it to be a cryptic manifestation of some truth offundamental importance, and he resolved to revisit the is-sue from time to time, hoping eventually to decode themessage.

Tryon’s research centered on the strong interactionsfor several years, but he occasionally spent a few hourswrestling with the significance of a density near ρc. Oneafternoon in 1973, an epiphany occurred. To his utter sur-prise and astonishment, his “mind’s eye” saw a brilliantflash of light, and he realized instantly that he was witness-ing the birth of the universe. In that same instant, he real-ized how and why it happened: the universe is a vacuumfluctuation, with its cosmic size and duration explained bythe relation between ρc and G that he had struggled for solong to understand.

For pedagogical purposes, textbooks on quantum fieldtheory discuss a few simple models which, for a vari-ety of reasons, fail to describe physical reality. One suchmodel contains a field ψ with a potential V (ψ) = λψ3,where λ is a constant. This model is immediately re-jected, because it has no state of minimum energy. (Forλ = ±|λ|, V (ψ) → −∞ as ψ → ∓∞). The existence of astate of minimum energy, called the vacuum, is tradition-ally regarded as essential to any realistic quantum fieldtheory of elementary particles.

Consider now a distribution of particles with total massM . Adding Einstein’s Mc2 to Newtonian approximationsfor kinetic and potential energies, one obtains a totalenergy

E = M(c2 + 1

2 〈ν2〉) − 12 GM2 × 〈1/r〉, (97)

where angled brackets denote average values (of velocity-squared and inverse separation, respectively). Note thatE has no minimum value: E → −∞ as M → ∞.Gravitational potential energy is precisely the kind thathas traditionally been rejected in quantum field theory,because such a theory has no state of minimum energy.An initially empty state with E = M = 0 would sponta-neously undergo quantum transitions to states with E = 0but arbitrary M > 0, subject only to the constraint that

〈1/r〉 = 2(c2 + 1

2 〈ν2〉)GM

(98)

Quantum field theory has no stable vacuum when gravityis included.

A vacuum fluctuation from M = 0 to a state with M > 0would require a form of quantum tunneling, since E > 0for values of M between 0 and the final value. Such tun-neling is a completely standard feature of quantum theory,however. We have already considered one example, the in-evitable tunneling of a Higgs field from the false vacuumof Fig. 4a to the true vacuum.

Page 58: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

820 Cosmic Inflation

In quantum field theory, pairs of particles (e.g. electronsand positrons) are continuously appearing in the mostempty space possible (Section IV.G). Energy conservationis violated by an amount E for a duration t . This is per-mitted (actually implied) by quantum mechanical uncer-tainties, however, subject to the constraint that Et ≈ h-- .Eventually, by chance, enough particles would sponta-neously appear close enough together for them to havezero net energy, because of cancellation between posi-tive mass-energy and negative gravitational energy. Sucha state would have E = 0, and could last forever.

Now recall Tryon’s guess that the f in Eq. (92) might bethe magnitude of the cosmic gravitational potential. Thiswould require that f 1/

√G but, in a crude calculation

assuming the density to be ρc, he had obtained f √3/G.

Equations (92)–(94) then imply that m I c2 + GPE ≈ 0 forevery piece of matter, i.e., a universe with density near ρc

could have zero energy. Such a universe could have orig-inated as a spontaneous vacuum fluctuation. Conversely,such an origin would explain why our universe has a den-sity near ρc.

More generally, creation ex nihilo (from nothing) wouldimply that our universe has zero net values for all con-served quantities (“the quantum numbers of the vacuum”).The only apparent challenge to this prediction arises fromthe fact that our universe is strongly dominated by mat-ter (i.e., antimatter is quite rare). Laboratory experimentshave revealed that the symmetry between matter and anti-matter is not exact, however, and most theories of elemen-tary particles going beyond the standard model containmechanisms for matter to have become dominant overantimatter at a very early time (baryogenesis, see Sec-tion VII.C ).

The preceding discussion has used Newtonian approx-imations for energy, and these are only suggestive. Wenote, however, that Newtonian theory resembles GTR farmore than one might guess. For a matter-dominated uni-verse, Newton’s and Einstein’s physics predict the samerate of slowing for the expansion, as was shown in SectionIV.D. As another example, the escape velocity from thesurface of a planet of mass M and radius R is given byNewtonian mechanics as νesc = √

2GM/R. For νesc = c,the relation between R and M is precisely that of GTR de-scribing a black hole (where R is called the Schwarzschildradius).

The field equation (22) is of particular interest. With a(t)denoting the cosmic scale factor and a ≡ da/dt , Eq. (22)implies

(kc2 + a2) − 8πG(ρa3)

3a= 0 (99)

Now consider a closed, finite universe (k = +1). Theproper volume of such a universe (Section III.C) is

V (t) = 2π2a3(t), and the total mass-energy is M(t) =ρ(t)V (t). [M is constant for zero pressure, but otherwisenot; see Eq. (24).] With rp(t) denoting proper distance(Section III.C) and r p ≡ drp/dt corresponding to veloc-ity, Eq. (97) multiplied by M yields

M(c2 + 0.36

⟨r2

p

⟩) − 0.55GM2 × 〈1/rp〉 = 0, (100)

where a2 and 1/a have been expressed in terms of

⟨r2

p

⟩ = 1

V

∫dV

(r2

p

) = 2a2

π

∫ π

0du(u sin u)2, (101)

〈1/rp〉 = 1

V

∫dV

1

rp= 2

πa

∫ π

0du

sin2 u

u(102)

(see Section III.C). Note that the density, pressure, andtotal mass M are arbitrary in the above analysis. [In par-ticular, Eqs. (99)–(102) are valid in both standard and in-flationary models.]

Apart from slight differences in numerical coefficients,the left side of Eq. (100) is identical in form to the rightside of Eq. (97). This suggests that, in some sense, allclosed universes have zero energy. This interpretation isbuttressed by the fact that such universes are finite andself-contained, so that no gravitational flux lines couldextend outward from them into any surrounding space inwhich they might be embedded. The energy of a universehas no rigorous definition within the context of GTR (Sec-tion IV.B), but the preceding observations are neverthelessquite suggestive.

We also note that if dominated by matter, a closed uni-verse expands to a maximum size in a finite time and thencollapses back to zero size (Section IV.C). This is the typ-ical behavior of a vacuum fluctuation. For the above rea-sons, and also because physical processes are expected togive rise to finite systems, Tryon’s conjecture of 1973 wasaccompanied by a prediction that our universe is closedand finite.

It is a remarkable fact that creation ex nihilo could havebeen proposed at any time since the early 1930s. All thetheoretical ingredients had been in place. There had been along period when the magnitude and grandeur of the cos-mos had befogged the normally clear vision of physicists,however. A few examples will illustrate this point.

Before Hubble’s discovery of the cosmic expansion in1929, Einstein had believed the universe to be infinitely oldand static. This view was incompatible with the physicsof even his time, however, in at least three respects.

1. In 1823, Heinrich Olbers had pointed out that theentire night sky should be as bright as the surface of atypical star in such a model. (Olbers’ paradox: Hisreasoning was that every line of sight would end on astar, a valid argument. In more modern terms, the

Page 59: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

Cosmic Inflation 821

entire universe would have reached a state of thermalequilibrium, in which case all of space would be filledby light as intense as that emitted from the surfaces ofstars.)

2. Conservation of energy would be grossly violated bystars that shone forever.

3. Over an infinite span of time, gravitationalinstabilities would have caused all stars to merge intoenormous, dense concentrations of matter, as wasevident from work published in 1902 by Sir JamesJeans (the Jeans instability).

Even Einstein, with few equals in the history of science,felt these considerations to be outweighed by the philo-sophical appeal of an infinitely old and static universe.

Hubble’s discovery of the cosmic expansion resultedin a period of intense speculation about the nature of theuniverse. In the 1930s, several inventors of cosmologicalmodels based their conjectures on a so-called “cosmolog-ical principle,” namely that the universe must (or should )be homogeneous and isotropic. This was extended in thelate 1940s to the “perfect cosmological principle,” mean-ing that the universe must (or should ) also be eternal andunchanging. The steady-state model of cosmology wasbased on this “principle,” and was widely favored in pro-fessional circles until empirical evidence undermined itin the 1960s. We remark here (as occasional skeptics didat the time) that both of the aforementioned “principles”were of a highly dubious nature, because (1) nothing sup-ported them except observations which they pretended toexplain, and (2) they had no generality, referring only toa single system (the entire universe).

The preceding examples are not meant as criticisms ofthe brilliant scientists who held such views, but rather asillustrations of how the mystique of the cosmos exemptedit from the objective standards routinely applied to lessersystems.

Early in his graduate studies (about 1963), Tryon at-tended a colloquium given by a 30-year-old professor thenunknown to him. In straightforward and compelling terms,this professor elucidated several ways in which neutrinosmight play critical roles in astrophysics. The lecture in-spired Tryon, and transformed his scientific worldview.For the first time, he realized that enormous structures,perhaps even the entire universe, could (and should ) beanalyzed in terms of the same principles that govern phe-nomena on earth, including the principles of microscopicphysics. The professor’s name was Steven Weinberg. Itwas Tryon’s great good fortune that Weinberg later ac-cepted him as a thesis student. In retrospect, Tryon haslong felt deeply indebted to Weinberg for opening his mindto the possibility that the entire universe might have arisenas a spontaneous vacuum fluctuation.

We come now to the second striking feature shared bythe psychological and theoretical breakthroughs of Guthand Tryon. In his highly engaging history and surveyof inflation (see bibliography), Guth describes how hebecame involved in the research that culminated in hisdiscovery.

In early 1979, Guth was a young research physicist atCornell University, preoccupied with issues unrelated tohis later work. He has written that in April of 1979,

“my attitude was changed by a visit to Cornell by Steven Wein-berg . . . . Steve gave two lectures on how grand unified theoriescan perhaps explain why the universe contains an excess of mat-ter over antimatter . . . . For me, Weinberg’s visit had a tremen-dous impact. I was shown that a respectable (and even a highlyrespected) scientist could think about things as crazy as grandunified theories and the universe at 10−39 sec. And I was shownthat really exciting questions could be asked.”

The day after Weinberg’s visit ended, Guth immersed him-self in a study of GUTs in the early universe. By year’s end,he had discovered the possibility of cosmic inflation, andhe knew immediately that he should take the possibilityseriously.

In truth, Weinberg influenced and inspired a generationof young physicists, the first generation entirely liberatedfrom the disorienting mystique of the cosmos. Throughwidespread lectures, writing, and the extraordinary powerand clarity of his vision, Weinberg emboldened a genera-tion to apply the principles of microscopic physics to theearly universe.

We have now reached the extraordinary position wherethe following are regarded as plausible, even likely inthe minds of many: (1) the universe originated as aspontaneous quantum fluctuation, (2) quantum fluctua-tions in the false vacuum lead to eternal inflation and end-less creation, and (3) quantum fluctuations in the falsevacuum were the primordial deviations from homogene-ity that evolved into galaxies and the largest structures inthe universe. These and related ideas are sometimes re-ferred to as “quantum cosmology,” which is surely amongthe most striking paradigm shifts in the history of science.Steven Weinberg may fairly be regarded as the spiritualfather of quantum cosmology.

One can only speculate about the stage upon which ouruniverse was born. If it was infinite in extent (in spaceand/or “time”), however, vacuum fluctuations giving riseto universes like ours would be inevitable if the probabil-ity were different from zero (no matter how small) on afinite stage. Indeed, such creations would be more numer-ous than any finite number—such would be the nature ofan infinite stage. It is nevertheless true that this quantumtheory of creation became much more widely accepted

Page 60: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN003K-149 June 14, 2001 11:8

822 Cosmic Inflation

after Guth’s proposal of inflation, because inflation ex-plains how even a microscopic fluctuation could ballooninto a cosmos.

A mathematically detailed scenario for creationex nihilo was published in 1978 by R. Brout, F. Englert, andE. Gunzig. This model assumed a large, negative pressurefor the primordial state of matter, giving rise to exponen-tial growth that converted an initial microsciopic quantumfluctuation into an open universe. This model was a signif-icant precursor to inflationary scenarios based on GUTs,as well as being a model for the origin of the universe.

During the years since Guth’s proposal of inflation, nu-merous creation scenarios have been published, differingprimarily in their details. An especially interesting andsimple one is that of Alexander Vilenkin (1982). Vilenkinsought to eliminate the need for any kind of stage uponwhich creation occurred, i.e., he sought a more literal cre-ation from nothing.

Vilenkin noted that a closed universe has finite volume:V (t) = 2π2a3(t). For a = 0, the volume is zero and, if sucha universe is completely empty, it is then a candidate fortrue “nothingness”—a mere abstraction. He proposed thatsuch an abstraction could experience quantum tunnelingto a microscopic volume filled with false vacuum, whichwould then inflate to cosmic proportions. We do not (andmay never) know whether our universe was born withinsome larger physical reality or emerged from a pure ab-straction, but both scenarios seem plausible.

Creation ex nihilo suggests (without implying) a radicalidealism, in the sense of ideals being abstract principles.To illustrate the point, it may be helpful to consider thealternative (and usual) view.

Well before a child acquires language and the ability tothink abstractly, it has become aware of tangible matterin its surroundings. As the child develops, it furthermorediscerns patterns of regularity—splinters hurt, chocolatetastes good, and day alternates with night. Given that mat-ter exists, it must behave in some way, and one learns fromexperience and perhaps the sciences how very regular thatbehavior is. The unarticulated hierarchy in one’s mind is

(1) matter exists, (2) it must do something, and (3) it seemsto follow some set of rules. Matter is the fundamental re-ality, without which it would make no sense for there tobe principles governing it.

The maximal version of creation ex nihilo inverts thishierarchy: abstract principles (ideals) are the fundamentalreality. Tangible matter and energy are manifestations ofa deeper reality, the principles that gave rise to their exis-tence. This remains, of course, an open question, but it isone of the many issues that quantum cosmology has givenus to contemplate.

SEE ALSO THE FOLLOWING ARTICLES

CELESTIAL MECHANICS • COSMIC RADIATION • COS-MOLOGY • DARK MATTER IN THE UNIVERSE • GALAC-TIC STRUCTURE AND EVOLUTION • HEAT TRANSFER •QUANTUM THEORY • RELATIVITY, GENERAL • RELATIV-ITY, SPECIAL • STELLAR STRUCTURE AND EVOLUTION •UNIFIED FIELD THEORIES

BIBLIOGRAPHY

Guth, A. (1997). “The Inflationary Universe,” Perseus Books, New York.Krauss, L. M. (2000). “Quintessence: The Mystery of the Missing Mass

in the Universe,” Basic Books, New York.Liddle, A. R. (1999). “An Introduction to Modern Cosmology,” Wiley,

New York.Liddle, A. R., and Lyth, D. H. (2000). “Cosmological Inflation and Large-

Scale Structure,” Cambridge Univ. Press, Cambridge, U.K.Linde, A. D. (1990). “Inflation and Quantum Cosmology,” Academic

Press, San Diego.Peacock, J. A. (1999). “Cosmological Physics,” Cambridge Univ. Press,

Cambridge, U.K.Peebles, P. J. E. (1993). “Principles of Physical Cosmology,” Princeton

Univ. Press, Princeton, New Jersey.Weinberg, S. (1972). “Gravitation and Cosmology: Principles and Ap-

plications of the General Theory of Relativity,” Wiley, New York.Weinberg, S. (1977). “The First Three Minutes,” Basic Books, New

York.Weinberg, S. (1992). “Dreams of a Final Theory,” Pantheon Books, New

York.

Page 61: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

Dark Matter in the UniverseSteven N. ShoreIndiana University

Virginia TrimbleUniversity of California, Irvine, and Universityof Maryland, College Park

I. Historical PrologueII. Kinematic Studies of the GalaxyIII. Large-Scale Structure of the GalaxyIV. Dark Matter in the Local GroupV. Binary Galaxies

VI. The Local SuperclusterVII. Clusters of Galaxies

VIII. Cosmological Constraints and Explanationsof the Dark Matter

IX. Future ProspectsAppendix: The Discrete Virial Theorem

GLOSSARY

Baryons Protons and neutrons that are the basic con-stituents of luminous objects and which take part innuclear reactions. These are strongly interacting par-ticles, which feel the electromagnetic and nuclear, orstrong, forces.

Cosmic background radiation (CBR) Relic radiationfrom the Big Bang, currently having a temperature of2.74 K. It separated out from the matter at the epoch atwhich the opacity to scattering of the universe, becauseof cooling and expansion, fell to small enough valuesthat the probability of interaction between the matterand the primordial photons became small compared

with unity. It is essentially an isotropic background andits temperature (intensity) fluctuations serve to placelimits on the scale of the density fluctuations in theuniverse at a redshift of about 1000. The redshift isdefined as λ/λ0 = 1 + z, where λ is the currently ob-served wavelength of a photon and λ0 is the wavelengthat which it was emitted from a distant object.

Deceleration parameter Constant that measures boththe density of the universe, compared with the criticaldensity necessary for a flat space–time metric (called = ρ/ρc), and the departure of the velocity–redshiftrelation from linearity (q0).

Faber–Jackson relation Relation between the bolomet-ric luminosity of a galaxy and its core velocity

191

Page 62: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

192 Dark Matter in the Universe

dispersion. This permits the determination of the intrin-sic luminosity of elliptical galaxies without the need toassume a population model for the galaxy. It is used inthe measurement of distances to galaxies independentof their Hubble types.

Hubble constant (H0) Constant of proportionality forthe rate of velocity of recession of galaxies as a func-tion of redshift. Its current value is 71 ± 8 km sec−1

Mpc−1. Often, in order to scale results to this empiri-cally determined constant, it is quoted as h = H0/(100km sec−1 Mpc−1). The inverse of the Hubble constant isa measure of the age of the universe, called the Hubbletime, and is about 1.5 × 1010 years.

Leptons Lightest particles, especially the electron, muon,and tau and their associated neutrinos. These particlesinteract via the weak and electromagnetic forces (elec-troweak).

Tully–Fisher relation Relation between the maximumorbital velocity observed for a galaxy’s 21-cm neutralhydrogen (the line width) and the bolometric luminos-ity of the galaxy.

Units Solar mass (M), 2 × 1033 g; solar luminosity(L), 4 × 1033 erg sec−1; parsec (pc), 3 × 1018 cm (3.26light years).

DARK MATTER is the subluminous matter currently re-quired to explain the mass defect observed from visiblematerial on scales ranging from galaxies to superclustersof galaxies. The evidence shows that the need for someform of invisible but gravitationally influential matter ispresent on length scales larger than the distance betweenstars in a galaxy. This article examines the methods usedfor determining galaxies and clusters of galaxies and dis-cusses the cosmological implications of various explana-tions of dark matter.

I. HISTORICAL PROLOGUE

The existence of a species of matter that can not be seen butmerely serves some mechanical function in the universecan be traced to Aristotelian physics and was clerly sup-ported throughout the development of physical models-prior to relativity. The concept of the aether, at first some-thing apart from normal material in being imponderableeventually metamorphosed into that of a fluid mediumcapable of supporting gravitation and electromagnetic ra-diation. However, since it was the medium that was respon-sible both for generating and transmitting such forces, itsnature was intimately tied to the overall structure of theuniverse and could not be separated from it. It had to be as-sumed and could not easily be studied. Nineteenth century

attempts, using high-precision measurements, to observethe anisotropy of the propagation of light because of themotion of the earth, the Michelson–Morley experiment be-ing the best known, showed that the aether’s mechanicalproperties could not conform to those normally associatedwith terrestrial fluids.

A version of the search for some cosmic form of darkmatter began in the late 17th century with Halley’s ques-tion of the origin of the darkness of the night sky. Laterenunciated in the 19th century as Olber’s paradox, it re-quired an understanding of the flux law for luminous mat-ter and was argued as follows: in an infinite universe,with an infinite number of luminous bodies, the night skyshould be at least as bright as the surface of a star, if not in-finitely bright. Much as in current work on dark matter, theargument was based on the assumption of the premise of acosmological model and of the dynamical (for in this casephenomenological), character of a particular observable.

In order to circumvent the ramifications of the paradoxwithout questioning its basic premises, F. Struve, in the1840s, argued for the existence of a new kind of matterin the cosmos, one capable of extinguishing the light ofthe stars as a function of their distance, hence path lengththrough the medium. The discovery of dark nebulae byBarnard and Wolf at the turn of the century, of stellarextinction by Kapteyn about a decade later, and of thereddening of distant globular clusters by Trumpler in the1930s served to support the contention that this new classof matter was an effective solution to the missing lightproblem. However, as J Hershel realized very soon afterStruve’s original suggestion of interstellar dust, this cannotbe a viable solution for the paradox. In an infinite universe,the initially cold, dark matter would eventually come intothermal equilibrium. The resulting glow would eventuallyreach, if not the intensity of the light of a stellar surface,an intensity still substantially above the levels required toreproduce the darkness of the sky between the stars.

An additional thread in this history is the explanation ofthe distribution of the nebulae, those objects we now knowto be extragalactic. In detailed statistical studies. Charlier,Lundmark, and later Hubble, among others, noted a zoneof avoidance, which was located within some 20 of thegalactic plane. Several new forces were postulated to ex-plain this behavior, all of which were removed by the dis-covery of the particulate nature of the dust of the interstel-lar medium and the expansion of the universe. However,the explanatory power of the concepts of dark matter andunknown forces of nature serves as a prototype for many,of the current questions, although the circumstances andnature of the evidence have dramatically altered in time.

The origin of the current problem of dark matter (here-after called DM) really starts in the late 1930s with thediscovery of galaxy clusters by Shapley. The fact that the

Page 63: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

Dark Matter in the Universe 193

universe should be filled with galaxies that are riding onthe expanding space–time was not seen as a serious prob-lem for early cosmology, but the fact that these objectscluster seems quite difficult to understand. Their distribu-tion appears to be very anisotropic, and was seen at thetime to be the evidence that the initial conditions of theexpansion may be drastically altered by later gravitationaldevelopments of the universe.

About the same time, Zwicky, analyzing the distributionof velocities in clusters of galaxies, argued that the struc-tures on scales of megaparsecs (Mpc) cannot be boundwithout the assumption of substantial amounts of nonlu-minous material being bound along with the galaxies. Hiswas also the first attempt to apply the virial theorem tothe analysis of the mass of such structures. Oort, in hisanalysis of the gravitational acceleration perpendicular tothe galactic plane, also concluded that only about half ofthe total mass was in the form of visible stars.

With the advent of large-scale surveys of the velocitiesof galaxies in distant clusters, the problem of DM has be-come central to cosmology. It is the purpose of this articleto examine some of the issues connected with current cos-mological requirements for some form of nonluminous,gravitationally interacting matter and the evidence of itspresence on scales from galactic to supercluster.

II. KINEMATIC STUDIES OF THE GALAXY

The Milky Way Galaxy (the Galaxy) is a spiral, consistingof a central spheroid and a disk of stars extending to severaltens of kiloparsecs. A halo is also present and appears toenvelope the entire system extending to distances of over50 kpc from the center. It is this halo that is assumed to bethe seat of most of the DM. Since it is essentially spherical,it contributes to the mass that lies interior to any radiusand also to the vertical acceleration. The stars of whichit is composed have high-velocity dispersions, of order150 km sec−1, and are therefore distributed over a largervolume while still bound to the Milky Way.

Galactic studies of dark matter come in two varieties:local, meaning the solar neighborhood (a region of severalhundred parsecs radius), and large-scale, on distances ofmany kiloparsecs. The study of the kinematics of stars inthe Galaxy produced the first evidence for “missing mass,”and there is more detailed information available for theMilky Way than for any other system. Further, many ofthe methods that have been applied, or will be applied infuture years, to extragalactic systems have been developedfor the Galaxy. So that the bases of the evidence can bebetter understood, this section will be a more detailed dis-cussion of some of the methods used for the galactic massdetermination.

A. Local Evidence

The local evidence for DM in the Galaxy comes from thestudy of the vertical acceleration relative to the galacticplane, the so-called “Oort criterion.” This uses the fact thatthe gravitational acceleration perpendicular to the plane ofthe Galaxy structures the stellar distribution in that direc-tion and is dependent on the surface density of material inthe plane. The basic assumption is that the stars, which arecollisionless, form a sort of “gas” whose vertical extent isdependent on the velocity dispersion and the gravitationalacceleration. For instance, taking the pressure of the starsto be ρσ 2, where ρ is the stellar density and σ is theirvelocity dispersion in the z direction, then vertical hydro-static equilibrium in the presence of an acceleration Kz

gives the scale height of an “isothermal” distribution:

z0 ≈ σ 2/

Kz . (1)

This represents the mean height that stars, in their oscilla-tions through the plane, will reach. It is sometimes calledthe scale height of the stellar distribution.

The gravitational potential is determined by the Poissonequation:

∇2 = −4πGρ = ∇ · K, (2)

where G is the gravitational constant and K is the totalgravitational acceleration. Separation of this equation intoradial and vertical components, assuming that the systemis planar and axisymmetric, permits the determination ofthe vertical acceleration. It can be assumed that the starsin the plane are orbiting the central portions of the galaxy,and that

Kr = 2(r )

r. (3)

Here, is the circular velocity at distance r from thegalactic center. Under this assumption, the Poisson equa-tion becomes

∂zKz = −4πGρ + 2

r

d

dr, (4)

where the radial gradient of the circular velocity is de-termined from observation. From the observation of theradial gradient in the rotation curve, it is possible to de-termine the vertical acceleration from dynamics of starsabove the plane and thus determine the space mass densityin the solar neighborhood.

The argument proceeds using the fact that the verticalmotion of a star through the plane is that of a harmonicoscillator. The maximum z velocity occurs when the starpasses through the plane. From the vertical gradient in thez component of the motion, Vz , one obtains the acceler-ation, and the mass density follows from this. Assumingthat stars start from free fall through the plane, their totalenergy is given by

Page 64: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

194 Dark Matter in the Universe

0 + 12 V 2

z,0 = 12 V 2

z , (5)

so that assuming that Vz,0 = 0, in other words, that the starfalls from a large distance having started at rest, it followsthat the maximum velocity through the plane of a galaxy isdetermined by the potential, thus the acceleration, at theextremum of the orbit (apogalactic distance). The verti-cal acceleration is a function of the surface mass density.Thus, star counts can be used to constrain, from luminousmaterial, the mass in the solar neighborhood. Note thatKz ∼ − 4πG , where is the surface density, given by

=∫ z

0ρ(r, z′) dz′. (6)

It should therefore be possible to determine the gravita-tional acceleration from observations of the vertical dis-tribution of stars, obtain the required surface density, andcompare this with the observations of stars in the plane.

The mass required to explain Kz observed in the solarneighborhood, about 0.15M pc−3, exceeds the observedmass by a factor of about two. This is not likely madeup by the presence of numerous black holes or more ex-otic objects in the plane since, for instance, the stabilityof the solar system is such as to preclude the numerousencounters with such gravitational objects in its lifetime.It appears that there must be a considerable amount ofuniformly distributed, low-mass objects or some other ex-planation for the mass.

Recent observations with the FUSE satellite in the ul-traviolet and with ISO in the infrared reveal significantabundances of H2 in the interstellar medium. ISO data forthe edge-on galaxy NGC 891 show enough in this coldand shocked gas to account for the disk (only) compo-nent of the missing mass. It thus appears plausible thatfor the relatively low mass component of disk galaxies—the disk—there is no evidence for a low-velocity exoticsource of dark matter. The same, however, is not true forthe halo, as we will soon discuss.

The velocity dispersion of stars is given by the gravi-tational potential in the absence of collisions. Thus, fromthe determination of the mass density in the solar neigh-borhood with distance off the plane, the vertical compo-nent of the gravitational acceleration can be found, lead-ing to a comparison with the velocity dispersion of starsin the z direction. The resulting comparison of the dy-namic and static distributions determines the amount ofmass required for the acceleration versus the amount ofmass observed.

Most determinations of Kz find that only one-third toone-half of the mass required for the dynamics is observedin the form of luminous matter. In spite of several attemptsto account for this mass in extremely low luminosity Mdwarf stars, the lowest end of the mass distribution of

stellar masses, the problem remains. The mass of the ob-served halo is not sufficient to explain the vertical accel-eration, and the mass function does not appear to extendto sufficiently low masses to solve this problem.

Another local determination of the mass of the Galaxy,this time in the plane, can be made from the observationof the maximum velocity, relative to the so-called localstandard of rest, of stars orbiting the galactic center. Theargument is best illustrated for a point mass. In the caseof a circular orbit about a point, the orbital velocity, ,is given by centrifugal acceleration being balanced by thegravitational attraction of the central mass,

(r ) =(

G M

r

)1/2

, (7)

and it is easily seen that the escape velocity is vesc(r ) =√2(r ) for all radii. The coefficient is slightly lower in

the case of an extended mass distribution, but the argu-ment follows the same basic form. The local standard ofrest (LSR) is found from the velocity of the sun relative tothe most distant objects, galaxies and quasi-stellar objects(QSOs), and the globular cluster system of our galaxy. Themaximum orbital velocity observed in the solar neighbor-hood should then be about 0.4vLSR, or in the case of oursystem about 70 km sec−1 in the direction of solar motion.For escape from the system, at a distance of 8.5 kpc witha 0 = 250 km sec−1, the escape velocity from the solarcircle should be only about 300 km sec−1. The mass ofthe Galaxy can be determined from the knowledge of thedistance of the sun from the galactic center (determinedfrom mass distributions and from the brightness of vari-able stars in globular clusters) and from the shape of therotation curve as a function of distance from the galacticcenter.

B. Large-Scale Determinations

From the stellar orbits, one knows the rate of galactic dif-ferential rotation in the vicinity of the sun, a region about3 kpc in radius and located about 8.5 kpc from the galacticcenter. The measurement of galactic rotational motion canbe greatly extended through the use of millimeter molec-ular line observations, such as OH masers, and the 21-cmline of neutral hydrogen. Most of the disk is accessible, butat the expense of a new assumption. It is assumed that thegas motions are good tracers of the gravitational field andthat random motions are small and unimportant in com-parison with the rotational motions. It is also assumed thatthe gas is coplanar with the stellar distribution and that themotion is largely circular, with no large-scale, noncircu-lar flows being superimposed. These assumptions are notobviously violated in the Galaxy, but may be problem-atic in many extragalactic systems, especially barred spiral

Page 65: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

Dark Matter in the Universe 195

galaxies. The maximum radial velocity along lines of sightinterior to the solar radius occurs at the tangent points tothe orbits. This method of tangent points presumes thatthe differential rotation of the Galaxy produces a gradientin the radial velocity along any line of sight through thedisk for gas and stars viewed from the sun and that themaximum velocity occurs at a point at which the line ofsight is tangent to some orbit. The distance can be deter-mined from knowledge of the distance of the sun from thegalactic center, about 8 kpc. Using HI gives a measureof the mass interior to the point r and thus the cumulativemass of the Galaxy. Recent work has extended this to in-clude large molecular clouds, which also orbit the galacticcenter.

The determination of the orbits and distances of starsand gas clouds outside of the solar orbit is difficult sincethe tangent point method cannot be applied for extrasolarorbits, but from the study of molecular clouds and 21-cmabsorption and stellar velocities of standard stars it appearsthat the rotation curve outside of the solar circle is still flat,or perhaps rising. The argument that this implies a con-siderable amount of extended, dark mass follows from anextension of the argument for the orbital velocity about apoint mass, 2(r ) ∼ M(r )/r . Since the observations sup-port = const, it appears that M(r ) ∼ r . The scale lengthfor the luminous matter is small, about 5 kpc, and this issubstantially smaller than the radial distances where therotation curve has been observed to still be flat (of order15 kpc). The same behavior has been observed in externalgalaxies, as we will discuss shortly. Thus, there is a sub-stantial need for the Galaxy to have a large amount of darkmatter in order to account for dynamics on scales largerthan 10 kpc.

As mentioned, mass measurements of the Galaxy fromstellar and gas rotation curves beyond the distance of thesun from the center are subject to several serious prob-lems, which serve as warnings for extragalactic studies.One is that there appear to be substantial departures ofthe distribution from planarity. That is, the outer parts ofthe disk appear to be warped. This means that much of themotion may not be circular and there may be sizable verti-cal motions, which means that the motions are not strictlyindicative of the mass. For the inner galactic plane, thereis some evidence of noncircular motion perhaps caused bybarlike structures in the mass distribution, thus affectingthe mass studies.

An extension of the Oort method for the vertical accel-eration, now taken to the halo stars, can be used to measurethe total mass of the Galaxy. One looks at progressivelymore distant objects. These give handles on several quan-tities. For instance, the maximum velocity of stars fallingthrough the plane can be compared with the maximumdistances to which stars are observed to be bound to the

galactic system. Halo stars of known intrinsic luminosity,such as RR Lyrae variable stars, can be studied with somecertainty in their intrinsic brightness. From the observedbrightness, the distance can be calculated. Thus, the dis-tance within the halo can be found. From an observationof the vertical component of the velocity, one can obtainthe total mass of the galactic system lying interior to thestar’s orbit. The same can be done for the globular clus-ters, the most distant of which should be nearly radiallyinfalling toward the galactic center. These methods give awide range of masses, from about 4 × 1011 M to as highas 1012 M. to distances of 50–100 kpc.

The phenomenological solution to the problem of sub-luminous mass increasing in fraction of the galactic pop-ulation with increasing length scale measured assumesthat the mass-to-light ratio, M/L , is a function of dis-tance from the galactic center. For low-mass stars, thisnumber is of order 1–5 in solar units (M/L), but forthe required velocity distribution in the galaxy, this mustbe as high as 100. There are few objects, except Jupiter-sized or smaller masses, which have this property. Thereason is simple—nuclear reactions that are responsiblefor the luminosity of massive objects, greater than about0.08M, cannot be so inefficient as to produce this highvalue. Even the fact that the flux distribution can be redis-tributed in wavelength because of the effects of opacity inthe atmospheres of stars of very low mass will still leavethem bright enough to observe in the infrared, if not thevisible, and their total (bolometric) luminosities are stillhigh enough to provide M/L ratios that are less than about10. Black holes, neutron stars, or cold white dwarfs wouldappear good candidates within the Galaxy. But limits canbe set on the space density of these objects from X-raysurveys since they would accerete material from the inter-stellar medium and emit X rays as a result of the infall.The observed X-ray background and the known rate ofproduction of such relics of stellar evolution are both toolow to allow these objects to serve as the sole solution tothe disk DM problem. Some other explanation is required.

III. LARGE-SCALE STRUCTUREOF THE GALAXY

A. Multiwavelength Observations

1. Radio and Far-Infrared (FIR) Observations

Neutral and molecular hydrogen (H I and H2), form only asmall component of the total mass of the galactic system.Both 21-cm and CO (millimeter) observations can only ac-count for about one-third of the total galactic mass beingin the form of diffuse or cloud matter. The IRAS satel-lite, which performed an all-sky survey between 12 and

Page 66: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

196 Dark Matter in the Universe

100 µm, did not find a significant population of opticallyunseen but FIR bright point sources in the galaxy. Instead,it showed that the emission from dust in the diffuse in-terstellar medium is consistent with the amount of neutralgas present in the plane and that there is not a very sizablecomponent of the galaxy at high-galactic latitude. Thisseverely constrains conventional explanations for the DMsince any object would come into equilibrium at tempera-tures of the same order as an interstellar grain and wouldlikely be seen by the IRAS satellite observations. As wementioned, however, spectra obtained with ISO show dif-fuse molecular gas in unexpectedly large abundance inspirals.

2. X-Ray Observations

These show the interstellar medium to possess a very hot,supernova-heated phase with kinetic temperatures of or-der 106−107 K. But here again it is only a small fractionof the total mass of luminous matter. The spatial extent ofthis gas is greater than that revealed by the IR and radiosearches but is still consistent with the galactic potentialderived from the optical studies. As mentioned, X-ray ob-servations also constrain the space density of collapsed butsubluminous objects, such as black holes, through limitson the background produced by such objects accretinginterstellar gas and dust.

3. Ultraviolet Observations

The stars that produce the UV radiation are the most mas-sive or most evolved members of the disk, being eitheryoung stars or the central stars of planetary nebulae. Theyhave very low M/L ratios, are easily seen in the UV, andwould not be likely candidates for DM. There is no com-pelling evidence from other wavelength regions that themissing matter is explained by relatively ordinary matter.Baryonic matter at temperatures from 107 K to below themicrowave background of 3 K can be ruled out by thecurrently available observations.

B. The Components of the GalacticDisk and Halo

Star counts as a function of magnitude in specific direc-tions (that is, as a function of lII and bII, the galactic longi-tude and latitude, respectively) provide the best informa-tion on the mass of the visible matter and the structure ofthe galaxy. This has been discussed most completely byBahcall and Soniera (1981). It depends on the luminosityfunction for stars, their distribution in spectral type, andtheir evolution. However, in the end, it provides more ofa constraint on the evolution of the stellar population than

on the question of whether the system is supported by asubstantial fraction of dark matter.

Although such studies concentrate on the luminous mat-ter, the reckoning of scale heights for the different stellarpopulations provides an estimating method for the verticaland radial components of the acceleration—direct massmodeling. Such studies are also sensitive to the details ofreddening from dust both in and off of the galactic plane,metallicity effects as a function of distance from the galac-tic center, and evolutionary effects connected with the pro-cesses of star formation in different parts of the galaxy.Supplementing the space density studies with space mo-tions (proper motion, tangential to the line of sight, can befound for some classes of high-velocity stars and can beadded to the radial velocities to give an overall picture ofthe stellar kinematics) permits both a kinematic and pho-tometric determination of the galactic mass. These studieshave extended the determination of the vertical accelera-tion to distances of more than 50 kpc into the halo.

Another constraint on the mass of the Galaxy comesfrom the tidal radii of the Magellanic Clouds and from themasses and mass distributions of globular clusters. TheMagellanic Clouds (the Clouds) are fairly large comparedwith their distances from the galactic center. They there-fore feel a differential gravitational acceleration that coun-teracts their intrinsic self-gravity and tends to rip themapart as they move around the Galaxy. The maximum dis-tance that a star can be from the Clouds before it is morebound to the Galaxy than the Clouds, or conversely theminimum distance to which the Clouds can approach theGalaxy without suffering tidal disruption, provides a mea-sure of the total mass of the Galaxy. The distribution of theglobular clusters does not provide any strong support for aDM halo. In addition, the clusters appear to be bound, eventhough they are the highest mass separate components ofthe galactic system, without involking any nonluminousmass. In order to account for the dynamics and to be con-sistent with the known ages of the clusters, the typicalM/L ratio is about 3, characteristic of low-mass stars. Soit appears that objects of order 106−108 M, and with sizesof up to 100 pc, are not composed of large quantities ofdark mass.

C. Theoretical Studies

The spiral structure of disk galaxies has been a long-standing problem since the discovery of the extragalac-tic nature of these objects. The first suggestion that thepatterns seen in spirals might be due to some form of in-trinsic collective mode of a disk gravitational instabilitywas from Lindblad and was elaborated by Lin and Shu(1965) as the “quasistationary density wave” model. Thepicture requires a collective mode in which the stars in the

Page 67: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

Dark Matter in the Universe 197

disk behave like a self-gravitating but collisionless fluid(there is no real viscous dissipation), which clumps in aspatially periodic wave. This wave of density feeds backinto the gravitational potential, which then serves as themeans for supporting the disk wave structure.

It was soon realized that these waves are not stationary.Further, because of angular momentum transport by tidalcoupling of the inner and outer parts of the disk, the waveswill wind up and dissipate in the absence of a continuingforcing. Halos have been shown to help stabilize the disk,suggesting that perhaps the very existence of spiral galax-ies is an indication of some DM being distributed over alarge volume of the Galaxy. Ostriker and Peebles (1973),by analysis of the stability of disks against collapse intobarlike configurations, came up with an independent ar-gument supporting the need for an extensive halo havingat least two-thirds the total mass of a disk galaxy. Theircriterion was established by the stability analysis of a self-gravitating ellipsoid, showing that a disk is unstable to theformation of a bar if the ratio of the kinetic energy tothe gravitational potential is large, that is, if T/W > 0.14.Here, T is the kinetic energy of the stars and W is thegravitational potential energy. This was perhaps the firstpaper to argue for the existence of a hot halo on the basisof the stability of a simple model system. Further model-ing supports this conclusion, which now forms one of thecornerstones of galactic structure models: if the halo is notmassive enough to stabilize the disk of a spiral galaxy, thesystem will collapse to a bar embedded in a more extendedspheroid of stars.

The implication is that, since there are many disks thatdo not possess large-scale, barlike structures, there may bea halo associated with many of these systems. In addition,the fact that a disk is observed is indication enough thatthere should be a considerable quantity of mass associatedwith it. Thus, the search was stimulated for rotation curvesthat remain flat or non-Keplerian (not point-mass-like) atlarge distances from the galactic center.

IV. DARK MATTER IN THE LOCAL GROUP

A. Magellanic Clouds

The total mass of the Galaxy can be determined from thestability of the Magellanic Clouds. These two Irr galax-ies orbit the Milky Way, being slowly tidally disruptedby the interaction with the disk of the Galaxy. The tidalradius of the Clouds and the fact that they have been sta-ble and bound to each other for far longer than a singleorbit (these satellites of the Galaxy appear as old as theMilky Way) provide an upper limit to the mass of theGalaxy and Clouds to a distance of about 60 kpc. This is

not far from that obtained from the study of halo stars andclusters, and provides an upper limit of about 1012 M.The required M/L is thus about 30. The total mass of theMagellanic Clouds may be individually underestimated,but this appears to be a good limit on the total mass ofthe system and is in qualitative agreement with the massrequired to explain the dynamics of the Local Group (thegalaxy cluster to which the Galaxy belongs) as well.

B. M 31 and the Rotation Curvesof External Galaxies

The study of the rotation curve of the Andromeda Galaxy,M 31, the nearest spiral galaxy to ours and one whichforms a sort of binary system with the Milky Way, showsthat the rotation curve is flat to the distance at which thestars are too faint to determine the rotation curve. TheM/L ratio is close to 30, about the same as that foundfor the Galaxy. In order to place this result in context, itis necessary to describe the process whereby the rotationcurves are determined for extragalactic systems.

First, one can assume that the disk is radially supportedby the revolution of collisionless stars about the mass ly-ing interior to their radial distance from the galactic cen-ter. This assumption makes the mass within a distancer, M = M(r ), a function of radius, and then allows the de-termination of the orbital velocity assuming that the meaneccentricity of the orbits is small or zero. A slit is placedalong, and at various angles to, the symmetry axes of thegalaxy, and the radial velocity of the stars is determined.The disk is assumed to be circular, so the inclination ofthe galaxy can be determined from the ellipticity of theprojected disk in the line of sight. There are several meth-ods for fitting the mass model to this rotation curve, mostof which assume a power-law form for the rotation curvewith distance and fit the coefficients of the density dis-tribution required to produce the observed velocities bythe solution of the dynamical equations in the radial di-rection. This density profile is then integrated to give themass interior to a given radius.

Several assumptions go into this method, most of whichappear theoretically well justified. The first is that the sys-tem is collisionless. There is little evidence that the starsin a galaxy undergo close encounters—they are simplytoo small in comparison with the distances that separatethem. However, there are very massive objects present inthe disk, the Giant Molecular Clouds, which may havemasses as great as 107 M and which appear to control thevelocity dispersion of the constituent stars. These cloudsare also the sites for star formation, and as such they formthe basis for the interaction of the gas and star componentsof the Galaxy. They may also transfer angular momentumthrough the Galaxy, making the disk appear viscous. Such

Page 68: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

198 Dark Matter in the Universe

interactions act much like encounters to change the veloc-ity distribution to something capable of supporting a flatrotation curve. However, this remains speculation.

The observation of the rotation curve using neutral hy-drogen line emission is a better way of determining themass of a galaxy at a large distance. First, it appears thatthe H I distribution is considerably more extended thanthat of the bright stars; H I maps extend, at times, to fouror five times the optical radius of the disk. Even at theselarge distances, studies show that the rotation curves re-main flat! This is a very difficult observation to understandif the sole means of support for the rotation is the mecha-nism just described. Rather, it would seem that some formof extended mass distribution of very high M/L objectsis required. The radial extent of such masses may be asgreat as 100 kpc, where the optical radii of these galaxiesis smaller than 30 kpc.

The measurement of the rotation curve for the LocalGroup galaxies is complicated by the fact that the systemsare quite close and therefore require low-resolution ob-servations to determine the global rotation curves. This isalso an aid in that the small-scale structure can be studiedin detail in order to determine the effects of departuresfrom circular orbits on the final mass determination.

The study of radial variations of velocity curves is es-sentially the same for external galaxies as for ours. Oneassumes that there is a density distribution of the formρ∗(r, z) of the stars and that this is the only contributor tothe rotation curve for the Galaxy. One then obtains (r ),from which one can solve the equation of motion for radialsupport alone:

2 = 4π (1 − e2)1/2∫ r

0

ρ(ξ )ξ 2 dξ

(r2 − ξ 2e2)1/2, (8)

where e is the eccentricity of the spheroid assumed forthe mass distribution. One can then expand ρ in a powerseries of the form

ρ(r ) = a−2

r2+ a−1

r+ a0 + · · · (9)

and perform a similar expansion for the velocity at a givendistance. The solution is then mapped term by term. Oncefinished, one can then integrate the entire mass of the diskby assuming that the spheroid is homogeneous, so that

M(r ) = 4π (1 − e2)1/2∫ r

0ρ(ξ )ξ 2 dξ. (10)

Such models are, of course, extremely simplified (themethod was introduced by Schmidt in the 1950s and is stilluseful for exposition), but they can illustrate the problemof the various assumptions one must make in obtainingquantitative estimates of the mass from the rotation curve.Since the rotation curves remain flat, one can also assumethat the M/L ratio is a function of distance from the galac-

tic center. This is folded into the final result after the massto a given radius has been determined.

Unlike our system, where one cannot be sure what isor is not a halo star, one can determine with some easethe velocity dispersions for face-on galaxies perpendicu-lar to the galactic plane with the distance from the centerof the system. In a procedure much like the Oort methoddescribed for the Galaxy, it is possible to place constraintson the amount of missing mass in the disk by using thedetermination of surface densities from the dynamical in-formation. External galaxies appear to have values similarto our system for the dark matter in the plane, about one-half of the matter being observed in stars.

Many disk systems show warps in their peripheries;M 31, the Galaxy, and other systems show distortions ofboth their stellar and H I distributions in a bending of theplane on opposite parts of the disk. This argues that thehalos of these galaxies cannot be too extensive, beyondthe observable light, or one would not be able to observesuch warps. Calculations show that these will damp, fromdynamical friction (dissipation) and also viscous effects,if the halo is too extensive. However, in the case of some ofthe current models for the distribution of DM in clusters, itis not clear whether this can also constrain a more evenlydistributed DM component (one less tied to the galaxiesbut more evenly distributed in the clusters than the halos).

V. BINARY GALAXIES

Binary galaxies may also be used to determine the M/Lof the DM. The process is much like that employed fora binary star, with the exception that the galaxies do notactually execute an orbit within the time scale or our ob-servations. Instead, one must perform the determinationstatistically. For a range of separations and velocities, onecan determine a statistical distribution of the M/L val-ues, which can then be compared with the mean separa-tions and masses obtained from the rotation curves of thegalaxies involved. Again, there is strong evidence for alarge amount of unseen matter, often reaching M/L val-ues greater than 100. The method has also been extendedto small galactic groups, with similar results.

Using large samples of binary galaxies, so that the ef-fects of orbital eccentricity, inclination, and mass transfercan be accounted for, recent studies have shown that abouta factor of a three times the observed mass is required toexplain binary galaxy dynamics. Summarized, the esti-mates of M/L range from about 10 to 50. The wide rangereflects uncertainties in both the Hubble constant, which isrequired for the specification of the separation in physicalunits, and the projection of the velocities of the galaxies inthe line of sight. Unlike clusters, binary galaxies are rather

Page 69: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

Dark Matter in the Universe 199

sensitive to the projection factors, since one is dealing onlywith two objects at a time.

Measurement of the masses of the component galaxiescan also add information to this picture, and there havebeen efforts to obtain the rotation curves for the galaxiesin binaries and small groups. The interpretation of the ve-locity differences, the quantity which, along with the pro-jected separation, gives the mass of the system, requiressome idea of the potential in which the galaxies move. Ifthere is very extended DM around each of the componentgalaxies, the use of point-mass or simple spheroidal mod-els will fail to give meaningful masses. Also, since someof the binary galaxies may be barred spirals and othersmay show subtle and not yet observed tidal effects, it islikely that the mass estimates from these systems shouldbe taken with considerable caution.

Small groups of galaxies, the so-called Hickson groups,can be used to provide an example of dynamical interac-tion which lies between the clusters and the binaries. Here,however, the picture is also complex. Interactions betweenthe constituent galaxies can alter their mass distributionsand affect their orbits in a manner that vitiates the determi-nation of the orbital properties for the member galaxies.If these groups are bound and stable, the M/L must be oforder 100. However, Rubin provided evidence, from thestudy of individual galaxies, that the rotation curves donot show the flat behavior of isolated galaxies. It is pos-sible that the halos of the galaxies have been altered inthe groups. Deep photographs of the groups provide somehandle on this. Complicated mass distributions of the lumi-nous matter are observed, indicating that the galaxies havebeen strongly interacting with one another. Mergers, col-lisions, and tidal interactions at a distance are all indicatedin the images. Clearly, more detailed study is required ofthese objects before they can be used as demonstrationsof the need for DM.

VI. THE LOCAL SUPERCLUSTER

A. Virgo and the Virgocentric Flow

The Local Group sits on the periphery of the Virgo su-percluster, containing several hundred bright galaxies andthe Virgo cluster centered on the giant elliptical galaxyM 87, which contains as much as 1014 M. Because of itsmass and proximity, its composition and dynamics havebeen well studied. Virial analyses indicate that the totalcluster is bound by DM, which amounts to about 10 timesthe observed mass (a figure that is typical of clusters andwhich will be discussed in more detail later). Because ofits closeness, however, the Virgo supercluster has an evenmore dramatic effect on motion of the Local Group, whichcan be used to independently estimate its total mass.

From the motion of the Local Group with respect to theVirgo cluster, it is possible to estimate the total mass ofthe cluster. The argument is a bit like that of Newton’sapple analogy. In the free expansion of the universe, localregions may be massive enough to retard the flow of thegalaxies away from each other. This is responsible for thestability of groups of galaxies in the first place. They arelocally self-gravitating. In the case of the motion of theGalaxy toward Virgo, there are several ways of measuringthis velocity.

One of the most elegant is the dipole anisotropy of thecosmic background radiation (CBR). The idea of using theCBR as a fixed light source relative to which the motion ofthe observer can be obtained is sometimes called the “newaether drift.” The motion of the observer produces a varia-tion in the effective temperature of the background that isdirectly proportional to the redshift of the observer relativeto the local Hubble flow. For motions of the order of 300km sec−1, for example, the variation in the temperatureof the CBR should be about 1 mK, with a dipole (that is,consine) dependence on angle. The CBR shows a dipoletemperature distribution, about one-third of which can beexplained by the motion of the Local Group toward Virgo,about 250–300 km sec−1, called the virgocentric velocity.Since the Local Group is part of the supercluster systemthat contains Virgo, it is not surprising that we are at leastpartially bound to the cluster. The total mass implied bythis motion is consistent with the virial mass determinedfor the cluster, which implies a considerable amount ofdark matter (about 10 times the observable mass). Thedipole anisotropy, however, is only partially, explained bythe virgocentric flow. There must be some other largerscale motion, about a factor of two larger, responsible formost of the variation in the CBR. This seems to resultfrom the bulk motion of the local supercluster toward theHydra-Centaurus supercluster (lII ≈ 270, bII ≈ 30). Thisbrings up the question of large-scale deviations, on thescale of superclusters, of the expansion from the Hubblelaw.

B. The Large-Scale Streaming in the LocalGalaxies: Dark Matter and the Hubble Flow

Distance determination can be a problem for galaxies out-side of the Local Group and Virgo cluster. This has todo with our inability to resolve individual stars and HII regions. Several methods have evolved that allow fordetermination of the intrinsic brightness of galaxies inde-pendent of their Hubble flow velocities. The Tully–Fisherrelation correlates the absolute magnitude of a galaxy, ei-ther in the blue or in the infrared, with the width of the21-cm line. Although this depends on the inclination ofthe galaxy, this can be taken into account from optical

Page 70: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

200 Dark Matter in the Universe

imaging, and it provides a calibration for the intrinsic lu-minosity of the galaxy from which, using the apparentmagnitude, the distance can be obtained directly, with-out cosmological assumptions. Another calibrator, espe-cially useful for elliptical and gas-poor galaxies, is theFaber–Jackson relation. This uses the velocity dispersionobserved for the core of elliptical galaxies and spheroidsto determine the intrinsic luminosity of the parent galaxy.It is representative of a wide class of objects, indicative ofdissipative formation of the systems, and is

L ∼ σ n, (11)

where σ is the velocity dispersion of the nucleus and n ≈ 4from most studies.

Using these methods, it is possible to obtain the dis-tance to a galaxy independent of the Hubble velocity; onecan then look for systematic deviations from the isotropyof the expansion of the galaxies in the vicinity of the Lo-cal Group. Observations show that there is a large-scaledeviation of galaxy motion in the vicinity of the Galaxy.Deviations at large displacement to the virgocentric ve-locity of order 600–1000 km sec−1 show that there is alarge departure from the Hubble law relative to the uni-form expansion that is usually assumed. The characteristicscale associated with this deviation is about 50–100 Mpc,much larger than the size usually associated with clus-ters of galaxies but on a scale of superclusters. No masscan clearly be identified with this gravitationally perturb-ing concentration, but it has been argued that it may bea group of galaxies of order 1014 M, about the size as-sociated with a very large cluster or small supercluster ofgalaxies. The mystery is its low luminosity, but it is lo-cated near the galactic plane, which would account for thedifficulty in observing its constituent galaxies. This maybe indicative of other large-scale deviations on a largerscale of gigaparsecs.

VII. CLUSTERS OF GALAXIES

From the first recognition of galaxies as stellar systemslike the Milky Way, it has been clear that their distributionis markedly inhomogeneous. Early studies by Shapley in-dicated large (factor of two) density fluctuations in theirdistribution, while later work has expanded the complexityof this distribution to larger density contrasts. A variety oflarge-scale structures is observed in the galactic clustering,which gives rise to the idea of a hierarchy. First, there areclusters, megaparsec scale concentrations of luminous ob-jects with total masses typically of 1013−1014 M. Thesecontrast with voids, the most famous of which is the BootesVoid, in which the density of galaxies is about 1/10 that ofthe averages. Clusters of galaxies cluster themselves into

structures called superclusters, and it appears that eventhese may have some clustering characteristics, althoughthey are sufficiently rare that the statistical information isshaky at best. Clusters can be dynamically and morpho-logically separated from the background of galaxies andthese can be studied more or less in isolation from oneanother.

Several catalogs are available for galaxy clusters, gener-ally identified with the names Abell and Zwicky. In addi-tion, large-scale sky counts of galaxy frequency per squaredegree have been made by Shane and Wirtanen (the Licksurvey), Zwicky, and the Jagellonian surveys. These donot contain any dynamical information; they are numbercounts only. But from information about the spatial cor-relation on the two-dimensional surface of the sky, thethree-dimensional properties of galaxies can be crudelydetermined.

This situation greatly improved with the Center for As-trophysics redshift survey, a complete radial velocity studyfor galaxies within 15,000 km sec−1 of the Local Group(covering scales < 0.1c). With velocity information avail-able, it is possible to place the galaxies in question inthree-dimensional space and to look at the detailed dis-tribution of both the galaxies and clusters on the scale of100 Mpc. This has given the first evidence for the natureof the large-scale structure of the universe and the needto consider dark matter in the context of both clusters andsuperclusters of galaxies. First, we examine the basis forthe determination of mass within the clusters and the ar-guments on the reality of clustering. Then, we discuss theways in which this information can be used to address thedistribution of, and need for, the DM on scales comparablewith the size of the visible universe.

The problem of cold versus hot dark matter has beenaddressed by a number of techniques. Large-scale struc-tures have been detailed in the neighborhood of the LocalGroup. Two big structures, dubbed the Great Attractor(Lynden-Bell et al., 1988) and the Great Wall (Geller andHuchra, 1990), have been discovered on size scales of or-der 100 Mpc. The presence of a large-scale deviation fromthe velocities of the Hubble flow was the signature of theAttactor, a comparatively local region possibly associatedwith a supercluster.

The Harvard Center for Astrophysics redshift survey ofabout 15,000 galaxies in the northern hemisphere has re-vealed several superstructures and, in general, a distinctlyinhomogeneous distribution for local luminous matter.Its implication for dark matter is that, at least on thescale of clusters and individual galaxies, there is consid-erable power in the small-scale length for which CDMis likely responsible. A statistically more distant samplehas, however, been analyzed using the IRAS database.These infrared-detected galaxies provide a sample with

Page 71: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

Dark Matter in the Universe 201

very different selection criteria than the optically based,and optically based, surveys previously available. The pro-cedure for analyzing such datasets is to follow up on the IRdetections with random samples of redshifts of a compara-tively large number of galaxies (of order several thousand)from which distances are obtained. The results, reportedby Saunders et al. (1991), find a significant excess in thegalaxy correlation function on scales in excess of 20 Mpc,a size large in comparison with that expected for ordinaryCDM. Precisely how this will stand up in extended surveysbased on deeper cuts of the existing catalogs, and how thiscompares with the source counts from COBE and otherIR missions, remains to be seen.

A. The Virial Theorem

One of the first suggestions that there is a compelling needto include unseen matter in the universe came from theoriginal determination of velocity dispersions in clustersof galaxies. The argument is quite similar to that used fora gas confined in a gravitational field. Galaxies in a clustermay collide, a problem to which we will later return, butfor the moment we will assume that the galaxies are inindependent orbits about the center of mass of the clusters.The velocity dispersion is the result of the distributionof orbits of the galaxies about the cluster center, and theradial extent of the galaxies in space results from the factthat they are bound to the cluster. A dynamical system inhydrostatic equilibrium obeys the virial theorem, whichstates that

2T + W = 0, (12)

where T is the kinetic energy and W is the gravitationalpotential energy. The total energy of the system must be atmost zero, since the system is assumed to be bound. There-fore, the total energy is given by E = T + W = 1

2 W < 0,and the velocity dispersion is given approximately by

σ 2 = G Mvirial

〈R〉 . (13)

The mass determined from the statistical distribution oforbits depends on the size of the system, and this requiressome estimate of the density profile of the cluster. Therms value of the group radius usually suffices for the com-ponent of the radial velocity. It is assumed that the orbitsare randomly distributed and that they have not been se-riously altered since the formation of the cluster. Also, anassumption is embedded in this equation for the disper-sion: the mass distribution has remained unaltered on atime scale comparable with the orbital time scale for thegalaxies about the center of mass.

The argument that clusters must be bound and stablecomes from the distribution of galaxy velocities within the

clusters. The dynamical time scale, also called the crossingtime, is the period of a galaxy orbit through the center ofthe cluster potential. The characteristic time scales are oforder 109 years, much shorter than the Hubble time (theexpansion time scale for the universe, taken as the inverseof the Hubble constant).

There have been numerous simulations of the evolutionof galaxies in clusters, most of which show evidence forencounters of the cluster members. There is also strongevidence for interaction in the numerous disturbed galax-ies in these clusters and the presence of cD galaxies, giantellipticals with extended halos, in the cluster centers. Thelatter are assumed to grow by some form of accretion, alsocalled cannibalism, of the neighboring galactic systems.

B. Interactions in Groups of Galaxies

Galaxies in clusters collide. This simple fact is respon-sible for many of the problems in interpreting the virialtheorem because it assumes that the dynamical system isintrinsically dissipationless. Collisions redistribute angu-lar momentum and mass and perturb orbits in stochasticand time-dependent ways. None of these effects can beincluded in the virial theorem formulation, but instead re-quire a full Fokker–Planck treatment of the dynamics.

Observations of small groups of galaxies, the Hicksongroups, reveal a complex array of interactions—shells,bridges, and tails are found for many of the members ofthese small groups. The extent to which this alters the massdistribution of the groups is not known nor is it knownhow this affects the determination of the M/L ratio forthe group.

For clusters of galaxies, the situation is even more com-plicated. Many cD galaxies have been found to be embed-ded in shell systems, which are believed to arise fromaccretion of low-mass disk galaxies by the giant ellip-ticals. The shells have been used to argue for the massdistribution of the cDs, although the detailed mass distri-bution cannot be determined from the use of the shells.Instead, they serve as a warning about using simple colli-sionless arguments for the evolution of the cluster galaxydistribution.

C. The Distribution of Clustersand Superclustering

The two-point correlation function, ξ (r ), is the measureof the probability that there will be two galaxies within afixed distance r of each other. The probability of findinga galaxy in volume element dV1 within a distance r ofanother in volume dV2 is given by

d P(1, 2) = n2[1 + ξ (r1,2)] dV1 dV2, (14)

Page 72: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

202 Dark Matter in the Universe

where n is the mean number density of galaxies. Defined inthis way, it is a good measurement of the trend of galaxiesto cluster. If there is a clustering, the correlation is asymp-totically vanishing at large distance, very positive at smalldistance, and negative for some interval. It is only a re-quirement that the cumulant be positive definite (normal-izable). The two-point function is not, however, the solediscriminant of clustering. The problem with the equationsof motion is that, in an expanding background and underthe influence of gravity, the mass points evolve in a highlynonlinear way. The equations of motion do not close atany order, and there is a hierarchy to the distribution ofthe correlations of different orders.

The correlation function seems to have a nearly univer-sal dependence on separation, ξ (r ) ∼ r−1.8, whether forgalaxy–galaxy or cluster–cluster correlations. The coef-ficient is different, which indicates that the hierarchy isnot quite exact and that there may be further differenceslurking on the scale of superclusters. It is not likely, how-ever, that this will be easily studied, considering the sizeof the samples and the extreme difficulty in determiningthe extent to which these largest scale structures can beseparated from the clusters.

D. The Cosmic Microwave Background

The November 1989 launch of the Cosmic BackgroundExplorer, COBE, revolutionized the study of large-scalestructure. Designed to operate in the far-infrared and mil-limeter portions of the spectrum, COBE performed an all-sky survey of the diffuse cosmic background radiation(CBR).

The CBR is truly Planckian; that is, it is a uniform-temperature blackbody spectrum. The temperature is2.735 K, with an error of less than 60 mK for the wave-length region between 400 µm and 1 cm. The dipoleanisotropy, caused by motion of the local standard of resttoward the Virgo cluster, is easily detected in the data withan amplitude of 3.3 mK (with less than a 10% error) in thedirection α = 11h .1 and δ = −6o.3. The lack of a strongquadrupole signature in the background is also very im-portant. This places limits on the distribution of large massconcentrations at large redshift and also on strong massconcentrations at distances comparable to the Virgo clus-ter (about 10–100 Mpc).

An important limit provided by the shape of the spec-trum is the variation of temperature over the sky due toscattering of the CBR by hot gas. Called the Sunyaev–Zeldovich effect, relativistic or near-relativistic electronswith temperature Te. scatter the background photons,boosting their energies to ν ≈ kBTe/h, where h is Planck’sconstant and kB is the Boltzmann constant. This scatter-ing removes the photons from the long-wavelength part of

the CBR, resulting in a change in the intensity with posi-tion. The limit provided by the COBE results is that y, thedimensionless Sunyaev–Zeldovich parameter, is ≤10−3,ruling out a substantial contribution by a hot intraclustermedium to the temperature of the CBR.

The major result from COBE was the detection oflarge-scale (>10) temperature fluctuations, T/T ≈(29 ± 1)−(35 ± 2) µK (the smaller amplitude correspondsto the larger angular scales). These are the signature ofthe fluctuations in the matter distribution imposed duringinflation that cross the horizon at the decoupling epochand produce local perturbations in the gravitational field.Even without the further constraint of galaxy and clusterformation, these temperature variations require a largedark matter component. First, they are very low ampli-tude. There is little time for them to grow large enough todrive clustering on scales of tens of megaparsecs unlessthere is something massive but unseen. Second, they domake it out of the earliest stages of the expansion withoutdamping. And most significantly, they are very large scale.

VIII. COSMOLOGICAL CONSTRAINTSAND EXPLANATIONS OF THEDARK MATTER

Several candidates have been suggested for the particularconstituents of DM. Since this is a field of enormous vari-ety and unusual richness of ideas, many of which changeon short life times, only a generic summary will be pro-vided. This should be sufficient to direct the interestedreader to the literature.

A. Cosmological Preliminaries

In attempting to find a likely candidate for the DM, onemust first look at some of the constraints placed on modelsby the cosmological expansion. First, the particles mustsurvive the process of annihilation and creation that dom-inates their statistical population fluctuations during theearly phases of the expansion. The cosmology is given bythe Friedmann–Robertson–Walker metric for a homoge-nous, isotropically expanding universe with radial coordi-nate r and angular measures θ and φ:

ds2 = dt2 − R2(t) dr2

1 − kr2− R2(t)(dθ2 + sin2 θ dφ2), (15)

where R(t) is the scale factor, the rate of the expansionbeing given by

d

dt(ρV ) + p

dV

dt= 0, (16)

Page 73: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

Dark Matter in the Universe 203

which is the entropy equation, with the volume V ∼ R3

and ρ and p being the energy density and pressure, re-spectively, and

H 2(t) =(

R

R

)2

= 8πGρ

3− k

R2, (17)

which is essentially the Hubble law. The value of k iscritical here to the argument, since it is the factor thatdetermines the curvature of the universe. The critical so-lution, with = 1, has a deceleration parameter q0 = 0.Here, = ρ/ρc, where ρc is the density needed to givea flat space–time, ρc = 3H 2

0 /8πG ≈ 2 × 10−29h2 g cm−3,where H0 is the present value of the Hubble constantand h = H0/(100 km sec−1 Mpc−1). This solution is fa-vored by inflationary cosmologies. During the radiation-dominated era, the energy density varies like R−4 and thepressure is given by 1

3 ρ, so that during the earliest epoch,ρ ∼ t−2. ∼ T −2.

The horizon is given by

lH =∫ t

t0

dt ′

R(t ′). (18)

Perturbations in the expanding universe have a chanceof becoming important when they grow to the scale ofthe horizon. Others will damp because they cannot becausally connected and will simply locally evolve anddie away. The expansion then produces fluctuations in thebackground radiation on the scale of the horizon at theepoch at which regions begin to coalesce.

The entropy of the expanding universe is constant, sothat the number density of particles at the time of theirformation is related to the temperature of the background,ni T −3 = const, from which we can define the ratio of thephoton to baryon number density as a measure of the en-tropy, S = nγ /nb = 109, where nγ ∼ T 3. For particle cre-ation, the particles will freeze out of equilibrium when theenergy in the background is small compared with theirrest mass energy, or when T ≤ mi c2/kB, where kB is theBoltzmann constant. Thus, for particles that are very mas-sive, during the radiation-dominated phase, the tempera-ture can be very high indeed. Put another way, the equi-librium temperature is defined as Teq = 1011mi (GeV) K.Since the temperature during the radiation-dominated eravaries like R−1, this implies that R/R0 ∼ 10−11mi , whereR0 is the present radius. The point is that the particleswhich may constitute nonbaryonic explanations for theDM are created in a very early phase of the universe,one which presented a very different physical environ-ment than anything we currently know directly. The cos-mic background radiation has a temperature of 2.735 K atthe present epoch.

B. The Cosmological Constant

The cosmological constant, , was first introduced byEinstein in an attempt to maintain Mach’s principle thatlocal inertia should be determined by the distribution ofmatter on the largest scale. Although ignored wheneverpossible, the constant is easily included in the field equa-tions through an added term /3 on the right-hand sideof Eq. (17). This behaves like a negative pressure, driv-ing the expansion, and is presumed to arise from vacuumenergy. Observations had, until recently, been consistentwith a vanishingly small . But recent data for type Ia su-pernovae strongly point to a value of ≈ 0.7 (althoughstill with = matter + = 1), which in light of the nu-cleosynthesis limits would make this term the dominantcontributor to the energy density. The data show an addi-tional acceleration is required to explain why supernovaeof this type, which have been shown to be standard candleswith intrinsically small luminosity dispersions, are fainterand more distant for z > 0.5 than would be predicted fromstandard cosmological models with = 1.

C. Global Constraints on the Presenceand Nature of Dark Matter

1. Nucleosynthesis

There are few direct lines of evidence for the total massdensity of the universe. One of the most direct comes fromthe analysis of the abundance of the light elements. Specif-ically, in the early moments of the Big Bang, within thefirst few minutes, the isotopes of hydrogen and heliumfroze out. While 4He can be generated in stars by nuclearreactions, deuterium cannot. In the Big Bang, however, itcan be produced by several different reactions,

p + n → 2D + γ, p + p + e → 2D + γ,

3He + γ → 2D + n,

and the deuterium can be destroyed very efficiently by

2D + n → 3H, 2D + p → 3He + γ.

The critical feature of all of these reactions is that theydepend on the rate of expansion of the universe duringthe nucleosynthetic epoch. The drop in both the densityand temperature serves to throttle the reactions, producingseveral useful indicators of the density.

The argument continues: if the rate of expansion is veryrapid, that is, if the density is much lower than closure, thenthe deuterium can be rapidly produced but not effectivelydestroyed because of the drop in the reaction rate for thesubsequent parts of the nuclear network. However, if thedensity is high, the rate of expansion will be slow enoughfor the chain of reactions to go to equilibrium abundances

Page 74: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

204 Dark Matter in the Universe

of 4He because of the consumption of the 2D. Thus, theD/H ratio, the mass fraction of 4He, and the 3He/4He ratiocan be used to constrain the value of = ρ/ρc, where ρc isthe critical density required for a flat universe (2 × 10−29h2

g cm−3). The larger is , the smaller is the D/H ratio. Cur-rent observations of the primordial abundances of heliumisotopes and of deuterium provide matter ≈ 0.1−0.3.

This appears to provide a number significantly lowerthan that obtained from the virial masses for clusters andfrom clustering arguments. Since the baryons in the earlyuniverse determined the abundances of the elements thatemerged from the Big Bang, the abundances of the lightisotopes provide a strong constraint on the fraction of DMthat can be in baryonic form, luminous or not.

2. Isotropy of the CBR

Recent balloon observations, especially theBOOMERANG experiment, have probed the intermedi-ate angular scales of the CBR fluctuations. These datashow a strong peaking on a scale of about 1, which cor-responds to an angular wave number of l ≈ 200. This ispresumably revealing the acoustic (pressure) fluctuationsthat were generated during the inflationary epoch andsurvived by stretching through the expansion to z ≈ 1000and indicate the size of the horizon. For angular scalessmaller than about 7 arcmin × 1/2, the perturbations aredamped because they are lower than the Silk mass, seebelow, Eq. (23). The spectrum is best reproduced by anopen CDM model (CDM with a cosmological constant).Further improvements in the angular correlation tests willbe achieved after MAP and PLANCK are launched inthis decade.

3. The Formation of Galaxiesand Clusters of Galaxies

Recent work on the distribution of galaxies has centeredon the idea that the visible matter does not represent thedistribution of mass overall. The picture that is develop-ing along these lines, called “biased galaxy formation,”takes as its starting point the assumption that galaxiesare unusual phenomena in the mass spectrum. One usu-ally assumes that the galaxies are the result of perturba-tions in the expanding universe at some epoch during theradiation-dominated period and therefore are representa-tive of the matter distribution. However, biasing arguesthat the galaxies are the product of unusually large densityfluctuations and that the subsequent development of theseperturbations is dominated by the smoother backgrounddistribution of dark matter—that which never coalescesinto galactic mass objects. Several mechanisms have beensuggested, all of which may be reasonable in some part

of the evolution of the early universe. One picture usesexplosions, or bubbles, formed from the first generationof stars and protogalaxies to redistribute mass, alter thestructure of the microwave background, and erase the ini-tial perturbation spectrum.

Gravitational clustering alone is insufficient to explainthe dramatic density contrasts observed between clustersand voids, unless one invokes large perturbations at theepoch of decoupling of matter and radiation. The currentdensity contrast is of order 2–10, which implies that at therecombination era the perturbations in the density must beof order 10−3. Either the fluctuations in the density at thisperiod are isothermal, in which case the density variationis not reflected in the temperature fluctuations in the cos-mic background radiation, or there must be some othermechanism for the formation of the galaxies. In an adi-abatic variation, δT/T = (1/3) × δρ/ρ, which is clearlyruled out by observations on scales from arcseconds todegrees to a level of 10−5. If the density fluctuations aresmaller than this value, there is not enough time for themto grow to the sizes implied by the large-scale structureby the present epoch. Further, quasars also appear to showclustering and to be members of clusters, at z = 1−4, sothis pushes the growth of nonlinear perturbations to evenearlier periods in the history of the universe. If the densityof matter is really the cosmological critical value, that is,if = 1, then there cannot be a simple explanation of thedistribution of luminous matter simply by the effects ofprimordial fluctuations.

Biasing mechanisms can be combined with the DM inassuming that the massive particles are the ones whichshow the large-scale perturbations. Here, the differencebetween hot and cold DM scenarios shows up most clearly.If the matter is formed hot and stays hot, it will damp out allof the small-scale perturbations. If formed cold, these willbe the ones that will grow most rapidly, with gravitationaleffects later sorting the smallest fragments of the Big Banginto the larger scale hierarchical structures of clusters andsuperclusters. The topologies of the resultant universesdiffer between these two extremes, with the cold mattershowing the larger contrast between voids and clusters andan apperance that is best characterized by filaments ratherthan lumps. It should be added that biasing can act on bothscenarios and that the final appearance of the simulationscan depend on the statistical method chosen to determinethe biasing.

4. Cold versus Hot Dark Matter: New Results

The choice between cold versus hot dark matter scenariosdepends on the smoothness of the mass distribution butparticularly on the concentration of galaxies within thelarge-scale filaments that link clusters. Too hot a DM

Page 75: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

Dark Matter in the Universe 205

component and all the substructure washes out, too coldand it grows too quickly. As with Goldilocks, there isalways a third option—warm DM—but the choice ofparticle is less clear. The distinction between the differentscenarios rests on whether the particles are relativisticand/or massless, or thermal and/or baryonic, in whichcase the former support the largest physical perturbationsbut prevent the coalescence of substructure, while thelatter exacerbate its growth.

D. The Particles from the Beginning of Time

The initial scale at which all of the particle theories startis with the Planck mass, the scale at which quantum per-turbations are on the same scale for gravity as the eventhorizon. This gives

mPlanck =(

hc

2πG

)1/2

≈ 1.7 × 1019 GeV. (19)

This is the scale at which the ultimate unification of forcesis achieved, independent of the particle theories. It dependsonly on the gravitational coupling constant. It is then aquestion of where the next grand unification scale occurs(the GUT transition). This is usually placed at 1012−1015

GeV, considerably later and cooler for the background.The particles of which the background is assumed to becomposed are usually assumed to arise subsequent to thisstage in the expansion, and will therefore have masses wellabove those of the baryons, in general, but considerablybelow that of the Planck scale.

1. Inflationary Universe and Soon After

The most recent interest by theorists in the need for DMcomes from the inflationary universe model. In this pic-ture, originated by Guth and Linde, the universal expan-sion begins in a vacuum state that undergoes a rapid ex-pansion leading to an initially flat space–time, which thensuffers a phase transition at the time when the temperaturefalls below the Planck mass. At this point, the universe re-inflates, now containing matter and photons, and it is thisstate which characterizes the current universe. One of thecompelling questions that this scenario addresses is whythe universe is so nearly flat. That is, the universe, were itsignificantly different from a critical solution at the initialinstant, would depart by large factors from this state bythe present epoch. The same is true for the isotropy. Theuniverse at the time of appearance of matter must havebeen completely isotropized, something which cannot beunderstood within the framework of noninflationary pic-tures of the Big Bang.

The fact that, in these models, = 1 is a requirement,while all other determinations of the density parameter

yield far smaller values (open universes), fuels the searchfor some form of nonbaryonic DM. The ratio /observed

is approximately 10–100, of the same order as that re-quired from clusters of galaxies.

During the expansion, particles can be created and an-nihilated through interactions; thus,

dn

dt= −D(T )n2 − 3

R

Rn + C(t), (20)

where the first term is the annihilation, which is temper-ature dependent; the second is the dilution of the numberdensity because of the expansion; and the third is the cre-ation term, which depends on time and (implicitly) temper-ature. Let us examine what happens if the rate of creationdepends on a particle whose rate of creation is in equilib-rium with its destruction. Then, if a two-body interactionis responsible for the creation of new particles and theyhave a mass m, the Saha equation provides the number inequilibrium,

neq ∼ (mT )3/2 exp(−m/T ), (21)

and the rate of particle production becomes a function ofthe particle mass. Thus, the more massive particles canbe destroyed effectively in the early universe but will sur-vive subsequently because of the rapid expansion of thebackground and dilution of the number density. Any DMcandidate will therefore have a sensitive dependence on itsmass for the epoch at which the separation from radiationoccurs as well as the strength of the particle’s interactionwith matter and radiation.

Following their freeze-out phase, the particles willbe coupled to the expanding radiation and matter untiltheir path length becomes comparable with the horizonscale. This phase, called decoupling, also depends on thestrength of the interaction with the radiation and matter.After this epoch, the particles can freely move in the uni-verse, which is now “optically thin” to them. It is thisfree-streaming that determines the scale over which theparticles can allow for perturbations to grow. For instance,in the case of the massive cold particles, those createdwith very low velocity dispersions, the scale over whichthey can freely move without interacting gravitationallywith the baryonic matter is determined by the mass of thebaryon clumps. For cold DM, this means that the slowparticles, those with energies of less than 10 eV, can betrapped within galactic potentials. Those with energies of100 eV would still effectively be trapped by clusters. If theparticles are hot, they cannot be trapped by these clumpsand can, in fact, erase the mass perturbations that wouldlead to the formation of these smaller structures.

As mentioned in discussing galaxy formation in gene-ral, simulations of the DM show that for the hot parti-cles, the mass scale that survives the primordial

Page 76: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

206 Dark Matter in the Universe

perturbations and damping is the size of superclusters,about 1015−1016 M. The superclusters thus form first,then these fragment and separate from the backgroundand form clusters and their constituent galaxies from the“top down.” If the particles are cold, they cannot erase thesmallest scales of the perturbations, on the order of galaxysize (1012−1013 M), and clusters and superclusters formgravitationally from the “bottom up.” The problem isthat there is little observational evidence to distinguishwhich of these is the most likely explanation. In effect,it is the simulations of the galactic interactions, placedin an expanding cosmology and with very restrictiveassumptions often being made about the interactionsbetween galaxies within the clusters thus formed, whichare used as argumentation for one or the other scenario.In fact, it is for this reason that the models must becalled scenarios since there cannot be detailed analyticsolution of the complex problem and the solutions areonly sufficient within the limitations of the preciseassumptions that have been applied in the calculations.

The critical mass for gravitational instability and galaxyformation is the Jeans mass, the length (and thereforemass) scale at which material perturbations become self-gravitating and separate out in the expanding background.For species in equilibrium with radiation, it depends onboth the number density of the dominant particle and itsmass as well as on the temperature of the background radi-ation. In the case of the expanding universe, since duringthe radiation-dominated era the mass density and temper-ature are intimately and inseparably linked, the Jeans massdepends only on the mass of the particle that is assumedto be the dominant matter species:

MJ ∼ ρi

(a2

s

Gρi

)3/2

=(

T 3

Gm2i ni

)1/2

≈ 3 × 10−9m−1i,GeV M, (22)

where as is the sound speed and ni is the number densityof the dominant species. The smaller the mass of the par-ticle, the larger the initially unstable mass. This collapsemust compete against the fact that viscosity from the in-teraction with background photons tends to damp out thefluctuations in the expanding medium. The critical massfor stability against dissipation within the radiation back-ground is given by the Silk mass, below which the photons,by scattering and the effective viscosity that is represents,damp perturbations. This gives a mass of

Ms = 3 × 1013(h2)−5/2 M. (23)

Below this mass, which is of the same order as a galacticmass, perturbations are damped out during the early stagesof the expansion. This provides a minimum scale on whichgalaxies can be conceived to be formed. Other particles,

however, can also serve to damp out the perturbations atan even earlier epoch if they are hot enough.

2. The Particle Zoo

a. Baryons. Baryons, the protons and neutrons, arethe basic building blocks of ordinary matter. They interactvia the strong and electromagnetic forces and constitutethe material out of which all luminous material is com-posed.

The basic form of baryonic matter, hot or cold gas, canbe ruled out on several counts. One has been discussedin the section on nucleosynthesis in the Big Bang; thebaryonic density is constrained by isotopic ratios to besmall. Another constraint is provided by the absorptionlines formed from cold neutral gas through the intergalac-tic medium in lines of sight to distant quasars. There are notsufficiently high column densities observed along any ofthe lines of sight to explain the missing mass. The absorp-tion from Lyman α, the ground-state transition of neutralhydrogen, is sufficiently strong and well enough observedthat were the gas in neutral form, it would certainly showthis transition viewed against distant light sources. Whilesuch narrow absorption lines are seen in QSO spectra,they are not sufficiently strong to account alone for thedark matter.

In addition, hot gas is observed in clusters of galaxies inthe form of diffuse X-ray halos surrounding the galaxiesand cospatial with the extent of their distribution. Hereagain, the densities required to explain the observationsare not sufficient to bind the clusters or to account for thelarger scale missing mass since the X-ray emission seemsto be confined only to the clusters and not to pervade theuniverse at large scale.

It is possible that the matter could be in the form ofblack holes formed from ordinary material in a collapsedstate, but the required number, along with the difficul-ties associated with their formation, makes this alterna-tive tantalizing but not very useful at present. Of course,football- to planet-sized objects could be present in verylarge numbers, forming the lowest mass end of the massspectrum that characterizes stars. Aside from the usualproblem of not being able theoretically to form these ob-jects in the required numbers, there are also the constraintsof the isotropy of the microwave background and of theupper limit to the contribution of such masses to the in-frared background. Since these objects would behave likeordinary matter, they should come into thermal equilib-rium with the background radiation and hence be observ-able as fluctuations in the background radiation on thescale of clusters of galaxies and smaller. Such a variationin the temperature of the CBR is not observed. In addi-tion, it would have been necessary for the baryons to have

Page 77: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

Dark Matter in the Universe 207

participated in the nucleosynthetic events of the first fewminutes, and this also constrains the rate of expansion torule out these as the primary components of the universe.If inflation did occur and the universe is really an = 1world, then the critical missing mass cannot be made upby baryons without doing significant damage to the under-standing of the nuclear processing of the baryonic matter.

b. Neutrinos. The electron neutrino (νe) is the light-est lepton, and in fact the lightest fermion, making it aninteresting particle for explaining DM. It cannot decay intolower mass species if it is only available in one helicity, andthus would survive from the epoch at which it decoupledfrom matter and radiation during the radiation-dominatedera of the expansion.

It is well known that neutrino processes, being mod-erated by the weak force, permit the particles to escapedetection by many of the classical tests. That these areweak particles means that they decouple from the expan-sion sooner than the photons and can freely stream onscales larger than those of the photons, also remaininghotter than the photon gas and therefore more extended intheir distribution if they accrete onto galaxies or in clus-ters of galaxies. Should the neutrino have a large enoughmass, about 30 eV, the predicted abundance of the threeknown species is sufficient to close the universe and pos-sibly account for the gross features of the missing mass.Limits place this maximum mass for the electron neutrinospecies as ≤10 eV. Other problems with this explanationinclude the rates of nucleosynthesis in the early universe,particularly the neutron to proton ratio, which is fixed bythe abundance of light leptons and the number of neutrinoflavors.

The argument that the νe mass must be nonvanishing isimportant. If the neutrino is massive, it may be nonrela-tivistic. This implies that it decoupled from the microwavebackground at some time before the baryons, but later thanthe massless particles. Since the particle decoupling de-pends on the temperature of the radiation, this determinesthe mean free-streaming velocity. The capture efficiencyof baryons to this background of particles is dependent ontheir mass and velocity. The escape velocity from a galaxyis several hundred km sec−1, while for galaxy clusters it ismuch higher. Thus, for galactic halo DM to be explainedby νe, it is necessary that the particles be cold, that is,they must have free-streaming velocities less than 100 kmsec−1. The restriction of the particle mass to less than10 eV by the observations of the width of the burst of neu-trinos from Supernova 1987 A in the Large MagellanicCloud likely adds fuel to the argument that the neutrino isan unlikely candidate for DM.

If νe is the dominant component of DM, an analog ofthe Silk mass is possible, below which fluctuations are

damped because of the weak interaction of matter withthe relic neutrinos. The density parameter is now given by = 0.01h−2mν (eV) and

Mν = 5 × 1014(νh2

)−2M, (24)

which is closer to the scale of clusters than galaxies. Withcurrent limits, it looks as if νe can neither provide ν = 1nor seriously alter the mass scale for galaxy formation,but may nonetheless be important in nucleosynthesis.

c. Alternative particles. After exhausting classicalexplanations of DM, one must appeal to far more exoticcandidates. It is this feature of cosmology that has becomeso important for particle theorists. Denied laboratory ac-cess to the states of matter that were present in the earlystages of the Big Bang, they have attempted to use the uni-verse as a probe of the conditions of energy and length thatdominate the smallest scales of particle interactions at afundamental level. Above the scale of energy representedby proton decay experiments, the only probable site for thestudy of these extreme states of matter is in the cosmos.

Many grand unified theories, or GUTs, contain brokensymmetries that must be mediated by the addition of newfields. In order to carry these interactions, particles haveto be added to the known spectrum from electroweak andquantum chromodynamics (QCD) theory. One such par-ticle, the axion, is a moderator of the charge-parity (CP)violation observed for hadrons. It is a light particle, butnot massless, which is weakly interacting with matter andshould therefore decouple early in its history from thebackground radiation. Being of light mass and stable, itwill survive to the epoch of galaxy formation and mayprovide the particles needed for the formation of the grav-itational potentials that collect matter to form the galaxy-scale perturbations.

It is perhaps easiest to state that the axion has the at-tractive feature that its properties can largely be inferredfrom models of DM behavior rather than the other wayaround, and that the formation of these particles, or some-thing much like them, is a requirement in most GUTs.Simulations of galaxy clustering make use of these andother particles with very generalized properties in orderto limit the classes of possible explanations for the DM ofgalaxies and clusters of galaxies.

Supersymmetry, often called SUSY, is the ultimateexploitation of particle symmetries—the unification offermions and bosons. Since there appears to be a character-istic scale of mass at which this can be effected and sincesupersymmetry assigns to each particle of one type a part-ner of the opposite type but of different mass, it is possiblethat the different particles decouple from the primordialradiation at different times (different temperature scales

Page 78: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

208 Dark Matter in the Universe

at which they are no longer in thermal equilibrium) andsubsequently freely move through the universe at large.This provides a natural explanation for the mass scalesobserved in the expanding hierarchy of the universe.

The photon is the lightest boson, massless and with inte-ger spin. It is a stable particle. Its supersymmetric (SUSY)partner is the photino, the lowest mass of the SUSY zoo.The photino freezes out of the background at very hightemperatures, of order Tphotino ≈ 4(mSUSY,f/100 (GeV)4/3

MeV, where mSUSY,f is the mass of the SUSY particle(a fermion) by which the photino interacts with the back-ground fermions. If the mass is very small, less than 4 MeV,the decoupling occurs early in the expansion, leading torelativistic particles and free streaming on a large scale;above this, the particle can come into equilibrium withthe background through interactions that will reduce itsstreaming velocity, and annihilation processes, like thoseof monopoles, will dominate its equilibrium abundance.The final abundance of the photino is dominated by thestrength of the SUSY interactions at temperatures higherthan Tphotino.

If the gravitino, the partner of the graviton, exists, then itis possible for the photino to decay into this SUSY fermionand a photon. Limits on the half-life can be as short as sec-onds or longer than days, suggesting that massive photinosmay be responsible for mediating some of the nucleosyn-thesis in the late epoch of the expansion. If the photinosdecay into photons, this raises the temperature of the back-ground at precisely the time at which it may alter the for-mation of the light isotopes, photodisintegrating 4He andchanging the final abundances from nucleosynthesis. Re-cent attempts to place limits on the processes have shownthat the photinos, in order to be viable candidates for DM,must be stable, or have a large mass (greater than a fewGeV) if they are nonrelativistic, or have a very low mass(<100 eV). There are problems with large numbers oflow-mass particles for the same reason there are problemswith νe, since these would tend to wash out much of thesmallscale structure while not being well bound to galaxyhalos.

Strings are massive objects that are the product ofGUTs. They are linear, one-dimensional objects that be-have like particles. Arguments about their structure andbehavior have shown that, as gravitating objects, they canserve as sites for promoting galaxy formation, and maybe responsible for the biasing of the DM to form potentialwells that accrete the baryons out of which galaxies arecomposed. There is no experimental support for their ex-istence, but there is considerable interest in them from thepoint of view of particle theories. Strings have the prop-erty that they can either come in infinite linear or closedforms. The closed forms serve as the best candidates forinclusion in the DM scenarios.

IX. FUTURE PROSPECTS

The current status of DM is very confusing, especiallyin light of the wealth of possible models for its expla-nation. None of the particles presently known can ex-plain the behavior of matter on scales above the galactic,while there are a number of hypothetical particles that cando so for larger scales. The masses of galaxies are suchthat, in order to explain their halos, the particles of whichthey are composed must be nonrelativistic, that is, theymust have velocities less than a few hundred km sec−1.Thus, cold DM appears the best candidate for the con-stituents of galactic halos. On the scale of clusters andsuperclusters, however, it is still not clear whether this isa firm limit. Biased galaxy formation seems a viable ex-planation of the distribution of luminous matter, but hereagain there is a wealth of explanation and not much datawith which to test it. A summary of the evidence for DMis given in Table I, and a list of candidates is given inTable II.

The best candidate for testing some of the models isspace observation. The Hubble Space Telescope is a high-resolution optical and ultraviolet spectroscopic and imag-ing instrument capable of reaching 30th magnitude andof performing high-resolution observations of galaxies toat least z = 4. Surveys of redshift distributions in clustersof galaxies with very high velocity resolution should aidgreatly in the details of virial mass calculations, whilethe imaging should delineate the extent to which interac-tions between cluster galaxies have played a role in theformation of the observed galactic distributions. The Cos-mic Background Explorer (COBE), launched in Novem-ber 1989, was designed to look at the isotropy of the mi-crowave background, near the peak of the CBR spectrum.COBE has placed strict limits on the scales on which thebackground shows temperature variations, thereby delim-iting the scale of adiabatic perturbations in the expandinguniverse at the recombination epoch.

TABLE I Summary of Evidence for Dark Matter in DifferentEnvironments

Evidence 〈M/L〉 Ω

Galactic stars and clusters 1 0.001

Vertical galactic gravitational acceleration 2–3 0.002

Disk galaxy dynamics 10 0.01

Irregular and dwarf galaxies 10–100 0.101–0.1

Binary galaxies and small groups 10–100 0.01–0.1

Rich clusters and superclusters 100–300 0.2 ± 0.1

Large-scale structure 1000 0.5–1.0

Baryosynthesis 10–100 ≤0.1

Inflationary universe 1000h 1.0

Page 79: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

Dark Matter in the Universe 209

TABLE II Nonbaryonic Dark-Matter Candidates and Their Propertiesa

Candidate/particle Approximate mass Predicted by Astrophysical effects

G(R) — Non-Newtonian gravitation Mimics DM on large scales

(cosmological constant) — General relativity Provides = 1 without DM

Axion, majoron, Goldstone boson 10−5 eV QCD; PQ symmetry breaking Cold DM

Ordinary neutrino 10–100 eV GUTs Hot DM

Light higgsino, photino, 10–100 eV SUSY/SUGR Hot DMgravitino, axino, sneutrinob

Para-Photon 20–400 eV Modified QED Hot/warm DM

Right-handed neutrino 500 eV Superweak interaction Warm DM

Gravitino, etc.b 500 eV SUSY/SUGR Warm DM

Photino, gravitino, axino, mirror- keV SUSY/SUGR Warm /cold DMparticle, simpson neutrinob

Photino, sneutrino, higgsino, MeV SUSY/SUGR Cold DMgluino, heavy neutrinob

Shadow matter MeV SUSY/SUGR Hot/cold (like baryons)

Preon 20–200 TeV Composite models Cold DM

Monopoles 1016 GeV GUTs Cold DM

Pyrgon, maximon, Perry Pole, 1019 GeV Higher dimension theories Cold DMnewtorites, Schwarzschild

Supersymmetric strings 1019 GeV SUSY/SUGR Cold DM

Quark nuggets, nuclearites 1015 g QCD, GUTs cold DM

Primordial (mini) black holes 1015−30 g General relativity Cold DM

Cosmic strings, domain walls 108−10 M GUTs Promote galaxy formation, thoughsmall contributor to

a QCD, quantum chromodynamics; PQ, Peccei–Quinn; GUTs, grand unified theories; SUSY, supersymmetry; SUGR, supergravity; QFD, quantumelectrodynamics.

b Of these various supersymmertric partners predicted by assorted versions of SUSY/SUGR, only one, the lightest, can be stable and contribute to, but the theories do not at present tells us which one it will be or the mass to be expected.

Laboratory tests of particle theories should also con-tribute to the DM problem. The plans for the Supercon-ducting Supercollider show that it should be able to detectthe Higgs boson, responsible for the mass of particles ingrand unified theories, and this should also feed the mod-eling of supersymmetric interactions. As the knowledgeof the behavior of the electroweak bosons increases, weshould also have a clearer picture of the role played bySUSY in the early universe and in the generation of the par-ticles which are responsible for (at least) some of the DM.

On a final note, it may be said, possibly, we have this allwrong. The study of matter that cannot be seen but onlyfelt on very large scales is obviously one driven by modelsand calculations. These have assumptions built into themthat may or may not be justified in the context of the par-ticular studies. Therefore, it may be the case that DM ismore an expression of our ignorance of the details of theuniverse at large distances and on the cosmological scalethan we currently believe. However, the need for invok-ing DM is very widespread in astrophysics: it is requiredin many explanations, models for it come in many vari-eties, its presence is indicated by different methods of massdetermination, and it is a phenomenon that involves only

classical dynamics. Most simple alternatives proposed sofar have been very specifically tailored to the individualproblems and are often too ad hoc to serve as fundamen-tal explanations of all of the phenomena that require thepresence of some form of DM. This is a field rich in spec-ulation, but it is also a field rich in quantitative results. Asmore data are accumulated on the dynamics of galaxiesand clusters of galaxies, we will clearly be able to distin-guish between the “everything we know is wrong” schooland that which detects the fingerprint of the early universeand its processes in the current world.

APPENDIX: THE DISCRETEVIRIAL THEOREM

Consider the motion of a particle about the center of massof a cluster. The gravitational potential is the result of theinteraction with all particles j , not the same as i , for thei th particle:

≡ −G∑i< j

m j

Ri j, (25)

Page 80: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 16:3

210 Dark Matter in the Universe

where the masses are allowed to be different. The equa-tions of motion for this particle is

mi vi = −mi∇. (26)

Taking the product of this equation with the position ofthe i th particle (the scalar product) gives∑

i

rimi r = −2T + 1

2I, (27)

where I is the moment of inertia of the cluster and T is thekinetic energy of the particles. For discrete particles, thisis easily calculated. Now assume that Ri j is the scalar dis-tance between the i th and j th particles, Then the equationof motion yields the discrete vitrial theorem:

I =∑

i

miv2i − G

∑i

∑i< j

mi m j

Ri j= 2T + W. (28)

Here W is the gravitational energy of the cluster, summedover all masses. This is the usual form of the virial theorem,with the additional term for the secular variation of themoment of inertia included. It is usually assumed thatthe system has been started in equilibrium, so that thegeometry of the configuration is constant. This impliesthat one can ignore the variation in I . Now, assume thatall particles have the same mass, m. This gives

−G∑i< j

m2i 〈Ri j 〉−1 +

∑i

mi⟨v2

i

⟩ = 0. (29)

We obtain only a two-dimensional spatial picture and aone-dimensional velocity picture of a three-dimensionaldistribution of galaxies in a cluster. The virial theoremapplies to the fullphase space of the constituent massesand is three dimensional, of course, in spite of the fact thatwe can only see the line-of-sight velocities. Therefore, inorder to obtain mass estimates from the virial constraint,we must make some assumption about the isotropy ofthe velocity distribution. The simplest and conventionalassumption is that the radial velocity is related to the meansquare velocity by 〈v2

i 〉 = 3〈V 2rad,i 〉 when averaged over

the various orientations of the cluster. The virial massestimator is then given by

MVT = 3π N

2G

( ∑i V 2

rad,i∑i< j ρ−1

i j

), (30)

where now N is the number of (identical) particles andρi j is the projected separation of the objects i and j in thesky. The coefficient results from assuming that these arerandomly oriented and that 〈R−1

i j 〉 = (2/π )ρ−1i j .

Several assumptions have been built into this deriva-tion, among which are the assumptions of isotropy of theorbits of stars about the center of the cluster and the sta-bility of the shape of the cluster. Notice that there will be

an additional term in the equation of motion if there isan oscillation of the cluster with time and that this willreduce the estimated virial mass (although it will in gen-eral be a small term since it is inversely proportional tothe square of the Hubble time). An additional potential iscontributed by the DM, but it may not enter into the virialargument in the same way as this discrete particle picture.In the preceding discussion, the constituent moving galax-ies were assumed to be the only cause of the gravitationalfield of the clusters. Now if there is a large, rigid massdistribution that forms the potential in which the galax-ies move, the mass estimates from the virial theorem areincorrect. There will always be an additional term, WDM,which adds to the gravitational energy of the visible galax-ies but does not contribute to the mass, and this can alterthe determination of the mass responsible for the bindingof the visible tracers of the dynamics. One should exercisecaution before accepting uncritically the statements of thevirial arguments by carefully examining whether they arebased on valid, perhaps case-by-case, assumptions.

SEE ALSO THE FOLLOWING ARTICLES

COSMIC INFLATION • COSMOLOGY • DENSE MATTER

PHYSICS • GALACTIC STRUCTURE AND EVOLUTION •NEUTRINO ASTRONOMY • STELLAR STRUCTURE AND

EVOLUTION

BIBLIOGRAPHY

Aaronson, M., Bothun, G., Mould, J., Huchra, J., Schommer, R. A.,and Cornell, M. E. (1986). “A distance scale from the infrared mag-nitude/HI velocity–width relation. V. Distance moduli to 10 galaxyclusters, and positive detection of bulk supercluster motion towardthe microwave anisotropy,” Astrophys. J. 302, 536.

Athanassoula, E., Bosma, A., and Papaioannou, S. (1987). “Halo param-eters of spiral galaxies,” Astron. Astrophys. 179, 23.

Bahcall, ., and Soniera, . (1981).Bahcall, N. A., Ostriker, J. P., Perlmutter, S., and Steinhardt, P. J. (1999).

“The cosmic triangle: Revealing the structure of the universe,” Science284, 1481.

Bergstrom, L. (1999) “Non-baryonic dark matter,” Nucl. Phys. B (Proc.Suppl.) 70, 31.

Blumenthal, G. R., Faber, S. M., Primack, J. R., and Rees, M. J. (1984).“Formation of galaxies and large-scale structure with cold dark mat-ter,” Nature 311, 517.

Boesgaard, A. M., and Steigman, G. (1985). “Big bang nucleosynthesis:Theories and observations,” Annu. Rev. Astron. Astrophys. 23, 319.

Dekel, A., and Rees, M. J. (1987). “Physical mechanisms for biasedgalaxy formation,” Nature 326, 455.

Dressler, A. (1984). “The evolution of galaxies in clusters,” Annu. Rev.Astron. Astrophys. 22, 185.

Faber, S. M., and Gallagher, J. S. (1979). “Masses and mass-to-lightratios of galaxies,” Annu. Rev. Astron. Astrophys. 17, 135.

Geller, M., and Huchra, J. (1990). Science 246, 897.

Page 81: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL Final Pages

Encyclopedia of Physical Science and Technology EN004F-162 June 8, 2001 23:33

Dark Matter in the Universe 211

Holt, S. S., Bennett, C. L., and Trimple, V. (eds.). (1991). “After theFirst Three Minutes,” American Institute of Physics Press, NewYork.

Kashlinsky, A., and Jones, B. J. T. (1991). “Large-scale structure of theuniverse,” Nature 349, 753.

Lynden-Bell, D., et al. (1988). Ap. J. 326, 19.Lin, and Shu. (1965).Ostriker, and Peebles. (1973).Partridge, B. (2000), “The universe as a laboratory for gravity,” Class.

Quantum Grav. 17, 2411.Peebles, J. (1980). “Large-Scale Structure of the Universe,” Princeton

University Press, Princeton, NJ.Peebles, P. J. E. (1993). “Principles of Physical Cosmology,” Princeton

University Press, Princeton, NJ.Rubin, V. (1983). “Dark matter in spiral galaxies,” Sci. Am. 248 (6), 96.Saunders, W., et al. (1991). “The density field of the local universe,”

Nature 349, 32.

Silk, J. (2000). “Cosmology and structure formation,” Nucl. Phys. B(Proc. Suppl.) 81, 2.

Sato, K., (ed.). (1999). “Cosmological Parameters and the Evolution ofthe Universe,” Kluwer, Dordrecht, The Netherlands.

Spiro, M., Aubourg, E., and Palanque-Delabroille, N. (1999), “Baryonicdark matter,” Nucl. Phys. B (Proc. Suppl.) 70, 14.

Trimble, V. (1988). “Dark matter in the universe: Where, what, why?”Contemp. Phys. 29, 373.

van Moorsel, G. A. (1987). “Dark matter associated with binary galax-ies,” Astron. Astrophys. 176, 13.

Vilenkin, A. (1985). “Cosmic strings and domain walls,” Phys. Rep. 121,264.

White, S. D. M., Frenk, C. S., Davis, M., and Efstathiou, G. (1987).“Clusters, filaments, and voids in a universe dominated by cold darkmatter,” Astrophys. J. 313, 505.

Zeld’ovich, Ya. B. (1984). “Structure of the universe,” Sov. Sci. Rev. EAstrophys. Space Phys. 3, 1.

Page 82: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 13:9

Galactic Structure and EvolutionJohn P. HuchraHarvard-Smithsonian Center for Astrophysics

I. Galaxy Morphology—The Hubble SequenceII. Galactic Structure

III. Integrated Properties of GalaxiesIV. Galaxy Formation and EvolutionV. Summary

GLOSSARY

Globular cluster A dense, symmetrical cluster of 105 to106 stars generally found in the halo of the galaxy.Globular clusters are thought to represent remnants ofthe formation of galaxies.

Halo The extended, generally low surface brightness,spherical outer region of a galaxy, usually populatedby globular clusters and population II stars. Sometimesalso containing very hot gas.

H II region A region of ionized hydrogen surroundinghot, usually young, stars. These regions are distin-guished spectroscopically by their very strong emissionlines.

Hubble constant The constant in the linear relation be-tween velocity and distance in simple cosmolical mod-els. D = V/H0, where H0 is given in km sec−1 Mpc−1.Distance and velocity are measured in Megaparsecs andin kilometers per second, respectively. Current valuesfor H0 range between 50 and 100 in those units.

Magnitude Logarithmic unit of relative brightnessor luminosity. The magnitude scale is defined as−2.5∗ log ( flux) + Constant so that brighter objectshave smaller magnitudes. Apparent magnitude is

defined relative to the brightness of the A0 star Vega(given magnitude = 0) and absolute magnitude isdefined as the magnitude of an object if it were placedat a distance of 10 parsecs.

Metallicity The average abundance of elements heavierthan H and He in astronomical objects. This is usuallymeasured relative to the metal abundance of the Sunand quoted as “Fe/H,” since iron is a major source ofline in optical spectra.

Missing mass The excess mass found from dynamicalstudies of galaxies and systems of galaxies over andabove the mass calculated for these systems from theirstellar content.

Parsec, kiloparsec, Megaparsec 1 parsec is the distanceat which 1 Astronomical Unit (the Earth orbit radius)subtends 1 arcsec. 1 pc = 3.086 × 1018 cm = 3.26 lightyears.

Population I, II Stars in the galaxy are classified intotwo general categories. Population II stars were formedat the time of the galaxy’s formation, are of lowmetallicity, and are usually found in the halo or inglobular clusters. Population I stars, like the sun, areyounger, more metal rich and form the disk of thegalaxy.

369

Page 83: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

370 Galactic Structure and Evolution

Solar mass, luminosity

L = 3.826 × 1033 ergs sec−1;

M = 1.989 × 1033 gm.

Surface brightness Luminosity per unit area on the sky,usually given in magnitudes per square arcsecond. InEuclidean space, surface brightness is distance inde-pendent since both the apparent luminosity and areadecrease as the square of the distance.

THE OBJECT of the study of galaxies as individual ob-jects is twofold—(1) understand the present morpholog-ical appearences of galaxies and their internal dynamics,and (2) to understand the integrated luminosity and en-ergy distributions of galaxies—in both the framework andthe timescale of a cosmological model and evolution. Themorphology of a galaxy is determined by its formationand dynamical evolution, mass, angular momentum con-tent, and the pattern of star formation that has occurredin it. The Brightness and spectrum of a galaxy are deter-mined by its present stellar population which is in turna function of the galaxy’s star formation history and theevolution of those stars which is also related to the dynam-ical history of the galaxy and its environment. Significantadvances have been made in detailed quantitative model-ing of galaxy morphology and internal dynamics. Galaxyform and luminosity evolution are much less well under-stood despite their great importance for the use of galaxiesas cosmological probes.

The study of galactic structure and evolution began onlyin the early 20th century with the advent of large reflectorsand new photographic and spectroscopic techniques thatallowed astronomers to determine crude distances to ex-ternal galaxies and place them in the model universes be-ing developed by Einstein, de Sitter, Friedmann, Lemaitreand others. The key discoveries were Harlow Shapley’s ofa relation between the period and luminosity of Cepheidvariable stars, and Edwin Hubble’s use of that relation todetermine distances to the nearest bright galaxies. Hubblethen was able to calibrate other distance indicators (suchas the brightest stars in galaxies, which are 10–100 timesbrighter than Cepheids) to estimate distances to furthergalaxies. This led to his discovery in 1929 of the redshift-distance relation which not only established the expansionof the universe as predicted by Einstein but also establishedthat the age or timescale of the universe was much greaterthan 100 million years. Current observational estimates forthe age of the universe range between 14 and 17 billionyears in the basic Hot Big Bang model, the Friedmann–Lemaitre cosmological model which has been favored forthe last few decades.

In the past few years, the parameters of the cosmologi-cal model have been narrowed down by astronomers andphysicists; the current view is that the Einstein–DeSitterglobally flat model is most likley correct, that the Universewill probably expand forever, and that matter makes upabout one third of the “content” of the Universe with onesixth of that (or about 4–5% of the total mass–energy den-sity of the Universe) in ordinary baryonic matter—the stuffwe are made of. The work of astronomers who concentrateon the study of galaxies is to understand the formation andevolution of galaxies and larger structures such as galaxyclusters from the time ∼15 billion years ago when, as seenin the Cosmic Microwave Background, the Universe wasuniform and homogeneous to one part in 100,000.

I. GALAXY MORPHOLOGY—THE HUBBLESEQUENCE

The initial steps in the study of galactic structure were thedevelopment of classification schemes that could be tiedto physical properties such as angular momentum content,gas content, mass, and age. Another of Hubble’s majorcontributions to extragalactic astronomy was the introduc-tion of such a scheme based only on the galaxy’s visual (or,more correctly, blue light) appearance. This morphologi-cal classification scheme, known as the Hubble Sequence,has been modified and expanded by Allan Sandage, Gerardde Vaucouleurs, and Sidney van den Bergh. The HubbleSequence now forms the basis for the study of galacticstructure. There are other classification schemes for galax-ies based on such properties as the appearance of theirspectra or the existence of star like nuclei or diffuse halos;these are of more specialized use.

Hubble’s basic scheme can be described as a “tuningfork” with elliptical galaxies along the handle, the twofamilies of spirals, normal and barred, along the tines, andirregular galaxies at the end (see Fig. 1). Elliptical galaxies

FIGURE 1 Hubble’s “tuning fork” diagram (adapted from Hubble,E. 1936.) The Realm of the Nebulae.

Page 84: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

Galactic Structure and Evolution 371

FIGURE 2 The EO galaxy NGC 3379 and its comparison SBOgalaxy, NGC 3384. This image was obtained with a Charge Cou-pled Device (CCD) camera at the F. L. Whipple Observatory byS. Kent. Slight defects in the image are due to cosmetic defectsin the CCD.

are very regular and smooth in appearance and containlittle or no dust or young stars. They are subclassified byellipticity which is a measure of their apparent flattening.Ellipticity is computed as e = 10(a −b)/a, where a and bare the major and minor axis diameters of the galaxy. Atypical elliptical galaxy is shown in Fig. 2. A galaxy witha circular appearance has e = 0 and is thus classified E0.These are at the tip of the tuning fork’s handle. The flattestelliptical galaxies that have been found have e ≈ 7, andare designated E7. Both normal and barred spiral galaxiesrange in form from “early” type, designated Sa, throughtype Sb, to “late” type Sc. Spiral classification is based onthree criteria: (1) the ratio of luminosity in the central bulgeto that in the disk, (2) the winding of the spiral arms, and(3) the contrast or degree of resolution of the arms intostars and H II regions. Early type galaxies have tightlywound arms of low contrast and large bulge-to-disk ratios.A typical spiral galaxy is shown in Fig. 3. The importanttransition region between elliptical and spiral galaxies isoccupied by lenticular galaxies that are designated S0. Itwas originally hypothesized that spiral galaxies evolvedinto S0 galaxies and then ellipticals by winding up theirarms and using up their available supply of gas in starformation. This is now known to be false.

Irregular galaxies are split into two types. Type I orMagellanic irregulars are characterized by an almost com-plete lack of symmetry, no nucleus, and are usually re-solved into stars and H II regions. The Magellanic Clouds,the nearest galaxies to our own, are examples of this type.Type II irregular galaxies are objects that are not easilyclassified—galaxies that have undergone violent dynam-

ical interactions, star formation events, “explosions,” orhave features uncharacteristic of their underlying classsuch as a strong dust absorption lane in an elliptical galaxy.

The modifications and extensions to Hubble’s schemeadded by Sandage and de Vaucouleurs include the ad-dition of later classes for spirals, Sd and Sm, between

FIGURE 3 (a) The Sb spiral galaxy Messier 81 (=NGC 3031),and (b) the Sc spiral M101 (=NGC 5457 a.k.a. the Pinwheel). TheCCD images courtesy of S. Kent.

Page 85: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

372 Galactic Structure and Evolution

TABLE I De Vaucouleurs’ T Classification

T Type Description

−6 Compact elliptical

−5 Elliptical, dwarf elliptical E, dE

−4 Elliptical E

−3 Lenticular L−, SO−−2 Lenticular L, SO

−1 Lenticular L+, SO+0 SO/a, SO-a

1 Sa

2 Sab

3 Sb

4 Sbc

5 Sc

6 Scd

7 Sd

8 Sdm

9 Sm, Magellanic Spiral

10 Im, Irr I, Magellanic Irregular, Dwarf Irregular

11 Compact Irregular, Extragalactic HII Region

classes Sc and Irr I, the further subclassification of spiralsinto intermediate types such as S0/a, Sab, and Scd, andthe inclusion of information about inner and outer ringstructures in spirals. S0 galaxies have also been subdi-vided into classes based either on the evidence for dustabsorption in their disks, or, for SB0’s, the intensity oftheir bar components. Van den Bergh discovered that thecontrast and developement of spiral arms were correlatedwith the galaxy’s luminosity. Spirals and irregulars can bebroken into nine luminosity classes (I, I–II, II, . . . V), butthere is considerable scatter (≈0.6–1.0 magnitude) andthus considerable overlap in the luminosities associatedwith each class. De Vaucouleurs also devised a numericalscaling for morphological type, called the T type and il-lustrated in Table I, for his Second Reference Catalog ofBright Galaxies.

The most recent “modification” to the Hubble sequenceis van den Bergh’s recognition of an additional sequencebetween lenticular galaxies and normal spirals. This se-quence of spirals, dubbed Anemic, has objects designatedAa, ABa, Ab, etc. These Anemic spirals have diffuse spi-ral features, are usually of low surface brightness, and aregas poor relative to normal spirals of the same form.

II. GALACTIC STRUCTURE

The quantitative understanding of galactic structure isbased on only two observables. These are the surfacebrightness distribution (including the structure of the armsin spiral galaxies) and the line-of-sight (radial) velocity

field. The surface brightness at any point in a galaxy’simage is the integral along the line-of-sight through thegalaxy of the light produced by stars and hot gas. Mea-surements of the velocity field are made spectroscopically(either optically or in the radio at the 21-cm emissionline of neutral H), and represent the integral along theline-of-sight of the velocities of individual objects (stars,gas clouds) times their luminosity. Dust along the line-of-sight in a galaxy obscures the light from objects be-hind it; in dusty, edge-on spiral galaxies only the near sideof the disk can be observed in visible light. Extinctionby dust is a scattering process and thus is a function ofwavelength; all galaxies are optically thin at radio wave-lengths. Elliptical galxies are considered optically thin atall wavelengths.

A. Elliptical Galaxies

Most galaxies can be readily decomposed into two mainstructural components, disk and bulge. Elliptical galaxiesare all bulge. The radial brightness profile of the bulgecomponent is usually parameterized by one of three laws.The earliest and simplest is an empirical relation calledthe Hubble law,

µ(r ) = µo(1 + r/ro)−2,

and is parametrized in terms of a central surface bright-ness, µo, and a scale length, ro, at which the brightnessfalls to half its central value. At large radius, the profilesare falling as 1/r2. At radii less than the scale length, theprofile flattens to µo. Giant elliptical galaxies typicallyhave µo ≈ 16 mag/arcsec2. A better relation is the empir-ical r1/4 law proposed by de Vaucouleurs

µ(r ) = µe exp[−7.67

((r/re)1/4 − 1

)],

where re is the effective radius and corresponds to theradius that encloses 1/2 of the total integrated luminosityof the galaxy, andµe is the surface brightness at that radius,approximately 1/2000 of the central surface brightness.The third relation is semiempirical and was derived fromdynamical models calculated by King to fit the brightnessprofiles of globular star clusters. These models can beparametrized as

µ(r ) = µK[(

1 + r2/

r2c

)−1/2 − (1 + r2

t

/r2

c

)−1/2]2,

where rc again represents the core radius where the surfacebrightness falls to ≈0.5, rt is the truncation or tidal radiusbeyond which the surface brightness rapidly decreases,and µK is approximately the central surface brightness.Isolated elliptical galaxies are best fit by models withrt/rc ≈ 100–200. Small elliptical galaxies residing in thegravitational potential wells of more luminous galaxies(like the dwarf neighbors of our own galaxy) are tidally

Page 86: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

Galactic Structure and Evolution 373

stripped and have rt /rc ≈ 10. Figure 4 shows examples ofthe Hubble and de Vaucouleurs laws and the King models.

Dwarf elliptical galaxies, designated dE, are low lumi-nosity, very low surface brightness objects. Because oftheir low surface brightness, they are difficult to identifyagainst the brightness of the night sky (airglow). The near-est dwarf elliptical galaxies, Ursa Minor, Draco, Sculp-tor, and Fornax, are satellites of our own galaxy, and areresolved into individual stars with large telescopes. Al-though dE galaxies do not contribute significantly to thetotal luminosity of our own Local Group of galaxies, theydominate its numbers (Table II). As mentioned earlier,dwarf galaxies are often tidally truncated by the gravita-tional field of neighboring massive galaxies. If M is themass of the large galaxy, m is the mass of the dwarf, andR is their separation, then the tidal radius, rt , is given by

rt = R

(m

3M

)1/3

.

Frequently the central brightest galaxy in a galaxy clus-ter exhibits a visibly extended halo to large radii. Theseobjects were first noted by W. W. Morgan and are desig-nated “cD” galaxies in his classification scheme. Unlikeordinary giant E galaxies, whose brightness profiles ex-hibit the truncation characteristic of de Vaucouleurs orKing profiles at radii of 50 to 100 kpc, cD galaxies haveprofiles which fall as 1/r2 or shallower to radii in excessof 100 kpc. “D” galaxies are slightly less luminous and

FIGURE 4 The three commonly fit surface brightness profiles forelliptical galaxies and the bulge component of spiral galaxies. TheHubble law (solid line) has core radius r0 = 1.0 unit, the de Vau-couleurs law (dash-dot line) has an effective radius re = 5.0 units,and the King model (dashed line) has a core radius rc = 1.0, anda tidal radius r t = 80.0 units. All three profiles have been scaledvertically in surface brightness to approximately agree at r = 5.0units (log r = 0.7).

TABLE II Presently Known Local Group Members

Distance LuminosityName Type BTa (kpc) (109 L)

Andromeda Sb 4.38 730 14.7

Milky Way Sbc — — (10.0)b

M33 Scd 6.26 900 3.9

LMC SBm 0.63 50 2.2

SMC Im 2.79 60 0.4

NGC 205 E5 8.83 730 0.24

M32 E2 9.01 730 0.20

IC 1613 Im 10.00 740 0.09

NGC 6822 Im 9.35 520 0.08

NGC 185 dE3 10.13 730 0.07

NGC 147 dE5 10.37 730 0.06

Fornax dE 9.1 130 0.006

And I dE 13.5 730 0.003

And II dE 13.5 730 0.003

And III dE 13.5 730 0.003

Leo I dE 11.27 230 0.0026

Sculptor dE 10.5 85 0.007

Leo II dE 12.85 230 0.0006

Draco dE 12.0 80 0.00016

LGS 3 ? (17.5)b 730 0.00008

Ursa Minor dE 13.2 75 0.00005

Carina dE — 170 —

a BT is the total, integrated blue apparent magnitude.b Quantities in parentheses are estimated.

have weaker halos; D galaxies are found at maxima inthe density distribution of galaxies. The extended halos ofthese objects are considered to be the result of dynamicalprocesses that occur during either the formation or duringthe subsequent evolution of galaxies in dense regions.

The internal dynamics of spheroidal systems (E galax-ies and the bulges of spirals) is understood in terms ofa self-gravitating, essentially collisionless gas of stars.These systems appear dynamically relaxed; they are basi-cally in thermodynamic equilibrium as isothermal sphereswith Maxwellian velocity distributions. Faber and Jacksonnoted in 1976 that the luminosities of E galaxies were wellcorrelated with their internal velocity dispersions.

L ≈ σ 4.

Here, the velocity dispersion, σ , is a measure of the ran-dom velocities of stars along the line-of-sight.

The collisional or two-body relaxation time, tr , for starsis approximately

tr ∼ 2 × 108 V 3

M2ρyears,

Page 87: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

374 Galactic Structure and Evolution

where V is the mean velocity in km sec−1, ρ is the densityof stars per cubic parseconds and M is the mean stellarmass in M. This relaxation time in galaxies is 1014 to 1018

years—much longer than the age of the universe. Two-body relaxation is generally not important in the internaldynamics of galaxies, but is significant in globular clusters.

The relaxed appearance of galactic spheroids is proba-bly due to the process called violent relaxation. This is astatistical mechanical process described by Lynden-Bell,where individual stars primarily feel the mean gravita-tional potential of the system. If this potential fluctuatesrapidly with time, as in the initial collapse of a galaxy, thenthe energy of individual stars is not conserved. The resultsof numerical experiments are similar to galaxy spheroids.

For the past decade and a half, the determination andmodeling of the true shapes of elliptical galaxies has beena major problem in galaxian dynamics. Most ellipticalgalaxies are somehat flattened. Early workers assumedthat this flattening was rotationally supported as in diskgalaxies (see following). Bertola and others observed,however, that the rotational velocites of E galaxies areinsufficient to support their shapes (Fig. 5). The velocitydispersion is a measure of the random kinetic energy inthe system.

To resolve this problem, Binney, Schwarzchild, andothers suggested that E galaxies might be prolate (cigarshaped) or even triaxial systems instead of oblate (disk-like) spheroids. Current work favors the view that mostflattened elliptical galaxies are triaxial and have inter-nal stellar velocity distributions which are anisotropic.

FIGURE 5 The ratio of rotation velocity to central velocity dis-person versus the ellipticity, e, for elliptical galaxies. Solid linesare oblate and prolate spheroid models with isotropic velocity dis-tributions from Binney. Data points are from Bertola, Cappacioli,Illingworth, Kormendy and others.

A small fraction do rotate fast enough to support theirshapes.

B. S0 Galaxies

Spiral and S0 galaxies can be decomposed into two surfacebrightness components, the bulge and disk. Disks havebrightness profiles that fall exponentially

µ(r ) = µ0 exp(−r/rs).

Bulges have been described earlier. Disks are rotating;their rotational velocity at any radius is presumed to bal-ance the gravitational attraction of the material inside. Atypical rotation curve (velocity of rotation as a function ofradius) is shown in Fig. 6.

Disks are not infinitely thin. The thickness of the diskdepends on the balance between the surface mass densityin the disk (gravitational potential) and the kinetic energyin motion perpendicular to the disk. This can be a functionof both the initial formation of the system and its laterinteraction with other galaxies, etc. Tidal interaction withother galaxies will “puff up” a galactic disk of stars. Disksof S0 galaxies are composed of old stars and do not exhibitany indications of recent star formation and associated gasor dust—that is part of the definition of an S0.

Bulges of S0 galaxies and spirals do rotate and appear tobe simply rotationally flattened oblate spheroids. We willreturn to this point later when we discuss galaxy formation.

C. Spiral Galaxies

Unlike the smooth and uncomplicated disks of S0 galax-ies, the disks of spiral galaxies generally exhibit significant

FIGURE 6 A typical rotation curve for a moderately bright spiralgalaxy like our own (adapted from the work of V. Rubin et al.and M. Roberts). For reference, the rotational velocity of our owngalaxy at the approximate radial distance of the sun, 220 km sec−1

at 8 pc, is marked with an .

Page 88: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

Galactic Structure and Evolution 375

amounts of interstellar gas and dust and distinct spiral pat-terns. The disk rotates differentially (see Fig. 6) and thespiral pattern traces the distribution of recent star forma-tion in the galaxy. Regions of recent star formation havea higher surface brightness than the background disk. Al-though the spiral is a result of the rotation of the materialin the disk, the pattern need not (and generally does not)rotate with the same speed as the material.

The rotation of a spiral galaxy can be described in termsof a velocity rotation curve, v(r ), or the angular rotationrate, (r ), where v = r. In 1927, Oort first measuredthe local differential rotation of our galaxy by studyingthe motions of nearby stars. He described that rotation interms of two constants, now known as Oort’s Constants,which are measures of the local shear (A) and vorticity (B):

A = −r

2

d

dr, B = − 1

2r

d

dr(r2).

The values adopted by the IAU (International Astro-nomical Union) in 1964 are A = 15 km sec−1 kpc−1 andB = −10 km sec−1 kpc−1, with the Sun at a distance ofro = 10 kpc from the center of the galaxy, and the rotationrate at the sun V = 250 km sec−1. Current estimatessupport a smaller Sun-Galactic-center distance (∼8 kpc)and a smaller rotation velocity (∼230 km sec−1) as wellas slightly different values for A and B.

Studies of rotation in spiral galaxies are undertaken bylong-slit spectroscopy at differenct position angles in theoptical or by either integrated (total intensity, big beam)measurements or interferometric mapping in the 21-cmline of neutral hydrogen. The neutral hydrogen (HI) inspiral galaxies is primarily found outside their central re-gions, reaching a maximum in surface density several kilo-parseconds from the center. The gas at the center is mostlyin the form of molecular hydrogen, H2, as deduced fromcarbon monoxide (CO) maps. From such detailed studiesby Rubin, Roberts, and others we know that the rotationcurves for spirals generally rise very steeply within a fewkiloparseconds of their centers then flatten and stay at analmost constant velocity as far as they can be measured.This result is rather startling because the luminosity ingalaxies is falling rapidly at large radii. If the light andmass were distributed similarly, then the rotational ve-locity should fall off as 1/R at large radii as predictedby Kepler’s laws. Only a small number of galaxies showthe expected Keplerian falloff, leading to the conclusionthat the mass in spiral galaxies is not distributed as thelight.

As in elliptical galaxies, Fisher and Tully noted in 1976that the luminosities of spiral galaxies are correlated withtheir internal motions, in particular rotational velocity.The best form of this relation is seen in the near infrared

FIGURE 7 The relation between the maximum rotation veloc-ity of spiral galaxies and their absolute infrared (H band = 1.6 µ)magnitude [Adapted from Aaronson, M., Mould, J., and Huchra,J. P. (1980) Astrophys. J. 237, 655.]

where the effects of internal extinction by dust on theluminosities are minimized (Fig. 7). There the relation isapproximately

L = (V )4,

where V is the full width of the H I profile measuredat either 20 or 50% of the peak. As the H I distributionpeaks outside the region where the rotational velocity hasflattened, the H I profile is sharp sided and is double peakedfor galaxies inclined to the line of sight.

The regularity of the spiral structure seen in these galax-ies is exceptional. If the spiral pattern was merely tied tothe matter distribution, differential rotation would “erase”it in a few rotation periods. A typical rotation time is afew hundred million years, or ≈1/100th the age of theuniverse. To explain the persistence of spiral structure, Linand Shu introduced the Density Wave theory in 1964. Inthis model, the spiral pattern is the star formation producedin a shock wave induced by a density wave propagatingin the galaxy disk. The spiral pattern is in solid body rota-tion with a pattern speed p. The main features of such adensity wave are the corotation radius, where the patternand rotation angular speeds are the same, and the innerand outer Lindblad resonances, where

p = ± κ/m;

m is the mode of oscillation and κ is the epicyclic fre-quency in the disk,

Page 89: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

376 Galactic Structure and Evolution

FIGURE 8 Typical two-armed spiral density wave pattern. Solidspiral represents the shock front in the gas. This follows closelybehind the gravitational potential perturbation. The outer dashedcircle is the corotation radius, where the matter in the disk is rotat-ing at the same speed as the spiral pattern. The inner, thin circlerepresents the inner Lindblad resonance.

κ2 = r−3 d(r42)

dr,

or

κ = 2

(1 − A/B)1/2.

Figure 8 depicts the features of a density wave. The spi-ral pattern only exists between the Lindblad resonances.Galaxies in which a single mode dominates are called“Grand Design” spirals.

Two alternative models have been proposed to accountfor spiral patterns, the Stochastic Star Formation (SSF)model of Seiden and Gerola, and the tidal encountermodel. In the first, regions of star formation induce star for-mation in neighboring regions. With proper adjustment ofthe rotation and propagation timescales, reasonable spiralpatterns result. In the second, spiral structure is the result ofgalaxy interactions. A possible example of this process isshown in Fig. 9. It is likely that all three processes operatein nature, with density waves producing the most regu-lar spirals, SSF producing “flocculent” spirals, and tidalinteractions producing systems like the Whirlpool. A sig-nificant fraction (∼10%) of all galaxies show some formof interaction with their neighbors. Not all such systemscontain spiral galaxies, however.

D. Irregular Galaxies

Galaxies are classified as Irregular for several differentreasons. In the morphological progression of the Hubblesequence the true Irregulars are the Magellanic Irregulars,the Irr I’s, which are galaxies with no developed spiralstructure that are usually dominated by large numbers

of star forming regions. Irregular II’s and other pecu-liar objects have been extensively cataloged by Arp andhis coworkers, by Vorontsov–Velyaminov and by Zwicky.These objects are usually given labels to describe theirpeculiar properties such as “compact,” which usually in-dicates abnormally high surface brightness or a very steepbrightness profile, “posteruptive,” which usually indicatesthe existence of jets or filaments of material near thegalaxy, “interacting,” or “patchy.”

There are also galaxies in the form of rings which arethought to be produced by a slow, head-on collision oftwo galaxies, one of which must be a gas-rich spiral.The collision removes the nucleus of the spiral leaving anearly round ripple of star formation similar to the ripplesproduced when a rock is dropped into a lake. Such ringgalaxies almost always have compact companion galaxieswhich are the likely culprits.

The Magellanic irregulars are almost always dwarfgalaxies (low luminosity), are very rich in neutralhydrogen, and have relatively young stellar populations.Their internal kinematics may show evidence for regularstructure or may be chaotic in nature. These galaxiesare generally of low mass; the largest such systemshave internal velocity dispersions (usually measuredby the width of their 21-cm hydrogen line) less than100 km sec−1. Their detailed internal dynamics have onlybeen poorly studied until now. There are some indicationsthat star formation proceeds in these galaxies as in theSSF theory mentioned earlier; however evidence alsoexists for H II regions aligned with possible shock fronts.Although they do not contribute significantly to the totalluminosity density of the universe, these galaxies and thedwarf ellipticals dominate the total number of galaxies.

III. INTEGRATED PROPERTIESOF GALAXIES

A. Luminosity Function

The luminosity function or space density of galaxies, φ(L)is the number of galaxies in a given luminosity range perunit volume. This function is usually calculated from mag-nituded limited samples of galaxies with distance infor-mation. Distances to all but the nearest galaxies are deter-mined from their radial velocities and the Hubble constant.(Note that in the Local Supercluster where the velocityfield is disturbed by the gravity of large mass concen-trations it is necessary to apply additional corrections todistances measured this way). Figure 10 is the differentialluminosity function for field galaxies derived from a recentlarge survey of galaxy redshifts. The luminosity functionis nearly flat at faint magnitudes and falls exponentially

Page 90: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

Galactic Structure and Evolution 377

FIGURE 9 The interacting galaxy pair NGC 5194 + 5195, also known as M51 or the Whirlpool Nebula. This galaxypair is also known as Arp 85, VV 1, and Ho 526 from the catalogs of peculiar, interacting and binary galaxies of Arp,Vorontsov–Velyaminov and Holmberg. The bright spiral, NGC 5194, is classified Sbc; its companion, NGC 5195, isclassified SBO P. (a) is a low contrast display of the image to show the interior structure of the galaxies; (b) is a highcontrast display of the same image to show the effects of interaction on the outer parts of the galaxies. (CCD photocourtesy of S. Kent.)

at the bright end. A useful parametrization of the φ(L),derived by Schechter, is

φ(L) = φ0L−1(L/L∗)α exp(−L/L∗).

L∗ is the characteristic luminosity near the knee, φ0 is thenormalization and α is the slope at the faint end. For aHubble constant of 100 km sec−1 Mpc−1 and with blue(B) magnitudes from the Zwicky Catalog of Galaxies and

FIGURE 10 The differential galaxy luminosity function, φ(L), inunits of galaxies per magnitude interval per cubic megaparsec,derived from the Center for Astrophysics Redshift Survey.

of Clusters of Galaxies

L∗B ≈ 8.6 × 109L, M∗

B ≈ −19.37,

φ0 ≈ 0.015 gal Mpc−3,

and

α ≈ −1.25.

M∗B is the characteristic blue absolute magnitude. The

Schechter function has the interesting property that theintegral luminosity density, useful in cosmology, is justgiven by

Lint = φ0L∗(α + 2)

where is the incomplete gamma function.Figure 11 shows the luminosity function of galaxies as

a function of morphological type.

B. Spectral Energy Distributions

The observed integrated spectra of normal galaxies arefunctions of stellar population, star formation rate, meanmetal content, gas content, and dust content. These prop-erties correlate with, and in some cases are causuallyrelated to, galaxy morphology. In our own galaxy thereare two relatively distinct populations of stars as discov-ered by Baade in 1944. Population II stars are the oldmetal poor stars that form the halo of the galaxy. Globu-lar clusters are population II objects. Population I stars

Page 91: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

378 Galactic Structure and Evolution

FIGURE 11

are younger and more metal rich. They form the diskof the galaxy. The sun is an intermediate age populationI star.

Figure 12a is an example of the optical spectral en-ergy distribution of an old, gasless stellar population. Thistype of population is typical of the bulge and old diskpopulations in galaxies. It is dominated by the light ofG and K giant stars. The strongest spectral features areabsorption lines of (originating in the atmospheres of thegiant stars) calcium, iron, magnesium, and sodium, usu-ally in low ionization states, as well as molecular bandsof cyanogen, magnesium hydride and, in the red, titaniumoxide. The very strong, factor of 2, break in the appar-ent continuum shortward of 4000 A is the primary distin-guishing characteristic of high redshift normal galaxies.The strengths of the absorption features in old bulge pop-ulations is primarily a function of metallicity. In popula-tions where star formation has taken place recently (ontimescales of ≤109 years), most absorption features in theintegrated spectra will appear weaker due to the contribu-tion to the continuum emission from hotter, weaker linedA and F stars. The Balmer lines of hydrogen, which arestrength in A and F stars, increase in strength. Table III lists

some of the common and strong absorption and emissionlines seen in normal galaxy spectra.

Figure 12b is the spectrum of a galaxy who’s light isdominated by very young stars and hot gas. The spectrumresembles that of an H II region. This is an irregular galaxythat is undergoing an intense “burst” of star formation. Itsoptical spectrum is dominated by strong, sharp emissionlines of H, He, and the light elements N, O, and S from thephotoionized gas. The continuum is primarily from hot Oand B stars with a small contribution from the free–free,two-photon, and Paschen and Balmer continuum emissionfrom the gas.

Population synthesis is the attempt to reproduce theobserved spectral energy distributions of galaxies by thesummation of spectra of well-observed galactic stars,model stars, and models for the emission from the gas pho-toionized by hot stars in the synthesis. An important pa-rameter in such studies is the Initial Mass Function (IMF),which is the differential distribution of stars as a functionof mass in star forming regions. The simplest approxima-tion for the IMF is a powerlaw in the mass, M ,

N (M) d M = (M/M)−α.

Page 92: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

Galactic Structure and Evolution 379

FIGURE 12 Galaxy Spectral energy distributions. (a) NGC4486B, typical of an old, evolved stellar population. The promi-nent absorption features of Ca, Na, Mg, the CN molecule and theG band blend of Fe and Cr are marked. (b) NGC 4670, typical ofa strong emission line galaxy or extragalactic H II region. The Oand H emission lines are marked.

The value of α found by Salpeter for the solar neighbor-hood is ≈2.35. The real IMF is slightly steeper at the highmass end (M > 10 M), and slightly flatter at the low massend (M > 1 M). Results from population synthesis indi-cate that the mix of stars in most luminous galaxies is verysimilar to our own in age and metallicity.

C. Galaxy Masses

The masses of galaxies are measured by a variety of tech-niques based on measurements of system sizes and relative

TABLE III Common Spectral Features in Galaxies

λ0 Element Commentsa

Absorption lines3810+ CN (molecular band)

3933.68 Ca II K

3968.49, 3970.08 Ca II + He H

4101.75 Hδ

4165+ CN (molecular band)

4226.73 Ca I

4305.5 Fe + Cr G band

4340.48 Hδ

4383.55 Fe I4861.34 Hβ F

5167.3, 5172.7, 5183.6 Mg I b

5208.0 MgH (molecular band)

5270.28 Fe I

5889.98, 5895.94 Na I D

8542.0 Ca II

Emission linesb

3726.0 [O II]

3729.0 [O II]

3868.74 [Ne III]

3889.05 Hζ

3970.07 Hε

4101.74 Hδ

4340.47 Hγ

4363.21 [O III]

4861.34 Hβ F

4958.91 [O III]

5006.84 [O III]

5875.65 He I

6548.10 [N II]

6562.82 Hα C

6583.60 [N II]

6717.10 [S II]

6731.30 [S II]

a Letters in comments refer to Fraunhoffer’s original designation forlines in the solar spectrum.

b Brackets around species indicate that the transition is forbidden.

motion, and the assumption that the system under studyis gravitationally bound. Masses are sometimes inferredfrom studies of stellar populations, but these are extremelyuncertain as the lower mass limit of the IMF is essen-tially unknown. It is customary to quote the mass-to-lightratios for galaxies rather than masses. This is both be-cause galaxy masses span several orders of magnitude soit is easier to compare M/L since mass and luminosityare reasonably well correlated for individual morpholog-ical classes, and because the average value of M/L is keyin the determination of the mean matter density in the

Page 93: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

380 Galactic Structure and Evolution

universe. In specifying masses or mass-to-light ratios forgalaxies, it is necessary to specify the Hubble constant,as scale lengths vary as the distance while luminositiesvary inversely as the square of the distance. It is alsouseful to specify the magnitude system used for deter-mining luminosity both because different systems mea-sure light to different radii and because galaxies have awide range of colors and are rarely the same color asthe Sun.

Masses of individual spiral galaxies are determinedfrom their rotation curves. Spiral galaxies are circularlysymmetric, so the apparent radial velocities relative totheir centers can be corrected for inclination. Then thevelocity of the outermost point of the rotation curve is ameasure of the enclosed mass. If the mass were distributedspherically, the problem would reduce to the classical cir-cular orbit of radius, R, around point mass, M ,

1

2mV 2 = GmM

R

where m is the test particle mass, and V is its orbital ve-locity, implying

M = 1

2

V 2 R

G.

Because the actual distribution is flattened, a small cor-rection must be applied. If the mass distributions of spiralgalaxies fall with radius as their optical light, then rotationcurves would turnover to a Keplerian falloff, V ∝ R−1/2.Observations of neutral hydrogen rotation curves made at21 cm now extend beyond the optical diameters of manygalaxies and are still flat. This implies that the masses ofspirals are increasing linearly with radius, which in turn,implies that the local mass-to-light ratio in the outer partsof spirals is increasing exponentially.

Masses for individual E galaxies are determined fromtheir velocity dispersions and measures of their core radiior effective radii. For a system in equilibrium, the VirialTheorem states that, averaged over time,

+ 2T = 0,

where T is the system kinetic energy and is the sys-tem gravitational potential energy. For a simple sphericalsystem and assuming that the line-of-sight velocity dis-persion, σ , is a measure of the mass weighted velocitiesof stars relative to the center of mass

Mσ 2 = G∫ R

0

M(r ) d M

r.

If the galaxy is well approximated by a de VaucouleursLaw, then = −0.33G M2re. More accurate mass-to-lightratios can be determined for the cores of elliptical galaxiesalone by comparison with models.

Masses for binary galaxies are also derived from sim-ple orbital calculations, but, unlike rotation curve masses,are subject to two significant uncertainties. The first isthe lack of knowledge of the orbital eccentricity (orbitalangular momentum). The second is the lack of informa-tion about the projection angle on the sky. The combi-nation of these two problems make the determination ofthe mass for an individual binary impossible. The formulaM = (V 2 R)/(2G) produces a minimum mass, the “true”mass is

MT = V 2 R

2G

1

cos3 i cos2 φ,

where i is the angle between the galaxies and the plane ofthe sky and φ is the angle between the true orbital velocityand the plane defined by the two galaxies and the observer.The correct determination of the masses of galaxies in bi-nary systems requires a large statistical sample to averageover projections as well as a model for the selection ef-fects that define the sample (e.g., very wide binaries aremissed; the effect of missing wide pairs depends on thedistribution of orbital eccentricities).

Mass-to-light ratios for galaxies in groups or clustersare also determined from the dimensions of the system andits velocity dispersion via the Virial Theorem or a variantcalled the Projected Mass method. The Virial Theoremmass for a cluster of N galaxies with measured velocitiesis

MV T = 3π N

2G

∑Ni V 2

i∑i< j 1/ri j

,

where ri j is the separation between the i th and j th galaxy,and Vi is the velocity difference between the i th galaxyand the mean cluster velocity. The projected mass is

MP = f p

G N

(N∑i

V 2i ri

),

where ri is the separation of the i th galaxy from the cen-troid. The quantity f p depends on the distribution of orbitaleccentrities for the galaxies and is equal to 64/π for radialorbits and 32/π for isotropic orbits.

Table IV summarizes the current state of mass-to-lightdeterminations for individual galaxies and galaxies in sys-tems via the techniques discussed earlier. These are scaledto a Hubble Constant of 100 km sec−1, and the Zwicky

Page 94: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

Galactic Structure and Evolution 381

TABLE IV Mass-to-Light Ratios for Galaxiesa

Type Method M/L

Spiral Rotation Curves 12

Elliptical Dispersions 20

All Binaries 100

All Galaxy Groups 350

All Galaxy Clusters 400

a In solar units, M/L.

catalog magnitude system used earlier for the LuminosityFunction.

In almost all galaxies, the expected mass-to-light ra-tio from population synthesis is very small, on the or-der of 1 ore lower in solar units, because the light ofthe galaxy is dominated by either old giant stars, withluminosities several hundred Suns and masses less thanthe Sun, or by young, hot main sequence stars with evenlower mass-to-light ratios. The large mass-to-light ratiosthat result from the dynamical studies have given rise towhat is called the “missing mass” problem. Astronomersas yet have not been able to determine what constitutes themass that binds clusters and groups of galaxies and formsthe halos of large galaxies. Possibilities include extremered dwarf (low luminosity) stars, massive stellar remnants(black holes), exotic elementary particles (axions, neutri-nos, etc.), and the possibilty that the dynamical state ofclusters is much more complex than the existing simplemodels.

D. The Fundamental Plane

Much of the work on global properties of galaxies overthe last 20 years has centered on finding global relation-ships like the Faber–Jackson relation and Tully–Fisherrelation between velocity dispersion or rotation velocityand galaxy luminosity. The drivers for this are twofold:first to search for redshift independent distance estima-tors, and second to discover if such relations hold cluesfor the study of galaxy formation. The most useful ofsuch relations discovered so far is the fundamental planefor early type galaxies, particularly elliptical galaxies. El-lipticals form a multiparameter family. The parametersthat describe the global properties of elliptical galaxiesas discovered by principal component analysis are thegalaxy’s size, surface brightness (related to stellar den-sity), and velocity dispersion (related to mass). A typ-ical example of such a relation is shown in Fig. 13,from Jorgensen et al. (1996). The equation describing thisrelation is

log re = 1.35 log σ − 0.82 log 〈I 〉e,

where re is the effective (or half-light) radius, σ is the line-of-sight velocity dispersion and 〈I 〉e is the mean surfacebrightness (luminosity inside the effective radius dividedby the area). There are other variants of this relation such asthe Dn − σ relation which has been used by several groupsto study the motions of galaxies relative to the uniformexpansion of the Universe and thus derive estimates of themean mass density of the Universe relative to the criticaldensity.

E. Gas Content

Neutral hydrogen (H I) was first detected in galax-ies in 1953 by Kerr and coworkers in the 21-cm(1420.40575 MHz) radio emission line. Since then it hasbeen found that almost every spiral and irregular galaxycontains considerable neutral gas. In spiral galaxies,the H I emission distribution usually shows a centralminimum and peaks in a ring which covers the prominentspiral arms. The fractional gas mass ranges from a fewpercent for early type spirals (Sa galaxies) to more than50% for some Magellanic irregulars. Neutral hydrogen isonly rarely detected in elliptical an S0 galaxies. The gasmass fraction in these objects is usually less than 0.1%.In spiral galaxies, H I can usually be detected at 21 cmout to two or three times the radius of the galaxy’s opticalimage on the Palomar Schmidt Sky Survey plates.

Ionized gas is found in galaxies in H II regions (H IIrefers to singly ionized hydrogen), in nuclear emissionregions, and in the diffuse interstellar medium. H II regionsare regions of photoionized gas around hot, usually youngstars. The gas temperature is of the order of 104 K , anddepends on the surface temperatures of the ionizing starsand cooling processes in the gas which depend stronglyon its element abundances. Low metallicity H II regionsare hotter than those with high metallicity. Ionized gasis often seen in the nuclei of galaxies and is either theresult of star formation (as in H II regions) or the result ofphotoionization by a central nonthermal energy source. Inour galaxy, there is a diffuse interstellar medium composedof gas and dust. Much of the volume (although not muchmass) of the galaxy is in this state with the gas ionizedby the diffuse stellar radiatation field. Because its densityis so low, ionized gas in this state takes a long time torecombine. (To a first approximation, the recombinationtime for diffuse, ionized hydrogen is

τ ∼ 105

ne,

where ne is the electron density in cm−3, and τ is in years.)Ionized gas is also found in supernova remnants.

Carbon monoxide (CO), which is found in cool molecu-lar clouds in the galaxy, has been detected in several other

Page 95: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

382 Galactic Structure and Evolution

FIGURE 13 The M/L ratio as a function of the luminosity, the mass, the velocity dispersion and the combination0.22 log rc + 0.49 log σ. log M/L = 2 log σ − log 〈I 〉e − log rc + constant. rc is in kpc; M/L, mass and luminosity in solarunits. For H0 = 50 km sec−1 Mpc−1 and Mass = 5σ 2rc/G the constant is −0.73. The dashed lines in (a), (b) and (d)show the selection effect due to a limiting magnitude of −20.45 mag in Gunn r . This is the limit for the Coma clustersample. (From the Royal Astronomical Society, 280, 173, 1996.)

nearby spiral and irregular galaxies and even in higherredshift starburst galaxies. CO clouds are associated withregions of both massive and low mass star formation. Insprial galaxies, the CO emission often fills in the centralH I minimum. There is an excellent correlation betweenthe gas content of a galaxy and its current star formationrate as measured by integrated colors or the strength ofemission from H II regions.

F. Radio Emission

All galaxies that are actively forming stars emit radio ra-diation. This emission is from a combination of free–freeemission from hot, ionized gas in H II regions, and fromsupernova remnants produced when massive young starsreach the endpoint of their evolution. In addition, cer-tain galaxies, usually ellipticals, have strong sources ofnonthermal (i.e., not from stars, hot gas, or dust) emis-sion in their nuclei. These galaxies, called radio galaxies,output the greatest part of their total emission at radioand far-infrared wavelengths. The radio emission is usu-

ally synchrotron radiation and is characterized by largepolarizations and self-absorption at long wavelengths. Amore detailed treatment of central energy sources can befound in another chapter.

G. X-Ray Emission

As noted earlier all galaxies emit X-ray radiation fromtheir stellar components—X-ray binaries, stellar chromo-spheres, young supernova remnants, neutron stars, etc.More massive objects, particulary elliptical galaxies, haverecently been found by Forman and Jones with the EinsteinX-ray Observatory to have X-ray halos, probably of hotgas. A small class of the most massive elliptical galaxieswhich usually reside at the centers of rich clusters of galax-ies also appear to be accreting gas from the surroundinggalaxy cluster. This has been seen as cooler X-ray emis-sion centered on the brightest cluster galaxy which sitsin the middle of the hot cluster gas. This phenomenon iscalled a “cooling flow,” and results when the hot cluster gascollapses on a central massive object and becomes dense

Page 96: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

Galactic Structure and Evolution 383

enough to cool efficiently. This process is evidenced bystrong optical emission lines as well as radio emission.Cooling flows may be sites of low mass star formation atthe centers of galaxy clusters.

Active galactic nuclei—Seyfert 1 and 2 galaxies (dis-coverd by C. Seyfert in 1943), and quasars are also usuallystrong X-ray emitters, although the majority are not strongradio sources. The X-ray emission in these galaxies is alsononthermal and is probably either direct synchrotron emis-sion or synchrotron-self-Compton emission.

IV. GALAXY FORMATIONAND EVOLUTION

A. Galaxy Formation

The problem of galaxy formation is one that remains asyet unsolved. The fundamental observation is that galaxiesexist and take many forms. The interiors of most galax-ies are many orders of magnitude more dense than thesurrounding intergalactic medium.

In the hot Big Bang model, the simplest descriptionof galaxy formation is the gravitational collapse of den-sity fluctuations (perturbations) that are large enough to bebound after the matter in the universe recombines. At earlytimes, the atoms in the universe—mostly hydrogen andhelium—are still ionized so electron scattering is impor-tant and radiation pressure inhibits the growth or forma-tion of perturbations. Before recombination, the universeis said to be in the Radiation Era, because of the domi-nance of radiation pressure. The period of recombination,that is, hydrogen and helium atoms are formed from pro-tons, α particles, and electrons, is called the DecouplingEra, because the universe becomes essentially transparentto radiation. After that the universe is Matter Dominated asgravitational forces dominate the formation of structure.The cosmic microwave background radiation, postulatedby Gamow and colleagues in the 1950s and detected byPenzias and Wilson in 1965, is the relic radiation fieldfrom the primeval fireball and represents a snapshot of theuniverse at decoupling.

In this simple picture, matter is distributed homoge-neously and uniformly before decoupling because theradiation field will tend to smooth out perturbations inthe matter. After decoupling statistical fluctuations willform in either the matter density or the velocity field (tur-bulence). The amplitude of fluctuations which are largeenough will grow, and the fluctuations can fragment and, ifgravitationally bound, collapse to form globular clusters,galaxies, or larger structures. A simple criterion for thegrowth of fluctuations in a gaseous medium was derivedby Jeans in 1928. The Jeans wavelength, λJ , is given by

λJ = cs

)1/2

,

where cs is the sound speed in the medium, and ρ is itsdensity. The sound speed is

cs ∼ c/√

3

in the radiation dominated era, and

cs =(

5kT

3m P

)1/2

after recomination. If a fluctuation is larger than that in theJeans length, gravitational forces can overcome internalpressure. The mass enclosed in such a perturbation, the“Jeans mass,” is just

MJ ≈ ρλ3J ≈ c3

s

G3/2ρ1/2.

Before recombination, this mass is a few times 1015 M,which is comparable to the mass of a cluster of galaxies.After recombination, the Jeans mass plunges to ≈106 M,or about the mass of a globular star cluster. The amplitude(δρ/ρ) required for fluctuations to become gravitationallybound and collapse out of the expanding universe isdependent on the mean mass density of the universe. Theratio of the actual mass density to the density required toclose the universe is usually denoted .

= 8πGρ

3H 2o

,

where Ho is the Hubble Constant. If is large, i.e., nearunity, small perturbations can collapse.

In the 1970s work on a variety of problems dealing withthe existence and form of large-scale structures in the uni-verse made it clear that the formation of galaxies and largerstructures had to be considered together. Galaxies clusteron very large scales. Peebles and collaborators introducedthe description of clustering in terms of low-order corre-lation functions. The two-point correlation function, ξ (r ),is defined in terms of

δP = N [1 + ξ (r )] δV,

where δP is the excess probability of finding a galaxyin volume δV , at radius r from a galaxy. N is the meannumber density of galaxies. Current best measurements ofthe galaxy 2-point correlation function indicate that it canbe approximated as a power law of form

ξ (r ) = (r/ro)−γ ,

with and index and correlation length (amplitude) of

γ = 1.8, and ro = 5h−1 Mpc,

Page 97: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

384 Galactic Structure and Evolution

(where h is the Hubble Constant in units of 100 km sec−1

Mpc−1). Power has been found in galaxy clustering onthe largest scales observed to date (∼100 Mpc), and theamplitude of clustering of clusters of galaxies is 5 to 10times that of individual galaxies. In addition, work in thelast decade has shown that there are large, almost emptyregions of space, called Voids, and that most galaxies areusually found in large extended structures that appear fila-mentary or shell like, with the remainder in denser clusters.

There are currently two major theories for the forma-tion of galaxies and large-scale structures in the universe.The oldest is the gravitational instability plus heirarchi-cal clustering picture primarily championed by Peeblesand coworkers. This is a “bottom-up” model where thesmallest structures, galaxies, form first by gravitationalcollapse and then are clustered by gravity. This model hasdifficulty in explaining the largest structures we see, es-sentially because gravity does not have time in the ageof the universe to significantly affect structure on verylarge scales. A more recent theory, often called the pan-cake theory, is based on the assumption that the initialperturbations grow adiabatically, as opposed to isother-mally, so that smaller, galaxy-sized perturbations are in-tially damped. In this model the larger structures form firstwith galaxies fragmenting out later. If dissipation-less ma-terial is present, such as significant amounts of cold or hotdark matter (e.g., massive neutrinos), collapse will usu-ally be in one direction first, which produces flattened orpancake-like systems. Models of this kind are somewhatbetter matches to the spatial observations of structure, butthe hot dark matter models fail to produce the observedrelative velocities of galaxies. The Hubble flow, or gen-eral expansion of the universe, is fairly cool, with galaxiesoutside of rich, collapsed clusters, moving only slowly(σ < 350 km sec−1) with respect to the flow.

There are alternative models of galaxy formation, forexample, the explosive hypothesis of Ostriker and Cowie.In this model, a generation of extremely massive, pre-galactic stars is formed and goes supernova, producingspherical shocks that sweep and compress material soonafter recombination. These shells then fragment and thefragments collapse to form galaxies. There is some recentevidence that favors this model.

The starting point for all of the aforementioned modelsare the observations of small-scale fluctuations in the mi-crowave background radiation. Perturbations produced ei-ther before or after recombination appear as perturbationsin the microwave background on scales of a few arc min-utes. One of the major cosmological results of the 1990swas the discovery of these fluctuations at amplitides ofone part in a hundred-thousand, T/T < 10−5, with theCOBE (Cosmic Background Explorer) satellite. More re-cently, higher spatial resolution observations with ground-

based telescopes at the South Pole and with balloon-bornetelescopes have measured the power spectrum of thesefluctuations and appear to strongly confirm that the Uni-verse is geometrically flat.

This is exactly as predicted by an amplification of theBig Bang model called Inflation, which was introducedin the early 1980s by A. Guth and later P. Steinhardt andA. Linde. In inflationary models, the dynamics of the earlyuniverse is dominated by processes described in th GrandUnified Theories (GUTS). In these models, the universeundergoes a period of tremendous inflation (expansion)at a time near 10−35 sec after the Big Bang. If inflation iscorrect, galaxy formation becomes easier because fluctua-tions in the very early universe are inflated to scales whichare not damped out and can exist before decoupling. Theproblem of galaxy formation and the formation of large-scale structure is then the problem of following the growthof the perturbations we see in the early Universe until theybecome the objects we see today.

B. Population Evolution

The appearance of a galaxy changes with time (evolves)because its stellar population changes. This occurs be-cause the characteristics of individual stars change withtime. and because new stars are being formed in mostgalaxies. A galaxy’s appearance thus reflects its integratedstar formation history and the evolution of its gaseous con-tent. The population evolution of galaxies is relatively wellunderstood although detailed models exist for our galaxyand only a few others.

After the initial “formation” of the galaxy, the highermass stars in the first generation evolve more rapidlythan the lower mass stars. For example, the evolutionarytimescale for a 100 M star is only a few million years,while that for a 1 M star is nearly 10 billion years. Ele-ments heavier than hydrogen and helium are produced inthe cores of these stars and are then ejected into the inter-stellar medium, either by stellar mass loss or supernova ex-plosions, thus enriching the metal content of the gas. Starsformed later and later have increasing metallicities—theoldest and most metal poor stars in our own galaxy haveheavy element abundances only 10−3 to 10−4 solar. Theyoungest stars have abundances a few times solar. It ispossible for the average metallicity of a galaxy’s gas todecrease with time if it accretes primordial low metallic-ity gas from the intergalactic medium faster than its highmass stars eject material.

Elliptical galaxies are objects in which most of the gaswas turned into stars in the first few percent of the age ofthe universe. Only a few exceptional ellipticals—galaxieswith cooling flows, for example—show any sign of currentstar formation. These are objects in which the initial star

Page 98: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

Galactic Structure and Evolution 385

formation episode was extremely efficient, with almostall of the galaxy’s gas being turned into stars. At present,elliptical galaxies, probably get very slightly fainter andredder as a function of time. Their light is dominated byred giant stars that have just evolved off the main sequence,and the number of these stars is a slowly decreasing func-tion of time for an initial mass function with the Salpeterslope. These stars are between 0.5 and 1 M. A com-peting process, main sequence brighteneing, can occur insystems where stars still on the main sequence contributesignificantly to the light—that is, for systems with steepinitial mass functions. Stars on the main sequence evolveup it slightly, becoming brighter and hotter, just beforeevolving into red giants.

Spiral and irregular galaxies have integrated star for-mation rates that are more nearly constant as a functionof time. In these objects, the gas is used up slowly or re-plenished by infall. It is probable that in many of thesegalaxies, star formation is episodic, occurring in burstscaused by either passage of spiral density waves throughdense regions, significant infall of additional gas, or in-teraction with another galaxy. The photometric propertiesof these galaxies are usually determined by the ratio ofthe amount of current star formation to the integrated starformation history. The optical light in objects with as lit-tle as 1% of their mass involved in recent, <108-year-old,star formation will be dominated by newly formed stars.Galaxies with constant star formation rates get brighter asa function of time for any reasonable assumption aboutthe initial mass function.

It is common to model the evolutionary history of galax-ies by assuming an unchanging initial mass function andparameterizing the star formation rate in terms of the avail-able gas mass or density, or in terms of a simple functionalform, usually and exponential. Examples of the possiblecolor and luminosity evolution of an elliptical and a spi-ral galaxy are shown in Fig. 14. These models assume anexponentially decreasing star formation rate,

(t) ∼ Ae−βt ,

where β is the inverse of the decay time (β = 0 is a con-stant star formation rate). Properties of galaxies along theHubble sequence are well approximated by a continuousdistribution of exponentials, from constant star formationrates (Sd and Im galaxies) to initial bursts with little sub-sequent star formation (E and S0 galaxies).

C. Dynamical Evolution

There are several processes responsible for the dynam-ical evolution of galaxies, tidal encounters and colli-sions, mergers, and dynamical friction. If galaxies wereuniformly distributed in space, the probability, Pi , that a

FIGURE 14 Luminosity and color evolution for two model galax-ies. (a) The color U–B represents the logarithmic difference(in magnitudes) between two broad bandpasses approximately800 A wide centered on 3600 A(U) and 4400 A(B). More negativeU–B’s are bluer. (b) The variable Mv is the absolute visual magni-tude. Model A has a star formation rate that is nearly constant asa function of time. This might be typical of a spiral galaxy. Model Bhas an exponentially decreasing star formation rate—15 e-foldsin 15 billion years. This would be typical of an E or SO galaxy witha small amount of present day star formation.

galaxy would have undergone a close interaction or mergerin time t is

Pi = π R2〈vrel〉Nt,

where R is the size of a galaxy, 〈vrel〉 is the mean relativevelocity, and N is the number density of galaxies. For theaverage bright galaxy R is about 10 kpc (for a HubbleConstant of 100 km sec−1 Mpc−1, 〈vrel〉 is about 300 kmsec−1 and Pi is thus less than 10−4 in a Hubble time. Galax-ies are clustered, however, and this significantly increasesthe probability that any individual object has undergonean interaction. As stated earlier, on the order of 10%, allgalaxies show some evidence of dynamical interaction.

Tidal encounters and collisions that produce observableresults are still relatively rare events in the field. Someexamples of such events were discussed earlier, e.g., theWhirlpool and ring galaxies. Detailed models of individ-ual events have been made by Toomre and others and themodels compare well with observed structures and ve-locity fields. Encounters at large relative velocity usuallyproduce small effects because the interaction time is short

Page 99: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

386 Galactic Structure and Evolution

and the stellar components of galaxies can easily passthrough one another. Encounters have large effects whenthe relative velocities are comparable or smaller than theinternal velocity fields of the galaxies involved. The effectsof encounters between spiral galaxies are also enhancedif the spin and orbital angular momenta of the galaxiesare aligned. Fast collisions, however, may be possibly re-sponsible for sweeping gas from early type galaxies ingalaxy clusters, although other mechanisms (ablation bythe hot intracluster medium, for example) probably dom-inate. Tidal encounters between protogalaxies were orig-inally thought to produce most of the observed angularmomentum in the universe; however, numerical simula-tions fail to produce the observed amount. The origin ofangular momentum remains a mystery.

It was similary proposed that mergers of spiral galax-ies might produce elliptical galaxies, a process whichcould explain the larger fraction of early type galax-ies seen in rich clusters, where interactions are morelikely to take place. Unfortunately, the photometric prop-erties of elliptical galaxies as well as their relatively largeglobular cluster populations strongly argue against thishypothesis. As in tidal encounters, the “efficiency” of

FIGURE 15

mergers depends on the relative encounter velocity; en-counters at low relative velocities are much more likelyto produce mergers than fast encounters. In particular, theline-of-sight velocity dispersion of galaxies in rich clus-ters is on the order of 1000 km sec−1, which is muchhigher than the stellar dispersions internal to the individualgalaxies.

Dynamical friction is a specialized case of “encounter”whereby a satellite object slowly loses its orbital energywhen orbiting inside the halo of a more massive galaxy.An object moving in the halo of a galaxy produces a grav-itational wake which itself can exert drag on the movingobject. This effect was first described by Chandrasekharin 1960. An object of mass M moving through a uniformhalo of stars of volume density n with velocity v will suffera drag of

dv

dt= −4πG2 Mnv−2 [φ(x) − xφ′(x)] ln ,

where φ is the error function, x = 2−1/2v/σ , and is theratio of the maximum and minimum impact parametersconsidered. σ is the velocity dispersion of the stars in thehalo.

Page 100: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS/GRD P2: GQT Final

Encyclopedia of Physical Science and Technology EN006K-271 June 29, 2001 12:1

Galactic Structure and Evolution 387

Mergers via dynamical friction as well as direct merg-ers of more massive galaxies almost certainly occur at thevery centers of rich clusters of galaxies where the centralcD galaxy is often accompanied by a host of satellitegalaxies. Such processes probably account for the cDgalaxy’s extended halo, depressed central surface bright-ness, and excess luminosity relative to unperturbed brightellipticals.

D. The High Redshift Universe

The launch of the Hubble Space Telescope in 1990 andthe advent of a large number of new and powerful 8-mclass ground-based optical telescopes has significantly im-proved our ability to see the Universe as it was a billionyears or so after the Big Bang. Perhaps the best exam-ple of our view is the Hubble Deep field (Fig. 15). Inthis image, we see many galaxies at refshifts of 3 to 4,or as they would appear when the Universe is only a lit-tle older than a billion years. These objects are generallymorphologically peculiar and show signs of both strongstar formation and intense dynamical interaction. This andother observations indicate that the rate of formation ofstars in the Universe probably peaked when the Universewas about 1/5th its present age and has declined signifi-cantly in the last 5–10 billion years. They also indicate thatmerging of galaxies and parts of galaxies is an importantaspect of the evolutionalry scenario at redshifts greaterthan 1.

V. SUMMARY

As of this writing, no individual objects have been ob-served at redshifts greater than 6, so there is a largeunexplored region of time and space between the time ofrecombination (the formation of the Cosmic microwavebackground at a redshift of ∼1000 or an age of a few100,000 years) and the first observable objects. These“dark ages” will be explored with a new set of space-borne telescopes such as SIRTF (the Space InfraRed Tele-

scope Facility) and the NGST (Next generation SpaceTelescope).

SEE ALSO THE FOLLOWING ARTICLES

COSMIC INFLATION • COSMOLOGY • INTERSTELLAR

MATTER • QUASARS • SOLAR PHYSICS • STAR CLUSTERS

• STELLAR SPECTROSCOPY • STELLAR STRUCTURE AND

EVOLUTION • SUPERNOVAE

BIBLIOGRAPHY

Bertin, G. (2000). “Dynamics of Galaxies,” Cambridge University Press,Cambridge, UK.

Binggeli, B., and Buser, R. (eds.) (1995). “The Deep Universe,” Springer,Berlin, Germany.

Binney, J., and Merrifield, M. (1998). “Galactic Astronomy,” Princeton,Princeton, NJ.

Binney, J., and Tremaine, S. (1987). “Galactic Dynamics,” Princeton,Princeton, NJ.

Bok, B. J., and Bok, P. (1974). “The Milky Way,” Harvard UniversityPress, Cambridge.

Bothun, G. (1998). “Modern Cosmological Observations and Problems,”Taylor and Francis, London, UK.

Corwin, H., and Bottinelli, L. (eds.) (1989). “The World of Galaxies,”Springer-Verlag, New York.

Elmegreen, D. M. (1998). “Galaxies and Galactic Structure,” PrenticeHall, Upper Saddle River, NJ.

Ferris, T. (1982). “Galaxies,” Stewart, Tabori and Chang, New York.Hodge, P. (1986). “Galaxies and Cosmology,” Harvard University Press,

Cambridge.Hodge, P. (ed.), (1984). “The Universe of Galaxies,” Freeman and Co.,

San Francisco, CA.Peebeles, P. J. E. (1971). “Physical Cosmology,” Princeton University

Press, Princeton NJ.Sandage, A. (1961). “The Hubble Atlas of Galaxies,” Carnegie Institution

of Washington, Washington, DC.Sandage, A., Sandage, M., and Kristian, J. (eds.) (1975). “Galaxies and

the Universe,” University of Chicago Press, Chicago.Shu, F. (1982). “The Physical Universe,” University Science Books, Mill

Valley, CA.Silk, J. (1989). “The Big Bang,” Freeman and Co., San Francisco, CA.Sparke, L. S., and Gallagher, J. S. (2000). “Galaxies in the Universe,”

Cambridge University Press, Cambridge, UK.

Page 101: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL/GJK P2: GNH Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN008E-717 June 29, 2001 12:33

Interstellar MatterDonald G. YorkUniversity of Chicago

I. The Interstellar Medium—An OverviewII. Physical Environment in Interstellar SpaceIII. Physical ProcessesIV. Diagnostic TechniquesV. Properties of the Interstellar Clouds

VI. Properties of the Intercloud MediumVII. The Interstellar Medium in Other Galaxies

GLOSSARY

Absorption Removal of energy by atoms, molecules, orsolids from a beam of radiation, with re-emission atwavelengths other than the absorbing wavelength.

Cosmic rays Atomic nuclei or electrons accelerated toenergies of more than 1 MeV by unknown processes.

Dissociation Breakup of a molecule into smallermolecules or atoms.

Emission Process whereby excited energy states in atomsand molecules produce radiation as electrons within theparticles relax to lower energy states.

Galaxy Aggregate of stars (numbered 106–1012) gravita-tionally bound in orbits typically 10 kpc in size.

Intercloud medium Space and material between the in-terstellar clouds, generally thought of as the largest vol-ume in the interstellar medium.

Interstellar clouds Condensations in the interstellarmedium of various densities and temperatures, typi-cally several parsecs to several tens of parsecs in size.

Interstellar medium Space between the stars and thedust, gas, and fields that fill it.

Ionization Removal of electrons from atoms or mole-cules by any of several processes.

Nucleosynthesis Formation of heavy elements fromlighter elements through fusion.

Parsec Measure of distance, equal to 3×1018 cm or about3 light-years.

Polarization Preferential rather than random alignmentof the electric vector of incoming radiation.

Recombination Capture of an electron by an atomic ormolecular ion.

Scattering Redirection of light without changing itswavelength.

Shock front Pressure discontinuity within which low-energy particles are accelerated and ionized ordissociated.

AS FAR AS WE KNOW, galaxies can be regarded asthe main building blocks of the universe. They are thecentral feature in modern astronomical research, all otherfields being aimed at understanding where they came fromand what goes on within them. The galaxies we can see

45

Page 102: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008E-717 June 29, 2001 12:33

46 Interstellar Matter

are made up of stars, but the largest part of the volumeof the galaxies is filled with dust and gas—the most purevacuum in nature, save only the space between the galaxiesthemselves. By mass, the interstellar material accounts forabout 10% of our galaxy. The space between the starsis referred to as the interstellar medium. According tomodern theories, the existence of a visible galaxy as anactive, star-forming entity depends upon the passage ofatoms made in stars out into this medium. The processesthat occur therein are thought to lead to formation of newgenerations of stars.

The detailed process of removal of atoms from starsand their aggregation into condensing clouds that formnew stars in interstellar space is complex. The elaborationof the process is a major thrust in modern astronomicalresearch. We attempt here to describe, from an empiricalpoint of view, what is known about the interstellar mediumand the central role played by the gas and dust found therein the unfolding of the larger story of the life, birth, anddeath of galaxies. Following an overview, the article pro-ceeds to describe the environment of interstellar space,the physical processes thought to be important, and thediagnostic techniques used in the field. The central ideason the nature of the interstellar medium are then summa-rized. Near the end of the article, the particular details ofthe region near the sun are generalized to elaborate on ar-eas of research on interstellar material in other galaxiesand to discuss the cosmological importance of interstellarmedium research.

I. THE INTERSTELLARMEDIUM—AN OVERVIEW

A. Clouds

The interstellar medium in our galaxy contains four read-ily detectable components, three of which are describedin terms of clouds of atoms and molecules. The bright-est components, are the so-called HII regions, regionsof ionized hydrogen that emit radiation when the pro-tons (H+) and electrons recombine. The regions are con-stantly reionized by radiation from nearby stars. Theyare readily visible near stars with temperatures above30,000 K (O stars). The Great Nebula in Orion, aroundthe multiple star θ Orionis, is the best-known exam-ple of the HII region because it is visible to the nakedeye.

The second component consists of dark clouds that arevery cold and emit no visible radiation, but do emit far-infrared (100-µm) and millimeter molecular line radia-tion. Absorption of background star light by dust in theseclouds is responsible for the dark bands along the Milky

Way in Scorpio and in Cygnus (the Cygnus rift) and forthe famous dark spot in the southern sky known as theCoal Sack.

A third component, detectable only with sophisticatedinstrumentation, consists of diffuse clouds. These arelower in mass than either HII regions or dark clouds andare warmer than dark clouds but are not warm enough toemit significant radiation. They contain too little dust tocause discernible extinction to the human eye.

B. The Intercloud Medium

A fourth component is a very hot phase, with T ∼ 106 K,thought to fill much of the void between the stars. Thisphase has been detected only indirectly, and its direct de-tection represents the most important frontier in the field.Gas with T ∼ 3 × 105 K (detected in the form of OVI orO+5) is ubiquitous and is thought to imply the presence ofthe hotter 106 K material. Soft X rays detected from gasnear the sun may arise in the 106 K gas. X rays from su-pernova remnants at 107 K are easily detectable. As theseremnants expand and cool, they should produce large cav-ities at 106 K.

Figure 1 illustrates the location of interstellar gas of var-ious types in a schematic spiral galaxy, seen from above.

FIGURE 1 Relative location, within the spiral arms of a galaxy,of various types of interstellar material. The space between inter-stellar clouds is known as the intercloud medium. Solar-like starspermeate the spiral arms, but are omitted here for clarity.

Page 103: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008E-717 June 29, 2001 12:33

Interstellar Matter 47

II. PHYSICAL ENVIRONMENTIN INTERSTELLAR SPACE

A. Galactic Radiation Field

Interstellar gas is ionized by radiation from many sources.The dominant source of high-energy photons capable ofionizing hydrogen (E > 13.6 eV) is the O and B stars(T > 104 K). These stars primarily reside in spiral arms,but they produce a diffuse source of radiation with thespectral shape of a 25,000 K black body and an energydensity of about 1 eV/cm3 (equivalent to the radiationdetected from a single B1 star at a distance of about 30light-years). Since the hottest, most massive stars formin groups, there are local maxima in this radiation field,discernible by the copious emission of HII regions.

The only place unaffected by this pervasive field is theinterior of dark clouds. Hydrogen atoms at the exterior ofclouds shield interior atoms from the high-energy photons(E > 13.6 eV, the ionization potential of HI), but lower en-ergy photons may penetrate. However, in the dark clouds,dust at the exterior provides a continuous opacity source,and so even visible light cannot penetrate.

Other sources of radiation include white dwarfs, hotcores of dying stars, normal stars like the sun, and distantgalaxies and QSOs (the diffuse extragalactic background).The first two sources produce high-energy photons, but notin copious amounts. Normal stars are an important sourceof photons only at λ > 4000 A.

While X-ray sources permeate space, the X rays them-selves have little measurable effect on interstellar clouds.X rays probably serve to warm and ionize cloud edges andto produce traces of high-ionization material.

Supernovae from time to time provide radiation burstsof great intensity, but, except perhaps in the distant halo,these have little global effect. (The mechanical impact ofthe blast wave from the explosion, on the other hand, isvery important, as discussed later.)

B. Magnetic Field

A weak magnetic field exists in space, though its scalelength and degree of order are uncertain. The strength isroughly 10−6 G. The field was originally discerned whenit was discovered that light from stars is linearly polar-ized (∼1%). From star to star the polarization changesonly slowly in strength and direction, suggesting that thepolarizing source is not localized to the vicinity of thestars, but rather is interstellar in origin. It is thought thatthe field aligns small solid particles, which in turn leadto polarization of starlight. Since many elements are keptmostly ionized by the radiation field in space, the mag-netic field constrains the motion of the gas (ions) and maybe responsible for some leakage of gas from the galaxy.

C. Cosmic Rays

Very high energy particles (106–1021 eV) penetrate mostof interstellar space. In the dark clouds, where ionizingphotons do not penetrate, collisions between atoms andcosmic rays provide the only source of ions. Cosmic raysprobably provide the basis of much of the chemistry indark clouds, since, except for H2, the molecules seenare the result of molecule–ion exchange reactions. Col-lisions between cosmic rays and atoms are thought to pro-duce most of the observed lithium, beryllium, and boronthrough spallation. The energy density of cosmic rays isabout 1 eV/cm3, similar to that of stellar photons. Cosmicrays interact with atoms, leading to pion production. Thesedecay to γ rays. The diffuse glow of γ rays throughoutthe galactic disk is accounted for by this process.

D. Cosmic Background Radiation

Light from other galaxies, from distant quasars, and evenfrom the primordial radiation has a measurable impact onthe interstellar medium. There are about 400 photons/cm3

(or about 1 eV/cm3, since λ ∼ 1 mm) in our galaxy fromthe radiation bath of the Big Bang, now cooled to 2.71 K.At the remote parts of the galaxy, the integrated lightof the extragalactic nebulae at 5000 A, and of the dis-tant quasars at E > 13.6 eV (λ > 912 A) is comparable togalactic sources of such radiation.

E. Mechanical Energy Input

Stars, in the course of their evolution, propel particles intospace, often at thousands of kilometers per second. Thesemay come from small stellar flares; from massive dumpsof the envelope of one star onto a binary companion, lead-ing to an explosion (novae or supernovae); from windsdriven by, among other things, radiation pressure, espe-cially in the hottest stars; or from detonation of stellarinteriors, leading to supernovae. These particle flows gen-erate pressure on the interstellar clouds, and, in regions ofmassive stars and supernovae, substantially heat the gasand reshape the clouds. The random motion of interstellarclouds is largely the result of the constant stirring causedby supernovae.

III. PHYSICAL PROCESSES

A. Absorption and Emission of Radiation

Up to a point the physics of interstellar clouds is easy toascertain. Atoms and simple molecules are easy to de-tect spectroscopically. Various effects lead to excitationor ionization of atoms and molecules. The extremely low

Page 104: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008E-717 June 29, 2001 12:33

48 Interstellar Matter

densities (10−3–105 particles/cm3) and the resulting greatdistances between atoms (millimeters to meters) meansthey seldom interact. When they do, the restoring radia-tive processes are usually fast enough that the atoms returnspontaneously to their original energy states. The net re-sult is quantized emission that is detectable at the earth.The strength and wavelength of the emission can be usedto discern the density, temperature, abundance, velocity,and physical environment of the emitting region.

The same physics implies that most atoms and mole-cules are in their ground electronic state, with very fewof the higher energy states populated by electrons. Hence,when light from a given star is intercepted by an atom,only a very restricted band of photons corresponds to anenergy that the atom can absorb. In these few cases, the ab-sorption is followed immediately by return of the electron,excited by the absorbed photon, to its original state, andsubsequent re-emission of an identical photon. There isonly a tiny chance that the re-emitted photon will have ex-actly the same path of travel as the one originally absorbed.To an observer using proper equipment, it appears that aphoton has been removed from the beam. Since there are1017–1021 atoms in a 1-cm2 column toward typical stars,the many repetitions of the absorption/re-emission pro-cess leave a notch in the spectrum called an “absorptionline” because of the appearance of the features as seen ina spectrograph—dark lines across the bright continuousspectrum of a star. The absorption line strengths can beused to measure the physical conditions of the absorbingregion.

B. Formation and Destruction of Molecules

In the densest regions (10–105 atoms/cm3), molecules areknown to exist in their lowest electronic states. The mostabundant molecule, H2, is thought to be formed on the sur-faces of grains, small-sized (<1-µm) particles discussedlater. Neutral hydrogen and molecular hydrogen are ion-ized by cosmic rays (see preceding discussion) as areother species such as neutral oxygen. A complicated chainof reactions (ion–molecular reactions) then ensues, sincemolecules and ions exchange electronic charge at rapidrates under the conditions present in interstellar clouds.Neutral molecules are then formed by recombination ofionized molecules and ambient electrons.

Some molecules are too abundant to be formed in thereaction scheme noted (such as ammonia, NH3). Perhapsformation on grains is important in such cases. While theproduction of H2 on grains can be generalized, productionof other molecules cannot be predicted without specificknowledge of the makeup of the grains, which we nowlack (see later discussion).

Some molecules may be formed in diffuse clouds byshock-triggered processes. Molecules are destroyed by

chemical reactions (to make other molecules) or by ra-diation. In general, the higher the photodestruction crosssection of a molecule, the more deeply it will be buried(in an equilibrium between destruction and formation pro-cesses) in the denser regions of clouds.

The most ubiquitous molecules are H2 and CO (10−5–0.5 molecules per H atom). Two- and three-atom mole-cules are common (e.g., CH, CN, HCN, and HCO+, with10−13 to 10−8 molecules per H atom). The most massivemolecule now known is HC13N. However, there is spec-ulation that molecules as large as C60 may be present inlarge numbers. Because the nature of grains is poorly un-derstood, it is possible that some observations that requiresmall grains could be accounted for by large molecules(see later discussion).

C. Ionization and Recombination

Atoms and molecules in clouds are infrequently hit byphotons or particles (atoms, molecules, or grains). Whensufficient energy exists in the collision, an electron may beremoved (ionization), producing a positively charged par-ticle. Coulomb attraction between these particles and freeelectrons leads to recombination, reinstating the originalstate of charge. Recombination may be to an excited stateof the neutral species. Subsequent decay of the electron tothe ground state, through the quantized energy levels ofthe neutral atom or molecule, leads to emission of photons.The best-known examples are the Balmer lines of hydro-gen seen in recombination from HII regions, the strongestof which is the Hα line at 6563 A, which leads to the red ap-pearance of some emission nebulae in color photographs.Sometimes the recombination process leads to populationof atoms in “forbidden” states not formally permitted tobe populated by absorption from the ground state becauseof quantum mechanical selection rules. These states arenormally suppressed by collisional de-excitation in lab-oratory gases, but at the densities encountered in space(<104 ions/cm3), radiative decay occurs before collisionalde-excitation can occur. The best-known cases are theoxygen forbidden lines, particularly λ5007 of O+2, whichgives a greenish appearance to some nebulae.

D. Shock Fronts

Discontinuities in pressure occur in the interstellarmedium, caused by explosions of stars, for instance. Prop-agation of such a pressure shock through the medium, intoclouds in particular, leads to observable consequences.Shocks propagating at velocities above 300 km/sec ran-domly thermalize the gas impinged upon to temperaturesabove 106 K, where cooling by radiation of a gas withcosmic or solar abundance is inefficient. Such shocks arereferred to as adiabatic and lead to growth of large cavities

Page 105: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008E-717 June 29, 2001 12:33

Interstellar Matter 49

of hot, ionized gas, transparent to radiation of most wave-lengths. Eventually, particularly when a dense cloud is hitby a shock, the shock speed is reduced; hence it thermal-izes gas to a lower temperature, and cooling is more im-portant. The radiation that cools the gas is just the recom-bination radiation discussed earlier. Thus, if a supernovaexplosion occurs in a low-density region containing higherdensity (hence cooler) clouds, the propagating shock wavewill lead to observable radiation from cooling just wherethe clouds are. Supernova remnants, such as Vela or theCygnus Loop, are thus largely filamentary in visible lightdue to recombination radiation from cloud edges (e.g., Hα

or λ5007; see previous subsection).

E. Dust Scattering

Several observation of stars indicate the presence of dustparticles in space. Historically, the first and chief indica-tion of their presence is that stars of the same temperature(based on stellar spectra) have different colors. In gen-eral, fainter, more distant stars are redder. Some cloudshave so much dust they extinguish all background stars atλ < 2 µm.

Models that try to account for the detailed features ofthe reddening suggest the existence of silicate grains andperhaps some graphite grains. Sizes range from 100 to1000 A (0.01–0.1 µm) with perhaps a core of silicatesand a mantle or outer shell of amorphous ices. Such grainscould produce extinction by direct scattering into direc-tions away from the line of sight of the observer, bypure absorption in the bulk of the grains (with reradia-tion into longer wavelengths), and by interference of re-fracted and scattered radiation on the observer’s side of thegrains. Such grains, if irregular in shape, can be aligned bythe interstellar magnetic field, leading to polarization ofstarlight.

F. Grain Formation and Destruction

Since the exact makeup of the grains is unknown, it is notclear how they form. Specific models for grain makeup al-low one to pose questions as to how certain types of grainsmight form. For instance, cool stars are known to expandand contract periodically (over periods of months). At themaximum expansion, the outer envelopes reach tempera-tures below 1300 K. At the relevant densities, solid par-ticles can condense out of the gas. The content of thegrains depends on the composition of the stellar atmo-spheres, but the atmospheres seem varied enough to pro-duce, in separate stars, silicate- and carbon-based cores.Such grains may be separated from the infalling, warminggas during the contraction of the atmosphere by radiationpressure, depending upon residual electric charge on thegrains.

Subsequently, grains may grow directly in interstellarclouds. In this case, small-grain cores (or perhaps largemolecules) must serve as seeds for growth of the remain-der of the grain by adhesion of atoms, ions, and moleculeswith which the seed grains collide. The type of grainsformed would then depend on the charge on the grainsand the charge on various ions and on the solid-state prop-erties of the grain surface and of monolayers that buildup on it.

G. Heating and Cooling of the Gas

Cooling processes in interstellar gas can be directly ob-served because cooling is mainly through line radiation.Diffuse clouds are cooled mainly by radiation from up-per fine structure levels of species such as C+, which areexcited by collisions of atomic or ionic species with elec-trons, neutral hydrogen, hydrogen ions, or H2. In denseclouds, this process is important in atoms such as carbon.In hot, ionized regions (HII regions, T ∼ 104 K) recombi-nation radiation of H0 and O+ carries away energy fromthe gas, whereas at temperatures of 105 K (shocks in super-nova remnants) recombination and subsequent radiationfrom forbidden levels of O+2 and other multiply ionizedspecies dominates.

Heating of the interstellar gas is poorly understood. Theheating sources observable (X rays, cosmic rays, exother-mic molecular reactions) are inadequate to explain thedirectly observed cooling rates, given the temperaturesobserved in diffuse clouds (T ∼ 100 K). The prime can-didate for the primary heating mechanism is photoelec-tron emission from grains caused by normal starlight atλ 3000 A striking very small grains.

Heating of dense molecular clouds (T ∼ 10 K) is alsouncertain. Obvious gas-phase processes appear to be inad-equate. An additional possibility in such clouds is that starformation activity (directly by infrared radiation or indi-rectly by winds or shocks caused by forming stars) heatsthe clouds. Localized shocked regions with T ∼ 25 K havenow been directly observed in dense molecular cloudsthrough emission from rotationally and vibrationallyexcited H2.

IV. DIAGNOSTIC TECHNIQUES

A. Neutral Hydrogen Emission

The spin of the nucleus (proton) of hydrogen and theelectron can be parallel or antiparallel. This fact splitsthe ground state of hydrogen into two levels so close to-gether that the populations are normally in equilibrium.In practice this means that the higher energy state is morefrequently populated and that the resulting emission, inspite of its low probability, yields a detectable emission

Page 106: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008E-717 June 29, 2001 12:33

50 Interstellar Matter

at λ = 21 cm. In many cases, when the column density ineach cloud (or velocity component) is not too high, thepower detected at 21 cm is directly proportional to thenumber of hydrogen atoms on the line of sight of the mainlobe of the radiation pattern of the radio telescope. Thetotal number of atoms is generally expressed as a columndensity, NHI, in units of atoms per square centimeter. Thedetection is not dependent on the volume density of the gas(denoted as nHI atoms/cm3) and yields no direct informa-tion on the distance of the emitting atoms: all atoms in thevelocity range to which the receiver channel is sensitiveare counted. Generally, the local expansion of the uni-verse removes atoms not in our galaxy from the receiverfrequency by Doppler shifting their 21-cm emission tolonger wavelengths.

B. Molecular Emission

Molecules, although generally in the ground electronicstate, are excited by collisional or radiative processesto higher rotational and vibration levels. Of the some40 molecular species known in interstellar clouds, mostare detected in the millimeter-wavelength region throughrotational excitation. Hydrogen has been detected in rota-tional and vibrational emission and may be detectable inelectronic emission (fluorescence).

If the mechanism populating the higher level of a transi-tion is known, the emission strength gives some informa-tion about the mechanism. For instance, if emission fromseveral different upper levels of a molecule is detectable,the relative population of the states can be determined.Given molecular constants and the temperature, the den-sity of hydrogen can be determined if excitation is bycollisional processes.

There are several unidentified emission features in thenear infrared that are apparently related to interstellar ma-terial, some of which have been attributed to molecules inthe solid phase or to polycyclic aromatic hydrocarbons.Identification of these features is being actively pursuedin several laboratories.

C. Atomic Emission

Recombination lines and forbidden emission lines ofatoms allow determination of abundances, densities, andtemperatures in HII regions. Optical telescopes provide themain data for these lines. Eventually, fine structure emis-sion in the infrared will give us detailed abundance datainside molecular clouds. Such data are currently availableonly for selected regions.

D. Atomic and Molecular Absorption

Absorption lines, already explained, are used to derivecolumn densities of many species. Because of collisional

de-excitation mechanisms, H2CO is seen in absorptionagainst the microwave cosmic background of only 3 K.Twenty-one-centimeter radiation is absorbed by H atomsin the lowest hyperfine state against background radio con-tinuum sources. Resonance absorption lines of moleculessuch as C2, H2, CN, and CH+ are seen in the optical and ul-traviolet spectral regions. Resonance (ground-state) tran-sitions of most of the first 30 elements in atomic or ionicform are seen as optical or ultraviolet absorption in thespectra of distant stars. In special circumstances, the de-gree of absorption is related linearly to the number ofatoms leading to derived column densities (particles persquare centimeter) to be compared with correspondingvalues of N for hydrogen. The ratio N (X)/N (HI) givesthe abundance of the species X, though in practice sev-eral ionization states of the species X must be accountedfor. Conversely, some ions of heavy elements may bedetected when hydrogen is ionized (HI unobservable),and the number of ionized hydrogen atoms must be ac-counted for (using, for instance, knowledge of the elec-tron density and the intensity of the Balmer recombinationradiation).

When ratios N (X)/N (HI) are available for clouds, theyare frequently referred to solar abundance ratios. If thereare 1/10 as many atoms of a certain kind per H atom asfound in the sun, the element is said to be depleted by afactor of 10 in interstellar space.

E. X-Ray Emission and Absorption

X-ray absorption lines have not yet been detected from thehot gas mentioned earlier. While this is one of the mostimportant measurements to be done in the study of inter-stellar material, it awaits the arrival of a new generation ofvery large X-ray telescopes. X-ray absorption edges havebeen detected, but these are contaminated by circumstellarabsorption in the X-ray source itself, with little possibilityof the velocity distinction possible in resonance absorptionlines.

X-ray emission arises from radiative recombinationand collisional excitation followed by radiative decay.Broadband X-ray emission from interstellar gas at 0.1-to 1-keV energies has been detected. High-quality spectraof the resolved emission lines from the hot gas are justbecoming available. Such measurements are extremelyimportant because the large hydrogen column densitiesat distances >100–200 pc absorb such soft X-ray pho-tons. Any detection will thus refer only to very nearbygas, which can therefore be studied without confusionfrom more distant emission. Broadband X-ray emissionat higher energies from the diffuse hot gas cannot yet beseparated from a possible diffuse continuum from distantQSOs and Seyfert Galaxies.

Page 107: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008E-717 June 29, 2001 12:33

Interstellar Matter 51

Higher energy (>1-keV) X-ray lines and continuumhave been detected from supernova remnants and fromvery hot (T ∼ 107 K) gas falling into distant clusters ofgalaxies.

F. γ-Ray Emission

As noted earlier, cosmic rays interact with interstellarclouds to produce γ rays. A knowledge of the distribu-tion of interstellar clouds and of the observed distributionof diffuse γ radiation may lead to a detailed knowledge ofthe distribution of cosmic rays in the galaxy. Our currentknowledge is based largely on the cosmic rays detecteddirectly at one point in the galaxy (earth).

V. PROPERTIES OF THEINTERSTELLAR CLOUDS

Given the many possibilities for detection of radiationemitted or modified by interstellar gas, astronomers havepieced together a picture of the interstellar medium. Inmany cases, the details of the physical processes are vague.In most cases, detailed three-dimensional models cannotbe constructed or are very model dependent. On the otherhand, in some cases, sufficient knowledge exists to learnabout other areas of astrophysics from direct observations.The example of the cosmic-ray distribution has alreadybeen given. Others are mentioned subsequently.

Interstellar clouds are complex aggregates of gas atcertain velocities, typically moving at ±6 to 20 km/secwith respect to galactic rotation, itself ∼250 km/sec overmost of the galaxy. Each cloud is a complex mixture ofa volume of gas in a near pressure equilibrium and ofisolated regions affected by transient pressure shocks orradiation pulses from star formation or from supernovaexplosions. Clouds are visible in optical, UV, or X-rayemission (or continuum scattering) when they happen tobe close to hot stars and are otherwise detectable in ab-sorption (molecules or atoms) or emission from low-lyingexcited levels (0.01 eV) or from thermal emission of thegrains in the clouds.

A. Temperatures

Molecular clouds are as cold as 10 K. Diffuse cloudsare typically 100 K. HII regions have T ∼ 8000 K, de-pending on abundances of heavy elements that providethe cooling radiation. Low-column-density regions with10,000 < T < 400,000 K are seen directly, presumablythe result of heating at the cloud edges from shocks, X rays,and thermal conduction. Isolated regions with T > 106 Kare seen near sites of supernova explosions.

B. Densities

Densities, as determined from direct observation of ex-cited states of atoms and molecules, are generally in-versely proportional to temperature, implying the exis-tence of a quasi-equilibrium state between the variousphases of the medium. The effects of sources of disequilib-rium in almost all cases last 107 years, or less than 1/10of a galactic rotation time, itself 1/10 of the age of the sun.The product nT (cm−3K) is ∼3000 to within a factor ofthree where good measurements exist. Thus, the molecular(dark) clouds have n > 102 cm−3, while in diffuse clouds,n < 102 cm−3. Higher densities (up to 105 cm−3) occurin disequilibrium situations such as star-forming regionsinside dense clouds and in HII regions.

C. Abundances: Gas and Solid Phases

By measuring column densities of various elements withrespect to hydrogen, making ionization corrections asnecessary, abundances of elements in interstellar diffuseclouds can be determined. Normally, the abundances arecompared with those determined in the sun.

Different degrees of depletion are found for differentelements. Oxygen, nitrogen, carbon, magnesium, sulfur,argon, and zinc show less than a factor of two depletion.Silicon, aluminum, calcium, iron, nickel, manganese, andtitanium show depletions of factors of 5–1000. Correla-tions of depletion with first ionization potential or with thecondensation temperature (the temperature of a gas in ther-mal equilibrium at which gas-phase atoms condense intosolid minerals), have been suggested, but none of thesescenarios actually fits the data in detail.

The pattern of depletion suggests no connection withnucleosynthetic processes. Those elements that are de-pleted are presumed to be locked into solid material, calledgrains. Such particles are required by many other obser-vations attributed to interstellar gas, as discussed earlier.In principle, the unknown makeup of the grains can be de-termined in detail by noting exactly what is missing in thegas phase. However, since there must be varying sizes andprobably types of grains and since the most heavily de-pleted elements do not constitute enough mass to explainthe total extinction per H atom, most of the grains by massmust be in carbon and/or oxygen. Establishing the exactmass of the grains amounts to measuring the depletionsof C and O accurately. Although these measurements cannow be made with the Hubble Space Telescope, problemsof interpretation of deletions remain.

The grain structure (amorphous or crystalline) is notknown. There are unidentified broad absorption features,called diffuse interstellar bands, that have been attributedto impurities in crystalline grains. However, these features

Page 108: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008E-717 June 29, 2001 12:33

52 Interstellar Matter

may be caused by large molecules. It has been arguedthat even if grains are formed as crystalline structures,bombardment by cosmic rays would lead to amorphousstructures over the life of the galaxy.

Theories of grain formation are uncertain. A generalscenario is that they are produced in expanding atmo-spheres of cool supergiants, perhaps in very small “seed”form. They may then acquire a surface layer, called a man-tle, probably in the form of water ice and solid CH4, NH3,etc. This growth must occur in cold, dense clouds. Thedetailed process and the distribution of atoms betweenminerals and molecules in solid phase are unknown.

D. Evolution

Interstellar clouds can be large, up to 106 solar masses,and are often said to be the most massive entities in thegalaxy. In this form, they may have a lifetime of morethan 108 years. They are presumably dissipated as a resultof pressure from stars formed within the clouds. Over thelifetime of the galaxy, interstellar clouds eventually turninto stars, the diffuse clouds being the residue from the starformation process. Growth of new molecular clouds fromdiffuse material is poorly understood. Various processesto compress the clouds have been suggested, including aspiral density wave and supernova blast waves. No onemechanism seems to dominate and several may be ap-plicable. However, the existence of galaxies with up to50% of their mass in gas and dust and of others with lessthan 1% of their mass in interstellar material leads to theinference that diffuse material and molecular clouds areeventually converted into stars.

VI. PROPERTIES OF THEINTERCLOUD MEDIUM

A. Temperatures

As already suggested, the medium between the clouds isat temperatures greater than 104 K. The detection of softX rays from space indicates that temperatures of 106 Kare common locally. Detection of ubiquitous OVI absorp-tion suggests there are regions at ∼3 × 105 K. Variousobservations suggest widespread warm neutral and ion-ized hydrogen. Attempts to explain this 104 K gas as anapparent smooth distribution caused by large numbers ofsmall clouds with halos have not been successful. Thusthere is evidence for widespread intercloud material at avariety of temperatures, though large volumes of gas near80,000 K are excluded. The more tenuous diffuse clouds,however formed, may be constantly converted to inter-cloud material through evaporation into a hotter medium.

B. Densities

In accordance with previous comments, all indicationsare that approximate pressure equilibrium applies in in-terstellar space. The above temperatures then imply in-tercloud densities of 10−1–10−3 H atoms/cm3. While di-rect density measurements in such regions are possibleat T < 30,000 K, through studies of collisionally excitedC+ and N+, direct determinations in hotter gas are spec-troscopically difficult. X-ray emission has not yet beenresolved into atomic lines and so is of limited diagnos-tic value. Direct measurements of ions from 104 to 107 Kin absorption over known path lengths must be combinedwith emission line data to fully derive the filling factor,hence the density, at different temperatures. Since emis-sion lines arise over long path lengths, velocities must beused to guarantee the identity of the absorption and emis-sion lines thus observed. Such data will not be availablefor several years.

C. Abundances

The deplection patterns already noted in the discussion ofclouds appear qualitatively in all measures of gas at 104 K,ionized or neutral. In general, the intercloud gas is less de-pleted. Perhaps shocks impinging on this diffuse mediumlead to spallation of grains and return of some atoms tothe gas phase. Data on hotter gas come mainly from ab-sorption lines of OVI, CIV, and NV. Since ionization cor-rections are not directly determinable and since the totalH+ column densities are not known at the correspondingtemperatures of 5 × 104 to 5 × 105 K, abundances are notavailable. However, the ratios C/O and N/O are the sameas in the sun. It is not known whether elements such asiron, calcium, and aluminum are depleted in this hot gas.Optical parity-forbidden transitions of highly ionized ironor calcium may some day answer this important question.

D. Evolution

The evolution of the intercloud medium depends on theinjection of ionization energy through supernova blastwaves, UV photons, and stellar winds. A single supernovamay keep a region of 100-pc diameter ionized for 106 yearsbecause of the small cooling rate of such hot, low-densitygas. Ionizing photons from O stars in a region free ofdense clouds may ionize a region as large as 30–100 pcin diameter for 106 years before all the stellar nuclearfuel is exhausted. Thus in star-forming regions of galax-ies with low ambient densities and with supernova rates of1 per 106 years per (100 pc)3 and/or comparable rates ofmassive star formation, a nearly continuous string of over-lapping regions of ∼104–106 K can be maintained. When

Page 109: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008E-717 June 29, 2001 12:33

Interstellar Matter 53

lower rates of energy input prevail, intercloud regions willcool and coalesce, forming new clouds. In denser regions,comparable energy input may not be enough to ionize theclouds, except perhaps near the edges of the dense region,for periods as long as 108 years.

VII. THE INTERSTELLAR MEDIUMIN OTHER GALAXIES

While much is unknown about the actual balance of mech-anisms that affect the distribution of gas temperatures anddensities in our own galaxy, the facts that are known makeinterstellar medium observations in other galaxies an im-portant way to determine properties of those galaxies as awhole. A few examples are mentioned here.

A. Supernova Rates

Supernovae are difficult to find because the visible super-novae occur only once per 30–300 years. Regions withhigher rates are often shrouded in dust clouds or extin-guished by nearby dust clouds. Thus, little is known aboutsupernova rates and their global effects on galaxies andon the orgin of elements. Nucleosynthesis in massive starsand in the supernova explosions, with subsequent distribu-tion to the diffuse interstellar medium, may have occurredat variable rates, perhaps much more frequently in theearly stages of galaxy formation than now.

Studies of interstellar media in other galaxies can shedlight on these subjects. X-ray emission and atomic lineemission can reveal the presence of supernova remnants,which, since they last up to 105 years or more, are eas-ier to find than individual supernovas, which last onlya few months. Absorption line measurements and emis-sion line measurements can be used to determine abun-dances in other galaxies. Absorption line measurementsreveal the velocity spread of interstellar gas, the stirring ef-fect caused by supernovae integrated over 106–107 years.(Quasi-stellar objects, clumps of O stars, or the rare super-nova can be used for background sources in such absorp-tion line studies.) By making the above-noted interstellarmedium studies on samples of galaxies at different red-shifts, the history of galaxy formation can be discernedover a time interval of roughly three-fourths the expan-sion age of the universe.

B. Cosmic-Ray Fluxes

The origin of cosmic rays is unknown. However, they ac-count for a large amount of the total energy of the galaxy.In situ measurements are only possible near earth. How-ever, cosmic rays provide the only explanation of theobserved abundances of boron, beryllium, and lithium

(∼10−9–10−10 atoms/H atom). Thus the abundance of anyof these elements is a function of the integrated cosmic-ray flux over the lifetime of the galaxy. Variations in theratio [B/H] would imply different cosmic-ray fluxes. Com-parison of [B/H] with other parameters related to galaxyhistory (mass, radius, total Hα flux, and interstellar clouddispersions) may indicate the history of cosmic rays.

C. History of Element and Grain Formation

As is clear from previous sections, studies of various kindsof clouds in our own galaxy have not led to a clear empir-ical picture of how the interstellar medium changes withtime. The trigger or triggers for star formation are poorlyunderstood, and the history of the clouds themselves isunknown in an empirical sense. Numerous problems existon the theoretical side as well.

Study of interstellar media in other galaxies should bevery important in changing this situation. Absorption andemission measurements that provide data on individualclouds and on ensembles of clouds should allow classi-fications of the gas phase in galaxies that can be com-pared with other classifications of galaxies by shape andtotal luminosity. Key questions include: Do elements formgradually over time, or are they created in bursts at thebeginning of the life of the galaxy? Do earlier galaxieshave the same extinction per hydrogen atom as is foundin our own galaxy, or are differences seen perhaps be-cause grains require long growth times to become largeenough to produce extinction at optical wavelengths? Dothe many unidentified interstellar features (optical absorp-tion, IR emission and absorption) occur at all epochs, orare they more or less present at earlier times? When didthe first stars form? How does star formation depend onthe abundance of heavy elements?

D. Cosmological Implications

For various reasons the interstellar medium has proved tobe fruitful ground for determining cosmological quanti-ties. Without detailed comparison with other techniques,a few examples are given.

The light elements hydrogen and helium are thoughtmainly to be of nonstellar origin. They are currentlythought to be formed in the Big Bang. Expansion of theearly cosmic fireball leads to a small interval of time whenthe gas has the correct temperatures for fusion of hydro-gen to deuterium and of deuterium to helium (3He and4He). The helium reactions are so rapid that deuteriumremains as a trace element, the abundance of which is de-pendent on the density of matter at the time of nucleosyn-thesis. By knowing the expansion rate of the universe, thederived density can be extrapolated to current densities.

Page 110: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: LLL/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008E-717 June 29, 2001 12:33

54 Interstellar Matter

Comparison with the total light emitted by stars in galaxiessuggests that there are 10 times fewer protons and neutronsthan implied by the abundance of deuterium. The searchfor the “missing” baryons is now focused on material inhot gas (1050–1070 K), already noted in this article as beinghard to detect.

The amount of helium present is similarly important inderiving the properties of matter at the time of nucleosyn-thesis. In principle, the helium abundance today providesa test of fundamental particle physics because it dependson the number of neutrino types (currently thought to bethree) and on the validity of general relativity.

Deuterium abundances are currently best determined inUV absorption line experiments in local interstellar matter.Helium abundances are best determined by measurementsof optical recombination lines of He2+ and He+ from HIIregions in dwarf galaxies with low metal abundance.

The preceding comments suggest how interstellarmedium studies allow a view of the very earliest stagesof the formation of the universe through measurementsof relic abundances of deuterium and hydrogen. Earliercomments related the importance of such studies to ob-serving and understanding the formation and evolution ofgalaxies through studies of abundances and star formationrates. A third example is the study of clustering of galaxiesat different redshifts. Since galaxies are very dim at cos-mological distances and since they may not have centralpeaks in their light distributions, they are difficult to detectat z > 2. Even at z = 1, only the most luminous galaxies aredetectable. However, interstellar medium observations ofabsorption lines depend not on the brightness of the galaxy,but on the brightness of an uncorrelated, more distant ob-ject, say a QSO. Thus, galaxies at very high redshift canbe studied. Many QSOs side by side will pass through anumber of galaxies in the foreground, and, with adequatesampling, the clustering of absorption lines reflects theclustering of galaxies on the line of sight.

Depending on the total matter density of the universe,the galaxies at high redshift will be clustered to a compa-rable or lesser degree than they are today. Thus, changesin clustering between high-z galaxies (interstellar media)and low-z galaxies (direct images and redshifts) reflectthe mean density of the universe. The material measured

in this way includes any particles with mass, whereas thedensity measurement discussed earlier, using deuterium,counts only neutrons and protons. The difference betweenthe mass density determined from changes in clusteringand the density determined from deuterium give a mea-sure of the amount of “dark” matter in the universe, that is,matter not made of protons and neutrons. Neither estimateof the mean density (one from deuterium, the other fromclustering) explains the apparent density needed to ac-count for the geometry of space. Either some form of mat-ter not clustered with galaxies, or equivalently, a nonzeroenergy of the vacuum of space may explain the discrep-ancy. Current estimates are that 0.5% of the mass-energyis in visible stars, 5% is in other forms of matter madeof protons and neutrons, 25% is in matter not made ofprotons and neutrons, and 70% is in vacuum energy.

SEE ALSO THE FOLLOWING ARTICLES

COSMIC RADIATION • COSMOLOGY • GALACTIC STRUC-TURE AND EVOLUTION • INFRARED ASTRONOMY • STAR

CLUSTERS • STELLAR SPECTROSCOPY • STELLAR STRUC-TURE AND EVOLUTION • SUPERNOVAE • ULTRAVIOLET

SPACE ASTRONOMY

BIBLIOGRAPHY

Bally, J. (1986). “Interstellar molecular clouds,” Science, 232, 185.Boesgaard, A. M., and Steigman, G. (1985). “Big Bang nucleosynthesis:

Theories and observations,” Annu. Rev. Astron. Astrophys. 23, 319.Frisch, P. F. (2000). “The galactic environment of the sun,” Am. Sci. 88,

52.Krauss, L. M. (1999). “Cosmological antigravity,” Sci. Am. 280(1),

52.Mathis, J. S. (1990). “Interstellar dust and extinction,” Annu. Rev. Astron.

Astrophys. 28, 37.Savage, B. D., and Sembach, K. (1996). “Interstellar abundances from

the absorption line observations with the Hubble Space Telescope,”Annu. Rev. Astron. Astrophys. 34, 279.

Shields, G. A. (1990). “Extragalactic HII regions,” Annu. Rev. Astron.Astrophys. 28, 525.

Spitzer, L. (1990). “Theories of hot interstellar gas,” Annu. Rev. Astron.Astrophys. 28, 37.

Page 111: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

Neutron StarsSteven N. ShoreIndiana University, South Bend

I. Historical and Conceptual IntroductionII. Formation of Neutron StarsIII. Equations of State: Neutron Stars as NucleiIV. Equations of StructureV. Pulsars

VI. Glitches: Evidence for Superfluids in NeutronStar Interiors

VII. Cooling ProcessesVIII. Neutron Stars in Binary Systems

IX. Magnetic Field Generation and DecayX. Future Prospects

GLOSSARY

Glitch Discontinuous change in the rotational frequencyof a pulsar.

Millisecond pulsars Weakly magnetized neutron stars,either isolated or in binary systems in which there is nomass transfer between components, which have periodsnear the rotational stability limit.

Pulsar Radio pulsars are isolated or nonaccreting binaryneutron stars with strong (>109 G) oblique magneticfields. X-ray pulsars are strongly magnetized neutronstars imbedded in accretion disks, which are fed frommass-losing companions in close binary systems.

Units Solar mass (M), 2 × 1033 g; solar radius (R),7 × 1010 cm.

URCA process Neutrino energy loss mechanismwhereby electron capture on nuclei produces neutri-

nos, followed by subsequent beta decay of the resultantnucleus with an additional neutrino emitted. The namederives from the Casino de Urca, at which gamblerslost their money little by little but without seemingto be aware of it. The process was first described byGamow and Schoenberg.

NEUTRON STARS are the product of supernovae (SN),gravitational collapse events in which the core of a massivestar reaches nuclear densities and stabilizes against furthercollapse. They occur in a limited mass range, from about0.5 to 4 M, being the product of stars between about 5and 30 M. If they are rapidly rotating and magnetized,their radio emission can be observed as pulsars. Some oc-cur in binary systems, where mass accretion from a com-panion produces X-ray and gamma-ray emission over a

.419

Page 112: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

420 Neutron Stars

wide range of accretion rates. If the rate is in the properrange, periodic nuclear reactions can be initiated, whichcreate episodic mass ejection events called bursters. Mag-netic neutron stars in binary systems produce pulsed X-rayemission without strong radio emission. Neutron stars areunifications of nuclear and relativistic physics, requiringgravitation theory to explain their structure and serving asuseful laboratories for constraining the nuclear equationof state.

I. HISTORICAL AND CONCEPTUALINTRODUCTION

The first work on dense equations of state, following theadvent of quantum theory in the 1920s, showed that itis possible for the pressure of a gas to depend solely onthe density. This implies that it is possible to constructstable cold configurations of matter, independent of theprevious thermal history of the matter. The first applicationof this idea to stellar evolution was by Landau who, in aprescient 1931 paper, discussed the possibility of the endstate of evolution being composed of a sphere in which thepressure support is entirely provided by electrons in thisextreme state of matter. Referred to as degenerate, thecondition results from the fact that, in a quantum gas, theelectrons are constrained to have only one particle per“box” of phase space. Landau pointed out that it might bepossible for the configuration to depend only on the massof the particle, which determines the momentum at a givenenergy, and thus to have objects in which the particle isneutral and heavy, but Fermi-like.

Chandrasekhar, in the early 1930s, derived an upperlimit to the mass of such a configuration, above which thepressure exerted by the electrons is always insufficient toovercome the pull of gravity. The ultimate fate of a moremassive object would be collapse. While he did not followthe evolution of such a state, it was clear from this workthat the upper mass for such stars, called white dwarfs, isof order the mass of the sun. He argued that more massivestars cannot end their lives in hydrostatic equilibrium butmust gravitationally collapse.

The discovery of the neutron and Yukawa’s meson the-ory clarified the basic properties of nuclear matter. Appli-cation of quantum statistics demonstrated that the neutronis a fermion, a spin of 1

2 . It is consequently subject to thePauli exclusion principle, like the electron, but since itsmass is considerably heavier than the electron, by abouta factor of 2000, degenerate neutron configurations areconsiderably denser than white dwarfs (WD). If a neutronsphere (NS) were formed, it would a sufficiently compactat a given mass that the Newtonian calculations previouslyused for stellar structure would not suffice for its descrip-

tion. Following the work of Tolman, who in 1939 derivedthe interior structure metric for a homogeneous nonro-tating mass, Oppenheimer and Volkoff, in the same year,derived the equation for relativistic hydrostatic equilib-rium. They showed that there is a maximum mass abovewhich no stable configuration for a neutron star is pos-sible for a degenerate neutron gas. While Chandrasekharhad presented a limiting mass, which is an asymptoticupper limit for infinite central density, Oppenheimer andVolkoff showed that the masses above about 1 M mustcollapse. A subsequent paper by Oppenheimer and Snydershowed that the ultimate fate of more massive objects wasto rapidly collapse past their Schwarzschild radii and be-come completely relativistic, singular objects, later calledblack holes (BH) by Wheeler. By the end of the 1930s, itwas understood that the end state of stellar evolution waseither a stable degenerate configuration of electrons (WD),neutrons (NS), or catastrophic gravitational collapse (BH).This picture has not been significantly altered with time.

Following these calculations, Zwicky and Baade nom-inated supernovae as the sites for neutron star formation.Having first shown that there is a class of explosive stellarevents with energies significantly above the nova outburst,by about three orders of magnitude, which they dubbed“supernovae,” they showed that the magnitude of energyobserved could be provided by the formation of a neutronstar. They argued that collapse of the core of the star fromabout the radius of the sun to the order of 10 km wouldbe sufficient to eject the envelope of the star (by a processat the time unknown) with the right magnitude of energyobserved in the SN events. The only object that could thusbe formed is a neutron star.

The first detailed calculations for the structure of suchobjects awaited the advent of computers in the early 1960sand the improvement in the equations of state for nuclearmatter. The fact that degenerate equations give an upperlimit of about 0.7 M is in part an artifact of the softnessof the equation of state at lower than nuclear densities.However, from the energy level structure in nuclei, it ispossible to specify more exact forms for the nucleon–nucleon (N–N) interaction potential, which can be usedto calculate the equation of state. The nuclear, or strong,force depends on the exchange of integer spin particles,π mesons. It is thus possible to have an attractive potentialat large separations, which becomes progressively morerepulsive at short distances. The central densities, whichresult in such models, are somewhat higher near the max-imum mass, but they do not change the fact that there isa maximum in the stable mass of such a star. The inclu-sion of exotic particles as the outcome of N–N scatteringchanges the details of the equation of state at the highestdensities, but is still incapable of removing this maximumentirely.

Page 113: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

Neutron Stars 421

Neutron stars remained theoretical entities untilNovember 28, 1967, when, using a radio telescope de-signed for the study of scintillation of radio sources at awavelength of 3.7 m, Bell and Hewish discovered the firstpulsar CP 1919 + 21. They observed a rapidly varying,periodic, pulsed, point radio source with pulse frequencyof about 1 Hz. In rapid succession, several more of theseobjects were discovered. By the end of the year, Gold hadsuggested that a pulsar is a rapidly rotating magnetizedneutron star. The pulses, he argued, result from the mag-netic field being inclined to the rotation axis; it is fromthe magnetic polar regions that the signal is radiated. Theshort duration of the pulses compared with the interpulseperiod supports this view. Pacini showed that such starsmust spin down via the emission of low-frequency ro-tationally generated electromagnetic waves based on thepower requirements for the synchrotron luminosity of theCrab Nebula, and such a spindown was indeed observedby Radakrishnan in the Vela pulsar in 1969. The polar-ization and frequency characteristics of the radio pulsesleft no doubt as to their nonthermal origin, and impliedthe presence of extremely strong magnetic fields, of order1012 G (the field of the earth is about 1 G). The forma-tion of such strong fields had previously been suggestedby Ginzberg in 1963 as the consequence of flux freezingduring stellar gravitational collapse.

The association of PSR 0532 + 27 with the Crab Neb-ula, the remnant of the supernova of 1054, and its opticalidentification with a faint blue star of featureless contin-uum and pulsed optical output near the center of symmetryof the nebula, showed that the initial guess for the originof neutron stars is likely correct. However, with the excep-tion of the Vela pulsar and one or two others (the matter isstill in doubt), pulsars are generally not found imbeddedin supernova remnants.

The discovery of binary X-ray sources came with thedetermination of the period of Cen X-3, in the early 1970sfollowing the launch of the UHURU satellite. The spec-trum of several other sources, and the mass determina-tions, notably Her X-1 = Hz Her, have been used for thedetermination of both the magnetic and mass propertiesof neutron stars. These sources are regular X-ray pulsars,formed by the structuring of the mass accretion region intoa column that corotates with the magnetic field of the starand has a period of order a few seconds. The observationof cyclotron lines in the X-ray from these stars shows thatthe magnetic fields are >1010 G as expected for neutronstars. None has a mass above 4 M regardless of the massof the companion.

The first binary pulsar, PSR 1913 + 16, was discov-ered by Manchester and Taylor in 1974. With its superbtiming stability relative to the errors associated with ve-locity determinations for mass-accreting X-ray pulsars, it

helped to set, for the first time, accurate mass limits forneutron stars. Only upper limits are possible, given uncer-tainties in the mass of the companion. About a half dozenmore of these have since been found. The discovery ofPSR 1937 + 214, a pulsar with a period of about 1 msec,in 1982 was soon followed by several others, includingseveral members of binary systems. These pulsars havecomparatively weak magnetic fields of order 109 G, andappear to result from the spin-up of neutron stars by accre-tion during phases of rapid mass transfer in a close binarysystem.

II. FORMATION OF NEUTRON STARS

Neutron stars are formed by gravitational collapse of in-termediate mass stars. Nucleosynthesis in such stars canproceed exothermally to build elements up to, but not past,56Fe because of the progressive increase in the binding en-ergy of nuclei until this point. Following Fe synthesis, theprimary loss mechanisms for the stellar core become elec-tron captures onto seed nuclei via the URCA process:

e + (Z , A) → (Z − 1, A) + νe, (1)

where Z is the atomic number and A the atomic mass,followed by the subsequent decay:

(Z − 1, A) → (Z , A) + e + νe, (2)

which represents a bulk rather than surface cooling. Be-cause the neutrino opacity of normal stellar envelopes issmall, these particles can freely escape from the core inrapid order, producing rapid cooling that is very densityand temperature sensitive. The core is thus forced to con-tract, increasing the reaction rates and producing acceler-ating losses. Finally, with increasing neutronization, thecore reaches a critical density at which the free-fall timebecomes short compared with the sound speed, and thesubsequent gravitational collapse cannot be avoided.

The free protons, which result from the destruction ofnuclei as the density passes ≈1011 g cm−3 and electroncapture that converts them to neutrons, alter the equa-tion of state. Should the core mass be small enough, lessthan the maximum mass allowable for a neutron star, thecollapse will halt for the core, and the subsequent re-lease of gravitational potential energy, which is essentiallythe gravitational binding energy of the final configuration(the collapse takes the core from about 106 km to about10 km) amounts to ≈1053 erg available for the lifting of theenvelope.

Most of this energy is radiated away in a neutrino burstthat accompanies the formation of the stable core. Neu-trino emission from the newly formed neutron star reac-celerates the shock, which initially stalls, and powers the

Page 114: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

422 Neutron Stars

expulsion of the stellar envelope. The visible supernovaevent that is the product of the collapse contains but asmall portion of the energy liberated by the core formation.While the details depend on the evolutionary history ofthe progenator star, and the visibility of the shock de-pends as much on the environment of the collapsed staras the energetic processes powering the ejection, it is gen-erally thought that the supernova event that heralds theend of the collapse should be visible. The neutrino burstdetected from SN 1987A in the Large Magellanic Cloud,with about the right order of magnitude and spectrum, aswell as duration (a few seconds), provides strong supportfor the general picture as outlined here.

There are two critical physical problems connected withthe formation of the stable configuration that will becomea neutron star: how does the equation of state halt thecollapse of the core; and how does the final configurationlook subsequent to the ejection of the envelope? To answerthe first, one must examine the equations of state, since itis the products of the nucleon–nucleon interaction. Then,one must consider the conditions for equilibrium, bothmechanical and thermal, and what constraints they provideon the interior structure.

III. EQUATIONS OF STATE: NEUTRONSTARS AS NUCLEI

Nuclear physics meets the cosmos nowhere as dramati-cally as in the context of the interior structure of neutronstars. Since these stars are in essence nuclei, albeit severalkilometers across, delimiting their density profiles is anexcellent check on our understanding of the structure ofnuclear matter.

The simplest equation of state, the relation betweenpressure, density, and temperature, comes from applica-tion of the Pauli exclusion principle to neutrons. Sinceneutrons are spin- 1

2 particles, their phase space is strictlylimited to single occupation. To show how a simple equa-tion of state can be derived, consider a gas of quantummechanical, chargeless particles that interact only via thePauli principle. That is, the number of particles in a vol-ume of phase space, h3, where h is Planck’s constant, islimited by

p3F n−1 = h3, (3)

which defines the fermi momentum, pF, for nucleon den-sity n. The distance between the particles is essentiallythe de Broglie wavelength. These particles exert a mutualpressure, resulting from the quantum condition, so thatP ≈ n(p2

F/2m) ≈ h2n5/3/m. Notice that this is indepen-dent of temperature, a condition which is called degener-ate. The pressure comes primarily from the unavailability

of phase space “bins” into which particles can be placed.The degeneracy of a gas is measured by

α = p2F

2mkBT, (4)

where kB is the Boltzmann constant. T is the tempera-ture, and m is the mass of the particle in question. If α

is large, the quantum effects dominate the available phasespace volume for the gas and the temperature plays norole in determining the pressure. The pressure is really theaverage of this over the total distribution function for

P ≈∫ EF

0

E3/2 d E

exp[(E/kT ) − α] + 1. (5)

Therefore, the pressure is strongly dependent on the den-sity, and not temperature, unless the total thermal energysignificantly exceeds the Fermi energy.

White dwarfs and neutron stars are similar, in this pic-ture, because both are composed of spin- 1

2 particles. Theirdifferences arise chiefly from the mass of the supportingparticles, electrons being the primary supporters for whitedwarfs. There is, however, a more profound difference be-tween these two stars—neutrons, being baryons, interactwith each other via the strong force. Thus, the details ofnuclear potentials play critical roles in the equilibrium ofsuch extreme states of matter.

In order to remain bound in nuclei, neutrons must in-teract attractively for some separations; the stability ofmatter, however, demands that the core turn repulsive (orat worst have vanishing potential) at very small distances.The finite sizes of nuclei, of order a few fermi (about10−13 cm), comes from the large mass of the π meson,the particle principally responsible for the transmission ofthe strong force. Degeneracy alone yields a soft equationof state, that is, one which asymptotically vanishes at in-finite density. The maximum mass such a gas can supportagainst collapse must be much smaller than one whichis due to a strongly repulsive interaction at small separa-tion (high density limit). The star that results from a softpressure law will consequently be smaller in radius andhave a higher central density than for a strongly repul-sive short-range potential, called a hard equation of state.However, it is only once the neutrons and protons becomefree that such considerations must be included in structurecalculations. Below the density at which this occurs, thenuclei are formed into an ionic lattice, through which adegenerate electron gas flows. This portion of the neutronstar, the crust, is a solid ranging from densities of about108 to 1011 g cm−3. The electrons at these very high den-sities are not only degenerate but form pairs called BCSsuperconductors.

The neutrons are not bound in nuclei in neutron starcores. That is because the free energy is greater for the

Page 115: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

Neutron Stars 423

bound than the “unbound” state, and the neutrons simplydiffuse, or “drip” out of the nuclei. This occurs well abovethe neutronization density, at about 4 × 1011 g cm−3. Itssignificance for neutron star structure is that until this den-sity is reached, the system can be thought of as a normalionic gas, or indeed solid, but at higher densities no suchstate is possible. Pairing becomes important in the reac-tions of the nucleons, again at densities of the order ofthe neutron drip state, and a superfluid can result. This in-volves the formation of N–N pairs, with their subsequentbehavior as a bosonic system. The analogous situation oc-curs much earlier in density for the electrons, which formpairs as in a superconductor and also behave as a super-fluid within the ionic lattice that can be set up from thenuclei.

This superfluid is one of the most important compo-nents in the interior of the neutron star, not because of itseffect on pressure so much as its other dynamical prop-erties. For instance, the heat capacity of a superfluid isessentially zero. The nucleons are completely degenerateand therefore do not have large thermal agitation. It hasthe property of rotating as a quantum liquid, which meanscertain properties such as the vorticity are quantized, andtherefore exercises a strong dynamical effect on the cou-pling between the outer and core regions of the neutronstar. In effect, the superfluid represents a collective dy-namical mode for the neutrons, similar to that observedin nuclei when many-particle states are included in thecalculation of nuclear structure. On the role it plays in therotational history of the star, we shall say more later.

The core region dominates the structure of the neutronstar. It is here that the stiffness of the nuclear matter equa-tion of state is most strongly felt. Since the density is about1–3 particles fm−3, about 1015 g cm−3, it is essential thatthe small distance part of the nuclear potential be properlyevaluated. The pressure is given by the gradient, with sep-aration, of the N–N potential; in fact, the maximum massof the star is essentially determined by the gradient in theN–N potential at distances less than 1 fm. The steeper thegradient in the potential, the stiffer the equation of state;at the same pressure, a lower density is required. The starcan be more distended and have a higher moment of in-ertia, a less centrally condensed structure, a lower densitythroughout the envelope, and a higher maximum mass thanfor a softer potential.

The interior structure can, in fact, be observationallystudied. The rotational properties of a neutron star dependon its equation of state through its moment of inertia. Ifthe magnetic field can be determined independently, thespin-down rate for pulsars gives an estimate of this quan-tity. The maximum mass is dependent on the N–N poten-tial. Various models provide substantially different results.While the softest equations of state give a lower limit of

about 0.7–1 M for the maximum mass, the stiffest give4–5 M.

IV. EQUATIONS OF STRUCTURE

In order to go from the equation of state to the structure ofa neutron star, it is essential that general relativity play arole. Electrons, being lower mass than neutrons, producedegenerate configurations that are large, with radii aboutthe size of the earth (about 0.01 R). Nucleons, however,are about 2000 times more massive, so that for degeneracyto be reached and for the N–N potential to play a part intheir support, the final configurations must be very muchmore compact. Neutron stars are so close to the stabilitylimit for a compact configuration and their gravitationalbinding energies are so close to its rest mass, that rela-tivistic rather than Newtonian equations are required todescribe their structure.

To begin with, the mass equation must take into ac-count the fact that the rest mass and the energy density,P , are both involved with the total density. The metric, forsimplicity, can be assumed to be that of an isolated, non-rotating (or slowly rotating) point mass outside of the star,which has an increasing mass with the distance from thecenter of the star (the so-called Schwarzschild interior so-lution). Take the mass to be scaled in terms of geometricalunits, so that G = c = 1. Then the equation for the massinterior to some point is given by integration on sphericalshells:

dm

dr= 4πr2ρ. (6)

The pressure gradient is given by the reaction of the matterto compression, which depends on the equation of state.Thus,

dP

dr= − (ρ + P)(m + 4πPr3)

r2(1 − 2m/r ). (7)

The gravitational potential, actually the term in the metricthat provides the redshift, is given by

dr− 1

ρ + P

dP

dr. (8)

This last term is the closest link to the actual metric, sinceds2 = −e2φdt2 + 1/[1 − (2m/r )]dr2. For a point mass, thevacuum field has a singularity at r = 2m, which in physicalunits is r∗ = 2 GM/c2, the Schwarzschild radius.

For a star with a surface gravity of 1014 cm sec−2, forinstance, the gravitational redshift is about 0.2. This red-shift reduces the intensity of the radiation seen by a distantobserver in a given energy band by a factor of (1 + z)3 andaffects the temperature determined from the flux. Thus, inall of the calculations of neutron star properties, it should

Page 116: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

424 Neutron Stars

be borne in mind that the observational quantities are thosedetected at infinity, that is by a distant observer, but thatthe interior values are calculated in the rest frame of theneutron star.

These equations, originally based on Tolman’s interiorsolution for the metric of a relativistic nonrotating mass,form the basis of the Oppenheimer–Volkov solution. Asone moves progressively farther from the stellar core, theincreasing interior mass causes an increasing curvatureof space time, which in turn affects the dependence ofthe pressure on the distance. The gravitational mass,the amount of mass that is felt by the overlying matteroutside of a radius r , increases more slowly than wouldbe expected in the Newtonian model, and the pressuretherefore changes appropriately.

The equation of state is inserted into these equations,for which the baryon density n is the independent “coor-dinate,” with the initial conditions that the density at thecenter, ρc, is a free variable, with M(0) = 0. The mass ofthe configuration is determined by the boundary conditionthat ρ(R) = 0 and P(R) = 0, while M(R) = M . Most ofthe mass of the star is taken up in binding energy, which be-comes progressively larger as the mass is increased. Suchbehavior is already understood from white dwarf mod-els, since degenerate equations of state behave similarly.As mass is added to the star, the star shrinks. The limit-ing mass is when the radius becomes small enough for ablack hole to develop, past which no stable configurationis possible. This is why neutron stars lead necessarily tocollapse if they are made massive enough. There is no sta-ble state between them and catastrophe should they beginto collapse.

The maximum mass is a product of general relativ-ity. While for a Newtonian star an infinite central den-sity is asymptotically approached leading to a progres-sively smaller star (which becomes asymptotically small),there is a lower limit to the size of a relativistic object, theSchwarzschild radius. Degenerate matter produces an ap-proximate mass–radius relation MR3 = constant, so thatfor R = r∗ there is a fixed mass. Above this mass, a blackhole is the inevitable result.

The upper mass limit for neutron stars can be set by theobservation that, for at least the X-ray pulsars, the mass ofthe degenerate object must exceed that of a white dwarfand is always lower than 4 or 5 M. For Cyg X-1 andLMC X-3, the masses appear inconsistent with this valueand likely represent cases of black holes as the sources ofthe X-ray emission.

V. PULSARS

The first observations of radio pulsars served to confirmthe existence of neutron stars—over 400 are now known

(Table I). The detailed modeling of these objects has cen-tered on the mechanisms that are capable of producingthe emission. It is clear that they are rapidly rotating col-lapsed stars, with periods as short as a few milliseconds.In order to discuss how observations of pulsars help to elu-cidate the interior properties of neutron stars, it is usefulto look at some of the observational constraints they canprovide.

First, the fact that they shine means that pulsars pos-sess strong magnetic fields; that they pulse is the result oftheir rotation and the fact that the magnetic field is not ax-isymmetric. A rotating, magnetized neutron star loses ro-tational energy by emission of magnetic dipole radiation:

dErot

dt= 2µ2

3ω4 sin2 β, (9)

where µ is the magnetic moment (µ = B0 R30, R0 is the

stellar radius, and B0 is the surface magnetic field) and β isthe angle between the magnetic moment and the rotationalaxis. The change in the rotation period, Pnot, is given by

Prot P rot ∼ B2 R60

I, (10)

where I is the moment of inertia. Secular variations of themagnetic field, such as decay due to the finite conductivityof the interior matter, produce a change in the decelerationrate. For instance, if the field is exponentially decreasingwith time,

ω ∼ B20 e−2t/τω3, (11)

where τ is the decay time for the field. Should the align-ment angle of the field also depend on time, the pulseprofile and the rate of spindown will also change. Unfor-tunately, the time scales implied by the models are verylong, and while these provide some probes of the electro-magnetic properties of neutron star interiors, they are notwell understood presently.

In a few other ways, pulsars provide us with windowsinto the interiors of neutron stars, most dramatically in theform of glitches. These will be discussed later in moredetail. Glitches are discontinuous changes in the rate ofrotation of the star, by ω/ω ≈ 10−7–10−6 superimposedon the secular spindown of isolated pulsars. In addition,however, there is timing noise, which is a fluctuation inthe rotation rate of the star manifested by changes in thearrival time of pulses not connected with the properties ofthe interstellar medium. This may be caused by small ther-mal effects, within the stellar superfluid, which producerandom walking between pinning sites of vortices.

Pulsar emission may also reveal properties of neu-tron star crust, although now only within the context ofspecific models. The Ruderman–Sutherland model usesthe fact that as a magnetic star rotates, it generates alatitude-dependent electric field in the surrounding space.

Page 117: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

Neutron Stars 425

TABLE I Radio Pulsar Mass Summary

Median 68%Star mass (M) Central limits 95% Central limits Notesa

Double neutron star binaries

J1518 + 4904:

Pulsar 1.56 +0.13/−0.44 +0.20/−1.20 GR, RO

Companion 1.05 +0.45/−0.11 +1.21/−0.14 GR, RO

Average 1.31 ±0.035 ±0.07 GR

B1534 + 12:

Pulsar 1.339 ±0.003 ±0.006 GR

Companion 1.339 ±0.003 ±0.006 GR

B1913 + 16:

Pulsar 1.4411 ±0.00035 ±0.0007 GR

Companion 1.3874 ±0.00035 ±0.0007 GR

B2127 + 11C:

Pulsar 1.349 ±0.040 ±0.080 GR

Companion 1.363 ±0.040 ±0.080 GR

Average 1.3561 ±0.0003 ±0.0006 GR

B2303 + 46:

Pulsar 1.30 +0.13/−0.46 +0.18/−1.08 GR, RO

Companion 1.34 +0.47/−0.13 +1.08/−0.15 GR, RO

Average 1.32 ±0.025 ±0.05 GR

Neutron star/white dwarf binaries

J0437 − 4715 — — <1.51 x , Pbm2

J1012 + 5307 1.7 ±0.5 ±1.0 Opt

J1045 − 4509 — — <1.48 Pbm2

J1713 + 0747 1.45 ±0.31 ±0.62 Pbm2, GR

1.34 ±0.20 ±0.40 Pbm2, GR, Opt

B1802 − 07 — <1.39 <1.45 GR

1.26 +0.08/−0.17 +0.15/−0.67 GR, Pbm2

J1804 − 2718 — — <1.73 Pbm2

B1855 + 09 1.41 ±0.10 ±0.20 GR

J2019 + 2425 — — <1.68 Pbm2

Neutron star/main—sequence binaries

J0045 − 7319 1.58 ±0.34 ±0.68 Opt, MS

aAssumptions made in mass estimate: (GR) general relativistic binary model; (RO) random orbital ori-entation, or inclination angle uniform in cos i ; (Pbm2) core mass orbital period relation; (x) propermotion-induced change in the projected semimajor axis; (Opt) optical companion observations; (MS) main-sequence stellar model.

Crust electrons experience an acceleration, initially flow-ing freely away from the pulsar in the magnetic polar cone,but being trapped closer to the magnetic equator. The pul-sar environment fills with charge, which alters the electricfield at the stellar surface. Electrons are being ripped outof the crust faster than they can be resupplied from chargemigration within the star so that a potential drop builds upover the polar cap. When the potential drop at the poles ex-ceeds that of the rest mass energy for pair creation, 2mec2,

positron–electron pairs, which produce cascades at the po-lar caps, are created. These, in turn, are responsible for theobserved emission. The emission is confined to the polarregion, and it is therefore the obliquity of the field that isresponsible for the visibility of the pulsar.

There is no evidence for any thermal emission fromany known pulsar that is not in a mass-accreting binarysystem. The neutron star cooling is sufficiently fast anduniform over the stellar surface that there is no evidence

Page 118: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

426 Neutron Stars

for pulsed thermal X-rays. Instead, any X-ray emissioncan be attributed either to small synchrotron nebulas inthe vicinity of the neutron star or the nonthermal high-energy emission from the polar cascade.

The spindown rate of a pulsar, all other things beingequal, is a measure of the age of the star. Provided themagnetic field does not change with time, the age can beestimated from II/II. Usually, this gives ages of about 103–106 years. There is one class of pulsar, however, for whichthis estimate is completely unreliable, the milli-secondpulsars. For these, the rate of spindown is extremely long,over 108 years, although their periods are incredibly small.The answer is to be found, in part, from the very low valuesof their magnetic fields and in part is due to their rotationbeing the artifact of their histories. These neutron starshave been, or still are, members of close binary systemsin which accretion has torqued the neutron star up to ro-tational frequencies near the limit of stability. We shallreturn to this point in Section VIII.

VI. GLITCHES: EVIDENCE FORSUPERFLUIDS IN NEUTRONSTAR INTERIORS

At densities in excess of the neutron drip density, the freenucleons feel an attractive potential. This favors the for-mation of pairs, which behave like bosons and form asuperfluid. Pairing is favored at densities between 1011

and 1014 g cm−3. This fluid feels no viscosity, except bycollisions with the normal fluid, which serves to drag onthe particles and couple them to the rotation of the star asa whole. A rotating superfluid quantizes vorticity. Thesevortices have the property of pinning to irregularities in thecrustal substructure by threading through nuclei. Shouldthere be sufficient thermal agitation of the vorticies, theywill creep from one nuclear site to another.

The surface is continually slowing down by the emis-sion of low-frequency (the rotation frequency of the star)electromagnetic waves causing the angular velocity gra-dient to become large across the superfluid mantle. Thisin turn generates a shear which will produce a drift of thepinning sites. Major depinning events result in glitches;the glitching rate depends on the temperature of the nor-mal fluid and serves as a probe of the internal tempera-ture of the neutron star mantle. Postglitch relaxation ofthe rotational frequency depends critically on scatteringproperties in the superfluid for the phonons generated bythe event. In effect, the star rings for a while as it settlesdown to a new rotationally stable state. The detailed be-havior of this system has yet to be completely understood,although phenomenological models have been successfulin predicting that the glitch behavior should be a propertyof only young, that is, still relatively hot, neutron stars.

The reader might think this contradicts the fact that theequation of state for the system is degenerate. However,that does not mean that there are no random motions for theparticles; it simply implies that the equation of state doesnot have a temperature dependence. In addition, neutronsin the superfluid are no longer constrained by Fermi statis-tics, and so have a different behavior than the electrons orfree nucleons.

VII. COOLING PROCESSES

Neutron star matter is completely optically thick to pho-tons. Therefore, only the surfaces can cool by the emissionof such radiation. If this were the only mode for the de-cay in the temperature of such objects, one would expectthem to be easily detected by X-ray observations for quitea long time after their formation, of order 106 yr, since theirformation surface temperatures are in excess of 1010 K.Other processes than photon emission, however, are farmore important in the cooling of such objects, specifi-cally those associated with the emission of neutrinos. Forvery dense matter, the cross sections for the production ofelectron and muon neutrinos by collision processes amongnucleons are high. The URCA processes, in which neutri-nos are the sole emitted particles, have been shown to beimportant. Specifically, processes of the form

n + n → n + p + e + νe (12)

and

n + p + e → n + n + νe (13)

which are modified URCA processes in which the neu-tron bystander absorbs some of the momentum neces-sary for the emission, have been shown to be important.Bremsstrahlung processes, in which the exit channels areessentially the same as the incoming particles with the ad-dition of the radiation of the excess momentum acquiredduring the collision, can be of the form:

n + p → n + p + νe + νe (14)

n + n → n + n + νe + νe. (15)

The URCA processes may occur in the crust, where entirenuclei are present:

e + (Z , A) → e + (Z , A) + νe + νe. (16)

One of the earliest suggested processes, by Bahcall andWolf, is

π− + n → n + e + νe (17)

a weak interaction, like the nn or pn interactions, which forhigh enough interaction energy is replaced by the (µ, νµ)

Page 119: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

Neutron Stars 427

pair. While the rates for π− and URCA processes havemodest temperature dependence of T 6, the nn and npbremsstrahlung processes are more temperature sensitive,varying like T 8.

Superfluids play a role in altering the rates of thebremsstrahlung processes. This is because of pairing ofthe baryons, which introduces a gap energy for the freeparticles so that the rates are suppressed by a factor ofexp(− j/kT ) where j is the BCS gap for either pro-tons or neutrons ( j = p, n, respectively). Since this gapenergy depends on the density, it alters the predicted ratesof cooling in a way that can probe the interior structure ofthe star.

It must be kept in mind that, since neutron stars are rel-ativistic objects, the gravitational metric plays a role in thecooling. Neutrinos and photons escaping from the interiorsuffer a redshift simply because of the gravitational field.This reduces their luminosity at the stellar surface andaffects the cooling rate. Their observed surface tempera-ture and luminosity are related to their “actual” values byTse−φ and Lse2φ , respectively. Because the structure of thestar is virtually independent of temperature, however, theredshift can be determined from the interior model andthen included in the calculation of the luminosity after thefact.

No young pulsars, which are isolated neutron stars orbinary neutron stars that are not accreting matter fromcompanions, show observable thermal X-ray emission, in-cluding the Crab pulsar. The cooling time estimates fromall current calculations indicate that the stars should bedetectable for at least 103–104 years, yet the Crab pul-sar is less than 1000-year old and is still unobservable insoft X-rays; its inferred surface temperature is less than2 × 106 K. This is at odds with most of the current theoriesfor neutrino formation, and the properties of the coolingfunction for neutron stars serve as a useful test of ideasin neutrino and intermediate energy nuclear physics. Onepossible explanation is that rapid cooling takes place viacharged pion condensation because of the rapid increasein the neutrino emission. This occurs at densities of order3 × 1014 g cm−3.

Recent calculations have included the effects of an at-mosphere, although the calculations of the opacities at theprobable densities and temperatures associated with neu-tron stars in the cooling phase are difficult to obtain. Thegravities are of order 1016 cm2 sec−1, in comparison withthose of a normal star (about 104) and a white dwarf (108).These show, however, that most of the emission is typicalof a blackbody, and that flux redistribution because of theopacity edges at ionization limits alters the shape of theflux profile but cannot explain the lack of detection of ther-mal emission from young pulsars. Magnetic fields can actto alter the cooling rates as well, but this tends to increasethe visibility of the star by suppressing many of the

cooling and conduction processes. The apparently rapidcooling associated with neutron stars remains a seriousproblem.

VIII. NEUTRON STARS INBINARY SYSTEMS

The number of pulsars known to be in binary systems issteadily growing, including several of the most rapidly ro-tating objects known—the millisecond pulsars. The pres-ence of neutron stars in binary systems was signaled bythe discovery of Cen X-3 and Her X-1, both of whichdisplay rapid X-ray pulsation. Accreting material from acompanion, which is losing mass because of tidal inter-actions with the neutron star, is funneled into a column atthe magnetic poles. The accretion columns are tied to thestellar surface and corotate with the star. The subsequentdiscoveries of the gamma burst sources, the rapid burster,and the X-ray burst sources have added to the picture ofbinaries.

If the neutron star does not possess a strong magneticfield, matter accreted from a companion will pile up onthe surface until it undergoes nuclear reactions. Since theseoccur with underlying degenerate matter, the heat conduc-tivity away from the site is very poor. The consequence isa thermonuclear runaway, resulting in the matter blowingoff the surface and producing a characteristic burst profile.The maximum luminosity of the burst is at the 1Eddingtonlimit, the luminosity at which the radiative acceleration ex-ceeds that of gravity. The burst signature is a very rapid riseto maximum light (of order 10 msec) and a slower (<1 sec)decay time. Many of the burst sources have been observedto flare on numerous occasions. Some show quasiperiodicvariations in the X-ray, as seen by EXOSAT and RXTE.These presumably come from accretion onto the magne-tosphere of rapidly rotating neutron stars, with the devel-opment of a two-stream and Rayleigh–Taylor instabilityat the boundary layer. This results in the occasional ex-citation of X-ray variations as the matter rains onto thestellar surface, mediated by the magnetic field.

Since neutron star formation requires a supernova event,one might think it unlikely that the binary could survivesuch a cataclysm. Yet clearly many have. This leads to thesuggestion that the amount of mass lost from the systemmust be small compared with the masses of the stars re-maining, and that the events must systematically be liketype I SN events. This implies that the collapsed star hadbegun as either a white dwarf or helium star that was in-duced to collapse, perhaps by the excess accretion of mat-ter. No such events have yet been observed, so the modelremains untested except by statistics on the binaries. Manyare circular orbits (a few notable exceptions exist, such asCir X-1), and this requires efficient tidal dissipation on

Page 120: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

428 Neutron Stars

time scales of 106 years to circularize the orbit. The bi-nary X-ray sources have been found in globular clusters,which also contain burst sources, so that the formation ofneutron stars is clearly not limited only to the young pop-ulation of the galaxy. The sources observed for globularclusters may be the result of tidal capture of the neutronstar.

Finally, X-ray novae and burst sources are analogs ofclassical novae, except the mass gainer is a neutron star.Matter accumulates from a tidally distorted close compan-ion and ignites when the pressure reaches the flash pointfor hydrogen burning, reaching temperatures of 109 K orhigher. Little mass is required, about 10−11 M, for lessthan the accretion necessary to bigger an explosion ona white dwarf because of the enormous gravity. Nuclearprocessing during the burst can bypass iron, extending asfar as M or beyond.

IX. MAGNETIC FIELD GENERATIONAND DECAY

While there are some strongly magnetic main sequencestars with fields of several kilogauss, most normal starsshow upper limits of about 100 G. If the flux of such astar is somehow conserved in the collapse process, theresultant field would scale as R−2

∗ , so that the expectedfield of a neutron star should be of order 1012 G. Whilethe details of the magnetic field freezing process have yetto be understood, the fact that this is the right order ofmagnitude for many neutron star fields is very interestingand indicates that some form of flux freezing must play arole in the stars.

Direct detection of the fields is accomplished throughcyclotron emission by electrons in accreting systems. Themeasured fields, ∼1012 G agree with the strengths inferredfor isolated radio pulsars.

A central problem for neutron star studies is the gen-eration of the wide range observed for pulsar magneticfields. The millisecond pulsars are all objects that haveintrinsically low fields, of order 109 G or so, and havebeen spun up by accretion during their earlier histories.It is important to note that the subsequent decay of therotation is dependent on the magnetic field strength andconfiguration, so that the weaker the field, the longer thetime over which the star will be a rapid rotator.

A model that generates strong magnetic fields after thecollapse uses the currents generated in the crust due tothe thermal gradients present immediately after the su-pernova event and while the neutron star is cooling. It isassumed that a small magnetic field (by pulsar standards)

may be present initially after the collapse. Since in a solid,heat is transported by electrons moving within the crystallattice, a crustal temperature gradient produces a net elec-tron current, the Nernst effect. The field saturates when theohmic dissipation overwhelms the amplifications at aboutthe value observed for pulsars, about 1012 G. The fieldsreach peak strength within the crust, not at the surface,where they become essentially quantum limited, at about1014 G. The problem is that this is a local, not global, ef-fect, and it is not clear how the field manages to organizethe large-scale dipolar structure thus far observed for thesesystems. Further, there is the problem of the obliquity. Noobvious alternative models have yet been proposed. Whatone knows is that many neutron stars do possess strongfields, the weakest of which almost coincide in strengthwith those of the strongest white dwarf stars, and thatmany of these fields are present even in stars that havebeen around for a while. The decay times for the fieldsappear to be long, but here again there are not sufficientdata to judge the matter.

The fields of many neutron stars are highly inclinedto the rotation axis. Some of this is due to the discoveryprocedure—one looks for periodically variable emissionlines, pulsed X-ray continuum, or pulsed radio emission,so one systematically discovers the highly oblique fields.Highly oblique fields are known to occur in main sequencestars and in white dwarfs and have even recently beendiscovered for planets (e.g., Uranus), so they may not bea special feature of magnetic field formation in neutronstars. It remains, however, an unexplained phenomenonfor pulsars.

Several important departures from normal matter resultfrom the strength of the surface magnetic fields. One is thatthe electrons in the degenerate gas at the surface quantizein their orbits about the magnetic field lines. The resultis that each of the particles, depending on its momentum(and recall that since there is complete degeneracy in theelectron gas this ensures that there will be only one particleper magnetic level, or Landau orbital), occupies an energylevel. Thus, the electron gas has a strongly polarized re-sponse and also highly anisotropic properties. There areseveral important consequences of this. For one, heat con-ductivity is strongly direction dependent. Even with thegas being degenerate, there will still be far more efficientheat transfer parallel than perpendicular to the magneticfield lines. This in turn affects the cooling of the neutronstar and alters the details of the temperature gradient. Inaddition, there is also a strong perturbation felt by the elec-trons still tied to the ions in the crust. This perturbationcauses the electron clouds to align with the magnetic field,producing one-dimensional crystals as a dominant crustalstructure. There is also an effect on the coupling of the

Page 121: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GPA/GRD P2: GLM/GUE QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN010E-479 July 16, 2001 15:43

Neutron Stars 429

crust and core via the current represented by the electrongas in the mantle of the neutron star.

X. FUTURE PROSPECTS

Several major problems clearly remain, some of whichcut to the core of our understanding of the formationand subsequent evolution of neutron stars. No extra-galactic supernovas have yet been observed to producepulsars, and, with the exception of the Crab and Velapulsars and RCW 103, supernova remnants do not ap-pear to have associated pulsars. Additionally, no pointX-ray sources are associated with Cas A (which ex-ploded in the 1670s as determined by the dynamicsof the ejecta and the γ -ray detection of 44Ti decay byCOMPTEL) or the remnants of SN 1604, SN 1572, orSN 1006.

Supernova models have still to explain the cutoff be-tween stars that will yield stable final cinders and thosethat continue to collapse to form black holes. The ef-fects of mass loss in the progenitor structure are stillunclear.

The origin and evolution of pulsar magnetic fields,while qualitatively explained by several models, haveyet to produce a reason for the large obliquities in-ferred for neutron star magnetic fields. The interaction ofthese oblique fields with accreting matter in close, mass-exchanging binary systems is still poorly understood. Inparticular, much work is still required to pull the observa-tions of the different regions of the accretion disk together,a task that involves correlating observations at X-ray, ul-traviolet, optical, and radio wavelengths. The discoveryof quasi-periodic X-ray sources has begun the probe ofneutron star magnetospheric structures, but much is leftto do.

The details of the internal properties of neutron stars,especially the equation of state for the core, the solidifi-cation of nuclear matter, whether or not pion condensatesor other exotic forms of matter occur, and the effects ofsuperstrong magnetic fields on the equation of state, areamong the important problems that will need to be under-stood before a complete model for neutron stars can bedetermined.

At this juncture, more than 20 years after the discoveryof pulsars, one can at least say that neutron stars exist. Theyare likely to remain the best available physical laboratory

for the study of most extreme stable states of matter forsome time to come.

SEE ALSO THE FOLLOWING ARTICLES

BINARY STARS • MAGNETIC FIELDS IN ASTROPHYSICS •NEUTRINOS • NUCLEAR PHYSICS • PULSARS • RELATIV-ITY, GENERAL • STELLAR STRUCTURE AND EVOLUTION •SUPERNOVAE

BIBLIOGRAPHY

Alpar, M. A., Buccheri, R., Ogelman, H., and van Paradijs, J. (eds.)(1998). “Many Faces of Neutron Stars,” Kluwer, Dordrecht, TheNetherlands.

Alpar, M. A., Kiziloglu, U., and van Paradijs, J. (eds.) (1995). “TheLives of Neutron Stars: Proc. NATO ASI,” Kluwer, Dordrecht, TheNetherlands.

Arnett, W. D. (1997). “Supernovae and Nucleosynthesis,” PrincetonUniv. Press, Princeton, NJ.

Bethe, H. A. (1971). Annu. Rev. Nuclear Sci. 21, 93.Blandford, R. D., Applegate, J. H., and Hernquist, L. (1983). Thermal

origin of neutron star magnetic fields, M.N.R.A.S. 204, 1025.Drechsel, H., Kondo, Y., and Rahe, J. (eds.) (1987). “Cataclysmic Vari-

ables: Recent Multi-frequency Observations and Theoretical Devel-opments, Reidel, Dordrecht, Netherlands.

Glendenning, N. K. (1997). “Compact Stars: Nuclear Physics, ParticlePhysics, and General Relativity, Springer-Verlag, Berlin.

Joss, P. C., and Rappaport, S. A. (1984). Neutron stars in interactingbinary systems, Annu. Rev. Astron. Astrophys. 22, 537.

Lyne, A., and Graham-Smith, F. (1998). “Pulsar Astronomy,” 2nd Ed.,Cambridge Univ. Press, Cambridge.

Manchester, R. N., and Taylor, J. H. (1977). “Pulsars,” Freeman, SanFrancisco, CA.

Meszaros, P. (1992). “High-Energy Radiation from Magnetized NeutronStars,” Univ. of Chicago Press, Chicago.

Michel, F. C. (1991). “Theory of Neutron Star Magnetospheres,” Univ.of Chicago Press, Chicago.

Oppenheimer, J. R., and Volkoff, G. M. (1939). On massive neutroncores, Phys. Rev. 55, 374.

Pethick, C. J., and Ravenhall, D. G. (1995). Ann. Rev. Nucl. Part. Sci.45, 429.

Prakash, M., Bonibaci, I., Prakash, M. et al. (1997). Phys. Rep. 280, 1.Shapiro, S., and Teukolsky, S. (1983). “Black Holes, White Dwarfs

and Neutron Stars: The Physics of Compact Objects,” Wiley (Inter-science), New York.

Thorsett, S. E., and Chakrabarty, D. (1999). “Neutron stars mass mea-surements. I, Radio pulsars,” Apj, 512, 288.

Tsuruta, S. (1998). “Thermal properties and detectability of neutron stars,II. Thermal evolution of rotation-powered neutron stars,” Phys. Rep.292, 1.

Page 122: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB/GPA/GRD P2: GPA/MBQ QC: GNB Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN013I-620 July 26, 2001 19:33

PulsarsF. Curtis MichelRice University

I. IntroductionII. Distribution of PulsarsIII. Exceptional PulsarsIV. Where Do Pulsars Come From?V. Nature of Pulsars

VI. The PulsesVII. Theory of Pulsars

GLOSSARY

Dispersion measure Observational measure of the ex-tent to which radio pulses are delayed by propagationthrough the interstellar medium; used to estimate dis-tances to individual pulsars.

Drifter Pulsar that exhibits drifting subpulses.Drifting subpulse Subpulse that moves systematically

across the integrated pulse profile.Integrated pulse profile Apparent pulse shape (intensity

versus time) of a pulsar as determined by averagingtogether a large number of otherwise weak and noisyindividual pulses.

Neutron star A star more massive than ∼1.4 times themass of the sun cannot become a white dwarf, but in-stead must collapse (once it has exhausted its supply ofhydrogen) to what is in effect a single atomic nucleus∼10 km in diameter.

Nulling A number of consecutive pulses may be missingin any long train of pulses. A few pulsars exhibit this“nulling.”

Subpulse Persistent feature within the integrated pulseprofile, often noticeable in single pulses.

Supernova Explosion of a star that temporarily producesa “star” as bright as an entire galaxy (∼1011 ordinarystars). Left behind are an expanding shell of gas and aremnant collapsed object often observed as a pulsar.

White dwarf Star that has exhausted its internal sourcesof energy and has therefore shrunk to a dense spherecomparable in size to the earth.

ONE OF THE very important discoveries in astronomyof modern times has been that of the radio pulsars: stellarobjects that emit high-intensity pulses of coherent radioemission. These astronomical “clocks” can be extraordi-narily stable and provide important probes of interstellarspace besides being enigmatic objects themselves.

I. INTRODUCTION

In astronomy, pulsar denotes an object emitting sharp,rapid pulses of radio emission with clocklike periodicityof about 1 sec. The discovery of pulsars by Antony Hewishand Jocelyn Bell, announced in 1968, earned both of them

267

Page 123: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB/GPA/GRD P2: GPA/MBQ QC: GNB Final Pages

Encyclopedia of Physical Science and Technology EN013I-620 July 26, 2001 19:33

268 Pulsars

a permanent place in the history of astronomy and Hewishwon a Nobel prize.

Pulsars are now believed to be neutron stars, stars ofmass comparable to that of the sun but collapsed to asphere of only ∼10 km in radius (i.e., nearly 100,000times smaller than our own sun). These neutron stars arethought to be intensely magnetized and in rapid rotation.The rotation serves two roles. First, the rotation of thehighly conducting neutron through its own magnetic fieldinduces strong electric fields, which pull charged particlesfrom the surface; second, this magnetic field is seen fromdifferent aspects by a distant observer on the earth as theneutron star rotates. Theories generally assume that theradio emission is concentrated into a beam, with electronsbeing accelerated out of the magnetic polar “caps” andemitting radio waves, which sweep the sky like a light-house beam as the star rotates.

A second type of rapidly pulsating object was later dis-covered to emit X-rays in certain binary systems. Theseobjects, called pulsating X-ray sources, are also believedto be rotating magnetized neutron stars, but in this casethe neutron star orbits a companion star that is transfer-ring mass from its outer atmosphere onto the neutron star.This mass influx is diverted by the magnetic field and fallsprimarily on the poles, which are heated to millions ofdegrees by this bombardment and emit X-rays. Again, ro-tation of the neutron star modulates the X-rays seen onearth.

As a class, the two are distinct both observationally(X-ray emission versus radio emission) and theoretically(accretion of mass onto a neutron star versus ejection ofcharged particles). However, a few pulsars emit both ra-dio waves and X-rays, notably, the so-called Crab pulsarlocated in the center of the Crab nebula.

II. DISTRIBUTION OF PULSARS

Radio pulsars are dispersed among the ordinary stars inthe galaxy, and 99% of them are single objects not in bi-nary systems, again unlike the pulsating X-ray sources andalso unlike ordinary stars, half of which are binary. Nor arethey visible as stars even to the best telescopes. Virtuallyall the detectable energy output is in radio waves, with aspectrum of emission that declines rapidly with increasingfrequency. This decline would in itself make radio pulsarstoo feeble to be seen at the much higher frequencies ofthe visible spectrum. Most of the stars in the galaxy areconcentrated in the disk of the galaxy (the Milky Way),and pulsars are markedly concentrated in the same way.The distances to pulsars can be estimated from their ob-served dispersion measure. Because radio waves interactwith the very few electrons in outer space (only ∼30 per

liter!), the lower frequency waves in the spectrum travelmore slowly, which causes a sharp pulse to be smearedout; the same thing happens to the sharp pulse createdby a lightning stroke, giving the Whistler phenomenonwhen the waves are detected at large distances. The dis-persion measure corresponds to the amount of correctionfor delay necessary to reconstruct a sharp pulse, whichalso determines the integrated electron density along thepath traveled. The above estimate for the electron den-sity then gives an idea of the distance, the closest being∼80 parsecs (pc) away and some are detected as far awayas 55,000 pc.

Most pulsars do not have proper names but are namedaccording to their place in the sky. Thus, the Crab pul-sar is also known as PSR 0531 + 21, meaning a pulsar(PSR) found at a right ascension of 5 hr and 31 min and adeclination of 21 north (plus).

Although ∼700 pulsars have been discovered, thefastest ones are among the most unusual, and it is instruc-tive to examine the properties of the latter. Owing to theprecession of the equinoxes, fixed objects in the sky driftto slightly new positions steadily over time. Astronomershave to periodically update these coordinates (whichcan simply be typed into the controls to point a moderntelescope at that spot). Although the pulsar “coordinatenames” were never meant to be the way of actuallylocating them, radio astronomers nevertheless changedfrom the 1950 coordinate system to the 2000 one a fewyears ago, which renumbered all the “names.” The newlabels are preceded by “J,” while the old ones are assigned“B.” All the names here are the original discovery names(“B”), unless explicitly noted.

III. EXCEPTIONAL PULSARS

The fastest known pulsar is called the Millisecond pulsar,or PSR 1937 + 214 (here the extra digit refines the decli-nation to 21.4 north), and has a period of only 1.558 msec,which is quite close to the maximum rotation rate that aneutron star could theoretically have without flying apartowing to the centrifugal forces exceeding gravity. The sur-face velocity for a typical neutron star according to the-ory would then be 4 × 109 cm/sec, or 13% the velocityof light. Although extremely rapid, the radio emissionsfrom this pulsar are very similar to those of most otherpulsars.

The pulse shape of this pulsar is shown in Fig. 1, show-ing intensity (I ) and degree of linear polarization (L). Al-though most of the energy is in a concentrated spike, thereis a distinctive notch in that spike, and also an interpulseis seen halfway between successive main pulses (one fullcycle of emission is shown). Indeed, as a class, pulsars

Page 124: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB/GPA/GRD P2: GPA/MBQ QC: GNB Final Pages

Encyclopedia of Physical Science and Technology EN013I-620 July 26, 2001 19:33

Pulsars 269

FIGURE 1 Pulse shape of PSR 1937 + 214. Although this pulsaris currently the fastest known, the pulse shape is not particularlyremarkable. Total intensity is I and linearly polarized intensity is L,showing that the main pulse (right) is about 33% polarized nearthe maximum. The width of the main pulse is about 10, whichis also fairly typical, and interpulses (left), although not exhibitedby most pulsars, are not uncommon. The spacing of interpulsesvery nearly halfway between successive main pulses is often inter-preted as the observer seeing both magnetic poles of the pulsar,first one and then the other as the neutron star rotates. The posi-tion angle is the axis of polarization projected on the sky (usuaullymeasured from north), and here we see that the two pulseshave just about orthogonal polarization. A “swing” in polarizationcorresponds to a rapid change in position angle with longitude(not seen here). [Reprinted by permission from Stinebring, D. R.,Boriakoff, V., Cordes, J. M., Deich, W., and Wolszczan, A. (1984).“Birth and Evolution of Neutron Stars: Issues Raised by Millisec-ond Pulsars” (S. P. Reynolds, and D. R. Stinebring, eds.), p. 35,NRAO, Green Bank, WV.]

are similar to fingerprints; each is readily identifiable assuch but is nevertheless distinct on close examination.Most, for example, do not have the interpulse, and manywould have multiple “components” if we were to viewthe notch as separating two distinct components, etc. Thispulsar is a particularly accurate clock, and the pulse periodhas been determined to high precision: 1.5578064488737(±6) msec. Like virtually all other pulsars, it is slowingdown, but only very slowly (about 10−19 sec/sec).

A second millisecond pulsar, PSR 1957 + 20, has al-most the same period (1.607 msec) and has the extraordi-nary property of being in a binary system and eclipsing itscompanion. The companion is heated by the pulsar radi-ation and is seen as a variable star at the orbital period of9.2 hr. For a likely pulsar mass of around 1.4 solar masses,the mass of the companion turns out to be only 0.022 solarmasses, the lightest known star. The eclipses cannot be en-tirely due to the companion moving in front of the pulsarbecause they are too large, corresponding to distances atwhich matter would not be gravitationally bound to thecompanion. Many think that the companion is essentiallya comet evaporating in this system (but the eclipses arequite symmetric, which is a problem). Another idea isthat plasma in a magnetosphere about the companion isthe occulting agent. We will discuss where the estimatefor the pulsar mass comes from in Section IV.

An even “slower” pulsar, PSR 1821−24 at 3.054 msec(about 20,000 rpm!), is one of an interesting new class thatis found near the centers of globular clusters, M28 in thiscase (the designation is from a catalog compiled by thecomet hunter Messier, who was frustrated by the variousother fuzzy objects in the sky). This pulsar is not in abinary system, but the field pulsar PSR 1855 + 09 (period5.362 msec) is, with a period of 12 days. A “field” objectis one that can pop up anywhere, as opposed to objectsassociated with clusters, etc. The companion is unseen,and its properties are largely unknown, other than that themass probably exceeds 0.2 solar masses. PSR 1953 + 29,which was discovered shortly after the millisecond pulsarwith a period of 6.133 msec, is also a field pulsar but alsoin a binary system, here with a period of 117 days (aboutthat of Mercury about the sun) and a similarly unseencompanion of approximately the same minimum mass.PSR 1620−26 is an 11.076 “millisecond” 191-day binarypulsar in the globular cluster M4.

This list of millisecond pulsars is growing continually,especially for pulsars in globular clusters. Once one pul-sar is discovered, the dispersion measure to the globularcluster is known. These distant pulsars are too faint to givedirectly detectable pulses, and the data must be processedwith a very large number of trial values for period anddispersion measure until a pulse can be found. Knowingthe dispersion measure, therefore, greatly reduces the ef-fort to search for additional pulsars in the globular cluster.For example, the globular cluster M15 is now known toharbor at least five pulsars designated PSR 2127 + 11A,B, C, D, and E in order of discovery, with periods of 110,56, 30, 4.80, and 4.65 msec, respectively, one of which isthought to be binary and one of which (the slowest) hasan increasing period, possibly because it is in a very widebinary system. Such systems may require years of obser-vation to determine the orbital properties. Although the

Page 125: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB/GPA/GRD P2: GPA/MBQ QC: GNB Final Pages

Encyclopedia of Physical Science and Technology EN013I-620 July 26, 2001 19:33

270 Pulsars

periods are also in the order of discovery, this seems to bea coincidence.

Until 1982, the fastest known pulsar was the Crabpulsar (PSR 0532 + 21, now known as J0534 + 2200), at33.1 msec, which sits centered on what is probably theremnant of a historical supernova observed by the Chinesein 1054 AD. This remnant, the Crab nebula, is expandingat a measurable rate consistent with such a birthdate.The pulsar is remarkable in that it is also a source ofvisible light, which is pulsed at the same 33-msec periodas the radio, too fast for the eye to follow! As withtypical pulsars the radio emission declines rapidly withfrequency, and some researchers believe a separate mech-anism may cause the pulsar to become visible again atvisible frequencies. [Roughly speaking, radio frequenciesare of the order 109 cycles per second (hertz), whereasvisible light is ∼1015 Hz.] Moreover, the high-frequencypart of the spectrum extends into X-ray and γ -rayenergies (of the order of 1021 Hz), again pulsed. The Crabnebula itself has been discovered to be jumping about inresponse to the pulsar in the optical, and images can beviewed at http://oposite.stsci.edu/pubinfo/pr/96/22.html.New X-ray satellite images show similar structures:http://chandra.harvard.edu/photo/0052/index.html.

A similar pulsar is PSR 0540 − 693, with a period of50 msec and also surrounded by a nebula, resembling quiteclosely the Crab pulsar except for the distance. This pulsaris in the Large Magellanic Cloud (LMC) some 55 kpcaway. Like the Crab pulsar, it emits visible pulses.

A very important pulsar, altrough slightly slower at59.0 msec, is again one in a binary system and is oftencalled the Hulse–Taylor binary pulsar (PSR 1913 + 16)after its discoveres. (About 20% of the known pulsarshave right ascensions of 19 hr, which is largely a selectioneffect owing to the fact that the look direction of the giantfixed radio telescope at Arecibo rotates across the MilkyWay at this location.) This pulsar was the first binary pul-sar to be discovered and is possibly the most important.The orbital period is only 7 hr and 45 min, and generalrelativistic effects are important. This binary system is thefirst single-line binary (i.e., only the pulsar is detected)for which all of the orbital elements have been deduced(by means of the general relativity theory), including ob-servation of the advance of perihelion (the same effectexplained for Mercury) at a rate entirely consistent withtheory. More importantly, the binary pair are spiraling to-gether at a rate consistent with energy loss by gravitationalradiation. Unlike the 6.1-msec binary pulsar, the unseencompanion also has a mass of 1.4 solar masses, suggestingthat it may also be a neutron star.

Another fast pulsar discovered early on (PSR 0833 −45) is the Vela pulsar associated with the Vela supernovaremnant, with a period of 89.2 msec. Like the Crab pulsar,

it has high-frequency emissions. Unlike any other pulsar,however, it exhibits extremely large “glitches,” wherein itsperiod abruptly decreases by a small but readily detectableamount of about one part in a million. These events repeatat an irregular interval of ∼3 years and do not have exactlythe same behavior each time.

This listing of pulsars illustrates a number of importantobservational inferences, detailed in Section IV. Most ofthe next dozen or so pulsars having periods between 100and 200 msec tend to be isolated pulsars without strikingor unusual properties.

IV. WHERE DO PULSARS COME FROM?

The Crab pulsar and surrounding supernova envelopestrongly support theoretical arguments that neutron starsform in supernova events. Numerical simulations indicatethat a very massive star (10 times that of the sun, say)grows a core as it evolves until it essentially has a densewhite dwarf at its center surrounded by an “atmosphere,”which is the rest of the (giant) star. The core providesenergy to the star by accreting more atmosphere, but thehuge luminosity of such stars (on order of 10,000 timesthat of the sun) rapidly removes the energy until the coregrows so massive that it collapses and forms a neutronstar. This collapse releases a sudden burst of energy, whichejects the outer shell of the core and the atmosphere—therest of the massive star. It had already been shown yearsago by S. Chandrasekhar that a white dwarf could be nomore massive than about 1.4 times the mass of the sunwithout collapsing to a point, so the collapse of the coreis inevitable. This theoretical analysis was dramaticallyconfirmed in 1987 when a giant star in the LMC became asupernova, designated SN 1987A. The reason that whitedwarfs can only be so massive is that they are supportedalmost entirely by the pressure of degenerate electrons.This is the same pressure that effectively gives atomsfinite sizes instead of the electrons being bound arbitrarilytightly. Planets are thus held up by electron degeneracypressure in the guise of the near-incompressibility of solidmatter. But near-incompressibility is not good enough,and at high enough pressures the electrons become rela-tivistic, which signals a slight weakening in their abilityto hold up stars, and they fail to do so beyond the aboveChandrasekhar mass limit (an entirely theoretical calcu-lation that follows directly from quantum mechanics andspecial relativity). The core does not, however, collapseto a point but stops when the nucleons are pushed closeenough together to themselves become degenerate, whichin direct parallel with atoms is when the density reachesthat of nuclear matter instead of ordinary matter. This isour neutron star. There is a mass limit for neutron stars just

Page 126: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB/GPA/GRD P2: GPA/MBQ QC: GNB Final Pages

Encyclopedia of Physical Science and Technology EN013I-620 July 26, 2001 19:33

Pulsars 271

as for white dwarfs, but it is harder to calculate becauseelectrons stay electrons under compression but nucleibegin to change into exotic forms (“strange” particles)under such circumstances in ways not yet fully under-stood. Most estimates give limiting masses around twicethat of the sun, and most astrophysicists assume that theneutron star would then collapse to become a black hole.

This understanding of how pulsars are created is ad-equate for most of the observed pulsars but cannot becomplete. The problem is those found in globular clus-ters. The globular clusters are thought to be among theoldest objects in the galaxy, and massive stars must lastonly a short time (if the sun lasts billions of years, a star10,000 times brighter must burn up within millions). Sothe pulsars formed by the above mechanism (a “type II”supernova; one in which spectral evidence for hydrogenis present—presumably the outer envelope) would havebeen formed early on and would by now be spinning veryslowly, not faster, than most pulsars. Thus, surprisingly,there must be another mechanism as well by which theseexotic-seeming objects can be made. That there is a sec-ond mechanism is further supported by the fact that somany of these fast pulsars are in binary systems, whilethe “ordinary” pulsars seen scattered about are virtuallyall single objects. Supernova do, however, occur in sys-tems like globular clusters, namely, the elliptical galaxies.These systems lack the gas and dust with which to man-ufacture new massive stars and in this respect resembleclosely the globular clusters, except for being much largerand containing vastly more stars.

It is thought that the second mechanism for making su-pernovae (again surprising to have two different sourcesof such exotic events) is the evolution of white dwarfs’ bi-nary systems. Binary star systems are common, and mostold stars become white dwarfs, so such systems are notuncommon. If the two orbit one another close enough totransfer matter, one white dwarf loses matter and expands(because it is no longer weighted down by the lost mass),while the other gains matter and shrinks; again, we canhave a white dwarf pushed over the Chandrasekhar masslimit. These explosions are probably the “type I” super-novae (no hydrogen detected; white dwarfs have “burned”all their hydrogen already). Calculations suggest that thecollapsing white dwarf might be entirely incinerated (ithas not been supporting a massive atmosphere all thistime and consequently should have more internal fuel left).But under the right circumstances, a neutron star might beleft, and in this sort of evolutionary sequence the remnantshould be rapidly spinning. A popular alternative is in-stead that an old pulsar has been captured and is spun upby accretion of matter from the companion. This modelrequires that the magnetic fields of pulsars decay away (atopic of current debate). It also makes it difficult to explain

why some of the millisecond pulsars are single objects (al-though the eclipsing binary pulsar has a very light compan-ion of 0.02 solar masses, so accretion or some other pro-cess could remove most or all of the companion’s mass).

V. NATURE OF PULSARS

Pulsars are thought to be formed in supernova events.There are actually only a few pulsar–supernova associ-ations. Many young supernova remnants do not containdetectable pulsars, and most pulsars are not in supernovaremnants. The first is partly a selection effect because theremnants themselves are intense radio sources, and onlythe brighter pulsars can be seen against such a background.Even if the pulsar is bright, its beam might not sweep theearth. The reason for the second is that the remnant ex-pands and fades over a period of ∼104 years, whereas thepulsars typically live on for a few million years. On theother hand, the weakly magnetized (so-called millisec-ond) pulsars survive on time scales leading back to thevery formation of our galaxy.

The fastest pulsars provide the strongest test of the ro-tating neutron star hypothesis. Early theories involvingwhite dwarfs as the pulsar object were ruled out by theshort time scales involved; such stars are simply too largeto behave coherently for periods much less than 1 sec.

All pulsars are observed to be slowing down. Again,this observation is consistent with a rotating object forwhich the source of free energy is the kinetic energy storedin rotation. The energy output can then be calculated byestimating the moment of inertia of the neutron star (es-sentially its mass times the square of its radius) and us-ing the observed rate of increase of period to determinehow rapidly this energy is declining. The rate of changeof the periods spans a large range, but a change of pe-riod of ∼10−15 sec/sec is representative. Again, for theweakly magnetized millisecond pulsars, the correspond-ing period changes are more like 10−20. A pulsar with aperiod of 1 sec (again typical; the pulsars noted above arethe fastest) has a stored energy of 1046 ergs, and thereforethe energy loss rates are ∼1031 ergs/sec. These figures arealways much larger by a comfortable factor (∼105) thanthe energy output in radio waves. Most of the pulsar en-ergy output seems to be invisible and is thought to be inthe form of a relativistic magnetized “wind” flowing awayfrom this pulsar. However, the entire Crab nebula super-nova remnant radiates power comparable to that lost fromthe pulsar, suggesting that it is the wind from the Crabpulsar that lights up the nebula.

The pulsating X-ray sources behave quite differently;they have quite large spin-up and spin-down rates, so thatthe period of a given source may increase over about 1 year

Page 127: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB/GPA/GRD P2: GPA/MBQ QC: GNB Final Pages

Encyclopedia of Physical Science and Technology EN013I-620 July 26, 2001 19:33

272 Pulsars

and then decline. The spin-up is thought to be caused byangular momentum carried in by the accreted matter. Thespin-down mechanism is less certain but is usually at-tributed to electromagnetic coupling of the neutron star,assuming it has a strong magnetic field just as the radiopulsars do, to an orbiting disk of matter. Indeed, the ro-tation periods of many of these binary X-ray sources aremuch longer than what a radio pulsar would ever spindown to in the age of the galaxy (about a 25-sec period).

The glitch phenomenon has proved difficult to explain.Most theories concentrate on some change in the phys-ical body of the neutron star itself. A “starquake” thatreleased internal stresses and allowed the neutron star toshrink slightly would explain the spin-up seen in the Velaglitches, but these events happen too frequently. The starcannot endlessly shrink. Present models argue instead foran interior core, which is decoupled from the crust and ro-tates more rapidly. The glitches might then be abrupt brak-ing phenomena that transfer angular momentum from thecore to the crust and spin up the latter. The strong mag-netic field, however, would lock the crust and core intocorotation, so these models impose special conditions onthis field (e.g., it would have to reside entirely in the crustand be excluded from the core). The Crab pulsar also dis-plays these events, but with much smaller amplitude andmuch more frequently. However, PRS 1509 − 26 (also as-sociated with a supernova remnant) has an exceptionallylarge spin-down rate of 1.5 × 10−12 sec/sec and yet hasshown no glitches whatsoever. A few slow old pulsars(PSR 1641 − 45 and PSR 1325 − 43) have experiencedsingle glitches of amplitude intermediate between thoseof the Crab and Vela.

With these and a few other exceptions, pulsars have ex-tremely stable slowing-down rates, and if their pulse ratesare corrected for spin-down, they rival the finest atomicclocks. The location of a pulsar in the sky can be deter-mined to high accuracy (comparable to that for visiblestars) using the above clocklike properties. Owing to theearth’s orbital motion about the sun, the pulsar clock ap-pears to run fast and slow at different times of the year.This variation can be removed only if the pulsar is locatedat a very specific location in the sky. The amplitude ofthis period variation gives the declination, and the phasegives the right ascention. In general, there are no starsfound at these well-determined locations that seem likelyvisible counterparts (except for the Crab, Vela, and PSR0540 − 693, which display optical emission).

VI. THE PULSES

Pulsars would not sound like the good clocks that they areif they could be heard. There are large pulse-to-pulse varia-

tions, and occasional pulses may even be missing. Severalhundred individual bursts of radio emission added togetheron top of one another form a generally stable integratedpulse profile, but this profile serves more as a “window”through which a variety of puzzling pulse variations arewitnessed. In general, the individual pulses bear little re-semblance to the integrated profile, often as sharp spikesappearing more or less randomly within the window. Insome cases, however, the behavior is quite coherent, witha smaller pulse (subpulse) appearing at one edge of thewindow and steadily working its way across to disappearat the other edge. These drifting subpulses are found inperhaps 10% of the pulsars, with PSR 0809 + 74 being aclassic example. Another pulsar (PSR 0826 − 34) is a re-markable example in that it has an extremely wide profilewithin which can be seen four or five subpulses at once.These subpulses are spaced about 30 apart (defining 360

as one pulse period) and move back and forth together. Theinterpretation is that we are looking nearly along the spinaxis of a neutron star that is also magnetized nearly alongthe spin axis, and therefore, we are almost constantly in thepulsar emission beam. Both of these drifters also displaynulling (as do some nondrifters), a phenomenon in whichthe pulsar disappears, becoming entirely undetectable for10 to even 1000 periods before reappearing abruptly. Onepulsar, PSR 0904 + 77, has never been seen since its firstdetection. If not a spurious observation, it is an extremeexample of nulling.

The phenomenon of nulling shows that pulsar actioncan cease, and indeed, it must if all pulsars slow down.The inevitable consequence should be an accumulation ofolder and slower pulsars, whereas observationally the av-erage pulsar has a period of ∼0.5 sec and yet the slowestpulsar (PSR J1951 + 1122 at present) has a period of only5.1 sec. The pulsating binary X-ray sources, on the otherhand, have periods of 700 sec or longer. Apparently, pul-sars as such must rapidly disappear once they slow below1 sec or so. Pulsars with a 1-sec period have slowing-downrates of ∼10−14 sec/sec, implying that they will live foronly ∼1014 sec = 3 × 106 years. Since most of the time isspent as a slow pulsar, the typical pulsar lifetime is still ex-pected to be of the order of a few million years regardlessof how fast the pulsar was rotating at birth.

The slowing down of the fastest pulsars is much lessthan average, a seeming contradiction. It would take∼4 × 108 years for the 1.558-msec pulsar to become evena 3-msec pulsar and extraordinarily long for it to becomea 4-sec pulsar. In the magnetized rotating neutron starpicture, it is the strong magnetic field that couples therotating star to its surroundings. Thus, an unusually weakmagnetic field would account for the small slowing-downrate. Indeed, it is only when a rapid pulsar has a weakmagnetic field that we can ever hope to observe it, unless

Page 128: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB/GPA/GRD P2: GPA/MBQ QC: GNB Final Pages

Encyclopedia of Physical Science and Technology EN013I-620 July 26, 2001 19:33

Pulsars 273

it is seen shortly after birth. The Crab pulsar is 1000 yearsold and spins at 33 msec. If we scale the lifetime it couldhave had (with its apparently strong magnetic field of∼4 × 1012 G) with a spin rate of 1.5 msec, we find that itwould double its period in only a few years. Thus, onlyin a fresh supernova could one hope to see a stronglymagnetized millisecond pulsar. Such events are rare;centuries can pass between known supernova events inour own galaxy, and it is not evident that the surroundingdense cloud of debris would permit the pulsar to shinethrough early on. Supernovas are more frequently seen inother galaxies (there are so many of them), but associatedpulsars have not been detected (possibly they are simplynot bright enough to be seen at such huge distances—megaparsecs). Unexpectedly, supernova SN 1987A in thenearby LMC has yet to show any evidence that it containsa pulsar, even 13 years after the event. Owing to theirlong lifetimes, the millisecond pulsars are largely a classthat have accumulated in the galaxy, whereas the stronglymagnetized field pulsars are a young population, and wedo not see the old ones because they would have spundown and become too faint (even if no other effect of ag-ing entered: cooling, possible decay of the magnetic field,etc.). Ironically, pulsars are more readily observable if themagnetic field is not too strong, even though presumablypulsars require strong fields to operate in the first place.

VII. THEORY OF PULSARS

A large amount of circumstantial evidence, as shown in thepreceding section, now exists on the behavior of pulsars.Yet the very fact that we can detect them at all is poorlyunderstood. The radio luminosity is ∼1028 ergs/sec fora typical pulsar, and yet a neutron star is only ∼10 kmin radius, which would require it to be an extremely effi-cient antenna. (If much more energy than that in the radiospectrum went into making the system glow in the visi-ble spectrum, some should have been seen as dim stars.)The simplest picture of how so much radiation might beproduced in such a small region is the “bunching” hypoth-esis, which assumes that the energetic electrons producingthe radio waves are bunched together. In a simple geom-etry one might consider a spherical bunch of N particles.At long wavelengths the electrons radiate as if they werefused into a single particle of charge Ne, giving emis-sion N times more intense than N separate electrons ra-diating independently. At wavelengths much shorter thanthe electron spacing, the electrons would radiate more orless independently despite the bunching. Thus, bunchinggives a natural qualitative account of why pulsar emissionis intense to low frequencies and declines rapidly at highfrequencies.

It was early on suggested by P. Sturrock that the pulsarmechanism had something to do with an exotic combi-nation of physical mechanisms wherein electrons are ac-celerated in the huge electric fields generated by the neu-tron star rotating through its own magnetic field. Theseelectrons are constrained to follow these curved mag-netic field lines and consequently emit curvature radia-tion, analogous to the emission of synchrotron radiationby electrons moving in curved trajectories around mag-netic field lines (a process thought to be unimportant inpulsar magnetospheres because the electron would almostinstantaneously lose such energy and it would not be re-placed). Indeed, γ -rays are detected from the Crab andVela pulsars possibly produced by this mechanism. Aneven more exotic process possible in such huge mag-netic fields is the conversion of γ -rays into electron-positron pairs. It is thus possible that a single electronwould emit a γ -ray and thereby produce a second elec-tron (and positron). Unfortunately, it was not understoodwhy bunches might form. Moreover, popular models ofthe time assumed that the magnetosphere was entirelyfilled with plasma pulled from the stellar surface by thehuge electric fields. As a result, the magnetic field lineswere thought to be largely shorted out by this plasma,which reduced the acceleration of the electrons to levelsunpromising for the above mechanism. Another problemwas that the bunching was assumed to result from someinstability, and, given the small scale of a neutron starsystem together with the high velocities of relativistic ve-locities, such instabilities had to be extraordinarily fastdeveloping.

A considerable theoretical effort was invested in ana-lyzing this magnetized rotating neutron star model. At firstit was thought that the simplest possible model, a dipolemagnetic field aligned with the spin axis, would emit par-ticles from the polar caps like a pulsar and be a valuabletest bed for radiation models. Indeed, most radiation mod-els have assumed acceleration of electrons from the po-lar caps as a starting point (often attempting to includepair production). It was later realized that such particleemission could only be a transient phenomenon, becauseonly electrons would be lost from the system. The resul-tant electrostatic charging of the neutron star eventuallywould halt any further loss of electrons. Positive particlesthat might neutralize this charging should also be emit-ted, but they would be trapped very close to the star on theclosed equatorial magnetic field lines. Although the modelseemed disappointingly unlike a pulsar, it is neverthelessinteresting because the neutron star ends up surroundedby an equatorial torus of positive charge and two cloudsof electrons over the polar caps, with vacuum elsewhere.It is unusual to find plasmas that consist only of elec-trons because the self-repulsion of the electrons should

Page 129: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB/GPA/GRD P2: GPA/MBQ QC: GNB Final Pages

Encyclopedia of Physical Science and Technology EN013I-620 July 26, 2001 19:33

274 Pulsars

blow the plasma apart. However, laboratory workers havesucceeded in producing just such plasmas by trappinglarge numbers of particles in magnetic Penning traps.These nonneutral plasmas have the remarkable property ofbehaving like liquids, with a sharp boundary separating aconstant density of particles from the vacuum surroundingthem. However, there is no reason to expect pulsars to bealigned rotators; that was merely a mathematical simpli-fication, and many wondered if an inclined rotator mightbe fundamentally different.

A promising new approach takes advantage of whatseemed to be a disadvantage, the plasma voids in the mag-netosphere. Here, electrons can indeed be accelerated tothe energies envisioned by Sturrock. But such a processwould simply fill up the magnetosphere of an aligned rota-tor. For an inclined rotator, however, large amplitude elec-tromagnetic waves form high above the neutron star polarcaps. As a consequence, the magnetospheres of inclinedrotators can never entirely fill up because the plasma isdriven away by the radiation pressure from waves createdby the neutron star itself. Thus, electrons can be acceler-ated by strong electric fields in the void regions created bythe electromagnetic waves. Consider a downward-movingelectron. It radiates γ -rays, which, if close enough to thestar, form pairs. In the strong field, the positron is im-mediately stopped and ejected, while the new electron isessentially a companion of the original electron and in thestrong electric field is almost immediately accelerated tothe same energy (the energy is now limited by that rateat which the γ -rays are being emitted). But two electronsquickly become 4, 16, 256, etc., and a highly chargeddownward-moving bunch is created from a single initiat-ing charge. Of course, the upward-moving positrons dothe same thing, giving a breakdown in the huge electricfields that create upward- and downward-moving sheets ofcharge. Thus, it is possible that the bunching (now some-thing of a misnomer because it is not ambient plasma thatis being bunched at all) is an intrinsic part of the dischargeprocess. Much more needs to be worked out before such asimple picture can be tested and compared to the abundantdata on pulsars.

In all of these models one expects a variation in theapparent (projected) direction of the magnetic field asthe emission region swings past the observer. A largenumber of pulsars indeed show a systematic rotation inthe polarization direction of their radiation during eachpulse, with the rate being fastest at the center of the pulse(which need not be the point of maximum brightness).These observations are broadly consistent with theoryassuming a simple tilted dipole magnetic field, and at-tempts have even been made to deduce the actual anglesbetween spin axis and observer plus magnetic field forsome pulsars. Unfortunately, the rotation rate gives only

one number, whereas there are two unknown angles tobe solved for, so some additional assumption must bemade.

The simplest expectation would be that the radio emis-sion is polarized parallel to the magnetic field lines.(Curvature radiation from relativistic electrons followedcurved magnetic field lines would have this behavior.) Infact, one observes in many pulsars abrupt 90 changes inpolarization at certain points in the pulse, so if the radia-tion was originally along the field lines, it must suddenlybecome orthogonal to them, which is not a property ofcurvature radiation. The attractive idea of relativistic elec-trons somehow bunching and radiating as they move oncurved magnetic field lines is therefore incomplete. It maybe that the more or less static nonneutral plasma thought tobe trapped near the neutron star complicates the emissionmechanism in interesting ways.

Returning to the supernova origin hypothesis, the super-nova event rate seems broadly consistent with that neces-sary to maintain pulsars against their apparent death rate,with there being considerable uncertainty in each. (Forexample, only ∼0.1% of the pulsars in our galaxy havebeen detected, and the supernova rate is actually an av-erage over supernovas in other galaxies apparently simi-lar to our own.) These rates are thought to be about oneper 50 years. It is not believed that the observed pulsarscentered on supernova remnants are chance associations(particularly the Crab pulsar, which seems to be excitingthe nebula). Supernova remnants occupy ∼1% of the skyin a 10 wide band along the Milky Way. The pulsarsare similarly distributed, showing that they are producedmainly by stars in the disk of the Milky Way and thatthey are seen at distances that are large compared withthe thickness of this disk. A few of the approximately700 observed pulsars could therefore be accidentally inline with a remnant. However, the fact that the pulsarsseen in these remnants are also very fast (<100 msec) re-duces significantly the chances of accidental association.Presently, three of the six pulsars faster than 100 msec arein supernova remnants. Since the millisecond pulsars liveso long, there is little hope of still finding the smokinggun.

The supernova association is also complicated by un-certainty in the pulsar beam geometry. Presumably, oneobserves a one-dimensional slice through the beam crosssection it sweeps across the earth, but there is no informa-tion about the extent of the beam above and below. If thebeam is roughly circular, there must be a large number ofunseen pulsars, because the beams are observed to be sonarrow (∼10 of the latitude if rotating), and statisticallyonly about one in five would be expected to sweep overthe earth, about the same ratio as remnants with pulsars tothose without.

Page 130: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB/GPA/GRD P2: GPA/MBQ QC: GNB Final Pages

Encyclopedia of Physical Science and Technology EN013I-620 July 26, 2001 19:33

Pulsars 275

Extinct pulsars may be resuscitated as γ -ray burstsources long after they fade from view. These sourcesemit single isolated but intense bursts of γ -rays lasting∼1 sec. There is little concensus on what produces thesebursts, with models ranging from an asteroid falling ontothe neutron star to rapid accretion of material from adisk. One such event, the brightest ever observed, whichoccurred on March 5, 1979, has even been localized neara supernova remnant designated N49 in the LMC. Thisis tantalizing but not necessarily supportive of the ideathat the source is an extinct neutron star, because thenthe remnant should long since have dissipated. It has re-cently been suggested that some such events might betriggered by the formation of a class of very strongly mag-netized pulsars (“Magnetars”), with magnetic fields about1000 times as strong as typical pulsars. In contrast, themillisecond pulsars are weaker by about this same factor.Alternatively, the association could be accidental, with theγ -ray burst source being actually much closer than averagerather than being much farther than average, which wouldexplain the brightness. However, recent observations sug-

gest just the opposite, namely, that the γ -ray bursts maybe from huge (“cosmological”) distances, which if truewould render the above burst a very weak one, bright onlybecause it was so “close.”

SEE ALSO THE FOLLOWING ARTICLES

GAMMA-RAY ASTRONOMY • GRAVITATIONAL WAVE AS-TRONOMY • NEUTRON STARS • STELLAR STRUCTURE AND

EVOLUTION • SUPERNOVAE • X-RAY ASTRONOMY

BIBLIOGRAPHY

Lyne, A. G., and Smith, F. G. (1998). “Pulsar Astronomy,” CambridgeUniv. Press, New York.

Manchester, R. N., and Taylor, J. H. (1977). “Pulsars,” Freeman, SanFrancisco.

Michel, F. C. (1991). “Theory of Neutron Star Magnetospheres,” Univ.of Chicago Press, Chicago.

Smith, F. G. (1977). “Pulsars,” Cambridge Univ. Press, New York.

Page 131: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

QuasarsB. M. PetersonS. MathurP. S. OsmerM. VestergaardOhio State University

I. Quasars: The Most Luminous Active GalacticNuclei

II. The Energy SourceIII. Observed Properties of QuasarsIV. Structure of QuasarsV. Finding Quasars

VI. Quasars and the Universe

GLOSSARY

Accretion disk A gaseous disk surrounding a source thatis powered by gravitational accretion. Viscosity heatsthe accretion disk, causing high temperatures that re-sult in thermal radiation from the disk and allowingdissipation of angular momentum.

Eddington luminosity The maximum luminosity thatcan be obtained from spherical mass accretion. Higherrates produce higher luminosities, which lead to radia-tion pressure (on an ionized gas) exceeding self-gravity.

Gravitational lensing Distortion of space–time nearlarge masses that causes deflection of light paths; themass acts as a lens and can produce multiple imagesof objects located behind the lens.

Nonthermal emission Emission due to processes that de-pend on physical conditions other than the temperatureof the source, such as synchrotron radiation (relativistic

electrons spiraling in magnetic fields) or the inverseCompton mechanism (scattering of photons off rela-tivistic electrons, increasing photon energies).

Schwarzschild radius The radius of the event horizon ofa nonrotating black hole (RS = 2G M/c2).

Very-long-baseline interferometry (VLBI) Techniqueof aperture synthesis over long radio antenna separa-tions (hundreds of kilometers or more). Angular res-olution of milliarcseconds can be achieved by usingreceivers across the Earth.

Virial theorem For a system in which angular momen-tum is conserved, the time-averaged relationship be-tween the kinetic energy K and potential energy U is2K + U = 0. This can be used to measure the massesof complex stellar systems.

THE TERM “quasar,” shorthand for “quasi-stellar ra-dio source,” was coined as a description of point-like (or

465

Page 132: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

466 Quasars

star-like) radio sources that are out of the galactic plane butcould not be clearly identified with extragalactic nebulae(external galaxies) when they were first discovered in earlyradio surveys of the sky. Accurate radio positions were ob-tained for a small number of these high-galactic-latitudesources, showing that they were clearly star-like in the op-tical. Optical spectra of these sources, which showed onlyvery broad low-contrast features, did little to clarify theirnature. In 1963, Maarten Schmidt realized that the broadfeatures in the spectrum of the source 3C 273 were in factthe hydrogen Balmer-series emission lines that are seenin the spectra of galactic nebulae, but at an uncommonlylarge redshift, z = 0.158, where z = (λ − λ0)/λ0 and λ andλ0 are the observed and laboratory wavelengths, respec-tively, of the emission line. The emission lines in quasarsare much broader than those in any other astrophysicalsource except supernova remnants: Their widths implyinternal velocity dispersions of a few to several thousandsof kilometers per second.

I. QUASARS: THE MOST LUMINOUSACTIVE GALACTIC NUCLEI

Redshifts as large as that measured for 3C 273 had beenobserved for some of the most distant known clusters ofgalaxies, but what was extraordinary about 3C 273 wasthe great luminosity implied by its redshift. If quasar red-shifts are a consequence of the expansion of the Universe,the luminosity of 3C 273 is more than 100 times greaterthan that of a typical giant galaxy. This result was evenmore alarming when it was realized how compact thesesources are. Quasars can undergo significant changes intheir brightness on time scales of days to weeks. Sincethe parts of the emitting region that are varying must becausally connected, light travel time sets an upper limit tothe size of the varying region. The problem with quasarsis thus immediately apparent: They are objects as small asthe Solar System (a few light days or light weeks across),but more luminous than an entire galaxy of stars (hundredsof thousands of light years across).

While quasars were discovered on account of their radioemission, it soon became apparent that quasars also tendedto be bluer than stars and that optical searches for blue ob-jects at high galactic latitude might be an efficient way offinding them (Section V.A). Searches for objects with “ul-traviolet (UV) excesses” revealed a surprising number ofthese sources, too many to account for with known radiosources. Indeed, the original “radio-loud” quasars turnedout to be outnumbered by their “radio-quiet” counterpartsby a factor of 10 to 20; these became known as quasi-stellarobjects, or QSOs, although nowadays the terms “quasar”and “QSO” are generally used interchangeably. The ex-

treme properties of the original radio-loud quasars relativeto typical QSOs probably contributed to the difficulty inrecognizing the similarity of QSOs to the puzzling Seyfertgalaxies. These galaxies had been isolated by Carl Seyfertin the early 1940s on the basis of their high central surfacebrightness (i.e., star-like central nuclei). They share manyproperties with QSOs, notably the strong, broad emissionlines in their UV and optical spectra. The basic differ-ence between QSOs and Seyfert galaxies is their luminos-ity: The somewhat arbitrary demarcation between them isnow taken to be an absolute B magnitude of −23.0 mag;brighter sources are QSOs, and fainter sources are Seyfertgalaxies. Collectively, QSOs and Seyfert galaxies are of-ten referred to generically as active galactic nuclei, orAGNs.

Unlike stars and systems of stars, quasars are extremelyluminous in every waveband in which they have been ob-served, as shown in the typical spectral-energy distributionshown in Fig. 1. Broad peaks in the spectral energy distri-bution occur (1) in the hard X-ray (2–10 keV) region, (2)in the UV region, and (3) in the mid-infrared. Relative tothe higher energy bands, the amount of energy radiated perunit bandwidth at radio wavelengths is lower by a factorof about 100 for the radio-loud objects and by a factor ofabout 104 for radio-quiet sources. The various local peaksin the distribution reflect different physical processes, aswill be described below.

B. Taxonomy of Active Galaxies

Active galactic nuclei have traditionally been subdividedinto a number of subcategories based on differences invarious observed properties. As noted above, higher lu-minosity AGNs are called quasars or QSOs and lowerluminosity AGNs are called Seyfert galaxies. There are,in fact, two types of Seyfert galaxies: Type 1 Seyfertsare those with quasar-like UV/optical spectra whose ma-jor features are a nonstellar continuum and both broadand narrow emission lines. Type 2 Seyfert galaxies lackthe broad components of the emission lines; in at leastsome cases, these are known to be type 1 objects in whichthe broad components are obscured from our view. Thereare also broad-line radio galaxies and narrow-line radiogalaxies that are the radio-loud analogues of the type 1and type 2 Seyferts. Finally, there is a subclass of radio-loud quasars whose emission at all wavelengths is dom-inated by nonthermal emission from relativistic jets ori-ented close to the line of sight. These objects are referredto as blazars. Among blazars are two major subclasses:BL Lacertae objects, in which the broad emission linesthat characterize quasar spectra are too weak to be de-tectable by normal means, and optically violent variables,which have much more quasar-like spectra but undergo

Page 133: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

Quasars 467

FIGURE 1 Mean spectral energy distributions (SEDs) for radio-quiet (solid line) and radio-loud (dashed line) QSOs.The horizontal axis is photon frequency, and the corresponding wavelengths (or energies in the X-rays region) arelabeled on the top axis. The vertical axis is luminosity in ergs per second. Major spectral features referred to in the textare labeled. [Adapted from Peterson (1997), based on data from Elvis, M. et al. (1994). Astrophys. J. Suppl. 95, 1.]

strong and rapid continuum variations. At least some ofthe differences among various classes of AGNs appear tobe due to differences in orientation rather than differencesin the underlying physics, as discussed in Section VI.B.

II. THE ENERGY SOURCE

A. Supermassive Black Holes

The quasar problem, as noted above, is to try to explainhow such an enormous amount of energy can be gener-ated in such a small volume. It was recognized early thatgravitational accretion by supermassive black holes wasa possible explanation, and indeed this is the only ex-planation that remains viable after some four decades ofresearch. That QSOs are massive can be inferred from theEddington limit, which is the minimum mass required fora gravitationally bound, self-luminous object to be stableagainst radiation pressure. In units appropriate for quasars,ME = 8 × 105(L/1044)M, where L is the central sourceluminosity in units of ergs s–1. Thus, a bright quasar witha luminosity of 1046 ergs s–1 must have a mass greaterthan 108 M. An equivalent way to think of this is to de-fine the Eddington luminosity LE = 1.3 × 1038(M/M)ergs s–1, which is the maximum luminosity of a sourceof mass M that is powered by spherical accretion. This

is therefore a useful benchmark in considering accretionrates and efficiencies. The mass accretion rate required tosustain the Eddington luminosity can thus be written asd ME/dt = 2.2 (M/108)M year–1, where M is the massof the quasar in solar masses. Thus, a modest accretionrate can sustain even high-luminosity QSOs.

The argument that supermassive black holes reside inAGNs continues to get stronger. Probably the best directevidence is provided by the kinematics of megamasersin the Seyfert 2 galaxy NGC 4268. The Keplerian orbitalmotions of the individual megamaser spots imply a centralsource mass of 3.6 × 107 M within the inner 0.1 pc ofthe galaxy. The light-travel, time-delayed response ofthe broad emission lines to continuum variations alsoprovides evidence for supermassive central sources: Ifthe kinematics of the emission-line gas are dominated bythe central source, then the central mass can be estimatedfrom the virial theorem, i.e., M ≈ RV 2/G, where R isthe size of the line-emitting region, as determined bytime delays, and V is the Doppler width of the line. Sincethe different lines arise primarily at different distancesfrom the central ionizing source, multiple lines in a singleAGN ought to show a relationship between line width andtime delay V ∝ R–1/2; for the few AGNs in which theresponse of multiple lines has been measured, this virialrelationship seems to hold and implies masses in the range

Page 134: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

468 Quasars

106–108 M. Additional evidence for supermassive blackholes comes from X-ray spectroscopy of the iron Kα lineat 6.4 keV; in some sources, this line is both redshifted(relative to the AGN itself) and asymmetric. Both of thesecharacteristics can be relativistic signatures that arisewithin a few Schwarzschild radii (Rs = G M/c2) of a blackhole. This interpretation of the line profile characteristicsis not yet firm, but since relativistic effects are involved,X-ray observations of the iron Kα line hold the mostpromise for actually proving the existence of supermassiveblack holes.

B. Accretion Disks

Fundamentally the luminosity of AGNs seems to derivefrom conversion of gravitational energy into radiation. Welack, however, a detailed understanding of how this pro-cess actually occurs. It is generally assumed that the radia-tion is emitted from an “accretion disk” that forms aroundthe black hole. Accretion disks are expected to form asa mechanism for dissipating angular momentum, and areknown to exist around stellar-mass compact objects. Fora 108-M black hole, the radiation from the inner accre-tion disk will thus be in the far-UV part of the spectrum(Section III.C).

Is there evidence that accretion disks do indeed existin AGNs? Certainly, the spectral energy distributions ofAGNs peak somewhere in the UV to soft X-ray range, asexpected for their masses and mass accretion rates. But theobservational details do not agree well with this simplestof accretion-disk theories. Expected signatures such asLyman edges (due to onset of hydrogen opacity shortwardof 912 A) and continuum polarization are not observed.Also, the shape of the continuum in the UV/optical is apoor match to the simple prediction that the specific lumi-nosity per unit frequency should follow the thin-disk pre-diction, Lν ∝ ν1/3. On the other hand, in the case of NGC7469, one of the best-studied variable Seyfert 1 galaxiesacross the UV/optical spectrum, brightness variations atlonger wavelengths follow those at shorter wavelengths.The observed behavior is consistent with a thin-disk struc-ture in which the variations are driven by radiation froman on-axis source (i.e., so-called “reprocessing models”):Variations are seen first at the shortest wavelengths, whicharise in the inner, hotter regions of the disk, and later atlonger wavelengths, which arise further out. However, it isalso widely acknowledged that the thin-disk model is toonatıve; the accretion-disk itself is probably immersed in ahot corona that inverse-Compton upscatters UV photonsinto X-rays (Section III.A.2), which in turn play a role inheating the disk and modifying its temperature structure.Detailed successful self-consistent models of the contin-uum source are still lacking.

C. Jets and Radio Structure

As noted earlier, quasars were first identified on the ba-sis of their radio emission, which is synchrotron emis-sion (Section III.A.2) from plasma originating in the nu-clear regions and fed to extended “lobes” through “jets,”which are long linear radio structures. The jets are colli-mated plasma outflows that are at least mildly relativisticin the most energetic sources. The electrons radiate overthe whole electromagnetic spectrum. While most of theenergy of a jet is radiated at high energies (detected ateven TeV energies), jets have been most thoroughly stud-ied at radio wavelengths on account of their high contrastin the radio and on account of the high angular resolu-tion and sensitivity achievable with radio interferometrictechniques.

Radio sources typically have a double-lobed morphol-ogy and are considerably larger than the host galaxy at op-tical wavelengths (i.e., they have sizes of hundreds of kilo-parsecs). Details of this morphology are correlated withsource luminosity: Lower power sources (Fanaroff-Rileytype I, or FR I) are brighter in the central parts of the lobes,and higher power (FR II) sources are edge-brightened.When radio structures are observed in the plane of thesky, emission from the extended lobes is most prominentand these sources are referred to as “lobe-dominated.”As the observer’s line of sight gets closer to the jet axis,emission from the jet becomes relatively more prominent,and these are referred to as “core-dominated” sources. Ifone observes very close to the jet axis, the relativisticallyboosted and beamed emission from the jet source swampsthe light from the host galaxy and the inner regions of theAGN; such sources are highly variable and are known as“blazars.” In some core-dominated sources, shock frontsare observed within the jet structures, and these seem topropagate at speeds in excess of the speed of light. Thisapparent superluminal motion is a consequence of rela-tivistic motion within a few degrees of the line of sight,basically because in the observer’s reference frame thefast moving shock is moving towards us almost as fast asits radiation. The mechanisms by which jets are formedand collimated are not understood, but they are prob-ably linked to the accretion-disk structure by magneticfields.

D. Alternative Models

The early history of quasar astronomy featured livelydebates on their fundamental nature. Early argumentswere focused on the nature of the emission-line redshifts.If these were not of cosmological origin, then the quasarsmight be closer than supposed, thus greatly reducingtheir luminosities and concomitant high energy densities.

Page 135: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

Quasars 469

Of course, the role of beamed radiation from jets was notfully appreciated in the early years, and the quasar energyproblem is now regarded as “solved” at least qualitativelythrough accretion by supermassive black holes. Within thelast ten years, the only serious challenge to the black-holeparadigm has been “nuclear starburst” models, which failto account for some quasar characteristics such as rapidX-ray variability. Nevertheless, a small minority stillmaintain that quasar redshifts are of noncosmologicalorigin. Evidence cited includes apparent angular correla-tions between quasar positions and those of nearby brightgalaxies and a suggested relationship between them.Most astronomers discount these argument because,for example, of direct detection of the host galaxiesof quasars which show that the quasars and their hostgalaxies do indeed have the same redshifts.

III. OBSERVED PROPERTIES OF QUASARS

As noted earlier, AGNs emit radiation across the en-tire electromagnetic spectrum, from the lowest detectableradio frequencies through high-energy X-rays. Gammarays are detected in some blazars, and the highest-energyγ -rays are detected indirectly by the Cerenkov showersthat they cause in the Earth’s atmosphere.

Typical spectral energy distributions (SEDs) are shownin Fig. 1 for both radio-loud and radio-quiet nonblazarAGN in units showing the energy emitted at a given fre-quency. An almost constant power per frequency decadeis emitted from 100 µm to at least 100 keV. To a first ap-proximation, the SED can be approximated as power lawFν ∝ ν−α , where 0 α 1. Deviations from this simpleform are evident in various spectral bands; from high tolow photon energies, the notable broad-band spectral fea-tures and their probable origin are

1. X-ray emission (ν ≈ 1018 Hz). X-rays are predomi-nantly a nonthermal emission that arises in the immediatevicinity of the black hole or in relativistic jets. Soft X-raysare those in the energy range 0.1 hν 2 keV, those athigher energies are hard X-rays.

2. The “big blue bump” (ν ≈ 1015–1017 Hz). As notedearlier, this is the feature that is most plausibly identi-fied with the accretion disk. It is observed in the short-wavelength optical (∼4000 A) through the ultraviolet (i.e.,wavelengths as short as 912 A). It is unobservable atshorter wavelengths due to the opacity of our own galaxyto hydrogen-ionizing photons in the extreme ultraviolet(EUV).

3. The infrared bump (ν ≈ 1013–1014 Hz). AGN emis-sion at infrared wavelengths seems to be dominated bythermal emission from dust in the immediate vicinity of

the active nucleus. The thermal dust emission spectrumturns over at short infrared wavelengths, which leavesa local minimum in AGN SEDs known as the 1-µminflection.

4. The submillimeter break (ν ≈ 5 × 1012 Hz). This fea-ture occurs where thermal emission from dust grains be-comes highly inefficient.

The two gaps in the SEDs shown in Fig. 1 are due (1)to the absorption by the neutral hydrogen in our galaxy(the EUV gap) and (2) to absorption by water vapor in theEarth’s atmosphere (the IR-mm gap). The observed AGNproperties are outlined in more detail below. In the caseof blazars, the SEDs show simpler structure and can peakin virtually any part of the spectrum. The emission fromblazars arises in relativistic jets that are directed approxi-mately in the direction of the observer.

A. Continuum Emission

1. Gamma Rays

Strong γ -ray emission is detected in blazars only. Gamma-ray emission is strongly correlated with radio emissionand radio variability. The γ -rays are almost certainly rel-ativistically beamed emission that arises in jets (SectionII.C), and therefore are seen only when the jet is directedtowards the observer. The γ -ray emission is also highlyvariable. Gamma rays are probably the result of inverse-Compton upscattering of lower energy photons. The originof the seed photons is not known, but might be either thelow-frequency synchrotron photons or photons from theaccretion disk.

2. X-Rays

X-ray emission is a ubiquitous property of AGNs. Theamount of energy emitted in the X-rays is a good fractionof the total bolometric luminosity of a quasar (Fig. 1).As noted above, the X-ray spectrum is usually describedas a power law with a spectral index typically aroundα ≈ 0.9. Superposed on top of the power law are a hardtail at energies higher than several keV which is thought tobe due to reflection from an ionized gas and a soft excessat energies less than 2 keV which may be the high-energytail of the big blue bump. Emission features are sometimesseen, notably the iron Kα line at about 6.4 keV, but com-monly only in lower luminosity AGNs. In some cases,there is evidence of emission lines due to ions of iron,oxygen, and other elements. Due to absorption in our owngalaxy, X-ray spectra generally turn over at low energies(∼0.1 keV). Absorption features are sometimes seenin soft X-ray spectra, arising in the AGNs themselves.

Page 136: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

470 Quasars

Spectral features are absent at the highest energies, whichcompletely ionize all abundant species (up through iron).The X-ray flux from quasars varies on time scales ofdays or hours and in some cases even minutes. Thissuggests that the X-ray emission originates very close tothe central black hole.

The power-law nature of the spectrum indicates thatthe X-ray emission mechanism is primarily nonthermal.The physical processes involved could be one or moreof the following nonthermal processes that can naturallyproduce a power-law spectrum:

1. Synchrotron radiation. Also known as magneto-bremsstrahlung, synchrotron radiation is emitted by rel-ativistic electrons as they spiral around magnetic fieldlines.

2. Inverse-Compton radiation. Relativistic electronslose energy to photons that scatter off them. The “seedphotons” are lower energy (infrared, for example) radia-tion, perhaps from the accretion disk.

3. Synchrotron self-Compton radiation. This combinesthe other two processes: Radio photons generated by thesynchrotron process are upscattered into the X-ray pho-tons by further interactions with the relativistic electrons.

3. UV/Optical

The big blue bump (BBB) dominates the bolometric en-ergy output of nonblazar AGNs. This feature is widely be-lieved to be optically thick thermal emission from a thinaccretion disk, generated by viscous heating in the differ-entially rotating disk annuli. An alternative explanationis that the emission is optically thin free–free emission(bremsstrahlung); while this better accounts for the ob-served optical-UV SED slope (α ≈ 0.3), this requires anemitting volume that is probably too large to explain rapidcontinuum variability.

The classical “thin accretion disk” is geometrically thinbut optically thick (i.e., opaque) and thus radiates locallymore or less like a blackbody. The temperature structurenot too close to the inner radius is T (r ) = 6.3 × 105 (d M/

dt)1/4E (M/108)−1/4 (r/Rs)−3/4 K, where (d M/dt)E is the

mass accretion rate in units of the Eddington rate. For a108-M black hole, the radiation from the inner accretiondisk will thus be in the far-UV part of the spectrum.

Each annular region of the accretion disk radiates as ablackbody with the local temperature, and the sum overall the annuli is thought to make up the observed BBB. Forthe expected disk temperatures (∼105–106 K) the emis-sion should peak in the unobservable EUV region, andthe observed SEDs appear to do just that (Fig. 1). Morecomplicated models are required to account for the high-energy emission: For example, there is probably a hot

corona overlying the disk, though it is probably ratherpatchy (covering about 20% or so of the disk). The coronainverse-Compton scatters UV-optical radiation from thedisk to produce the X-rays. Some of the X-rays, in turn,are absorbed by the accretion disk and increase its temper-ature locally. The interaction between the disk and coronais a complicated problem in magnetohydrodynamics, notunlike the problem of the solar corona, and is not wellunderstood.

4. Infrared Emission and ThermalDust Reprocessing

Figure 1 shows an inflection point in the SEDs of non-blazar AGNs at 1 µm, between the BBB and the broadinfrared bump, which peaks at ∼25–60 µm. Sharp cut-offs between 100 µm and 1 mm are typically observed inradio-quiet quasars. The IR bump is due to thermal emis-sion by warm extranuclear dust. The dust absorbs higherenergy photons from the AGN and re-radiates in the in-frared. The evidence for a thermal origin is severalfold:

1. Constancy of location is demonstrated by the inflec-tion point in the spectrum. Dust fully exposed to the in-tense UV radiation of an AGN will be destroyed once itis heated to the sublimation point, which is about 2000K for graphite grains. The hottest dust will be closest tothe AGN and near the grain sublimation temperature. Thisleads naturally to a high-energy cut-off at 1 µm.

2. In a few AGNs, the infrared continuum apparentlyvaries in response to changes in the UV continuum, with atime delay that is attributable to light-travel time betweenthe UV source (the accretion disk) and the IR-emittingdust. The grain sublimation radius is consistent with thelight-travel time measured.

3. The far-IR spectra of AGNs show little or no vari-ability on short time scales, consistent with the cooler dustresiding at larger distances from the UV source.

4. The steepness of the submillimeter break, typicallya 5- to 6-decade drop in specific luminosity in radio-quietquasars, implies a thermal origin. The spectral index ofthe break is sometimes very steep, with α < −2.5. Syn-chrotron self-absorption cannot yield such steep spectralindices, while dust emission can.

In core-dominated radio-loud quasars (i.e., nearly face-onor blazar-type objects), the IR continuum is dominated bythe high-energy tail of the synchrotron emission. The far-IR emission is anisotropically emitted, beamed along withthe radio emission. In the highly inclined, lobe-dominated,radio-loud quasars, the radio and IR continua are notdirectly related and a discontinuity is evident in theirSEDs.

Page 137: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

Quasars 471

FIGURE 2 The UV/optical spectrum of the Seyfert 1 galaxy Markarian 335. Strong, broad emission lines that covera wide range of ionization are characteristic of the spectra of active galactic nuclei. Narrow emission lines (such asthe [O III] l l 4959, 5007 doublet seen immediately longward of the hydrogen Hb l 4861 line) are also seen in AGNspectra. The feature labeled “small blue bump” is comprised of blended Fe II broad lines and Balmer continuumemission. [Adapted from Peterson (1997), based on data from Zheng, W. et al. (1995). Astrophys. J. 444, 632.]

B. Emission Lines

1. Basic Properties

The predominant features of the ultraviolet and opticalspectra of nonblazar AGNs are strong, broad emissionlines (Fig. 2). The emission lines are produced by pho-toionization of the line-emitting gas by high-energy con-tinuum radiation from the AGN.

Emission lines in AGN are classified as narrow or broadbased on their line widths, usually as characterized bythe full width at half maximum (FWHM) expressed as aDoppler width in the rest frame of the AGN. The narrowlines (FWHM in the range 500 to 1000 km s−1) are emit-ted by an extended, low-density (≤106 cm−3) narrow-lineregion (NLR) typically at distances ∼102–104 pc fromthe central source. The characteristic velocity dispersionof the NLR gas is similar to or only somewhat larger thanthat of the stellar component in the host galaxy. In the mostnearby AGN, the NLR is spatially resolved, showing a bi-conical region that is illuminated by the central continuumsource.

The NLR is the most distant and extended line-emittingregion where photoionization by the central source dom-inates over other heating processes.

The broad emission lines (FWHM in the range ∼1000to a few times 10,000 km s−1) arise in a compact, higherdensity (∼1011 cm−3) broad-line region (BLR). On the ba-sis of simple energetic considerations, the BLR interceptsabout 10% of the radiation emitted by the central source.

Because the BLR is comprised of high-density gas (so thatthe recombination time scale is very short, and the gas cantherefore respond quickly to a change in ionizing radia-tion) and is both optically thick and close to the continuumsource, the broad emission lines vary in flux in response tovariations of the continuum. The emission-line response,however, is time delayed relative to the continuumvariations on account of light travel-time effects withinthe BLR. The characteristic time scale for emission-lineresponse is days to months, depending on (1) which emis-sion lines are observed (since in a given source the highestionization lines respond more rapidly than the lowerionization lines), and (2) the luminosity of the source,since the BLR size is larger in more luminous sources:The size of the hydrogen Hβ λ4861 emitting region,for example, scales approximately as R = 30 (L/1044)1/2

light days, where L is the optical continuum luminosity inergs s−1.

The BLR is usually assumed to be comprised of a largenumber of discrete clouds of gas. The relative strengths ofvarious emission lines (see below) indicate that the line-emitting gas is at a temperature of about 20,000 K, butif the total line widths were interpreted as purely thermalbroadening, unrealistically high temperatures (T > 109 K)would be inferred. This suggests that the line widths aredue to bulk supersonic motions of individual clouds thateach emit line radiation with much smaller widths (about10 km s−1, the thermal width for more typical nebulartemperatures). The smoothness of the broad-line profiles

Page 138: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

472 Quasars

implies that the number of line emitting clouds must bevery large for stochastic clumping to be negligible.

While the BLR kinematics and structure are largelyunknown, there is fairly good evidence that the gas isgravitationally bound to the central source; that is, the gasis probably orbiting the central source and is specificallyunlikely to represent pure radiatively driven outflow. Out-flow is still certainly possible, although either the radialmotion of the gas is small compared to the orbital motion(i.e., the gas spirals outward) or most of the line emissioncomes for gas that is moving very close to escape velocity.If the line-emitting clouds are gravitationally bound to thesource (i.e., in “virial motion”), then the central mass canbe determined from the virial theorem, as described in Sec-tion II.B. The broad emission lines are also a means tostudy the important but unobservable EUV region as theBLR reprocesses the EUV photons emitted by the hottestparts of the accretion disk into the observable UV andoptical lines.

2. Gas Densities, Temperatures, and Metallicities

As noted above, the line emission is primarily a result ofphotoionization by the central source. Heating of the gasby shocks or particle collisions is less important. This isevident by the relatively low, constant gas temperatures(T ≈ 104 K) and by the large range in ionization levels in-dicated by the emitted line spectra (Fig. 2). Shock-heatedgas, on the contrary, shows a narrow range of ionizationswith strong emission from collisionally excited lineswith T 30,000 K. These signatures are not commonlyobserved.

Line ratios are useful for determining the density andtemperature in the low-density NLR gas. Electron densi-ties ne can be measured using “forbidden” line-doubletsarising from different levels of the same multiplet, suchas [OII] λ3729/λ3926 and [SII] λ6716/λ6731. The NLRdensities are found to be in the range 102–104 cm−3, typ-ically with ne ≈ 2 × 103 cm−3. Flux ratios of lines fromthe same ion with very different excitation potentials aretemperature sensitive. For AGNs, the best line ratio fortemperature determination is [OIII] λλ4959, 5007/[OIII]λ4363, as all of these lines are relatively strong. NLRelectron temperatures measured this way are in the range(1–2.5) × 104 K, with an average value Te ≈ 1.6 × 104 K.The BLR density is too high for these line ratios to be use-ful temperature diagnostics; the critical density for theselines (above which collisional de-excitation of the upperstate becomes important and the line emissivity grows onlylinearly with density) is about 106 cm–3, so these “forbid-den” lines are very weak (“collisionally suppressed”) inbroad-line spectra. There are also no good density diag-nostics for the BLR, but the most successful photoion-

ization equilibrium models suggest densities in the rangene ≈ 109–1012 cm–3 and temperatures of about 20,000 K.The presence of CIII λ977 sets a strong upper limit ofTe ≈ 25,000 K, ruling out shock heating.

The most abundance-sensitive line ratios, least affectedby the density, temperature, and local radiation field in thegas, involve lines of different elements that are formed un-der similar conditions, preferrably in the same region. Forexample, NV λ1240/CIV λ1549 is representative of therelative C/N abundance in the BLR, while the NV/HeIIλ1640 flux ratio can provide robust lower limits on theN/He abundance and thus the overall metallicity. Thisholds great promise for mapping the early metallicity pro-duction in the Universe and its cosmological evolution ingalactic nuclei by observing quasars at low to very highredshifts (Section VI.D). Studies of C/N, for example,show that the metallicities increase with redshift abovethe approximately solar values measured in nearby AGN.This may be a natural consequence of the denser and moremassive systems present at earlier epochs.

3. Emission-Line Profiles

Active galactic nuclei emission lines have non-Gaussianprofiles. The narrow lines have more pronounced basesthan a Gaussian with the same half width, and the broadwings tend to be blueward asymmetric (i.e., with moreemission short of line center) while the line core is moresymmetric. This indicates that the highest velocity gashas a net motion towards us and/or that the line emis-sion is subject to some type of obscuration, such as ab-sorption by dust, that suppresses the red wing. Indeed,dust-sensitive emission-line flux ratios, such as Hα/Hβ,confirm the presence of dust in the NLR. The strength ofthe asymmetric wing is correlated with the presence ofnuclear radio jets. The high velocity wings on the narrowemission lines probably represent material that has beenentrained in radio outflows from the nucleus.

The situation is more complicated for the broad emis-sion lines, which have approximately logarithmic profiles(i.e., F( V ) ∝ – ln δV , where V is the Doppler veloc-ity relative to the line center), with subtle differences seenamong objects. The observed profile asymmetries reveala complex picture but appear somewhat related to the ra-dio properties of the sources. Radio-loud quasars exhibitpredominantly redward (i.e., towards longer wavelengths)asymmetries, and radio-quiet quasars show mostly blue-ward asymmetries, yet these are only statistical trends, notrigid rules. The broad-line asymmetries are not understoodbut suggest either that the BLR has different kinemat-ics in different types of AGNs or that the BLR comprisemultiple, distinct physical components and different com-ponents dominate in different types of AGNs.

Page 139: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

Quasars 473

The low-ionization lines, the Balmer lines included,have emission-line peaks that are approximately at theAGN systemic redshift, defined by the forbidden lines orabsorption lines in the host galaxy. The high-ionizationlines are blueshifted relative to the low-ionization lines.The velocity shift is observed to be stronger for face-onradio-loud quasars, which are also the most Doppler-boosted sources. There is no definitive picture to explainthis, but it indicates that the bulk motion of the low-ionization gas may be different from that of the high-ionization gas in the BLR. Detailed interpretations arestrongly dependent on the assumptions of the unknownBLR geometry and structure.

While the line profiles of the highest luminosity sourcestend to be fairly smooth and structureless (on small scales),the broad-line profiles of lower luminosity objects can bequite complex. Some emission lines, usually but not al-ways in low-luminosity Seyfert galaxies, have a double-peaked structure, with the two peaks displaced approxi-mately symmetrically around the systemic redshift of theAGN. As noted above, emission-line fluxes vary in re-sponse to changes in the continuum brightness. Emission-line profiles are also observed to vary with time, but theline profile changes have not been successfully linked toany other property of AGNs.

4. Continuum/Emission-Line Correlations

Correlations between various emission-line and contin-uum properties of AGNs are of great potential importancedetermining the predominant physical properties at workin these sources. Many such correlations have been found.Perhaps the most important and well-known is the “Bald-win Effect,” an inverse relationship between the broad-line equivalent width∗ of CIV λ1549 and the underlyingcontinuum luminosity. This effect is observed over morethan 6 orders of magnitude in AGN continuum luminosity.Part of its importance derives from its potential as a cos-mological “standard candle.” If we can deduce the sourceluminosity from the observed line equivalent width, thena comparison with the observed flux allows a determina-tion of the “luminosity distance,” which in turn dependson various cosmological parameters. It is thus possible toinfer the values of cosmological parameters if the lumi-nosity of an object is known from its spectrum. However,the observed Baldwin relationship shows significant scat-ter, the origin of which is not yet completely identified,although source variability and inclination effects seem tocontribute. The origin of the Baldwin effect itself is not

∗The equivalent width of a line is essentially the amount of localcontinuum emission, measured in wavelength units, that would give thesame total flux as the emission line. Equivalent width is thus a measureof line strength, not line width.

understood. Possible explanations include that with in-creasing luminosity (1) the covering factor (fraction of thecontinuum radiation intercepted by the BLR) decreases,(2) the ratio of ionizing photon density to electron den-sity (the “ionization parameter” that determines the basiccharacteristics of the emission-line spectrum) decreases,(3) the metallicity increases, (4) the SED peak shifts tolower energies, and (5) the accretion-disk inclination sys-tematically decreases.

If, as inferred from several considerations above, thecentral continuum source is responsible for photoioniz-ing the BLR and NLR gas, certain intimate relationshipsbetween the ionizing continuum and the emission linesare expected. As most of the ionizing continuum lies inthe EUV and soft X-ray regime, this is not currently fullyexplored, though we can try to infer some characteris-tics of the EUV spectrum from the observable UV/opticalspectrum. Operationally this amounts to searching forcorrelations among various continuum and emission-linespectral properties. One way to do this is by principal com-ponent analysis, which can be used to identify complexsets of correlations that appear in many AGNs. The bestknown set of these correlations, known as the Boroson–Green Eigenvector 1, reveals that as the strength of theoptical broad-line FeII emission increases, the narrow-line [OIII] λ5007 strength decreases, and the Hβ linewidth and profile asymmetry both decrease. At one ex-treme of the Eigenvector 1 properties are the narrow-line Seyfert 1 galaxies, which have unusually narrow(FWHM 2000 km s−1) Balmer lines and strong Fe IIemission. These sources also show very rapid and strongX-ray variability. The physical basis of these correlationsis not understood, but it is anticipated that they are drivenprimarily by one or two fundamental parameters, such ascentral black-hole mass and black-hole accretion rate. Forexample, the best current explanation for the narrow-lineSeyfert 1 phenomenon is that these are relatively low-masssystems that are accreting at a higher than normal rate.

C. Absorption in AGNs

Very distant quasars often show resonance absorptionlines (e.g., hydrogen Lyα λ1215, C IV λλ1548, 1550,Mg II λλ2795, 2802) at redshifts smaller than the quasaremission-line redshift. In most cases, these arise in mate-rial that is unrelated to the quasar and is simply gas at lowercosmological redshift that happens to fall along our lineof sight to the quasar (see Section VI.B). Some absorp-tion features, however, especially in lower redshift objects,clearly arise in the immediate vicinity of the AGN.

Narrow (FWHM < few hundred km s–1) “intrinsic”resonance-line absorption is found within a velocity∼5000 km s–1 of the AGN systemic redshift and is often

Page 140: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

474 Quasars

superposed on the emission-line profiles. Such featuresare almost always blueshifted relative to the emissionlines and thus represent some kind of outflow fromthe source. These features are very common in the UVspectra of Seyfert galaxies.

Broad absorption lines (BALs) constitute a more spec-tacular class of absorption as the line widths can reachseveral 104 km s–1. The absorption troughs are always lo-cated blueward of the emission lines and can have P-Cygniprofiles,∗ and in some cases the absorption appears tobe completely detached from the broad emission lines,indicating that the absorption does not simply occur inthe broad emission-line gas along the line of sight to thecontinuum source. Spectroscopy and polarization studiessuggest that BALs arise in high-density, high-velocity, op-tically thick equatorial outflows that cover ∼10% of thecontinuum source, as BALs are detected in only ∼10% ofQSOs (Section VI.B). All known BAL quasars were radio-quiet until the FIRST survey, which is based on deep radioobservations, began turning up significant numbers withradio emission. The BAL objects yield information on theinner regions of quasars.

High-velocity (V 3000 km s–1) absorption lines,BALs included, are only observed in high-luminosityQSOs, never in Seyferts. This indicates that the energeticsof the outflow are closely related to the luminosity of theAGN.

Absorption features that arise in the nuclear regions ofAGNs are also seen in X-ray spectra. Their presence isstrongly correlated with the narrow-absorption featuresin the UV. The X-ray absorption arises in fairly highlyionized gas referred to as “warm absorbers.” Before thelaunch of the Chandra X-Ray Observatory, the strongestabsorption features observed in the X-ray band were theedges of ionized carbon, oxygen, neon, and iron. Thesedemonstrate that there is a large amount ionized matterin the nuclear regions of AGNs, possibly just outside theBLR. The Chandra X-Ray Observatory has now observedabsorption lines due to highly ionized oxygen and otherspecies. These lines provide more powerful diagnostic ofthe physical conditions of matter close to the nuclear blackhole than the absorption lines observed in the UV.

D. Dust

Dust is known to be present in AGNs, as noted above, al-though little is known about its properties or distribution

∗The spectra of stars with massive outflowing winds, such asP-Cygni, have strong emission lines that have self-absorption due tothe cooler diffuse gas that is seen projected against the background star.The absorption occurs in the blueshifted gas that is superposed againstthe star, so the emission lines are strongly self-absorbed in their bluewings.

in AGNs. The dust extinction as a function of energy de-pends on properties such as grain size and composition,but is observed to consistently increase towards higherenergies in the UV-optical region. AGN spectra lack thestrong graphite absorption at 2175 A that is often seenin galactic sources. Silicate dust absorption (10 µm) andemission (19 µm) features are also rarely observed, exceptin a few Seyfert galaxies. Emission-line flux ratios, suchas Balmer line Hα/Hβ, show that dust is mixed with theNLR gas. Dust is destroyed inside the BLR by the ener-getic photons, unless it is strongly shielded, so that, rela-tive to the galactic interstellar medium, the BLR gas is lessdepleted in heavier elements: An order of magnitude en-hancement of elements, such as iron, silicon, magnesium,aluminum, and carbon, is found in the BLR compared togalactic gaseous nebulae containing dust grains.

As discussed in Section IV.B, the central regions ofAGNs are thought to be at the center of a dusty torus thatobscures the view of the AGN in the equatorial plane ofthe system. The large columns of cold dust and gas that aresometimes detected along the line of sight may originatein this torus.

E. Radio-Loud/-Quiet Dichotomy

The radio emission from radio-quiet quasars is severalorders of magnitudes fainter than that from radio-loudquasars (Sections I and II.C), and their weak radio struc-tures, observed so far, are either amorphous or show signsof small jet-like structures. These structures may be dueto weak collimation of the radio plasma. The powerful,large-scale radio structures in radio-loud quasars are, onthe contrary, believed to be due to a strong collimation ofthe radio plasma, probably by magnetic fields.

It has long been a puzzle why the SEDs of radio-loudand radio-quiet classes are so similar over most of the elec-tromagnetic spectrum, yet display such dramatically dif-ferent radio properties. However, all across the spectrumdetailed differences exist. The radio-loud quasars (1) havenonthermal synchrotron emission contributing to their IRemission, (2) are stronger X-ray sources, and (3) haveX-ray spectra that are flatter (i.e., spectral index α closeto zero). The excess X-ray emission may be linked to thenonthermal radio emission. The radio-quiet quasars tendto have (1) stronger optical FeII emission lines, (2) weakerforbidden narrow lines (such as [OIII] λλ4959, 5007), and(3) a much higher frequency of occurrence of the BALphenomenon. How these differences relate more directlyto the radio-luminosity and the central radio-generatingmechanism is still unknown.

Most QSOs, irrespective of their radio power, are nowfound to be hosted by elliptical galaxies. This is con-sistent with the lack of luminosity differences between

Page 141: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

Quasars 475

the two QSO classes (the lower-luminosity Seyferts aregenerally found in spiral galaxies), but leaves open theissue of why and how radio-loud quasars can producethe powerful, large-scale radio emission, while radio-quietquasars cannot. This indicates that it is not the host galaxyproperties that play a vital role for the generation of power-ful radio emission, but rather some fundamental propertiesand/or structures of the AGN central regions, such as thecombined properties of the magnetic field, the accretiondisk.

IV. STRUCTURE OF QUASARS

The inner structure of AGNs is far too small to be ob-served directly: Even in the case of the nearest AGNs,the BLR and accretion-disk structure project to angularsizes smaller than tens of microarcseconds. Milliarcsec-ond radio structures in jets can be observed using very-long-baseline interferometry (VLBI), and some narrow-line region structure can be observed with Hubble SpaceTelescope (with angular resolution of about 0.05 arcsec-onds) or even with ground-based telescopes (with angularresolution generally no better than about 0.3 arcseconds).Thus, much of what we know about the inner structure ofAGNs is inferred indirectly from, for example, polariza-tion and variability studies.

A. Interrelations Among Spectral Components

While the UV-optical emission in AGNs is generally be-lieved to arise in an accretion disk, the origin of theX-rays is not understood. As noted earlier, strong, rapidX-ray variability implies that the X-ray emitting regionis very compact. Furthermore, certain X-ray spectral fea-tures, specifically the iron Kα line and the Compton re-flection hump, suggest that about 50% of the X-ray energyis reprocessed (i.e., absorbed and re-emitted without ad-ditional energy input) by an optically thick plasma. Thishas led to heuristic models in which the X-ray emitting re-gion lies on the accretion disk axis and irradiates the diskfrom above. This suggests that the variations in the X-rayband will induce variations in the emitted spectrum of theaccretion disk. In this case, the X-ray variations drive thevariations at longer wavelengths, with a time delay due tolight travel-time effects. The expectation is that the UVvariations that arise in the inner hotter part of the accre-tion disk should thus lead the longer wavelength variationsthat arise in the cooler outer disk. This effect has been de-tected in only one AGN, the Seyfert 1 galaxy NGC 7469,and the interpretation is somewhat problematic becausethe observed hard X-ray variations are poorly correlatedwith the UV-optical variations. In any case, the UV-optical

variations are closely coupled in time, so closely that vari-ations attributable to disk instabilities (which propagate onthe much slower viscous time scale) as seen in accretiondisks around stellar-mass collapsed objects can be ruledout. Whatever the agent that causes continuum variability,the signal must surely be propagated at light speed.

Also, as noted earlier, IR continuum variations followthose in the UV-optical on a time scale consistent withlight-travel time within the dust sublimation radius. Inother words, the IR flux arises in dust as close to theAGN as it can be and still survive, and this dust absorbsAGN continuum radiation and reprocesses it into IR emis-sion. Variations at the shortest IR wavelengths lead thoseat longer IR wavelengths, consistent with the hottest gasarising closest to the UV/optical continuum source.

B. Dusty Tori and AGN Unification Models

In Section I.B, we briefly described a rich and varied AGNtaxonomy. Given the similarities and differences amongthe various types of AGNs, we need to ask whether allthese sources are intrinsically different from one other, orsimply appear to be different when viewed in differentways. A long-standing question is why Seyfert 2 galaxiesdo not have broad lines: Is the BLR actually missing inthese sources, or is it somehow hidden from our view? Inat least some Seyfert 2s, the latter has been shown to be thecase. The polarized UV/optical spectra of these Seyfert 2galaxies show broad components. The simplest interpre-tation of this phenomenon is that our direct line of sight tothe continuum source is blocked by dust. Light from thecentral source and the BLR is seen only because it is scat-tered or reflected; the wavelength-independent continuumpolarization observed in the best-studied sources suggeststhat the scattering is done by electrons rather than by gas.This is the basis of what are now known as AGN unifi-cation models. In these models, all the radio-quiet AGNsare intrinsically similar to each other, but their externalappearance depends on the observer’s viewing angle. Theintrinsic differences among various AGNs are then, in themost optimistic case, limited to luminosity (which pre-sumably depends on black-hole mass) and radio-loudness(which may be related to a parameter such as black-holespin rate). It is assumed that the central regions of AGNsare at the center of a thick torus of gas and dust, as shownschematically in Fig. 3: When one observes the AGN closeto the torus axis, one sees a Seyfert 1 spectrum. When oneviews the AGN close to the torus plane, the nucleus isseen only in scattered light, and a Seyfert 2 spectrum isobserved.

A similar unification exists for radio-loud objects.Blazars can be neatly fit into this unification scheme bysupposing that they represent AGNs observed directly

Page 142: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

476 Quasars

FIGURE 3 A schematic view of the current paradigm for innerregions of an AGN. The main components are labeled in the bot-tom half of the figure. At the center is a supermassive black hole(SMBH) and an accretion disk that accounts for most of the ob-served continuum radiation. Surrounding this is a system of fairlyhigh-density clouds, the broad-line region (BLR), that accounts forthe strong broad-emission features, such as those seen in Fig. 2.This system is embedded in the center of torus that is opaquein the UV/optical and near IR. The narrow-line region (NLR) is asystem of lower density clouds on scales comparable to or largerthan the torus. Radiation from the accretion disk can only driveemission lines in clouds that are not shielded from the ionizingradiation by the countinuum source (i.e., those clouds within the“ionization cone” boundaries, which are shown in the figure asdashed lines). In radio-loud objects, there is a jet structure com-prised of outflowing plasma, along the accretion-disk axis. Labelsin the top half of the figure show the AGN classification that willresult based on different viewing angles: a radio-loud AGN withthe jet pointed at the observer will likely be classified as a blazar.Sources observed off-axis out with an unobscured view of the nu-cleus will be type 1 sources; those in wihch the continuum sourceand BLR connot be observed directly will be type 2 sources.

down the axis of the system so that the observed radiationis dominated by the beamed emission from the relativisticjet. Furthermore, core-dominated radio morphologiesrepresent sources seen close to their axes, but lobe-dominated radio morphologies are those in which theradio axis is closer to the plane of the sky.

Can unification models account for all the diversity seenamong AGNs as simple orientation effects? At the presenttime it is unclear. While it is certainly true that someSeyfert 2 galaxies are intrinsically Seyfert 1 galaxies, itis not yet clear that all Seyfert 2 galaxies can be explainedin terms of the standard unification model described here.

C. Environment of Quasars:Hosts and Neighbors

About 5 to 10% of galaxies contain an active nucleus asdescribed here. What makes galaxies that harbor quasarsdifferent? Are all galaxies active at some point in time, or isnuclear activity an unusual phenomenon that occurs onlyin certain galaxies? On the other hand, does the presenceof a quasar affect its host galaxy?

Most luminous quasars are observed to be present inmassive galaxies. In both active and nonactive galaxies,the central black-hole mass correlates well with the lumi-nosity of the nuclear bulge component of the host galaxy,though it is not yet clear whether or not the relationship isthe same for active and nonactive galaxies.

Seyfert host galaxies are generally spiral galaxies, al-though it appears that the most luminous quasars reside inelliptical galaxies. Whether or not Seyfert galaxies havea disproportionate number of nearby companion galax-ies is also an open issue: tidal interactions with neigh-boring galaxies can disturb the gravitational potential andresult in matter flowing inwards towards the nucleus. Ithas thus been proposed that interactions between galaxiesmight be responsible for triggering activity in the galacticnuclei.

Only a decade ago, the key question about AGN envir-onments might well have been “Why do some massivegalaxies have black holes in their nuclei and others ap-parently do not?” Within the last several years, primarilyas a result of Hubble Space Telescope observations, it hasbecome clear that most galaxies have massive black holesat their centers, and the question might now be posedas “Why are the black holes in some galaxies active andnot active in others?” The answers to these questionsare of fundamental importance to understanding theAGN phenomenon. Because quasars are seen in suchabundance at high redshifts (Section VI.A), it is widelybelieved that the formation of quasars is intimately relatedto the formation of galaxies. Nuclear black holes mightform when galaxies form and go through a high-accretionphase during which they are quasars. At some point,the fueling process is arrested, and the quasar fades andeventually becomes a quiescent black hole.

V. FINDING QUASARS

A. General Principles and Historical Techniques

Compared to stars and galaxies, quasars are rare objectsat optical wavelengths. Much effort has been devoted todeveloping effective ways to find quasars so that adequatesamples can be assembled for studies of their nature andevolution. The general principle for finding quasars is to

Page 143: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

Quasars 477

make use of the differences in their SEDs compared tostars and galaxies; quasars are luminous over a much widerrange in frequency (Fig. 1).

As mentioned in Section I.A, quasars were first iden-tified because of their strong radio emission. Initial radiosurveys produced catalogs of radio-loud quasars, an im-portant class of quasars but one that represents only 5 to10% of the optically selected population. Interestingly, asthe sensitivity of radio telescopes improved, deep radiocatalogs now contain many radio-quiet quasars, whichhave reduced radio emission but are not radio silent.Radio-selected quasars are important both because of theirphysical nature and because radio emission is virtually un-affected by dust clouds that can block the optical emissionof quasars.

At optical wavelengths, quasars were first separatedfrom stars by their “ultraviolet excess,” i.e., quasars werestrong ultraviolet emitters compared to most stars. Thistechnique is most effective for redshifts up to about 2.2.At higher redshifts, the Lyα emission moves beyond theultraviolet passband into redder bands and quasars losetheir characteristic UV excess.

B. Multicolor Techniques

With the advent of digital sky surveys and computerized,multicolor techniques, it became possible to find quasarsefficiently at redshifts greater than 2.2 by using filters thatcover a wide range of wavelengths and applying new al-gorithms to separate foreground stars and distant quasarsby the differences in their SEDs. This approach encoun-ters the most difficulty around redshift 3, where the dif-ferences in the SEDs of quasars and local stars are lessthan at higher redshifts. At redshifts greater than 4, thecontinuum shortward of Lyα is significantly depressed byabsorption from intervening clouds of neutral hydrogen,which makes quasar SEDs stand out from those of all starsexcept those with the lowest temperatures.

C. Slitless Spectroscopy

In the mid-1970s a different approach was developed forfinding quasars that made use of low-resolution slitlessspectra on which the emission lines of quasars could bedetected directly. The technique was especially effectivefor quasars at high redshifts because Lyα is the strongestemission feature in quasar spectra and can be seen at opti-cal wavelengths for redshifts between 2 and 7. As did themulticolor approach, the slitless spectrum technique ben-efited from improvements in detectors and digital surveytechniques, and it has also been used successfully for ob-jects with redshifts less than 2 because of SED informationcontained in the data.

It should be noted that all survey techniques for quasarshave selection biases, and considerable effort is required todetermine the selection efficiency of surveys and the cor-responding corrections for incompleteness. However, thisis now done with increasing accuracy for modern digitalsurveys.

D. X-Ray Surveys

The advent of powerful X-ray observatories enables a newand powerful technique for finding quasars based on theirX-ray emission. Indeed, it can be argued that the strongX-ray emission of quasars and AGNs provides the bestway to distinguish them from stars and normal galaxies.Historically this was of practical use only for the brightestobjects, but the deepest X-ray surveys carried out with theROSAT satellite now achieve higher surface densities ofobjects than the deepest optical surveys published to date.However, high-accuracy positions for the X-ray sourcesare critical for the identification of the correct source, be-cause confusion problems are severe at such low flux lev-els. We may expect important new results both on thequasar and lower luminosity AGN populations and on thenature of faint X-ray sources as even deeper surveys withthe Chandra and XMM observatories are carried out andfollowed up with spectroscopy and SED measurements atother wavelengths.

The optical/UV techniques of finding quasars may failfor quasars surrounded by large amount of obscuring mat-ter. Dust reddens quasars, changing their SEDs and mak-ing them fainter in the UV. Such obscuration and redden-ing is could be present in a large fraction of quasars. Sincethe hard X-ray flux is relatively insensitive to absorption bydust, X-rays seem to be more useful in finding the “hid-den” quasars. As a result, many new quasars have beendiscovered as a result of X-ray surveys. More recent mis-sions such as Chandra and XMM–Newton are expected tofind thousands of new quasars. In particular, hundreds ofhigh-redshift quasars are expected to be found by the newX-ray telescopes. Also, some objects that are especiallyprominent in X-rays (e.g. narrow-line Seyfert 1 galaxies)are found more easily in X-ray surveys.

E. Current Status

The number of known quasars has increased rapidly inrecent years owing to the development of the techniquesdescribed above and their application in large surveys. Asof March 2000, the Veron catalog lists 18,104 quasars,BL Lac objects, and quasars. The Anglo-Australian 2dFQuasar Survey now has redshifts for 10,000 quasarsin a program that is planned to observe more than25,000 quasars for the study of their spatial distribution

Page 144: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

478 Quasars

and other properties. The Sloan Digital Sky Survey(SDSS) will cover 10,000 deg2 of the sky and be thelargest quasar survey ever done; it should obtain spectraof 100,000 quasars.

VI. QUASARS AND THE UNIVERSE

A. Evolution and Luminosity Function

Quasars provided our first view of objects in the Universeat redshifts greater than 0.5 and within two years of theirdiscovery the first object, 3C 9, at a redshift greater than 2.The lookback time to 3C 9 was of order 10 billion years,or 80% of the age of the Universe. Quasars offered greatpotential as cosmological probes because of their highluminosities and large redshifts.

Schmidt noted in the late 1960s that the fraction ofhigh-redshift quasars in his samples was unusually high.Upon analyzing their distribution in space, he showedthat the quasar population was significantly nonuniformand increased significantly with redshift. The effect out toredshift 2 was dramatic: The derived space density wasa thousand times greater than the local value. Alterna-tively, quasars at high redshift were significantly moreluminous than their local counterparts. In any case, theevidence for evolution was very strong and provided oneof the main arguments at the time in favor of Big Bangcosmology.

Subsequently, major efforts to determine the luminos-ity function of quasars and its evolution with redshift havebeen carried out. Large samples covering a wide range inredshift and luminosity and having well-understood selec-tion properties are needed. Current results indicate that thespace density of quasars reaches a maximum near redshift3 and declines steeply toward higher redshift. A straight-forward interpretation of the results is that we have de-tected the time of greatest quasar activity. However, thispresumes that there is no significant amount of interven-ing absorption by dust clouds along the line of sight tothe quasars. This is an important question because somequasars and nearby AGNs are known to be heavily ob-scured by dust clouds in their host galaxies. However, asurvey of radio-selected quasars at high redshift, whichare not sensitive to dust obscuration, also shows a peakin their space density similar to that of optically selectedquasars.

In recent years, the advent of 8- to 10-m class tele-scopes and the success of programs such as the HubbleDeep Field have provided important new tools for discov-ering and studying the most distant objects in the Universe.The results for the first galaxies confirmed with slit spec-troscopy to have redshifts greater than 5 were publishedin 1998. Then the first quasars with z > 5 were discovered

in the following year. At the time of writing, the SDSS hasjust yielded a quasar with z = 5.8. The light from this ob-ject was emitted when the Universe was less than a billionyears old. We may expect continuing important discover-ies in the near future about the nature of the first galaxiesand quasars to appear in the Universe.

B. Quasar Absorption Lines

Because of their great luminosity and high redshifts,quasars enable the detection and analysis of cold gas alongtheir lines of sight to the Earth via absorption-line spec-troscopy, gas that would in general not be observable oth-erwise. Furthermore, the lines of sight provide fair samplesof the Universe. Thus, the study of quasar absorption linesis a major subject in itself. Absorption lines were found inall quasars with z > 2 because (1) the strong ultraviolet res-onance lines of abundant ions in the intergalactic mediumwere redshifted to wavelengths that could be observedwith ground-based telescopes, and (2) the number densityof lines was found to increase with redshift. Subsequently,the ultraviolet capability of the Hubble Space Telescopehas enabled extensive studies of absorption lines in quasarswith redshifts less than 2.

There are three general classes of absorption-line sys-tems, depending on the distance of the absorbing gas fromthe central region of the quasar. In the first class are thoseintrinsic absorption systems that arise in gas very closeto the active nucleus and appear to be some kind of man-ifestation of the nuclear activity. This includes the BALquasars discussed earlier (Section III.C.)

The second class, associated absorption-line systems,consists of quasars with narrower lines of elements such ashydrogen, carbon, and magnesium at redshifts close to thatof the quasar itself. The lines are thought to be producedin gas that is either associated with the host galaxy orneighboring galaxies.

The third class, intervening systems, refers to the ab-sorption lines produced by clouds of gas distant from theemitting quasar and unrelated to it. These systems, whichare numerous, provide fundamental information about thenature and evolution of the intergalactic medium at red-shifts from 0 to beyond 5. The intervening systems arethemselves grouped into three different categories accord-ing to the strength of the absorbing systems.

The strongest absorption, with neutral hydrogen col-umn densities in excess of 1020 atoms cm−2, occurs inthe damped Lyα systems, in which the Lyα absorption isstrong enough to show damping wings in the line profile.This absorption arises when the line of sight traverses adisk galaxy or dense filaments of intergalactic gas. Theseabsorption systems also show lines from heavier elements,including carbon, silicon, magnesium, and iron, among

Page 145: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

Quasars 479

others. The inferred abundance of metals is typically aboutone tenth solar, indicating that the gas has undergone lessnuclear processing than is observed in nearby galaxies,which are being seen at a later stage of their evolution.

Intermediate absorption systems have a column den-sity of approximately 1017 neutral hydrogen atoms cm−2,high enough to show a discontinuity at the Lyman limit ofhydrogen. They also have absorption lines from carbon,magnesium, and other metals and are believed to arise inthe halos of galaxies. Their metal abundances can be aslow as 1% solar.

Finally, the low column density systems are the mostnumerous and are referred to as the Lyα forest, becauseof the large number of lines that are seen in the spectrumshortward of the Lyα emission line. They arise in cloudsof gas in the intergalactic medium with a range of columndensities from about 1012 neutral hydrogen atoms cm−2

up to the densities of the intermediate systems. Their num-ber density increases steeply with increasing redshift, andby redshift 3, for example, they produce a significant de-pression of the continuum emission shortward of the Lyα

emission. Observations of the Lyα forest are being used inconjunction with theoretical modeling of the intergalacticmedium for both cosmological studies and the develop-ment of large-scale structure in the Universe.

C. Gravitational Lenses

In 1979, Walsh, Carswell, and Weymann discovered thefirst gravitational lens Q0957 + 561, a quasar at z = 1.41that showed two images separated by 6 arcsec. Gravita-tional lensing occurs when two objects are nearly perfectlyaligned along the line of sight. The gravitational field ofthe nearer object bends the light of the background objectand produces several effects, including multiple imagingand magnification of the brightness. Gravitational lensingcan provide information on the expansion rate and geom-etry of the Universe and about the distribution of mass inthe Universe, particularly dark matter.

D. Implications

The formation and evolution of quasars at high redshift isone part of the larger subject of the formation of structurein the Universe and the formation of galaxies. Quasarsat high redshift mark the sites of the active black holesin the nuclei of galaxies. Furthermore, the spatial distri-bution and clustering properties of quasars are related tothe development of large-scale structure in the Universein models in which quasars (and galaxies) form at sites ofhigh-density fluctuations in the distribution of dark mat-ter. Advances in observational capabilities, which now en-able both galaxies and quasars to be observed to redshifts

beyond 5, and developments in the theory of galaxy andstructure formation are enabling a unified study of all thesetopics. The large surveys mentioned above such as theSDSS and 2dF will provide very important observationaldata for these topics.

Quasars are related to cosmology in another way—theyare a major source of the radiation that ionizes the inter-galactic medium, which contains most of the hydrogenand helium in the Universe. Observations of the absorp-tion line systems described above show that the IGM andthe clouds producing the Lyα forest are highly ionized athigh redshift. Otherwise absorption from neutral hydrogenwould block the continuum light of quasars at wavelengthsshortward of the Lyα emission (the Gunn–Peterson ef-fect). A main research question is when did this reion-ization occur and what caused it—hot stars in galaxiesor quasars and AGNs? In this view, reionization is con-nected with the appearance of the first luminous objectsin the Universe. While it had been thought that quasarswere the main sources of ionization at high redshift, therapid decline in their space density at z > 3 and the subse-quent discovery of numerous galaxies at z ≥ 3 with rapidrates of star formation have made it apparent that ioniza-tion from hot stars is also an important source of ioniz-ing radiation. Their total ionizing flux could exceed thatof quasars. The relative contributions of quasars and hotstars to ionization at z > 3 are uncertain at present. DeepX-ray and optical surveys and the Next Generation SpaceTelescope are expected to provide valuable data for thisproblem.

Quasars are also connected to the chemical evolutionof galaxies. The emission-line spectra of quasars showfeatures from elements such as carbon, nitrogen, oxygen,helium, silicon, magnesium, and iron. The general simi-larity of the emission-line spectra over the redshift intervalfrom 0 to 5 is striking. Although the derivation of chemi-cal abundances from the spectra is a difficult topic, currentevidence indicates that the abundances of elements suchas nitrogen are solar or possibly greater, even at the high-est redshifts. This is an indication that the emission-linegas was enriched by earlier generations of stars.

Finally at low redshifts, recent observations show thatinactive black holes are common in the spheroids of nearbygalaxies. They provide important information about theendpoints of quasar and AGN activity in galaxies, whichoccur when no material remains to be accreted by thecentral black hole. It is plausible that the decline in spacedensity of quasars from redshift 2 to 0 is a consequence ofthe expansion of the Universe and the conversion of gasin galaxies to stars. The material available for accretiondeclines with advancing time. The masses and frequencyof occurrence of black holes in nearby galaxies provideimportant constraints about the formation and growth of

Page 146: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN013H-629 July 26, 2001 19:48

480 Quasars

the black holes with time and the efficiency of the accretionprocess.

SEE ALSO THE FOLLOWING ARTICLES

COSMOLOGY • GALACTIC STRUCTURE AND EVOLUTION

• ULTRAVIOLET SPACE ASTRONOMY • X-RAY ASTRON-OMY • X-RAY, SYNCHROTRON RADIATION, AND NEU-TRON DIFFRACTION

BIBLIOGRAPHY

Antonucci, R. R. J. (1993). “Unified models for active galactic nucleiand quasars,” Ann. Rev. Astronomy Astrophys. 31, 473.

Blandford, R. D., Netzer, H., and Woltjer, L. (1990). In “Active GalacticNuclei” (T. J.-L. Courvoisier and M. Mayor, eds.), Spring-Verlag,Berlin.

Hamann, F., and Ferland, G. (1999). “Elemental abundances in quasis-tellar objects: star formation and galactic nuclear evolution at highredshifts,” Ann. Rev. Astronomy Astrophys. 37, 487.

Kembhavi, A. K., and Narlikar, J. V. (1999). “Quasars and Active Galac-tic Nuclei: An Introduction,” Cambridge University Press, Cambridge,U.K.

Mushotzky, R. F., Done, C., and Pounds, K. A. (1993). “X-Ray spectraand time variability of active galactic nuclei,” Ann. Rev. AstronomyAstrophys. 31, 717.

Osterbrock, D. E. (1989). “Astrophysics of Gaseous Nebulae and ActiveGalactic Nuclei,” University Science Books, Mill Valley.

Peterson, B. M. (1997). “An Introduction to Active Galactic Nuclei,”Cambridge University Press, Cambridge, U.K.

Rees, M. J. (1984). “Black hole models for active galactic nuclei,” Ann.Rev. Astronomy Astrophys. 22, 471.

Urry, C. M., and Padovani, P. (1995). “Unified schemes for radio-loud active galactic nuclei,” Publ. Astronom. Soc. Pacific 107,803.

Weedman, D. W. (1986). “Quasar Astronomy,” Cambridge UniversityPress, Cambridge, U.K.

Page 147: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

Star ClustersSteven N. ShoreIndiana University South Bend

I. Historical IntroductionII. Galactic Open Clusters

III. AssociationsIV. Globular ClustersV. Concluding Remarks

GLOSSARY

Association Loosely gravitationally bound group of co-eval stars, including massive (OB) stars and often nu-merous subgroups of more tightly bound systems. Gen-erally associated with massive molecular clouds andsites of recent star formation.

Hertzprung–Russell diagram (H-R diagram) Plot ofloci of stars by surface temperature or color versus lu-minosity or absolute magnitude.

Horizontal branch Stage of helium core burning. Inevolved clusters, this appears as a nearly horizontalgrouping of stars (i.e., nearly constant luminosity) inthe Hertzprung–Russell diagram.

Main sequence Stage of a star’s evolution when energyis generated via hydrogen processing in the stellar core.

OB stars Massive stars, typically more massive than∼10 M, which have short mainsequence lifetimes andevolve into luminous supergiants and, probably, Wolf–Rayet stars. Generally found in associations.

Pre-main-sequence star Star in the stage of evolution,before hydrogen core ignition, when energy is still be-ing generated by gravitational contraction and massaccretion.

Stellar populations Differentiation between stars of dif-ferent ages, determined from their metal abundancesand distribution. Population I stars are young stars con-fined to the disk of the galaxy and of metal abundancesnear the solar value; Population II stars are the oldeststars in the galaxy and reside primarily in the galactichalo, with metallicities much less than the Sun’s.

STAR CLUSTERS are gravitationally bound, presum-ably coeval groups of stars. Clusters can be morphologi-cally distinguished between open or galactic and globularon the basis of both the overall geometry and the stellardensity. Globular clusters, which reside primarily in thehalo of the galaxy, are associated with an older populationand are typically an order of magnitude more metal poorthan the disk. They contain upwards of 105 members withtypical radii of the order of 1 to 10 parsacs (pc). Openclusters are usually of much lower mass, containing of theorder of 100 to 1000 members, and are among the recentlyformed stars in the galaxy. They span the age and metal-licity range of the Population I stars. The OB associationsare looser groups of massive stars and contain the mostrecently formed stars. These are thought to be a possible

715

Page 148: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

716 Star Clusters

extension to the lower mass open-cluster sequence. Thepopulous blue clusters are observed in a number of ex-ternal galaxies, notably the Magellanic Clouds. The useof clusters for the study of stellar evolution and metalsproduction in galactic history is facilitated by the fact thatthey display a mass range that is similar to that observedfor disk stars and appear to form stars sufficiently rapidlythat there is no great spread in their initial chemical com-position. These objects provide the basic test of modelsfor stellar evolution theory and also fundamental distancecalibrators.

I. HISTORICAL INTRODUCTION

The first observations of a cluster can be found in the ear-liest compilations of the constellations by the Greeks. TheHyades, Pleiades, and Praesepae are all mentioned. Theseare the most visible northern open clusters, the first twolocated in Taurus and the last in Cancer. That these starsare in fact of higher density than the rest of the field wasinitially recognized by Galileo in 1610, in the first useof the astronomical telescope. In the Siderius Nuncius hementions the Pleiades specifically, showing that it has aconsiderably larger population than can be seen with thenaked eye and larger even than the neighboring galacticplane. A more complete collection of cluster data can befound in the first telescopic surveys by Halley in the 17thcentury and Messier about a century later. These revealedthe existence of several classes of clusters—specificallythe globular and compact open varieties. For a time, itwas generally believed that all such objects were nebu-lous, in agreement with the Kant—Laplace nebular hy-pothesis, which appeared to predict the preplanetary stagethese objects represented. The resolution by Herschel andlater observers of many of these nebulae into stars for atime precluded the general acceptance of the idea of starformation from gaseous matter, but the spectroscopic ob-servations of Huggins and Secchi in the 1860s assisted indistinguishing clusters from nebulae.

In the first years of the 20th century, observations ofthe globular clusters at Harvard under S. Bailley and laterH. Shapley showed that there is a distinct class of variablestar associated with the clusters. These are the RR Lyraestars, a group of horizontal branch stars with periods of ∼ 1

2day and that display (like many pulsating variables) a dis-tinctive period–luminosity relation. In addition, cepheidvariables were also observed and these enabled Shapleyto determine that the globular clusters form a sphericaldistribution about the galactic center and that we are dis-placed from that center. The current value is 8.5 kpc.

In the 1930s, R. Trumpler discovered that the most dis-tant of the clusters also appeared to be more reddened than

the local stars. This led to the discovery of interstellar dustand further served to correct the determination of the dis-tance scale. The discovery of globular clusters in the halosof external galaxies has continued at a rapid pace in recentyears, extending their use as distance indicators.

Observations of the colors, motions, and brightnessesof cluster stars had become sufficiently advanced by themid-1940s to reveal that the stars are in fact coeval.M. Schwarzschild and F. Hoyle, and later A. Sandage,exploited this property of the members to test models forstellar evolution. The argument proceeds as follows. Ifthe stars are formed at the same time and have differ-ent masses, then the most massive ones should evolve themost rapidly toward the red giant phase. By using the lu-minosity and temperature of the brightest main-sequencestar, one should therefore be able to find a correspond-ing mass for the most evolved hydrogen-core-burning starfrom which the age of the cluster follows. E. Salpeter ex-tended this to the determination of the initial mass function(IMF) by determining the relative number of stars of eachmass that are needed to synthesize the morphology of theH-R diagram of the cluster. Fitting a power law shows thatmany more low-mass than higher mass stars are formed inthese clusters. The same is true for the globular clusters,although there is a narrower mass range.

The globular clusters were shown by W. Baade to havethe same population mix as, and similar metallicities to,the stars found in the halo of the galaxy and M31, theAndromeda galaxy. Specifically, when a sample of high-velocity field stars (presumed to be the stars that have themost extended distribution and thus are moving statisti-cally faster) is obtained, their overall properties closelymatch those of the globular clusters. The distributions inspace, around the disk of the galaxy, are also quite sim-ilar. From the low metallicities and the extended spatialdistributions, Baade and subsequent investigators arguedthat the globular clusters were formed in a rapid sequenceof events in the early history of the galaxy, thus forming asample of the stages before the formation of the disk stars.

Observations from space began for these objects in thelate 1960s with the launch of OAO-2, which performedUV photometric observations in many of the globular clus-ters and individual stars in open clusters. Additional workby International Ultraviolet Explorer (IUE) has shownthat a number of globular clusters possess UV-brightcores. These have been resolved using the WFPCZ on theHubble Space Telescope (HST) revealing a populationof blue stragglers and post-AGB stars (see below).EINSTEIN observations have revealed the existence of aclass of bright X-ray sources, at first thought to be centralblock holes and now believed to be low-mass binaries,in ∼10% of the globular clusters. None of the open orglobular clusters appears to be a γ -ray source.

Page 149: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

Star Clusters 717

II. GALACTIC OPEN CLUSTERS

A. Introduction

The young stellar population of the galaxy, Population I,whose metal abundance is about the same as that of theSun, appears to form only loose clusters, consisting ofsome hundreds of members, which have ages ranging fromonly a few hundreds of thousands of years to about theage of the oldest globular clusters, of the order of 1010 yr.Among the youngest are NGCs 2244, 2264, and 6530 (allwith ages of less than a few tens of millions of years), andamong the oldest are NGC 188, NGC 2506, Mel 66, andM 67, which are as old as the oldest disk stars (about10 Gyr). The mechanism of formation appears to be thesame as that of the associations; that is, the stars areformed from a parent molecular cloud in which many stel-lar masses are present. If we take the mass of the typicalcluster to be several hundred solar masses, this places alower bound to the mass of the parent cloud. The forma-tion time for the stars in the cluster appears to be quiteshort, of the order of the contraction time for a star of afew solar masses, though we return to this point below.

The stars in the cluster all have a common proper motionand thus were formed in a gravitationally bound system.The mass of the cluster can be estimated as follows. Thereis a simple relation between the total kinetic and gravita-tional potential energies of a group of bound objects in agravitational field, called the virial theorem, which is that

2T + = 0 (1)

where

T = 12 Nm∗σ 2

is the mean kinetic energy of the N stars in the cluster,which are all assumed to have about the same mass m∗,and σ is the velocity dispersion, and

= −G N 2m2∗〈R〉−1

is the mean gravitational potential energy, where 〈R〉 is themean radius of the stars from the center of the cluster. Theagreement between the calculated mass required to bindthe cluster and the observed masses of the stars (inferredfrom their luminosities and spectral types) shows that theclusters were formed as gravitationally bound systems.1

1The total energy of the cluster is given by

E = T + < 0,

which shows that the clusters are bound (that gravity wins out). It shouldalso be added that the virial theorem holds for any gravitationally boundcluster, even of galaxies.

B. Isochrones and the Interpretationof H-R Diagrams for Clusters

The fact that the stars were all formed at approximatelythe same time means that the cluster represents a snapshotof stellar evolution for the masses of the component stars.Assume that we have two stars of different mass, M1 > M2,which start their life at the same time. The more massivestar will evolve faster owing to the higher rate of corenuclear burning and thus will become a red giant in ashorter time than the lower mass star. If we know that thestars on the main sequence, the hydrogen-core-burningstage of evolution, have no more than some mass, sayMmax, then we can place a lower limit on both the age andthe mass of the evolved star. If we know, for example, thatM2 is a main-sequence star and M1 is a red giant, thena lower limit on the mass of M1 can be determined. Byfitting stellar evolutionary tracks to the entire ensemble ofcluster stars, one can determine the number of stars in agiven mass range that were formed at the time of formationof the cluster, the age of the cluster, the initial chemicalcomposition, and the distance to the cluster through theknowledge of the masses and absolute luminosities of thecomponent stars.

In order to employ isochrones, however, the cluster’sdistance must be determined. The raw data are simply thevisual apparent brightness (or V magnitude) of the starplotted versus its color, or B − V (the more negativeB − V , the bluer is the star; the lower the V magnitude, thebrighter is the star). Figure 1 shows the color–magnitude orH-R diagram for a moderately young open cluster, M11.Note the well-populated main sequence and the scatteringof a few red giants to the right of the figure. Figure 2 showsthe old disk cluster M67. Here, the subgiant branch is wellpopulated, and the turnoff point is at about the age of thesun.

FIGURE 1 Young open cluster, M11. (Courtesy of B. Anthony-Twarog.)

Page 150: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

718 Star Clusters

FIGURE 2 Old open cluster, M67. (Courtesy of B. Anthony-Twarog.)

An additional use of isochrones concerns the determi-nation of physical properties of envelope and core con-vection. In the stars more massive than ∼1.5 M, the coreis dominated by carbonnitrogenoxygen (CNO) process-ing on the main sequence and is consequently completelyconvective. The envelopes are radiative, so that there is aturbulent interface between the two regions. Overshootingof the turbulent cells from the core can promote mixing offresh, hydrogen-rich material into the nuclear processingregion, with the consequent increase in the mass of thechemically helium-rich core with time. This produces, atthe end of the main-sequence stage, a more rapid thanexpected contraction of the core and a gap in the H-R dia-gram of the cluster immediately after the main sequence.Several clusters, notably M67 and NGC 188, have beenused for the study of the interiors models, but the defini-tive answer must await the development of better interiorsmodels. The post-main-sequence gap, as this feature ofthe diagrams is known, is one of the few probes we have,using the aggregate population of the clusters, of the de-tailed properties of the interiors of stars more massive thanthe Sun. This gap can be clearly seen at V = 13 in Fig. 2,the color–magnitude diagram for M67. An example ofisochrone fitting to the Hyades is shown in Fig. 3.

C. Initial Mass Function

Since all of the stars were formed from the same compo-sition and at the same time in the same place in the galaxy,one can use this important feature to trace the chemical his-tory of the galaxy. The older clusters in the disk have beenshown to have a slightly lower metallicity than the Sun,whereas the ones just now being formed have a slightlyhigher or about the same value of metal abundance. This isone of the key observations indicating that the metallicityof the disk of the galaxy has been secularly changing in

FIGURE 3 Theoretical isochrones for the Hyades color trans-formed to match photometry data; absolute magnitudes based onHipparcos parallax measurements. The models used the Stanieroequation of state (1998). The mixing length parameter (ratio ofconvection mixing length to scale height) is 1.3 in these mod-els with the curves labeled by age (the top-leftmost curve is theyoungest). Notice how the giants and subgiants constrain the ageas well as the main sequence turnoff point. [From the Monthly No-tices of the Royal Astronomical Society, 320, 66–72, 2001, withpermission.]

time. In addition, it appears that for the lifetime of the disk,during which the open clusters have been formed, therehas been little change in the so-called initial mass function.This is the population distribution of the stellar masses.

The actually measured quantity for a cluster is the num-ber of stars in a given range of color and visual magnitude.Stellar atmosphere models are used to determine the trans-formation between color and effective (surface) tempera-ture, and this provides the bolometric (total) luminosity.The stars are then placed on a “theoretician’s” H-R dia-gram (luminosity versus effective temperature) and tracedback to the main sequence. Thereafter, their masses (as-suming no mass loss has substantially altered their abun-dances or masses) can be specified, and the number perunit mass can be determined. This is the initial mass func-tion of the cluster.

The observational picture is still only approximatelyknown, but there is strong evidence that the IMF for clusterstars is the same as that for field stars and that for at leastthe past 1010 yr it has remained stable, with few high-massand many low-mass stars being formed. Simple power lawfits to the distribution show that

(M) dM ∼ M−α dM,

with α being 2.35. There is still considerable uncertaintyin the exponent (±0.5). A more accurate fit for field starsis a log-normal distribution. However, the basic result, thatthe massive stars are considerably rarer than the low-mass

Page 151: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

Star Clusters 719

ones like the Sun, is unchanged by this detail. The detailsof the origin of the mass function are, presently, unknown.Qualitatively, it is attributed to a stochastic fragmentationcascade to some minimum stable mass. It is also likelythat it entails a kind of fragmentation–coagulation phe-nomenon.

The lower mass of star formation is also an importantparameter for stellar evolution theory. The lower end ofthe main sequence, representing the minimum mass atwhich stable nuclear reactions can be initiated in the stellarcore, appears to be at ∼0.1 M. The minimum mass fromfragmentation is of the same order, so the census of thispopulation in clusters is a very important task.

Another check on stellar evolution theory provided byopen clusters is that there are white dwarf stars present inmany of the older clusters. These are extremely compactremnants of stellar evolution, formed from stars of aboutthe mass of the Sun. They are no longer generating lu-minosity by nuclear processes, so their presence indicatestheir cooling time and rate of formation. Again, it is im-portant to be able to observe these in systems of knownage, since then both the ages of the white dwarfs and themasses of the progenitor stars can be specified. There areno pulsars associated with any cluster presently known,nor are clusters generally associated with supernova rem-nants. In general, the massive stars appear to have a con-nection with associations and not the open clusters, andthis may be an additional clue to the formation mechanismfor such systems.

In some clusters, notably the Pleiades (among theyounger group of open clusters), there is a sequence abovethe main sequence, in the lower mass star region of the H-Rdiagram, that may be an indication of a continuing starformation in the cluster. The blue stragglers may also rep-resent such a group. The bounds on the time scale requiredfor star formation to proceed in these systems is thus of theorder of the premain-sequence contraction time for a verylow-mass star to the main-sequence lifetime of a roughly3 M star. This is from about 107 to 108 yr.

D. Rotation in Open Clusters

Studies of stellar rotational velocity in clusters have shownthat there is also a distribution of angular momentumamong young stars, such that the angular momentum Jis larger for the more massive stars. Although at presentwe do not know why this is the case, the study of starsin clusters greatly aids the determination of this impor-tant parameter for star formation theory. The lower massstars are typically slow rotators, owing to loss of angularmomentum during the main-sequence phase through stel-lar winds, but in some clusters there is an excess of rapidrotation on the lower main sequence.

One of the few relations that appears to hold for rotationhas been determined by A. Skumanich. It is that the rateof chromospheric activity appears to decline with age andthat, if this is due to the spin-down of the stars owing toangular momentum loss via a stellar wind, then the rotationfrequency should vary as ∼ t−1/2, where t is the age.For the youngest clusters, like the Pleiades, the low-massstars also display strong stellar coronas, suggesting that theenhanced dynamo activity presumed to be the origin of theX-ray-emitting region is connected with the statisticallyhigher rotation rate of these stars due to their age.

Recent observations of rapidly rotating lower-main-sequence stars in the Pleiades and possibly other youngclusters may be an indication of a long time (in comparisonwith the age of the upper-main-sequence stars) of forma-tion for the cluster members. In associations, this appearsto be short in the subgroups of the systems but in galacticclusters may take as long as 108 yr. Much remains to bedone in relation to this problem.

E. Binary Frequency in Open Clusters

There is still considerable uncertainty about both the fre-quency of occurrence of, and distribution of mass ratio in,binary systems in clusters. The problem is that few long-term surveys of clusters have been undertaken to deter-mine the presence of small-amplitude velocity variationsin main-sequence stars. About half of the stars in the fieldare binaries, and the mass ratio for many of these systemsis nearly unity. (It is still uncertain what the true value is.)There appears to be a dearth of binaries among the Pleiadesstars, but the claimed excess in IC 4665 has been shownto be incorrect. There may be some clusters for which thebinary statistics are unusually high or low compared withthis value, but this is a problem for the future.

F. Star Formation in Clusters

The youngest clusters, such as the Trapezium, σ ori, andNGC 2264, are still imbedded within their parent clouds.Many of the individual stars in these young clusters stillhave dust shells and are contracting toward their initiationof hydrogen core burning. In a few such young clusters,one can see stars evolving both toward (the lower massstars) and away from (the highest mass ones) the main se-quence, thus providing a check on the pre-main-sequencecontraction time and the rate of stellar evolution on themain sequence. There is also some evidence, from thefact that the lower mass stars evolve more slowly through-out their lifetimes, that the star formation in open clustersmay take place over a time scale comparable to the age ofthe cluster. In some clusters, like the Pleiades and NGC752, there is evidence, albeit weak at the moment, that

Page 152: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

720 Star Clusters

there are several epochs of star formation. The spread inthe rotational and luminosity properties of the lower mainsequence is larger than would be expected for a sample ofstars formed simultaneously.

Some clusters display a population of stars that lie abovethe main-sequence turnoff points of the massive stars, theso-called blue stragglers. These may be stars that havebeen the product of mass transfer between the membersof close binary systems, in which one of the stars has beenmoved above the turnoff point due to mass transfer froma companion that is evolving toward the red giant phase.However, there is little evidence for binarity among thesestars. An interesting alternative is that they have somehowbecome completely chemically mixed and are evolving ashomogeneous configurations. The evolution of such stars,rather than being toward the cooler parts of the H-R dia-gram, is almost along the main sequence toward slightlysmaller and hotter configurations as the stars age as a re-sult of the global increase in the mean molecular weightof the gas in the interior. Normally, stars massive enough(∼1.5 M or greater) to burn hydrogen via the CNO cy-cle would have convective, completely mixed cores whosemean composition would be considerably heavier than thestellar envelope (which is radiative). As a result, as thecores contract under the increase in the mass of the con-stituent species, the released gravitational binding energyis capable of lifting the envelope. This causes the star to in-crease, overall, in radius and move toward lower surfacetemperatures (the mechanism responsible for producingred giants). However, the mechanism whereby the mixingis initiated and maintained remains conjectural. The bluestragglers remain one of the important puzzles in the oth-erwise largely coherent picture of cluster evolution. Theymay even be additional evidence for multiple epochs ofstar formation.

G. Abundance Studies

An important use of galactic clusters is the study of theage dependence of the metallicity of the stars in the galac-tic disk. There are several elements for which it is espe-cially important to know the absolute, rather than statisti-cal, ages. One such element is lithium. It is well known thatlithium is destroyed by nuclear processing in solar-typestars. The element is mixed into the core by convectionand totally consumed by proton nuclear processes. Thus,it is a useful constraint on the rate of mixing in stellar en-velopes as well as a valuable potential age discriminator infield stars. Studies of intermediate-age clusters show thatthe time scale for this mixing to occur is on the order of1 Gyr and is also consistent with the change in the rotationrate for solar-type stars due to angular momentum loss viastellar winds.

We have no direct determination of the deuterium abun-dance in cluster stars, but this would also serve as a usefulconstraint on both the rate of consumption of deuteriumvia a mechanism similar to that of lithium destruction andalso on the primordial deuterium abundance.

The abundance of heavy metals appears to vary amongcluster stars, but this is likely due primarily to processesconnected with mixing and diffusion that are intrinsic tothe stellar envelope. As yet, there is no clear measure ofthe intrinsic dispersion in initial abundances among starsin a cluster. The abundances overall follow the trend of thefield stars—the youngest clusters are a bit more metal richthan the Sun, whereas the oldest clusters are considerablymore metal poor.

H. Variable Stars in Clusters

Several classes of pulsating variable stars are observed inopen clusters. The main-sequence population comprisesthe δ Scuti stars, which are ∼ 1.5 M and are late A andearly F stars. These have periods of about an hour andamplitudes that are typically several hundredths of a mag-nitude. They are the same type as the field population, buttheir properties can be best studied, as for all pulsators,when their distances can be independently determined andtheir absolute magnitudes assigned. Perhaps the most im-portant aspect of clusters, however, for the purposes ofboth stellar evolution and cosmology is that a few clustershave δ Cepheid or classical Cepheid variables as mem-bers. These stars, which are about 3–5 M, are evolvedstars that pulsate with periods of several days to severalmonths and have large amplitudes (typically near a magni-tude). They have a regular period–luminosity relation fromwhich their absolute brightnesses can be determined giventheir period of variation, a comparatively easily measuredquantity. They form the anchor of the extragalactic dis-tance scale. It is, in part, the fact that the Cepheids occurin clusters that has led to one of the most interesting ques-tions of stellar evolution theory. When one determinesthe masses of the stars, they typically have evolution-ary masses that are greater than indicated by pulsation-theoretical calculations. Although a number have beenshown to be members of binary systems, thereby alter-ing their placement in the H-R diagram, not all have beenshown to be such. Thus, again, clusters provide one of themost important handles on the absolute properties of thesestars.

Flare stars and T Tau stars are associated with the you-ngest clusters. The flare stars, which are chromospherica-lly active M dwarfs, may be more frequent because ofgreater dynamo activity in the newly formed systems.These stars are rapidly rotating systems and may not yethave spun down to their more sedate rotation rates due to

Page 153: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

Star Clusters 721

breaking by stellar wind-driven angular momentum loss.One serious complication in this observational picture isthat at present there are few surveys of such activity inclusters.

I. Magnetic Fields in Cluster Stars

Additional important information provided by galacticclusters is the occurrence of magnetic fields in main-sequence stars. Only a few clusters have so far been studiedin any detail for direct observation of magnetism, but thereis some indirect information provided by the presence ofAp stars. These are stars with anomalously high (com-pared with the average Population I star and with the Sun)abundances of heavy metals—especially the rare earthsand silicon—which possess strong magnetic fields. Theyoccur in clusters having ages greater than ∼107 yr, sug-gesting that the magnetic fields are present but that thereis some lag time after formation for the development ofthe abundance anomaly. These stars are massive, typicallygreater than 2 M.

For the lower mass stars, the statistics for the pres-ence of chromospheric activity and X-ray emission forma useful tool with which to study the development ofdynamo-generated magnetic fields in stars. The youngestsystems, like the Pleiades, have already well-developedcoronal activity indicators, and it seems that most starsin the youngest clusters on the lower main sequence havesome X-ray emission associated with them. Indeed, inagreement with the fact that the mean rotational velocitydecreases with time, the oldest clusters show weaker dy-namo activity among the lower main sequence stars thando the young clusters. Again, because the ages of thesestars can be determined independently from the H-R dia-gram for the cluster, the study of the aggregate propertiesis far more valuable than the detailed work on field starsfor which such information is only statistically available.There is only weak evidence at presence for the decay ofthe strong magnetic fields of the upper main sequence starswith age, although the time scale for such a phenomenonshould be of the order of 109 yr or fewer due to diffusivelosses. This remains one of the most important areas ofresearch in open cluster populations.

III. ASSOCIATIONS

A. OB Associations

The most massive stars in the galaxy do not appear toform in the tight configurations that open clusters repre-sent. Instead, they form dynamically fragile associationsof several thousand solar masses. Some, like the Sco–Cen association, contain open clusters (NGC 6231). Oth-

ers, like Orion, contain tight multiple-star systems of theTrapezium variety, which contain ∼100 stars but whichin total have masses about like those of very small openclusters. A few show evidence not only of having formedover a considerable period of time, of the order of 107 yr,but also of forming in sequences (e.g., Cep OB3 and OriOB1). Many OB associations, these two groups being pro-totypical, are still associated with their molecular cloudsand show evidence of continuing star formation.

The physics governing the division between cluster andassociation is not well understood although it is clear thata continuous hierarchy exists that connects star formationon different scales. One suggestion is that the formation ofmassive stars stops the continued formation of the lowermain sequence objects and, by altering the thermal anddynamic structure of the environment, brings the system’sstar-forming activity to a halt. A related point is that onlythe most massive molecular clouds may have sufficientmass and stability to allow for the formation of OB asso-ciations, whereas the average cloud may serve as the nurs-ery for the lower mass objects. If this is the case, the OBassociations should show at the same age as the clusters asystematic preference for the upper main sequence stars, atest that has yet to be completed. The problem is that thereare few open clusters known from optical observation tobe of sufficient youth with which to test this idea. Instead,infrared satellite: observations of clusters with IRAS andISO should be used to determine the mass function for thestars forming currently in clouds of different mass.

Another important feature of the associations is thatthey can evaporate. Due to the gravitational interactionsof a bound star in the group with the others in the vicin-ity, there can occasionally be a large enough kick froma close encounter that the star is sent at escape velocityor higher out of the system. The main reason is that theassociations are rather fluffy, from the gravitation pointof view—quite distended even considering the masses ofthe individual stars. The escape of stars from the clustersproduces a contraction of the cluster core, increasing therate of encounters and gravitationally “heating” the starsin the association. The contraction of the core of the asso-ciation therefore increases, with a further increase in therate of escape of the stars. This evaporative dissipation ofthe cluster, first discussed by S. Chandrasekhar, L. Spitzer,and J. Oort in the 1940s and 1950s, is the main mechanismfor the dissolution of the associations.

It is likely that the evaporation time scale for the asso-ciations is always short, of the order of the main-sequencetime for the B stars, and may be important for understand-ing the formation mechanism for these systems. The OriOB1 and Cep OB3 associations have among the best de-termined dynamic ages, and these agree quite well withthe inferred stellar ages from evolutionary models.

Page 154: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

722 Star Clusters

B. R and T Associations

If the open cluster or association is still connected withdust, it is sometimes called an R association, for reflection.Dust has the property of scattering, as well as absorbing,starlight. The presence of B and A stars in an association,in the absence of massive stars, does not provide sufficientUV radition to ionize the environment. A result is thatthe neighborhood dust survives, without the formation ofan emission nebula, which then scatters the starlight andforms an extended filamentary diffuse nebula. These Rassociations are among the youngest clusters in the galaxy,having ages a bit greater than those of the OB associations.

Finally, if the association contains T Tauri stars, whichare pre-main-sequence objects, it is sometimes called a Tassociation. The purpose of separating these groups was tocall attention to the star-forming activity associated withthem. However, in light of the recent observations by ISOthat star formation is present in the molecular clouds con-nected with many or most of the OB associations in thegalaxy, even if it were optically invisible to us, there seemslittle reason to continue this designation. The T Tau starsare well emerged from their parent cloud material, and theindication of ongoing star formation is better provided bythe wealth of point IR sources observed to be associatedwith many of the OB associations in the galaxy.

IV. GLOBULAR CLUSTERS

A. Introduction and Basic Properties

The most striking property of globular clusters is theiroverall optical appearance. They form nearly spherical,compact systems that are enormous aggregates of stars.It is this feature that was first selected as their distin-guishing characteristic. The morphologically distinguish-ing features of individual clusters are the degree of centralconcentration, the so-called Shapley classes (on a scale ofI to XII, depending on decreasing concentration), and theellipticity. The latter, though originally looked to as pro-viding evidence of the rotation of the cluster as a whole,actually appears due to both the tidal interaction with thegalaxy as a whole and the formation of nonaxisymmetricinternal velocity and spatial structures.

The frequently employed model for the overall massdistribution, the so-called King model, assumes that theglobular cluster is an isothermal sphere of stars. This as-sumption means that the velocity dispersion is assumed tobe homogenous and isotropic. The stars behave as if theyare particles in the mutually generated gravitational poten-tial well moving around at uniform temperature. The rateof escape of stars from such a system is seen to be lowerthat the rate of escape from the open clusters because of

the relative compactness of the potential and of the massof the cluster. The central portion of such spheres is calledthe core radius, a measure of the distance over which thesurface brightness of the cluster falls to about half its cen-tral value. The Shapley classes are not well correlated withthis more quantitative measure of concentration, for whichreason the former is preferred as a characterization of theculster morphology.

B. Integrated Properties of the ClusterSystem of the Galaxy and Comparisonwith External Galaxies

An important feature of the globular clusters is that theycan be studied as if they were extremely bright stars and,because of their compactness and brightness, can be ob-served to very large distances. This includes the halos ofexternal galaxies. The cluster system in our galaxy appearsto consist of two components, the halo and disk systems,which are distinguished by their metal abundances andspatial distributions.

For the halo system, the mean integrated absolute vi-sual magnitudes of the disk and halo clusters are about thesame, 〈MV〉 = −6.9, to within ∼20%. For external galax-ies, only a few systems have so far been observed. Theseinclude most of the galaxies in the Local Group (the clus-ter to which the galaxy belongs) and NGC 5128 = Cen A,a nearby active and morphologically extremely peculiargalaxy.

There exists in the Large Magellanic Cloud a popula-tion of young, globularlike value clusters, the so-calledpopulous blue clusters. These appear to be still in a modeof active star formation. Clusters having ages as young as0.01 Gyr are observed. Recent observations of several blueclusters show that they can have ages as great as severalgigayears but, again, that they are not as old as the glac-tic globular clusters. Similarly, these have been observedin the even more metal poor Small Magellanic Cloud.They typically have metallicities at most 10% of the solarvalue, comparable with the most metal-rich galactic clus-ters. This latter point is likely the most important clue totheir origin, since the Magellanic Clouds, being of lowermass than the galaxy, also have an overall lower abundanceof metals.

For M31, the clusters appear to have an excess of blueevolved stars for the same metallicities (determined fromintegrated spectra), but much remains to be done on thepopulations of clusters in general. Their luminosity distri-bution is not markedly different from that of our galaxy.For M33, the mean magnitude is about the same as in ourgalaxy, and the population of the LMC-like blue clusters islower, more in agreement with our experience locally. Thesame appears to be true for the clusters about the active

Page 155: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

Star Clusters 723

peculiar elliptical galxy NGC 5128. The cluster system ofM87, the central giant elliptical galaxy in the Virgo clus-ter, has also shown that many of the clusters have a highermetallicity than those in our galaxy and that there is a sub-stantial population of brighter systems than we possess.There appears to be no radial gradient in the abundances.More detailed information on external galatic systems willhave to await further high-resolution space observations.Strongly interacting starburst galaxies, such as Arp 220,show such compact blue clusters in currently triggered starformation. And the Galactic Center also shows massiveongoing star formation in dense compact clusters analo-gous to R 136 in the LMC H II region 30 Doradus.

C. Metallicities and Space Distribution

The globular clusters are characteristic members of theold population, Population II, of the galaxy. They aredistributed in a halo around the disk, although some arepresent in the disk. Their metallicities range from the low-est values observed in the oldest open clusters to ∼10−3

the solar metallicity (which is ∼ 0.02 by mass). The usualmeans of quoting metallicity for these clusters, as with theopen galatic clusters, is in terms of the Fe/H ratio. This isnormally quoted as a differential measure relative to theSun, as [Fe/H] = log(Fe/H) − log(Fe/H).

An important observational fact, still not well under-stood, is that there are no galatic globular clusters with thecharacteristics of the Population I stars and that they do notnow appear to form in the disk. Their distribution can beseparated into two fairly distinct groups, differentiated onthe basis of metallicity. One is distributed approximatelyspherically around the disk, centered on the galactic cen-ter, and concentrated toward the bulge of the galaxy. Theseare the most metal poor clusters, having values of Z , themetal fractional mass abundance of metals, ranging from10−4 Z to about 10−1 Z.

The metallicity scale for the globular clusters is notas well established as that for the open systems, sincethey are intrinsically fainter and only the more evolvedstars can in fact be individually analyzed. These tend tobe stars on the red giant branch, which may have under-gone internal mixing processes and so may not be repre-sentative of the entire cluster abundances. Furthermore,the metallicity determined from integrated cluster spec-tra, in which the entire system is treated as if it were asingle star, are severely affected by the morphology ofthe cluster H-R diagram. Specifically, the population ofthe blue end of the horizontal branch can cause spuri-ous abundance results when photometric indices are used.The lines from these stars are extremely strong and thestars are far brighter than the main sequence so that theyalso wash out the contribution from these fainter, although

more numerous, stars. However, despite the uncertainty,it appears that even the most metal-rich globular clusters,like 47 Tuc, barely if at all reach the current abundancesof the disk. Dynamical evidence from halo stars points toaccretion of stars from galactic collisions with the Milkyway. The same is indicated by the bimodal globular clus-ter distribution, which separates into a group that is moreconfined toward the Galactic plane, such as ω Cen, whilemost are more broadly distributed and follow the halo massprofile.

Observations of the distribution of the cluster suggestthat the break point in the flattening of the cluster systemoccurs at [Fe/H] = −1 to −1.5 (depending on the metal-licity scale employed). The spatial distribution of the moremetal-rich globulars appears flattened, with a scale height(distance above the Galactic plane) of the order of 0.5 kpc.These are far fewer in number than the more metal poorsystems and appear to have the metallicity of the oldestfield stars in the disk, the so-called old Population I dis-tribution. They too are concentrated toward the galacticcenter.

D. H-R Diagrams and Aggregate Properties

Two important observational features dominate the studyof these clusters. One is the fact that the morphology ofthe H-R diagram appears to depend on the metallicity ofthe stars. The stage of the helium core burning is repre-sented in clusters by a horizontal branch (HB), which liesabout a magnitude or so above the main sequence andstretches across the diagram from red to blue, connectingwith the nearly vertical red giant branch on the cool end.The lower the mean metallicity of the cluster, the bluer thehorizontal branch is. This is reflected in the B/(R + B) ra-tio for the HB stars. Since the presence of RR Lyrae starsis a function of the morphology of the HB, the clusters oflower metallicity have a larger population of these vari-ables. In addition, there appears to be a relation betweenthe HB morphology and distance of the cluster from thegalactic center such that the lower metallicity stars and thebluer HB clusters lie farther from the center. The relationis a weak one, but it nonetheless appears to indicate thatthe objects of higher metallicity lie closer to the galacticdisk.

There is one other metallicity-dependent parameter forglobular clusters. The brightness of the tip of the giantbranch relative to the HB is also a function of the meancluster metallicity, although it may depend as well on theinitial helium abundance. One serious problem in under-standing this is that there is a considerable spread in metalabundance, especially of CNO, among the stars on the gi-ant branches of several clusters. The study is hamperedby the fact that only the brightest stars, in the nearest

Page 156: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

724 Star Clusters

TABLE I Metallicity of Select Globular Clusters

NGC Other name [Fe/H] B/(B + R) log(Z/Z)

104 47 Tuc −1.1 0.0 −0.4

5272 M3 −1.6 0.5 −1.2

5904 M5 −1.1 0.8 −0.8

6205 M13 −1.4 1.0 −1.2

6341 M92 −2.1 1.0 −1.7

7078 M15 −1.8 0.8 −1.4

clusters, can be studied presently. Nonetheless, there isa spread of upward of an order of magnitude in the abun-dance of CN in ω Cen and in 47 Tuc, the two brightest suchclusters.

The integrated spectra of the clusters can also be ob-served. The most important parameter here is S, the dif-ference between the spectral type provided by the hydro-gen and the metallic lines (especially Ca II H and K). Thiscan be derived for both individual stars (e.g., halo high-velocity stars) and for the integrated light of the clusters.It is an excellent indicator of metallicity. The metallicitiesof globular clusters are summarized in Table I.

E. Ages

The globular clusters are the oldest members of the galac-tic system that we can study in detail. Their ages are in-ferred to be about 11 to 13 Gyr, with no large age spreadeven with the large observed metallicity dispersion. It isclear that the epoch of formation of these objects lastedonly for a brief interval, in contrast to the ongoing activ-ity of star formation in the disk of the galaxy. Both theoptically observed color–magnitude diagrams and the UVproperties of the stellar populations of the clusters givesimilar ages. It appears that the disk increase in abun-dance was very rapid indeed, taking no more than a fewgigayears.

F. Luminosity Functions

The mass function for globular clusters is heavily weightedtoward the low-mass stars because of evolutionary effects.The turnoff points for most of the clusters are well below1 M, leading to functions that can be easily comparedwith only the solar neighborhood in the disk of the galaxy.Until the introduction of charge-coupled devices (CCDs),the determination of this population was hampered by thefaintness of the stars; however, recently a few clustershave been studied to lower than 0.3 M. One of the mostthoroughly studied is M13, the Hercules globular cluster.It has been taken to MV = +9.5. The IMF is steeper above0.65 M than for the galactic disk, flattening out for the

lower mass (≤0.5 M) objects compared with the field.The intrinsic width of the main sequence is quite narrow,of the same order as observed for the most populous openclusters, less than 0.1 magnitude, and there is no evidencefor a sizable population of equal-mass binary stars. Thishas been seen in a number of open clusters as well. The lowspread, even in light of the rotation of stars (see discussionfor the open galactic clusters), is in part due to the extremeage of the systems (these stars would have slowed in theirrotation by now) and the fact that lower mainsequencestars are intrinsically slow rotators anyway. There is alsono evidence for a second epoch of star formation in theclusters, which would agree with the narrow spread in theirvarious evolved members across the H-R diagram.

G. Initial Mass Function

The globular clusters now consist of only lowmass stars,but this is likely due to their epoch of observation andformation. If they were formed in the early stages of theevolution of the galaxy and all of the stars within themwere born at the same time, then we would not see onlystars lower in mass than the sun due to stellar evolution.We therefore cannot use them as direct confirmation ofthe stability of the IMF with time, since all of the massivestars have disappeared.

H. Variable Stars

The primary variable stars in these clusters are the RRLyrae type, which have periods of about 0.3–0.6 day. Orig-inally called cluster variables, they have been extensivelycataloged in the past several decades. The light curves fallinto two types: the ab, which have sharp rising portionsand slow declines, and the c type, which is more nearly si-nusoidal. The ab type tend to be longer periods and havelarger amplitude. First recognized by Bailly in the late19th century, they are apportioned differently in differentglobular clusters. This is the Oosterhoff dichotomy. Thebluer the horizontal branch, the larger is the ratio of cto ab. There appears to be some single parameter, as yetunknown, that controls both the population of the HB ingeneral and the population of instability strip in particular.Both metallicity and initial helium composition have beenimplicated. The RR Lyr stars have nearly the same abso-lute magnitude, about +0.7, and thus allow the distancesto the clusters to be determined independent of parallaxmeasurements. Since the light of the cluster is dominatedby the HB and the red giants, these colors and syntheticpopulation models can be used to determine the distancesto galaxies in which the clusters can be located. This isyet another use of the globulars as distance indicators onthe cosmological scale, one that will be increasingly im-portant with the advent of the HST.

Page 157: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

Star Clusters 725

I. X-Ray Emission

About 10% of the known globular clusters in the galaxy areX-ray sources. Not all of them, however, show the burstingactivity with which the class is usually identified. The burstsources are due to accretion onto a neutron star of matterfrom a low-mass companion, which may have formed abinary by means of capture in the cluster’s lifetime. Thereis no evidence from X-ray data for the presence of diffusegas in globular clusters. In every cluster observed withROSAT and Chandra the XR emission has been resolvedinto stellar sources that are cataclysmic binaries.

J. Binary Stars and X-Ray Sources

Binary stars have been directly detected in globular clus-ters by light variations and direct radial velocity measure-ments. These are mainly WUMa-type common envelopesystems on the main sequence, but novae and other cata-clysmics are found. Millisecond pulsars, that are producedby spinup of binary neutron stars, have also been detected,particularly in 47 Tuc.

Since the formation of neutron-star-containing binarysystems involves supernova explosions, it is probable thatthey are formed from the remnants of low-mass stars,which have been pushed past the Chandrasekhar limit-ing mass for stable white dwarfs rather than from recentlyformed massive objects. The presence of such stars in theclusters is also strong support for the mechanism of neu-tron star binary formation for SN type I in the galaxy.

Simulations argue, however, that such systems shouldbe quite rare in globulars. The lifetime for a binary againsttidal disruption from a close encounter with another staris so short that there should be no longer period (days orweeks) separations permitted in these systems. The bestmodel for the formation of any close binaries still ap-pears to be tidal capture of the companion. Blue stragglerssupport this. They are likely merger remnants resultingfrom capture and subsequent tidal dissipation of angularmomentum.

K. Diffuse Gas and Star Formation

There is no compelling evidence for diffuse interstellargas in globular clusters. Sensitive searches have failed toturn up extended emission, and there are no known in-stances of planetary nebulae in any cluster, although in atleast one case there is a chance superposition of a fore-ground nebula with M15. So there is again strong supportfor the contention that such clusters cannot and do notundergo star formation in the present epoch. Finally, theevidence for blue stars in the cores of these clusters comesfrom the HB objects and not from new massive stars. The

clusters contain red giants that should have strong stellarwinds, of the order of 10−7 M yr−1. However, the fact thatthe material does not accumulate within the cluster alsosuggests strongly that it is swept out on time scales shortcompared with the cluster lifetime. Such a mechanism asthe ram pressure-induced stripping of the clusters as theypass through the interstellar medium in the plane of thegalaxy has been successful in explaining the absence ofthis extended material. Material may also be blown out bysupernovas or removed via a galactic shock due to the tidalpotential of the plane. The gravitational binding energy ofglobulars is such that if the gas is heated by supernovasand other hydrodynamic heating sources, it will simply beblown out of the cores of the clusters and lost, unlike clus-ters of galaxies in which the matter can be retained to muchhigher temperatures. Again, considerable work remains tobe done on this important problem in the chemical historyof the galaxy.

V. CONCLUDING REMARKS

At present, from IRAS and ISO observations and relatedwork on the chemical evolution of the galaxy, it is cer-tain that stars arise in clouds of considerable mass—fromhundreds to millions of solar masses. Very likely, they alsoarise in association with other stars, perhaps by stimulatedprocesses such as supernova-induced collapse of clouds orspontaneously. Clusters hold the most important key to anunderstanding of the formation of stars and ultimately ofthe chemical input for all of subsequent galactic history.The question of the evolution of the mass function forstar formation is best answered with cluster data, since wehave the additional certainty that the stars are essentiallysimultaneously formed (on the time scale of the galacticlifetime). This has direct implications in cosmology, theinterpretation of the spectra of distant galaxies at earlierepochs in their histories.

The comparison of galactic and extragalactic clustersand associations is likely to yield an important test ofstellar evolution: the dependence of stellar populationproperties on the environment. This will have to awaithigh-resolution observations from the HST and relatedfuture large telescope projects. CCD imaging has greatlyimproved knowledge of the lower main sequence of oldgalactic and globular clusters, and it is likely that inthe next few years age determinations will be routinelyperformed for globulars by fitting main-sequence turnoffpoints. The helium abundance determination for evolvedclusters can also be greatly improved by such obser-vations, an important cosmological parameter. Finally,the observation of spectra for individual cluster stars inthe lower main sequence will also improve as telescope

Page 158: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN015H-724 August 2, 2001 15:59

726 Star Clusters

imaging technology and solid-state detectors are im-proved. The revision of the metallicity scale for stellarpopulations and of the understanding of the origin ofH-R diagram morphology for globular and open clustersremains an intriguing prospect for the next decade.

SEE ALSO THE FOLLOWING ARTICLES

BINARY STARS • GALACTIC STRUCTURE AND EVOLUTION

• NEUTRON STARS • STARS, MASSIVE • STARS, VARIABLE

• STELLAR SPECTROSCOPY • STELLAR STRUCTURE AND

EVOLUTION • SUPERNOVAE • X-RAY ASTRONOMY

BIBLIOGRAPHY

Bailyn, C. D. (1995). “Blue stragglers and other stellar anomalies: Im-plications for the dynamics of globular clusters,” Annu. Rev. Astron.Astrophys. 33, 133.

Blaauw, A. (1964). The O associations in the solar neighborhood. Annu.Rev. Astron. Astrophys. 2, 213.

Castellani, V., Degl’ Innocenti, S., and Prada Moroni, P. G. (2001).“Stellar models and the Hyades: The Hipparcos test,” MNRAS 320,66.

de Zeeuw, P. T. et al. (1999). “A Hipparcos census of the nearby OBassociations,” Astron. J. 117, 354.

Freeman, K., and Norris, J. (1981). Physical properties of globular clus-ters. Annu. Rev. Astron. Astrophys. 19, 319.

Goudis, C. (1982). “The Orion Complex: A Case Study of InterstellarMatter,” Reidel, Dordrecht, Holland.

Harris, G. H. (1970). “Atlas of Galactic Open Cluster Color-MagnitudeDiagrams, Vol. 4,” David Dunlap Observatory.

Hut, P. et al. (1992). “Binaries in globular clusters,” PASP 104, 981.James, K., ed. (1991). “The Formation and Evolution of Star Clusters,”

Astronomic Society of the Pacific, San Francisco.Lebreton, Y. (2000). “Stellar structure and evolution: Deductions from

Hipparcos,” Annu. Rev. Astron. Astrophys. 38, 35.Lewin, W. H. G. (1980). X-ray burst sources in globular clusters and

the galactic bulge. In “Globular Clusters” (D. Hanes and B. Madore,eds.), p. 315, Cambridge Univ. Press, New York.

Philip, A. G. D., and Hayes, D. S. (1981). “Physical Parameters of Glob-ular Clusters: IAU Colloquium 68,” Davis, New York.

Pilachowski, C. A., Sneden, C., and Wallerstein, G. (1983). The chemicalcomposition of stars in globular clusters. Astrophys. J. Suppl. 52, 214.

Sandage, A., and Roques, P. (1984). Main sequence photometry and theage of the metal-rich globular cluster NGC 6171. Astrophys. J. 89,1166.

Vanden Berg, D. A., Bolte, M., and Stetson, P. B, (1996). “The age ofthe galaclic globular cluster system,” Annu. Rev. Astron. Astrophys.34, 461.

Zinn, R. (1985). The globular cluster system of the galaxy. IV. The haloand disk subsystems. Astrophys. J. 293, 424.

Page 159: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT/GLT P2: GPB Final pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN015H-725 August 2, 2001 16:2

Stars, MassiveSteven N. ShoreIndiana University South Bend

I. IntroductionII. Mass Loss from StarsIII. Massive Stellar Evolution and NucleosynthesisIV. Effects of Massive Stars on the GalaxyV. The Luminous Blue Variables

VI. Coda

GLOSSARY

CNO cycle Thermonuclear fusion process that convertshydrogen into helium by reactions with carbon, nitro-gen, and oxygen.

Eddington limit Greatest luminosity that a self-gravitating, luminous object can achieve while stilldynamically stable. It is the luminosity at which radi-ation pressure can balance gravitational acceleration.

Eta (η) carinae Prototype luminous blue variable. Oneof the intrinsically brightest stars in the Galaxy, knownas an eruptive optical variable. Surrounded by a densenebula formed from previous eruptions, it has oftenbeen cited as a prospective supernova candidate.

Luminous blue variable (LBV) Massive supergiant thatundergoes aperiodic brightness increases of over sev-eral magnitudes, with a strong stellar wind; also calledS Doradus variable and Hubble–Sandage variable.

P Cygni lines First observed in the luminous blue vari-able P Cygni. These lines have blue-shifted absorp-tion and strong redward-shifted emission. Absorptionis formed in the expanding part of the flow in the lineof sight to the photosphere while the emission is from

the extended envelope. These profiles are indicationsof mass loss. The absorption extends to the terminal ve-locity, which is about the same as the escape velocityfrom the star.

Units Solar luminosity (L = 4 × 1033 erg sec−1); so-lar mass (M = 2 × 1033 g); solar radius (R = 7 ×1010 cm); M yr−1 = 6.3 × 1025 g sec−1.

Wolf–Rayet star Highly evolved stars with extremelystrong stellar winds and hydrogen-deficient envelopes.The surface abundances of He, C, N, and O have beenseverely altered from their normal (solar) values be-cause of mass loss. Observed as members of binarysystems, their companions may be either normal orcompact stars.

THIS REVIEW of recent work discussed the evolutionof stars with masses greater than 10 M. Such starsare the major source of mechanical and radiative energyfor the Galaxy and intrinsically the rarest stellar objects.They are the progenitors of type II supernovae, Wolf–Rayet stars, and the luminous blue variables. They possessstrong stellar winds and provide an important abundance

727

Page 160: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT/GLT P2: GPB Final pages

Encyclopedia of Physical Science and Technology EN015H-725 August 2, 2001 16:2

728 Stars, Massive

input of heavy metals to the Galaxy through this massoutflow.

I. INTRODUCTION

The study of the most massive stars is an astrophysicalproblem belonging to this century, which was the first torecognize their existence.

The measurement of stellar masses began with the dis-covery of eclipsing binary stars and the determination ofstellar spectral types. It was realized by the middle of thelast century that stars come in a wide range of masses,ranging from about 10% of the mass of the Sun to morethan 50 M. The most massive star for which a direct de-termination of the mass is possible, Plaskett’s star, consistsof two main sequence objects with about 50 M. Throughthe observation of detached binaries, it has been possibleto obtain a fair understanding of the mass luminosity rela-tion for the main sequence, the first stage of stable stellarevolution corresponding to core hydrogen burning.

Extended stellar atmospheres were first observed in thestrong emission line Wolf–Rayet stars, in the Be stars, andin P Cygni at the end of the 19th century. The first explana-tions in terms of mass outflow were proposed in the 1930sby C. Beals. However, it was not until the late 1960s thatrocket ultraviolet spectroscopy showed that all massivestars lose mass via stellar winds. Even stars that do notshow strong optical emission lines show the signature ofoutflows in the ultraviolet on the profiles of the C IV, Si IV,and N V lines. The inclusion of this information in stellarevolution models, beginning in the 1970s, dramaticallychanged the picture of the life cycle of a massive star.

In 1953, Hubble and Sandage discovered the class ofvariables that now bears their names. These stars are thebrightest stars in a galaxy and are best observed in thenearby spirals M 31 and M 33, both members of the LocalGroup. These stars typically have luminosities in excessof 105 L and vary irregularly on time scales of decades,which makes them exceedingly difficult to follow. Be-cause they are generally faint optically, they have beendifficult to study until the advent of solid-state detectors,which permit their spectra to be measured with high pre-cision. As early as 1970, models of massive supergiantsshowed that radiation pressure produced vibrational in-stabilities that might lead to mass ejection. The extremelyrapid increase in computer capabilities has produced sig-nificant advances in these calculations.

The theory of radiative stellar winds has also developedrapidly. Radiative driving of mass outflows was first sug-gested in the 1920s by E. Milue, although it remained un-wed to hydrodynamics until after the development of stel-lar wind theory by E. Parker and others in the 1960s. The

first analytic theories were developed in the early 1970sby L. Lucy, J. Castor, and their collaborators. These havebeen substantially extended through the use of numericalmodels, which include upward of hundreds of thousands ofatomic transitions in the calculation of radiation pressure.Time-dependent calculations are now becoming feasiblefor massive stellar envelopes.

The explosion of SN 1987A in the Large MagellanicCloud has suddenly and dramatically brought this fieldinto a sharper focus. Information gained from the multi-wavelength light curves, the fact that the star was a bluesupergiant at the time of the explosion, and the fact thatthe nucleosynthesis can actually be studied in the ejectaall mean that models can be better constrained than everbefore. More than any single event in the history of thestudy of the stellar evolution, this supernova has rede-fined the questions asked by theorists about the origin anddevelopment of massive stars.

II. MASS LOSS FROM STARS

Stellar masses are not fixed, although they change forsingle stars only on long periods. Low-mass stars, like theSun, lose material through modest winds, about 10−14 Myr−1 while in the hydrogen core burning (main sequence)phase. Turbulence and magnetic field dissipation appearresponsible for such mass outflows. Red giants supportstronger outflows, likely by the same mechanism for thelower mass stars. Such a star, losing mass at the thermalrate, will have a wind of about 10−7 M yr−1. Such windsare common for convective stars in the red giant phase.

For luminous stars, the intensity of the envelope ra-diation field may be sufficiently great that radiation ex-erts a pressure on the background gas. The reason forthis is physically simple. Photons carry not only energyE = hν but also momentum, p = hν/c, where h and care Planck’s constant and the speed of light, respectively,and ν is the frequency. If a photon is scattered, its energymay be unchanged—except in the Compton effect or ifthere is resonant redistribution of the energy in the linetransition responsible for the scattering—but the momen-tum vector changes direction. Any absorption or scatteringof a photon transfer momentum from the photons to thematter.

To calculate the effective radiative acceleration, it isnecessary to take into account the effects of absorption andscattering by stellar line and continuum transitions. Theavailable lines are quite numerous and densely distributedin the ultraviolet. In this wavelength region, the radiationfield for a massive hot star is intense, and if there aremany lines near the peak of the photospheric distributionof intensity, the radiative acceleration will become quite

Page 161: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT/GLT P2: GPB Final pages

Encyclopedia of Physical Science and Technology EN015H-725 August 2, 2001 16:2

Stars, Massive 729

large. The precise definition of the radiative accelerationis therefore the following:

grad = 1

P

d

drPrad = π

c

∑i

∫κνiFν dν, (1)

where the sum is taken over all ionic species in the stellarenvelope at a given depth with fractional abundance Xi .The density of the envelope is ρ, κν is the opacity, andFν is the monochromatic flux. This factor may be quitelarge compared with the predicted acceleration because ofelectron scattering, since the opacity is wavelength depen-dent. The most effective acceleration occurs when the peakof the photospheric flux occurs in a wavelength regime ofvery large line opacity. In such cases, the radiative accel-eration may dominate over gravity. The envelope is thenunbound and flows away from the star.

If electron scattering, also called Thomson scattering,is the only opacity source, the opacity is said to be gray,that is, independent of frequency. The radiation pressurefelt by the scatterers in ensemble is expressed as follows:

d

drPrad = Lκρ

4cr2. (2)

Here L is the luminosity and κ is the mean opacity pergram of material.

There is a critical luminosity at which the radiationpressure precisely balances the gravitational acceleration.Called the Eddington luminosity, this is given by thefollowing:

LEdd = 4πcGM

κ≈ 3 × 104

(M

M

)L (3)

for electron scattering (κ = 0.4 cm2 g−1). Above this lu-minosity, the effective gravitational acceleration at thesurface of the body is negative, so that the matter isactively accelerated outward by the radiation. The fac-tor = L/LEdd measures the departure of the gravita-tional acceleration, g, from that of the mass alone, thusgeff = g(1 − ). For a given mass, the scattering due toelectrons provides an upper limit on the stable mass, andany increase in the opacity will produce a greater tendencyof the stellar material to be lost by the star.

The equation of motion for a steady-state, sphericallysymmetric stellar wind is given by the following:

vdv

dr= − 1

ρ

dPgas

dr− GM

r2− dPrad

dr

= − 1

ρ

dPgas

dr− geff. (4)

If the gas is isothermal, the equation of state is Pgas = ρa2s ,

where as is the sound speed. The radiative acceleration isa function of depth in the atmosphere, so the flow acceler-

ates outward. If at some point in the envelope the effectivegravity is zero, the wind becomes supersonic. It is im-possible for a static solution to be achieved once the flowspeeds exceed the sound speed, so the matter has no alter-native but to leave the star and reach a terminal speed atsome distance that is large—generally a few times largerthan the escape velocity. Such an outflow is called a ra-diative stellar wind because the driving force is radiationpressure.

In steady state, the condition of the continuity equationis that ∇ ·ρv = 0. The mass loss rate for spherical (radial)flow is given by the following:

M = 4πr2ρv. (5)

The mass loss rate is fixed by the conditions at the sonicpoint, near the stellar photosphere. For low-mass, main se-quence stars, radiation pressure plays no role in the struc-ture of the envelope and thus is incapable of producingany serious mass loss. From the cool stars, such as redgiants, despite their luminosities, the radiation field doesnot produce a large acceleration. This is because of thewavelength distribution of the absorption lines with re-spect to the continuum and the fact that the average mo-mentum carried by the photons is quite low. Only if thesurface gravity is very low, for example, for supergiantstars, which are very evolved, will significant radiativedriving take place. It is generally assumed that dust playsthe primary role in coupling the radiation to matter inthe atmosphere. However, for the hot massive stars, theluminosity is carried by the ultraviolet photons, which en-counter large line opacities, and the accelerations may bevery large. In fact, predicted mass loss rates range from10−10 M yr−1 for 5 M stars to about 10−5 M yr−1 for40 or 50 M stars. For these massive stars, the effect ofthe mass loss may be quite severe. This property of themost massive stars creates most of the difficulty in presentevolutionary calculations.

The winds for massive stars reach terminal velocitiesthat are of order 3000 to 5000 km sec−1. When this mat-ter mixes with the interstellar medium, it dissipates thetransported energy and momentum. In the lifetime of thestar, the combined effects of radiation and wind amountto nearly as much energy being put into the medium as ina supernova event. We return to this point below.

III. MASSIVE STELLAR EVOLUTIONAND NUCLEOSYNTHESIS

A. Formation

The most massive stars in galaxies appear to form ingroups. This is virtually all we can say about the origin

Page 162: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT/GLT P2: GPB Final pages

Encyclopedia of Physical Science and Technology EN015H-725 August 2, 2001 16:2

730 Stars, Massive

of these objects at present. The associations in whichthey form range in mass from several hundred to severalthousand solar masses, often heavily weighted towardthe upper mass end. In general, stars form in a varietyof clusters and associations, with a mass function that isskewed toward the lower mass end. This distribution inmass, known as the initial mass function (IMF), may be auniversal property of star formation, sampled differentlyin different regions of star formation, but this is stillcontroversial.

The distribution of the very massive stars in other gala-xies, such as the Large Magellanic Clouds (LMC), makesit clear that these stars often form in tightly bound, com-pact subgroups within more extensive complexes. Oneof the best examples is 30 Doradus, an enormous H IIcomplex near the bar of the LMC. The triggering mecha-nism is not well understood, but one point is certain: Oncea cloud begins to form massive stars, it is permanentlychanged. Such stars may induce star formation through-out the cloud, leading to its disruption on short (106 yr)time scales.

B. Main Sequence Stages

During most of their lifetimes, massive stars are dominatedby their stellar winds. Their masses are never constant, butare constantly decreasing as their luminosities change. Onthe main sequence they burn hydrogen into helium via theCNO process, during which phase they possess massiveconvective cores and radiative envelopes. The stellar lu-minosity of the mast massive main sequence stars is veryclose to the Eddington limit for most of their lives. Maeder(1987) gives the following mass–luminosity relations forthe massive stars on the main sequence:

logL

L= 2.55 log

M

M+ 1.27 (6)

for 15 ≤ M/M ≤ 40

and

logL

L= 1.84 log

M

M+ 2.42 (7)

for 40 ≤ M/M ≤ 120. It is possible that the most massivestars, above about 60 M, do not have a real main sequencestage in the sense of being globally static. Their surfacesmay be merely opaque layers in their wind. As a result, theatmosphere is likely dynamic and more extended than astationary layer. This, if correct, alters the age determina-tion from fitting the main sequence isochrones to clustersthat are computed with conventional assumptions.

Stars more massive than about 25 M lose as muchas, or more than, 10% of their mass while still in thehydrogen core burning phase of evolution. There is a

rapid increase in this rate for the stars more massive than25 M, reaching upward of 30% for 80 M stars. The mainsequence lifetime for these stars is more difficult to assessthan for lower mass objects. Because their mass contin-ually decreases even during the hydrogen core burningstage of evolution, their lifetimes are longer than wouldbe anticipated from constant-mass models:

tMS = 1.3 × 108

(M

M

)−0.86

yr (8)

for stars with masses 15 ≤ M/M ≤ 60. The greatest un-certainty in these figures is the rate of mass loss whilethe star is on the main sequence. This depends criticallyon the prescription used for calculating the relation be-tween the stellar luminosity and the mass loss in the stellarwind.

Winds are observed for luminous stars. The question isnot whether they occur, but what drives them and how thedriving depends on the stellar properties, such as mass, ra-dius, and luminosity. If driven by turbulence or some otherform of direct mechanical input, the dependence on the lu-minosity is through the surface temperature and radius ofthe star. On the other hand, for radiatively driven winds, thedriving depends on the envelope opacity and consequentlythe metal abundances. Presently, there is considerable un-certainty associated with these choices and the reader isreferred to Maeder and Chiosi (1994) for a tabulated list-ing of currently proposed theoretical and empirical lawsfor M as a function of stellar parameters.

The main sequence lifetime is also affected by coreconvection. There are still uncertainties in the physics ofstellar convective energy transport, particularly concern-ing the overshooting of convective cells at the boundary ofthe CNO-processed core and overlying radiative envelope.Mixing of envelope material, which is hydrogen rich, intothe helium core extends the lifetime of the star in the hydro-gen core burning stage and increases the core mass relativeto the envelope. It also extends the period of subsequenthelium core burning and thus alters the entire subsequentlifetime of the star. While population statistics of stellardistributions near the main sequence on the Hertzsprung–Russell diagram (HRD) seem to require some mixing dur-ing this early stage of evolution, the mechanism for suchmixing is an open problem.

As helium builds in the core, the star begins to evolvetoward the red giant branch, with increasing luminosityand larger radius. Here again, the effects of mass loss playan important role. As the surface gravity drops and theluminous output of the nuclear source increases, the stellarwind becomes more powerful. The combined effect ofcore contraction and increased mass loss produces nearlyconstant luminosity evolution across the HRD. The star’strajectory is then cutting across isochrones computed for

Page 163: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT/GLT P2: GPB Final pages

Encyclopedia of Physical Science and Technology EN015H-725 August 2, 2001 16:2

Stars, Massive 731

constant mass models with more than 30% of the masslost by the time of carbon core ignition. The core of amassive star evolves almost independently from the massof the envelope, so the nucleosynthetic details are lesssensitive to mass loss prescription than are the surfaceproperties.

C. Post-Main-Sequence Evolution:Helium to Iron Core Phases

Helium ignites in the stellar core under nondegenerateconditions while the star is still near the main sequence.In rapid succession, He is converted to oxygen via carbonthrough the successive capture of He nuclei. Some of theO is converted to neon, 16O(α, γ )20Ne. Most of the car-bon and oxygen converts to magnesium and then siliconvia oxygen burning. Finally, the core builds to sulfur andthen to iron. Throughout this sequence, core homogeneityis maintained by convection. Table I, gives some repre-sentative numbers for the relevant time scales and mainsequence luminosities.

During these successive core nuclear burning stages,the inner envelope is also processed. The energy releasedthrough the core reactions increases the temperature at thebase of the envelope, and an onion skin effect is generated,with progressive motion of the nuclear processed materialoutward to a larger mass fraction of the star. The entireprocess is almost independent of mass above about 20 M,and at the stage at which the Fe core is produced, abouthalf of the mass has been converted from hydrogen intoheavier elements.

D. Surface Abundance Changeson Evolutionary Time Scales

It would be incorrect, however, to think that the core evo-lution is without effect on the surface properties of thestar. As the more energetic, and short-lived, phases of nu-

TABLE I Evolutionary Time Scales for 15 to 120 MStarsa,b

M / M log (L / L) ττH core ττHe core

15 4.25 11.58 2.34

20 4.60 8.46 1.89

25 4.85 6.62 1.22

40 5.34 4.53 0.86

60 5.70 3.71 0.71

85 5.98 3.25 0.76

120 6.23 2.81 0.84

a From Maeder, A. (1987). Astron. Astrophys. 173, 247.b Times given in years ×106.

clear processing are reached, the luminosity of the starincreases, thereby powering progressively more intensemass loss. This strips off the outer layers of the star andappears to bring nuclear-processed matter to the stellarsurface. The precise mechanism for this is presently un-known but is assumed to be caused by turbulent mixing.An important point to note here is that regardless of thedetails, CNO-processed material is observed to appear atthe photosphere in many evolved massive stars. There isstrong support from observations for the idea that at leastsome effect of the mass loss can be detected at the sur-face of the star long before it has stripped down to thezones that have actually been burned deep in the interior.It is also likely that an attendent dynamical mixing pro-cess, perhaps powered by shear flows in the core boundarylayer, occurs.

Another very important aspect of the evolution is that,as the mass loss rate increases with decreasing surfacegravity, the envelope begins to strip off the progressivelyprocessed layers, and eventually the redward trend of mo-tion of the star on the HRD is halted and reversed. Starsmore massive than about 50 M are predicted never to en-ter the red giant phase at all. Instead, they turn around atsurface temperatures of about 8000 K and become Wolf–Rayet stars.

E. The Wolf–Rayet Stars

The Wolf–Rayet, or WR, stars come in three varieties, de-pending on the dominant species in their optical spectra.Called WN, WC, or WO for nitrogen, carbon, or oxygen,they form a sequence of both luminosity and temperaturein the hot, high-luminosity part of the HRD. The WR starsubclasses are further subdivided into temperature classeson the basis of line strength and excitation. For instance,WN7 stars are cooler than WN3 stars. The Of stars, whichshow weak emission in optical nitrogen and ionized he-lium lines, overlap with the WN7/8 stars and may form acontinuous sequence.

Because some WR stars are binaries, we have some ideaof their masses. Their masses range from as low as 7 M(for HD 94546) to upward of 50 M (for HDE 311884).For a few eclipsing systems, such as CQ Cephei and GPCephei, the masses are better determined and lie between10 and 20 M. On the basis of their luminosities, onlylower limits would be possible by assuming mass lossat the Eddington rate. The WN stars often have normal,main sequence, massive companions. Since the WR isoften the lower mass star in the system, it is likely that itstarted out as the most massive and evolved more rapidly.The eclipsing binary V444 Cygni is an excellent exampleof such a system: its WN star is about 10 M while itscompanion is an O6 III star of about 30 M. The WC stars

Page 164: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT/GLT P2: GPB Final pages

Encyclopedia of Physical Science and Technology EN015H-725 August 2, 2001 16:2

732 Stars, Massive

are also observed in binaries, but here the companion isfrequently a compact object, most likely a neutron star.A representative star in this class is HD 50896. The WOstars, a rare group, are the most luminous and hottest ofthe WR stars. They show extreme stages of ionization intheir optical spectra, such as O VI, and are likely the mostdenuded of the sequence.

It appears that the Wolf–Rayet stars form the sequenceof stripping in which the nitrogen enhancement is the leastextreme, with the carbon and oxygen revealing progres-sively deeper layers of nuclear-processed material. Themechanism for this process is not well understood, but thebinarity may be a clue. In the current scenario, WR starsare produced in massive close binaries by tidal interactionlimiting the maximum radius to which either componentcan expand.

As evolution drives a star toward a larger radius in aclose binary system, the mass cannot remain bound tothe star—independent of the presence of a stellar wind—beyond a critical radius, called the Roche limit. This radiusdepends on the period, total mass, and mass ratio of thecomponents of the binary. On further expansion, once thestar reaches this surface, some of the stellar envelope islost from the system as a whole and some is accreted ontothe companion. The loss of mass and angular momentumfrom the system drives the stars closer together and accel-erates the stripping process. In systems consisting of twomassive stars, wind collisions produce X-rays that havebeen observed to show orbital modulation. The mass-loserstrips farther down into its interior than would have beenpossible through radiative wind action alone, and a closebinary with a WR component is produced. Many detailsstill remain to be understood about this model before itsapplication to the origin of the Wolf–Rayet stars can bedeclared successful.

The WR stars are often surrounded by thin shell-likenebulae. These ring nebulae, formed by the action of thestellar wind on the interstellar medium, are known asbubbles. They are less dusty, and more energetic, thanthe winds from the luminous blue variables (LBVs) (seeSection V). It is likely that these shells structure the envi-ronment around the core into which the supernova ejectaplow during the explosion of the remnant of the WRstage.

F. The Final State Question: What Doesa Presupernova Star Look Like?

Is a Wolf–Rayet star the precursor to a supernova? Thisis still a controversial matter. It appears that the under-luminous historical supernova, Cas A, which explodedapparently without notice in the 17th century, was oneof these stars. It was probably a WO on the basis of the

observed abundances in the optical filaments and the lackof hydrogen in the ejecta. While exceedingly energetic,the shocks from WO stars do not have much overlyingmatter as they transit the stellar envelope, so the ejectedmaterial cools rapidly without producing an opticallybright blast wave. The normal Type II supernova arisesfrom a less evolved stage of the star, for instance a red orblue supergiant. In these stars, the envelopes are massiveand remain hot and luminous for much longer, nearly ayear, before they become optically thin and cool rapidly.The long plateau observed in the normal SN II light curveis generally ascribed to this envelope structure. Thus, thestage that precedes a supernova is set up by the rate of stel-lar mass loss, so there is still considerable work to be doneon the driving mechanisms for that phenomenon beforedefinitive answers can be given to the questions concern-ing the presupernova stages of evolution. At least one typeof γ -ray burst may result from hypernovae, in which up to1052 erg is released by the shock propagating through theWR wind.

The precise appearance of the stellar surface at the stageof Fe core formation is still a hotly debated question. Be-cause of the role played by mass loss in the final stages ofstellar evolution, the star may end up either as a blue or redstar at the end of its life. The Wolf–Rayet stars are candi-dates for the final, presupernova state. However, it shouldbe remembered that the precursor of SN1987A in the LMCwas a B3 supergiant, and model calculations show that fora 15 to 25 M star the iron core forms while the star is ared supergiant. Clearly, the end stages of massive stars area fruitful area for future work.

The formation of successively heavier nuclei by fusioncannot continue past the formation of Fe. The bindingenergy of this nucleus is the largest possible for a sta-ble isotope, so that subsequent processing can only leadto destruction of the nucleus. This involves endoergic re-actions, robbing the star of its gravitational energy andproducing the rapid disintegration of the nuclei. Coolingproceeds via neutrino emission, which is rapid enough thatthe star cannot quasistatically adjust to the energy lossesand begins to dynamically contract. The formation of Feis, therefore, the beginning of the end for a massive star.After this stage, only dynamical collapse and the conse-quent supernova explosion are possible.

IV. EFFECTS OF MASSIVE STARSON THE GALAXY

The energy input from stars occurs in two forms: (1) ra-diatively through the emission of light from their photo-spheres and (2) mechanically through the action of stellarwinds and supernovae. During the course of its life, a star

Page 165: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT/GLT P2: GPB Final pages

Encyclopedia of Physical Science and Technology EN015H-725 August 2, 2001 16:2

Stars, Massive 733

of about 50 M puts in roughly equal amounts of energy tothe interstellar medium from winds, radiation, and its finaldemise in a supernova explosion. The principal differencebetween these modes is the time scale.

A. Winds and H II Regions

The winds from the massive stars account for the forma-tion of many of the expanding rings observed around OBassociations in the Galaxy. With typical mass loss rates of10−5 M yr−1 and wind speeds of about 3 × 103 km sec−1,the average OB star puts some 3 × 1037 erg sec−1 into theinterstellar medium, about 104 L. Over its lifetime, thisamounts to about 1051 erg. The shell thus generated bythe expansion of the wind compresses the interstellar gas,sweeping it up and slowing down in the process, In theWolf–Ravet stage, this momentum input can increase byas much as an order of magnitude.

The hottest, most massive stars ionize the medium sur-rounding them, forming so-called H II regions. These re-gions of ionized hydrogen (and other elements) are ini-tially static, with equilibrium radii determined by the rateof input of Lyman continuum photons with energies above13.6 eV. The radius of such a region, first determined byStromgren (hence called a Stromgren sphere) is given bythe following:

n2eαR3

H II = NLyC = n0κ

∫ ∞

νLyC

hνdν, (9)

where α is the recombination rate, ne is the electron den-sity, and NLyC is the number of Lyman continuum photons.Within this radius, every atom will be ionized, and RH II

is limited only by the exhaustion of the Lyman continuumphotons. The initial expansion varies as RH II ∼ t1/3 in theStromgren sphere stage.

Such a structure cannot remain in equilibrium for long,and with the continuing input of radiative and mechanicalenergy from the exciting stars, the boundary of the H IIregion begins to expand. This is augmented by the input ofthe winds from the massive OB stars. The outer portion ofthe H II region quickly becomes supersonic and developsa compressional zone. As matter is swept up by the shock,it is first compressed and then ionized, and the hot con-fined gas of the H II region begins to rarify. If driven byradiation pressure or mass loss from the stars, the radius ofthe shell expands in time as RH II ∼ t1/2, while if drivenby an impulsive event, the initially adiabatic stage ex-pands as Rshell ∼ t2/5, and the later momentum-conserving(snowplow) stage expands as Rshell ∼ t1/4. Material fromthe stellar wind mixes with the interstellar gas during thisphase. Some of the matter, if it comes from Wolf–Rayetstars, will have been processed and it is likely that the

most massive red supergiants will also enrich the mediumin CNO products.

B. Induced Star Formation

Should the star formation begin within a molecular cloud,the winds and H II regions can either destroy the cloud byheating it up through radiative and mechanical processesor they can break free of the cloud. This latter type of starformation event is called a champagne flow, analogous tothat commonly observed at weddings. The sudden releaseof pressure by the breakout of the shock from its environ-ment causes a rapid outflow of material from the hot H IIregion surrounding the OB stars. This flow further drivesa shock into the molecular cloud via momentum conser-vation and compresses the already dense material of thecloud core. Such an event may initiate collapse from grav-itational instability, at least locally to the OB stars. Thus,the massive stars are not only capable of destoying thecloud environments in which they have formed, they canalso serve as agents for propagating star formation throughthe cloud.

The formation of the massive stars is well traced bythe radio and infrared flux emitted by the H II regionsthat surround them. Even the densest parts of molecularclouds are not optically thick longward of about 60 µm,so that far infrared (FIR) and radio photons freely escapethe cloud. The argument that permits these to determinethe rate of massive star formation is as follows. Everymassive star that forms an H II region will be surroundedby a thermal, radio-emitting plasma with a temperature ofabout 104 K. The rate of radio emission depends on therate of ionization, which in turn depends on the luminos-ity of the central stars. This in turn depends on the massof the stars. Each radio photon is associated with the ion-ized medium, while each recombination eventually leadsto a Lyα photon through radiative cascades. These pho-tons, which are trapped in the optically thick H II regions,eventually collide with dust grains in the cloud and the H IIregion and are absorbed. The dust reradiates this energy atequilibrium in the FIR. Thus, there is an expected correla-tion between LFIR, the far infrared luminosity, L radio, andψ , the star formation rate per unit mass.

Such a correlation is observed in regions of active starformation in the Galaxy and nearby galaxies, although itis still not completely certain what the implied rates of starformation mean. For some galaxies, called starburst galax-ies, the rate of massive star formation can reach more than103 M yr−1. This enormous rate cannot continue for verylong before molecular cloud material is consumed, but thetriggering mechanism is not currently understood. Suchsystems may be the result of collisions between galaxiesand are often seen in galactic systems undergoing merger.

Page 166: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT/GLT P2: GPB Final pages

Encyclopedia of Physical Science and Technology EN015H-725 August 2, 2001 16:2

734 Stars, Massive

V. THE LUMINOUS BLUE VARIABLES

The discovery of the outburst of η Carinae by JohnHerschel (1847) was of singular importance for the studyof the most massive stars. During the nearly 10 yr follow-ing its discovery in outburst, this star was the brightestobject in the Galaxy, becoming one of the brightest starsin the sky after Sirius and Canopus, both of far lower in-trinsic luminosity. At its peak, η Car reached a luminosityin excess of 106 L. The total radiative output during the30 yr following the eruption may have been as large as2 × 1049 erg. The outburst was accompanied by the ejec-tion of some debris, which is now just visible in the coreof the nebula surrounding η Car, itself the product of aprevious eruption some thousands of years ago. Presently,the star has a surface temperature of about 20,000 K, aluminosity of order 7 × 105 L, and a mass loss rate ofabout 10−1 M yr−1.

η Car was not the first such variable to be observed;P Cygni had also undergone a similar brightening in the17th century, but it was the first since the advent of tele-scopic observation.

While for many years η Car has been assumed to be aprotostar, both because of its surrounding nebulosity andits location in the galactic plane, ultraviolet spectroscopicobservations by T. R. Gull, K. Davidson, and N. Walbornin 1982 showed that the filaments are enriched in nitrogen.This is the signature of a highly evolved massive star andfocused attention on the end state problem for such objects.Similarly, nitrogen-enriched filaments have been foundaround the galactic LBV star AG Car.

Herschel’s description of the event is most importantfor its comparison with the Hubble–Sandage variables.First described for the Local Group by E. Hubble andA. Sandage in 1953, these stars are the most luminousstars in their parent galaxies. They were first recognizedand studied in M31 and M33. They have since been ob-served in several other nearby systems. Several have beenknown in the Large Magellanic Cloud as S Doradus vari-ables, after their prototype, which is an A supergiant inthe LMC. These stars show large-amplitude (upward of5 magnitudes) variations on decade-long time scales. Theyare also variable, with smaller amplitude, on time scalesof months to years.

During the course of an eruption, the spectrum and lightcontinuum of the star change dramatically, although mul-tiwavelength observations show that the bolometric lumi-nosity remains constant. The optical variations are pro-duced by flux redistribution due to enormous increasesin the UV line opacity from the increased mass loss rate.While normally appearing as an OB supergiant, once theejection phase occurs, the star develops a massive windthat becomes optically thick in the Lyman continuum.

The effective temperature, given by Teff = (L/4πσ R2)1/4,drops rapidly. Here, R is the shell radius and σ is theStefan–Boltzmann constant. The bolometric luminosityremains constant, so the peak of the radiation from thispseudophotosphere shifts from the ultraviolet to the nearinfrared through the visible, which produces the markedincrease in the brightness at optical wavelengths. The shellspecturm develops strong, windlike P Cygni line profileswith expansion velocities of 200 km sec−1 up to 3000 kmsec−1, and the star begins to look like a late A supergiant.This has been well observed in this past decade with theoutburst of R 127 in the LMC and is well known for S Dorand several of the M31 variables, AE And and AF And,and also in M33. In a few cases, the shell may becomecool enough to look like an M supergiant, with a temper-ature as low as 3500 K. Variable A in M33 appears to beone of these stars. As the shell dissipates on the turnoff ofthe massive wind stage, the photosphere gradually recedesbackward in the matter, and the photosphere is again re-vealed as an OB supergiant. We do not presently knowwhat turns off the ejection event. Figure 1 shows the lo-cation of some of the LBVs known in the Galaxy, LMC,SMC, M31, and M33.

The cause of the large mass loss episodes is not wellunderstood. Massive stars are pulsationally unstable andmay display impulsive mass loss because of the ampli-tude of the pulsations. Many of the brightest stars in theMagellanic Clouds, called hypergiants by C. de Jager,show long-term fluctuations with amplitudes of about onemagnitude and time scales of months to years. These vari-ations are, however, far smaller than the eruptions thatcharacterize the Hubble–Sandage variables.

One possible picture is that the radiative winds of mas-sive stars are unstable to small changes in the mass lossrate. The temperature of the envelope depends on the opac-ity in the wind. If the mass loss rate begins to increase,perhaps because of some change in the luminosity of thestar or pulsational instability of the envelope, the densityincreases, causing the temperature in the outer parts of thewind to drop. Lines of lower than normal ionization, suchas Fe II, strengthen and produce an increased radiative ac-celeration. This in turn causes the density to increase stillmore, leading to an instability that appears to saturate atabout 10−3 M yr1. The ejection events in P Cyg and η

Car are well reproduced by this mechanism, although theseat of the initial event is not known.

The resultant light variations of the star are thus wellin agreement with the observations. As discussed above,the bolometric luminosity of the star remains constant asthe mass loss increases. The exception is η Car, which ap-pears to require an additional energy source to power itsemission, possibly mechanical due to wind–environmentinteraction. In contrast, as the envelope opacity increases,

Page 167: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT/GLT P2: GPB Final pages

Encyclopedia of Physical Science and Technology EN015H-725 August 2, 2001 16:2

Stars, Massive 735

FIGURE 1 Location and trajectories of luminous blue variables in Local Group galaxies. The main sequence (left)is labeled by stellar mass. The bold line marks the red limit for massive stars (Humphreys-Davidson limit). Normalred supergiants are located in the shaded box, lower right. [This figure originally appeared in the Publications of theAstronomical Society of the Pacific (Humphreys and Davidson, 1994, PASP, 106, 1025), copyright 1994, AstronomicalSociety of the Pacific; reproduced with permission of the Editors.]

the peak of the envelope radiation shifts into the visi-ble and finally into the infrared. As the mass loss slack-ens off, the envelope clears and the star settles back intoa fainter visual, but brighter ultraviolet, state. The be-havior of several LBVs in the Galaxy and the LargeMagellanic Cloud has been followed with ground-basedoptical spectroscopy, in the ultraviolet by the InternationalUltraviolet Explorer satellite and HST, and now in X-rayswith Chandra; it appears to conform to this expectedbehavior.

Several LBVs have been discovered to be surroundedby massive cold shells. R 127 = HD 269858 f in the LargeMagellanic Cloud began an outburst during the 1980s andis surrounded by shells from several previous eruptions.The galactic stars He3-519 and AG Car also have suchshells. η Car has a complex nebulosity, formed from theejecta from several previous events. Imaging with the widefield camera on the Hubble Space Telescope has revealedseveral complex jetlike and lobe structures on the subarc-sec scale in the inner nebula, possibly the products of thelast eruption in the 19th century. All of these shells haveabout 10−3 to 10−2 M, a considerable amount of matterfor single ejection events. The winds from these stars musthave increased substantially during these events, whichtook place some 104 yr ago. In nearly all cases, the ejectaappear bipolar, as do the ionized rings of SN 1987A. Theordering appears to be due to an aspherical wind, equa-torially enhanced and slowed presumably by angular mo-mentum conservation. The structures are reminiscent ofplanetary nebulae.

Yet, despite these findings, the triggering mechanismfor some of the outbursts is still problematic. The varia-

tions in η Car appear to defy description. The luminosityof this star was so great during the outburst in the lastcentury that it is in excess of what is likely possible forthe known bolometric luminosity of the star. Yet anothermechanism may be at work here. It may be linked to the5.5 yr binary that lies imbedded in the ejecta.

VI. CODA

Rare, bright, and physically problematic still seems thebest characterization of the most massive stars. Their for-mation and death are not well understood, and there arestill considerable uncertainties associated with the stagesin between. They bring into sharper view the most extremephysical processes of nucleosynthesis, mass loss, and sta-bility found in steller environments. They will likely re-main true challenges to observers and theorists for decadesto come.

SEE ALSO THE FOLLOWING ARTICLES

BINARY STARS • GALACTIC STRUCTURE AND EVOLUTION

• NEUTRON STARS • PULSARS • SOLAR PHYSICS • STAR

CLUSTERS • STARS, VARIABLE • STELLAR SPECTROSCOPY

• STELLAR STRUCTURE AND EVOLUTION • SUPERNOVAE

BIBLIOGRAPHY

Abbott, D. C., and Conti, P. S. (1987). The Wolf–Rayet stars. Annu. Rev.Astron. Astrophys. 25, 113.

Page 168: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GQT/GLT P2: GPB Final pages

Encyclopedia of Physical Science and Technology EN015H-725 August 2, 2001 16:2

736 Stars, Massive

Arnett, W. D. (1997). “Supernovae and Nucleosynthesis,” Princeton, NJ,Princeton Univ. Press.

Bernasconi, P. A., and Maeder, A. (1996). Astron. Astrophys. 307, 829.Boland, W., and van Woerden, H., eds. (1985). “Birth and Evolution of

Massive Stars and Stellar Groups,” Reidel, Dordrecht.Cassinelli, J., and Lamers, H. J. G. L. M. (1999). “Theory of Stellar

Winds,” Cambridge, UK, Cambridge Univ. Press.Castor, J., Abbott, D. C., and Klein, R. (1975). Astrophys. J. 195, 157.Davidson, K. R., and Humphreys, R. M. (1997). Annu. Rev. Astron.

Astrophys. 35, 1.Davidson, K., Moffatt, A. J., and Lamers, H. J. G. L. M. eds. (1989).

“Physics of Luminous Blue Variables: IAU Colloquium 113,” Reidel,Dordrecht.

Franco, J., Terlevich, R., Lopez–Cruz, O., and Aretxaga, I., eds. (2000).“Cosmic Evolution and Galaxy Feedback: Structure, Interactions, andFeedback,” Astronomical Society of the Pacific, San Francisco.

Humphreys, R., and Davidson, K. (1994). Publ. Astron. Soc. Pacific 106,1025.

de Jager, C. (1980). “The Brightest Stars,” Reidel, Dordrecht.Kudritski, R. P., Pauldrach, A., and Puls, J. (1987). Astron. Astrophys.

173, 293.Langer, N., and El Eid, M. F. (1986). Astron. Astrophys. 167, 265.Maeder, A. (1987). Astron. Astrophys. 173, 247.Maeder, A., and Chiosi, C. (1994). Annu. Rev. Astron. Astrophys. 32,

227.Maeder, A., and Meynet, G. (1994). Astron. Astrophys. 287, 803.Nota, A., Livio, M., Clampin, M., and Schulte–Ladbeck, R. (1995).

Astrophys. J. 448, 788.Schaller, D., Schaerer, D., Meynet, G., and Maeder, A. (1992). Astron

Astrophys. Suppl. 96, 269.Shore, S. N., and Sanduleak, N. (1984). Astrophys. J. Suppl. 55, 1.Shore, S. N., Altner, B., and Waxin, I. (1996). Astron. J. 112, 2744.Sziefert, T., Humphreys, R. M., Davidson, K., Jones, T. J., Stahl, O.,

Wolf, B., and Zickgraf, F.-J. (1996). Astron. Astrophys. 314, 131.Walborn, N., and Fitzpatrick, E. (2000). Publ. Astron. Soc. Pac.

112, 50.

Page 169: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

Stars, VariableSteven N. ShoreIndiana University, South Bend

I. IntroductionII. Observations of Stellar VariabilityIII. Basic Theory of Stellar PulsationIV. Classification of Intrinsic Variable StarsV. Pre-Main-Sequence Variables

VI. Concluding Remarks

GLOSSARY

Hertzsprung–Russell diagram (H-R diagram) Plot ofsurface, or effective, temperature versus luminosity fora star. Observationally, the locus of the observed stellarpopulation in a color–magnitude plane. Theoretically,the description of the path covered by the surface of astar with time.

Instability strip Portion of the HR diagram, spanning thetemperature range from about 3000 to 8000 K and fromthe main sequence to the supergiants, in which most pe-riodic stellar pulsational instabilities are detected. Thestrip is characterized by a period–luminosity (P-L) re-lation, with increasing period corresponding to higherintrinsic brightness.

Overstability Growing, but periodic, instability charac-terized by a complex frequency.

Population I and II Characterized mainly by age andmetallicity differences. Population I is the young, stel-lar galactic component, with metallicities about thesame as the Sun’s and dynamical confinement to theplane of the Galaxy. Population II, found in the galactichalo and globular clusters, consists of older, low-mass

stars with below solar metal abundances.Secular instability A nonperiodic, or steady, growth of

the pulsation with time.Units Solar luminosity, L, 4 × 1033 ergs−1; solar mass,

M, 2 × 1033 g; solar radius, R, 7 × 1010 cm; meansolar density, 〈ρ〉, 1.4 g cm−3.

VARIABLE STARS are stars that vary in photometricoutput over any period of time. The manifestation of peri-odic behavior is rare among stars, resulting from resonancephenomena in the interior. The primary mechanism for in-trinsic stellar variability is pulsation, which may be eitherradial or, in rapidly rotating stars, nonradial. Magneticfields also play a role in structuring the variations. Massloss and shock phenomena are also associated with largeamplitude pulsators, which may affect stellar evolution.This article deals exclusively with pulsating variables.

I. INTRODUCTION

Like most astronomical discoveries, the first observationsof stellar variability were accidental. Fabricius noted, in

737

Page 170: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

738 Stars, Variable

1596, that the star o Ceti showed a range of variabilityfrom invisibility to second magnitude. In honor of this,he named the star Mira, the wonderful. This discoveryserved as confirmation of the view, first forcefully pro-mulgated in the Latin west by Tycho Brahe following the“new star” in Cassiopea of 1572, that the heavens are notstatic.

A major step was achieved in the late 18th century withGoodricke’s discovery, announced in 1782, of the varia-tions of β Persei, also called Algol. The variations wereobserved to be periodic and stable, and Goodricke ex-plained this by the model of a close, unresolved binary starsystem undergoing eclipses. This is the first explanationfor stellar variability, and one which served as an effectivestimulus to search for other binary stars. The statisticalreality of binarity and the application of Newtonian me-chanics made a potent combination. In 1784, Goodrickediscovered the variability of δ Cephei (δ Cep); in the sameyear, Piggot discovered the variability of η Aquilae (ηAql). The eclipse model was applied at the time to bothof these stars, whose periods are similar to that of Algol.Both δ Cephei and η Aql are now known to be pulsat-ing variables, unrelated to the Algol class of binaries, butthe general explanation of the variations did not becomeclear for more than 150 years after this discovery. Table I,adapted from John Herschel’s 1847 edition of Outline ofAstronomy, gives a list of some of the earliest discoveriesof what are now known to be intrinsically variable stars.

During the next century, the search for stellar vari-ability grew in importance to astronomy. The work ofArgelander at Bamberg, the establishment of the BonnerDurchmusterung star atlas, and the international efforts toconstruct the Carte du Ciel, a photographic all-sky atlas,stimulated the discovery and cataloging of variable stars.

Perhaps the most important group working at the end ofthe 19th century was that at the Harvard College Observa-tory. S. Bailey, in observing globular clusters, discovereda class of periodic variables that showed the same charac-teristics as the field variable RR Lyrae, and the search for

TABLE I Early Discoveries of Intrinsic Stellar Variables

Star name Discoverer Year

o Ceti Fabricius 1596

ψ Leonis Montanari 1667

κ Sagittarii Halley 1676

χ Cygni Kirch 1687

δ Cephei Goodricke 1784

η Aquilae Piggot 1784

R CrB Piggot 1795

α Herculis Wm. Herschel 1796

such stars was extended to the southern hemisphere withthe establishment of the Bruce telescope at the HarvardSouthern Station in Peru.

The most important discovery in this period wasconnected with observations of variable stars in the Largeand Small Magellanic Clouds, most of the work beingperformed at Harvard by H. Levitt. She realized, around1905, that these stars were generally of the same type asthe variable δ Cephei. The extraordinary discovery was therelation between the period and the apparent brightness ofthe stars, the so-called period–luminosity relation. Whilethe zero point for this relation was not then known, therelative brightness of these stars, with the same period asδ Cep and other variables in our galaxy, led to a singularlyimportant cosmological breakthrough—the realization ofthe great distance that must characterize the MagellanicClouds.

In the meantime, theoretical work on the cause forvariability continued. Russell pointed out several para-doxes resulting from the assumption of binarity as thecause for the variations of many stars. Eddington sug-gested the pulsation model for stellar variability and de-rived the linearized, adiabatic pulsation equation. Subse-quent work by Rosseland and Cowling further clarified thenature of the instabilities. The culmination of this epochof study was the encyclopedic review article by Ledouxand Walraven, in 1958, which still serves as an impor-tant starting point for all discussions of stellar variability.The role played by stellar convection zones in drivingpulsation and producing the radial velocity variations inCepheids was understood by Cox around 1963. The firstmachine calculations of nonlinear, nonadiabatic pulsa-tion models were accomplished in the mid-1960s. Aroundthe same time, the first large-scale grids of stellar inte-rior models were produced. Schwarzschild and Harm dis-covered the thermal instability of double shell sources in1965.

This article deals exclusively with pulsating variablestars, those which are mechanically and thermally unsta-ble because of intrinsic internal structural processes. Sev-eral other classes of variable stars are dealt with in thearticles “Supernovae” and “Binary Stars” (which includesa discussion of the RS CVn stars, novae, and eclipsingbinary stars).

II. OBSERVATIONS OF STELLARVARIABILITY

Photometric and radial velocity variations are the basicdata for the study of stellar variability. Light curves arethe most obvious manifestation of variations, but becauseof the problems associated with interpretation, they cannot

Page 171: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

Stars, Variable 739

be deemed sufficient to pinpoint a mechanism responsi-ble for the observed changes. Spectroscopic observations,specifically radial velocity variations, when coupled withcolor and light variability, serve to identify the causativeagent behind most stellar variability.

The detailed behavior of velocity and light changes of aCepheid variable or RR Lyrae star are usually very differ-ent from those observed in an eclipsing binary, althoughboth will show both light and radial velocity changes.Most variable stars show asymmetric light curves, usuallysteepest on the ascending branch. Many are not strictlyperiodic, some are genuinely aperiodic. There is often alag between maximum brightness and maximum radialvelocity. Most telling is that there are often both color andspectral changes, which are gradual through the photomet-ric cycle, indicating alterations in the physical structure ofthe stellar atmosphere.

The presence of radial velocity variations means, fora pulsating star, that the radius of the star is physicallychanging with time. The scale of this change is the timeintegral of the radial velocity curve. Changes in the signof the radial velocities take place at extrema of the radiuschanges, so that they can be thought of as a bounce orrecontraction. If the pulsations are strictly periodic, theradial velocity curve will be as well. The relation betweenthe phase of maximum light and maximum radial veloc-ity is most important in specifying the driving mechanismfor the pulsation. One application of the radial velocityvariations, in conjunction with the light curve, is the deter-mination of temperature by the Baade–Wesselink method.This technique assumes that at two phases for which thecolors are the same, the temperatures are also identical.From the comparison of luminosities at these points withthe radial velocities, the radii can be obtained and theamplitude of pulsation can be specified. Additional infor-mation is available using the Barnes–Evans calibration ofsurface brightness versus intrinsic red color [(V − R)0

color, in magnitudes]:

FV = 3.956 − 0.363(V − R)0 (1)

This provides an angular diameter, φ (in arc seconds), forthe star using the apparent (reddening corrected) visualmagnitude V0:

FV = 4.221 − 0.1V0 − 0.5 log φ. (2)

The radius thus obtained can be compared with that de-rived from the radial velocities.

For Cepheid variables, there is an empirical, as well astheoretical, period–luminosity–color relation that relatesthe period of pulsation of the star to its absolute mag-nitude and surface temperature. In combination with theBarnes–Evans relation, this can help place the star in itsevolutionary state, and it also serves as a fundamental cal-

ibrator for the extragalactic distance scale (see discussionin Section IV).

III. BASIC THEORY OF STELLARPULSATION

Stellar pulsation is a phenomenon resulting from depar-tures from strict hydrostatic and thermal equilibrium. Sucha state can be reached either by internal, that is to say evolu-tionary, avenues or by the presence of external perturbers.For instance, the tides observed on Earth are the result ofthe periodic forcing on the oceans and crust, on a rotatingplanet, by the gravitational pull of the Sun and Moon. Onthe other hand, the relaxation of the planet following anearthquake is internally driven.

Pulsation, when it appears on a global scale as or-dered standing wave motion, is essentially a resonant phe-nomenon. Driving is mechanical, due to convection, whichcouples the temperature gradient to mass transport bybuoyancy. In stars like the Sun, this results in a broad spec-trum of acoustic modes, trapped in the envelope. Shouldthe overlying mass be large enough as a restoring force andthe thermal and mechanical time scales in a layer be of thesame order stable pulsation—organized on the scale ofthe envelope—can result. The conditions are met only inlimited parts of the Hertzsprung–Russell (H-R) diagram,the instability strip.

For stars, the problem is that the precise nature of themechanical and thermal couplings of different parts of theinterior is difficult to specify. Stellar interiors are from uni-form or isothermal. There are steep temperature gradientsnear the surface, which is defined by the photosphere, andchemical composition changes because of the processingof the interior mass by nuclear reactions near the core. Inaddition, the ionization of the gas is a function of depth,as is the opacity, and therefore both the heat capacity andradiative losses are strongly depth dependent.

Pulsation serves as one of the few direct probes of thestate of the interior of a star. The way in which an instabil-ity, a temperature and pressure variation at some zone inthe stellar envelope, manifests itself at the stellar surfacefrom which point it can be studied by an external observeris a complex resonance phenomenon that is not completelyunderstood. Helio- and asteroseismology, both developedon the analogy of terrestrial seismology, use the acous-tic spectrum to probe the depth dependence of the soundspeed, much like a trumpet bell.

There are several time scales associated with a stellarinterior. One is the characteristic time it takes for a zoneof some physical size to achieve pressure equilibrium, thesound crossing time. This is given by ts = l/as, where as

is the sound speed that is a function of both the equation of

Page 172: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

740 Stars, Variable

state, through the ratio of specific heats, and the tempera-ture and is therefore strongly depth dependent. Another isthe time scale for radiative losses, which depends stronglyon the opacity and thus on the equation of state, density,and temperature. Finally, and most critically, there is thetime scale for a parcel of matter to oscillate in a gravita-tional field, given its local density. This last time scale, theso-called free-fall time scale, is given by the following:

t f f = (G〈ρ〉)−1/2, (3)

where 〈ρ〉 is the mean density of the star. For a star with amean density of 1 g cm−3, this time scale is about 10 min,essentially the collapse time for matter from any distanceto the center of mass. Stars have large density gradients andthis mean density is heavily weighted toward the core, so ingeneral, to account for the degree of central concentration,one can take the pulsation time scale to be related to thefree-fall time through the following:

Q = P(〈ρ〉/〈ρ〉)1/2, (4)

where 〈ρ〉 is the mean density of the Sun, about1.4 g cm−3, and P is the pulsation period. The value of Qcan be calculated from any interior model once the periodhas been determined from the pulsation equation.

In the first approximation, a star can be assumed tobe spherical. While rotation may play a role in the inter-nal structure, resulting in polar flattening and equatorialexpansion to an oblate spheroid, in general this is not im-portant. The equation of hydrostatic equilibrium can bewritten as a balance between the outward pressure gradi-ent and gravitation:

r = − 1

ρ

∂p

∂r− Gm

r2. (5)

Here r is the radius, ρ is the density, p is the pressure, andm is the mass interior to a point r .

Energy loss is assumed to be due to radiation, whichtakes place diffusively because of the large optical depthof the stellar envelope:

L = 16πacr2T 3

3κρ

∂T

∂r. (6)

Here, L is the luminosity, a and c are the Stefan constantand speed of light, respectively, κ is the opacity coefficient(a function of the density and temperature), and T is thetemperature.

The energy balance of the stellar interior is given by thechange in the entropy of the matter as a result of internalenergy generation and radiative losses. The second law ofthermodynamics:

Td S

dt= d E

dt− P

ρ2

dt, (7)

when supplemented by the equation of state p = p(ρ, T ),becomes

Td S

dt= p

ρ(3 − 1)

[d

dtln p − 1

d

dtln ρ

]

= ε − ∂L

∂m, (8)

where S is the entropy; 1 = (d ln p/d ln ρ)Ad and 3 −1 = (d ln T/d ln ρ)Ad are the adiabatic exponents for thepressure and temperature, respectively; and ε is the rate ofenergy generation per unit mass. This latter quantity, likethe opacity, is a function of the density and temperature.The opacity is, in addition, dependent on the composition.Dramatic temperature dependence results from atomic ab-sorption lines.

Finally, the explicit assumption of sphericity of the starenters in the equation for the mass interior to any radius:

∂m/∂r = 4πr2ρ. (9)

It should be noted that all of these equations will be morecomplicated in the event that the star is distorted fromsphericity, because of the coupling in the momentum equa-tion between the various components of the gravitationalacceleration on a distorted star and possible effects of ro-tation in both the momentum and energy equations.

In this enumeration of the equations of stellar struc-ture, there are two time dependent equations, for the ac-celeration and the energy generation, and two equationsof constraint, the diffusion approximation for the energyloss and the mass conservation equation. Thus, we wouldexpect that the final equation will be third-order in time,and that there will be both stable and unstable solutionsfor many possible combinations of the parameters of en-ergy generation and opacity. In fact, one point is alreadymanifest in these equations—there are three critically im-portant parameters in the pulsation characteristics of astar: the ratio of specific heats, the opacity, and the rateof energy generation. These govern the stability of thestar.

Since the radius is changing with time in a pulsatingstar, the radial coordinate is not the best one to employfor the analysis of the instability. Instead, the Lagrangianapproach is used, in which a mass zone is chosen and fol-lowed in its oscillations, rather than using the radial co-ordinate. In this way, the interior model can be calculatedin equilibrium and perturbed, and thus the perturbation isonly a function of time. For example:

r = r0

(1 + δr

r0

)= r0(1 + ξ ), (10)

where r0 is now the equilibrium position of a mass zoneq = M(r )/M, M being the stellar mass.

Page 173: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

Stars, Variable 741

The calculation of pulsational properties proceeds in adeductive way. One assumes an interior structure and askswhat the response will be for the mass if this configurationis infinitesimally perturbed. The usual argument is thatif the mass is intrinsically unstable any perturbation, nomatter how small and irrespective of its origin, will besufficient to begin pulsation. Consequently, it is assumedthat all physical parameters, T, p, and ρ, are perturbedfrom equilibrium values by a small component.

The combined effect of the perturbations of the equa-tions of stellar evolution is that a single equation can beobtained for the temporal variation of the pulsation radialamplitude:

∂3ξ

∂t3− 4πξr0

∂m[(31 − 4)P0]

− 1

r20

∂m

[16π21 P0ρ0r6

0∂ξ

∂m

]

= −4πr0∂

∂m

[ρ0(3 − 1)δ

(ε − ∂L

∂m

)], (11)

which is also known as the linearized wave equation (al-though not really a wave equation). This equation, firstderived by Eddington, is one of the best studied tools ofstellar interiors. There is an exact solution for the adiabaticversion of this equation, which gives secular instability if1 ≤ 4

3 (for a perfect gas, this ratio is 53 ). The effects of

partial ionization, which occurs in convection zones, de-crease the star’s stability. In consequence, the convectionzones, already mechanically unstable, tend to drive theenvelope pulsation.

The reason for dwelling for so long on the derivationof the pulsation equation is that there are many classes ofinstability. The pulsation is driven from different regionsof the star depending upon the state of the stellar interior.For instance, as first emphasized by Eddington, a star canbe thought of as a self-gravitating analog to a steam en-gine. If the star is to be pulsationally unstable, or what hecalled overstable, then the driving region must heat up oncompression and overcool on expansion. Thus, the restor-ing force for the pressure rises on compression, and therate of energy dissipation is most efficient at the lowertemperature, so the matter overcools. The combined ef-fects of the heat capacity and the opacity throttle the ratesof heating and cooling, and thus there are specific zoneswithin the star that are identified with the initiation andmaintenance of the oscillatory behavior of the envelope.If the radiative time scale is short enough, the pulsationsare strongly damped. In contrast, adiabatic pulsation ispossible if 1 approaches 4

3 , the limit of stability for apolytropic equation of state.

There are few stars that are driven into instability bythe nuclear burning, unless that burning is not central. Forexample, in the case of stars in the transition stage betweenmain sequence and red giant, hydrogen burning occursin a thin shell zone surrounding the hot but inert heliumcore. The temperature dependence of the CNO reaction isnot, though, by itself, sufficient to induce the instabilityobserved in stars in this stage of their lives. Instead, oncethe helium has undergone consumption in the stellar core,the double shell source that surrounds the red giant coreis unstable because of the extreme temperature sensitivityof the He and CNO processes combining to produce theobserved luminosity. This double shell source instability,first recognized by Schwarzschild and Harm in 1965, isone of the few, and still best studied, examples of nuclear-driven instabilities.

The primary cause of stellar pulsation is that the convec-tion zones of the envelope, which correspond to ionizationzones, are thermally unstable. When the zones expandand cool, the ionization drops and so does the opacity.When compressed, they heat up and the ionization goesup thereby increasing the heat capacity (the value of γ

drops). Detailed nonlinear modeling of both Cepheids andRR Lyrae stars confirms this expectation. Because of thelarge energy associated with the ionization of neutral (HeI) and ionized helium (He II) and because these zoneslie fairly deep in the stellar envelope, they serve as thesources for pulsational driving in most classes of variablestars. The explanation lies in another aspect of the drivingof the instability. If the mass overlying the driving zoneis too large, the mechanical impulse of the instability inthe convection zone will be damped before it can raise thesurface layers substantially and the star will be stable orof very low amplitude. Should the mass above the zone betoo small, however, the inertia of the zone will be smalland it will not be able to drive the compression sufficientlyon fall-back to continue the pulsation. In addition, if theconvection zone is too near the surface, the layers will beoptically thin and the energy of the pulsation will be tooeasily dissipated by radiative processes. Thus, a delicatebalance exists between the depth of the pulsational driv-ing zone and the conditions in the zone, and this servesas the simple explanation as to why most stars do notvary.

Intuition would lead one to expect that, when the staris smallest, the luminosity would be greatest since the de-crease in surface area should be more than compensated bythe increase in the surface temperature. The observationsshow, however, that maximum luminosity usually occursat maximum expansion velocity, the phase lag being about0.25 of the period. This effect is produced by the pres-ence of the convection zones, mainly the surface hydro-gen ionization region, which stores the wave of luminosity

Page 174: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

742 Stars, Variable

coming from the driving He II zone deep in the interiorand slowly releases it on expansion. This nonadiabatic be-havior is largely due to the variation in both the opacityand specific heat ratio in this zone as a function of phasein the pulsational cycle.

Nonradial pulsation will result if the star is not spheri-cal, for instance because of rapid rotation or the presenceof a close perturbing companion in a binary system. Suchpulsations are often nearly adiabatic, involving only thestellar atmosphere and, consequently, of short period andlow amplitude. In the case of the sun, for example, thereis no organized resonant radial pulsation, but the envelopeshows a rich spectrum of modes because of the mechan-ical instability of the convection zone. Nonradial modesare driven by gravity rather than thermal instabilities andare often more sensitive to the equation of state than to theopacity (the β Cep stars may be an exception to this).

IV. CLASSIFICATION OF INTRINSICVARIABLE STARS

Classes of variable stars are generally named after thefirst discovered, or prototypical, member of the class. Sub-sequent to the recognition of a given phenomenology, itsometimes appears that the namesake is not the best rep-resentative of the class, but the name persists. As in alltaxonomy, there is a dichotomy between the “splitters”(those who relish the proliferation of subclasses for everydeparture from the precisely defined properties of a class)and “lumpers,” and in this article, the latter presentationis followed. In this article, a broad separation is made be-tween the radial and nonradial pulsators, driven in largemeasure by the differences in the theoretical models re-quired to understand their behavior.

Variable stars are named according to a standard in-ternational rule. The stars are named in order of discov-ery. The first star discovered in a given constellation isnamed R (the first letter that had not previously been usedin star catalogs during the 19th century), if not alreadynamed with some other designation (such as Greek orBayer names). The single letters are exhausted with Z,and the sequence restarts with RR through ZZ, then goesfrom AA through QZ before starting as V335 and con-tinuing. The central clearing house for all variable stardesignations and properties is maintained by the Interna-tional Astronomical Union in Moscow, which publishesthe General Catalogue of Variable Stars.

A. Radial Pulsators

Most radial pulsators are confined to a nearly vertical stripon the Hertzsprung–Russell diagram, called the instabil-

ity strip. In the range of temperature represented by thestrip, the stars have thermally and mechanically unstableenvelopes.

1. Delta Scuti Variables

The Delta Scuti (δ Sct) stars are main sequence and slightlyevolved, intermediate mass stars, A-type stars that showsmall amplitude (0.003 to 0.9 magnitudes) pulsations in aperiod range from about 0.01 to 0.2 days. Their lumi-nosities range from 10 to 100 L. They are primarilydriven by the surface convection zones, and only a smallamount of mass is involved with the pulsational activityof the star. Several classes of stars coexist in the H-R di-agram with these stars, the δ Del stars, Am stars, and Apstars (see Section IV.B.3). The maximum radial expansionvelocity closely matches the phase of maximum light towithin about 1/10th of the period. These variables are nu-merous but difficult to detect because of their short periodsand small amplitudes; however, because many are foundin clusters, their evolutionary status is well understood.They form the lower luminosity (main sequence) end ofthe Cepheid instability strip.

2. RR Lyrae Stars

The RR Lyrae (RR Lyr) stars are identified with a uniquestage in the life of a low-mass star, the horizontal branch,helium core burning period of evolution. As such, theyshow a small range in luminosity, the most variableproperty being their effective temperature as a function ofmass. The locus of stellar masses on the horizontal branchis determined by helium core mass. The RR Lyr stars rangein spectral type from A3 to A6 and have absolute mag-nitudes of about 0.m5, about 102 L. They are low-massstars, 0.5 to 0.7 M. The pulsation is driven by instabilitiesin the He II convection zone. The RR Lyrae stars displayseveral different light curves, which are correlated withtheir periods. The RR(b) stars show both fundamental andfirst overtone pulsations, the typical period ratio beingabout P1/P0 = 0.75; the prototype of this class is AQ Leo.For those RR Lyr stars with only a one observed period,Bailey originally divided the “cluster variables” into twomain subclasses. The RRab stars have periods rangingfrom 0.3 to 1.2 days and amplitudes typically from 0.5to 2 magnitudes. They are characterized by “sawtooth”(steep ascending branches) light curves. The prototype ofthe class, RR Lyr, is a member of this subclass. The RRcstars show small amplitudes, generally less than 0.8 mag-nitudes, short periods from 0.2 to 0.5 days, and essentiallysinusoidal light curves. As in the δ Sct and δ Cep stars, themaximum expansion velocity corresponds to maximumlight.

Page 175: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

Stars, Variable 743

The RR Lyr stars are best studied as members of glob-ular clusters, where they appear in large numbers be-cause of the sizable population of the horizontal branch inthese evolved stellar systems. They can be classified ac-cording to a metallicity index, first proposed by Preston,which compares the spectral type assigned from the metallines, especially Ca II, to that obtained from the hydrogenBalmer lines. This S index is a useful indication of metalabundance and shows that the RR Lyr stars are low-massPopulation II stars, related in their pulsational propertiesto the W Vir and RV Tau stars, but of very low mass.

3. Delta Cepheid Variables: Classical Cepheids

These are the prototype variable stars and the first class forwhich the period luminosity relation was discovered. Theyare generally moderately massive stars, being above 2 or3 M, evolved, and in the stage of helium core burning.They reside in the instability strip, in fact defining theposition of the strip on the H-R diagram, and have singlemodes of pulsation with periods between about 1 day and2 months.

Most δ Cep stars have periods ranging from 1 to about140 days, with amplitudes ranging from 0.01 to 2 mag-nitudes. In general, their variations are greater at shorterwavelength and longer period. The stars are usually spec-tral type F at maximum, and G or K at minimum. Theirluminosities range from 5 × 102 L to about 2 × 104 L,depending on their mass and evolutionary status. Theclassical Cepheids are Population I stars, occasionally re-siding in open galactic clusters from which their approx-imate masses of 3 to 10 M are obtained. Several havebeen found to be members of binary systems, notablySU Cas. Their radial velocity variations are characteris-tic: with only a small phase shift (about 0.1 in phase),the maximum expansion velocity corresponds to maxi-mum light. δ Cep light curves are usually distinctive, beingsteep on the ascending branch with little structure to thecurve.

A subset of the δ Cep stars, called Cep(B) or bumpCepheids, show multiple periodicity, usually with P1/

P0 = 0.7 and P0 between 2 and 7 days. TU Cas and V367Sct are typical of this subclass.

Finally, the DCepS subclass shows amplitudes that arealways less than 0.5 magnitudes and almost symmetriclight curves. Their periods rarely exceed 7 days. The well-studied star SU Cas is typical of this subclass. These starsseem to be pulsating in the first harmonic (hence, thesmaller amplitudes and shorter periods), while classicalCepheids appear to be pulsating in the fundamental mode.

The Cepheids arise from a high-mass population, andas such they do not represent a unique stage in the life ofa star. Following hydrogen core exhaustion, stars greater

than about 3 M will cross the instability strip at least once.The higher mass stars, greater than about 7 M, will tra-verse the strip several times, including the post-helium-core burning stage, each time with increasing luminosity;as many as 5 crossings have been calculated for stars abovethis mass.

Theroetical models provide the following calibrationfor the Cepheid period–luminosity relation:

PF ∼(

L

L

)0.83( M

M

)−0.66

T −3.45eff , (12)

where PF is the fundamental period and Teff is the effectivetemperature. Turning this relation around, one can derivea mass for the Cepheid from theoretical models, calledthe pulsational mass, if the luminosity, surface tempera-ture, and period are known. Few masses are well known,but the pulsational mass determinations are quite sensitiveto the temperature determinations for these stars, while theevolutionary masses seem to agree with the few stars forwhich masses have been obtained. For example, for SUCyg, the Cepheid mass is about 6 M, which agrees withthe evolutionary mass.

Feast and Catchpole (see Heck and Caputo, 1999) haverecently rederived the period-luminosity relationship fromHipparcos parallax data:

〈MV 〉 = −2.81 log P − (1.43 + 0.1) (13)

where 〈MV 〉 is the mean absolute magnitude of theCepheid averaged over its pulsational cycle and P is theperiod in days. The slope derives from the LMC, the zeropoint comes from parallax measurements toward abouttwo dozen Galactic Cepheids.

The δ Cep stars are also important as distance indica-tors, in large measure because of their place in the insta-bility strip; as a result, an enormous effort has, in the past40 years, been put into the determination of intrinsic prop-erties of these stars, especially the study of their massesand luminosities. The discovery of large numbers of binarystars among the Cepheid variables has been an importantclue to these properties. Many of the companions are stillon the main sequence and are of suficiently early hot spec-tral type that they must be moderately massive. They arebest studied using ultraviolet spectra, in which the spec-trum of the Cepheid variable does not appear. Masses canbe determined from a comparison of the Cepheid radialvelocity curve, obtained from the optical spectrum, withthat for the companion, obtained from the UV. In conse-quence, the lower mass of the Cepheid can be understood,and the time scale for the evolution of the variable spec-ified within limits. The binaries will also serve as usefulcalibration of the evolutionary and pulsational mass de-terminations, once a large enough number of them havebeen identified.

Page 176: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

744 Stars, Variable

An additional determination of intrinsic properties isprovided by the use of the infrared. For stars of surfacetemperatures less than about 8000 K, the peak of the spec-tral distribution falls in the red. Therefore, the use of theIR, wavelengths of 1 to 3 µm, can give a clear indicationof the bolometric variability of the star and is not sen-sitive to changes in the shape of the spectrum. Further,since the infrared is far less sensitive to the effects of in-terstellar reddening than the optical, the zero point for themagnitudes can be more reliably determined. The slope ofthe relationship depends on the wavelength at which thevariations are observed, but the mean stellar magnitudeas a function of log P appears to be invariant betweengalaxies, independent of stellar metallicity, for classicalCepheids.

4. W Virginis Stars: Population II Cepheids

These are the Population II analogs of the classicalCepheids. They arise from a lower mass population, fromabout 0.5 to 0.8 M, but being similarly internally struc-tured, they display the same mechanical and photometricproperties as the more massive Population I stars. Theyrange in period from 0.5 to 35 days and have amplitudesin the range 0.3 to 1 magnitude. These stars also dis-play emission in hydrogen Balmer lines. There are twosubclasses identified among the W Vir stars. The CWAstars, like W Vir, have periods greater than 8 days; CWBstars, like BL Her, have shorter periods. In BL Her, whichhas also served as the prototype of a subclass of W Virstars, bumps are observed after maximum on the descend-ing portion of the light curves.

The W Vir stars are fainter than classical Cepheids atthe same period by between 0.7 and 2 magnitudes, a factthat has important cosmological consequences. Since theextragalactic distance scale rests in large measure on theCepheid variables as distance calibrators for relativelynearby galaxies, the period–luminosity relation is used todetermine the distance modulus, m j − M j . Here, m j andM j are the apparent and absolute magnitude in the j thwavelength band. With only the period and apparent mag-nitude in hand, one can determine the distance to a galaxyknowing, from the P-L (or period–luminosity–color orP-L-C) relation, the intrinsic luminosity of the star, henceits distance.

5. Long-Period Variables

a. Mira variables. Mira variables, named after o Cet,are emission-line cool giants, generally arising from a pop-ulation of low to intermediate mass (1 to 2 M) stars.Miras typically have luminosities of about 103 L andtemperatures less than 4000 K. They are found among

the Me, Ce, and Se stars, often displaying time vari-able emission lines of both atomic and molecular species.These stars have some of the largest amplitudes observedamong pulsating variables, ranging from 2.5 to 11 mag-nitudes in the optical, but generally considerably smallerin the infrared (at a wavelength of a few micrometers,the amplitude is less than one magnitude). The periodsrange from 80 to over 1000 days, although many of thelongest period systems are quite irregular in their lightcurve behavior. A characteristic of the light variations isthe change in the light curve structure with time. Mirasin general do not appear to be rigorously periodic, withamplitudes that do not repeat from one cycle to anotherat any wavelength and considerable evidence for massloss and shock processes accompanying the pulsationcycle.

The Miras are perhaps the best studied examples ofnonlinear pulsators, having amplitudes large enough toproduce the ejection of matter and strongly nonadiabaticbehavior of the envelopes. One interesting aspect of thesestars is that their pulsation periods are sufficiently long,and their amplitudes large enough, that time-dependentconvection may be important in modeling their internalstructure. Considerable theoretical attention is currentlybeing paid to these stars.

b. RV Tauri stars. The RV Tau stars are interme-diate temperature (F and G) supergiants with luminosi-ties of approximately 104 L, highly evolved, and conse-quently of long period. They are often associated with oldPopulation I or Population II. Their periods range from30 to 1000 days, and their amplitudes are about 3 to 4magnitudes.

The RV Tau stars are distinctive because of their lightcurves, which on the short-period end resemble classicalCepheids, but which for the long-period stars are quitedistinctive. The longer period subclass, called RVb stars,shows variable mean magnitudes, with a variation in theamplitude of maximum between successive cycles in aperiodic way. The RV Tau star is the prototype of this class.The RVa stars, typified by AC Her, show no variation inthe mean magnitude.

c. R Corona Borealis stars. The R CrB stars are hy-drogen deficient, helium rich, and often carbon rich, high-luminosity stars, characterized by a semiregular behav-ior, which is unique among variable stars. They undergoquasiperiodic fadings during which time they can decreasein brightness by more than 1 to about 10 magnitudes ona time scale of 30 to several hundred days. The intervalsbetween these events are irregular; some low amplitudeperiodicity of less than a year may be present as well.These episodes are marked by the ejection of considerable

Page 177: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

Stars, Variable 745

amounts of dust, as determined from ultraviolet satelliteobservations of their spectra. Optically, they display highpolarization. The mechanism responsible for this behav-ior is not understood presently. The R CrB stars, whichare poorly understood, also span much of the temperaturerange of the stellar population, occupying spectral typesBe to the carbon stars. These stars show very high lumi-nosities, about 5 × 104 L.

d. Semiregular variables. Finally, there are high-luminosity stars that show photometric and radial velocitychanges on time scales of months to years that do notappear to be periodic, although there is a characteristicinterval of time during which the brightness of the star isobserved to change. The supergiants are among the bestexamples of this class of variable, perhaps the best knownbeing the M supergiant Betelguese (α Orionis). In manyways, these stars resemble the RV Tau and Mira variables,but without the regularity of variability associated withthese other stars.

The SRc, or supergiant semiregular variables, have am-plitudes of up to 1 magnitude and periods from tens tothousands of days. Their luminosities range from 104

to 105 L, and temperatures are generally in the range of2000 to 4000 K. The SRc’s overlap with the R CrB stars.The OH/IR stars, red supergiants that display OH maseremission and large middle and far infrared luminosities,may be related to these stars. Other subclasses of thesemiregular variables are the SRa and SRb stars, giantswith similar periods but lower luminosities, and PV Telstars, which are helium-rich, long-period supergiants oflow amplitude (several tenths of a magnitude). The SRaand SRb stars overlap with the Mira variables.

6. Luminous Blue Variables

The luminous blue variables (LBVs) are a recently rec-ognized class of massive supergiants. Also called S Dorvariables, after their Large Magellanic Cloud prototype,these are the most luminous stars in a galaxy and easilyidentified in extragalactic systems. Their luminosities areoften about 106 L, and inferred masses are in excess of30 M Their amplitudes range from less than 1 magnitudeto over 5. Also called Hubble–Sandage variables, they dis-play long (years) time scales for temperature and luminos-ity changes, often ejecting shells that form pseudophoto-spheres, causing their spectral types to change from O toas late as F. The galactic supergiant P Cyg, well knownfor its high rate of mass loss and irregular light variations,is a galactic member of this class. Others recently recog-nized as LBVs are AG Car and HR Car in the Galaxy, HD269858f in the LMC, R40 in the SMC and AE And andAF And in M 31.

B. Nonradial Pulsators

1. β Cephei Stars

The name of this class is dependent on which side of theAtlantic one is on. The Europeans typically refer to theseas β Canes Majoris stars, while the North Americans typ-ically use β Cep as the prototype.

These stars are the prototypes of nonradial pulsation.They do not show simple mode structure, but often pul-sate in several different modes at once. Their periods areextremely short, about 3 to 6 hr. Several Be stars (emissionline B stars that are generally rapid rotators) are also inthis class, the most notable being λ Eri. The best studiedsubclass, the short-period group, lies around spectral typeB2-3 IV–V, displays periods from 0.02 to 0.04 days, andhas amplitudes between 0.015 and 0.025 magnitudes. Theshort period and the high-mode number are indicative ofatmospheric or shallow envelope pulsation, but the detailsof the driving mechanism for these stars are not currentlyunderstood.

The β Cep stars show an instability strip unrelated tothat for the classical Cepheids. It is roughly parallel to themain sequence, and at luminosity class IV, characteristicof post-hydrogen-core exhaustion stars. The pulsation isopacity-driven at temperatures of about 105 K, a result thatfollows from the revision of stellar opacity that was re-quired to explain the 5 Cep stars.

Apparently related to these stars is a class of massivepulsators, which generally lies between O8 and B6, cover-ing the range in luminosity from main sequence to super-giant. These nonradial pulsators show periods between0.1 and about 1 day and small amplitudes, usually lessthan 0.3 magnitudes. Maximum brightness correspondsto minimum radius, indicative of adiabatic pulsation butdistinctly different from the normal behavior observed forthe radial pulsators. Spectral lines in these stars show theeffects of distortion of the photosphere, varying in shapeduring the pulsation period in a fashion that indicates dis-tortion of the stellar surface.

2. White Dwarf Stars

Several classes of variables are found among degeneratestars. The hottest are called GW Vir stars, or PG 1159-035objects, after their prototype. These stars are extremelyhot helium white dwarfs, typically with temperatures ofabout 105 K. They show small amplitude pulsation, whichis usually observed in many nonradial modes. Because ofthe dependence of the period on the temperature of the at-mosphere, the star’s period can change during the courseof centuries as the dwarf cools. The pulsation periods areabout 500 sec, indicative of nonradial modes; radial pul-sation periods for degenerate stars are several orders of

Page 178: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

746 Stars, Variable

magnitude smaller than this. Several of the central stars ofplanetary nebulae have been found to be members of thisclass, with long periods up to about 2000 sec, indicatingextensive atmospheres and temperatures that may be ashigh as 150,000 K.

The cooler class, which lies in the instability strip andwas one of the few predicted classes of variable stars, is theZZ Ceti stars. These are DA-type white dwarf stars, withtemperatures between 104 K and 2 × 104 K. Here, the driv-ing is due mainly to the hydrogen convection zone. Periodsare about 500 to 1000 sec, and masses are about 0.6 M.The DBVs, helium-rich analogs of the ZZ Cet stars, havetemperatures between 2 × 104 K and 3 × 104 K but oth-erwise similar properties. In both cases, the amplitudesare between a few tens and hundreds of millimagnitudes.Among the cataclysmic systems, there are many whitedwarf systems that show multiple periodicities, but it isnot clear whether these should be ascribed to pulsation.

3. The Rapid Magnetic Pulsators

These are main-sequence A type stars with strong mag-netic fields that inhabit the instability strip at the sameplace as the δ Sct stars. The difference between these andthe normal variables of the main sequence is that theydisplay time variable signatures of pulsation, which aredue to the site of surface displacement. They provide aunique chance to study the interaction between pulsationand stellar magnetism.

The periods range from about 5 to 25 min, indicative ofatmospheric pulsation, and they show very small ampli-tudes (typically less than 0.m01. The pulsation appears tobe active only in the vicinity of the magnetic poles. Thus,as the star rotates, the polar region crosses the line ofsight and the pulsation is observable. When the magneticequator crosses the observer’s sight line, the pulsation dis-appears. The mode of the pulsation is very high, typicallyl = 20–40. The presumed driving is essentially the sameas for the δ Sct stars, that is, the hydrogen convectionzone is unstable. According to Shibahashi and Cox, themodification introduced by the magnetic field is to sup-press pulsation at the magnetic equator by the addition ofa strong restoring force, while producing an overstabilityat the poles. About a dozen such systems are known, allconfined to stars with magnetic fields of several hundredto several thousand gauss.

V. PRE-MAIN-SEQUENCE VARIABLES

The variability of stars in the stage before the onset ofhydrogen core burning is not presently well understood.Intrinsic variations are observed in both the T Tauri stars,

which are the young, late-type stellar population associ-ated with, but not imbedded in, their parent molecularclouds, and the FU Orionis stars. The latter are notable forlarge amplitude outbursts, sometimes of more than 5 mag-nitudes, taking place over a period of months to years.These have been recently explained as arising from an ac-cretion disk instability in the environment of the formingstar. Neither of these classes of stars appears to be pulsa-tionally unstable, although detailed modeling is difficult.All stars in evolutionary stages are characterized by emis-sion line spectra and strong stellar winds. One subclass,the YY Ori stars, shows reverse stellar wind profiles, withabsorption on the red side of the emission line suggestiveof mass accretion.

The Herbig Ae/Be stars, more massive pre-main se-quence counterparts of the τ Tau stars, have also beenpredicted to display an instability strip for 1 to 4 M(Marconi and Palla, 1998). The stars for which such vari-ations have been detected, HR 5999, V351 Ori, and HD35929, have similar periods to δ Sct stars.

VI. CONCLUDING REMARKS

As a tool for the study of stellar interiors, pulsation theoryand observations of stellar variability are only now ma-turing. Increased instrumental sensitivity is allowing thestudy of very small scale, nonperiodic oscillations in solar-type stars, the field of helio- and asteroseismology; and it isnow possible to compute physically interesting nonlinear,nonadiabatic pulsation models for a wide variety of stel-lar interior conditions. The effects of magnetic fields androtation are being included in theoretical computations.Improvements in detectors, especially the widespread useof CCDs, are quickening the pace of discovery and studyof faint variables. Improved model atmospheres allow forthe study of stellar abundances with increased precision.The advent of true flux calibration spectra, again usingCCD detectors, allows for direct comparison with mod-els and much improved radial velocity determinations atmany different wavelengths.

The field of variable star research is also a fruitful onefor both amateurs and professionals with access to smalltelescopes; it is a field in which every newly determinedlight curve makes a substantive contribution to our knowl-edge of the cause and behavior of pulsating stars.

SEE ALSO THE FOLLOWING ARTICLES

BINARY STARS • NEUTRON STARS • PULSARS • STAR

CLUSTERS • STARS, MASSIVE • STELLAR SPECTROSCOPY

• STELLAR STRUCTURE AND EVOLUTION

Page 179: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN015H-726 August 2, 2001 16:5

Stars, Variable 747

BIBLIOGRAPHY

Brown, T. M., and Gilliland, R. C. (1994). “Asteroseismology,” Ann.Rev. Astr. Ap. 32, 37.

Cox, J. P. (1980). “Theory of Stellar Pulsation,” Princeton Univ. Press,Princeton, NJ.

Cox, A. N., Sparks, W., and Starrfield, S. G., eds. (1987). “Stellar Pul-sation,” Springer-Verlag, Berlin/New York.

Feast, M. (1999). “Cepheids as distance indicators,” PASP, III, 775.Gautschy, A., and Saio, H. (1996). “Stellar pulsation across the HR

diagram: Part 2,” Annu. Rev. Astron. Astrophys. 34, 551.Hansen, C. J. (1978). Secular stability, Annu. Rev. Astron. Astrophys. 16,

15.Heck, A., and Caputo, F., eds. (1999), “Post-Hipparcos Cosmic Can-

dles,” Kluwer, Dordrecht.

Hoffmeister, C., Richter, G., and Wenzel, W. (1985). “Variable Stars,”Springer-Verlag, Berlin/New York.

Ibanoglu, Ca, ed. (2000). “Variable Stars as Essential AstrophysicalTools (NATO Science Series. Series C, Vol. 544),” Kluwer, Dordrecht.

Kholopov, P. N., ed. (1985). “General Catalogue of Variable Stars: 4thEdition” (3 Vols.), NAUKA, Moscow.

Kurtz, D. W. (1990). “Rapidly pulsating Ap stars,” Annu. Rev. Astr.Astrophys. 28, 607.

Ledoux, P., and Walraven, T. (1958). In “Handbuch der Physik”(E. Flugge, ed.), Vol. 51, p. 353, Springer-Verlag, Berlin/New York.

Marconi, M., and Palla, F. (1998). Ap J (Letters). 507, L141.Rosseland, S. (1949). “The Pulsation Theory of Variable Stars,” Oxford

Univ. Press, London/New York.Unno, W., Osaki, Y., and Shibahashi, H. (1979). “Nonradial Oscillations

of Stars,” Tokyo Univ. Press, Tokyo.

Page 180: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and EvolutionPeter BodenheimerLick Observatory, University of California,Santa Cruz

I. IntroductionII. Observational InformationIII. Physics of Stellar InteriorsIV. Stellar Evolution before the Main SequenceV. The Main Sequence

VI. Stellar Evolution beyond the Main SequenceVII. Final States of Stars

VIII. Summary: Important Unresolved Problems

GLOSSARY

Brown dwarf Object in the substellar mass range (0.01–0.075 solar masses) which, during its entire evolution,never attains interior temperatures high enough to ac-count for 100% of its luminosity by nuclear conversionof hydrogen (1H) to helium.

Degenerate gas Gas in which the elementary parti-cles of a given type fill most of their availablemomentum quantum states as determined by thePauli exclusion principle. Electron degeneracy oc-curs in the cores of highly evolved stars and inwhite dwarfs; neutron degeneracy occurs in neutronstars.

Effective temperature Surface temperature of a star cal-culated from its luminosity and radius under the as-sumption that it radiates as a black body.

Galactic cluster Gravitationally bound group of a fewhundred or a few thousand stars found in the disk ofthe galaxy. In a given cluster all stars are assumed

to have been formed at about the same time, but awide range of ages is represented among the variousclusters.

Globular cluster Compact, gravitationally bound groupof 105–106 stars, generally found in the halo of a galaxyand formed early in the history of the galaxy.

Helioseismology Study of the internal structure of the sunthrough observation and analysis of the small oscilla-tions at its surface.

Hertzsprung-Russell diagram A plot whose ordinateindicates the luminosity of a star and whose abscissaindicates the effective temperature of the star. A givenstar at a given time is represented by a point in this dia-gram. The evolution of a star is represented by a curve(or track) in this diagram.

Horizontal branch Sequence of stars on theHertzsprung-Russell diagram of a globular clus-ter, above the main sequence. The stars all haveapproximately the same luminosity and are in theevolutionary phase where helium is burning in the core.

45

Page 181: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

46 Stellar Structure and Evolution

Luminosity Total rate of radiation of electromagnetic en-ergy from a star, in all wavelengths and in all directions.Unit: energy per unit time.

Main sequence Sequence, or band, of stars in theHertzsprung-Russell diagram, on which a large frac-tion of stars fall, running diagonally from high lumi-nosity and high effective temperature to low luminosityand low effective temperature and associated with theevolutionary phase in which the stars burn hydrogen tohelium in their cores.

Neutrino Subatomic neutral particle produced in starschiefly in beta-decay reactions, but also by other pro-cesses, which travels at the speed of light and interactsonly very weakly with matter.

Neutron star Highly compressed remnant of the evolu-tion of a star of high mass. Its main constituent is freeneutrons, the degenerate pressure of which supports thestar against gravitational collapse. The mean density iscomparable to that of nuclear matter, 1014 g cm−3.

Nova Sudden, temporary, but recurrent brightening ofa star by factors ranging from 10 to 106, occurringin a binary system where, in most cases, a whitedwarf is accreting mass from a main-sequence com-panion. The outbursts are caused either by insta-bility in the accretion disk surrounding the whitedwarf (dwarf nova) or by nuclear reactions in thematerial recently accreted onto its surface (classicalnova).

Nucleosynthesis Production of the elements through nu-clear reactions in stars or in the early universe.

Protostar Object in transition between interstellar andstellar densities, during which it undergoes hydrody-namic collapse and is observable primarily in the in-frared part of the spectrum.

Red giant Post-main-sequence star whose radius ismuch larger than its main-sequence value and whoseeffective temperature generally falls in the range3000–5000 K.

Supernova Sudden increase in luminosity of a star, bya factor of up to 1010, followed by a slower declineover a time of months to years. The event is causedby explosion of the star and dispersal of much of itsmatter.

White dwarf Compact star (mean density, 106 g cm−3)representing the final stage of evolution of a star of lowto moderate mass. It is supported against its gravity bythe pressure of degenerate electrons.

Zero-age main sequence Line in the Hertzsprung-Russell diagram corresponding to the points wherestars of different masses first arrive on the main se-quence and representing the first moment, for a givenstar, when 100% of its luminosity is provided by thefusion of protons to helium.

STELLAR STRUCTURE is the study of the internalproperties of a star, such as its temperature, density, andrate of energy production, and their variation from cen-ter to surface. The structure can be determined through acombination of theoretical calculations, based on knownphysical laws, and observational data. Stellar evolutionrefers to the change in these physical properties with time,again as determined from both theory and observation. Thethree main phases of stellar evolution are (1) the pre-main-sequence phase, during which gravitational contractionprovides most of the star’s energy; (2) the main-sequencephase, in which nuclear fusion of hydrogen to helium inthe central region provides the energy; and (3) the post-main-sequence phase, in which hydrogen burning awayfrom the center, as well as the burning of helium, carbon,or heavier elements, may provide the energy. The evo-lutionary properties are strongly dependent on the initialmass of the star and, to some extent, on its initial chemicalcomposition. The end point of the evolution of a star canbe a white dwarf, a neutron star, or a black hole. In accor-dance with this picture, a star can be defined as a gaseousobject that, at some point in its evolution, obtains 100%of its energy from the fusion of protons (1H) to heliumnuclei. The boundary in mass between stars and substel-lar objects (also known as brown dwarfs) falls at about0.075 solar masses (M), below which the 100% energycondition is never fulfilled.

I. INTRODUCTION

The study of the structure and evolution of the stars, whichconstitute the major fraction of the directly observablemass in the universe, is of critical importance for the under-standing of the production of the chemical elements heav-ier than helium, of the evolution of the solar system andother planetary systems, of the structure and energetics ofthe interstellar medium, and of the evolution of galaxies asa whole. The structure of a star is determined by the inter-action of a number of basic physical processes, includingnuclear fusion; the theory of energy transport by radia-tion, convection, and conduction; atomic physics involv-ing especially the interaction of radiation with matter; andthe equation of state and thermodynamics of a gas. Theseprinciples, combined with basic equilibrium relations andassumed mass and chemical composition, allow the con-struction of mathematical models of stars that give, as afunction of distance from the center, the temperature, den-sity, pressure, and rate of change of chemical compositionby nuclear reactions. A star is not static, however; it mustevolve in time, driven by the loss of energy from its sur-face, primarily in the form of radiation. This energy is pro-vided by two fundamental sources—nuclear energy and

Page 182: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 47

gravitational energy—and in the process of providing thisenergy the star undergoes major changes in its structure. Tofollow this evolution mathematically requires the solutionof a complicated set of equations, a solution that requiresthe use of high-speed computers to obtain sufficient detail.The goal of the calculations is to obtain a complete evo-lutionary history of a star, as a function of its initial massand chemical composition, from its birth in an interstellarcloud to its final state as a compact remnant or possibly asan object completely disrupted by a supernova explosion.

The heart of the study of stellar structure and evolu-tion is, however, the comparison of models and evolu-tionary tracks with the observations. There are numer-ous ways in which such comparisons can be made, forexample, by use of the Hertzsprung-Russell (H–R) dia-grams of star clusters, the mass–luminosity relation onthe main sequence, the abundances of the elements at dif-ferent phases of evolution, and the mass–radius relationfor white dwarfs. There are many exotic stars that the the-ory is not yet able to fully explain, such as pulsars, novae,X-ray binaries, some kinds of supernovae, or stars showingrapid mass loss. These systems provide a challenge for thefuture theorist. However, the general outline of the phasesof stellar evolution has by now fallen into place througha complex interplay between theoretical studies, obser-vations of stars, and laboratory experiments, particularlythose required to determine nuclear reaction rates. The pe-riod of development of ideas concerning the structure ofstars extends more than 100 years into the past. Amongthe noteworthy historical developments were the clarifi-cation by Sir Arthur Eddington (1926) of the physics ofradiation transfer; the development by S. Chandrasekhar(1931) of the theory of white dwarf stars and the derivationof their limiting mass; and the work of H. Bethe (1939)and others, which established the precise mechanisms bywhich the fusion of hydrogen to helium provides most ofthe energy of the stars. However, much of the detailed de-velopment of the subject has occurred since 1955, spurredby the availability of high-speed computers and by theextension of the observational database from the opticalregion of the spectrum into the radio, infrared, ultraviolet,X-ray, and gamma-ray regions. Numerous scientists havecollaborated to advance our knowledge of the physics ofstars in all phases of their evolution.

II. OBSERVATIONAL INFORMATION

The critical pieces of observational data include the lumi-nosity, effective temperature, mass, radius, and chemicalcomposition of a star. A further fundamental piece of in-formation that is required to obtain much of this data isthe distance to the star, a quantity that in general is diffi-

cult to measure because even the nearest star is 2.6 × 105

astronomical units (AU) away, where the AU, the meandistance of the earth from the sun, is 1.5 × 1013 cm. TheAU, measured accurately by use of radar reflection offthe surface of Venus, is used as the baseline for trigono-metric determinations of stellar distances. The apparentshift ( parallax) in the position of a star, against the back-ground defined by more distant stars or galaxies, whenviewed from different points in the earth’s orbit, allowsthe distance to be determined. The parsec (pc) is definedas the distance of a star with an apparent positional shiftof 1 sec of arc on a baseline of 1 AU and has the valueof 3.08 × 1018 cm. The nearest stars (the Alpha Centauritriple system) are about 1.3 pc away. The most accurateavailable database of parallaxes was obtained from theHipparcos satellite, which measured 120,000 stars withan accuracy down to 0.001 arcsec. Thus, the distance outto which accurate distance measurements are available isabout 150 pc, still small compared with the distance to thecenter of our galaxy (≈8000 pc). Thus, for most stars indi-rect determinations of distances must be used, based, forexample, on period–luminosity relations for variable starsor on properties of the spectrum and apparent luminosityof a star compared to those of a star with a very similarspectrum and known distance.

A. Luminosity

The standard unit of luminosity is that of the sun, whichis obtained by a direct measurement of the amount ofenergy S, received per unit area per unit time, over allwavelengths, outside the earth’s atmosphere, at the meandistance of the earth from the sun. This quantity, knownas the solar constant, is then converted into the solar lu-minosity by using the formula

L = 4πd2S = 3.86 × 1033 erg/sec,

where d = 1 AU. For other stars, in principle, the stellarflux S (energy per unit area per unit time) received at theearth is corrected for absorption in the earth’s atmosphereand in interstellar space and is extended to include allwavelengths of radiation. If the star’s distance d is known,its luminosity follows from L = 4πd2S.

B. Effective Temperature

A number of different methods are used to determine theeffective (surface) temperature, most of which are basedon the assumption that the stars radiate into space witha spectral energy distribution that approximates a blackbody. For a few stars, such as the sun, whose radius R canbe measured directly, the value of Teff is obtained from Land R by use of the black-body relation L = 4π R2σ T 4

eff,

Page 183: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

48 Stellar Structure and Evolution

where σ is the Stefan-Boltzmann constant. Otherwise thetemperature may be estimated by four different methods.

1. The detailed spectral energy distribution is mea-sured, and the temperature of the black-body distributionthat best fits it is found.

2. The wavelength λmax of maximum intensity in thespectrum is measured, and the temperature is found fromWien’s displacement law for a black body: λmaxT =2.89 × 107, if λmax is expressed in angstroms.

3. The “color” of the star is obtained by measurementof the stellar flux in two different wavelength bands. Forexample, the color “B − V” is obtained by comparing thestar’s flux in a wavelength band centered at 4400 A andabout 1000 A broad (blue) with that in a band of similarwidth centered at 5500 A (visual). In principle, the trans-mission properties of the B and V filters could be used inconnection with black-body curves to determine the tem-perature. In practice, stars are not perfect black bodies,and the temperature is determined by comparison with aset of standard stars.

4. From the strength in the stellar spectrum of absorp-tion lines of various chemical elements, the spectral typeand thereby the temperature can be determined by com-parison with a set of standard stars. This method doesnot depend on the black-body assumption, and the tem-perature is, in principles, determined from the degree ofexcitation and ionization of the atoms.

C. Radius

The radius can be measured directly for only a few stars.In the case of the sun, the angular size can be measuredand the distance is known. In the case of Sirius and a fewother stars, the angular diameter can be measured by use ofinterferometry or by high-resolution direct imaging withthe Hubble Space Telescope. For certain eclipsing binarysystems, if the orbital parameters are known and the lightvariation with time can be accurately measured, the radiiof both components can be determined. In all other casesthe radius must be estimated from measurements of L andTeff and the formula L = 4π R2σ T 4

eff.

D. Mass

A stellar mass can be measured directly only if the staris a member of a binary system, in which case Kepler’sthird law can be applied. If M1 and M2 are the stellarmasses, in units of the solar mass M, P is the orbitalperiod of the system in years, and a is the semimajor axisof the relative orbit in astronomical units, then the lawstates that (M1 + M2)P2 = a3. In the case of the sun, theorbital periods and distances of the planets can be used for

an accurate determination of solar masses. If both com-ponents of a binary are visible (visual binary) and if theangular separation as well as P can be determined, thenthe sum of the masses follows from Kepler’s law as longas the distance is known. The individual masses can befound if the relative distances of the stars from the centerof mass can be measured. If the binary system has sucha close separation that the components cannot be visuallyresolved, it may be possible to resolve them spectroscop-ically. If the Doppler shifts of spectral lines as a functionof time can be measured for both components, then theperiod can be determined as well as the mass ratio andminimum masses of the components. If in addition thesystem shows an eclipse, then the individual masses canbe determined. There are relatively few systems with therequired orbital characteristics to allow reasonably accu-rate mass determinations.

E. Abundances

The relative abundances of the elements in the solar sys-tem (at approximately the time of its formation) have beencompiled by E. Anders and N. Grevesse, based in mostcases on laboratory measurements of the oldest meteoriticmaterial and in some cases from the strengths of absorp-tion lines in the solar atmosphere. Abundances for selectedelements are given in Table I. In stellar evolution theory,the fractional abundance of H by mass is known as X ,that of He is known as Y , and that of all other elementsare known as Z . In stars, the abundances are determined

TABLE I Solar System Abundances

Log abundance bynumber of atoms Fractional

Element (H = 1012) abundance by mass

1 H 12.00 0.704

2 He 11.00 0.279

3 Li 3.31 9.89 × 10−9

6 C 8.55 2.97 × 10−3

7 N 7.97 9.12 × 10−4

8 O 8.87 8.28 × 10−3

10 Ne 8.07 1.66 × 10−3

11 Na 6.33 3.43 × 10−5

12 Mg 7.58 6.45 × 10−4

13 Al 6.47 5.56 × 10−5

14 Si 7.55 6.96 × 10−4

16 S 7.21 3.63 × 10−4

20 Ca 6.36 6.41 × 10−5

24 Cr 5.67 1.70 × 10−5

26 Fe 7.51 1.26 × 10−3

28 Ni 6.25 7.29 × 10−5

Page 184: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 49

from the strengths of absorption lines in the spectrum com-bined with other parameters in the stellar atmosphere andcompared with solar values. In practically all measuredsystems, the values of X and Y are deduced to be verysimilar to the solar values. However, Z can vary, and inthe oldest stars and globular clusters, it can fall below thesolar value by a factor of more than 100.

F. Hertzsprung-Russell Diagram

The Hertzsprung-Russell (H–R) diagram is a plot of lu-minosity versus surface temperature for a set of stars. Al-though the data can be plotted in various forms, the sam-

FIGURE 1 H–R diagram for the 100 brightest stars (open circles) and the 90 nearest stars (filled circles). [Reprintedwith permission from Jastrow, R., and Thompson, M. H. (1984). “Astronomy: Fundamentals and Frontiers,” 4th ed.,Wiley, New York. 1984, Robert Jastrow.]

ple H–R diagram shown here (Fig. 1) gives data convertedfrom observed quantities to L and Teff. Most of the starslie along the main sequence, which represents the locus ofstars during the phase of hydrogen burning in their cores,with increasing Teff corresponding to increasing mass. Thestars well below the main sequence are in the white dwarfphase, having exhausted their nuclear fuel. The stars in theupper-right part of the diagram are red giants (e.g., Alde-baran); these are stars that have exhausted their centralhydrogen and are now burning hydrogen in a shell regionaround the exhausted core. Some of them are also burninghelium in a core or in a shell. The number of stars in agiven region of the H–R diagram is roughly proportional

Page 185: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

50 Stellar Structure and Evolution

FIGURE 2 Observed masses of main-sequence stars in units ofthe solar mass as determined from binary orbits plotted againsttheir luminosities in units of the solar luminosity (filled circles).The open circles are subgiants or giants. The observational er-rors in mass are less than 2%. [Reprinted with permission fromAndersen, J. (2000). In “Unsolved Problems in Stellar Evolution”(M. Livio, ed.), Cambridge Univ. Press, Cambridge, UK. Reprintedwith permission of Cambridge, University Press.]

to the evolutionary time spent in that region; thus, themain-sequence or core hydrogen-burning phase is that inwhich stars spend most of their lifetime.

G. Mass–Luminosity Relation

For stars known to be on the main sequence, the obser-vational information on M and L can be combined toproduce a reasonably smooth mass–luminosity relation(Fig. 2). The slope of this relation is not constant, how-ever. Empirically, the luminosity increases approximatelyas M3 for stars in the region of 10–20 M and as M4.5

for stars between 1–2 M. A few additional masses areavailable for stars less massive than the sun, but the ac-curacy is not as good as for those shown in Fig. 2. Theo-retical models, with composition close to that of the sunat the zero-age main sequence, agree well with the ob-served points. There are two reasons for the scatter in theobserved points: first, some of the observed stars are notstrictly at zero age but have evolved slightly, with a corre-sponding increase in L; and second, there are small differ-ences in metal abundance among the stars, which affectthe luminosities. Because stars evolve and change theirluminosity considerably while retaining the same mass,a mass–luminosity relation cannot be specified for mostregions of the H–R diagram, only near the zero-age mainsequence.

H. Stellar Ages

Although the age of the solar system (and therefore pre-sumably the sun) can be determined accurately from ra-dioactive dating of the oldest moon rocks and meteorites,the ages of other stars cannot be determined directly. Sev-eral indirect methods exist, which depend for the most partupon the theory of stellar evolution, and generally, thereis some uncertainty in the derived ages.

1. The H–R diagrams of galactic or globular clusterscan be compared with evolutionary calculations. The starsin a cluster are assumed to be all of the same age; to havethe same composition; and, except for the very nearestclusters, to all lie at the same distance from the earth. Theobserved cluster diagram is compared with a theoreticalline of constant age, known as an isochrone, obtained bycalculating the evolution in the H–R diagram of a set ofstars with different masses but the same composition andby connecting points on their evolutionary tracks that cor-respond to the same elapsed times since formation. Thisprocedure is illustrated in Figs. 3 and 4, which show howthe age is determined for two different clusters. Fig. 3shows the Hyades, a nearby galactic cluster with a distanceof 46.3 pc. The age is determined from the positions in the

FIGURE 3 H–R diagram for the Hyades cluster. The luminosityis given in terms of the absolute visual magnitude Mv, and theobserved color (B − V) is an indicator of surface temperature (in-creasing to the left). The filled circles with error bars correspondto the observations of the individual single stars, whose distanceshave been determined from Hipparcos parallaxes. The symbolswith arrows correspond to the two components of a spectroscopicbinary. The dotted curve corresponds to the theoretical zero-agemain sequence for the composition of this cluster. Other curvesare theoretical isochrones, that is, predictors of how the H–R dia-gram should look at the indicated ages. [Reprinted with permissionfrom Perryman, M. A. C., et al. (1998). Astron. Astrophys. 331, 81. European Southern Observatory.]

Page 186: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 51

FIGURE 4 H–R diagram for the stars in the globular cluster M92.The axes have the same meaning as in Fig. 3. The filled circlesare averaged observed stellar positions. The solid curves are the-oretical isochrones for the ages (top to bottom) of 8, 10, 12, 14, 16,and 18 Gyr. The evolved stars in the upper-left part of the diagramare on the “horizontal branch,” where they are burning He in theircores. [Adapted with permission from Pont, F., Mayor, M., Turon,C., and VandenBerg, D. A. (1998). Astron. Astrophys. 329, 87. European Southern Observatory.]

H–R diagram of the stars near the “turnoff” point, usuallyidentified as the point of highest effective temperature onthe main sequence. The time spent by a star on the mainsequence decreases with increasing mass; thus, stars withmasses above that corresponding to the turnoff mass havealready left the main sequence and have evolved to the redgiant region. In this case the age is determined to be 625Myr, with an error of ±50 Myr. Fig. 4 shows the globularcluster M92, one of the oldest objects in the galaxy. Itsdistance is about 9000 pc, and the abundance of metals,such as iron, is less than 1/100 that of the sun. The dis-tance to the cluster is determined by fitting the locationof its main sequence in the H–R diagram to the locationsof nearby stars of the same (reduced) metallicity whosedistances are known from Hipparcos parallaxes. The agederived from isochrone fitting is 14 ± 1.2 Gyr. Note thatthe luminosity of the turnoff is much fainter than that forthe Hyades cluster. In the cluster fitting method, the ageof the stars is really being determined from the nuclearburning time scale.

2. Main-sequence stars like the sun have chromo-spheric activity, analogous to solar activity, such as flares,prominences, and chromospheric emission lines, whichdeclines with age. An often-used indicator of chromo-spheric activity is the strength of two emission features inthe Ca lines between 3900 and 4000 A. The age-activityrelation is calibrated by use of objects whose ages havebeen determined by other methods, such as the sun andthe Hyades cluster.

3. Certain special types of stars are known to be young,that is, with ages of 1 to a few million years. One exam-ple is the T Tauri stars, which are identified by their highlithium abundances, the presence of emission lines of hy-drogen, their irregular variability, their association withdark clouds (star-forming regions) in the galaxy, and theirlocation in the H–R diagram above and to the right of themain sequence. Typical masses are 0.5–2 M, and typicalvalues of Teff are around 4000 K. A second example is themassive stars near the upper end of the main sequence,with L/L ≈ 105. They are also known to be young be-cause their nuclear burning time scale is only a few Myr.

4. The stars in the galaxy have been roughly divided,according to age, into two populations. The PopulationI stars are associated with the galactic disk, have rela-tively small space motions with respect to the sun’s, andhave metal (such as iron) abundances similar to the sun’s.Although precise ages cannot be determined, this popu-lation is younger as a group than the Population II stars,which are distributed in the galactic halo and in globu-lar clusters and have high space velocities and low metalabundances. These objects were formed early in the his-tory of the galaxy (see Fig. 4), and their low metal abun-dance supports the point of view that the elements heav-ier than helium have been synthesized in the interiors ofthe stars. Subsequently, some of the synthesized materialwas ejected from the stars in supernova explosions and inwinds from evolved stars, so that later generations of starsform from interstellar material that has been gradually en-riched in the heavy elements.

III. PHYSICS OF STELLAR INTERIORS

A. Time Scales

The dynamical time scale, the contraction time scale, thecooling time scale, and the nuclear time scale are impor-tant at different stages of stellar evolution. If the star is notin hydrostatic equilibrium, it will evolve on the dynam-ical time scale given by tff = [3π/(32Gρ)]1/2; this is, infact, the free-fall time from an initial density ρ. Examplesof unstable stars that evolve on this time scale are proto-stars or the evolved cores of massive stars, which undergophotodissociation of the iron nuclei and consequently areforced into gravitational collapse. The resulting supernovaoutburst, of course, also takes place on a dynamical timescale. Another example is the light variation in Cepheidvariables, which is caused by radial oscillations about anequilibrium state; the oscillation period is of the sameorder as tff. Characteristic time scales are 105 years for theprotostar collapse, 0.1 sec for the collapse of the iron core,and 10 days for the pulsation period of a Cepheid.

Page 187: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

52 Stellar Structure and Evolution

Stars in hydrostatic equilibrium but without a substan-tial nuclear energy source undergo a slow contractionwith the release of gravitational energy. The associatedtime scale, known as the Kelvin-Helmholtz time scale, isgiven by the gravitational energy divided by the luminos-ity: tKH ≈ G M2/(RL), where R is the final radius and Lis the average luminosity. Stars contracting to the mainsequence evolve on this time scale as do, for example,stars at the end of the main-sequence phase when they runout of hydrogen fuel in the core. For the present sun, thevalue of tKH is about 3 × 107 years, which represents thetime required for it to contract to its present size withoutany contributions from nuclear reactions. Because tKH,as well as the other time scales discussed later, dependon L , they are controlled by the time required for en-ergy to be transported from the interior of the star to thesurface.

Stars that derive all their energy from nuclear burningevolve on a nuclear time scale, which can be estimatedfrom the total available nuclear energy divided by theluminosity. For the case of hydrogen burning, four pro-tons, each with a mass of 1.008 atomic mass units (amu),combine to form a helium nucleus which has 4.0027amu. The difference in mass of 0.0073 amu per protonis released as energy. The corresponding time scale istnuc = 0.007Mc2/L , where M is the amount of mass ofH that is burned and c is the velocity of light. For the sunon the main sequence, X ≈ 0.71, L ≈ L, and about 14%of the hydrogen is actually burned, so the estimate givestnuc ≈ 1. × 1010 years. In subsequent evolutionary phasesthe sun continues to burn hydrogen but at a higher rate.For a star of 30 M, L ≈ 2 × 105 L, about half of thehydrogen is burned, and tnuc ≈ 5 × 106 years. For heliumburning, where three helium nuclei combine to producecarbon, the energy release per atomic mass unit is a factor10 smaller than for hydrogen burning, and correspond-ingly, the time spent by a star in the core helium burningphase is a factor of 5–10 shorter than its main sequencelifetime, depending on the luminosity and the amount ofhydrogen burning going on in a shell at the same time.

A final time scale associated with stellar evolution is thecooling time scale. For white dwarf stars that can no longercontract and also have no more nuclear fuel available, theradiated energy must be supplied by cooling of the hotinterior. The electrons by this time are highly degenerate(see later discussion), so the available thermal energy isonly that of the ions. The time scale can be estimated from

tcool ≈ Ethermal/L = 1.5RgT M/(µAL),

where T is the mean internal temperature, Rg is the gasconstant, and µA is the mean atomic weight (in amu) ofthe ions. For example, the cooling time of a white dwarfof 0.7 M composed of carbon from the beginning of

the white dwarf phase to a luminosity of L = 10−3 Lis 1.15 × 109 years. The cooling time scale also appliesto substellar objects once they have contracted to theirlimiting radius.

B. Equation of State of a Gas

In the pre-main-sequence and main-sequence stages ofthe evolution of most stars, the ideal gas equation holds.The pressure of the gas is given by P = NkT , where kis the Boltzmann constant and N is the number of freeparticles per unit volume. If X, Y, and Z are the massfractions of H, He, and heavy elements, respectively, andthe gas is fully ionized, then

N = (2X + 0.75Y + 0.5Z )ρ/mH,

where mH is the mass of the hydrogen atom. It has been as-sumed that the number of particles contributed per nucleusis 2 for H, 3 for He, and A/2 (an approximation) for theheavy elements, where A is the atomic weight. In the outerlayers of the star, N must be adjusted to take into accountpartial ionization. The internal energy for an ideal gas is1.5kT per particle or 1.5kT N/ρ = 1.5P/ρ per unit mass.The equation of state can also be written P = RgρT/µ,where µ, the mean atomic weight per free particle, is givenby µ−1 = 2X + 0.75Y + 0.5Z for a fully ionized gas.

Under conditions of high temperature and low density,the pressure of the radiation must be added. Accordingto quantum theory, a photon has energy hν, where ν isthe frequency and h is Planck’s constant, and momentumhν/c. The radiation pressure is the net rate of transfer ofmomentum per unit area, normal to an arbitrarily orientedsurface. Under conditions in stellar interiors where the ra-diation is nearly isotropic and the system is near thermo-dynamic equilibrium, it can be shown that the radiationpressure is given by PR = 1

3 aT 4, where a, the radiationdensity constant, equals 7.56 × 10−15 erg cm−3 degree−4.

At high densities an additional physical effect must beconsidered, the phenomenon of degeneracy. The effect is aconsequence of the Pauli exclusion principle, which doesnot permit more than one particle to occupy one quan-tum state at the same time; it applies to elementary parti-cles with half-integral spin, such as electrons or neutrons.In the case of electrons bound in an atom, the princi-ple governs the distribution of electrons in various energystates. If the lowest energy states are filled, any electronsadded subsequently must go into states of higher energy;only a discrete set of states is available. The same prin-ciple applies to free electrons. In a momentum intervaldpx dpy dpz and in a volume element dx dy dz, the num-ber of states is ( 2

h3 ) dpx dpy dpz dx dy dz, where the factorof 2 arises from the two possible directions of the elec-tron spin. Degeneracy occurs when most of these states,

Page 188: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 53

up to some limiting momentum p, are occupied by par-ticles. Complete degeneracy is defined as the situation inwhich all states are occupied up to the limiting momen-tum p0 and all higher states are empty. Under this as-sumption, if the particles are electrons, their pressure canbe derived to be Pe = 1.0036 × 1013(ρ/µe)5/3 dyne cm−2,where µe, the mean atomic weight per free electron, is2/(1 + X ) for a fully ionized gas. This formula is valid ifpo is low enough so that the electron velocities are not rel-ativistic. In the limiting case that all electrons are movingwith the velocity of light, the corresponding expressionis Pe = 1.2435 × 1015(ρ/µe)4/3 dyne cm−2. In either casemost of the electrons are forced into such high momen-tum states that their pressure, defined again as the rateof transport of momentum per unit area, is much higherthan the ideal gas pressure. The electrons in stellar inte-riors become degenerate when the density reaches 103–106 g cm−3, depending on T . The protons or neutronsbecome degenerate only at much higher densities, about1014 g cm−3; these densities are characteristic of neutronstars, and it is the degenerate pressure of the neutrons thatsupports the star against gravity. The regions in densityand temperature where ideal gas, radiation pressure, andelectron degeneracy dominate in the equation of state areshown in Fig. 5.

C. Energy Sources

From the previous discussion of time scales, it is clearthat only two fundamental energy sources are available

FIGURE 5 Density–temperature diagram. The double lines sep-arate regions in which different physical effects dominate in theequation of state. Solid lines, denoted by H, He, and C, respec-tively, indicate the central conditions in stars that are undergo-ing burning of hydrogen, helium, and carbon in their cores. Thedashed line represents the structure of the present sun, while thesolid dot gives typical central conditions in a white dwarf of 0.8 M.The circled cross gives central conditions in a star of 25 M justbefore the collapse of the iron core.

to a star. Gravitational energy can be released either ona dynamical or a Kelvin-Helmholtz time scale. Nuclearenergy in moderate-mass stars is produced by conversionof hydrogen to helium or by conversion of helium to car-bon and oxygen. Only in the most massive stars do nuclearreactions proceed further to the synthesis of the heavier el-ements up to iron and nickel. During cooling phases starsdraw on their thermal energy, which, however, has beenproduced in the past as a consequence of nuclear reac-tions or gravitational contraction. As the star evolves, theenergy is taken up by radiation, heating of the star, expan-sion, neutrino emission, mass loss, ionization of atoms,and dissociation of nuclei. Here, we discuss in somewhatmore detail the nuclear energy source.

The nuclear processes involve reactions between charg-ed particles. At the temperatures characteristic of nuclearburning regions in stars, the gas may safely be assumedto be fully ionized. The Coulomb repulsive force betweenparticles of like charge presents a barrier for reactions be-tween the ions. Particles must come within 10−13 cm ofeach other before the strong (but short-range) nuclear at-tractive force overcomes the Coulomb force. The energythat is required for particles to pass through the Coulombbarrier is Ecoulomb = Z1 Z2e2/r , where r is the requiredseparation of 10−13 cm, e is the electronic charge, andZ1 and Z2 are, respectively, the charges on the two parti-cles involved, in units of the electronic charge. Evidently,favorable conditions for reactions will occur most read-ily for particles of small charge, but even for two protonsEcoulomb = 1000 keV. The only source for the energy isthe thermal energy of the particles, which, however, in thetemperature range around 107 K is only 3

2 kT ≈ 1 keV. Itwould seem that nuclear reactions under these conditionswould not be possible.

However, three factors contribute to a small but non-zero probability of reaction. (1) The particles have aMaxwell velocity distribution, so that a small fraction ofthem have thermal energies much higher than the average.(2) According to the laws of quantum mechanics, there isa small probability that a particle with much less energythan required can actually “tunnel” through the Coulombbarrier. (3) A star has a very large number of particles, sothat even if an individual particle has a very small proba-bility of reacting, enough reactions do occur to supply therequired energy. It turns out that in main-sequence starsparticles with energies of about 15–30 keV satisfy the tworequirements of existing in sufficient numbers and hav-ing enough energy to have a reasonable change of passingthrough the Coulomb barrier.

At temperatures of 1 to 3 × 107 K, which character-ize the interiors of most main-sequence stars, only re-actions between particles of low atomic number Z needbe considered, usually those involving protons. However,

Page 189: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

54 Stellar Structure and Evolution

even if the Coulomb barrier is overcome, there is stillonly a small probability that a nuclear reaction will oc-cur. This nuclear probability varies widely from one re-action to another, and it can be determined theoreticallyfor only the simplest systems; in general, these proba-bilities must be measured for each reaction in laboratoryexperiments at low energy. An intense effort was madeduring the period 1960–1980 to make measurements ofreasonable accuracy of all the nuclear reactions that areimportant in stars. Nevertheless, in some cases measure-ments cannot be made at the low energies appropriatefor main-sequence stars, and the probabilities have tobe extrapolated from measurements at higher energy. Thegroup headed by W. A. Fowler at the California Instituteof Technology led in the task of carrying out these difficultmeasurements.

Two reaction sequences have been identified that resultin the conversion of four protons into one helium nucleus,the proton-proton (pp) chains and the CNO cycle. Thereactions of the main branch of the pp chains (PP I) are asfollows:

1H + 1H → 2D + e+ + ν (1)2D + 1H → 3He + γ (2)

3He + 3He → 4He + 1H + 1H (3)

The symbol e+ denotes a positron; ν is the neutrino; and2 D is the deuterium nucleus, composed of one proton andone neutron. The positron immediately reacts with an elec-tron, with the annihilation of both and the production ofenergy. The total energy production of the sequence is 26.7MeV or 4.27 × 10−5 erg, corresponding to the mass dif-ference between four protons and one 4He nucleus, timesc2. Each sequence produces two neutrinos, which imme-diately escape from the star, as they interact only veryweakly with matter, carrying with them a certain fractionof the energy. The remainder of the energy is depositedlocally in the star, typically in the form of gamma rays.

Reaction (1) turns out to be very improbable, as the con-version of a proton into a neutron requires a beta decay.The reaction rate is so slow that it cannot be measuredexperimentally and must be calculated theoretically. Atthe center of the sun, a proton has a mean lifetime of al-most 1010 years before it reacts with another proton; thisreaction controls the rate of energy production, as subse-quent reactions are more rapid. The neutrino in Reaction(1) carries away an average energy of 0.265 MeV. Once2D is formed, it immediately (within 1 sec) captures an-other proton to form 3He. This reaction can be measuredin the laboratory down to energies close to those where re-actions can occur in stars. Any 2D initially present in thestar will be burned by this reaction at temperatures around106 K. When Reactions (1) and (2) have occurred twice,

the two resulting nuclei of 3He combine to form 4He andtwo protons. There is perhaps a 10% uncertainty in thisreaction rate and in most others in the pp chains.

The second branch of the pp chains (PP II) occurs whenthe 3He nucleus produced in Reaction (2) reacts with 4He.The sequence of reactions is then

3He + 4He → 7Be + γ (4)7Be + e− → 7Li + ν (5)7Li + 1H → 4He + 4He (6)

Under conditions of the solar interior, Reaction (4) pro-ceeds at approximately one-sixth of the rate of Reaction(3), and as the temperature increases it becomes relativelymore important. Reaction (5) involves an electron captureon the 7Be and the production of a neutrino with energy0.86 MeV (90% of the time) or 0.38 MeV (10% of thetime). The 7Li nucleus immediately captures a proton andproduces two nuclei of 4He. This reaction also results inthe destruction of any 7Li that may be present at the timeof the star’s formation, since it becomes effective at tem-peratures of 2.8 × 106 K or above. Note that the secondbranch of the pp chains has an average neutrino loss ofabout 1.1 MeV, but that otherwise the net result is thesame: the conversion of four protons to a 4He nucleus.

The third branch of the pp chains (PP III) occurs if aproton, rather than an electron, is captured by 7Be:

7Be + 1H → 8B + γ (7)8B → e+ + ν + 8Be → 4He + 4He (8)

Less than 0.1% of pp-chain completions in the sun oc-cur through Reactions (7) and (8); however, they are veryimportant because the neutrino produced has an averageenergy of 6.7 MeV, high enough to be detectable in allcurrent solar neutrino experiments.

The total rate of energy generation by the pp chainsinvolves a complicated calculation of the rates of all thereactions given above, which are functions of the temper-ature, density, and concentration of the species involved.If T6 is the temperature in units of 106 K, then under theassumption of equilibrium, that is, the abundances of theintermediate species 2D, 3He, 7Be, and 7Li reach steadystate, the energy generation rate for the pp chains can bewritten

εpp = 2.38×106ψg11ρX2T −2/36

× exp(−33.80T −1/3

6

)erg g−1 sec−1, (9)

where ρ is the density, X is the fraction of hydrogen bymass, and

g11 = 1 + 0.0123 T 1/36 + 0.0109 T 2/3

6 + 0.0009 T6.

Page 190: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 55

The function ψ is a correction factor, between 1 and 2,that accounts for (1) the increase by a factor of 2 in theenergy production rate, with respect to that for PP I, thatoccurs when the reactions go through PP II or PP III,and (2) the different neutrino energies in the three chains.For the center of the present sun, ψ ≈ 1.5. For temper-atures lower than and densities higher than those of thepresent solar center, the effect of electron shielding as wellas departure from equilibrium may have to be taken intoaccount.

Protons can also interact with the CNO nuclei, but be-cause the Coulomb barrier is higher these reactions be-come more important than the pp chains only at tempera-tures above 2 × 107 K. The reactions are the following:

12C + 1H → 13N + γ (10)13N → 13C + e+ + ν (11)

13C + 1H → 14N + γ (12)14N + 1H → 15O + γ (13)

15O → 15N + e+ + ν (14)15N + 1H → 12C + 4He (15)

The net effect is the conversion of four protons into ahelium nucleus, two positrons (which annihilate), and twoneutrinos (which carry away energies of 0.71 and 1.0 MeV,respectively). The CNO nuclei simply act as catalysts;however, their relative abundances change as a result ofthe operation of the cycle. Note from Table I that 12C isconsiderably more abundant than 14N in the solar system.However, the rate of Reaction (10), which uses up 12C,under stellar conditions is about 100 times faster than thatof Reaction (13), which uses up 14N. All other reactionsare much faster than these two. The reaction chain tends toreach equilibrium, in which each CNO nuleus is producedas fast as it is destroyed. The equilibrium is obtained onlywhen most of the 12C and other participating nuclei areconverted to 14N. This change in relative abundances couldbe observed if the layers in which the reactions occur couldlater be mixed to the surface of the star.

A secondary branch of the CNO cycle affects the abun-dance of 16O. Starting at 15N, the reactions are

15N + 1H → 16O + γ (16)16O + 1H → 17F + γ (17)

17F → 17O + e+ + ν (18)17O + 1H → 14N + 4He (19)

This branch accounts for only 0.1% of the completions,but the 16O is generated less rapidly than it is convertedinto 14N by Reactions (17)–(19), so that when the branchcomes into equilibrium the abundance ratio O/N will be

reduced. The overall effect of the CNO cycle, apart fromits energy generation, is the conversion of 98% of the CNOisotopes into 14N. Once the cycle reaches equilibrium, theenergy generation can be calculated by the rate of the slow-est reaction in the main chain (Reaction 13) multiplied bythe energy released over the whole cycle (24.97 MeV, afterneutrino losses):

εCNO = 8.67×1027g14,1ρX XCNOT −2/36

× exp( − 152.28T −1/3

6

)erg g−1 sec−1, (20)

where XCNO is the total mass fraction of all CNO isotopesand

g14,1 = 1 + 0.0027 T 1/36 − 0.00778 T 2/3

6 − 0.000149 T6.

At T6 ≈ 25, the energy generation goes approximately asT 17.

The synthesis of elements heavier than helium hasproved in the past to be a considerable problem, primarilybecause there is no stable isotope of atomic mass 5 or 8.Thus, if two 4He nuclei react to produce 8Be, the nucleuswill immediately decay back to He. If a proton reacts with4He, the result is 5Li, which also is unstable. Helium burn-ing must, in fact, proceed through a three-particle reaction,which can be represented as follows:

4He + 4He ↔ 8Be (21)8Be + 4He ↔ 12C + γ + γ (22)

Although the 8Be is unstable, its decay is not instanta-neous. Under suitable conditions, e.g., T = 108 K andρ > 105 g cm−3, it was shown by E. Salpeter that a suf-ficient equilibrium abundance exists so that a third 4Henucleus can react with it [Reaction (22)]. It was also pre-dicted by F. Hoyle that in order for this triple-alpha processto proceed at a significant rate, Reaction (22) must be reso-nant; that is, there must be an excited nuclear state accessi-ble in the range of stellar energies whose presence greatlyenhances the probability that the reaction will occur. Thisresonance was later found in experiments performed at theKellogg Radiation Laboratory at the California Institute ofTechnology. The amount of energy produced per reactionis 7.275 MeV, or 0.606 MeV per atomic mass unit, a fac-tor of 10 less than in hydrogen burning (per atomic massunit). An expression for the rate of energy production is

ε3α = 5.09 × 1011ρ2Y 3T −38

× exp( − 44.027T −1

8

)erg g−1 sec−1, (23)

where T8 = T/108 K, Y is the mass fraction of 4He, andelectron screening has not been included. This reaction hasa very steep temperature sensitivity, with ε ∝ T 40 at T8 = 1.A further helium burning reaction, which occurs at slightlyhigher temperatures than the triple-alpha process, is

Page 191: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

56 Stellar Structure and Evolution

12C + 4He → 16O + γ, (24)

with an energy production of 7.162 MeV per reaction andan uncertain reaction rate. Reactions (21), (22), and (24)probably produce most of the carbon and oxygen in theuniverse.

D. Energy Transport

The transport of energy outward from the interior of astar to its surface depends in general on the existenceof a temperature gradient. Heat will be carried by vari-ous processes from hotter regions to cooler regions; theprocesses that need to be considered include (1) radia-tion transport, (2) convective transport, and (3) conduc-tive transport. Neutrino transport must be considered onlyin exceptional circumstances when the density is above1010 g cm−3; at lower densities, neutrinos produced in starssimply escape directly without interacting with matter. Ineach case a relation must be found between the energyflux Fr , defined as the energy flow per unit area per unittime at a distance r from the center of the star, and thetemperature gradient dT/dr .

Energy transport by radiation depends on the emissionof photons in hot regions of the star and absorption ofthem in slightly cooler regions. The radiation field may becharacterized by the specific intensity Iν , which is definedso that Iν cos θ dν dω dt d A is the energy carried by abeam of photons across an element of area d A in time dtin frequency interval ν to ν + dν into an element of solidangle dω, in a direction inclined by cos θ to the normalto d A. The equation of transfer shows how the intensityof a beam is changed as it interacts with matter. The massemission coefficient jν is defined so that jν ρ dV dν dω dtis the energy emitted by the volume element dV into dω

in time dt in the frequency range dν. The correspondingmass absorption coefficient κν is defined so that the energyabsorbed in the same intervals is κν ρ Iν dV dν dω dt . Ifwe consider both the absorption and the emission of theradiation passing through a cylinder with length ds andcross section d A, the equation of transfer becomes

d Iνds

= −κνρ Iν + jνρ. (25)

Suppose that the direction ds is inclined by an angle θ tothe radial direction in the star so that the projected distanceelement dr = ds cos θ . Also, define the optical depth τν by

dτν = −κνρdr (26)

and the equation of transfer becomes

cos θd Iνdτν

= Iν − jνκν

. (27)

In the stellar interior the mean free path of a photon beforeit is absorbed is only 1 cm or less. Thus, a typical photon isabsorbed at practically the same temperature as it is emit-ted. These conditions are so close to strict thermodynamicequilibrium, where the specific intensity is given by thePlanck function Bν(T ), that the ratio jν/κν can be shownto be the same as it is in strict thermodynamic equilibrium,namely, jν/κν = Bν(T ). The equation of transfer in stellarinteriors can thus be expressed as

cos θd Iνdτν

= Iν − Bν(T ). (28)

This equation can be integrated over all frequencies andsolved for the flux in terms of the temperature gradient forconditions appropriate to the stellar interior, that is, wherethe temperature change is negligible over the mean freepath of a photon, to give

Fr = −(4ac)(3κρ)−1T 3(dT/dr ), (29)

where c is the velocity of light and a is the radiation den-sity constant. Here, κ is the absorption coefficient aver-aged over frequency according to the so-called Rosselandmean:

1

κ=

∫ ∞0

d Bν (T )/dT dν

κν,a [1 − exp(−hν/kT )]+κν,s∫ ∞0 d Bν(T )/dT dν

. (30)

Here, κν,a refers to processes of true absorption, whichhave to be corrected for induced emission, and κν,s refersto scattering processes. Equation (29) is known as the dif-fusion approximation for radiation transfer, an appropriatenomenclature because a photon is absorbed almost imme-diately after it is emitted, so that an enormous number ofabsorptions, re-emissions, and scatterings must occur be-fore the energy of a photon as transmitted to the surface.The time required for energy to diffuse in this mannerfrom the center of the sun to its surface is about 105 years.The quality of the radiation changes during this process.As the photons diffuse to lower temperatures, their energydistribution corresponds closely to the Planck distributionat the local T , because matter and radiation are well cou-pled. Thus, the gamma rays produced by nuclear reactionsat the center are gradually transformed into optical pho-tons by the time the energy reaches the surface.

As Eq. (29) indicates, the energy transport in a radia-tive star is controlled by the opacity κ . Numerous atomicprocesses contribute to this quantity, and in general, thestructure of a star can be calculated only with the aid ofdetailed tables of the opacity, calculated as a function ofρ, T , and the chemical composition. Starting at the high-est temperatures characteristic of the stellar interior andproceeding to lower temperatures, the main processes are(1) electron scattering, also known as Thomson scatter-ing, in which a proton undergoes a change in direction but

Page 192: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 57

no change in frequency during an encounter with a freeelectron; (2) free-free absorption, in which a photon is ab-sorbed by a free electron in the vicinity of a nucleus, withthe result that the photon is lost and the electron increasesits kinetic energy; (3) bound-free absorption on metals,also known as photoionization, in which the photon is ab-sorbed by an atom of a heavy element (e.g., iron) and oneof the bound electrons is removed; (4) bound-bound ab-sorption of a heavy element, in which the photon inducesan upward transition of an electron from a lower quantumstate to a higher quantum state in the atom; (5) bound-free absorption on H and He, which generally occurs nearstellar surfaces where these elements are being ionized;(6) bound-free and free-free absorption by the negativehydrogen ion H−, which forms in stellar atmospheres inlayers where H is just beginning to be ionized (example,the surface of the sun); (7) bound-bound absorptions bymolecules, which can occur only in the atmospheres of thecooler stars (Teff < 4000 K, although even the sun showsa few molecular features in its spectrum); and (8) absorp-tion by dust grains, which can occur in the early stagesof protostellar evolution and possibly in the atmospheresof brown dwarfs, at temperatures below the evaporationtemperature of grains (1400–1800 K).

The Thomson scattering from free electrons (in units ofcm2 g−1) is given by κ = 0.2(1 + X ). For stars this pro-cess is important generally in regions above the upper-most double line in Fig. 5, corresponding to the interiorsof massive stars. The free-free absorptions and the bound-free absorptions on heavy elements have the approximatedependence κ ∝ ρT −3.5. As one moves outward in a starfrom higher to lower T , the number of bound electronsper atom increases and, correspondingly, κ increases. Amaximum in κ occurs in the range 104 K < T < 105 K de-pending on density; this range corresponds to the zoneswhere H and He, the most abundant elements, undergoionization. Below this range in T , the above dependenceis no longer valid, and κ drops rapidly with decreasing T ,reaching a mimimum of about 10−2 at T = 2000 K. Atlower T , where grains exist, the opacities are higher, onthe order of 1 cm2 g−1.

Under certain conditions, the temperature gradientgiven by Eq. (29) can be unstable, leading to the onsetof convection. Suppose that a small element of material ina star is displaced upward from its equilibrium position.It may reasonably be assumed that the element remainsin pressure equilibrium with its surroundings. Thus, if thematerial has the equation of state of an ideal gas, an up-wardly displaced element with density less than that of thesurroundings will have a temperature greater than that ofthe surroundings. If, as the element rises, its density de-creases more rapidly than that of the surroundings, it willcontinue to feel a buoyancy force and will continue to rise.

In this case the layer is unstable to convection, and heat istransported outward by the moving elements themselves.If, however, the density of the upward-moving elementdecreases less rapidly than that of the surroundings, thedensities soon will be equalized, there will be no buoyancyforce, and the layer will be stable.

The condition for occurrence of convection in a star canbe expressed in another form. If the temperature gradientin a layer, calculated under the assumption of radiationtransfer, is steeper than the adiabatic temperature gradient,then convection will occur:

|dT/dr | = 3κρFr (4acT 3)−1 > |dT/dr |ad. (31)

A layer with high opacity, or opacity rapidly increasinginward, is therefore likely to be unstable; this situation canoccur near the surface layers of cool stars. A layer with ahigh rate of nuclear energy generation in a small volume isalso likely to be unstable because of high Fr ; this situationis likely to occur in the cores of stars more massive than thesun, where the temperature-sensitive CNO cycle operates.

Convection involves complicated, turbulent motionswith continuous formation and dissolution of elements ofall sizes. The existence of convection zones is important inthe calculation of stellar structure because the overturningof material in almost all phases of evolution is much morerapid than the evolution time, so that the entire zone can beassumed to be mixed to a uniform chemical compositionat all times. It is also important for heat transport, which isaccomplished mainly by the largest elements. No satisfac-tory analytic theory of convection exists, although three-dimensional numerical hydrodynamical simulations, forexample, of the outer layers of the sun, have been ableto attain spatial resolution high enough so that they canreproduce many observed features. For long-term stellarevolution calculations, a very simplified “mixing length”theory is employed. It is assumed that a convective ele-ment forms, travels one mixing length vertically, and thendissolves and releases its excess energy to the surround-ings. The mixing length is approximated by αH , where α

is a parameter of order unity and H is the distance overwhich the pressure drops by a factor e. By use of this the-ory the average velocity of an element, the excess thermalenergy, and the convective flux can be estimated. In mostsituations in a stellar interior the estimates show that it isadequate to assume that dT/dr = (dT/dr )ad. Although anexcess in dT/dr over the adiabatic value is required forconvection to exist, convection is efficient enough to carrythe required energy flux even if this excess is negligiblysmall. There is a small uncertainty in stellar models be-cause of possible overshooting, and resulting mixing, ofconvective elements beyond the boundary of a formal con-vection zone; otherwise, the uncertainty in the use of themixing length theory is limited to the surface layers of

Page 193: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

58 Stellar Structure and Evolution

stars, where in fact the actual |dT/dr | in a star can besignificantly greater than |dT/dr |ad. The parameter α canbe calibrated by comparison of calculated mixing lengthswith the size of the granular elements in the solar photo-sphere that represent solar convective elements. Also, theradius of a theoretical model of the sun, which is sensitiveto α, can be compared with and adjusted to the observedradius. In this manner it has been determined that the pa-rameter α should be in the range 1–2. Calibration can bealso made for stars in other phases of evolution by the useof numerical hydrodynamical models; however, it is oftenassumed that stars evolve with the same α that has beenobtained for the sun.

Conduction of heat by the ions and electrons is, in gen-eral, inefficient in stellar interiors because the density ishigh enough that the mean free path of the particles issmall compared with the photon mean free path. In the in-teriors of white dwarfs and the evolved cores of red giants,however, the electrons are highly degenerate, their meanfree paths are very long, and conduction becomes a moreefficient mechanism than radiative transfer. The flux isrelated to the temperature gradient by Fr = −K dT/dr ,where K , the conductivity, is calculated by a compli-cated theory involving the velocity, collision cross section,mean free path, and energy carried by the electrons. Nu-merical tables are constructed for use in stellar structurecalculations.

E. Basic Equilibrium Conditions

During most phases of evolution, a star is in hydrostaticequilibrium, which means that the force of gravity on amass element is exactly balanced by the difference in pres-sure on its upper and lower surface. In spherical symmetry,an excellent approximation for most stars, this conditionis written

∂ P

∂ Mr= −G Mr

4πr4, (32)

where Mr is the mass within radius r, ρ is the density,and P is the pressure. A second equilibrium condition isthe mass conservation equation in a spherical shell, whichsimply states that

∂r

∂ Mr= 1

4πr2ρ. (33)

The third condition is that of thermal equilibrium, whichrefers to the situation where the star is producing enoughenergy by nuclear processes to exactly balance the loss ofenergy by radiation at the surface. Then,

L =∫ M

0ε d Mr , (34)

where M is the total mass and ε is the nuclear energygeneration rate per unit mass after subtraction of neutrinolosses. This condition holds on the main sequence, but inmany stages of stellar evolution it does not, because thestar may be expanding, contracting, heating, or cooling.The more general equation of conservation of energy, ef-fectively the first law of thermodynamics, must then beused:

∂Lr

∂ Mr= ε − ∂ E

∂t− P

∂V

∂t, (35)

where V = 1/ρ; E is the internal energy per unit mass;and Lr = 4πr2 Fr , the total amount of energy per unit timecrossing a spherical surface at radius r . If ε = 0, the starcontracts and obtains its energy from the third term on theright-hand side.

To obtain a detailed solution for the structure and evo-lution of a star, one must solve Eqs. (32), (33), and (35)along with a fourth differential equation which describesthe energy transport. Equation (29) for radiation transportcan be re-written, with Mr as the independent variable, ina form which can be used for all three types of transport:

∂T

∂ Mr= − G Mr T

4πr4 P∇, (36)

where, if the energy transport is by radiation,

∇ = ∇rad = 3

16πGac

κLr P

Mr T 4=

(∂ log T

∂ log P

)rad

. (37)

If transport is by conduction, an equivalent “conduc-tive opacity” κcond can be obtained from the conductiv-ity K . However, each point in the star must be testedto see if the condition for convection is satisfied, and ifso, ∇ in Eq. (36) is replaced by the adiabatic gradient∇ad = (∂ log T/∂ log P)ad in the interior or by a “true” ∇obtained from mixing-length theory in the surface lay-ers. The four differential equations are supplemented byexpressions for the equation of state (see Section III.B),nuclear energy generation (see Section III.C), and opacityκ (see Section III.D) as functions of ρ, T , and chemicalcomposition.

Four boundary conditions must be specified. At the cen-ter the conditions r = 0 and Lr = 0 at Mr = 0 are applied.At the surface (Mr = M) the photospheric boundary con-ditions can be used for approximate calculations:

L = 4π R2σ T 4eff and κ P = 2

3g, (38)

where g is the surface gravity. For detailed comparisonwith observations, however, model atmospheres are usedas surface boundary conditions and are joined to the inte-rior calculation a small distance below the photosphere.

To start an evolutionary calculation, one specifies an ini-tial model by giving its total mass M and the distribution of

Page 194: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 59

chemical composition with mass fraction. A typical start-ing point is the pre-main-sequence phase where the com-position is uniform. A sequence of models is calculated,separated by small intervals of time. If nuclear burning isimportant, the change of composition caused by reactions(e.g., conversion of H to He) is calculated at each layer.The system of equations is solved numerically, generallyby a method developed by L. G. Henyey, in which first-order corrections to a trial solution are determined simul-taneously for all variables at all grid points; the process isrepeated until the corrections become small.

A final important equilibrium condition is the virial the-orem, which, for a non-rotating, non-magnetic sphericalstar in hydrostatic equilibrium, can be written

2Ti + W = 0. (39)

Here, W is the gravitational energy, −qG M2/R, whereR is the total radius and q is a constant of order unitythat depends on the mass distribution, and Ti is the totalinternal energy

∫ M0 Ed Mr . This expression can be used

to estimate stellar internal temperatures for the case ofan ideal gas. Then Ti = 1.5(Rg/µ)T M , where T is themean temperature, and T ≈ G Mµ/(3R Rg) ≈ 5 × 106 Kfor the case of the sun. This expression also shows that acontracting, ideal-gas star without nuclear sources heatsup, with T increasing approximately as 1/R.

IV. STELLAR EVOLUTION BEFORETHE MAIN SEQUENCE

Early stellar evolution can be divided into three stages :star formation, protostar collapse, and slow (Kelvin-Helmholtz) contraction. The characteristics of these threestages are summarized in Table II. As the table indicates,there is a vast difference in physical conditions betweenthe time of star formation in an interstellar cloud and thetime when a star arrives on the main sequence and be-gins to burn hydrogen. An increase in mean density by22 orders of magnitude and in mean temperature by 6orders of magnitude occurs. In the star formation and pro-

TABLE II Major Stages of Early Stellar Evolutiona

Internal TimePhase Size (cm) Observations Density (g cm−3) temperature (K) (year)

Star formation 1020–1017 Radio 10−22–10−19 10 107

Protostellar collapse 1017–1012 Infrared 10−19–10−3 10–106 106

Slow contraction 1012–1011 Optical 10−3–1.0 106–107 4 × 107b

aReproduced with permission from Bodenheimer, P. (1983). “Protostar collapse.” Lect. Appl. Math. 20, 141. ©AmericanMathematical Society.

bFor 1 solar mass.

tostar collapse stages, it is not sufficient to assume thatthe object is spherical, and a considerable variety of phys-ical processes must be considered. The problem involvessolution of the equations of hydrodynamics, including ro-tation, magnetic fields, turbulence, molecular chemistry,gravitational collapse, and radiative transport of energy.Some two- and three- dimensional hydrodynamic simu-lations have been performed for the star formation phaseand for the protostar collapse phase, with the aim of in-vestigating the following, as yet unsolved, problems: (1)What initiates collapse? (2) What is the rate and efficiencyof the conversion of interstellar matter into stars? (3) Whatdetermines the distribution of stars according to mass? (4)What is the mechanism for binary formation, and what de-termines whether a star will become a member of a doubleor multiple star system or a single star?

A. Star Formation

It is clear that star formation is limited to regions of unusualphysical conditions compared with those of the interstel-lar medium on the average. The only long-range attrac-tive force to form condensed objects is the gravitationalforce. However, a number of effects oppose gravity andprevent contraction, including gas pressure, turbulence,rotation, and the magnetic field. The chemical composi-tion of the gas, the degree of ionization or dissociation,the presence or absence of grains (condensed particlesof ice, silicates, carbon, or iron with characteristic size5 × 10−5 cm), and the heating and cooling mechanismsin the interstellar medium all have an important influenceon star formation. The regions where star formation is ob-served to occur is in the molecular clouds, where they canform in a relatively isolated mode, as in the Taurus regionor, more commonly, in clusters (Fig. 6).

We consider first only the effects of thermal gas pressureand gravity in an idealized spherical cloud with uniformdensity ρ, uniform temperature T , and mass M . The self-gravitational energy is W = −0.6G M2/R, where R is theradius. The internal energy is Ti = 1.5RgT M/µ, where µ

is the mean molecular weight per particle. Collapse will be

Page 195: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

60 Stellar Structure and Evolution

FIGURE 6 The Trifid Nebula in Sagittarius (M20, NGC 6514),a region of recent star formation (Shane 120-in. reflector). (LickObservatory photograph.)

possible roughly when |W | > Ti ; that is, when the cloudis gravitationally bound, a criterion that has been verifiedby numerical calculations. Thus, for collapse to occur theradius must be less than RJ = 0.4G Mµ/(RgT ), known asthe Jeans length. By eliminating the radius in favor of thedensity, we obtain the Jeans mass, which is the minimummass a cloud with density ρ and temperature T must havefor collapse to occur:

MJ =(

2.5RgT

µG

)3/2(4

3πρ

)−1/2

= 8.5 × 1022

(T

µ

)3/2

ρ−1/2g. (40)

This condition turns out to be quite restrictive. Atypical interstellar cloud of neutral H has T = 50 K,

ρ ≈ 1.7 × 10−23 g cm−3, and µ ≈ 1, and the correspond-ing MJ = 3600 M. The actual masses are far belowthis value, so the clouds are not gravitationally bound.On the other hand, typical conditions in an observedmolecular cloud are T = 10 K, ρ ≈ 1.7 × 10−21 g cm−3,and µ ≈ 2, with MJ ≈ 10 M. Fragmentation into stellarmass objects seems quite possible.

However, this analysis must be modified to take into ac-count rotation and magnetic fields. The angular momen-tum problem can be stated in the following way. The angu-lar momentum per unit mass of a typical molecular cloudwith mass 104 M and a radius of a few parsecs is 1024 cm2

sec−1. The rotational velocities of young stars that have

just started their pre-main-sequence contraction indicatethat their J/M ≈ 1017 cm2 sec−1. Evidently, some frag-ments of dark cloud material must lose 7 orders of mag-nitude in J/M before they reach the stellar state. The an-gular momentum reduction may occur at several differentevolutionary stages and by different physical processes.During the star formation stage, it has been proposed thatif the matter is closely coupled to the magnetic field, thetwisting of the field lines can generate Alfven waves thatwould transfer angular momentum from inside the cloudto the external medium. It has been estimated that signif-icant reduction of angular momentum could take place in106–107 years. Once the density increases to about 10−19

g cm−3, the influence of the magnetic force on the bulk ofthe matter becomes relatively small, because the densityof charged particles becomes very low. However, obser-vations of rotational motion at this density indicate thatJ/M has already been reduced to ≈1021 cm2 sec−1. Thesedense regions, known as molecular cloud cores, have sizesof about 0.1 pc and temperatures of about 10 K; they areobserved in the radio region of the spectrum, often throughtransitions of the ammonia molecule. They are the regionswhere the onset of protostellar collapse is likely to occur.

However, the magnetic force also opposes collapse, anda magnetic Jeans mass can be estimated by equating themagnetic energy of a cloud with its gravitational energy:

MJ,m ≈ 103 M

(B

30 µG

)(R

2 pc

)2

, (41)

where B is the magnetic field. In typical molecular cloudregions, with ρ ≈ 10−21 g cm−3, R ≈ 2 pc, and observedB ≈ 30 µG, magnetic effects are seen to be much more sig-nificant than thermal effects in preventing collapse. How-ever, in the cores, the observed field, although not accu-rately determined, is still ≈30 µG and the radius is only 0.1pc, giving a value of MJ,m ≈ 2.5 M, while MJ ≈ 1 M.The cores actually have masses of a few solar masses, so,again, they are likely regions to collapse. The value of Bdoes not increase in the cores as compared with the av-erage cloud material because during the compression thepredominantly neutral matter is able to drift relative to thefield lines.

There are actually two scenarios which could explainhow collapse is initiated. The first is based on the dis-cussion above: molecular cloud material gradually diff-uses with respect to magnetic field lines until a situation isreached, in a molecular cloud core, where the core mass isabout the same as both the thermal and the magnetic Jeansmasses. Rotation is not important at this stage because ofmagnetic braking, and collapse can commence. Of course,if angular momentum is conserved, rotational effects willsoon become very important as collapse proceeds. Thesecond picture, known as induced star formation, relies

Page 196: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 61

on a more impulsive event, such as the collision betweentwo clouds or the sweeping of a supernova shock across acloud, to initiate collapse. The resulting compression can,under the right conditions, lead to more rapid diffusion ofthe magnetic field, more rapid decay of turbulence, andenhanced cooling of the cloud, thereby reducing the mag-netic, turbulent, and thermal energies, respectively, andincreasing the likelihood of collapse.

B. Protostar Collapse

Once collapse starts, the subsequent evolution can be di-vided into an isothermal phase, an adiabatic phase, and anaccretion phase. A typical initial condition is a cloud corewith radius 2 × 1017 cm, T = 10 K, M = 2 M, and meandensity ≈1 × 10−19 g cm−3 (cloud cores are actually ob-served to be somewhat centrally condensed). During theisothermal phase, the cloud is optically thin to its owninfrared radiation. It tends to heat by gravitational com-pression, but the cooling radiation generated by gas-graincollisions rapidly escapes the cloud, and the tempera-ture remains at about 10 K. The pressure cannot increaserapidly enough to bring the cloud to hydrostatic equilib-rium, and instability to collapse continues. Numerical sim-ulations show that even if the cloud initially has uniformdensity, a density gradient is soon set up the propagationof a rarefaction wave inward from the outer boundary;soon the denser central regions are collapsing much fasterthan the outer regions, and the protostar becomes highlycondensed toward the center with a density distributionclose to ρ ∝ r−2. During this process the thermal Jeansmass (Eq. 40) rapidly decreases with the increase in den-sity. Thus, in principle, fragmentation of the cloud intosmaller masses can occur. Calculations that include rota-tion show that the cloud tends to flatten into a disklikestructure, and once it has become sufficiently flat, the cen-tral region of the cloud can break up into orbiting subcon-densations with orbital specific angular momentums of1020–1021 cm−2 sec−1, comparable to those of the widerbinary systems (Fig. 7). That fragmentation during theprotostar collapse is a reasonable mechanism for binaryformation is supported by the observational evidence thatmany, if not most, very young stars are already in binarysystems at the early phases of the pre-main-sequence con-traction, with binary frequency similar to that observedfor stars on the main sequence. A few binary protostarshave also been detected. In the fragmentation process, eachof the fragments still retains some angular momentum ofspin and is unstable to collapse on its own, but most of thespin angular momentum of the original cloud goes intothe orbital motion of the binary; the process thus providesanother mechanism for solving the angular momentumproblem.

FIGURE 7 Numerical simulation of binary formation in a collaps-ing, rotating cloud. Contours of equal density are shown in theequatorial (x, y ) plane of the inner part of the cloud. The maxi-mum density in the fragments is 10−12 g cm−3, and the minimumdensity at the outer edge is 10−18 g cm−3. Velocity vectors havelength proportional to speed, with a maximum value of 2.8 × 105

cm sec−1. The forming binary has a separation of about 600 AU.The length scales are given in centimeters.

Once the density in the center of the collapsing cloudreaches about 10−13g cm−3, that region is no longer trans-parent to the emitted radiation, and much of the releasedgravitational energy is trapped and heats the cloud, mark-ing the beginning of the adiabatic phase. By this time theJeans mass (Eq. 40) has decreased to about 0.01 M. Theheating results in an increase in the Jeans mass above thisminimum value. The collapse time is still close to the free-fall time, and, as the density increases further, this timebecomes short compared with the time required for radia-tion of diffuse out of the central regions. The collapse thenbecomes nearly adiabatic, and the pressure increases rela-tive to gravity quite rapidly, so that a small region near thecenter approaches hydrostatic equilibrium. As this regioncontinues to compress, the temperature rises to 2000 K. Atthat point the molecular hydrogen dissociates and therebyabsorbs a considerable fraction of the released gravita-tional energy. As a result, the center of the cloud becomesunstable to further gravitational collapse, which continuesuntil most of the molecules have dissociated at a densityof 10−2 g cm−3 and a temperature of 3 × 104 K. The col-lapse is again decelerated and a core forms in hydrostaticequilibrium. Initially, this core contains only ≈1% of theprotostellar mass; it serves as the nucleus of the formingstar.

Even if the cloud had previously fragmented, the frag-ments will collapse through the adiabatic phase. Once

Page 197: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

62 Stellar Structure and Evolution

the final core has formed, the accretion phase begins. Atfirst, infalling material has relatively little angular momen-tum, and it joins the core, falling through a shock front atits outer edge. However, material arriving later will havehigher and higher angular momentum, so after some timeit will not be able to join the core but will start forminga flattened disk surrounding the core, which can grow toa size of several hundred astronomical units. Thus, theprotostar, at a typical stage of its development, consists ofa slowly rotating central star, a surrounding disk, and anopaque envelope of collapsing material which accretes pri-marily onto the disk. Once the disk becomes comparablein mass to the central core, it can become gravitationallyunstable. In some situations the instability could possiblyresult in the formation of subcondensations, with massesin the brown dwarf range, in orbit around the core. In othercases, the instability would produce spiral density waves,which would result in transport of angular momentum out-ward and mass inward, with the result that disk materialcollects onto the core and allows it to grow in mass. Oncethe disk mass decreases to less than 10% of the mass of thecore, it becomes gravitationally stable. However, observa-tions of young objects show that their disks, which are inthe mass range of 1–10% that of the star, still continue toaccrete onto the star. The mechanism by which they doso is not fully understood, and hydrodynamic as well ashydromagnetic instabilities are under investigation. As-sociated with the accretion of mass onto the star is thegeneration of jets and outflows, generally perpendicularto the plane of the disk, observed in most protostars, andoriginating through a complicated interaction between ro-tation, accretion, and magnetic effects near the interfacebetween star and disk.

As material accretes onto the star, either directly orthrough a disk, it liberates energy at a rate given approxi-mately by L = G Mc M/Rc, where Mc and Rc are the coremass and radius, respectively, and M is the mass accretionrate. This energy is transmitted radiatively through the in-falling envelope, where it is absorbed and re-radiated inthe infrared. The typical observed protostar has a spectralenergy distribution peaking at 60–100 µm, although thepeak wavelength and the intensity of the radiation will de-pend on the viewing angle with respect to the rotation axis.Protostars viewed nearly pole-on will have bluer and moreintense radiation than those viewed nearly in the plane ofthe disk. The time for the completion of protostellar evo-lution is essentially the free-fall time of the outer layers,which lies in the range 105–106 years, depending on theinitial density. As the accretion rate slows down in the laterphases and the infalling envelope becomes less and lessopaque, the observable surface of the protostar declinesin luminosity and increases in temperature. Once the in-falling envelope becomes transparent, the observer can see

through to the core, which by now can be identified in theH–R diagram as a star with a photospheric spectrum. Thelocus in the diagram connecting the points where stars ofvarious masses first make their appearance is known asthe birth line.

C. Pre-main-Sequence Contraction

Once internal temperatures are high enough (above 105 K)that hydrogen is substantially ionized, the star is able toreach an equilibrium state with the pressure of an idealgas supporting it against gravity. Rotation, which was veryimportant during the protostar stage, has now become neg-ligible. The star radiates from the surface, and this energyis supplied by gravitational contraction, which can be re-garded as a passage through a series of equilibrium states.The virial theorem shows that half of the released gravi-tational energy goes into radiation and the other half intoheating of the interior, as long as the equation of state re-mains ideal. The calculation of the evolution can now beaccomplished by solution of the standard structure equa-tions (32), (33), (35), and (36).

The solutions are shown in the H–R diagram in Fig. 8.These tracks start at the birth line for each mass. For ear-lier times, a hydrostatic stellar-like core may exist, but itis hidden from view by the opaque infalling protostellarenvelope. Note that the sun arrives on the H–R diagram

FIGURE 8 Evolutionary tracks for various masses during thepre-main-sequence contraction. Each track is labeled by the cor-responding stellar mass (in solar masses, M). For each track,the evolution starts at the birth line (upper solid line) and endsat the zero-age main sequence (lower solid line). Loci of constantage are indicated by dotted lines. [Reproduced by permission fromPalla, F., and Stahler, S. (1999). Astrophys. J. 525, 772. TheAmerican Astronomical Society.]

Page 198: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 63

with about five times its present luminosity, and the highermass stars actually do not appear until they have alreadyreached the main sequence. The results of the calculationsshow in general that the stars first pass through a convec-tive phase (vertical portions of the tracks) and later a ra-diative phase (relatively horizontal portions of the tracks).The relative importance of the two phases depends on thestellar mass. During the convective phase, energy transportin the interior is quite efficient, and the rate of energy lossis controlled by the thin radiative layer right at the stellarsurface. The opacity is a very strongly increasing functionof T in that layer, and that fact combined with the photo-spheric boundary conditions (38) can be shown to resultin a nearly constant Teff during contraction. As the surfacearea decreases, L drops and Teff stays between 2000 and4000 K, with lower masses having lower Teff. As the starcontracts, the interior temperatures increase and in most ofthe star the opacity decreases as a function of T . The stargradually becomes stable against convection, starting atthe center, because the radiative gradient [Eq. (37)] dropsalong with the opacity and soon falls below ∇ad. Whenthe radiative region includes about 75% of the mass, therate of energy release is no longer controlled by the sur-face layer, but rather by the opacity of the entire radiativeregion. At this time, the tracks make a sharp bend to theleft, and the luminosity increases gradually as the averageinterior opacity decreases.

Contraction times to the main sequence, starting atthe birth line, for various masses are given in Table III.A summary of the evolution of the sun during thepre-main-sequence, main-sequence, and early post-main-sequence phases is given in Fig. 9. The highermass stars have relatively high internal temperature andtherefore relatively low internal opacities and are ableto radiate rapidly. The contraction times are short, andbecause the luminosity is relatively constant during theradiative phase, during which they spend most of theirtime, the contraction time is well approximated by the

TABLE III Evolutionary Times (Year)

Mass Pre-main-sequence Main-sequence To onset of Core He(M) contraction lifetime core He burning lifetime

30 — 4.8 × 106 1.0 × 105 5.4 × 105

15 — 1.0 × 107 3.0 × 105 1.6 × 106

9 — 2.1 × 107 9.1 × 105 3.8 × 106

5 2.3 × 105 6.5 × 107 4.8 × 106 1.6 × 107

3 2.0 × 106 2.2 × 108 2.9 × 107 6.6 × 107

2 8.5 × 106 8.4 × 108 3.1 × 108 —

1 3.2 × 107 9.2 × 109 2.5 × 109 1.1 × 108

0.5 1.0 × 108 1.5 × 1011 — —

0.3 1.8 × 108 7.1 × 1011 — —

0.1 3.7 × 108 5.9 × 1012 — —

FIGURE 9 Sketch of the pre-main-sequence (solid line) and post-main-sequence (dashed line) evolution of a star of 1 M. Evolu-tionary times (years) are given for the filled circles. The presentsun is indicated by a cross. The term “nebula” refers to the cir-cumstellar disk. [Adapted with permission from Bodenheimer, P.(1989). In “The Formation and Evolution of Planetary Systems”(H. Weaver and L. Danly, eds.), p. 243, Cambridge Univ. Press,Cambridge, UK. Reprinted with permission of Cambridge Univer-sity Press.]

Kelvin-Helmholtz time tKH ≈ G M2/(RL). A star of 1M spends 107 years, on the vertical track, known asthe Hayashi track after the Japanese astrophysicist whodiscovered it. He also pointed out that the region to theright of the Hayashi track is “forbidden” for a star of agiven mass in hydrostatic equilibrium, at any phase ofevolution. For the next 2 × 107 years, the star is primarilyradiative, but it maintains a thin outer convective envelopeall the way to the main sequence. The final 107 year ofthe contraction phase represents the transition to the mainsequence, during which the nuclear reactions begin to beimportant at the center, the contraction slows down, and,as the energy source becomes more concentrated toward

Page 199: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

64 Stellar Structure and Evolution

the center, the luminosity declines slightly. For the lowermass stars, the evolution is entirely along the Hayashitrack. Stars of 0.3 M or less remain fully convective allthe way to the main sequence. Because the luminosityvaries continuously during the contraction, tKH is deter-mined by integration along the track. It takes a star of0.1 M about 4 × 108 years to reach the main sequence.

A number of comparisons can be made between thecontraction tracks and the observations.

1. As predicted by Hayashi, no stars in equilibrium areobserved to exist in the forbidden region of the H–R dia-gram.

2. The lithium abundances of stars that have justreached the main sequence can be compared with cal-culations of the depletion of lithium in the surface lay-ers during the contraction. Lithium is easily destroyedby reactions with protons at temperatures above aboutT = 2.8 × 106 K. The lower mass stars have outer con-vection zones extending down to this T during the con-traction; the higher mass stars do not. Because the materialin the convection zone is completely mixed, the low-massstars would be expected on the average to have less surface(observable) Li when they arrive at the main sequence, inagreement with observations.

3. The youngest stars, known as the T Tauri stars, havehigh lithium abundances; are associated with dark cloudswhere star formation is taking place; display irregular vari-ability in light, as well as mass loss that is much more rapidthan that of most main-sequence stars; and have excess in-frared emission over that expected from a normal photo-sphere, indicating the presence of a surrounding disk, andexcess ultraviolet radiation, indicating accretion onto thestar. All these characteristics are consistent with youth;their location in the H–R diagram falls along Hayashitracks for stars in the mass range 0.5–2.0 M. The diskcharacteristics vanish after ages of a few million to 107

years, putting a constraint on the time available for theformation of gaseous giant planets.

TABLE IV Zero-Age Main Sequence

Mass (M) log L/L Log Teff (K) R/R Tc (106 K) ρc (g cm−3) Mcore/M Menv/M Energy source

30 5.15 4.64 6.6 36 3.0 0.60 0 CNO

15 4.32 4.51 4.7 34 6.2 0.39 0 CNO

9 3.65 4.41 3.5 31 10.5 0.30 0 CNO

5 2.80 4.29 2.3 27 17.5 0.23 0 CNO

3 2.00 4.14 1.7 24 40.4 0.18 0 CNO

2 1.30 4.01 1.4 21 68 0.12 0 CNO

1 −0.13 3.76 0.9 14 90 0 0.02 PP

0.5 −1.44 3.56 0.44 9 74 0 0.4 PP

0.3 −2.0 3.53 0.3 7.5 125 0 1.0 PP

0.1 −3.3 3.43 0.1 5 690 0 1.0 PP

4. Dynamical masses obtained from pre-main-sequence eclipsing binaries are an important tool forcalibration of the pre-main-sequence theoretical tracks.

5. The H–R diagrams for young stellar clusters can becompared with lines of constant age (Fig. 8). The resultsshow that there is a considerable scatter of observed pointsabout a single such line, indicating that star formation in agiven region probably occurs continuously over a periodof about 107 years.

V. THE MAIN SEQUENCE

A. General Properties

The zero-age main sequence (ZAMS) is defined as the timewhen nuclear reactions first provide the entire luminosityradiated by a star. The chemical composition is assumed tobe spatially homogeneous at this time, although actuallyin the centers of higher mass stars, some 12C is convertedto 14N by Reactions (10)–(12) before that time. Later, thenuclear reactions result in the conversion of hydrogen tohelium, with the change occurring most rapidly at the cen-ter. The characteristics of a range of stellar masses on theZAMS are given in Table IV; the assumed composition isX = 0.71, Y = 0.27, Z = 0.02 (subscripts c refer to centralvalues). The following general points can be made.

1. The equation of state in the interior is close to thatof an ideal gas. There are deviations only at the high-massend, where radiation pressure becomes important, and atthe low-mass end, where electron degeneracy begins to besignificant.

2. The radius increases roughly linearly with mass atthe low-mass end and roughly as the square root of themass above 1 M.

3. The theoretical mass–luminosity relation agrees wellwith observations (Fig. 2); the location of the theoreticalZAMS in the H–R diagram is also in good agreement withthe observations.

Page 200: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 65

4. The central temperature increases with mass, as ex-pected from the virial theorem. The higher mass stars burnhydrogen on the CNO cycle, and because of the steep T de-pendence of these reactions and the corresponding strongdegree of concentration of the energy sources to the center,they have convective cores. The fractional mass of thesecores increases with total mass. Comparisons with clusterH–R diagrams suggests that a small amount of convec-tive overshooting occurs at the edge of these cores. Thelower mass stars run on the pp chains, and because ofthe more gradual T dependence, they do not have con-vective cores. However, because they have cooler surfacelayers and therefore steeply increasing opacity going in-ward from the surface, they have convection zones in theirenvelopes, which become deeper as the mass decreases.At 0.3 M, this convection zone extends all the way tothe center. The lowest mass stars do not have high enoughTc to allow 3He to react with itself [Reaction (3)]; there-fore (at least on the ZAMS), the pp chain proceeds onlythrough Reactions (1) and (2).

5. There is no substantial evidence for stars above100 M. The upper limit for stellar masses arises eitherfrom the fact that radiation pressure from the core canprevent further accretion of envelope material during theprotostellar phase when the core reaches some criticalmass or from the fact that main-sequence stars above agiven mass are pulsationally unstable which, in combina-tion with strong radiation pressure, tends to result in rapidmass loss from their surfaces.

6. The lower end of the main sequence occurs at about0.075 M. Below that mass nuclear reactions cannot pro-vide sufficient energy. Physically, the limit arises fromthe fact that as low-mass objects contract, they approachthe regime of electron degeneracy before Tc becomes highenough to start nuclear burning. As the electrons are forcedinto higher and higher energy states, much of the gravita-tional energy supply of the star is primarily used to pro-vide the required energy to the electrons. The ions, whosetemperatures determine nuclear reaction rates, are still andideal gas, but they reach a maximum temperature and thenbegin to cool. Their thermal energy is required, along withthe gravitational energy, to supply the radiated luminos-ity. Substellar objects below the limit simply contract toa limiting radius and then cool. Extensive observationalsearches for such objects, known as “brown dwarfs,” haveresulted in the discovery of many strong candidates.

B. Evolution on the Main Sequence

During the burning of H in their cores, stars move veryslowly away from the ZAMS. The main-sequence life-times, up to exhaustion of H at the center, are given inTable III. High-mass stars go through this phase in a few

million years, while stars below about 0.8 M (with solarcomposition) have not had time since the formation of thegalaxy to evolve away from the main sequence. The rate atwhich energy is lost at the surface, which determines theevolutionary time scale, is controlled by the radiative opac-ity through much of the interior. The stellar structure ad-justs itself to maintain thermal equilibrium, which meansthat the energy production rate is exactly matched by therate at which energy can be carried away by radiation. Iffor some reason the nuclear processes were producing en-ergy too rapidly, some of this energy would be depositedas work in the inner layers, these regions would expandand cool, and the strongly temperature-dependent energyproduction rate would return to the equilibrium value.

The structure of the star on the main sequence changesduring main-sequence evolution as a consequence ofthe change in composition. In the case of upper main-sequence stars the H is depleted uniformly over the entiremass of the convective core, while in the lower mass stars,which are radiative in the core, the H is depleted mostrapidly in the center. The conversion of H to He results ina slight loss of pressure because of a decease in the numberof free particles per unit volume; as a result, the inner re-gions contract very slowly to maintain hydrostatic equilib-rium. The conversion also reduces the opacity somewhat,which tends to cause a slow increase in L . The outer lay-ers see no appreciable change in opacity; a small amountof the energy received from the core is deposited therein the form of work, and these layers gradually expand.For example, in the case of the sun, L has increased from0.7 to 1.0 L since age zero, R has increased from 0.9 to1.0 R, Tc has increased from 14 × 106 to 15.7 × 106 K,and ρc has increased from 90 to 152 g cm−3. In the caseof high-mass, stars when X goes to zero at the center, itdoes so over the entire mass of the convective core, and thestar is suddenly left without fuel. A rapid overall contrac-tion takes place, until the layers of unburned H outside thecore reach temperatures high enough to burn. For solar-mass stars, the main-sequence evolution does not end sosuddenly, because the H is depleted only one layer at atime, and the transition to the red giant phase takes placerelatively slowly.

An important calibration for the entire theory of stellarevolution is the match between theory and observation ofthe sun. The procedure is to choose a composition and toevolve the model sun from age zero to the present solar ageof 4.56 × 109 years . The value of Z is well constrained byobservations of photospheric abundances; thus, if the solarL does not match the model L , the abundance of He in themodel can be used as a parameter to be adjusted to pro-vide agreement. Once L of the model is satisfactory, theradius can be adjusted to match the observed R by vary-ing the mixing-length parameter α that is used to calculate

Page 201: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

66 Stellar Structure and Evolution

the structure of the outer part of the convection zone. Afurther check is provided by the results of helioseismol-ogy, a method of probing the interior structure of the sunby making precise measurements of the frequencies of itsoscillations at the surface. The best-known periods are ofthe order of 5 min. The technique is able to measure thesound speed as a function of depth far into the sun, as wellas the location of the bottom of the convection zone. Mod-els that start with the composition consistent with that ofTable I (X = 0.70, Y = 0.28, Z = 0.02) and α = 1.8, andwhich include the slow settling of He downward duringthe evolution, provide excellent agreement with the re-sults of helioseismology and the observed Z , L , and R.The present sun (see Table V) has Y = 0.24 in the outerlayers; the convection zone includes the outer 29% of theradius with T = 2.2 × 106 K at its lower edge.

C. The Solar Neutrino Problem

The results of terrestrial solar neutrino detection experi-ments can be summarized as follows:

1. The chlorine experiment, 37Cl + ν → 37Ar + e− withdetection threshold 0.81 MeV, detects primarilyneutrinos produced by 8B decay [Reaction (8)]. Along series of experiments going back to 1970 showsa detection rate only one-third that predicted by thestandard solar model.

TABLE V Standard Solar Model

M(r ) / M r / R T (106K) ρ (g cm−3) P (erg cm−3) L(r ) / L X

0.000 0.00 15.7 152 2.33 × 1017 0.0 0.34

0.009 0.045 15.0 132 2.05 × 1017 0.073 0.40

0.041 0.078 13.9 104 1.63 × 1017 0.282 0.48

0.117 0.120 12.3 73.3 1.10 × 1017 0.602 0.58

0.217 0.159 10.8 51.5 7.17 × 1016 0.822 0.65

0.308 0.191 9.69 38.3 4.89 × 1016 0.920 0.68

0.412 0.226 8.60 26.9 3.08 × 1016 0.971 0.69

0.500 0.255 7.79 19.5 2.03 × 1016 0.990 0.70

0.607 0.299 6.82 12.2 1.12 × 1016 0.998 0.71

0.702 0.345 5.96 7.30 5.85 × 1015 1.000 0.71

0.804 0.411 4.97 3.48 2.32 × 1015 1.000 0.71

0.902 0.518 3.81 1.13 5.82 × 1014 1.000 0.72

0.927 0.562 3.42 0.726 3.35 × 1014 1.00 0.72

0.962 0.656 2.69 0.304 1.11 × 1014 1.000 0.72

0.976a 0.714 2.18 0.185 5.53 × 1013 1.000 0.74

0.990 0.804 1.33 0.088 1.59 × 1013 1.000 0.74

0.998 0.900 0.60 0.026 2.10 × 1012 1.000 0.74

0.9997 0.949 0.28 0.008 3.06 × 1011 1.000 0.74

1.0000 1.000 0.0058 2.8 × 10−7 1.20 × 105 1.000 0.74

aBase of convection zone.

2. The neutrino-electron scattering experiment carriedout by the Superkamiokande project has a highdetection threshhold (about 6 MeV) and can detectonly neutrinos from Reaction (8). Results show abouthalf the expected detection rats, but the experimentcan determine from which direction the neutrinosarrive, and it is clear that they come from the sun.

3. Two independent gallium experiments, the SAGEproject and the GALLEX project, detect neutrinos by71Ga + ν → 71Ge + e− with detection threshhold0.234 MeV. The experiments can detect neutrinosfrom all three branches of the pp chain; in particular,about half of the neutrinos produced by Reaction (1),which are by far the most numerous of the solarneutrinos, are detectable by this experiment. Theactual detection rate in these experiments, which havebeen calibrated through the use of a terrestrialneutrino source of known production rate, is slightlymore than half of the expected rate. The discrepanciesin all case are far greater than the experimental ortheoretical uncertainties.

The following possibilities have been considered as solu-tions to the solar neutrino problem.

1. The solar model is inappropriate. The productionrate of the neutrinos produced by 8B is very sensitive tothe temperature near the center of the sun; those produced

Page 202: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 67

by the p + p reaction are also sensitive, but less so. A de-crease in this temperature in the models would bring thepredictions more into line with the experiments. One ofthe main parameters in the solar model is the abundanceof He. A decrease in its assumed value (and replacementby H) results in an increase in the opacity, so that a com-pensating decrease in Z is required so that the solar lumi-nosity will be matched at the solar age. The value of Tc isreduced because in the presence of the increased value ofX , Eq. (9) shows that the energy requirements of the suncan be met with a lower temperature and density. How-ever, the value of Z is well enough known observationallythat a change in Y sufficient to bring about agreementwould result in a required value of Z that is outside theobservational uncertainties. Furthermore, the value of Yis well constrained through helioseismology as a functionof radius almost all the way to the center of the sun. Theagreement between the observationally determined soundspeed as a function of radius and the standard solar modelis within 1%. In general, a change in the solar model suf-ficient to bring about consistency with the solar neutrinoexperiments will result in inconsistency with at least oneother well-observed property of the sun.

2. The nuclear reaction rates, when extrapolated to stel-lar energies, are incorrect. The experimental rate of Re-action (7) is particularly difficult to determine, and it isdirectly proportional to the production rate of neutrinosfrom 8B. However, the quoted uncertainty in all the rates,typically 10%, is not large enough to solve the problem.

3. Through so-called “neutrino oscillations,” the elec-tron neutrinos produced in the sun are converted to muonor tau neutrinos before they reach the earth; the detectionexperiments now running are insensitive to these othertypes. This possibility is the subject of intensive researchby particle physicists. There is some direct experimen-tal evidence that muon neutrinos undergo oscillations, butnone so far for electron neutrinos.

4. The sun undergoes periodic episodes of mixing ofthe chemical composition in the interior. Some hydrogenoutside the burning region would be mixed downward,providing extra fuel and increasing the energy generationrate given by Eq. (9). Some energy is deposited in theform of work in the burning layers, they expand and coolto re-establish the balance between L and the nuclear en-ergy generation rate. The cooler sun then produces fewerdetectable neutrinos. It is also possible that the sun is con-tinuously mixed by a slow diffusive process; in either case,the mechanism for mixing is not understood, and there isno evidence for it in the helioseismological results. Thesolution to the solar neutrino experiment, may come aboutthrough different experimental techniques. For example,the SNO experiment, which uses heavy water (D2O) ratherthan ordinary water as a detecting agent, in principle will

be able to determine whether solar neutrinos have under-gone oscillations.

VI. STELLAR EVOLUTION BEYONDTHE MAIN SEQUENCE

Following the exhaustion of H at the center of a star, majorstructural adjustments occur; the physical processes thatresult in the transition to a red giant were first made clearby the calculations of F. Hoyle and M. Schwarzschild. Thecentral regions, and in some cases the entire star, contractuntil the hydrogen-rich layers outside the exhausted corecan be heated up to burning temperatures. Hydrogenburning becomes established in a shell source, whichbecomes thinner in terms of mass and hotter as the starevolves. The central region, inside the shell, continuesto contract, while the outer layers expand considerablyas the star evolves to the red giant region. The value ofTeff decreases to 3500–4500 K, and a convection zonedevelops in the outer layers and becomes deeper as thestar expands. When T in the core reaches 108 K, heliumburning begins, resulting in the production of 12C and 16O.Evolutionary time scales up to and during helium burningare given in Table III. Whether further nuclear burningstages occur that result in the production of still heavierelements depends on the mass of the star. For massesless than 9 M, interior temperatures never become highenough to burn the C or O. The star ejects its outer enve-lope, goes through a planetary nebula phase, and ends itsevolution as a white dwarf. The more massive stars burnC and O in their cores, and nucleosynthesis proceeds tothe production of an iron-nickel core. Collapse of the corefollows, resulting in a supernova explosion, the ejectionof a large fraction of the mass, and the production of aneutron star remnant for masses up to about 25 M andpossibly a black hole for higher masses. The details of theevolution are strongly dependent on mass. The followingsections describe the post-main-sequence evolution ofstars of Population I with masses of 1, 5, and 25 M andof a Population II star of about 0.8 M, which represents atypical observable star in a globular cluster. Evolutionarytracks for the Population I stars are shown in Fig. 10.

A. Further Evolution of 1 M Stars

The transition phase to the red giant region, during whichthe star evolves to the right in the H–R diagram, takesplace on a time scale that is short compared with the main-sequence lifetime, but long enough so that stars undergo-ing the transition should be observable in old clusters. Asthe convective envelope deepens, the evolutionary trackchanges direction in the H–R diagram, and the luminosityincreases considerably while Teff decreases only slowly.

Page 203: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

68 Stellar Structure and Evolution

FIGURE 10 Post-main-sequence evolutionary tracks in the H–R diagram for stars of 1.0, 5.0, and 25 M under theassumption that the metal abundance Z is comparable to that in the present sun. The heavy portions of each curveindicate where major nuclear burning stages occur in the core. The label RGB refers to the red giant branch, AGBrefers to the asymptotic giant branch, and PN refers to the planetary nebula phase. Helium burning occurs on thehorizontal branch only if Z is much less than the solar value. [Reproduced with permission from Iben, I., Jr. (1991).Astrophys. J. Suppl. 76, 55. The American Astronomical Society.]

The inner edge of the convection zone moves inward to apoint just outside the hydrogen-burning shell; inside theshell is the dense, burned-out core consisting mainly of Heand increasing in mass with time. Temperatures in the shellincrease to the point where the CNO cycle takes over asthe principal energy source. The outer convective envelopebecomes deep enough so that it reaches layers in whichC has been converted to N by Reactions (10)–(12), whichproceed at a temperature slightly less than that requiredfor the full cycle to go into operation. The modification ofthe C to N ratio at the surface of the star, which is expectedbecause of convective mixing, has been verified in somecases by observations of abundances in red giant stars. Theoxygen surface abundance is not affected. This process isknown as the first dredge-up. As the sun evolves to highluminosity on the red giant branch, it undergoes a massloss episode, which, depending on the uncertain mass lossrate, could result in the loss of the outer 25% of the mass.

Helium burning in the core finally begins whenits mass is about 0.45 M. At this time, L ≈ 2 ×103 L, Tc = 108K, and ρc = 8 × 105 g cm−3. At the cen-ter, the electron gas is quite degenerate; thus, the total gaspressure depends strongly on density but not on tempera-ture. The helium-burning reaction rate is very sensitive toT ; when the reaction starts, the local region heats some-what and the reaction rate increases, resulting in furtherheating. Under normal circumstances, the increased Twould result in increased pressure, causing the region toexpand to the point where the rate of energy generationmatched the rate at which the energy could be carriedaway. However, under degenerate conditions the pressuredoes not respond, only very slight expansion occurs, andthe region simply heats, resulting in a runaway growth ofthe energy generation. This thermal instability is knownas the helium flash, during which the luminosity canincrease to 1011 L for a brief period at the flash site. This

Page 204: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 69

enormous luminosity is absorbed primarily in the gradualexpansion of the core, whose density decreases by a factor40, and the luminosity at the surface does not increase. Onthe contrary, the expansion of the inner regions results in adecrease in the temperature at the hydrogen shell source,a reduction in the energy generation, a reduction of thesurface luminosity, and a contraction of the outer layers tocompensate. The temperature in the core rises to the pointwhere it is no longer degenerate, cooling can occur, andthe nuclear reaction rate once more becomes regulated.Relatively little He is actually burned during the flash, andthe star settles down near the red giant branch, but witha reduced L = 40 L (see Fig. 10). Energy productioncomes from both core helium burning and shell hydrogenburning. Although a convection zone develops in the re-gion of the helium flash, it does not link up with the outerconvection zone, and so the products of helium burningare not mixed outward to the surface of the star at thistime.

Helium burns in the core, converting it to C and O. Whenthe helium is exhausted in the core, that region begins tocontract until helium burning is established in a shell re-gion. The hydrogen-burning shell is still active; thus, thestar now has a double shell source surrounding the core,and the convective envelope still extends inward to a point

FIGURE 11 The outer radius (Reff) of a model of the evolution of the sun, as a function of time, starting at thezero-age main sequence (solid line). The dashed line refers to the pre-main-sequence phase. The present sun isindicated by an open circle. Dotted lines indicate the orbital distances of the inner planets. These orbits move outwardat times when the sun undergoes rapid mass loss. At the end of the simulation, the sun’s mass has been reducedto 0.54 M. The oscillations in the right-hand part of the diagram are caused by helium shell flashes. [Adapted withpermission from Sackmann, I.-J., Boothroyd, A. I., and Kraemer, K. E. (1993). Astrophys. J. 418, 457. The AmericanAstronomical Society.]

just outside the hydrogen-burning shell. The star resumesits upward climb in luminosity along the asymptotic gi-ant branch (AGB; see Fig. 10). The shell sources becomethinner, hotter, and closer together. The core becomes de-generate, and the increase in its temperature stops becausethe energy released by contraction must go into lifting theelectrons into higher and higher energy states and becauseof neutrino losses. The structure of the star is divided intothree regions: (1) the degenerate C/O core, which growsin mass to about 0.6 M, in ρc to over 106 g cm−3, andmaintains a radius of about 109 cm; (2) the hydrogen-and helium-burning shells, which are separated by a verythin layer of He and which contain only a tiny fraction ofthe total mass; and (3) the extended convective envelope,which still has essentially its original abundances of Hand He and a mean density of only about 10−7 g cm−3,expands to a maximum size of about 200 R (see Fig. 11).The helium-burning shell is subject to a thermal instabil-ity during which the energy generation increases locallyto peak values of 106 L. These events, known as heliumshell flashes, repeat with a period of about 105 years andresult in surface luminosity changes of factors of 5–10.

As the star increases in luminosity, up to a maximum ofabout 5000 L, it again develops an increasingly strongstellar wind, as a result of which it undergoes further mass

Page 205: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

70 Stellar Structure and Evolution

loss. Although the mechanism behind this final mass ejec-tion is not completely understood, it is probable that thestar first becomes pulsationally unstable. In fact, manyof the stars in this part of the H–R diagram are variablein light with periods of about a year. As the luminosityincreases, the amplitude of the oscillations increases andeventually grows to the point where a small amount ofmass at the outer edge is brought to escape velocity. Theformation of grains in the outer, very cool layers of thesepulsating stars may contribute to the mass loss; the ra-diation pressure on them can result in a stellar wind. Inany case, practically the entire hydrogen-rich envelope isejected, leaving behind the core, the shell sources, and athin layer of hydrogen-rich material on top. The star nowenters the planetary nebula phase.

The system now consists of a compact central star andan expanding diffuse envelope, known as the planetarynebula. The star evolves to the left in the H–R diagram ata nearly constant luminosity of about 5000 L, with Teff

increasing from the red giant value of about 4000 to over105 K. Once Teff exceeds 30,000 K, the ultraviolet radia-tion results in fluorescence in the nebula, causing it to glowin optical light (Fig. 12). The evolution time from the redgiant region to maximum Teff is about 105 years, duringwhich time the radius decreases from its maximum valueto about 0.1 R (Fig. 11). The nuclear energy source fromone or both shells is still active. The hydrogen-rich outerenvelope decreases in mass as the fuel is burned, and even-tually, when the envelope mass decreases to ∼10−4 M,

FIGURE 12 The giant planetary nebula NGC 7293 (Shane120-in. reflector). (Lick Observatory photograph.)

it is no longer hot enough to burn. The outer layers con-tract until the radius approaches a limiting value, at whichpoint only a very slight amount of additional gravitationalcontraction is possible because practically the entire star ishighly electron degenerate, and the high electron pressuresupports it against gravity. Essentially, the only energysource left is from the cooling of the hot interior. Fromthis point the star evolves downward and to the right inthe H–R diagram and soon enters the white dwarf region.Thus, the final state of the evolution of 1 M is a whitedwarf of about 0.6 M.

B. Evolution of 5 M Stars

The evolution of higher mass stars differs from that of thelower mass stars in several respects. On the main sequencethe star of 5 M burns hydrogen on the CNO cycle andhas a convective core that includes about 23% of the massat first, but which decreases in mass with time. The His uniformly depleted in the core. With the exhaustion ofH in the core, the star undergoes a brief overall contrac-tion that leads to heating and ignition of the shell source;the convection zone has now disappeared. As the shellnarrows, the central regions interior to it contract, whilethe outer regions expand. The star rapidly crosses the re-gion between the main sequence and the red giant branch.Because of this rapid crossing, very few stars would beexpected to be observed in this region, and for that reasonit is known as the Hertzsprung gap. As Teff decreases, aconvection zone develops at the surface and advances in-ward; when it includes about half the mass, the star turnsupward in the H–R diagram. When log L/L = 3.1 andthe hydrogen-exhausted core includes about 0.75 M, he-lium burning begins at the center. Because this region isnot yet degenerate, the helium flash does not occur and theburning is initiated smoothly. A central convection zonedevelops because of the strong temperature dependenceof the triple-alpha reaction. The structure of the star nowconsists of (1) the convective helium-burning region, (2)a helium-rich region that is not hot enough to burn He, (3)a narrow hydrogen-burning shell source, and (4) an ex-tended outer envelope with essentially unmodified H andHe abundances. However, the outer convection zone hasextended downward into layers where some C has beenconverted into N through previous partial operation of theCNO cycle. The relative depletion of C and enhancementof N could now be observable at the stellar surface. Thisfirst dredge-up is analogous to that which occurs in 1 M(see Fig. 13 at a time of 6 × 107 years).

As helium burning progresses, the central regionsexpand, resulting in a decrease in temperature at thehydrogen-burning shell and a slight drop in luminosity.The expansion of the outer layers is reversed, and the

Page 206: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 71

FIGURE 13 The evolution of the internal structure of a star of 5 M with composition similar to that of the sun. Theabscissa gives the age, starting from the zero-age main sequence. A vertical line through the diagram correspondsto a model at a given time. The figure shows only the inner half of the mass of the star. Cloudy regions indicateconvection zones. Hatched regions are those with significant nuclear energy production. Regions which are non-uniform in composition are dotted. Breaks in the diagram correspond to changes in the time scale. [Adapted withpermission from Kippenhahn, R., and Weigert, A. (1990). “Stellar Structure and Evolution,” Springer-Verlag, Berlin. Springer-Verlag.]

star evolves to higher Teff. The energy production is di-vided between the core and the shell, with the former be-coming relatively more important as the star evolves. Thetotal lifetime during helium burning is about 1.4 × 107

years, about 20% of the main-sequence lifetime. AsFig. 10 shows, most of this time is spent in a regionwhere Teff ≈ 6000 − 8000 K and log L/L ≈ 3, where theCepheid variables are observed; they are interpreted asstars of 5–9 M in the phase of core helium burning. Theevolutionary calculations show that for solar metallicityonly stars in this mass range make the excursion to the leftin the H–R diagram during He burning. Lower mass starsstay close to the giant branch during this phase.

As He becomes depleted in the core, the region con-tracts and heats in order to maintain about the same levelof energy production. The contraction results in a slightincrease in T at the hydrogen-burning shell. The energyproduction increases there, and a small amount of thisexcess energy is deposited in the outer envelope in theform of mechanical work, causing expansion. The evo-lution changes direction in the H–R diagram and headsback toward the red giant region. The He is used up at thecenter, a helium-burning shell source is established, andthe star resumes its interrupted climb up the giant branch.The surface convection zone develops again and reachesdeep within the star to layers that have previously beenenriched in He. In this second dredge-up the ratio of He

to H is increased at the surface, and both C and O are de-creased with respect to N. As the star rises to higher L , itsstructure includes the following regions: (1) a degeneratecore consisting of a mixture of 12C and 16O, (2) a narrowhelium-burning shell source, (3) a thin layer composedmostly of He, and (4) the extended outer convective enve-lope, somewhat enriched in He and with modified CNOratios relative to the original composition. The H-burningshell goes out, to become re-established later on duringHe shell flashes. The structure of the interior of the star asa function of time up to about this point is illustrated inFig. 13.

The star increases in L up to about 2 × 104 L, andthe mass of the C/O core increases to 0.85 M. A seriesof thermal flashes, analogous to those found for 1 M,occurs in the He-burning shell; during the flash cycle, themain energy production alternates between an H-burningshell and an He-burning shell. The flashes occur in non-degenerate layers, unlike the He core flash in the lowermass stars. After several flashes, the third dredge-up oc-curs. Some of the C produced in the He-burning shell canbe mixed to the surface of the star, eventually produc-ing a C star. Ejection of the envelope now takes place byprocesses similiar to those described for 1 M. The totalmass lost during the evolution of the star is estimated to be4.15 M; the remaining C/O white dwarf has a mass of0.85 M. The same result will occur for stars up to 9 M,

Page 207: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

72 Stellar Structure and Evolution

with the maximum core mass being about 1.1 M. Thesecond dredge-up occurs only for stars in the mass rangefrom about 4.5 to 9 M; the third dredge-up occurs onlyin stars from about 2 to 9 M.

C. Evolution of 25 M Stars

The high-mass stars evolve similarly to the stars of5 M during main-sequence and immediate post-main-sequence evolution, except that the fraction of the masscontained in the convective core is larger, and the veryhighest mass stars may lose mass even on the main se-quence. The 25-M star evolves toward the red giant re-gion, but helium burning at the center occurs when Teff

is still in the range 104 to 2 × 104 K, and carbon burn-ing starts soon afterward. The star expands to become ared giant with L/L = 2 × 105 and Teff = 4500 K. A se-quence of several new nuclear burning phases now occursin the core, rapidly enough that little concurrent changeoccurs in the surface characteristics. Carbon burns at about9 × 108 K, neon burns at 1.75 × 109 K, oxygen burns at2.3 × 109 K, and silicon burns at 4 × 109 K. A central coreof about 1.5 M, composed of iron and nickel, builds up,surrounded by layers that are silicon rich, oxygen rich,and helium rich, respectively. Outside these layers is theenvelope, still with its original composition. The layersare separated by active shell sources. The temperature atthe center reaches 7 × 109 K and the density is 3 × 109 gcm−3 (see Fig. 5). However, the sequence of nuclear reac-tions that has built the elements up to the iron peak groupin the core can proceed no further. These elements havebeen produced with a net release of energy at every step,a total of 8 × 1018 ergs g−1 of hydrogen converted to iron.However, to build up to still heavier elements, a net inputof energy is required. Furthermore, the Coulomb barrierfor the production of these elements by reactions involvingcharged particles becomes very high. Instead, an entirelydifferent process occurs in the core. The temperature, andalong with it the average photon energy, becomes so highthat the photons can react with the iron, breaking it upinto helium nuclei and neutrons. This process requires anet input of energy, which must ultimately come from thethermal energy of the gas. The pressure therefore does notrise fast enough to compensate for the increasing forceof gravity and the core begins a catastrophic gravitationalcollapse. On a time scale of less than 1 sec, the centraldensity rises to 1014 g cm−3 and the temperature rises to3 × 1010 K. As the density increases, the degenerate freeelectrons are captured by the nuclei, reducing the elec-tron pressure and further contributing to collapse. At thesame time, there is very rapid energy loss from neutri-nos. The point is reached where most of the matter is inthe form of free neutrons, and when the density becomes

high enough, their degenerate pressure increases rapidlyenough to stop the collapse. At that point a good fractionof the original iron core has collapsed to a size of 106

cm and has formed a neutron star, nearly in hydrostaticequilibrium, with a shock front on its outer edge throughwhich material from the outer parts of the star is fallingand becoming decelerated. The core collapse is thoughtto be the precursor to the event known as a supernova oftype II.

The question of what happens after core collapse isone of the most interesting in astrophysics. Can at leastpart of the gravitational energy released during the col-lapse be transferred to the envelope and result in its ex-pansion and ejection in the form of a supernova? Presentindications are that it is possible and that the shock willpropagate outward into the envelope. A large fraction ofthe gravitational energy is released in the form of neutri-nos, produced during the neutronization of the core. Mostof these neutrinos simply escape, but the deposition of asmall fraction of their energy and momentum in the layersjust outside the neutron star is crucial to the ejection pro-cess. Assuming that the shock does propagate outward,it passes through the various shells and results in furthernuclear processing, including production of a wide va-riety of elements up to and including the iron peak. Italso accelerates most of the material outside the originaliron core outward to escape velocities. When the shockreaches the surface of the red giant star, the outermostmaterial is accelerated to 10,000 km sec−1, and the deeperlayers reach comparable but somewhat smaller velocities.Luminosity, velocity, and Teff as a function of time in nu-merical simulations of such an outburst agree well withobservations. The enormous luminosity arises, in the ear-lier stages, from the rapid release of the thermal energyof the envelope. At later times, most of the observed ra-diation is generated by the radioactive decay of the 56Nithat is produced mainly by explosive silicon burning inthe supernova shock. Supernova observations are best fitwith total explosion energies of about 1051 erg. The CrabNebula (Fig. 14) is consistent with this energy and an ex-pansion velocity of 10,000 km sec−1. Another good testof the theory of stellar evolution is the calculation of therelative abundances of the elements between oxygen andiron in the ejected supernova envelope. Integration of theyields of these elements over the range of stellar massesthat produce supernovae gives values that are consistentwith solar abundance ratios.

D. Evolution of Low-Mass Starswith Low-Metal Abundance

The oldest stars in the galaxy are characterized by metalabundances that are 0.1–0.01 that of the sun and by a

Page 208: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 73

FIGURE 14 Crab Nebula in Taurus (M1, NGC 1952) in red light,the remnant of the supernova explosion in AD 1054 (Shane120-in. reflector). (Lick Observatory photograph.)

distribution in the galaxy that is roughly spheroidal ratherthan disklike. These stars can be found in globular clus-ters or in the general field; in particular, the H–R diagramsof globular clusters (Fig. 15) give important informationon the age of the galaxy and the helium abundance at thetime the first stars were formed. There are observationaldifficulties in determining the properties of these stars,since most of them are very distant. For example, it hasnot been possible to determine observationally the mass–metallicity–luminosity relation for their main sequence.However, observations of the locations in the H–R diagramof a few nearby low-metal stars show that they fall belowthe main sequence, defined by stars of solar metal abun-dance. This information, combined with detailed analysisof the H–R diagrams of globular clusters, indicates thatthe helium abundances in the old stars are slightly lessthan that of the sun: Y = 0.24 ± 0.02.

The age estimates of globular clusters, based on detailedcomparisons of observed H–R diagrams with theoreticalevolutionary tracks (see Fig. 4), range from 10 to 15 ×109 years. Recent work, based on improved observations,Hipparcos parallaxes for nearby low-metal stars which areused to calibrate the distances to the globular clusters, andimproved theoretical tracks, favors ages in the range 13 to14 × 109 years for the clusters of lowest metallicity. Thehigh-mass stars in these clusters have long ago evolvedto their final states, mostly white dwarfs. The mass of thestars that are now evolving off the main sequence and be-

coming relatively luminous red giants is about 0.8 M.On the main sequence, however, these stars have approx-imately the solar luminosity because of their lower metalcontent, and hence lower opacity, when compared withstars of normal metal abundance. The evolution duringthe main-sequence phase and the first ascent of the giantbranch is very similar to that of 1-M stars described pre-viously. Hydrogen burning occurs on the pp chains; thereis no convection in the core; and when hydrogen is ex-hausted in the center, the energy source shifts to a shelland the star gradually makes the transition to the red gi-ant region. The core contracts and becomes degenerate,and the envelope expands and becomes convective. Thefirst dredge-up occurs, and a comparison can be made be-tween theoretical and observed element abundances at thisstage. Indications are that there is a disagreement in thesense that observed abundances change at earlier phases ofevolution than the theory suggests. The implication is thatsome additional mechanism besides convection is causingmixing into the deep interior. A helium flash in the coreoccurs when its mass is 0.45 M and the total luminosityis above 103 L.

When the helium flash is completed and the core is nolonger degenerate, the star settles down to a stable statewith He burning in the core and H burning in a shell. How-ever, the location in the H–R diagram during this phasediffers from that of the low-mass stars with solar metals.The star evolves rapidly to a position on the horizontalbranch (HB), with Teff considerably higher than that onthe giant branch and with L at about 40 L, considerablyless than that at the He flash (see Figs. 10 and 15). Calcula-tions show that as the assumed value of Z is decreased fora given mass, the position on the model on the HB shiftsto the left. This behavior is in agreement with observa-tions of globular clusters, which show that the leftwardextent of the HB is well correlated with observed Z , inthe sense that smaller Z corresponds to an HB extendingto higher Teff. In theoretical models, as Z is increased to10−2 the HB disappears altogether, and the He-burningphase occurs very close to the red giant branch. This cal-culation is also in agreement with the observed fact thatold clusters with solar Z do not have an HB. The HB itselfis not an evolutionary track. The spread of stars along theHB in a cluster of given Z represents a spread in stellarmass, ranging in a typical case from about 0.6 M at theleft-hand edge to about 0.8 M at the right. The mean HBmass is less than the mass of the stars that are just leavingthe main sequence in a given cluster. The implication isthat stars lose mass on the red giant branch during the pe-riod of their evolution just prior to the He flash and that thetotal mass lost varies from star to star. The mass loss prob-ably occurs by a mechanism similar to that which drivesthe solar wind, but much stronger.

Page 209: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

74 Stellar Structure and Evolution

FIGURE 15 Color-magnitude diagram of a low-metal globular cluster composed of data from about five differentactual clusters. The various features are ZAMS, zero-age main sequence; WD, white dwarfs; TO, turnoff; SG, subgiantbranch; RGB, red giant branch; HB, horizontal branch; rr, RR Lyrae variables; EHB, extreme horizontal branch; andAGB, asymptotic giant branch. The location of the various features can be used to compare theory with observations.[Figure courtesy of William E. Harris.]

The phase of core helium burning on the HB lasts about108 years, after which the star develops a double shellstructure and evolves back toward the red giant branch,which during the following phase of evolution is referredto as the asymptotic giant branch. The evolution from thispoint on is very similar to that of the star of 1 M withZ = 0.02; ascent of the asymptotic giant branch, develop-ment of a carbon/oxygen core that becomes increasinglydegenerate, development of a pulsational instability in anda wind from the convective envelope, ejection of the enve-lope, evolution through the planetary nebula phase, and,finally, the transition to the white dwarf phase.

VII. FINAL STATES OF STARS

A. Brown Dwarfs

Although objects below 0.075 M are not, strictly speak-ing, stars, their general observational parameters are nowbecoming available, and the physics that must be includedin theoretical models is very similar to that for the lowest

mass stars. For example, the equation of state must includenon-ideal Coulomb interactions as well as partial electrondegeneracy, and the surface layer calculation must includethe effects of molecules. Substellar objects never attain in-ternal temperatures high enough that nuclear burning cansupply their entire radiated luminosity. They can, however,burn deuterium by Reaction (2), and this energy sourcecan be significant for a short period of time. Below about0.06 M, internal temperatures never become high enoughto burn Li by Reaction (6); thus, most brown dwarfs shouldshow Li lines in their spectra. However, above that mass,the Li does burn near the center, and because the objectsare fully convective, the depletion of Li becomes evidentat the surface. This lithium test has been successfully usedobservationally to identify brown dwarfs.

The brown dwarfs contract to the point where the elec-trons become degenerate. For 0.07 M, the value of Teff

during contraction is about 2900 K; for 0.02 M, it isabout 2500 K. Following that time the contraction be-comes very slow and the objects approach an asymptoticvalue of the radius, depending on mass and composition.At 0.07 M, the time to contract to maximum central

Page 210: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 75

temperature, which corresponds to significant electron de-generacy, is about 3 × 108 years; for lower masses, it isless. Beyond this point the object cools, with decreasingTeff and L . A typical brown dwarf after an evolution time of3 × 109 years will have Teff = 1000–1500 K and L/L =10−5 to 10−6 and is therefore very difficult to observe.Nevertheless, a number of them have been identified, forexample, the companion to Gl 229, which falls exactlyinto this range and is thought to have a mass of about0.03–0.05 M. The brown dwarfs are distinguished fromwhite dwarfs first by their very cool surface temperaturesand second by the fact that their internal composition ispractically unchanged from the time of formation; that is,it is about 71% hydrogen by mass. As a result, the radiiare larger than those of white dwarfs, with values near theend of contraction of about 0.1 R over the mass range0.01–0.08 M. The dividing line between brown dwarfsand planets is arbitrary. One possibility is the mass be-low which deuterium burning is no longer possible, about0.012 M. Or they could be distinguished by their forma-tion mechanism: planets form by accretion in protostellardisks, while brown dwarfs form in the same way as stars,by fragmentation in collapsing interstellar clouds.

B. White Dwarfs

Observational parameters of the most commonly ob-served white dwarfs are Teff = 10,000–20,000 K andlog L/L = −2 to −3. The objects, therefore, lie well be-low the main sequence, and the deduced radii fall in therange 10−2 R, with mean densities above 106 g cm−3. Atthese high densities, most of the mass of the object fallsin the region of complete electron degeneracy; only a thinsurface shell is non-degenerate. Only a few white dwarfsoccur in binary systems with sufficiently accurate obser-vations so that fundamental mass determinations can bemade. Examples are Sirius B, Procyon B, and 40 EridaniB, with masses of 1.05, 0.63, and 0.43 M, respectively.

The structure of a white dwarf can be determined ina straightforward way under the assumptions of hydro-static equilibrium and uniform composition, with the pres-sure supplied entirely by the degenerate electrons. The ionpressure and the effect of the thin surface layer are smallcorrections. Because the pressure is simply a functionof density—for example, P ∝ ρ5/3 in the nonrelativisticlimit—it can be eliminated from Eq. (32). The structurecan then be calculated from the solution of Eqs. (32) and(33). The results show that for a given composition thereis a uniquely defined relation between mass, radius, andcentral density. In other words, if the central density isconsidered to be a parameter, for each value of it therecorresponds a unique value of the mass and of the radius.This theoretical mass–radius relation, appropriate for a

FIGURE 16 Theoretical mass–radius relation derived byS. Chandrasekhar for fully degenerate white dwarfs (solid line)compared with a few observed points (open circles). The massesof Sirius B and 40 Eridani B are known to within 5%; all radii andthe mass of Procyon B are less accurately known.

typical C/O white dwarf or for a helium white dwarf, isshown in Fig. 16, where it is compared with the observa-tions of a few white dwarfs. If a more accurate equationof state is included, the line is shifted slightly, depend-ing on composition. Note that the radius decreases withincreasing mass and vanishes altogether at 1.46 M.

This upper limit to the mass of a white dwarf was firstderived by S. Chandrasekhar and corresponds to infinitecentral density in the formal solution of the equations. Infact, however, the physical process that limits the mass ofa white dwarf is capture of the highly degenerate electronsby the nuclei, predominantly carbon and oxygen. Supposethat a white dwarf increases in mass upward toward thelimit. The central density goes up, the degree of electrondegeneracy goes up, and a critical density is reached, about1010 g cm−3 for carbon, where electron capture starts. Asfree electrons are removed from the gas, the pressure is re-duced and the star can no longer maintain hydrostatic equi-librium. Collapse starts, more electron capture takes place,and (assuming no nuclear burning takes place) neutron-rich nuclei are formed. Collapse stops when the neutrondegeneracy results in a very high pressure, in other words,when a neutron star has been formed. The white dwarfmass limit is reduced by about 10%, depending upon com-position, from the value of 1.46 M given earlier.

The theoretical mass–radius relation agrees, within theuncertainties, with the observations. The location of thetheoretical white dwarfs in the H–R diagram also agreeswith observations. From the L and Teff of observed whitedwarfs, one can obtain radii and from them the corre-sponding masses, using the M–R relation. The masses ofwhite dwarfs, determined in this manner, range from 0.42to 0.7 M. These objects have presumably evolved from

Page 211: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

76 Stellar Structure and Evolution

stars of somewhat higher original mass, as indicated inFig. 10. Their composition is therefore that of the evolvedcores of these stars, mainly carbon and oxygen with onlya thin layer of hydrogen or helium on the outside. Theinternal temperatures are not high enough to burn nuclearfuel and contraction is no longer possible, so the only en-ergy source available is the thermal energy of the ionsin the interior. The star’s L , Teff, and internal tempera-ture all decrease as the star evolves at a constant radius.The energy transport in the interior is by conductionby the free electrons, which are moving very rapidly andhave long collisional mean free paths. The conduction isefficient with the result that the interior is nearly isother-mal; the energy loss rate is controlled by the radiativeopacity in the thin outer nondegenerate layers. These lossrates are sufficiently low that the cooling times are gener-ally several billion years.

C. Neutron Stars

These objects, whose existence as condensed remnantsleft behind by supernova explosions was predicted byF. Zwicky and W. Baade in 1934, were first discovered byBell and Hewish in 1967 from their pulsed radio radiation.The typical mass is 1.4 M, corresponding to the mass ofthe iron core that collapses for stars in the 10- to 20-Mrange, the typical radius is 10 km, and the mean densityis above 1014 g cm−3, close to the density of the atomicnucleus. At this high density the main constituent is freeneutrons, with a small concentration of protons, electrons,and other elementary particles. The surface temperatureis very difficult to determine directly from observations.Neutron stars form as collapsing cores of massive stars andreach temperatures of 109–1010 K during their formation.However, they then cool very rapidly by neutrino emis-sion; within one month Teff is less than 108 K, and within105 years it is less than 106 K. Their luminosity soon be-comes so low (∼10−5 L) that they would not be detectedunless they were very nearby. The fact that neutron starsare observable arises from the remarkable properties that(1) they are very strongly magnetic, (2) they rotate rapidly,and (3) they emit strong radio radiation in a narrow coneabout their magnetic axes. The rotation axis and the mag-netic axis are not coincident, and so if the star is orientedso that the radio beam sweeps through the observer’s in-strument, he or she will observe radio pulses separatedby equal time intervals. The observed pulsars are thus in-terpreted as rapidly spinning neutron stars with rotationperiods in the range 4.3 to 1.6 msec, with an average of0.79 sec. The pulsar in the Crab nebula (Fig. 14) has aperiod of 1/30 sec. The origin of the beamed radiation,which is also observed in the optical and X-ray regions ofthe spectrum, is not known. The electromagnetic energy

is emitted at the expense of the rotational energy, and therotation periods are gradually becoming longer.

D. Black Holes

Neutron stars also have a limiting mass, but it is not knownas well as that for white dwarfs because the physics of theinterior, in particular the equation of state, is not fullyunderstood. A reasonable estimate is 2–3 M. If the col-lapsing core of a massive star exceeds this limit, the col-lapse will not be stopped at neutron star densities becausethe pressure gradient can never be steep enough to balancegravity. The collapse continues indefinitely, to infinite den-sity, and a black hole is formed. Stars with original massesin the range 10–20 M form cores of around 1.4 M; theywill become supernovae and leave a neutron star as a com-pact remnant. Stars above 25–30 M have larger cores atthe time of collapse, and the amount of mass that collapsesprobably exceeds the neutron star mass limit; the result isa black hole.

E. Supernovae

Only very small fraction of stars ever become supernovae.Two main types are recognized. The supernovae associ-ated with the collapse of the iron core of massive stars arecalled type II. They tend to be observed in the spiral armsof galaxies, where recent star formation has taken place.The luminosity at maximum light is about 1010 L, andthe decline in luminosity with time takes various formswith a typical decline time of a few months. The observedtemperature drops from about 15,000 to 5000 K duringthe first 30–60 days, and the expansion velocity decreasesfrom 10,000 to 4000 km sec−1 over the same time period.The abundances of the elements are similar to those in thesun, with hydrogen clearly present. The type I supernovae,on the other hand, tend to be observed in regions associatedwith an older stellar population, for example, in ellipticalgalaxies. The luminosity at maximum is about the same asthat for type II, and the curves of the decline in light withtime are all remarkably similar, with a rapid early declinefor 30 days and then a more gradual exponential decay inwhich the luminosity decreases by a factor 10 in typically150 days. Temperatures are about the same as in type II,the expansion velocities are somewhat higher, and the ele-ment abundances are quite different: there is little evidencefor hydrogen, or helium and the spectrum is dominated bylines of heavy elements. The origin of the type I supernovais controversial, but one promising model, which appliesto the so-called type Ia, is that it represents the end pointof the evolution of a close binary system. Just before theexplosion the system consists of a white dwarf near thelimiting mass and a companion of relatively low mass,

Page 212: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

Stellar Structure and Evolution 77

which lasts for several Gyr on the main sequence. Whenthe companion evolves, it expands and at some point trans-fers mass to the white dwarf, which therefore is accretingmaterial that is hydrogen rich. The hydrogen burns nearthe surface, producing a layer of helium that increasesin mass. By the time the helium is hot enough to burn,it does so under degenerate conditions. The temperatureincreases rapidly to the point where the carbon ignites ex-plosively. Much of the material is processed to heavierelements such as iron, nickel, cobalt, and silicon. It stillis not entirely certain whether the whole star blows up orwhether a neutron star or white dwarf remnant is left. Thelight emitted is produced primarily from the radioactivedecay of 56Ni. In any case, the type Ia supernova is es-sentially a nuclear explosion, while the type II supernovais driven ultimately by the rapid release of gravitationalenergy during core collapse.

VIII. SUMMARY: IMPORTANTUNRESOLVED PROBLEMS

The study of stellar structure and evolution has resultedin major achievements in the form of quantitative andqualitative agreement with observed features, in the pre-main-sequence, main-sequence, and post-main-sequencephases. A number of aspects of the physics need to be im-proved, and improvement in many cases means extendingthe theory from the relatively simple one-dimensional cal-culations into two- and three-dimensional hydrodynamicsimulations. Considerable uncertainty is caused by a lackof a detailed theory of convection that can be applied instellar interiors. On the main sequence this uncertaintycould result in relatively minor changes in the surfacelayers of stars, mainly in the mass range 0.8–1.1 M.Three-dimensional hydrodynamic simulations of convec-tion near the surface of sunlike stars provide very goodagreement with the observed properties of solar granu-lation and can provide a calibration for the value of themixing length to be used in long-term calculations overbillions of years. However, the question of overshootingat the edges of interior convection zones still needs tobe solved. In pre-main-sequence and post-main-sequencestars, which have deep extended convection zones, a bet-ter theory could result in some major changes. There areseveral examples of situations where observations suggestmixing of the products of nucleosynthesis to the surface ofstars, but the theoretical models of convection zones do notprovide sufficiently deep mixing. Other possible mixingmechanisms, including overshoot, need to be considered.Improvement in laboratory measurements of nuclear re-action rates, particularly for the pp chains, could reducethe theoretical error bar in the rate of solar neutrino pro-

duction. The continued discrepancy in the solar neutrinodetection rates, along with the excellent agreement be-tween the solar model and the helioseismological results,suggests that neutrino oscillations may actually be the so-lution to the puzzle. New experiments under way, such asSNO, should provide further clues on this question.

The assumption of mass conservation is built into muchof the theory discussed in this article. However, it is knownfrom observation that various types of stars are losingmass, including the T Tauri stars in the early pre-main-sequence contraction phase, the O stars on the upper mainsequence, and many red giant stars. The physics of theejection process is not well explained, especially in thecase of rapid mass loss leading to the planetary nebulaphase. Also, in the early phases of evolution, the massloss, often in the form of bipolar ouflows and jets, is asso-ciated with simultaneous accretion of mass onto the starthrough a disk. The physics of the mass and angular mo-mentum transfer at this stage needs to be clarified.

Binary star evolution has not been treated here, butnumerous fascinating issues remain for further study.If the two components interact, the problem again in-volves three-dimensional hydrodynamics. The first dif-ficulty is the formation process. Most stars are found indouble or multiple systems, but their origin, presumably aresult of fragmentation of collapsing, rotating interstellarclouds, is not well understood, particularly for the closebinaries. A further problem is the explanation of short-period systems consisting of a white dwarf and a nearbylow-mass main-sequence companion. These systems areassociated with nova outbursts and possibly supernovaeof type I. The problem is that the orbital angular mo-mentum of such a binary is much smaller than that ofthe original main-sequence binary from which the systemmust have evolved. A goal of current research is to un-cover the process by which the loss of angular momen-tum occurs. A likely candidate is the “common envelope”phase. An expanding red giant envelope interacts with itsmain-sequence companion, so that the latter spirals intothe giant, losing angular momentum and energy becauseof frictional drag and gravitational torques. The energylost from the orbital motion is deposited in the envelope,resulting in its ejection, leaving the main-sequence starin a short-period orbit about the white dwarf core of thegiant. Clarification of this process requires detailed three-dimensional hydrodynamic simulations.

The effects of rotation and magnetic fields have also notbeen discussed in this article. However, their interaction inthe context of stellar evolution is certain to produce somevery interesting modifications to existing theory. Duringthe phase of star formation these processes are central,and consideration of them is crucial for the clarification ofthe processes of binary formation and generation of jets.

Page 213: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GTV Final Pages

Encyclopedia of Physical Science and Technology EN016C-736 July 31, 2001 13:56

78 Stellar Structure and Evolution

During main-sequence evolution, rotational and magneticenergies, although present, are almost certainly small com-pared with gravitational energy, and their overall effect onthe structure is therefore insignificant. However, the cir-culation currents induced by rotation could be importantwith regard to mixing and the exchange of matter betweenthe deep interior and the surface layers. During post-main-sequence phases, the cores of stars contract to very highdensities, and if angular momentum is conserved in themfrom the main-sequence phase, they would be expected torotate very rapidly by the time the core becomes degener-ate. However, white dwarfs are rotating slowly, suggestingthat at some stage almost all of the angular momentumis transferred, by an as yet unexplained process, out ofthe cores of evolving stars. The same problem applies tothe rotation of neutron stars.

A number of other problems present themselves. Thecomplicated physical processes involved in the generationof supernovae explosions and the formation of neutronstars require further study, in some cases involving three-dimensional hydrodynamics. Here, rotational and mag-netic effects again become important, as well as neutrinotransfer, development of jets, and the possible generationof gamma-ray bursts. The observational and theoreticalstudy of stellar oscillations can provide substantial infor-mation on the structure and evolution of stars. For ex-ample, the interior of the sun has been probed through theanalysis of its short-period oscillations, and the techniquescan be extended to other stars. For many types of stars (in-cluding the sun!), the physical mechanism producing theoscillations is not understood. The determination of theages of globular clusters, presumably containing the old-

est stars in the galaxy, through the use of stellar evolution-ary tracks, is a particularly interesting problem as it helpsto constrain cosmological models. It is clear that the studyof stellar evolution involves a wide range of interactingphysical processes and that the findings are of importancein numerous other areas of astronomy and physics.

SEE ALSO THE FOLLOWING ARTICLES

BINARY STARS • DARK MATTER IN THE UNIVERSE •GALACTIC STRUCTURE AND EVOLUTION • NEUTRINO AS-TRONOMY • NEUTRON STARS • STAR CLUSTERS • STARS,MASSIVE • STARS, VARIABLE • STELLAR SPECTROSCOPY

• SUPERNOVAE

BIBLIOGRAPHY

Bahcall, J. N. (1989). “Neutrino Astrophysics,” Cambridge Univ. Press,Cambridge, UK.

Bowers, R., and Deeming, T. (1984). “Astrophysics I. Stars,” Jones &Bartlett, Boston.

Clayton, D. D. (1983). “Principles of Stellar Evolution and Nucleosyn-thesis,” Univ. of Chicago Press, Chicago.

Goldberg, H. S., and Scandron, M. (1981). “Physics of Stellar Evolutionand Cosmology,” Gordon & Breach, New York.

Kippenhahn, R. (1983). “100 Billion Suns,” Basic Books, New York.Kippenhahn, R., and Weigert, A. (1990). “Stellar Structure and Evolu-

tion,” Springer-Verlag, Berlin.Phillips, A. C. (1994). “Physics of Stars,” Wiley, Chichester.Shklovskii, I. S. (1978). “The Stars: Their Birth, Life, and Death,” Free-

man, San Francisco.Tayler, R. J. (1994). “The Stars: Their Structure and Evolution,” Cam-

bridge Univ. Press, Cambridge, UK.

Page 214: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

SupernovaeDavid BranchUniversity of Oklahoma

I. Basic ObservationsII. The Interpretation of SpectraIII. Thermonuclear SupernovaeIV. Core-Collapse SupernovaeV. Supernova Remnants

VI. Supernovae as Distance Indicatorsfor Cosmology

VII. Prospects

GLOSSARY

Absolute magnitude The apparent magnitude that a starwould have if it were at a distance of 10 pc.

Apparent magnitude A logarithmic measure of thebrightness of a star as measured through, e.g., ablue or visual filter. A difference of five magni-tudes corresponds to a factor of 100 in observed flux;one magnitude corresponds to a factor of 2.512. In-creasing magnitude corresponds to decreasing bright-ness.

Light curve Apparent magnitude, absolute magnitude, orluminosity, plotted against time.

Luminosity The rate at which a star radiates energy intospace. The luminosity of a supernova usually refersonly to the thermal radiation in the optical, ultraviolet,and infrared bands, and does not include nonthermal γ

rays, X-rays, and radio waves.Neutron star A very compact star that has a radius of

the order of 10 km and a density that exceeds that ofnuclear matter. The maximum mass is thought to be

less than 3 solar masses. A neutron star is one possiblefinal state of the core of a massive star (the other beinga black hole).

Parsec The distance at which a star would have an annualtrigonometric parallax of 1 sec of arc. One parsec (pc)equals 206,625 astronomical units, or 3.1 × 1018 cm.Distances to the nearest stars are conveniently ex-pressed as pc; distances across our Galaxy, as kilo-parsecs (kpc); distances to external galaxies, as mega-parsecs (Mpc).

Photosphere The layer in a star from which photons es-cape to form a continuous spectrum; equivalently, thelayer at which the star becomes opaque to an externalobserver’s line of sight.

Stellar wind The steady flow of matter from a (nonex-ploding) star into surrounding space.

White dwarf A compact star whose internal pressureis provided by degenerate electrons. A typical whitedwarf has a radius comparable to that of Earth but adensity of the order of 107 g/cm3. The maximum mass,the Chandrasekhar mass, is 1.4 solar masses. A white

335

Page 215: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

336 Supernovae

dwarf is the final state of a star that forms with less than8 solar masses.

A SUPERNOVA is a catastrophic explosion of a star. Theluminosity of a bright supernova can be 1010 times that ofthe Sun. The explosion throws matter into space at a fewpercent of the speed of light, with a kinetic energy of 1051

erg—the energy equivalent of 1028 Mtons of TNT. Theejected matter is enriched in heavy elements, some thatwere synthesized by nuclear fusion reactions during theslow preexplosion evolution of the star and some that weresynthesized during the explosion itself. Thus, supernovaedrive the nuclear evolution of the universe.

Some supernovae are the complete thermonuclear dis-ruptions of white dwarf stars; others are initiated by thegravitational collapse of the cores of massive stars. Thelatter eject only their outer envelopes and leave behindcompact stellar remnants—neutron stars and black holes.The ejected matter of a supernova sweeps up and heatsinterstellar gas to form an extended supernova remnant.

Supernovae are valuable indicators of extragalactic dis-tances. They are used to establish the extragalactic dis-tance scale (the Hubble constant) and to probe the historyof the cosmic expansion (deceleration and acceleration).

I. BASIC OBSERVATIONS

A. Discovery

At least five of the temporary “new” stars that suddenlyappeared in the sky during the last millennium—in 1006,1054, 1181, 1572, and 1604—are now known to have beensupernova explosions in our Galaxy. Some of these werebright enough to be seen by eye even in daylight. Radio,X-ray, and optical emission from the extended remnantsof these historical Galactic supernovae is now detected.Hundreds of other extended remnants of supernovae thatoccurred in the Galaxy within the last 100,000 years alsoare recognized.

No Galactic supernova has been discovered since 1604,just a few years before the invention of the telescope. Thesupernova remnant known as Cassiopeia A, the bright-est radio source in the sky beyond the solar system, wasproduced by a supernova that occurred near 1680, but theevent was subluminous and either was not noticed or wasdismissed as an ordinary variable star. Most of the super-novae that occur in our Galaxy are not detected by ob-servers on Earth, owing to extinction of their light by dustgrains that are concentrated within the disk of the Galaxy.

On February 24, 1987, an extraordinary astronomi-cal event took place when a supernova appeared in the

Large Magellanic Cloud (LMC), a small irregular satel-lite galaxy of our Galaxy. At a distance of only 50 kpc(160,000 light years), the LMC is the nearest externalgalaxy. At its brightest, SN 1987A reached the third appar-ent visual magnitude and was easily visible to the nakedeye (from sufficiently southern latitudes). SN 1987A wasthe brightest supernova since 1604, and it became by farthe most well-observed supernova. Its aftermath will beobserved long into the future, perhaps as long as there areastronomers on Earth.

Until another outburst is seen in our Galaxy, or in one ofour satellite galaxies, studies of the explosive phases of su-pernovae must be based on more distant events. The studyof extragalactic supernovae began in August 1885, when astar of the sixth apparent visual magnitude appeared nearthe nucleus of the Andromeda Nebula, now known to bethe nearest large external galaxy, at a distance of less than0.8 Mpc. It was not until 1934 that the titanic scale of suchevents was generally recognized, and the term supernovabegan to be used. Systematic photographic searches forsupernovae began in 1936. The following 35 years wereprimarily exploratory, as some of the basic characteristicsof the various supernova types were established, and sta-tistical information was gathered. A convention for desig-nating each event eventually was adopted: e.g., SN 1987Awas the first supernova to be discovered in 1987, andSN 1979C was the third of 1979. By the end of 1989, 687extragalactic supernovae had been seen. During the late1990s the discovery rate increased dramatically, owingmainly to searches with modern detectors (CCDs) for verydistant supernovae, which can be discovered in batches.SN 1999Z was followed by SN 1999aa to SN 1999az, thenSN 1999ba to SN 1999bz, and so on, to SN 1999gu. Bythe end of 1999, 1447 supernovae had been found (Fig. 1).

B. Types

The classification of supernovae is based mainly on the ap-pearance of their optical spectra (Figs. 2 and 3). Type I su-pernovae (so named because the first well-observed eventsof the 1930s were of this kind) show no conspicuous spec-tral features produced by hydrogen; Type II supernovae doshow obvious hydrogen lines. Type I supernovae are sub-divided according to other spectroscopic characteristics:those of Type Ia have a very distinctive spectral evolutionthat includes the presence of a strong red absorption lineproduced by singly ionized silicon; the spectra of Type Ibsupernovae include strong lines of neutral helium; neitherionized silicon nor neutral helium is strong in the spectraof Type Ic. Type II supernovae that have especially narrowlines are called Type IIn. Type II events also are dividedinto Type II-P and Type II-L according to the shapes oftheir light curves.

Page 216: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

Supernovae 337

FIGURE 1 CCD images of SN 1991T in its parent galaxyNGC 4527, displayed at two contrast levels. [Reproduced by per-mission from Filippenko, A. V., et al. (1991). Astrophys. J. 384,L15.]

FIGURE 2 Photospheric-phase optical spectra of supernovae ofvarious types, obtained 1 week after the time of maximum bright-ness (1 week after the time of explosion in the case of SN 1987A).[Reproduced by permission from Branch, D., Nomoto, K., andFilippenko, A. V. (1991). Comments Astrophys. XV, 221.]

FIGURE 3 Nebular-phase optical spectra of supernovae, ob-tained 5 months after the time of maximum brightness (5 monthsafter the time of explosion in the case of SN 1987A). [Reproducedby permission from Branch, D., Nomoto, K., and Filippenko, A. V.(1991). Comments Astrophys. XV, 221.]

C. Light Curves

A typical supernova reaches its maximum brightnessabout 20 days after explosion. At its brightest, a nor-mal Type Ia supernova (SN Ia) reaches an absolute vi-sual magnitude of −19.5 and has a luminosity exceeding1043 erg/sec, billions of times that of the Sun. SNe Ia(plural) are highly homogeneous with respect to peak ab-solute magnitude as well as other observable properties.Supernovae of the other types show more observationaldiversity, and almost all of them are less luminous thanSNe Ia. The time-integrated luminosity of a bright super-nova exceeds 1049 erg, while the typical kinetic energy is1051 erg; thus the radiative efficiency of a supernova islow.

Characteristic light-curve shapes of the various super-nova types are compared in Fig. 4. The light curve of aSN Ia consists of an initial rise and fall (the early peak) thatlasts about 40 days, followed by a slowly fading tail. Thetail is nearly linear when magnitude is plotted against time,but since magnitude is a logarithmic measure of bright-ness, the tail actually corresponds to an exponential decayof brightness, The rate of decline of the SN Ia tail corre-sponds to a half-life of about 50 days.

The light curves of many SNe II interrupt the initial de-cline from their peaks to enter a “plateau” phase of nearlyconstant brightness for several months; these are desig-nated Type II-P. Others show a nearly linear decline (inmagnitudes) from peak, with little or no plateau; these are

Page 217: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

338 Supernovae

FIGURE 4 Schematic light curves of supernovae of varioustypes. [Reproduced by permission from Wheeler, J. C. (1990).In “Supernovae” (J. C. Wheeler, T. Piran, and S. Weinberg, eds.),p. 1, World Scientific, Singapore.]

called Type II-L. Some SNe II, especially the SNe IIn,have slowly fading light curves that fit neither the SN II-Pnor the SN II-L categories. Most SNe II eventually en-ter a linear tail phase, many with a rate of decline thatcorresponds to a half-life near 77 days.

SN 1987A was subluminous, and its light curve was un-usual. It was observationally conspicuous only because itoccurred in a very nearby galaxy; most such subluminousevents in distant galaxies go undiscovered. This illustratesthe importance of a very strong observational selection ef-fect: in our observational sample of supernovae, luminousevents are highly overrepresented relative to subluminousones (Fig. 5).

D. Sites, Rates, and Stellar Populations

Galaxies are classified as spirals and ellipticals. SNe II,SNe Ib, and SNe Ic are found in spirals, usually in thespiral arms. SNe Ia are found in both kinds of galaxies, andthose in spirals show little, if any, tendency to concentrateto the arms.

To estimate the relative production rates of the differ-ent types of supernovae in the various kinds of galaxies,several observational selection effects must be taken intoaccount. The most important is the bias in favor of thediscovery of luminous events. For example, because oftheir high peak luminosities, SNe Ia are the most numer-ous type in the observational sample, but they actuallyoccur less frequently than SNe II. Another important ef-fect is that it is difficult to discover supernovae in spiralgalaxies whose disks are oriented such that we observethem from the side. When these and additional selectioneffects are allowed for, it is found that in spirals, SNe II are

most frequent, followed by SNe Ic, SNe Ib, and SNe Ia,which occur at comparable rates. In spirals, the rates ofall supernova types are correlated with galaxy color: thebluer the galaxy—and, by inference, the higher the starformation rate within the last billion years—the higher thesupernova rate. Elliptical galaxies produce only SNe Ia,at a lower rate than spirals.

Absolute supernova rates are more uncertain than rel-ative rates because the relevant selection effects are stillmore difficult to evaluate. In large spiral galaxies, the meaninterval between supernovae is about 25 years. A similarrate is derived for our Galaxy itself, from the number ofhistorical supernovae that have been seen, but this involvesa large correction for events that have been missed owingto our location in the dusty disk of the Milky Way. The factthat the remnants of all of the historical Galactic super-novae are within a few kiloparsecs of the Sun, while theradius of the galaxy is 15 kpc, confirms that only (someof) the nearest Galactic supernovae of the past millenniumhave been noticed by observers on Earth.

The observation that SNe II, SNe Ib, and SNe Ic appearin the arms of spirals indicates that they are produced bymassive stars. Only stars that are formed with a mass thatis greater than about 8 solar masses have nuclear lifetimesshort enough—less than about 30 million years—that theyexplode before drifting out of the arms in which they wereborn.

FIGURE 5 Absolute blue magnitudes of supernovae are plot-ted against the distance modulus of their parent galaxies. Thedistance modulus is five times the logarithm of the distance, ex-pressed in units of 10 pc. The slanted dashed lines correspond toapparent blue magnitudes of 20 and 25. The horizontal dashedline is at the absolute magnitude of a normal SN Ia, MB = −19.5.For each supernova, the absolute magnitude is derived from thebrightest apparent magnitude that was observed, which was notin all cases at the time of maximum brightness. [Courtesy ofD. Richardson, University of Oklahoma.]

Page 218: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

Supernovae 339

The observation that SNe Ia occur in elliptical galax-ies, but also in spirals in proportion to the recent rate ofstar formation, appears at first to be paradoxical. Ellipti-cal galaxies stopped forming stars billions of years agoand now contain few stars that are much more massivethan the Sun. On the other hand, the correlation betweenthe SN Ia rate and the recent star-formation rate in spiralsimplies that most SNe Ia in spirals are produced by starsthat are fairly short-lived and, therefore, born moderatelymassive. The resolution of the paradox is that a SN Ia isproduced by a white dwarf that accretes matter from a bi-nary companion star until it is provoked to explode. Thiscan account for the low but nonzero SN Ia rate in ellipti-cals as well as for the correlation of the SN Ia rate withgalaxy color in spirals (Section III.C).

II. THE INTERPRETATION OF SPECTRA

A. The Continuous Spectrum

The shape of a star’s thermal continuous spectrum is de-termined primarily by the temperature at the photosphere.The absolute brightness depends also on the radius of thephotosphere. In a supernova, the temperature and radius ofthe photosphere change with time. When a star explodes,its matter is heated and thrown into rapid expansion. Theluminosity abruptly increases in response to the high tem-perature, but most of this energy is radiated as an X-rayand ultraviolet “flash” rather than as optical light. As thesupernova expands, it cools. For about 3 weeks the opticallight curve rises as the radius of the photosphere increases,and an increasing fraction of the radiation from the coolingphotosphere goes into the optical band. Maximum opticalbrightness occurs when the temperature is near 10,000 K.At this time the radius of the photosphere is 1015 cm, or70 AU, almost twice the radius of Pluto’s orbit. The matterdensity at the photosphere is low, near 10−16 g/cm3.

After maximum light, further cooling causes the lightcurve to decline. Owing to the expansion and consequentgeometrical dilution, the photosphere recedes with respectto the matter, i.e., matter flows through the photosphere,and deeper and deeper layers of the ejecta are graduallyexposed. Eventually, at a time that depends mainly on theamount of matter ejected but that ordinarily is a matterof months, the ejected matter becomes optically thin, thecontinuous spectrum fades away, and the “photosphericphase” comes to an end. The supernova then enters thetransparent “nebular” phase.

B. The Spectral Lines

Supernova spectral lines carry vital information on thetemperature, velocity, and composition of the ejected

FIGURE 6 Top: A schematic drawing of the photosphere(shaded) and the surrounding atmosphere of a supernova. Anobserver to the left would see an emission line formed in region Eand an absorption line formed in region A; region O would be oc-culted. Bottom: The shape (in flux versus wavelength) of a charac-teristic supernova spectral line is shown. [Courtesy of K. Hatano,University of Oklahoma.]

matter. However, because of the fast ejection velocities,typically 10,000 km/sec—3% of the speed of light—thespectral features are highly Doppler-broadened and over-lapping, and extracting information from the spectra isdifficult.

During the photospheric phase, broad absorption andemission lines form outside the photosphere and are super-imposed on the continuous spectrum. A schematic modelthat is useful for understanding the formation of spectrallines during the photospheric phase is shown at the topin Fig. 6. The central region is the photosphere, and out-side the photosphere is the line-forming region. The entiresupernova is in differential expansion, with velocity pro-portional to distance from the center. This simple velocitylaw is a natural consequence of matter being ejected witha range of velocities and then coasting without further ac-celeration; after a time, each matter element has attaineda distance from the center in proportion to its velocity.

As a photon travels through the expanding atmosphere,it redshifts in wavelength with respect to the matter thatit is passing through (as a photon redshifts in the expand-ing universe). Thus a photon can redshift into resonancewith an electronic transition in an atom or ion and beabsorbed. In the very useful approximation of pure scat-tering, the absorbed photon is immediately reemitted, butin a random direction. From the point of view of an exter-nal observer, a photon scattered by a moving atom will beDoppler-shifted with respect to the rest wavelength of thetransition. A photon scattered into the observer’s line ofsight from the right of the vertical line in Fig. 6 comes from

Page 219: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

340 Supernovae

an atom that has a component of motion away from theobserver and is seen to be redshifted; a photon scatteredfrom the left of the vertical line is seen to be blueshifted.Therefore the region labeled E produces a broad, symmet-rical emission line superimposed on the continous spec-trum and centered on the rest wavelength of the transition.Region O is occulted by the photosphere. In region A,photons originally directed toward the observer can bescattered out of the line of sight. Because it has a com-ponent of motion toward the observer, matter in regionA produces a broad, asymmetric, blueshifted, absorptioncomponent.

The shape of the full line profile is as shown at thebottom in Fig. 6. This kind of profile, having an emis-sion component near the rest wavelength of the transi-tion and a blueshifted absorption component, is charac-teristic of expanding atmospheres and is referred to as a“P Cygni profile” (after a bright star whose atmosphere isin rapid but nonexplosive expansion). The precise shapeof a line profile in a supernova spectrum depends primar-ily on how the matter density decreases with radius andon the velocity at the photosphere. The higher the veloc-ity, the broader the profile and the more the absorptioncomponent is blueshifted.

In supernovae the velocities are so high that line profilesusually overlap. Physically, this corresponds to multiplescattering in the atmosphere: after a photon is scattered byone transition, it continues to redshift and may come intoresonance with another transition and be scattered againand again before it escapes from the atmosphere.

During the nebular phase that follows the photosphericphase, there is no continuum to serve as a source ofphotons to be scattered. However, an ion can be ex-cited by a collision with a free electron and respond byemitting a photon. Thus the spectrum during the nebularphase consists of broad, symmetric, overlapping emissionlines.

C. Composition

The spectrum of a SN II near the time of its maximumbrightness is almost a smooth featureless continuum. Sub-sequently, as the temperature falls and the ionized hydro-gen begins to recombine, the Balmer lines of hydrogenstrengthen. Further cooling causes the spectrum to be-come much more complicated as many additional linesdevelop, from ions such as singly ionized calcium, singlyionized iron, and neutral sodium. Detailed analysis of thestrengths of the spectral features during the early photo-spheric phase indicates that the composition of the outerlayers of a SN II resembles that of the Sun and ordinarystars: hydrogen and helium are most abundant and only asmall fraction of the matter is in the form of heavy ele-ments. The deeper, slower-moving matter, which becomes

observable at later times, shows strong lines of elementssuch as oxygen, calcium, and magnesium, and analysis re-veals that the composition is dominated by such elements.In general, the deeper the layer, the heavier the elementsof which it consists.

The spectra of SNe Ib lack hydrogen lines but they dodevelop strong lines of neutral helium. The compositionstructure is inferred to be much like that of a SN II, ex-cept for the absence of the outer hydrogen-rich layers. InSNe Ib, the temperature is not high enough to produceoptical lines of helium by ordinary thermal excitation; anonthermal source is required. This is provided by theradioactive decay of 56Ni that was synthesized in the ex-plosion, and then the decay of its daughter nucleus, 56Co(Section III.A); γ rays from the radioactive decays scatteroff thermal electrons to produce a population of fast (non-thermal) electrons, which in turn cause the excitation ofhelium. The spectra of SNe Ic do not have strong heliumlines, either because the outer helium layer is absent (inwhich case the outer layers consist mainly of a mixture ofcarbon and oxygen) or because nonthermal excitation isineffective. Otherwise the composition structure of SNe Icresembles that of SNe II and Ib.

The spectral evolution of SNe Ia is quite different. Nearthe time of its maximum brightness the spectrum of a SN Iais dominated by lines of singly ionized silicon, sulfur, andcalcium. Weeks later the spectrum becomes dominated byoverlapping lines from various ionization states of iron andcobalt. The inferred composition structure is very differentfrom that of ordinary stellar matter; the outer layers of aSN Ia are a mixture of elements of intermediate mass,from oxygen to calcium, and the deeper layers consistmainly of iron and neighboring elements in the periodictable (“iron-peak” elements).

Because the line blending in supernova spectra is so se-vere, the process of extracting information from observedspectra usually entails calculating “synthetic spectra” ofmodel supernovae and comparing them with observation.The parameters of the model are varied until the syn-thetic and observed spectra are in satisfactory agreement;at that point, the physical conditions and composition ofthe model are accepted as an approximation to the actualconditions and composition of the supernova. An exampleof a comparison of a synthetic spectrum and an observedone is shown in Fig. 7.

D. Polarization

Information about the shape of a supernova can be ob-tained by observing the extent to which its light is polar-ized, but the number of supernovae for which polariza-tion measurements have been made is rather small. If asupernova is spherically symmetric, its light will be ob-served to be unpolarized, and this appears to be the case for

Page 220: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

Supernovae 341

FIGURE 7 A synthetic spectrum (dashed line) is compared to an optical spectrum of the Type Ic SN 1994I obtained4 days before the time of maximum brightness. The narrow absorption line near 5900 A was produced by interstellarsodium atoms in the parent galaxy, M51. The synthetic spectrum has excess flux at short wavelengths because notall relevant spectral lines were included in the calculation. [Reproduced by permission from Millard, J., et al. (1999).Astrophys. J. 527, 746.]

SNe Ia. The light from core-collapse supernovae, however,is observed to be polarized, which indicates that core-collapse supernovae are asymmetric. The observed degreeof polarization, which is typically at the level of 1%, indi-cates that if the shape is ellipsoidal, then the axial ratio isabout 1.5.

III. THERMONUCLEAR SUPERNOVAE

A. Nickel and Cobalt Radioactivity

When a star explodes its matter is heated suddenly, butexpansion then begins to cause cooling, approximatelyadiabatically, with the temperature inversely proportionalto the radius. Unless the initial radius of the progenitorstar is much larger than that of the Sun, the ejected mattercools too quickly ever to attain the large (1015 cm), hot(104 K) radiating surface that is required to account forthe peak optical brightness of a supernova. Some contin-uing source of energy must be provided to the expandinggas, to offset partially the adiabatic cooling. For SNe Ia,the continuing energy source is the radioactive decay ofabout 0.6 solar masse of 56Ni, and then its daughter nu-cleus, 56Co. The former has a half life of 6 days and decaysby electron capture with the emission of a γ ray; the meanenergy per decay is 1.7 MeV. Then 56Co decays with a

half-life of 77 days to produce stable 56Fe, the most abun-dant isotope of iron in nature. The 56Co decay is usually byelectron capture with γ -ray emission, but sometimes it isby positron emission. The mean energy per decay is 3.6 eV,with 4%, on average, carried by the kinetic energy of thepositrons.

Numerical hydrodynamical calculations have shownthat the light curve of a SN Ia can be reproduced providedthat a total ejected mass of somewhat more than 1 solarmass includes 0.6 solar mass of 56Ni that was freshly syn-thesized by nuclear reactions during the explosion. Dur-ing the first month after the explosion, the supernova isoptically thick to the γ rays from radioactivity; they areabsorbed and thermalized in the ejected matter and pro-vide the continuing source of energy to keep it hot. As thesupernova expands, the γ rays have a higher probability ofescaping rather than being absorbed. The positrons from56Co decay can be trapped by even very small magneticfields, long enough to deposit their kinetic energy intothe gas by means of collisions before they mutually anni-hilate with electrons to produce more γ rays. The fractionof the γ -ray energy that is absorbed, together with the ki-netic energy of the positrons, powers the tail of the SN Ialight curve. Because the fraction of the γ rays that is ab-sorbed decreases with time, the optical light curve declinesmore rapidly than the 77-day half-life of 56Co.

Page 221: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

342 Supernovae

B. The Fate of Low-Mass Stars

The sites of SNe Ia indicate that they are produced bystars that were born with less than 8 solar masses. Sucha star fuses hydrogen into helium in its core during itsmain-sequence phase, at a temperature of about 107 K.When the hydrogen in the core is exhausted, the star ex-pands its envelope to become a red giant; during this phasethe core contracts until it is able to fuse the helium to amixture of carbon and oxygen, at 108 K. When helium isexhausted, the core contracts again. The carbon–oxygencore would fuse to heavier elements if the temperatureapproached 109 K, but this is not achieved because thedensity in the core becomes sufficiently high (106 g/cm3)that gravity is balanced by the pressure of degenerateelectrons—electrons that strongly resist further compres-sion because their momentum distribution is determinedby the Pauli exclusion principle. The star becomes a sta-ble carbon–oxygen white dwarf that has a radius of onlya few percent of the radius of the Sun, comparable to theradius of Earth. The maximum mass of a white dwarfis 1.4 solar masses—the Chandrasekhar mass. It appearsthat most, if not all, stars that are born with less than 8solar masses manage to lose enough mass during theirred giant phases, by means of stellar winds and planetary-nebula ejection, to become white dwarfs. Thus it is dif-ficult to account for SNe Ia as the explosions of singlestars.

C. Accreting White Dwarfs

A more promising explanation for the origin of SNe Ia ap-peals to the more complicated evolution of stars in closebinary systems. A pair of stars forms, and the more mas-sive one evolves first to become a white dwarf. At somepoint in the evolution of the second star, it expands andtransfers matter to the surface of the white dwarf, eventu-ally provoking it so explode. The time delay between theformation of the binary system and the explosion is de-termined by the initial mass of the less massive star. Thismodel can account for the occurence of SNe Ia in ellipticalgalaxies (the second star has low mass, so binary systemsthat formed long ago are producing SNe Ia now) as wellas for the correlation between the SN Ia rate and the re-cent rate of star formation in spirals (binaries in which theless massive star’s mass is not so low also are producingSNe Ia now). A related possibility is that both membersof the binary form white dwarfs, which gradually spiraltogether because of orbital decay caused by the emissionof gravitational radiation, and eventually merge to tem-porarily assemble a super-Chandrasekhar mass. Whethersuch a configuration would explode as a SN Ia rather than

collapse to a neutron star is not yet known. In this model,the spiral-in phase would add an additional time delaybetween formation and explosion.

Computer simulations of the response of a carbon–oxygen white dwarf to the accretion of matter lead to avariety of outcomes, depending on the initial mass of thewhite dwarf and the composition and rate of arrival ofthe accreted matter. Certain combinations of these param-eters do lead to predictions that correspond well to theobserved properties of SNe Ia. If the accretion rate ofhydrogen-rich matter is of the order of 10−7 solar massesper year, fusion reactions in the accreted layers convert hy-drogen to helium, and then helium to carbon and oxygen,thus producing an increasingly massive carbon–oxygenwhite dwarf. As its mass approaches 1.4 solar masses,the white dwarf contracts and heats until carbon begins tofuse near its center. The ignition of a nuclear fuel in a gasthat is supported by the pressure of degenerate electronscauses a thermonuclear instability, because the increas-ing temperature of the nuclei fails to raise the pressureenough to cause a compensating expansion, while it doescause the fuel to burn at an increasing rate. The innerportions of the white dwarf are incinerated to a state ofnuclear statistical equilibrium, becoming primarily 56Nibecause this is the most tightly bound nucleus that hasequal numbers of protons and neutrons, and there is notenough time for weak interactions to change the proton-to-neutron ratio of the carbon–oxygen fuel. A nuclear flamefront propagates outward through the white dwarf in abouta second, heating and accelerating the matter, and the en-tire star is disrupted. The outer layers, because of theirlower densities, are fused only to elements of intermedi-ate mass, such as magnesium, silicon, sulfur, and calcium(Fig. 8). Thus the thermonuclear white dwarf model canaccount for the SN Ia composition structure that is inferredfrom spectroscopy. The uniformity, or near–uniformity, ofthe amount of mass ejected (1.4 solar masses) is respon-sible for the remarkable observational homogeneity ofSNe Ia.

In this model of a SN Ia, the energy produced by fu-sion, less the energy needed to unbind the star, results ina final kinetic energy of about 1051 erg. However, for rea-sons given earlier, the explosion becomes optically brightonly because of the smaller amount of energy that is re-leased gradually by the radioactive decay of 56Ni and56Co. In short, energy released promptly by nuclear fu-sion explodes the star, while energy released slowly byradioactivity makes the explosion shine. Because the finalcomposition of a SN Ia is largely iron and neighboringelements in the periodic table, SNe Ia make an importantcontribution to the amount of iron-peak elements in theuniverse.

Page 222: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

Supernovae 343

FIGURE 8 The calculated composition of an exploded white dwarf, displayed as element mass fraction plottedagainst ejection velocity. The fraction of the ejected mass that is interior to each velocity is shown at the top. Thisfigure is for 32 days after the explosion; the composition below 10,000 km/sec changes with time owing to radioactivedecays, and most of the cobalt eventually decays to iron. [Reproduced with permission from Branch, D., Doggett, J. B.,Nomoto, K., and Thielemann, F. K. (1985). Astrophys. J. 294, 619.]

IV. CORE-COLLAPSE SUPERNOVAE

A. Supergiant Progenitors

SNe II, SNe Ib, and SNe Ic occur in spiral arms and there-fore come from massive stars. In its late evolution, a starhaving an initial mass in the range of 8 to about 30 solarmasses becomes a red supergiant, developing a compact,dense, heavy-element core surrounded by an extendedhydrogen-rich envelope that may swell to 1000 times thesolar radius, approaching 1014 cm. If an instability in thecore suddenly releases a large amount of energy, a shockwave may form and propagate outward, heating and eject-ing the star’s envelope. As it expands, the envelope coolsnearly adiabatically. Nevertheless, because the initial ra-dius of a red supergiant is so large, the envelope can attainthe large, hot radiating surface that is required to pro-duce the peak optical brightness of a supernova, withoutrequiring a delayed input of energy from radioactivity.Hydrodynamical computer simulations of the response ofa red supergiant to a sudden release of energy at its cen-ter predict light curves and other properties that are ingood agreement with observations of SNe II-P, the mostcommon kind of SN II. The light-curve plateau is a phaseof diffusive release of thermal energy deposited in the

envelope by the shock. To account for the observed peakluminosity, the expansion velocity, and the duration of theplateau phase, the progenitor needs to have a large radiusand eject roughly 10 solar masses that carries a kinetic en-ergy of 1051 erg. After the plateau phase, the light-curvetail is powered by 56Ni and 56Co decay. The γ rays areefficiently trapped and thermalized by the large ejectedmass of a SN II-P, so the decline rate of the optical tailcorresponds closely to the 56Co half-life of 77 days. Toaccount for the light curve of a SN II-L, the ejected massshould be only a few solar masses, and as in a SN Ia,γ rays increasingly escape so the tail declines somewhatfaster than the 77-day half-life.

The progenitor of a SN Ib is a massive star that loses itshydrogen-rich envelope prior to core collapse, either bymeans of a stellar wind (if the initial mass is more thanabout 30 solar masses) or by mass transfer to a binarycompanion star. A star that lacks a hydrogen envelopehas a smaller radius than a red supergiant, so the SN Iblight curve, like that of a SN Ia, is powered primarilyby radioactive decay. Because the amount of 56Ni that isejected by an SN Ib is typically only 0.1 solar mass, aSN Ib is not as bright as a SN Ia. Similar statements applyto the progenitors of SNe Ic. Some of the SN Ic progenitorsmay be bare carbon–oxygen cores that lose most or all of

Page 223: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

344 Supernovae

their helium layers prior to collapse, and others may ejecthelium but lack the nonthermal excitation that is neededto produce helium lines in optical spectra.

B. Collapse and Explosion

The basic explosion mechanism of SNe II, Ib, and Ic is thesame. The sudden energy release at the center of the pro-genitor star is caused by the inevitable gravitational col-lapse of its highly evolved core. In massive stars, carbon isnondegenerate when it ignites, and it fuses nonexplosivelyto a mixture of oxygen, neon, and magnesium. Subsequentnuclear burning of these and heavier elements is unableto disrupt the increasingly tightly bound stellar core, andthe core eventually becomes iron (Fig. 9). Iron is the ulti-mate nuclear “ash,” from which no nuclear energy can beextracted. Instead, high-energy photons characteristic ofthe high core temperature photodisintegrate the iron en-dothermically, decreasing the energy and pressure in thecore and causing it to collapse at a fraction of the speed oflight, from the size of a planet to a radius of 50 km. Duringthe collapse electrons are forced into nuclei to form a fluidof neutrons, and as the density exceeds the nuclear density(more than 1014 g cm−3) the pressure rises enormously toresist further compression.

The collapse releases a gravitational potential energyof more than 1053 erg. The resulting temperatures and

FIGURE 9 The calculated composition of a star of 15 solar masses, just before core collapse. The amount of massinterior to each point is plotted on the horizontal axis; note the change of scale at 4.5 solar masses. [Courtesy ofS. E. Woosley, University of California at Santa Cruz, and T. A. Weaver, Lawrence Livermore National Laboratory.]

densities are so high that neutrinos are created copiously,and because they interact with matter only weakly, theydiffuse out of the core in seconds and carry away almost allof the energy. To produce a supernova, however, about 1%of the gravitational energy must somehow go into ejectingthe matter outside the core. Just how this occurs is not yetunderstood. One possibility is a “prompt” hydrodynamicalexplosion. Owing to its infall velocity, the collapsing coreovershoots its equilibrium density, rebounds (“bounces”),transfers energy mechanically to matter that is falling inmore slowly from above, and produces an outgoing shockwave. The energy in the shock tends to be consumed, how-ever, in photodisintegrating heavy elements just outsidethe core, causing the shock to stall before it can eject theouter layers of the star. A related possibility is a “delayed”hydrodynamical explosion, in which the small fraction ofneutrinos that are absorbed in the matter just outside thecore deposits enough energy to revive the stalled shock.Or it may be that the explosion mechanism is fundamen-tally different, e.g., a magnetohydrodynamical event inwhich the increasingly rapid rotation and the amplifyingmagnetic field of the collapsing core combine to causethe ejection of relativistic jets of matter along the rotationaxis, which drive shocks into the envelope and cause ejec-tion. This could be the reason for the significant departuresfrom spherical symmetry that are implied by the observedpolarization of the light from core collapse supernovae. It

Page 224: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

Supernovae 345

also could be related to the association of at least somecore-collapse supernovae with at least some γ -ray bursts,which became apparent with the spatial and temporal co-incidence of the peculiar Type Ic SN 1998bw and theγ -ray burst of April 25, 1998. In SN 1998bw and sev-eral other core-collapse events, the kinetic energy carriedby the ejected matter appears to have been more than 10times that of a typical supernova. Such “hyperenergetic”events may be produced by the collapse of the cores ofvery massive stars (more than 30 solar masses) to formblack holes of about 7 solar masses.

Shock waves raise the temperature of the deep layers ofthe ejected matter high enough to cause fusion reactions.As a result of this “explosive nucleosynthesis” the inner-most layers of the ejected matter are fused to 56Ni and otherother iron-peak isotopes, and the surrounding layers arefused to elements of intermediate mass. The low-densityouter layers of the progenitor star—whether hydrogen, he-lium, or carbon and oxygen—are not fused. Core-collapseevents are the main producers of intermediate-mass ele-ments in nature, and they also make a significant contri-bution to the production of iron-peak elements.

Whether the collapsed star ends as a neutron star or ablack hole depends on whether its final mass exceeds themaximum mass of a neutron star.

C. Circumstellar Interaction

The progenitor of a core-collapse supernova loses massby means of a stellar wind. For a steady-state wind thathas a constant velocity and a constant mass-loss rate, thedensity of the circumstellar matter is proportional to themass-loss rate, inversely proportional to the wind velocity,and inversely proportional to the square of the distancefrom the star. The wind velocity ordinarily is compara-ble to the escape velocity from the photosphere of thestar—of the order of 10 km/sec for a red supergiant and1000 km/sec for a blue one—so the circumstellar mat-ter of a red supergiant is much more dense than that ofa blue one. When a core-collapse supernova occurs, boththe ejected matter and the radiated photons interact withthe circumstellar matter. The violent collision between thehigh-velocity ejecta and the circumstellar matter is calleda hydrodynamic interaction. The interaction between thesupernova photons and the circumstellar matter is called aradiative interaction. These interactions can have observ-able consequences in various parts of the electromagneticspectrum.

Hydrodynamic circumstellar interaction has been de-tected most often by observations at radio wavelengths.The spectrum of the radio emission indicates that it is syn-chrotron radiation from electrons that are accelerated torelativistic velocities in a very hot (up to 109 K) interaction

region. The higher the density of the circumstellar mat-ter, the longer the radio “light curve” takes to rise to itspeak, because initially the circumstellar matter outside theinteraction region, which has been ionized by the X-rayand ultraviolet flash, absorbs the radio photons by free–free processes. Thermal X-rays from the hot interactionregion also have been detected in a smaller number ofcases. The fraction of a supernova’s total luminosity thatis emitted in the radio and X-ray bands ordinarily is verysmall, but observation of the radio and X-ray emissioncombined with modeling of the interaction provides valu-able information on the mass-loss rate of the progenitorstar. Rates in the range of 10−6 to 10−4 solar masses peryear have been inferred. Optical light radiated from theinteraction region is responsible for the slow decay rateof the light curves of SNe IIn. Unlike most supernovae,which seldom are followed observationally for more thana year after explosion, some circumstellar-interacting su-pernovae are observed decades after explosion, at optical,radio, and X-ray wavelengths.

Radiative interactions are manifested most clearly byrelatively narrow emission and absorption lines that ap-pear in optical spectra (which cause the supernova to beclassified Type IIn) and, especially, in ultraviolet spectra[which can only be obtained from above Earth’s atmo-sphere with instruments such as the Hubble Space Tele-scope (HST ); Fig. 10]. The widths of the circumstellarlines provide information on the wind velocity of the pro-genitor, and analysis of the line strengths provides infor-mation on the relative abundances of the elements in thewind. The ionization state of the circumstellar matter alsoprovides information on the amount of ionizing radiationthat was emitted by the explosion, something that is notobservable directly.

Some supernovae emit excess infrared radiation, wellbeyond the amounts expected from their photospheres, bythermal radiation from cool (1000 K), small (10−5 cm)solid particles (dust grains) in the circumstellar medium.The grains are heated by absorbing optical and ultravioletradiation from the supernova and respond by emitting inthe infrared. Infrared observations provide information onthe nature and spatial distribution of the dust grains.

D. Supernova 1987A

SN 1987A was discovered during the predawn hours ofFebruary 24, 1987, as a star of the fifth apparent magnitudein place of a previously inconspicuous twelfth-magnitudestar known as Sanduleak (Sk) −69 202 (star number 202near −69 declination in a catalog of stars published byN. Sanduleak in 1969). Sk −69 202 was the first super-nova progenitor star whose physical properties could bedetermined from observations that had been made prior to

Page 225: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

346 Supernovae

FIGURE 10 Ultraviolet spectra of the Type IIn SN 1998S obtained with the Hubble Space Telescope. [Courtesy ofP. Challis, Harvard–Smithsonian Center for Astrophysics and Supernova INtensive Study (SINS) team.]

its explosion. It was a supergiant star of spectral type B3,with an effective temperature of 16,000 K and a luminos-ity 105 times that of the Sun. The luminosity indicates thatthe mass of its core of helium and heavier elements was 6solar masses, which in turn implies that the initial mass ofthe star was about 20 solar masses. By the time of the ex-plosion, presupernova mass loss had reduced the mass toabout 14 solar masses. From the luminosity and effectivetemperature, the radius of Sk −69 202 was inferred to be40 times that of the Sun.

As discussed above, intrinsically luminous SNe II arethe explosions of red supergiants, which have very largeradii. It is well understood why an explosion of a lessextended star such as Sk −69 202 would produce a sub-luminous supernova (see Section IV.A). A more difficultquestion is, Why did Sk −69 202 explode as a relativelycompact blue supergiant rather than as a very extendedred one? Observations of the circumstellar matter ofSN 1987A indicate that its progenitor did go through a redsupergiant phase, but it ended some 40,000 years prior tothe explosion. There are several possible reasons why theprogenitor star contracted and heated to become a less ex-tended blue supergiant, including the possibility that theevolutionary history of Sk −69 202 involved a merger oftwo stars in a close binary system.

Within days of the optical discovery of SN 1987A, itwas realized that a neutrino burst had been recorded byundergound neutrino detectors in Japan and the UnitedStates, a few hours before the optical discovery. Because

neutrinos interact only weakly with matter (each person onEarth was harmlessly perforated by 1014 neutrinos fromSN 1987A), only a total of 18 neutrinos, with arrival timesspread over 12 sec, was detected with certainty. The totalneutrino energy emitted by SN 1987A was 2 × 1053 erg,and the mean neutrino energy was 12 MeV, as expected,so the neutrino signal confirmed the basic theory of corecollapse. It also provided valuable information for particlephysics, including an upper limit of 16 eV to the mass ofthe electron neutrino. The 12-sec duration of the neutrinoburst suggested that the core initially formed a neutronstar, because black-hole formation would have terminatedthe burst earlier. However, the neutron star has not yet beendetected directly. It is possible that “fall-back” of matterthat was ejected very slowly has caused the neutron starto collapse to a black hole.

The spectrum of SN 1987A has been observed farinto the nebular phase, and over a very broad range ofwavelengths. Analysis of optical and infrared spectra ofSN 1987A has established that, as expected, the composi-tion varied from primarily hydrogen and helium in the out-ermost layers to mainly heavier elements in the innermostlayers. The brightness during the tail of the light curveindicated that the amount of ejected 56Ni was 0.07 solarmass. The optical and infrared spectra, as well as the un-expectedly early leakage of γ rays and X-rays (producedby γ rays that transferred much of their energy to elec-trons), indicated that a sustantial amount of compositionmixing (light elements into the deeper layers and heavier

Page 226: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

Supernovae 347

elements, including 56Ni, into the outer layers) took place.The element mixing process still is not well understood.An excess of infrared radiation that began to develop ayear after explosion indicated that a small fraction of theheavy elements in the ejected matter had condensed intodust grains.

SN 1987A showed various signs of early circumstellarinteraction, at ultraviolet and radio wavelengths, but the in-teraction was weak because the wind of a blue supergiant isfast and the circumstellar density is correspondingly low.After a few years, however, it became clear that SN 1987Awas at the center of a ring of higher density matter, whichappears elliptical on the sky but actually is circular, andtilted with respect to our line of sight by 42. The matterin the ring, which is expanding at 10 km/sec, presum-ably originated from the slow, dense wind of Sk −69 202when it was in its red supergiant phase. The reason thatit is confined to a ring rather than a spherical shell is notwell understood. The later discovery of two larger, fainterrings (Fig. 11), as well as additional complexity in thecircumstellar environment, is suggestive of a complicatedbinary-star history of Sk −69 202 involving interactingstellar winds. The time variation of ultraviolet emission

FIGURE 11 A Hubble Space Telescope image of SN 1987A (center), its inner ring, and its two fainter outer rings.[Reproduced by permission from Burrows, C. J., et al. (1995). Astrophys. J. 452, 680.]

lines emitted by ions in the inner ring, as they recombinedfollowing ionization by the supernova radiation, indicatedthat the radius of the ring is 0.6 light year; when com-bined with the observed angular size of the ring (0.9 secof arc), this provides a valuable independent confirmationthat the distance to the LMC is 50 kpc. In the late 1990s,the first signs of an imminent collision between the high-velocity supernova ejecta and the slowly expanding innerring began to appear. This interaction will cause the emis-sion from SN 1987A to increase dramatically, across theelectromagnetic spectrum, and provide an opportunity toprobe further the distribution of the circumstellar matterof SN 1987A.

V. SUPERNOVA REMNANTS

The matter ejected by a supernova sweeps up interstellargas. After a time of the order of a century, the mass of theswept-up matter becomes comparable to the mass of theejected matter, and the latter begins to decelerate, with en-ergy being converted from kinetic into other forms. Thus,a hot region of interstellar interaction develops, and an

Page 227: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

348 Supernovae

FIGURE 12 A radio image of the shell-type remnant of Kepler’s supernova of 1604. The image was made at awavelength of 20 cm using the Very Large Array radiotelescope at the National Radio Astronomy Observatory.[Reproduced by permission from Matsui, Y., et al. (1984). Astrophys. J. 287, 295.]

extended “shell-type” supernova remnant (SNR) forms.The remnants of the supernovae of 1006, 1572 (TychoBrahe’s supernovae), 1604 (Johannes Kepler’s), and 1680(Cas A) are well-observed examples of shell-type SNRs(Fig. 12).

Galactic supernova remnants are among the brightestradio sources on the sky. The emission is synchrotron ra-diation produced by relativistic electrons that are accel-erated in the hot interaction region. Because radio wavesare unaffected by interstellar dust grains, radio SNRs areuseful for statistical studies of the spatial distribution andrate of supernovae in the Galaxy. Thermal X-rays also aredetected from some SNRs, but most SNRs are not detectedin this way because of absorption by interstellar matter;the X-ray spectra provide important information on thecomposition of the ejected matter.

The famous Crab Nebula, the remnant of the super-nova of 1054, is the prototype of a less common kindof supernova remnant, called a “filled center” SNR, ora plerion (from the Greek word for full). The emissionfrom the Crab includes spectral lines from matter that isexpanding at only about 1000 km/sec. Another compo-nent of emission, which extends from the radio throughthe optical to the X-ray region, is nonthermal synchrotronradiation. This indicates the presence of highly relativis-tic electrons and the need for continued energy input intothe central regions of the remnant; the source of this is apulsar that rotates 30 times per second and generates therelativistic electrons. Searches for signs of high-velocity

matter associated with the Crab Nebula have not yet beensuccessful.

Some SNRs also can be detected in relatively nearbyexternal galaxies, but because of their small angular sizesthey cannot be studied in as much detail.

VI. SUPERNOVAE AS DISTANCEINDICATORS FOR COSMOLOGY

A. The Extragalactic Distance Scale

The universe is expanding. The radial velocity of a galaxyrelative to us is proportional to the distance of the galaxyfrom us; thus the cosmic expansion can be represented bythe “Hubble law”: v = H0d, where v is the radial veloc-ity (ordinarily expressed as km/sec), d is the distance (asMpc), and H0 is the Hubble constant (as km/sec/Mpc) atthe present epoch. The reciprocal of the Hubble constantgives the Hubble time—the time since the Big Bang originof the expansion, assuming deceleration to have been neg-ligible. Radial velocities are easily measured, so determin-ing the value of H0 reduces to the problem of measuringdistances to galaxies. In practice, because galaxies haverandom motions, it is necessary to determine distances tofairly remote galaxies, at distances of hundreds of mega-parsecs. Measuring the distances to supernovae providesan alternative to the classical approach of estimating thedistances to galaxies themselves.

Page 228: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

Supernovae 349

There are several ways to estimate distances to super-novae that require some physical understanding of theevents but that have the advantage of being independent ofall other distance determinations in astronomy. For exam-ple, the observed properties of SNe Ia indicate that theyeject about 0.6 solar mass of 56Ni, so the peak luminos-ity of an SN Ia can be calculated from the decay rate of56Ni and compared with the observed flux to yield the dis-tance. Alternatively, the luminosity of a supernova can becalculated from its temperature (as revealed by its spec-trum) and the radius of the photosphere (as the product ofthe expansion velocity and the time elapsed since explo-sion). Both methods usually have given longer distancesand therefore lower values of H0, near 60 km/sec/Mpc,than most other methods.

Another important approach takes advantage of thenear homogeneity of the peak absolute magnitudes ofSNe Ia, to use them as “standard candles” for distancedeterminations. The top panel in Fig. 13 illustrates thenear-homogeneity, and the bottom panel shows how wellSNe Ia can be standardized even further by correcting thepeak magnitudes with the help of an empirical relation

FIGURE 13 A “Hubble diagram” for a sample of well-observedSNe Ia. Top: The apparent visual magnitude of the supernovaat maximum brightness is plotted against the logarithm of theparent-galaxy radial velocity (km/sec). If SNe Ia had identical peakabsolute magnitudes, the points would fall on the straight line.Bottom: A correction for a relation between the peak magnitudesand the light-curve decay rate has been applied, to reduce thescatter about the straight line. [Reproduced by permission fromHamuy, M., et al. (1996). Astron. J. 112, 2398.]

between the peak brightness and the rate of decline of thelight curve. The HST has been used to find Cepheid vari-able stars in nearly a dozen galaxies in which SNe Ia wereobserved in the past. The period–luminosity relation forCepheids gives the distance to the parent galaxy; the dis-tance and the apparent magnitude of the supernova givesthe absolute magnitude of the supernova (MV = −19.5);and that absolute magnitude, when assigned to SNe Ia thatappear in remote galaxies, gives their distances. In thisway, H0 has been determined to be about 60 km/sec/Mpc,in good agreement with the supernova physical methods.The corresponding Hubble time is 17 Gyr.

B. Deceleration and Acceleration

The self-gravity of the mass in the universe causes a de-celeration of the cosmic expansion. If the matter densityand the deceleration exceeded a critical amount, the ex-pansion eventually would cease, and be followed by con-traction and collapse. A way to determine the amount ofdeceleration that actually has occurred is to compare theexpansion rate in the recent past, based on observations ofthe “local” universe, to the expansion rate in the remotepast, based on extremely distant objects (“high-redshift”objects, because a large distance means large radial ve-locities and Doppler shifts toward longer wavelengths).Because SNe Ia are such excellent standard candles, andbecause they are so luminous that they can be seen at highredshifts, they are suitable for this task. This has beenthe main motivation for searching for batches of remoteSNe Ia, beginning in the mid-1990s. Results based on thefirst dozens of high-redshift SNe Ia to have been discov-ered (which actually occurred deep in the past, when thedistances between galaxies were only one-half to two-thirds of what they are now) indicate that deceleration isinsufficient to bring the expansion to a halt. Evidently, theuniverse will expand forever. In fact, the SN Ia data indi-cate that although the cosmic expansion was deceleratingin the past, it is accelerating now. An acceleration wouldrequire the introduction of Einstein’s notorious “cosmo-logical constant,” or some other even more exotic agentthat would oppose the deceleration caused by gravity. Theacceleration is inferred from the observation that high-redshift SNe Ia are 20 to 50% dimmer, observationally,than they would be in the absence of acceleration. Ob-servations of many more SNe Ia, and an improved under-standing of the physics of SNe Ia, are needed to draw afirm conclusion that acceleration really is occurring.

VII. PROSPECTS

Our understanding of supernovae began to develop onlyafter about 1970. Observations with modern detectors,

Page 229: Encyclopedia of Physical Science and Technology - Stars and Stellar Systems

P1: GNH/MAG P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN016A-754 March 26, 2002 11:22

350 Supernovae

primarily in the optical region but increasingly in otherwavelength bands as well, provided high-quality data forcomparison with the results of detailed numerical models.A basic understanding of the nature of supernova spectraand light curves, and detailed computer simulations of theexplosions, emerged from the interplay between observa-tion and theory. SN 1987A attracted much attention to thestudy of supernovae. During the 1990s dramatic progresswas made in using SNe Ia to measure the Hubble constantand to probe the cosmic deceleration, and the discoveryof an apparent acceleration has caused excitement amongcosmologists and particle physicists, and even more inter-est in supernovae.

Further important observational advances are forth-coming. Ground-based optical astronomy will contributeautomated searches with supernova-dedicated telescopes,to increase the discovery rate greatly. New powerfulinstruments such as the Next Generation Space Telescope(the NGST, successor to the HST ), and perhaps even a largeorbiting supernova-dedicated observatory, will provideobservations of both nearby and very remote events. Newsensitive instruments also will detect large numbers ofneutrinos, and perhaps even gravitational waves, from su-pernovae in our Galaxy and the nearest external galaxies.The flow of observational data will stimulate further theo-retical modeling. Improved physical data will be fed intonew generations of fast computers to produce increasinglyrealistic models of the explosions. Astronomers will inten-

sify their efforts to understand better which kinds of starsproduce supernovae, and how they do it; to evaluate moreprecisely the role of supernovae in the nuclear evolution ofthe cosmos; and to exploit supernovae to probe the historyand future expansion of the universe with ever-greater pre-cision. These are grand goals—and they can be achieved.

SEE ALSO THE FOLLOWING ARTICLES

COSMOLOGY • GAMMA-RAY ASTRONOMY • GRAVITA-TIONAL WAVE ASTRONOMY • INFRARED ASTRONOMY •NEUTRINO ASTRONOMY • NEUTRON STARS • STELLAR

SPECTROSCOPY • STELLAR STRUCTURE AND EVOLUTION

• ULTRAVIOLET SPACE ASTRONOMY

BIBLIOGRAPHY

Arnett, D. (1996). “Supernovae and Nucleosynthesis,” Princeton Uni-versity Press, Princeton NJ.

Barbon, R., Buondi, V., Cappellaro, E., and Turatto, M. (1999). Astron.Astrophys. Suppl. Ser. 139, 531.

Branch, D. (1998). Annu. Rev. Astron. Astrophys. 36, 17.Filippenko, A. V. (1997). Annu. Rev. Astron. Astrophys. 35, 309.Mann, A. K. (1997). “Shadow of a Star: the Neutrino Story of Supernova

1987A,” W. H. Freeman, San Francisco.Ruiz–Lapuente, P., Canal, R., and Isern, J. (eds.) (1997).“Thermonuclear

Supernovae,” Kluwer, Dordrecht.