18
2007 The Paleontological Society. All rights reserved. 0094-8373/07/3301-0005/$1.00 Paleobiology, 33(1), 2007, pp. 98–115 Inferring phenotypic evolution in the fossil record by Bayesian inversion Bjarte Hannisdal Abstract.—This paper takes an alternative approach to the problem of inferring patterns of phe- notypic evolution in the fossil record. Reconstructing temporal biological signal from noisy stra- tophenetic data is an inverse problem analogous to subsurface reconstructions in geophysics, and similar methods apply. To increase the information content of stratophenetic series, available geo- logical data on sample ages and environments are included as prior knowledge, and all inferences are conditioned on the uncertainty in these geological variables. This uncertainty, as well as data error and the stochasticity of fossil preservation and evolution, prevents any unique solution to the stratophenetic inverse problem. Instead, the solution is defined as a distribution of model param- eter values that explain the data to varying degrees. This distribution is obtained by direct Monte Carlo sampling of the parameter space, and evaluated with Bayesian integrals. The Bayesian in- version is illustrated with Miocene stratigraphic data from the ODP Leg 174AX Bethany Beach bore- hole. A sample of the benthic foraminifer Pseudononion pizarrensis is used to obtain a phenotypic covariance matrix for outline shape, which constrains a model of multivariate shape evolution. The forward model combines this evolutionary model and stochastic models of fossil occurrence with the empirical sedimentary record to generate predicted stratophenetic series. A synthetic data set is inverted, using the Neighbourhood Algorithm to sample the parameter space and characterize the posterior probability distribution. Despite small sample sizes and noisy shape data, most of the generating parameter values are well resolved, and the underlying pattern of phenotypic evo- lution can be reconstructed, with quantitative measures of uncertainty. Inversion of a stratigraphic series into a time series can significantly improve our perception and interpretation of an evolu- tionary pattern. Bjarte Hannisdal. Department of the Geophysical Sciences, University of Chicago, 5734 South Ellis Ave- nue, Chicago, Illinois 60637. E-mail: [email protected] Accepted: 28 August 2006 Introduction Reconstructions of tempo and mode—the rate and pattern—of phenotypic evolution from the fossil record are necessarily based on indirect and highly incomplete data (Simpson 1944; Eldredge and Gould 1972; Gingerich 1983; Hendry and Kinnison 2001; Roopnarine 2003). Species-level patterns are usually de- rived from stratophenetic series, composed of skeletal shape measurements from fossil sam- ples arranged in stratigraphic succession (e.g., Gingerich 1976; Malmgren et al. 1983; Rey- ment 1985; Cheetham 1986). Taphonomic modifications, low temporal resolution, and restricted geographic coverage generally limit fossil samples to providing only indirect mea- sures of population variance and per-genera- tion rates of evolution. Information on trait heritability and reaction norms from extant representatives is rarely available. In addition, observed stratophenetic patterns can be con- founded by the environmental and temporal heterogeneity of the sedimentary sequences yielding the fossils, and by ubiquitous sam- pling biases, frustrating attempts to quantify tempo and mode using conventional methods such as hypothesis tests based on univariate random-walk statistics (Sheets and Mitchell 2001; Hannisdal 2006). Analysis and interpretation of stratophenet- ic data depend on assumptions about the ecol- ogy and evolution of the organism, and about the environmental history recorded in the sedimentary deposits. These assumptions are often described qualitatively, but can equally well be expressed probabilistically as param- eters of a stochastic model. The analysis of stratophenetic data is thus fundamentally an inverse problem: given an observed stratophe- netic series, available geological information, and a model mapping underlying evolution- ary parameters to observed fossil data, what are the most plausible parameters of evolution supported by the data, and how uncertain is

Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

� 2007 The Paleontological Society. All rights reserved. 0094-8373/07/3301-0005/$1.00

Paleobiology, 33(1), 2007, pp. 98–115

Inferring phenotypic evolution in the fossil record by Bayesianinversion

Bjarte Hannisdal

Abstract.—This paper takes an alternative approach to the problem of inferring patterns of phe-notypic evolution in the fossil record. Reconstructing temporal biological signal from noisy stra-tophenetic data is an inverse problem analogous to subsurface reconstructions in geophysics, andsimilar methods apply. To increase the information content of stratophenetic series, available geo-logical data on sample ages and environments are included as prior knowledge, and all inferencesare conditioned on the uncertainty in these geological variables. This uncertainty, as well as dataerror and the stochasticity of fossil preservation and evolution, prevents any unique solution to thestratophenetic inverse problem. Instead, the solution is defined as a distribution of model param-eter values that explain the data to varying degrees. This distribution is obtained by direct MonteCarlo sampling of the parameter space, and evaluated with Bayesian integrals. The Bayesian in-version is illustrated with Miocene stratigraphic data from the ODP Leg 174AX Bethany Beach bore-hole. A sample of the benthic foraminifer Pseudononion pizarrensis is used to obtain a phenotypiccovariance matrix for outline shape, which constrains a model of multivariate shape evolution. Theforward model combines this evolutionary model and stochastic models of fossil occurrence withthe empirical sedimentary record to generate predicted stratophenetic series. A synthetic data setis inverted, using the Neighbourhood Algorithm to sample the parameter space and characterizethe posterior probability distribution. Despite small sample sizes and noisy shape data, most ofthe generating parameter values are well resolved, and the underlying pattern of phenotypic evo-lution can be reconstructed, with quantitative measures of uncertainty. Inversion of a stratigraphicseries into a time series can significantly improve our perception and interpretation of an evolu-tionary pattern.

Bjarte Hannisdal. Department of the Geophysical Sciences, University of Chicago, 5734 South Ellis Ave-nue, Chicago, Illinois 60637. E-mail: [email protected]

Accepted: 28 August 2006

Introduction

Reconstructions of tempo and mode—therate and pattern—of phenotypic evolutionfrom the fossil record are necessarily based onindirect and highly incomplete data (Simpson1944; Eldredge and Gould 1972; Gingerich1983; Hendry and Kinnison 2001; Roopnarine2003). Species-level patterns are usually de-rived from stratophenetic series, composed ofskeletal shape measurements from fossil sam-ples arranged in stratigraphic succession (e.g.,Gingerich 1976; Malmgren et al. 1983; Rey-ment 1985; Cheetham 1986). Taphonomicmodifications, low temporal resolution, andrestricted geographic coverage generally limitfossil samples to providing only indirect mea-sures of population variance and per-genera-tion rates of evolution. Information on traitheritability and reaction norms from extantrepresentatives is rarely available. In addition,observed stratophenetic patterns can be con-founded by the environmental and temporal

heterogeneity of the sedimentary sequencesyielding the fossils, and by ubiquitous sam-pling biases, frustrating attempts to quantifytempo and mode using conventional methodssuch as hypothesis tests based on univariaterandom-walk statistics (Sheets and Mitchell2001; Hannisdal 2006).

Analysis and interpretation of stratophenet-ic data depend on assumptions about the ecol-ogy and evolution of the organism, and aboutthe environmental history recorded in thesedimentary deposits. These assumptions areoften described qualitatively, but can equallywell be expressed probabilistically as param-eters of a stochastic model. The analysis ofstratophenetic data is thus fundamentally aninverse problem: given an observed stratophe-netic series, available geological information,and a model mapping underlying evolution-ary parameters to observed fossil data, whatare the most plausible parameters of evolutionsupported by the data, and how uncertain is

Page 2: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

99INFERRING EVOLUTION BY INVERSION

our reconstruction of the evolutionary pat-tern?

This paper presents a method for solvingthis kind of stratophenetic inverse problem.The geological context is provided by pa-leoenvironmental and chronostratigraphicdata on Miocene shallow-marine sequencesfrom a sedimentary core. An empirical phe-notypic covariance matrix is derived from asample of benthic foraminifera and used toconstrain a model of multivariate phenotypicevolution. A simulated evolutionary trajectoryis combined with stochastic models of fossiloccurrence and filtered through the empiricalsedimentary record to generate synthetic stra-tophenetic data. These synthetic data are thenrun through an inversion procedure to seewhether the generating parameters of pheno-typic evolution can be recovered. The inver-sion is a simultaneous multiparameter esti-mation that uses Monte Carlo (i.e., randomnumber) techniques to sample the parameterspace, searching for parameter values thatgenerate data-fitting predictions. Bayesian in-ference on the entire set of sampled parametervalues is then used to characterize the poste-rior probability distribution and retrieve mea-sures of uncertainty as well as the degree towhich the data are able to resolve evolutionaryparameters. The following sections describeeach step of the inversion method, and readersinterested in the technical details and equa-tions are referred to the corresponding sec-tions of the online supplementary material atdx.doi.org/10.1666/06038.s1.

Specifying Prior Information

An important part of the inversion is tospecify the initial constraints and uncertaintyin model parameters and data. This is neededto avoid solutions (i.e., data-fitting models)that are unreasonable or incompatible withwhat we know about the real system. All theobserved data are recorded stratigraphically(i.e., as a function of depth in core), but phe-notypic evolution must be modeled as a func-tion of time (i.e., number of generations). This,in addition to the stochastic nature of both fos-sil preservation and evolution models, severe-ly inflates the non-uniqueness of the inverseproblem, because a large number of different

temporal patterns can give rise to predictedstratophenetic patterns that fit the observa-tions equally well. We need to limit the spaceof plausible models or hypotheses by givinghigher probability to those that agree moreclosely with prior knowledge. Such knowl-edge can be based on observations that are notpart of the stratophenetic data (e.g., geologicaldata on the age and environment of the sam-ples), general principles (e.g., a species will bemore abundant in its preferred environment),and knowledge of the plausible range of val-ues a parameter can take (e.g., directional se-lection cannot be so strong as to produce bi-ologically impossible shapes). We can quan-tify this information in terms of probabilitydistributions, referred to as the prior proba-bility distribution.

Stratigraphic Information. This study usesdata from the Ocean Drilling Program (ODP)Leg 174AX borehole at Bethany Beach, Dela-ware (Miller et al. 2003; Browning et al. 2006).The Bethany Beach sedimentary core includesover 350 m of Miocene strata, encompassing12 third-order depositional sequences. TheNeogene stratigraphy of the U.S. mid-Atlanticcontinental margin is characterized by a seriesof forestepping sequences with well-devel-oped clinoforms that extend more than 100km across the margin (Greenlee et al. 1992;Miller and Sugarman 1995; Steckler et al.1999). ODP targeted these Neogene sequencesand supplemented offshore sites by drillingthe coastal plain in Leg 150X (Miller andMountain 1996) and more recently in Leg174AX (Miller et al. 1998). Available data fromBethany Beach include sedimentary grain-sizedistributions, bio-, chrono-, and sequence stra-tigraphy, Sr isotopic age estimates, facies anal-ysis, and paleobathymetry estimates. Thisprovides a high-quality geological frameworkfor a paleontological study of within-lineageevolutionary change in a shallow-marine set-ting.

Figure 1 shows the subset of the BethanyBeach data used here. The grain-size distri-bution (Fig. 1A) is calculated from the cumu-lative percentages within the siliciclastic sed-iments only (i.e., excluding glauconite, shells,mica, and other constituents as reported byMiller et al. 2003). The mean grain size (Fig.

Page 3: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

100 BJARTE HANNISDAL

FIGURE 1. Miocene stratigraphic data from ODP Leg 174AX Bethany Beach, Delaware, and synthetic stratopheneticdata. Formation and sequence names follow Browning et al. (2006). A, Cumulative percentage of the three silici-clastic fractions; m � mud, fs � fine sand, cs � medium and coarser sand. Blank intervals indicate core sectionsnot recovered. B, Mean grain size calculated from the data in A (intervals of missing section are interpolated forvisual purposes). Horizontal lines correspond to sequence boundaries. C, Paleobathymetric estimates based onlitho- and biofacies. D, Simulated sample sizes. Samples smaller than five specimens are not included in the analysis.E, PC1 scores from a principal component analysis of the sample mean �* shape functions. F, Age-depth plot forthe Miocene section based on Sr isotopic analysis of mollusk shells. Open circles are interpreted as diageneticallyaltered or reworked samples. G, Reconstructed shapes based on the mean �* shape function of each sample. H,Evolutionary trajectories of the population mean score on each of the first four principal components. The gener-ating S* parameter values (�, � pairs) are listed above the panel, scaled by a factor of 104. Data in A, C, and F fromMiller et al. (2003).

1B) is estimated by taking the average abso-lute grain size of a sample with the given pro-portions of grain-size fractions (i.e., mud, finesand, and coarse sand, Fig. 1A), and assuminggrains are uniformly distributed within theabsolute size range of each fraction (e.g., 0–63�m for mud). Facies associations in the Beth-any Beach Miocene section range from fluvi-al/upper estuarine sands through upper/lower shoreface sands to offshore muds. Thesefacies suggest a wave-dominated shoreline, asetting that is sensitive to relative sea-levelchanges, and the upsection facies changestend to follow transgressive-regressive pat-

terns (Miller et al. 2003). Water depths (Fig.1C) are interpreted on the basis of lithofaciesand/or established benthic foraminiferal bio-facies, and range from the shoreline to 80 mdepth, with errors from �10 m (inner neritic)to �30 m (outer neritic) (Pekar et al. 2001; VanSickel et al. 2004). Oxygen isotope studieshave shown a close correspondence between�18O increase (a proxy for glacioeustatic low-ering) and New Jersey sequence boundaries(Sugarman and Miller 1997; Miller et al. 2005).

The Bethany Beach age model supplementsbiostratigraphy with a Sr isotope chronostra-tigraphy based on mollusk shells (Miller et al.

Page 4: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

101INFERRING EVOLUTION BY INVERSION

FIGURE 2. Outline-based shape quantification. A, P. pi-zarrensis (SEM), lower Calvert Formation (lower Mio-cene), ODP Leg 174AX Bethany Beach, Delaware. Theperipheral outline in side view is digitized in two seg-ments defined by two landmarks: (1) the intersection ofthe final chamber wall with the periphery and (2) theaperture. Thus, the aperture always corresponds to thesame element of the shape vector. B, �* shape functionsbased on 120 points along the outline going counter-clockwise from landmark 1 (sample from lower CalvertFormation, n � 173). The mean and covariance matrix ofthis sample is the basis for the phenotypic evolutionmodel. C, The first four ‘‘eigenshapes,’’ derived from asingular value decomposition of the shape functions (B).Each eigenshape is scaled by its eigenvalue, the propor-tional value of which is listed on the right. The first ei-genshape gives an estimate of the sample mean shape.D, Sample outlines with the estimated mean shape(white) superimposed.

2003). Figure 1F shows the Sr age-depth plotfor the Miocene section. The large standard er-rors (�0.4 to �1.2 Myr) on the sample agesplay an important part in quantifying the un-certainty associated with inferring temporalpatterns from stratigraphic data. By includingthe uncertainty in sample ages and bathymet-ric estimates as part of the prior information,all inferences are conditioned on this uncer-tainty.

Phenotypic Covariance. Nonionid benthicforaminifera of the species Pseudononion pizar-rensis are very common in inner to middle ne-ritic facies in the Miocene sequences of themid-Atlantic shelf (Miller et al. 2003). Becauseof its abundance in the core samples, P. pizar-rensis is the target of a stratophenetic study inprogress. Details on the morphology, taxono-my, paleoecology, and evolution of this specieswill thus be addressed in a future paper. HereI use the shape of the peripheral outline of thetest in side view to build an evolutionary testcase with known parameters to assess the per-formance of the inversion method. Clearly, theperipheral outline measures only limited as-pects of overall morphology, and suffers froma lack of ontogenetic control in forms with in-determinate growth. Nevertheless, the outlineis able to capture subtle changes in chambershape and growth pattern, and recent studiesshow that external shape can be used to dis-tinguish ‘‘cryptic’’ species in benthic forami-nifera previously interpreted as ecopheno-types (e.g., Hayward et al. 2004).

Each outline is first digitized in two seg-ments defined by two landmarks on the pe-riphery (Fig. 2A), and then transformed into anormalized shape function �* (Zahn and Ros-kies 1972) based on net angular deviation be-tween successive points along the outline (Fig.2B, eq. A.5). P. pizarrensis ranges through tothe present, and ideally, samples covering thegeographic range of the extant species couldbe used as an independent source of within-species phenotypic covariance. Here a sampleof 173 P. pizarrensis from the lower Calvert For-mation in the Bethany Beach core is used tocalculate an initial mean shape and an empir-ical phenotypic covariance matrix for the out-line shape functions. An estimate of the sam-ple mean shape is obtained by singular value

decomposition of the matrix of �* shape func-tions (Fig. 2C, eq. A.6) (see eigenshape anal-ysis [Lohmann 1983; MacLeod 1999]). Thephenotypic covariance matrix is used to con-strain the multivariate phenotypic evolutionmodel described below. The sample meanshapes and stratophenetic series used to eval-uate the performance of the inversion tech-nique are thus synthetic sample means (Fig.1G) generated by evolving the initial empiricalmean shape according to a set of evolutionaryparameters, constrained by the empirical phe-notypic covariances.

Model Prior. The model prior probabilitydistribution used here is a uniform distribu-tion over the range of each parameter (Table1). Parameter ranges are determined by ex-ploring the upper and lower limits for whichthe forward model produces realistic predic-tions. Given that the upper and lower bounds

Page 5: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

102 BJARTE HANNISDAL

TABLE 1. Forward model parameters and their ranges.

Parameter Min. value Max. value Definition

mu1 �5 10�5 5 10�5 Directionality* PC1mu2 �5 10�5 5 10�5 Directionality PC2mu3 �5 10�5 5 10�5 Directionality PC3mu4 �5 10�5 5 10�5 Directionality PC4sig1 0 0.005 Variability* PC1sig2 0 0.005 Variability PC2sig3 0 0.005 Variability PC3sig4 0 0.005 Variability PC4Nmax 1 105 5 105 Maximum abundance†prob 5 10�4 0.008 Preservation probability†pd 10 80 Preferred depth (m)dt 5 50 Depth tolerance (m)pg 10 200 Preferred grainsize (�m)gt 10 300 Grainsize tolerance (�m)pres 50 400 Peak preservation (Nmaxprob)

* Directionality and variability refer to the mean and standard deviation of the distribution of selection differentials projected into principal com-ponent space (see Fig. 3 and eq. A.12).

† The absolute scaling of Nmax and prob is arbitrary, because their product, pres, is the parameter of interest.

are the only constraints we have on the modelparameters, a uniform distribution imposesthe least amount of additional information onthe prior (Jaynes 2003), and represents a con-stant over the parameter space.

The prior thus represents what we knowabout the evolutionary parameters before ob-serving the stratophenetic data. The goal ofthe inversion is to update this prior in light ofthe data and thereby narrow the range of plau-sible parameter values. In the Bayesian ap-proach, Bayes’s theorem is used to measurethe impact of new data on the prior probabil-ity, resulting in a posterior probability distri-bution (PPD), which is conditional on bothdata and prior information (see online supple-mentary material, eqs. A.1–A.4).

The Forward Model

The forward model consists of a set of sto-chastic parameters that describe (1) the abun-dance and preservation and (2) the multivar-iate phenotypic evolution of a benthic fora-minifer. Combined with the geological datafrom the sedimentary core, the model gener-ates predicted stratophenetic series compris-ing the number of preserved fossils and themultivariate mean shape for each sample.

Species Abundance and Preservation. Themodel assumes that a benthic species’ habitatpreference and abundance during life are as-sociated with water depth and substrate sed-iment grain size, either directly for physical/

biomechanical reasons or indirectly as a resultof disturbance or nutrient availability, and fur-ther that the association between abundanceand environment is Gaussian, with a singlepeak representing an optimal combination ofenvironmental properties. Holland (1995,2000) modeled the probability of fossil occur-rence as a Gaussian response curve usingthree parameters: preferred depth, depth tol-erance and maximum abundance. The sameparameters are used here, but the curve is ex-panded into a bivariate Gaussian by addingsubstrate response parameters (preferredgrain size and grain size tolerance; Table 1).When scaling the peak of the curve to themaximum abundance, this model predicts theabundance of the species on the basis of grainsizes and estimated water depth of a givensample (eq. A.7). The number of individualspreserved as fossils is then determined by theproduct of the abundance and the per capitapreservation probability, which is assumed torepresent the intrinsic fossilization potentialof the organism as well as the probability ofcollection. This product enters into the modelas the ‘‘rate’’ parameter of a Poisson process(eq. A.8). Thus, the absolute values of theabundance and preservation probability pa-rameters are not important for the purpose ofthis paper, whereas the value of their productis considered the preservation parameter ofinterest (Table 1). For more discussion and ex-

Page 6: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

103INFERRING EVOLUTION BY INVERSION

amples of simulations using a similar model,see Hannisdal (2006).

Phenotypic Evolution. Lande (1976, 1979)showed that the multivariate response to se-lection depends on the additive genetic (heri-table) covariances among traits as well as themagnitude and direction of selection actingon the traits. Neither the additive genetic co-variance matrix nor the vector of selection co-efficients acting on each of the phenotypictraits is available in the fossil record. However,we can estimate the phenotypic covariancematrix, and because the genetic covariance isthe product of phenotypic covariance and her-itability, we can model the change in multi-variate mean shape by treating the product ofselection coefficients and heritabilities as asingle parameter vector (eq. A.10). This papertakes an approach similar to that described byPolly (2004), who also showed how modelingof multivariate shape evolution can be donemore efficiently by using the eigenvectors ofthe phenotypic covariance matrix to project amultivariate mean shape onto its principalcomponent axes (eqs. A.9–A.12). Thus, a pa-rameter vector of projected selection differ-entials S* is used to drive phenotypic evolu-tion in principal component shape space. Dif-ferent modes of evolution are simulated bydrawing the elements of the S* vector fromdifferent underlying distributions. Further-more, it is possible to use the eigenvalues as-sociated with each principal component vec-tor to limit the number of elements of S* need-ed. For instance, in the sample used to gener-ate the phenotypic covariance matrix, the firsteigenvector accounts for 92% of the variance(Fig. 2C), and only the first four elements of S*are used. This restriction reduces the dimen-sionality of the parameter space that needs tobe sampled in the inversion.

A probability distribution is assigned toeach element of S*, and values are drawn ran-domly from these distributions every newgeneration. The projected population meanshape is then updated accordingly, and the re-sult is easily rotated back into the original co-ordinate space. S* distributions are assumedto be Gaussian and are thus defined by theirmean and variance, which become the evolu-tionary parameters of interest. Along each

principal component axis, directional selec-tion occurs when the S* distribution has anon-zero mean. Randomly fluctuating selec-tion occurs with zero mean and non-zero var-iance. Drift can occur if both the mean andvariance are zero, whereby any change fromone generation to the next is due to samplingeffects of a relatively small effective popula-tion size. The model is easily expanded to in-clude stabilizing selection and speciation (e.g.,Polly 2004), but these processes are not imple-mented here. Furthermore, the current imple-mentation assumes no connection betweenmorphology and the observed environmentalvariables, although this can also be includedas a reaction norm and/or via environmen-tally induced selection pressures. Figure 3 isan example of how multivariate shape evolu-tion can be modeled by using these parame-ters. For each of the four principal compo-nents, S* values are drawn from the probabil-ity densities shown in Figure 3A–D. Figure3E–G shows the initial population in principalcomponent space (gray), the population after100 generations (black), and the directionaltrend in population mean phenotype (whiteon black line). The reconstructed initial and fi-nal mean shapes are superimposed to showthe shape change involved (Fig. 3H). In thisexample, the final mean shape shows a lessrounded chamber wall and more acute anglesat the aperture and at the intersection of thechamber wall with the periphery. The task ofinferring the mode of phenotypic evolution isequivalent to estimating the directionality andvariability parameters (mean and variance) ofthe S* distributions.

Synthetic Stratophenetic Series. Given theMiocene stratigraphic data from the BethanyBeach borehole and outline shape covariancefrom a sample of P. pizarrensis, the forwardmodel described above generates predictedstratophenetic data (Fig. 1). In this case the pa-rameter values are chosen to simulate a mid-dle neritic, muddy-substrate species subject todirectional selection along the first four prin-cipal component axes. The model runs succes-sively through the 206 core samples. The firststep of a model run is to assign the age andwater depth of a sample by picking values ran-domly from a Gaussian distribution defined

Page 7: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

104 BJARTE HANNISDAL

FIGURE 3. Multivariate phenotypic evolution model. A–D, Gaussian probability density functions for the S* selec-tion differentials projected into principal component space, on the first four PC axes. The parameters (� and �)defining the Gaussians determine the tempo and mode of evolution. E–G, Principal component scores of the initialpopulation (gray), and the final population (black), after 100 generations. The population mean shape trajectory(white on black line) shows a persistent directional trend in PC space. H, The initial and final mean shapes trans-formed back into original coordinate space.

by the estimated mean and standard error val-ues of the paleobathymetric and age estimates(Fig. 1C,F). Samples positioned between esti-mated points have their mean assigned by lin-ear interpolation and their variance derivedfrom the maximum error estimates. An order-ing criterion is used to ensure that sampleages are not reversed. Errors in the meangrain size are assumed to be negligible. Giventhe water depth and grain size of the sample,an abundance value is calculated and passedto the preservation model to determine thenumber of preserved fossils. Note that be-cause only a single core is used here, only localabundance is modeled, which means that thetotal population size does not vary in re-sponse to habitat availability (see Hannisdal2006). The next step is to update the popula-tion mean phenotype by running the evolu-tion model over the number of generations(years) between samples. Despite large fluc-tuations in selection coefficients on a genera-tional time scale (the variance of the S* distri-bution is orders of magnitude larger than themean), the number of generations betweeneach sample is generally large enough to causea relatively persistent net trend in the popu-

lation mean shape. Finally, preserved shapesare found by sampling the preserved numberof individuals from the population distribu-tion along each principal component axis andcalculating the sample mean shape.

The result is an ‘‘observed’’ stratopheneticseries of 80 preserved samples, their samplesizes (Fig. 1D), and the corresponding samplemean �* shape functions (reconstructed asoutlines in Figure 1G). Most of the shapechange throughout the series is associatedwith the angles at the two landmark positions(Fig. 2A), which are linked to the curvature ofthe final chamber wall and the rate of increasein chamber height through the last whorl(analogous to whorl expansion rate in molluskshells). The sampled shape change is mosteasily visualized stratigraphically as PC1scores from a principal component analysis ofthe matrix of sample mean shapes (Fig. 1E).Despite the fairly persistent directionality ofthe generating model (Fig. 1H) and an overallshift apparent in the fossil series, small sam-ple sizes cause the fossil shapes to show largefluctuations in PC1 scores. The question wewant to answer is, Given these 80 samplemean shapes, the sample sizes, and the geo-

Page 8: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

105INFERRING EVOLUTION BY INVERSION

logical data (Fig. 1A–G), to what extent can wereconstruct the underlying evolutionary pat-tern (Fig. 1H)?

Defining Data-Model Misfit

The next step is to specify how probable thedata are, given a particular model and priorinformation, via a scaled misfit function. Notethat the number of model parameters is fixed;only their values vary. The problem of varyingthe number of parameters (model selection) isdiscussed below. Given a set of model param-eters, the forward model generates a predictedstratophenetic series, which is compared withthe observed stratophenetic series. The datavector includes the number of fossils per sam-ple (most samples are barren) and the meansample shape function for each sample. Themisfit function needs to take into account theuncertainties in the data, which are associatedwith (1) systematic errors in obtaining datafrom fossils that are actually present in a sam-ple (e.g., discarded specimens) and (2) errorsin outline shape digitization caused by mis-alignment of specimens (orientation error).The first component is poorly defined, but astandard error of �5 specimens is used hereas an uncertainty envelope on the fossil sam-ple sizes. The second component is estimatedfrom digitizing a set of 100 images of the samespecimen subjected to random tilt angles andazimuths on the SEM mount. Tilt angles aredrawn from a Gaussian distribution with amean of zero and a variance of 25, emulatingmounting error caused by the relief of the P.pizarrensis test. This provides an estimate ofthe sensitivity of each point along the outlineto orientation error. The data noise vector thuscontains both sample size error and shape er-ror, and the data-model misfit involves thesum of separate misfit measures scaled differ-ently for the two data types (eq. A.13). This al-lows the sample sizes to indicate whether fluc-tuations in the shape data are likely to be‘‘noise’’ rather than ‘‘signal.’’

With data, model, prior information and amisfit function in place, we have the elementsneeded to form the PPD that defines the so-lution to the inverse problem (eq. A.4). How-ever, the relationship between data and model(the forward relation) does not exist in ana-

lytical form as a mathematical function. Thus,to solve the necessary integrals over the pa-rameter space, we need to use Monte Carlomethods to sample the PPD directly.

Sampling the Parameter Space

The stratophenetic inverse problem doesnot have a solution in the classical sense. In-stead we are seeking a family of parametervalues and the degree to which they match theobserved data. Efficient deterministic algo-rithms (e.g., gradient methods such as steep-est descent) cannot be used here because theyrequire the derivatives of a mathematical for-ward model function with respect to the mod-el parameters. Thus, a stochastic algorithm isneeded that explores the parameter space ex-haustively through a large number of reali-zations of the model. This paper uses theNeighbourhood Algorithm (NA) developedby Sambridge (1999a) to sample the parame-ter space. The NA is a type of Monte Carlo di-rect-search algorithm like genetic algorithmsand simulated annealing, but uses a differentapproach briefly outlined below. For a full de-scription of the NA, see Sambridge (1999a,2001).

The NA sampling algorithm can be distilledinto four basic steps: (1) sample an initial setof ns parameter values uniformly distributedin parameter space; (2) compute the data mis-fit for the current sample of ns parameter val-ues and find the subset of nr values with thelowest misfit; (3) sample ns new values by per-forming a uniform random walk in the near-est-neighbor region (Voronoi cell) of each ofthe nr chosen samples; (4) return to step 2. TheNA thus guides the search in parameter spaceby using previously sampled parameter val-ues (for which the forward problem has beensolved). An approximate misfit function isconstructed by setting the misfit value to aconstant within each Voronoi cell, and the NAuses this approximation (instead of actual for-ward modeling) to generate the next samples.Parameter values are evaluated only on the ba-sis of rank misfit, independent of the absolutescale of the misfit values, and several compu-tational advantages of the NA stem from itssimple underlying geometric concept. Onlytwo tuning parameters are required, ns and nr,

Page 9: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

106 BJARTE HANNISDAL

FIGURE 4. Ensemble of 105 models sampled by the Neighbourhood Algorithm (NA). The shading represents data-model misfit (white � best fit, not including the white background), illustrating how the NA concentrates samplingin data-fitting regions of the parameter space. Each panel corresponds to a unique pair of parameters, and the axesrepresent the parameter ranges (Table 1). The optimal model is indicated by ‘‘,’’ the true model by ‘‘�.’’

and the performance of the algorithm de-pends essentially on the ratio ns/nr. The NA isused to generate a large ensemble of param-eter values that preferentially sample data-fit-ting regions of the parameter space, and theinformation necessary to solve the inverseproblem is extracted from this ensemble.

With the data from Figure 1 as the observedstratophenetic data, including the geologicalprior information, and the 14-dimensional pa-rameter space defined in Table 1, the NA al-

gorithm can search for parameter values thatfit the data. Setting the tuning parameters ns

� 200 and nr � 40, and running 500 iterations,an ensemble of 105 parameter space samplesis produced (Fig. 4). Each panel of Figure 4corresponds to a unique pair of model param-eters, with each dot representing a sampledparameter value and the shading representingthe misfit value (white � best fit to the data).The optimal (best-fit) model is marked with‘‘,’’ and the true generating model indicated

Page 10: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

107INFERRING EVOLUTION BY INVERSION

by ‘‘�.’’ For most parameters the NA tends toconcentrate sampling in regions at or near thetrue parameter value, most strikingly in thecase of the directionality parameters (e.g., Fig.4A,B). For some parameters the samplingmisses the true value (e.g., sig2; Fig. 4C), and/or fails to concentrate sampling very well (e.g.,Fig. 4F). Note that for the abundance (Nmax)and preservation probability (prob) parame-ters, the data-fitting models tend to fall alonga line containing both the optimal and the truemodel (Fig. 4E). This line simply correspondsto the best-fit value of the product of the twoparameters, which is the preservation param-eter (pres). The next step in solving the inverseproblem is to characterize and retrieve infor-mation from this ensemble.

Ensemble Inference

The goal of the inversion is to make quan-titative inferences about the temporal patternof phenotypic change from a large set of al-ternative model parameter values with theircorresponding fits to the observed stratophe-netic data. In the Bayesian approach, com-monly used measures of information arebased on properties of the PPD and take theform of integrals over the multidimensionalparameter space (see online supplementarymaterial) that need to be evaluated using Mon-te Carlo integration techniques (Gelman et al.1995). Ubiquitous computing power has madethe Bayesian solution increasingly tractable ingeophysical inverse problems (e.g., Gouveiaand Scales 1998; Malinverno 2002; Mosegaardand Sambridge 2002; Gunning and Glinsky2004), and similar approaches are used inphylogenetic reconstruction (e.g., Huelsen-beck et al. 2001) and ecosystem modeling (e.g.,Dowd and Meyer 2003).

Numerical estimates of the Bayesian inte-grals are obtained using multidimensionalMonte Carlo integration, based on summingover a large number of discrete samples inmodel space. In general, an ensemble of pa-rameter values sampled using a search algo-rithm (e.g., Fig. 4) will follow an unknown dis-tribution and cannot be used directly for Mon-te Carlo integration. However, the Voronoi di-agram of nearest-neighbor cells alreadyconstructed by the NA search algorithm can

be used to interpolate between points in amultidimensional space (Sambridge 1999b).Thus, the PPD can be replaced by a ‘‘neigh-borhood approximation’’ constructed fromthe input ensemble, and a resampled ensem-ble generated by importance-sampling the ap-proximate PPD. A standard technique for im-portance-sampling is the Gibbs sampler,which generates a random walk in modelspace whose distribution asymptoticallytends toward any prescribed distribution(Gelman et al. 1995). This requires no furthersolving of the forward problem, and effective-ly reduces numerical estimates of Bayesian in-tegrals to simple averages over the resampledensemble (Sambridge 1999b; Sambridge andMosegaard 2002).

One immediate PPD property of interest isthe posterior probability distribution of eachparameter considered separately (i.e., themarginal distributions, eq. A.16; Gelman et al.1995). If the parameter space has been ade-quately sampled, the shapes of the marginaldistributions give a visual representation ofthe accuracy and precision of the parameterestimates, reflecting the amount of informa-tion in the data, and parameter sensitivity.Figure 5 shows the 1-D marginal distributionsfor each of the parameters, based on resam-pling the ensemble in Figure 4 using the NA-Bayes (NAB) programs developed by Sam-bridge (1999b). 2 105 samples were gener-ated by 200 random walks, each 1000 stepslong and starting from a different point in theensemble. The maximum abundance (Nmax)and preservation probability (prob) parame-ters have been combined into a single param-eter (pres). Its posterior is sharply peaked(Fig. 5M; note scaling of axes). The top fourpanels in Figure 5 correspond to the direc-tionality parameters, and all show a peak anda posterior mean value (dotted line) close tothe true value (solid line). The second row ofpanels correspond to the variability parame-ters, some of which have strongly skewedmarginals, suggesting a peak at zero and witha posterior mean that overestimates the truevalue. In some cases, skewed marginals havea posterior mean value very close to the truevalue (e.g., Fig. 5H,K). Figure 6 shows the 2-Dmarginals for selected pairs of parameters.

Page 11: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

108 BJARTE HANNISDAL

FIGURE 5. 1-D marginal posterior distributions obtained by resampling the NA ensemble (Fig. 4). Vertical solidlines show the true parameter values. Dotted lines show the posterior mean values. Plots A–L are all normalizedto the same area, with the abscissa spanning the parameter range. In panel M the parameter range of ‘‘pres’’ istruncated and the ordinate scaled to emphasize the shape of the marginal. The prior distribution for each parameteris uniform (flat).

Overall, the shapes of these posteriors indicatethat the parameters can be resolved quite wellby the stratophenetic data. Peaks are well de-fined and close to the position of the true val-ues. Dependencies between parameters aresuggested by the shape in some cases (e.g., be-tween preferred depth and depth tolerance,Fig. 6D). In terms of inferring the pattern of

phenotypic evolution, the marginals imply anevolutionary mode of weak to moderate pos-itive and persistent directional selection alongall four PC axes, which is in agreement withthe generating model (Fig. 1H).

We also want to look closer at associationsand trade-offs between parameters. For ex-ample, in order to match a given number of

Page 12: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

109INFERRING EVOLUTION BY INVERSION

FIGURE 6. 2-D marginal posterior distributions. Axes correspond to parameter ranges, except for ‘‘pres’’ (F, ordi-nate), where the parameter range is truncated as in Figure 5. Solid lines represent 0.1 (inner), 0.5, and 0.95 (outer)confidence contours. Shading is scaled to the peak of each plot to enhance shapes, so shading levels are not com-parable between plots. True model indicated by ‘‘�.’’

preserved individuals in a particular sample,an increase in the peak abundance parameterwill require a corresponding decrease in thepreservation probability parameter, and wewould therefore expect the parameters Nmaxand prob to be negatively correlated in theposterior. This relationship is also suggestedby the ‘‘slope’’ of their ensemble distribution(Fig. 4E). Note that these are not empirical cor-relations and do not suggest that more abun-dant organisms have lower per-capita pres-ervation probability. Parameter covariance in-formation is obtained by calculating a poste-rior model covariance matrix (eq. A.17), wherethe diagonal elements represent the ‘‘standarderrors’’ in the parameters, and the off-diago-nal elements contain information about the co-variance between them. Because the parame-ters differ greatly in type and dimension, pa-rameter covariance is best visualized as a

model correlation matrix, calculated by stan-dardizing each element by the posterior stan-dard errors (Fig. 7). This matrix shows the ex-pected negative posterior correlation betweenmaximum abundance and preservation prob-ability. As mentioned above, the product ofthese two parameters is treated as a singlepreservation parameter (pres), and this trans-formed parameter is consequently positivelycorrelated with the original parameters. An-other distinct feature is the positive correla-tion between preferred depth (pd) and depthtolerance (dt), which is also suggested by theirensemble distribution (Fig. 4F) and their 2-Dmarginal posterior (Fig. 6D). This dependencymay arise as a side effect of the forward modelbehavior. For example, as high pd values (�50)approach the limit of the observed depth val-ues (Fig. 1C), a corresponding increase in mis-fit can be counteracted by increasing dt. Pre-

Page 13: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

110 BJARTE HANNISDAL

FIGURE 7. Posterior model correlation matrix. Shadingis saturated at �0.6 to enhance the off-diagonal ele-ments (weakly to moderately correlated parameters).

FIGURE 8. Model resolution matrix. Shading is saturat-ed at �0.5 to enhance the off-diagonal elements, regard-less of their sign. A perfectly resolved parameter willhave a black diagonal and white off-diagonal elements.

dicted abundance will then be generated bythe observed depth values falling in the lefttail of the Gaussian density defined by pd anddt, without any contribution to the misfit bythe right tail. No such dependency occurs inthe grain-size response parameters. Minorfeatures such as the moderate negative corre-lation between the variability parameters sig2and sig3 are less easily interpreted.

The shapes of the marginal posteriors (Fig.5) indicate how well the stratophenetic dataare able to resolve the parameters. A morecompact summary of this information isfound by using the prior and posterior co-variance matrices to calculate a model reso-lution matrix (Fig. 8, eq. A.18). This matrix in-dicates how well the model parameters can beindependently resolved by the data and can bethought of as a ‘‘filter’’ through which we seethe ‘‘true’’ parameters. The closer the resolu-tion matrix approaches the identity matrix(one along the diagonal, zero elsewhere), thecloser we approach the ‘‘truth’’ (Tarantola1987). Conversely, if the information in thedata is insufficient to constrain the value of aparameter, the posterior will be less differentfrom the prior, and the resolution value will

‘‘leak’’ into the off-diagonal elements. In ourcase, the transformed preservation parameteris sharply resolved, and the directionality pa-rameters (mu1–mu4) are also well resolved.The variability parameters, particularly onPC2 and PC3, are influenced by other param-eter estimates, which is also the case for thefamiliar Nmax and prob, as well as the depthresponse parameters. Note that the posteriorcovariance matrix depends on the calculationof the posterior mean parameter values (eq.A.17). If the PPD is Gaussian then the meanvalue is located at the peak of the PPD (e.g.,Fig. 5A), but in non-linear problems the PPDis often more complex (e.g., Fig. 5E), and in-terpretation of the correlation and resolutionmatrices may be less straightforward (Taran-tola 1987).

The final goals are reconstructing the tem-poral pattern of phenotypic evolution andquantifying the uncertainty associated withthis reconstruction. Again, these are ap-proached by assuming that all solutions of theforward model carry information, and that theentire model ensemble should be used to char-acterize the predicted evolutionary pattern.Figure 9 summarizes this information for each

Page 14: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

111INFERRING EVOLUTION BY INVERSION

FIGURE 9. Ensemble-based inferred pattern of pheno-typic evolution. Plots A–D correspond to the temporalpattern of population mean shape change along the firstfour PC axes (cf. Fig. 3). Solid line represents the truepattern (Fig. 1H), dotted line represents the ensemblemean trajectory, and dashed lines are an approximateerror envelope representing uncertainty in both age (ab-scissa) and phenotype (ordinate) calculated as two timesthe standard error in the ensemble of predicted trajec-tories.

of the four principal component axes. The sol-id line represents the true population meantrajectory (cf. Fig. 1H), the dotted line is thepopulation mean trajectory calculated by av-eraging all the trajectories in the ensemble,and the dashed lines represent the approxi-mate ‘‘error envelope’’ calculated as two stan-dard errors over the ensemble. Note that theerror shows spread on both axes, in order tosimultaneously represent uncertainty in boththe age and the population mean phenotypicvalue. The width of the error envelope thus re-flects the shape of the marginals for the direc-tionality parameters (Fig. 5A–D) and modelstochasticity as well as age model error (Fig.1F), and the true model falls well within thisrange in each case. In reality, the uncertaintyenvelope around the reconstructed patternwill expand and contract through time, as a

result of variable uncertainty in the geologicalprior information and variable preservationthrough the stratigraphic section. However,because the evolution model used here is time-homogeneous and constrained to start at zero,the ensemble of predicted patterns does notproperly reflect the time-varying ‘‘confi-dence.’’ Future work will focus on an expand-ed inversion approach to characterizing atime-varying posterior, as discussed below.

Discussion

Despite sparse fossil data with apparentlylittle information beyond noise, biologicallyrelevant temporal patterns can be retrieved,albeit with considerable uncertainty. The factthat we can resolve directional selection pa-rameters from the series shown in Figure 1Eis encouraging, and underscores the distinc-tion between temporal and stratigraphic pat-terns. From the marginal posterior distribu-tions and the resolution matrix, we find thatalthough the selection variability parametersare less well resolved than the directionalityparameters, the tendency toward low vari-ability implies an evolutionary mode of per-sistent directionality. Note that persistencedoes not mean constant selection pressure, butrepresents a statistical property detectable onthe scale of millions of generations. This in-ference is all the more surprising given thatthe stratophenetic series at face value is dom-inated by large fluctuations from sample tosample that would seem to swamp any un-derlying trend. However, the large fluctua-tions tend to be associated with small samplesizes, and the misfit function treats these val-ues as more likely the result of noise. Thus,when seeking to interpret stratophenetic se-ries as evidence of phenotypic evolution, thepattern we want to explain is not the strati-graphic pattern in Figure 1E directly, but theinferred temporal pattern (Fig. 9), along withthe observed shape changes (Fig. 1G). Al-though the P. pizarrensis test case presentedhere involves an artificially imposed trend, itshows how inversion of a stratophenetic seriesinto a time series can dramatically alter ourperception of the pattern, from random fluc-tuations to persistent directional evolution.

Inverse problems are encountered in several

Page 15: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

112 BJARTE HANNISDAL

areas of analytical paleobiology. Examples in-clude inference of taxonomic rates and pat-terns of origination and extinction (Raup 1991;Foote 2001, 2003); quantitative biostratigraphy(Alroy 1994; Sadler 2001); and likelihood ap-proaches in stratigraphic phylogenetics (Wag-ner 1998) and stratophenetics (Roopnarine2005; Hunt 2006). Recent studies also applyformal model selection criteria (Connolly andMiller 2001; Foote 2003, 2005; Hunt 2006). Thispaper is related to these studies through itsbasic approach of data-fitting and parameterestimation, although the inversion techniquesand terminology used here are more commonin the geophysics literature. Thus, the methoddescribed here differs in several ways fromother approaches to quantifying tempo andmode in the fossil record: (1) Rather than test-ing the stratigraphic pattern for significant de-viations from the expectation of a random nullmodel (Raup and Crick 1981; Bookstein 1987;Gingerich 1993; Roopnarine 2001), the goal isto characterize a family of temporal patternsbased on their fit to the observations. (2) Geo-logical information on time and environmentis explicitly included in the analysis, and allinferences are conditioned on the uncertaintyin this information. (3) The data-model misfitis based on the full multivariate shape as wellas preserved sample size, rather than a singlecharacter or a univariate summary of multi-variate shape (cf. Gingerich 1976; Malmgren etal. 1983; Reyment 1985; Cheetham 1986). Thisenables the detection of subtle shape variationand the effects of sampling error. Moreover,the misfit takes into account the uncertainty ineach element of the data. (4) An ensemble in-ference approach is adopted, which gathersinformation from all forward model solutions,in contrast to global optimization, where thesolution is defined by the best-fit or maxi-mum-likelihood model. This ensemble ap-proach is preferable in the case of a stratophe-netic inverse problem as defined here, wherethe forward model is highly stochastic and thePPD can have any arbitrary shape. However,like all Monte Carlo direct-search techniques,it is limited by the degree to which the ensem-ble samples the important regions of the pa-rameter space. (5) The effects of sample sizeand series length (i.e., the number of samples)

on the Bayesian quantities are intuitive: assampling decreases, the data are less able toconstrain parameter values, and the posteriorwill relax back toward the prior distribution.This behavior is a distinct advantage over con-ventional tests based on statistical propertiesof infinitely long time series, where smallsample size and series length can cause griev-ous errors. For instance, many random-walktests applied directly to the series in Figure 1Ewill reject the random null in favor of stasis(i.e., analytical stasis; Hannisdal 2006).

Despite these improvements, the presentwork has some obvious limitations: (1) Only asingle core is used, which misses the spatialcomponent of geological and biological het-erogeneity. Multiple sections within the basinare needed to estimate geographic variationwithin the species. Future work will supple-ment the Bethany Beach core with other ODPLeg 174AX and Leg 150X boreholes formingan onshore-offshore transect across the Mio-cene shelf. (2) The forward model is time-ho-mogeneous (constant parameter values) andboth the phenotypic covariance matrix andthe effective population size are held constantthrough time, which is a particular weaknessfor modeling phenotypic evolution. A morerealistic model would allow selection regimesand drift to vary on shorter time scales (a cyn-ic might point out that a straight line drawnfrom the first to the last point on Figure 1Ewould result in an inferred pattern very closeto that of Figure 9). Ecophenotypy also needsto be parameterized, either as a direct effect ofenvironment on morphology, or as a correla-tion between selection pressures and observedenvironmental variables. (3) In the presentcase of a synthetically generated pattern, thestructure of the generating model is identicalto the forward model, which may lead to op-timistic measures of resolution. Inversion of asynthetic example is required to verify andevaluate the performance of the method (i.e.,its ability to recover a known signal). A forth-coming study will apply Bayesian inversion toobserved stratophenetic patterns in P. pizarren-sis. Another important aspect of invertingsynthetic data lies in gaining a better under-standing of the forward model and its defi-ciencies. For instance, posterior correlations

Page 16: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

113INFERRING EVOLUTION BY INVERSION

between parameters point to unanticipatedbehavior, such as the correlations in the depthresponse parameters. Furthermore, the sharpresolution of the preservation parameter(pres) suggests that the misfit function is verysensitive to this parameter, at the cost of shapeinformation. This results in part from the con-scious decision to let the misfit function usesample size to evaluate the noise in the shapedata, but a modified misfit function with morerealistic estimates of the error in the preservedsample size may achieve a better balance be-tween the two data types. Alternatively, sam-ple mean shape misfit could be explicitlyweighted by some function of the observedsample size. The behavior of such alternativemisfit functions needs to be explored. (4) De-tailed geological information on sample agesand paleodepth is often not available to sup-plement stratophenetic data, particularly forPaleozoic records. In principle, the prior in-formation on sample ages and environmentsfor a record can always be assigned a proba-bility distribution, but this assignment maynot always be desirable. A uniform prior mayhave unexpected properties in high-dimen-sional spaces (Backus 1988; Scales and Tenorio2001), and methods for including complexgeological prior information are rarely en-countered outside geophysics (but see Curtisand Wood 2004). (5) The approximation of anuncertainty ‘‘envelope’’ in the inferred evolu-tionary pattern (Fig. 9) does not properly re-flect the temporal heterogeneity in geologicalprior information and fossil preservation. Al-though this does not reflect any deficiency inthe inversion procedure or its applicability, itunderscores the importance of a more realisticevolutionary model.

Several of these shortcomings can be ad-dressed by an empirical and analytical expan-sion of the methods used here. In addition toincluding data from multiple sedimentarycores, the Bayesian inversion should be gen-eralized to tackle time-heterogeneous models.The stratophenetic inverse problem can beposed at two conceptually different levels.This paper has approached the problem at thefirst level, which corresponds to finding thevalues of a given number of model parametersthat fit the observations (a parameter estima-

tion problem). In this case, the number of un-knowns (parameters) is given, and the prob-lem is of fixed dimensionality. The second lev-el is to find the number of model parametersand corresponding model structure needed tofit the observations (a model selection prob-lem). In this case, the number of unknowns isone of the unknowns, and the dimensionalityof the problem is allowed to change. In bothcases, however, we would like to know howmuch the model may vary while fitting thedata (Backus 1988). In the Bayesian approach,model selection is theoretically a straightfor-ward expansion of parameter estimation(Bretthorst 1996; Malinverno 2002; Jaynes2003), but the use of Monte Carlo direct-search techniques requires algorithms that canmove between parameter subspaces of differ-ent dimensionality (Green 1995). Future workwill pursue model selection approaches tostratophenetic inversion that provide time-heterogeneous posterior distributions andmore realistic quantification of uncertainty inthe evolutionary reconstructions.

Summary

The problem of extracting patterns of phe-notypic evolution from stratophenetic series isan inverse problem of estimating temporal pa-rameters from stratigraphic observations.Analogous problems have long been ad-dressed outside of paleontology, such as re-constructing subsurface structure in seismol-ogy or tomography, and this paper shows howthe same methods can be applied. The ele-ments needed to solve the stratophenetic in-verse problem include a forward model oftemporal evolution and preservation, prior in-formation on the ages and environments offossil samples, and estimates of the errors inthe data. Sparse sampling and geological andbiological heterogeneity can thus be treated assources of information needed to distinguishsignal from noise and to avoid drawing con-clusions that underestimate the uncertaintyinvolved. Because of the large uncertainty andthe randomness of evolution and preservation,this paper takes an ensemble approach thatdraws inferences from a large number of for-ward model realizations and their fit to thedata. Bayesian analysis of this model ensem-

Page 17: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

114 BJARTE HANNISDAL

ble provides quantitative measures of howwell the data are able to resolve the model pa-rameters, parameter uncertainties, and corre-lations. A synthetic example demonstratesthat even under severe conditions of data in-completeness and noise, the generating pa-rameters and the temporal pattern of evolu-tion can be retrieved with meaningful preci-sion. These inferences are conditioned on theuncertainty in the age and environment of thesamples, model stochasticity, and measure-ment errors. Thus, inverting a stratigraphicseries into a time series can improve the waywe perceive and interpret the underlying evo-lutionary pattern. More work is needed to ex-pand and generalize this method, includingjoint analysis of multiple stratigraphic sec-tions, exploration of higher-dimensional prob-lems with time-varying parameters and mod-el selection, and more realistic quantificationof uncertainty.

Acknowledgments

Special thanks to M. Sambridge for allow-ing the use of his Neighbourhood Algorithmcode. K. Miller and J. Browning generouslyprovided core samples and data from ODPLeg 174AX Bethany Beach. Thanks to S. Kid-well and M. Foote for comments and discus-sion. Thoughtful reviews by D. Fox, J. Ritsema,and S. Wang helped improve the manuscript.This study was supported by a doctoral fel-lowship from the Research Council of Norwayand by the American Chemical Society Petro-leum Research Fund grant no. 41014-AC8 (S.Kidwell).

Literature CitedAlroy, J. 1994. Appearance event ordination; a new biochron-

ologic method. Paleobiology 20:191–207.Arnold, S. J., M. E. Pfrender, and A. G. Jones. 2001. The adaptive

landscape as a conceptual bridge between micro- and mac-roevolution. Genetica 112–113:9–32.

Backus, G. 1988. Bayesian inference in geomagnetism. Geo-physical Journal Oxford 92:125–142.

Bookstein, F. L. 1987. Random walks and the existence of evo-lutionary rates. Paleobiology 13:446–464.

Bretthorst, G. L. 1996. An introduction to model selection usingprobability theory as logic. In G. R. Heidbreder, ed. Maximumentropy and Bayesian methods, Santa Barbara, California,U.S.A., 1993 (Proceedings of the Thirteenth InternationalWorkshop on Maximum Entropy and Bayesian Methods).Fundamental Theories of Physics 62:1–42. Kluwer Academic,Dordrecht, The Netherlands.

Browning, J. V., K. G. Miller, P. P. McLaughlin, M. A. Kominz, P.

J. Sugarman, D. Monteverde, M. D. Feigenson, and J. C. Her-nandez. 2006. Quantification of the effects of eustasy, subsi-dence, and sediment supply on Miocene sequences, mid-At-lantic margin of the United States. Geological Society ofAmerica Bulletin 118:567–588.

Cheetham, A. H. 1986. Tempo of evolution in a Neogene bryo-zoan: rates of morphological change within and across spe-cies boundaries. Paleobiology 12:190–202.

Connolly, S. R., and A. I. Miller. 2001. Joint estimation of sam-pling and turnover rates from fossil databases: capture-mark-recapture methods revisited. Paleobiology 27:751–767.

Curtis, A., and R. Wood, eds. 2004. Geological prior informa-tion: informing science and engineering. Geological Society ofLondon Special Publication 239.

Dowd, M., and R. Meyer. 2003. A Bayesian approach to the eco-system inverse problem. Ecological Modelling 168:39–55.

Eldredge, N., and S. J. Gould. 1972. Punctuated equilibria: analternative to phyletic gradualism. Pp. 82–115 in T. J. M.Schopf, ed. Models in paleobiology. Freeman, Cooper, SanFrancisco.

Foote, M. 2001. Inferring temporal patterns of preservation,origination, and extinction from taxonomic survivorshipanalysis. Paleobiology 27:602–630.

———. 2003. Origination and extinction through the Phanero-zoic: a new approach. Journal of Geology 111:125–148.

———. 2005. Pulsed origination and extinction in the marinerealm. Paleobiology 31:6–20.

Gelman, A. B., J. S. Carlin, H. S. Stern, and D. B. Rubin. 1995.Bayesian data analysis. Chapman and Hall/CRC Press, BocaRaton, Fla.

Gingerich, P. D. 1976. Paleontology and phylogeny: patterns ofevolution at the species level in early Tertiary mammals.American Journal of Science 276:1–28.

———. 1983. Rates of evolution: effects of time and temporalscaling. Science 222:159–161.

———. 1993. Quantification and comparison of evolutionaryrates. American Journal of Science 293-A:453–478.

Gouveia, W. P. J., and J. A. Scales. 1998. Bayesian seismic wave-form inversion: parameter estimation and uncertainty anal-ysis. Journal of Geophysical Research 103-B2:2759–2779.

Green, P. J. 1995. Reversible jump Markov chain Monte Carlocomputation and Bayesian model determination. Biometrika82:711–732.

Greenlee, S., W. Devlin, K. G. Miller, G. Mountain, and P. Flem-ings. 1992. Integrated sequence stratigraphy of Neogene de-posits, New Jersey continental shelf and slope: comparisonwith the Exxon model. Geological Society of America Bulletin104:1403–1411.

Gunning, J., and M. E. Glinsky. 2004. Delivery: an open-sourcemodel-based Bayesian seismic inversion program. Computersand Geosciences 30:619–636.

Hannisdal, B. 2006. Phenotypic evolution in the fossil record:numerical experiments. Journal of Geology 114:133–153.

Hayward, B. W., M. Holzmann, H. R. Grenfell, J. Pawlowski, andC. M. Triggs. 2004. Morphological distinction of moleculartypes in Ammonia—towards a taxonomic revision of theworld’s most commonly misidentified foraminifera. MarineMicropaleontology 50:237–271.

Hendry, A. P., and M. T. Kinnison. 2001. An introduction to mi-croevolution: rate, pattern, process. Genetica 112–113:1–8.

Holland, S. M. 1995. The stratigraphic distribution of fossils. Pa-leobiology 21:92–109.

———. 2000. The quality of the fossil record: a sequence strati-graphic perspective. In D. H. Erwin and S. L. Wing, eds. Deeptime: Paleobiology’s perspective. Paleobiology 26(Suppl. to No.4):148–168.

Huelsenbeck, J. P., F. Ronquist, R. Nielsen, and J. Bollback. 2001.

Page 18: Inferring phenotypic evolution in the fossil record ...rses.anu.edu.au/.../Paleobiology_33_1_reprint.pdf · Paleobiology,33(1), 2007, pp. 98–115 Inferring phenotypic evolution in

115INFERRING EVOLUTION BY INVERSION

Bayesian inference of phylogeny and its impact on evolution-ary biology. Science 294:2310–2314.

Hunt, G. 2006. Fitting and comparing models of phyletic evo-lution: random walks and beyond. Paleobiology 32:579–602.

Jaynes, E. T. 2003. Probability theory—the logic of science. Cam-bridge University Press, Cambridge.

Lande, R. 1976. Natural selection and random genetic drift inphenotypic evolution. Evolution 30:314–334.

———. 1979. Quantitative genetic analysis of multivariate evo-lution, applied to brain: body size allometry. Evolution 33:402–416.

Lohmann, G. P. 1983. Eigenshape analysis of microfossils: a gen-eral morphometric procedure for describing changes inshape. Mathematical Geology 15:659–672.

MacLeod, N. 1999. Generalizing and extending the eigenshapemethod of shape space visualization and analysis. Paleobi-ology 25:107–138.

Malinverno, A. 2002. Parsimonious Bayesian Markov chainMonte Carlo inversion in a nonlinear geophysical problem.Geophysical Journal International 151:675–688.

Malmgren, B. A., W. A. Berggren, and G. P. Lohmann. 1983. Ev-idence for punctuated gradualism in the Late Neogene Glo-borotalia tumida lineage of planktonic foraminifera. Paleobi-ology 9:377–389.

Miller, K. G., and G. S. Mountain. 1996. Drilling and dating NewJersey Oligocene-Miocene sequences: ice volume, global sealevel, and Exxon records. Science 271:1092–1095.

Miller, K. G., and P. Sugarman. 1995. Correlating Miocene se-quences in onshore New Jersey boreholes (ODP Leg 150X)with global �18O and Maryland outcrops. Geology 23:747–750.

Miller, K. G., P. Sugarman, J.V. Browning, et al., eds. 1998. Pro-ceedings of the Ocean Drilling Program, Initial Reports,174AX. Ocean Drilling Program, Texas A&M University, Col-lege Station.

Miller, K. G., P. McLaughlin, and J. V. Browning, et al. 2003.Bethany Beach Site. Pp. 1–85 in K. G. Miller, P. Sugarman, J.V. Browning, et al., eds. Proceedings of the Ocean DrillingProgram, Initial Reports 174AX(Suppl.). Ocean Drilling Pro-gram, Texas A&M University, College Station.

Miller, K. G., M. Kominz, J. Browning, J. Wright, G. Mountain,M. Katz, P. Sugarman, B. Cramer, N. Christie-Blick, and S. Pe-kar. 2005. The Phanerozoic record of sea-level change. Science310:1293–1298.

Mosegaard, K., and M. Sambridge. 2002. Monte Carlo analysisof inverse problems. Inverse Problems 18:R29–R54.

Pekar, S. F., N. Christie-Blick, M. A. Kominz, and K. G. Miller.2001. Evaluating the stratigraphic response to eustasy fromOligocene strata in New Jersey. Geology 29:55–58.

Polly, P. D. 2004. On the simulation of the evolution of morpho-logical shape: multivariate shape under selection and drift.Palaeontologia Electronica 7(2)7A:1–28.

Raup, D. 1991. A kill curve for Phanerozoic marine species. Pa-leobiology 17:37–48.

Raup, D., and R. E. Crick. 1981. Evolution of single characters inthe Jurassic ammonite Kosmoceras. Paleobiology 7:200–215.

Reyment, R. A. 1985. Phenotypic evolution in a lineage of theEocene ostracod Echinocythereis. Paleobiology 11:174–194.

Roopnarine, P. D. 2001. The description and classification of evo-lutionary mode: a computational approach. Paleobiology 27:446–465.

———. 2003. Analysis of rates of morphological evolution. An-nual Review of Ecology, Evolution, and Systematics 34:605–632.

———. 2005. The likelihood of stratophenetic-based hypothesesof genealogical succession. Special Papers in Palaeontology73:143–157.

Sadler, P. M. 2001. Constrained optimization—approaches topaleobiologic correlation and seriation problems: user’s guideand reference manual to the CONOP program family, Version6. 1. Riverside, Calif.

Sambridge, M. 1999a. Geophysical inversion with a neighbour-hood algorithm. I. Searching a parameter space. GeophysicalJournal International 138:479–494.

———. 1999b. Geophysical inversion with a neighbourhood al-gorithm. II. Appraising the ensemble. Geophysical Journal In-ternational 138:727–746.

———. 2001. Finding acceptable models in nonlinear inverseproblems using a neighbourhood algorithm. Inverse Prob-lems 17:387–403.

Sambridge, M., and K. Mosegaard. 2002. Monte Carlo methodsin geophysical inverse problems. Reviews of Geophysics 40:1–28.

Scales, J. A., and L. Tenorio 2001. Tutorial: prior information anduncertainty in inverse problems. Geophysics 66:389–397.

Sheets, D. H., and C. E. Mitchell. 2001. Why the null matters:statistical tests, random walks and evolution. Genetica 112–113:105–125.

Simpson, G. 1944. Tempo and mode in evolution. Columbia Uni-versity Press, New York.

Steckler, M. S., G. S. Mountain, K. G. Miller, and N. Christie-Blick. 1999. Reconstruction of Tertiary progradation and cli-noform development on the New Jersey passive margin by 2-D backstripping. Marine Geology 154:399–420.

Sugarman, P. J., and K. G. Miller. 1997. Correlation of Miocenesequences and hydrogeologic units, New Jersey coastal plain.Sedimentary Geology 108:3–18.

Tarantola, A. 1987. Inverse problem theory: methods for datafitting and parameter estimation. Elsevier, Amsterdam.

Van Sickel, W. A., M. A. Kominz, K. G. Miller, and J. V. Brown-ing. 2004. Late Cretaceous and Cenozoic sea-level estimates:backstripping analysis of borehole data, onshore New Jersey.Basin Research 16:451–465.

Wagner, P. J. 1998. A likelihood approach for evaluating esti-mates of phylogenetic relationships among fossil taxa. Paleo-biology 24:430–449.

Zahn, C. T., and R. Z. Roskies. 1972. Fourier descriptors forplane closed curves. IEEE Transactions on Computers C- 21:269–281.