11
INVESTIGATION Serial Founder Effects During Range Expansion: A Spatial Analog of Genetic Drift Montgomery Slatkin* ,1 and Laurent Excofer ,*Department of Integrative Biology, University of California, Berkeley, California 94720-3140, Computational and Molecular Population Genetics, Institute of Ecology and Evolution, University of Berne, 3012 Berne, Switzerland, and Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland ABSTRACT Range expansions cause a series of founder events. We show that, in a one-dimensional habitat, these founder events are the spatial analog of genetic drift in a randomly mating population. The spatial series of allele frequencies created by successive founder events is equivalent to the time series of allele frequencies in a population of effective size k e , the effective number of founders. We derive an expression for k e in a discrete-population model that allows for local population growth and migration among established populations. If there is selection, the net effect is determined approximately by the product of the selection coefcients and the number of generations between successive founding events. We use the model of a single population to compute analytically several quantities for an allele present in the source population: (i) the probability that it survives the series of colonization events, (ii) the probability that it reaches a specied threshold frequency in the last population, and (iii) the mean and variance of the frequencies in each population. We show that the analytic theory provides a good approximation to simulation results. A consequence of our approximation is that the average heterozygosity of neutral alleles decreases by a factor of 1 1/(2k e ) in each new population. Therefore, the population genetic consequences of surng can be predicted approximately by the effective number of founders and the effective selection coefcients, even in the presence of migration among populations. We also show that our analytic results are applicable to a model of range expansion in a continuously distributed population. R ANGE expansion through a series of colonization events can produce geographic patterns in allele frequencies that are quite different from what is expected in equilibrium populations. One consequence of range expansion is the steady reduction of heterozygosity with increasing distance from the ancestral population (Austerlitz et al. 1997; DeGiorgio et al. 2011), a pattern well documented in human populations (Prugnolle et al. 2005; Ramachandran et al. 2005; Handley et al. 2007; Li et al. 2008; DeGiorgio et al. 2009; Deshpande et al. 2009). Another consequence is that some alleles may reach a high frequency because of repeated founder events (Edmonds et al. 2004), a process called genetic surng (Excofer et al. 2009). Even deleterious alleles may reach a high frequency because of surng (Klopfstein et al. 2006; Travis et al. 2007; Excofer and Ray 2008; Hallatschek and Nelson 2010; Hallatschek 2011). In addition to these simu- lation-based studies, the analytic theory of surng in terms of reactiondiffusion equations has been developed (Vlad et al. 2004; Hallatschek and Nelson 2008; Hallatschek 2011). The serial-founder model has also been investigated by coalescent approaches (Austerlitz et al. 1997; DeGiorgio et al. 2011) and the theory has been used to date the time of the onset of human expansions from Africa (Liu et al. 2006). In this article, we show that the effect of range expansion in a one-dimensional habitat is analogous to the effect of random mating in a single population. The spatial series of allele frequencies in a one-dimensional array of populations can be predicted from the standard theory of a single population in which the effective population size is set to an effective propagule size that depends on migration and population growth in each newly founded population, and the selection coefcients are set to effective selection coefcients that depend on the number of generations between successive colonization events. We use the theory of a single population to predict several quantities, including (i) the probability that an allele present in the initial population will persist through- out the range expansion, (ii) the probability that an allele will reach high frequency after the range expansion is complete, (iii) the average allele frequency in each population, and (iv) Copyright © 2012 by the Genetics Society of America doi: 10.1534/genetics.112.139022 Manuscript received December 8, 2011; accepted for publication February 1, 2012 1 Corresponding author: Department of Integrative Biology, VLSB 3060, University of California, Berkeley, CA 94720-3140. E-mail: [email protected] Genetics, Vol. 191, 171181 May 2012 171

Serial Founder Effects During Range Expansion: A Spatial

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Serial Founder Effects During Range Expansion: A Spatial

INVESTIGATION

Serial Founder Effects During Range Expansion:A Spatial Analog of Genetic Drift

Montgomery Slatkin*,1 and Laurent Excoffier†,‡*Department of Integrative Biology, University of California, Berkeley, California 94720-3140, †Computational and Molecular

Population Genetics, Institute of Ecology and Evolution, University of Berne, 3012 Berne, Switzerland, and‡Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland

ABSTRACT Range expansions cause a series of founder events. We show that, in a one-dimensional habitat, these founder events arethe spatial analog of genetic drift in a randomly mating population. The spatial series of allele frequencies created by successivefounder events is equivalent to the time series of allele frequencies in a population of effective size ke, the effective number offounders. We derive an expression for ke in a discrete-population model that allows for local population growth and migration amongestablished populations. If there is selection, the net effect is determined approximately by the product of the selection coefficients andthe number of generations between successive founding events. We use the model of a single population to compute analyticallyseveral quantities for an allele present in the source population: (i) the probability that it survives the series of colonization events, (ii)the probability that it reaches a specified threshold frequency in the last population, and (iii) the mean and variance of the frequenciesin each population. We show that the analytic theory provides a good approximation to simulation results. A consequence of ourapproximation is that the average heterozygosity of neutral alleles decreases by a factor of 1 – 1/(2ke) in each new population.Therefore, the population genetic consequences of surfing can be predicted approximately by the effective number of foundersand the effective selection coefficients, even in the presence of migration among populations. We also show that our analytic resultsare applicable to a model of range expansion in a continuously distributed population.

RANGE expansion through a series of colonization eventscan produce geographic patterns in allele frequencies

that are quite different from what is expected in equilibriumpopulations. One consequence of range expansion is the steadyreduction of heterozygosity with increasing distance from theancestral population (Austerlitz et al. 1997; DeGiorgio et al.2011), a pattern well documented in human populations(Prugnolle et al. 2005; Ramachandran et al. 2005; Handleyet al. 2007; Li et al. 2008; DeGiorgio et al. 2009; Deshpandeet al. 2009). Another consequence is that some alleles mayreach a high frequency because of repeated founder events(Edmonds et al. 2004), a process called genetic surfing(Excoffier et al. 2009). Even deleterious alleles may reacha high frequency because of surfing (Klopfstein et al. 2006;Travis et al. 2007; Excoffier and Ray 2008; Hallatschek andNelson 2010; Hallatschek 2011). In addition to these simu-lation-based studies, the analytic theory of surfing in terms

of reaction–diffusion equations has been developed (Vladet al. 2004; Hallatschek and Nelson 2008; Hallatschek2011). The serial-founder model has also been investigatedby coalescent approaches (Austerlitz et al. 1997; DeGiorgioet al. 2011) and the theory has been used to date the time ofthe onset of human expansions from Africa (Liu et al. 2006).

In this article, we show that the effect of range expansion ina one-dimensional habitat is analogous to the effect of randommating in a single population. The spatial series of allelefrequencies in a one-dimensional array of populations can bepredicted from the standard theory of a single population inwhich the effective population size is set to an effectivepropagule size that depends on migration and populationgrowth in each newly founded population, and the selectioncoefficients are set to effective selection coefficients thatdepend on the number of generations between successivecolonization events. We use the theory of a single populationto predict several quantities, including (i) the probability thatan allele present in the initial population will persist through-out the range expansion, (ii) the probability that an allele willreach high frequency after the range expansion is complete,(iii) the average allele frequency in each population, and (iv)

Copyright © 2012 by the Genetics Society of Americadoi: 10.1534/genetics.112.139022Manuscript received December 8, 2011; accepted for publication February 1, 20121Corresponding author: Department of Integrative Biology, VLSB 3060, University ofCalifornia, Berkeley, CA 94720-3140. E-mail: [email protected]

Genetics, Vol. 191, 171–181 May 2012 171

Page 2: Serial Founder Effects During Range Expansion: A Spatial

the rate of decrease in heterozygosity with increasing distancefrom the founding population. The analytic approximation alsoallows us to relate the discrete-population model to a model ofrange expansion in a continuously distributed population.

Our analytic approximation is not intended to replacesimulations. In fact, even with relatively large effective pro-pagule sizes, there is considerable stochastic variability inallele frequency after a range expansion, making it difficultto predict what will happen to any one allele. Instead, thetheory is intended as a guide to intuition because it showshow each parameter in a model of range expansion influencesthe intensity of founder effects.

We present our results in several parts: (i) we define anidealized model of range expansion that ignores some ofthe complexity we allow for later, (ii) we develop analytictheory of a Wright–Fisher model of a single population thatis analogous to the idealized model of range expansion,(iii) we compare analytic results for the Wright–Fishermodel to simulation results for the idealized model, (iv)we define a more realistic model of range expansion andshow that it can be matched to the idealized model byredefining parameters, (v) we compare simulation resultsof the realistic model with the predictions of the analytictheory of the Wright–Fisher model, and (vi) we discuss therelationship between discrete-population models of rangeexpansion and range expansion in a continuously distrib-uted population.

Idealized Model of Range Expansion

In our idealized model there are n+1 sites at which pop-ulations can be established. They are arranged on a line andnumbered 0–n. At t=0, site 0 is occupied by a diploid pop-ulation of effectively infinite size in which an allele A ispresent in frequency x0 in zygotes. Selection changes thefrequency deterministically to x90 among adults. Then, kadults are drawn randomly to found a new population atsite 1. The propagule at site 1 grows in one generation intoa population of zygotes of effectively infinite size. Selectionthen modifies the frequencies in populations 0 and 1, andfinally k adults are chosen randomly from population 1 tofound population 2. This process continues for n – 1 moregenerations, with selection affecting the frequency in eachestablished population each generation. After n generations,all n sites are occupied. Our concern is with the frequencytrajectory, x ¼ fx0  ; x1; . . . ;   xng, at time n given that A hasnot been lost from the population. Note that if A is notneutral, the final x0 will differ from the initial value becauseof selection acting for n generations.

Wright–Fisher Model of a Single Population

The idealized model of range expansion is similar to a modelof random mating in a single population containing k in-dividuals. The time sequence of allele frequencies in therandomly mating population is preserved in the spatial se-

quence of frequencies in the model of range expansion. Theonly difference is that, if there is selection, then selectioncontinues to modify the allele frequency after a populationsends a propagule to found the next population.

Given the similarity of the two models, we can use thewell-developed theory of drift and selection in single pop-ulations to predict what is seen after all populations arecolonized. We assume a Wright–Fisher model of a popula-tion of size k. The relative fitnesses of individuals with 0, 1,and 2 copies of A are 1, 1 – s1, and 1 – s2, respectively. Weare concerned with the case in which A is deleterious orneutral (s2$ s1$ 0). We follow the notation of Chap. 2.12of Ewens (2004). At any time, the population is in a state Ei,i ¼ 0, . . . , 2k, with i copies of A present. The transition prob-ability from Ei to Ej is

pij ¼�2kj

�x9jð12x9Þ2k2j; (1)

where x ¼ i=ð2kÞ and

x9 ¼ x½12 s2x2 s1ð12 xÞ�12 s2x22 2s1xð12 xÞ: (2)

Let pðtÞij denote the t-generation transition probability. Theprobability that A survives for n generations, given that it ispresent in i copies initially, is

Sði; nÞ ¼ 12 pðnÞi0 : (3)

For large n, S(i, n) approaches the probability of ultimatefixation.

Given the initial number of copies of A, i0, the Wright–Fisher model will generate a trajectory, i ¼ (i0, i1, ..., in),where it is the number of copies of A at time t. The pro-bability of any trajectory given that in . 0 (i.e., that A sur-vives) is

Prði j in  .  0Þ ¼Qn21

t¼0 pititþ1

Sði0; nÞ : (4)

We can use the Wright–Fisher model to calculate otherquantities of interest. One is the probability that Awill reacha specified number of copies, j, after t generations:

Prðin $  jÞ ¼X2ki¼j

pðtÞi0i : (5)

And we can predict the decrease in heterozygosity as afunction of t,

HðtÞ ¼�12

12k

�t

Hð0Þ; (6)

where H(t) is the probability of heterozygosity in generationt (Equation 7.2.8 in Crow and Kimura 1970).

172 M. Slatkin and L. Excoffier

Page 3: Serial Founder Effects During Range Expansion: A Spatial

The probability of i copies of A at time t, given nonloss attime n, is

Prði; t j in  . 0Þ ¼pðtÞi0i

�12 pðn2tÞ

i0

�Sði0; nÞ ; (7)

from which the mean and variance of the number of copiesof A can be computed,

Eði; t j in  .  0Þ ¼X2ki¼1

i  Prði; t j in  .  0Þ; (8)

with a similar equation for the second moment of i.

Comparison of the Idealized andWright–Fisher Models

The Wright–Fisher model is analogous to our idealizedmodel of range expansion. If A is neutral, it is the numberof copies of A in the propagule that founds population t. Theassumption that each population immediately grows to aneffectively infinite size ensures that the frequency of A willremain the same in each population and hence xt ¼ it=ð2kÞ.

The idealized model differs slightly from the Wright–Fisher model if A is not neutral because selection will modifythe frequency in population t for n – tmore generations. Thisadditional effect of selection makes no difference in the cal-culation of either Sði0;   nÞ or Prðin  $  jÞ because they do notdepend on frequencies in the intermediate populations.To predict the effect of selection on the average frequencyin the intermediate populations, we first calculatedEðxtÞ ¼ Eðit j in . 0Þ=ð2kÞ and then deterministically changedthe frequency by applying Equation 2 for n – t generations.

We simulated the idealized model to test the accuracy ofthe analytic approximation on the basis of the Wright–Fishermodel. Typical results are shown in Figure 1. The fit of theaverage allele frequency in the simulations to the expectationcalculated from the analytic theory is quite good for both neu-tral and deleterious alleles. There is considerable variationamong replicates, however; most trajectories deviate sub-stantially from the average. Some trajectories are clinal witha roughly monotonic increase in frequency while others reachan intermediate maximum and then decrease. For Figure 1A,the predicted and observed survival probabilities are 0.159 and0.158, and in Figure 1B, they are 0.158 and 0.147. The prob-ability that an allele reaches at least a given frequency is alsopredicted accurately by the analytic theory (Figure 1, C and D).

Realistic Model: Finite Population Size and Migration

To create a more realistic model, we assume each propagulegrows in one generation to size N (. k) and remains at thatsize for T generations before generating a propagule thatcolonizes the next site. We also allow for migration: eachestablished population, including population 0, receivesimmigrants at a rate m per generation from each neighbor-ing population. The migration among established popula-

tions continues until all sites are occupied and the processis stopped. We are concerned with weak migration only andignore the effect of migration on population growth andpopulation size.

These additional features make the resulting modelintractable because it is no longer analogous to a Wright–Fisher model. However, the realistic model can be approxi-mated by the Wright–Fisher model if the parameters of thatmodel are defined in a way that equalizes the mean andvariance of the change in allele frequencies between succes-sive colonization events. In Appendix A, we show that thevariance in the change in allele frequency in a realisticmodel is the same as that in the Wright–Fisher model if kis replaced by

ke ¼ 1aT  =  kþ ð12 aTÞ= Nð12 aÞ; (9)

where a ¼ 122m21=ð2NÞ, which is � 121=  ð2NFÞ, whereF � 1=  ð1þ 4NmÞ if m is small. The effective number offounders, ke, is similar to the effective population size (Ne)defined by Hallatschek and Nelson (2008) for a model ofrange expansion in a continuously distributed population.Like ke, Hallatschek and Nelson’s Ne quantifies the rate ofloss of heterozygosity at the leading edge of a range expan-sion. They estimated Ne from simulations.

The composite parameter ke includes the effects of twoopposing forces. Both the founder effect and the period ofrandom mating after the population is founded increase theimportance of genetic drift and hence reduce ke. Immigra-tion from the neighboring population—there is only one forthe population most recently founded—reduces the variancein allele frequency and hence increases ke. Whether ke islarger or smaller than k depends on the balance reached.A little algebra shows that ifm� 1, N� 1, and T� 2N, thenke  .  k if 2Nm .  k 2  1=2, independently of T.

Additional generations between successive colonizationevents also increase the effective strength of selection. If A isin low frequency and s1 6¼ 0, then only the fitness of hetero-zygous carriers of A is important. We show in Appendix Bthat, in this case, the effective selection coefficient againstheterozygotes is

s1;e ¼ Ts1: (10)

If s1¼ 0, then the effective selection coefficient against AAindividuals is

s2;e ¼ Ts2: (11)

When comparing simulation results to the predictions basedon the Wright–Fisher model when A is not recessive, weused Equations 10 and 11 to compute the effective selectioncoefficients, and when A is recessive, we used Equation 11.These effective selection coefficients indicate the strength ofselection in a single generation that is equivalent to T gen-erations of weaker selection.

Serial Founder Effects During Range Expansion 173

Page 4: Serial Founder Effects During Range Expansion: A Spatial

Simulation Test of Realistic Model

We simulated the realistic model of range expansion de-scribed above. Each replicate begins with the frequency of Aset to the specified initial frequency, x0. Then one cycle ofcolonization results in a frequency x1 in population 1. Acycle consists of the sampling of gametes to form a propa-gule; selection in the propagule; creation of population 1;and then T generations of migration, selection, and geneticdrift in populations 0 and 1. The details of one cycle aredescribed below. If x1 . 0, the next cycle begins with theformation of the propagule that will establish population 2.Cycles continue until either A is lost or all n populations areestablished. If xn$ 1/(2N), A survived the range expansionand the set of frequencies fx0;   . . . ;   xng, the allele frequencytrajectory, was retained for further analysis. This processwas continued until a specified number of replicates inwhich A survived was obtained. The probability of survivalwas estimated to be the ratio of the number of replicates inwhich A survived to the total number of replicates run.

The events during one cycle are as follows. If i populationshave been established, then the number of copies of A in thepropagule that establishes population i+1, ji+1, is generatedfrom a binomial distribution with probability xi and sample size2k. Thus xi+1 in the propagule is jiþ1=ð2kÞ. Then migrationdeterministically modifies all the allele frequencies according to

x9i ¼ ð12 2mÞxi þmðxi21 þ xiþ1Þ (12)

for i¼ 1, . . . , i. In the two end populations (0 and i+1), 1 –2mis replaced by 1 – m, and x21 and xiþ2 are set to 0. Then,selection deterministically changes the frequencies in all popu-lations to x$j, for j ¼ 0, . . . , i+1, using Equation 2. Finally, thefrequency of A in the next generation in each population (in-cluding population 0) is obtained by generating a binomiallydistributed random variate with mean x$i and sample size 2N.

To compare the simulation results from the realisticmodel with the Wright–Fisher model, we used the analytictheory described above with k replaced by the integer near-est ke (Equation 9) and s1 and s2 by Ts1 and Ts2. The re-striction to integer values of k is necessary because theMarkov chain has to have an integer number of states.

Figure 2 shows two examples of the fit of the predictionsof the Wright–Fisher model to the simulation results of therealistic model without migration. Figure 2, A–C, shows theresults for a neutral allele and Figure 2, D–F, shows theresults for a deleterious recessive allele (s1 ¼ 0, s2 ¼0.03). Figure 2, A and D, compares the probability that Areaches a specified frequency. Figure 2, B and E, comparesthe average frequencies, using the same format as Figure 1.Figure 2, C and F, compares the variances among replicates.These results are typical for other data sets. Without migra-tion, the fit of the analytic predictions to the simulations isquite good for neutral and recessive deleterious alleles, atleast for ks2, 2. The predicted variances among replicatesdo not fit as well as the predicted averages, which is notsurprising given that second moments are more variablethan first moments. For selected alleles, the predictedPrðx  $  yÞ tends to be slightly larger than the simulated val-ues and the predicted x

�tends to be slightly larger than the

simulated values. Those tendencies are more pronouncedfor alleles with an additive deleterious effect on relativefitness (s2¼ 2s1), particularly if ks2. 1, which is largeenough that such deleterious alleles would have little chanceof increasing to high frequency. The extent to which all threepredicted quantities fit the simulations is roughly similar.

With migration, the fit of the predictions to the simu-lations is not quite as good. Four examples are shown inFigure 3. Only the results for the average frequency arepresented. With migration but no selection (Figure 3A),the analytic prediction tends to be slightly less than the

Figure 1 Comparison of analytic predictionswith the simulation results for the idealizedmodel described in the text. (A and B) Thin linesshow xi for i ¼ 0 to n ¼ 10 for each replicate inwhich A was not lost before population 10. Thethick black line shows the average of the 100replicates and the thick blue line shows the ex-pectation based on the analytic theory. (C andD) Probability that an allele reaches a frequencyy in the final population when the range expan-sion is complete. In A–D, k ¼ 100 and one copyof the mutant was present in the propagulefounding population 1. In A and C, s1 ¼ s2 ¼0; in B and D, s1 ¼ 0.005 and s2 ¼ 0.01.

174 M. Slatkin and L. Excoffier

Page 5: Serial Founder Effects During Range Expansion: A Spatial

average of the simulations. That tendency is also seen whenthere is migration and selection (Figure 3, B–D). Even whenthe number of colonization events is large enough that A isfixed in some of the replicates (Figure 3D), the predictedaverage frequency does not deviate by much from the ana-lytic prediction.

After the range expansion is complete, continuing migra-tion will tend to smooth patterns created by the rangeexpansion and continuing selection will cause the frequen-cies of deleterious alleles to gradually decay. Once the lastpopulation is colonized, our approximations for the expan-sion phase no longer apply. Subsequent evolution will begoverned by the effective population sizes and selectionintensities in each population. If the populations are largeand selection is relatively strong, the decay will roughlyfollow the deterministic theory of selection and migrationuntil allele frequencies become quite small. One example ispresented in Figure 4.

The analytic approximation also provides an estimate ofthe probability that an allele will not be lost during therange expansion. Table 1 shows that the predicted and ac-tual probabilities for the six cases shown in Figures 2 and 3are in reasonable agreement, although the predictions tendto be larger than the simulated values.

We can explore further the probability that A will not belost during the range expansion. Deleterious alleles of addi-tive effect are lost quickly but deleterious recessive alleleshave a substantial probability of not being lost even if theirultimate probability of fixation is low. Figure 5 shows sometypical results for deleterious alleles. The probability that anallele survives is much lower for an allele of additive effectthan for one that has a recessive effect.

The similarity of the model of range expansion anda model of a single population leads to a simple predictionabout the loss in heterozygosity during a range expansion.The heterozygosity of neutral loci will be reduced by a factorof 121=ð2keÞ in each successive population.

Continuously Distributed Populations

Our results are based on a model in which populations arediscrete. Much of the interest in range expansion comesfrom human and other populations that are continuouslydistributed in space. The relationship between continuousand discrete population models is not simple. As Felsenstein(1975) first noted, it is difficult to formulate a consistentmodel of a finite population that is continuously distributedin space and that maintains a uniform population density.The reason is that the assumption of uniform density is notcompatible with the assumption that individuals reproduceindependently of one another (Sawyer 1976). The usual res-olution of this problem is to approximate a continuous modelby a sequence of discrete models (Nagylaki 1978a,b).

To express our results in terms of a model of rangeexpansion in a continuously distributed population, assumethe populations in a discrete-population model are a distancel apart. Our results predict that heterozygosity decreases asa linear function of distance, HðiÞ ¼ H0ð12i=ð2keÞÞ, in linewith previous analytical results (DeGiorgio et al. 2011). Toexpress this result as a function of distance, d, rather thanpopulation number, we write

HðdÞ ¼ H0

�12

d2ke

�; (13)

Figure 2 Comparison of analytic predictions of the Wright–Fisher model with the simulation result for the realistic model with no migration. The formatis the same as in Figure 1. The analytic predictions were obtained by replacing k by the integer nearest ke and s1 and s2 by Ts1 and Ts2. In A–F, k ¼ 100,N ¼ 10,000, T ¼ 5, and one copy of the mutant was present in the propagule founding population 1. The averages are over 100 replicate simulations. InA–C, s1 ¼ s2 ¼ 0; in D–F, s2 ¼ 0.3 and s1 ¼ 0 (recessive deleterious alleles).

Serial Founder Effects During Range Expansion 175

Page 6: Serial Founder Effects During Range Expansion: A Spatial

where ke ¼ lke indicates the net effect of genetic drift whenthe leading edge of the population moves a distance l.

We can relate the parameters of a discrete-populationmodel to those of a model of a continuously distributedpopulation as follows. A continuous-population model atequilibrium is characterized by the population density (r),

the root-mean-square dispersal distance (s), and the totallength of the habitat (L). The correspondence between dis-crete and continuous models in a one-dimensional habitat iswell established for populations at equilibrium. If L is largeenough that end effects do not dominate, the heterozygosityand decrease in the probability of identity-by-descent ofneutral alleles in a continuous-population model are thesame as in a discrete-population model when r ¼ N=l ands ¼ l

ffiffiffiffim

p, where l ¼ L/n is the distance between adjacent

populations (Malécot 1975). Therefore, to recover theparameters of the discrete-population model, for which wehave an analytic approximation, N ¼ lr, m ¼ s2=l2, andn ¼ L=l, where l is not yet specified.

To determine l, we assume that new individuals beyondthe leading edge of the population are randomly sampledfrom k individuals at or near the leading edge. Range ex-pansion occurs as new individuals appear in such a way thattheir average density is r. This assumption fits the observa-tion of Hallatschek et al. (2007) who found that, in an ex-perimental study of range expansion in Escherichia coli,colonists appeared to come predominantly from the smallnumber of cells at the expanding edge of the population. Ina continuously distributed population expanding at a uni-form rate, there is no delay corresponding to the T gener-ations allowed for in the discrete-population model. Hence Tcan be set to 1. In the continuous model, a habitat of lengthL is colonized in total time t, which corresponds to n gen-erations in the discrete-population model. Therefore, n ¼ t,which we have already determined to be L/l, and conse-quently l ¼ L/t.

With these assumptions, the ke is given by Equation 9with T ¼ 1 and

a ¼�12

s2

l2

�2

212lr

; (14)

Figure 3 Comparison of analytic predictionswith the simulation results for a model withfinite population size and migration. The formatis the same as in Figure 1, B and E. In A–D, k¼100, N ¼ 1000, m ¼ 0.04, and one copy of themutant was present in the propagule foundingpopulation 1. In A, s2 ¼ s1 ¼ 0 (neutral); in B,s2 ¼ 0.02 and s1 ¼ 0 (recessive deleterious); inC, s2 ¼ 0.02 and s1 ¼ 0.01 (additive deleteri-ous); in D, s2 ¼ 0.01 and s1 ¼ 0.005 (additivedeleterious).

Figure 4 Illustration of the effects of continued evolution after all popu-lation are colonized. k ¼ 100, N ¼ 1000, T ¼ 5, s2 ¼ 0.01, s1 ¼ 0.005, andm ¼ 0.05. (A) Trajectories for 100 replicates immediately after the lastpopulation is colonized. The format is the same as in Figure 1, A and B.The prediction of the analytic model is shown in A by the thick blue line.The average of the 100 replicates immediately after all populations arecolonized is shown by the thick black line in A and B. (B) The same setof trajectories 100 generations later. The thick red line shows the averagesafter 100 generations. Note the difference in the vertical scale in A and B.

176 M. Slatkin and L. Excoffier

Page 7: Serial Founder Effects During Range Expansion: A Spatial

where l ¼ L/t. Therefore

ke ¼ Ltða=kþ ð12 aÞ=lrð12 aÞÞ: (15)

The value of ke corresponds to Ne of Hallatschek and Nelson(2008). Their simulation model is more realistic and allowsfor the interaction of range expansion and dispersal in a waythat ours does not. We are assuming weak migration anddeterministic expansion of the population front.

We can estimate ke from the regression of heterozygositywith linear distance. Ramachandran et al. (2005) fitted theregression line

HðdÞ ¼ 0:76822 6:52 · 1026d (16)

(where d is the distance in kilometers from Addis Ababa) toheterozygosities computed for 783 microsatellite loci ina worldwide sample of 1027 individuals. From this regres-sion we conclude ke ¼ 1:5 · 105 km. To get a rough idea ofwhat this result means, assume that modern human popu-lations expanded a total distance, L, of 25,000 km in 2000generations (i.e., 50,000 yr assuming 25 yr per generation).This implies l ¼ 125 km. To make further progress we needto assume a historical equilibrium density, r, which we ar-bitrarily set to 100/km. It is reasonable to assume that theaverage per generation dispersal distance, s, was small

enough that s2=l2 is small. Together these assumptions im-ply that a in Equation 15 is nearly 1. Consequently,

11=ke þ 1=12; 500

¼ 1:5 · 105

125¼ 1:2 · 104 (17)

or ke � 1327. Using our simple model of range expansion inhumans, we conclude that the data of Ramachandran et al.(2005) imply the expansion of modern humans into Asia andNorth America did not require extreme founder events at theleading edge. This application is intended to illustrate how ourresults can be interpreted in terms of a continuously distributedpopulation rather than to infer details of human history. Hu-man populations did not expand their range at a uniform rateand, more importantly, the expansion had a two-dimensionalcomponent that we have not attempted to model here.

Discussion and Conclusions

We show that range expansion in a one-dimensional habitatis similar in some ways to random mating in a single popu-lation. The succession of colonization events during rangeexpansion creates a spatial sequence of allele frequencies thatis analogous to the time sequence of allele frequencies in asingle population. This is true in an idealized model of rangeexpansion and is approximately true in a more realistic modelthat allows for some delay before the next colonization eventand for weak gene flow among established populations. Thesimilarity of the two models allows the theory of randommating to be adapted to make analytic predictions aboutthe consequences of range expansions. There is a strongstochastic component to the process that makes predictionof individual allele frequency trajectories difficult, but theaverage trajectory and the extent of variation among themare well predicted by the analytic theory.

Although our simulation results are for a model ofinstantaneous population growth, our analytic theory makesclear that the effective number of colonists, ke, dependspartly on the net effect of genetic drift between successivecolonization events and hence can be defined for other mod-els of population growth including the logistic model.

Previous work on the effects of surfing has emphasizedthat surfing can drive some initially rare alleles to highfrequency (Travis et al. 2007; Hallatschek and Nelson 2010;Hallatschek 2011). The probability that an initially rareallele is fixed in a randomly mating population can be cal-culated from a diffusion approximation (Kimura 1962).Roughly speaking, deleterious alleles have a significantprobability of being fixed by genetic drift if Ns # 1, whereN is the population size and s is the selection coefficient. Inthe context of range expansions, our approximate theorytells us that the probability that an initially rare allele isdriven to fixation during a range expansion depends onthe product kese, where ke is the effective propagule size(Equation 9) and se is the effective selection coefficient(Equations 10 and 11). This provides an approximate way

Figure 5 Illustration of the difference between alleles with an additive(red line) and a recessive (black line) effect on fitness. Pr(survival) is theprobability that an allele initially present in one copy in population 0 is stillpresent in population n. For both curves, k ¼ n ¼ 50 and s2 ¼ s. Foradditive selection, s1 ¼ s/2 and for recessive selection, s1 ¼ 0. The resultswere obtained by iterating the transition matrix of the Markov chain.

Table 1 Predicted and actual probabilities that an allele initiallypresent in one copy in the propagule founding population 1 willbe present in population n

Predicted Actual

Figure 2, A–C 0.048 0.043Figure 2, D–F 0.047 0.038Figure 3A 0.034 0.048Figure 3B 0.047 0.031Figure 3C 0.039 0.007Figure 3D 0.013 0.0012

The parameter values are given in the legends to Figures 2 and 3.

Serial Founder Effects During Range Expansion 177

Page 8: Serial Founder Effects During Range Expansion: A Spatial

to determine whether a deleterious allele has a significantprobability of surfing to a high frequency. It implies thatexpanding populations could accumulate deleterious muta-tions at a faster rate than equilibrium populations, whichcould potentially explain the observed excess of deleteriousalleles in Europeans (Lohmueller et al. 2008) or the collapseof some invading species (Cooling et al. 2011).

Clines in allele frequency are often attributed to geo-graphic variation in selection intensities. In humans, clinesin the frequencies of alleles that cause monogenic diseasesare observed and sometimes attributed to unknown environ-mental conditions (Novembre and Di Rienzo 2009). For ex-ample, the D508 allele of CFTR associated with cystic fibrosis(Bertranpetit and Calafell 1996) and the C282Y mutation ofhemochromatosis (Lucotte and Dieterlen 2003) have a higherfrequency in northern than in southern Europe. Our resultsshow that such clines could be created by range expansionin the absence of any geographic variation in selection inten-sity. Several authors (Handley et al. 2007; DeGiorgio et al.2009; Hunley et al. 2009) have emphasized the importanceof nonequilibrium processes in structuring human and otherpopulations and the need to consider neutral explanationsfor apparently adaptive patterns.

Our theory also predicts the slope of a gradient onheterozygosity that results from range expansion in a one-dimensional habitat and can be recast in terms of a contin-uous habitat. This correspondence allows us to obtain a roughestimate of the effective propagule size on the basis of thedata of Ramachandran et al. (2005).

Acknowledgments

M.S. was partially supported by a grant from the U.S.National Institutes of Health, R01-GM40282. L.E. was par-tially supported by a Swiss National Science Foundationgrant, 3100A0-126074.

Literature Cited

Austerlitz, F., B. Jung-Muller, B. Godelle, and P. H. Gouyon,1997 Evolution of coalescence times, genetic diversity andstructure during colonization. Theor. Popul. Biol. 51: 148–164.

Bertranpetit, J., and F. Calafell, 1996 Genetic and geographicalvariability in cystic fibrosis: evolutionary considerations, pp. 97–118 in Variation in the Human Genome, edited by D. Chadwickand E. Cardew. John Wiley & Sons, New York.

Cooling, M., S. Hartley, D. A. Sim, and P. J. Lester, 2011 Thewidespread collapse of an invasive species: Argentine ants (Line-pithema humile) in New Zealand. Biol. Lett. DOI: 10.1098/rsbl.2011.1014.

Crow, J. F., and M. Kimura, 1970 An Introduction to PopulationGenetics Theory, Harper & Row, New York.

DeGiorgio, M., M. Jakobsson, and N. A. Rosenberg, 2009 Explainingworldwide patterns of human genetic variation using a coalescent-based serial founder model of migration outward from Africa.Proc. Natl. Acad. Sci. USA 106: 16057–16062.

DeGiorgio, M., J. H. Degnan, and N. A. Rosenberg, 2011 Coalescence-time distributions in a serial founder model of human evolutionaryhistory. Genetics 189: 579–593.

Deshpande, O., S. Batzoglou, M. W. Feldman, and L. L. Cavalli-Sforza, 2009 A serial founder effect model for humansettlement out of Africa. Proc. Natl. Acad. Sci. USA 276:291–300.

Edmonds, C. A., A. S. Lillie, and L. L. Cavalli-Sforza, 2004 Mutationsarising in the wave front of an expanding population. Proc. Natl.Acad. Sci. USA 101: 975–979.

Ewens, W. J., 2004 Mathematical Population Genetics: I. Theoret-ical Introduction. Springer-Verlag, New York.

Excoffier, L., and N. Ray, 2008 Surfing during population expan-sions promotes genetic revolutions and structuration. TrendsEcol. Evol. 23: 347–351.

Excoffier, L., M. Foll, and R. J. Petit, 2009 Genetic consequencesof range expansions. Annu. Rev. Ecol. Evol. Syst. 40: 481–501.

Felsenstein, J., 1975 A pain in the torus: some difficulties withmodels of isolation by distance. Am. Nat. 109: 359–368.

Hallatschek, O., 2011 The noisy edge of traveling waves. Proc.Natl. Acad. Sci. USA 108: 1783–1787.

Hallatschek, O., and D. R. Nelson, 2008 Gene surfing in expand-ing populations. Theor. Popul. Biol. 73: 158–170.

Hallatschek, O., and D. R. Nelson, 2010 Life at the front of anexpanding population. Evolution 64: 193–206.

Hallatschek, O., P. Hersen, S. Ramanathan, and D. R. Nelson,2007 Genetic drift at expanding frontiers promotes gene seg-regation. Proc. Natl. Acad. Sci. USA 104: 19926–19930.

Handley, L. J. L., A. Manica, J. Goudet, and F. Balloux, 2007 Go-ing the distance: human population genetics in a clinal world.Trends Genet. 23: 432–439.

Hunley, K. L., M. E. Healy, and J. C. Long, 2009 The global patternof gene identity variation reveals a history of long-range migra-tions, bottlenecks, and local mate exchange: implications forbiological race. Am. J. Phys. Anthropol. 139: 35–46.

Kimura, M., 1962 On the probability of fixation of mutant genesin a population. Genetics 47: 713–719.

Klopfstein, S., M. Currat, and L. Excoffier, 2006 The fate of muta-tions surfing on the wave of a range expansion. Mol. Biol. Evol.23: 482–490.

Li, J. Z., D. M. Absher, H. Tang, A. M. Southwick, A. M. Casto et al.,2008 Worldwide human relationships inferred from genome-wide patterns of variation. Science 319: 1100–1104.

Liu, H., F. Prugnolle, A. Manica, and F. Balloux, 2006 A geograph-ically explicit genetic model of worldwide human-settlementhistory. Am. J. Hum. Genet. 79: 230–237.

Lohmueller, K. E., A. R. Indap, S. Schmidt, A. R. Boyko, R. D.Hernandez et al., 2008 Proportionally more deleterious ge-netic variation in European than in African populations. Nature451: 994–997.

Lucotte, G., and F. Dieterlen, 2003 A European allele map of theC282Y mutation of hemochromatosis: Celtic vs. Viking origin ofthe mutation? Blood Cells Mol. Dis. 31: 262–267.

Malécot, G., 1975 Heterozygosity and relationship in regularlysubdivided populations. Theor. Popul. Biol. 8: 212–241.

Nagylaki, T., 1978a Random genetic drift in a cline. Proc. Natl.Acad. Sci. USA 75: 423–426.

Nagylaki, T., 1978b The geographical structure of populations.Stud. Math. 16: 588–624.

Novembre, J., and A. Di Rienzo, 2009 Spatial patterns of variationdue to natural selection in humans. Nat. Rev. Genet. 10: 745–755.

Prugnolle, F., A. Manica, and F. Balloux, 2005 Geography predictsneutral genetic diversity of human populations. Curr. Biol. 15:R159–R160.

Ramachandran, S., O. Deshpande, C. C. Roseman, N. A. Rosenberg,M. W. Feldman et al., 2005 Support from the relationship ofgenetic and geographic distance in human populations for a se-rial founder effect originating in Africa. Proc. Natl. Acad. Sci.USA 102: 15942–15947.

178 M. Slatkin and L. Excoffier

Page 9: Serial Founder Effects During Range Expansion: A Spatial

Sawyer, S., 1976 Branching diffusion processes in population ge-netics. Adv. Appl. Probab. 8: 659–689.

Travis, J. M. J., T. Munkemuller, O. J. Burton, A. Best, C. Dythamet al., 2007 Deleterious mutations can surf to high densities onthe wave front of an expanding population. Mol. Biol. Evol. 24:2334–2343.

Vlad, M. O., L. L. Cavalli-Sforza, and J. Ross, 2004 Enhanced(hydrodynamic) transport induced by population growth in re-action-diffusion systems with application to population genetics.Proc. Natl. Acad. Sci. USA 101: 10249–10253.

Communicating editor: W. Stephan

Appendix A

Derivation of Effective Number of Founders

The model assumes that population 1 is founded by k individuals from population 0 in which the frequency of A is x0. Afterpopulation 1 is founded, it grows immediately to size N. Each generation, it receives immigrants from population 0 at a ratem per generation for T generations. Let the frequency of A in the newly founded population be x. Following the notation inCrow and Kimura (1970, Chap. 7.3), assuming the 2N copies are drawn with replacement,

EðxÞ ¼ x0

and

Ehðx2x0Þ2

i¼ x0ð12 x0Þ

2k;

where k is the number of founding individuals and E(.) denotes the expectation.Next we find what happens after T generations of random mating with continued immigration from population 0. First,

the population size grows to N by sampling with replacement from the k founders, which changes x to

x9 ¼ x þ e;

where E(e) = 0 and VarðeÞ ¼ xð12xÞ=2N. Migration changes x9 to

x$ ¼ ð12mÞx9 þmx0:

Therefore

E�x$

� ¼ ð12mÞ  EðxÞ þmx0 ¼ x0:

To derive a recursion equation for the variance in x in generation t, V(t), we take the square

x$2 ¼ ð12mÞ2   x2 þ ð12mÞ2   e2 þm2x20 þ 2ð12mÞ2   xeþ 2mð12mÞ  x0x þ 2mð12mÞ  x0e

and the expectation of both sides. Writing the variance of x$ as V(t + 1), we obtain

Vðt þ 1Þ þ x20 ¼ ð12mÞ2�VðtÞ þ x20� þ ð12mÞ2   x0ð12 x0Þ2VðtÞ

2Nþm2x20 þ 2mð12mÞ  x20 þm2x20 ;

which implies

Vðt þ 1Þ ¼ aVðtÞ þ ð12mÞ2   x0ð12 x0Þ2N

;

where a � 122m21=ð2NÞ and the approximate equality assumes m ,, 1 and N .. 1, which is the case of interest in thepresent application.

This is a linear recursion equation that has the solution

VðtÞ ¼ c1at þ c2:

Because c1 and c2 have to be chosen to satisfy the initial condition,

Serial Founder Effects During Range Expansion 179

Page 10: Serial Founder Effects During Range Expansion: A Spatial

c1 þ c2 ¼ V0

c2 ¼ ac2 þ x0ð12 x0Þ2N

;

where we are ignoring terms of order m/N and m2.Therefore

c2 ¼ x0ð12 x0Þ2Nð12 aÞ

c1 ¼ V02x0ð12 x0Þ2Nð12 aÞ:

The solution has to satisfy V0 = x0(1 2 x0)/(2k), which implies

c1 ¼ x0ð12 x0Þ�12k

21

2Nð12 aÞ�

c2 ¼ x0ð12 x0Þ2Nð12 aÞ:

After T generations

VðTÞ ¼ x0ð12 x0Þ�12k

21

2Nð12 aÞ�

aT þ x0ð12 x0Þ2Nð12 aÞ

¼ x0ð12 x0Þ2k

  aT þ x0ð12 x0Þ2Nð12 aÞ

�12 aT

�:

To define an effective number of founders, we solve

x0ð12 x0Þ2ke

¼ x0ð12 x0Þ2k

  aT þ x0ð12 x0Þ2Nð12 aÞ  

�12 aT

for ke to obtain

ke ¼ 1aT  =  kþ ð12 aTÞ=Nð12 aÞ:

ke may be larger or smaller than k. In general ke increases with increasing m and but may increase or decrease withincreasing T. Continued gene flow increases the effective number of founders because it reduces the variance in x at thetime the next population is founded.

In this derivation, we have not included the effect of migration from the newly founded population back into the source ofthe colonists. This back migration would change x0 slightly. We justify ignoring this change because it is of the same order ofmagnitude as the migration rate, m, and hence will modify the effect of immigration on the variance in the newly foundedpopulation by a term that is only of order m2.

Appendix B

Derivation of the Effective Selection Coefficients

To model the effect of selection during the T generations, we assume that A is the deleterious allele and that the relative fitnessesof individuals with 2, 1, or 0 copies of A are 1 – s2, 1 – s1, and 1. The deterministic frequency of A after one generation is

Dx ¼ s2x2ð12 xÞ2 s1xð12 xÞð12 2xÞ12 s2x2 2 2s1xð12 xÞ ;

where x is the frequency of A. Our interest is in the case in which the selection coefficients are small and A is in low frequency.Under these assumptions, when A has intermediate dominance (s1 . 0), this equation can be approximated by

Dx � 2 s1x;

180 M. Slatkin and L. Excoffier

Page 11: Serial Founder Effects During Range Expansion: A Spatial

which has the solution

xðTÞ ¼ xð0Þð12s1ÞT� xð0Þð12Ts1Þ

after T generations if s1 is small. With this approximation, we can summarize the net effect of selection by replacing s1 by aneffective selection coefficient s1,e = Ts1.

When A is recessive (s1 = 0), the equation of change can be approximated by

Dx � 2 s2x2:

This difference equation does not have a closed-form solution but, if s2 is small, it can be approximated by a differentialequation that does,

dxdt

¼ 2 s2x2;

which has the solution

xðTÞ ¼ xð0Þ1þ s2Txð0Þ:

Once again, we can define the effective selection coefficient to be s2,e = Ts2.

Serial Founder Effects During Range Expansion 181