42
ECSAC09, Veli Lošinj, August 26 th 2009

Tracking the genetic legacy of past human populations through the grid

  • Upload
    aideen

  • View
    20

  • Download
    0

Embed Size (px)

DESCRIPTION

Tracking the genetic legacy of past human populations through the grid. Nicolas Ray University of Bern / CMPG (University of Geneva & UNEP/GRID-Europe). ECSAC09, Veli Lošinj, August 26 th 2009. Adapted from Cavalli-Sforza & Feldman, 2003. Human migrations. [12,000]. [55,000]. - PowerPoint PPT Presentation

Citation preview

Page 1: Tracking the genetic legacy of past human populations through the grid

ECSAC09, Veli Lošinj, August 26th 2009

Page 2: Tracking the genetic legacy of past human populations through the grid
Page 3: Tracking the genetic legacy of past human populations through the grid

Human migrationsHuman migrations

Adapted from Cavalli-Sforza & Feldman, 2003

[12,000]

[55,000]

Homo sapiens sapiens

Page 4: Tracking the genetic legacy of past human populations through the grid
Page 5: Tracking the genetic legacy of past human populations through the grid

1. Better understand human evolution

• Origin of modern human (when, where, how many?)

• Relationship with other members of the Homo genus

2. Distinguish between the effect of demography and those

of selection (biomedical applications)

Page 6: Tracking the genetic legacy of past human populations through the grid

Gene-specific factorsmutationsrecombinationselection

A complex past demographyfluctuation in effective pop. sizesubstructuremigrations

Observed patterns of genetic diversity in contemporary populations

Page 7: Tracking the genetic legacy of past human populations through the grid
Page 8: Tracking the genetic legacy of past human populations through the grid

50 loci in non-genic regions (Chen and Li, 2001)

About 500 bp each, 24,425 bp in total30 individuals: 10 Africans, 8 Asians, 12 AmerindiansChimpanzee sequenced to get estimation of mutation rates assuming 6 My divergence time

Statistical Evaluation of Alternative Models of Human Statistical Evaluation of Alternative Models of Human EvolutionEvolutionNelson Fagundes, Nicolas Ray, Mark Beaumont, Samuel Neuenschwander, Nelson Fagundes, Nicolas Ray, Mark Beaumont, Samuel Neuenschwander, Francisco Salzano, Sandro Bonatto, and Laurent Excoffier. 2007. Francisco Salzano, Sandro Bonatto, and Laurent Excoffier. 2007. PNASPNAS, , 104: 17614-17619

Page 9: Tracking the genetic legacy of past human populations through the grid

AF AS AM

AFRIGAFRIG ASIGAF AS AM

ASIGAF AS AM

ASEGAF AS AM

ASEGAF AS AM

AFREGAF AS AM

AFREGAF AS AM

MRE1SAF AS AM

MRE1SAF AS AM

MRE2SAF AS AM

MRE2SAF AS AM

MREBIGAF AS AM

MREBIGAF AS AM

MREBEGAF AS AM

MREBEGAF AS AM

ModelsModelsAfrican replacement Assimilation

Multiregional evolution

tim

e

Page 10: Tracking the genetic legacy of past human populations through the grid

Model parameters and priorsModel parameters and priors

Page 11: Tracking the genetic legacy of past human populations through the grid

Africa Asia Americas

SimulationsSimulations

Coalescence theory

A retrospective model of population genetics

Traces all copies of a gene in a sample from a population to a single ancestral copy shared by all members (MRCA)

Assumes no recombination, no selection

Tim

e

Page 12: Tracking the genetic legacy of past human populations through the grid

Simulated genealogySimulated genealogy

MutationModèle de mutation

AC

CTA

GTA

CA

ATC

GG

TA

ATG

CC

ATTG

GT

TCCTTGTA…ATTGGT

ACCGAGTA…GTTGGT

Summary statistics– Within population:

• S, – Between populations

• Pairwise FST

• Global FST

– Globally• S,

Page 13: Tracking the genetic legacy of past human populations through the grid

Approximate Bayesian Computations (ABC)

• Calculate summary statistics (S) for observed data sets

• Draw parameter values φ’ from prior distributions, and use them to simulate data

• Calculate summary statistics (S’) on the simulated data set and compare them to the observations: δ = ||S - S’|| (Euclidean distance)

• Accept φ’ if δ is arbitrarily small, otherwise reject sample

The rejection-sampling approach:

The ABC approach (Beaumont et al. 2002)Modification: a local regression is added within the set of accepted φ’

values

Page 14: Tracking the genetic legacy of past human populations through the grid

Neuenschwander (2006)

Page 15: Tracking the genetic legacy of past human populations through the grid

COMPUTATIONAL ISSUESCOMPUTATIONAL ISSUES

1-10 mio.

Computer clusters

UBELIX (>500 nodes)

Zooblythii (~40 nodes)

Page 16: Tracking the genetic legacy of past human populations through the grid

For ABC, 5 mio. demographic simulations are necessary to obtain robust parameter estimations

Each demographic simulation is followed byn genetic simulations (n = num. of loci)

Example

8 simple models, 50 loci, 30 individuals 2 CPU-year

Page 17: Tracking the genetic legacy of past human populations through the grid

0.218

0.4610.4220.048 0.069

0.9580.042 0.091 0.909

0.001

AF AS AM

AFRIGAFRIG

0.781

ASIGAF AS AM

ASIGAF AS AM

ASEGAF AS AM

ASEGAF AS AM

AFREGAF AS AM

AFREGAF AS AM

MRE1SAF AS AM

MRE1SAF AS AM

MRE2SAF AS AM

MRE2SAF AS AM

MREBIGAF AS AM

MREBIGAF AS AM

MREBEGAF AS AM

MREBEGAF AS AM

Relative probabilities of models of human evolutionRelative probabilities of models of human evolutionAfrican replacement Assimilation

Multiregional evolution

Page 18: Tracking the genetic legacy of past human populations through the grid

2e-6

1e-53e-5

5e-5

NA-AF

NbASNbMH NbAM

NAMNASNAF

8,0007,0006,0005,000

de

ns

ity

0.0004

0.0003

0.0002

0.0001

0

TMH TASTAM

8,000 4,000 1,600

2e-6

1e-53e-5

5e-5

NA-AF

NbASNbMH NbAM

NAMNASNAF

8,0007,0006,0005,000

de

ns

ity

0.0004

0.0003

0.0002

0.0001

0

TMH TASTAM

8,000 4,000 1,600

142 Kya(104 – 186)

Speciation time

2e-6

1e-53e-5

5e-5

NA-AF

NbASNbMH NbAM

NAMNASNAF

4,0003,5003,0002,5002,000

de

ns

ity

0.0012

0.0009

0.0006

0.0003

0

TMH TASTAM

8,000 4,000 1,600

2e-6

1e-53e-5

5e-5

NA-AF

NbASNbMH NbAM

NAMNASNAF

4,0003,5003,0002,5002,000

de

ns

ity

0.0012

0.0009

0.0006

0.0003

0

TMH TASTAM

8,000 4,000 1,600

51.1 Kya(40.1 – 70.9)

Out-of-Africa time

2e-6

1e-53e-5

5e-5

NA-AF

NbASNbMH NbAM

NAMNASNAF

1,6001,200800400

de

ns

ity

0.008

0.006

0.004

0.002

0

TMH TASTAM

8,000 4,000 1,600

2e-6

1e-53e-5

5e-5

NA-AF

NbASNbMH NbAM

NAMNASNAF

1,6001,200800400

de

ns

ity

0.008

0.006

0.004

0.002

0

TMH TASTAM

8,000 4,000 1,600

10.3 Kya(7.6 – 15.9)

Americas colonization time

Page 19: Tracking the genetic legacy of past human populations through the grid
Page 20: Tracking the genetic legacy of past human populations through the grid

A complex demographyA complex demography

Adapted from Cavalli-Sforza & Feldman, 2003

[10,000]

[55,000]

demographic and spatial expansions

population bottlenecks

fast migration events

population isolation

secondary contacts

Page 21: Tracking the genetic legacy of past human populations through the grid

From environment to demographyFrom environment to demography

Spatial resolution: 100 km

low

high

Carrying capacity

Page 22: Tracking the genetic legacy of past human populations through the grid

low

high

Friction

From environment to demographyFrom environment to demography

Page 23: Tracking the genetic legacy of past human populations through the grid

Demographic simulationsDemographic simulations

stepping-stone model (cellular automata)

Cell or deme

Pop

. si

ze

time

Page 24: Tracking the genetic legacy of past human populations through the grid

SPLATCHESPLATCHESPatiaL And Temporal Coalescences in Heterogeneous Environment

(http://cmpg.unibe.ch/software/splatche)

Page 25: Tracking the genetic legacy of past human populations through the grid

Vegetation mapsVegetation maps

<Empty Picture><Empty Picture>

Present potential vegetationVegetation at the Last Glacial MaximumLast Glacial Maximum

present potential

Ray et Adams. 2001. Internet Archaeology 11

Taking into account altitudes

Expert system

Page 26: Tracking the genetic legacy of past human populations through the grid

Demography and spatial expansionDemography and spatial expansion

Population density

Page 27: Tracking the genetic legacy of past human populations through the grid

Dynamic vegetationDynamic vegetation

intermediateLGM

PP

Page 28: Tracking the genetic legacy of past human populations through the grid

N[t]

Generations4'0003'5003'0002'5002'0001'5001'0005000

Nu

mb

er

of

pe

op

le p

er

ce

ll

280

260

240

220

200

180

160

140

120

100

80

60

40

20

0

Page 29: Tracking the genetic legacy of past human populations through the grid

Genetic simulationsGenetic simulations

Page 30: Tracking the genetic legacy of past human populations through the grid

Computational issuesComputational issues

A fully spatially-explicit model using 500 loci in 800 individuals:

10 CPU-years

Adding long-distance dispersal:

20 CPU-years

Page 31: Tracking the genetic legacy of past human populations through the grid

SPLATCHE on the gridSPLATCHE on the grid

early 2005: joined the Biomed VO of the EGEE project

mid 2005: tested on GILDA test bed, and deployed on the Grid

since late 2005: testing and improvement

since mid 2006: production mode and optimization

Page 32: Tracking the genetic legacy of past human populations through the grid

Use of SPLATCHE on the gridUse of SPLATCHE on the grid

N simulations

Independent simulations: - the more CPUs, the better- job failures are not that bad

GRID

Posterior distribution of demographic/genetic parameters of interest

Statistical tools

Page 33: Tracking the genetic legacy of past human populations through the grid

OptimizationsOptimizations

5 mio. simulations

GRID

Reduction of the number of simulations (Daniel Wegmann)By MCMC. Promising results (~10 times less sims)

Submission timemulti-threaded application using up to 30 RBs (used for the WISDOM project)

Fetching time of job outputsin-house multi-threaded solution for checking status and getting outputs

Page 34: Tracking the genetic legacy of past human populations through the grid

Geographic origin of human dispersalGeographic origin of human dispersal

Ray et al. (2005) Genome Research

Page 35: Tracking the genetic legacy of past human populations through the grid
Page 36: Tracking the genetic legacy of past human populations through the grid

Mutations surfing during a range expansionMutations surfing during a range expansion

Page 37: Tracking the genetic legacy of past human populations through the grid

Mutations surfing during a range expansionMutations surfing during a range expansion

• Some mutation can travel with the wave of advance

• New mutations can reach high

frequencies

• More pronounced in small populations

Klopfstein, Currat and Excoffier (2006) MBE 23(3): 482-490

Page 38: Tracking the genetic legacy of past human populations through the grid

Selection ?Selection ?

(2005) Science 509 (5741)

Currat, Excoffier, Maddison, Otto, Ray, Whitlock and Yeaman (2006) Science 313:172a

Page 39: Tracking the genetic legacy of past human populations through the grid

Interactions among Interactions among populationspopulations

Interaction between modern humans and Neanderthals in Europe

Currat & Excoffier (2004), PLoS Biol.

Page 40: Tracking the genetic legacy of past human populations through the grid

Plausible introduction site 1LAGOON CREEK (first sight: 1979)

Initial introduction site in AustraliaGORDONVALE (1935)

KDM

NW

B

T

RE

120 0 120 240 360Kilometers

19991982

19881992

1995

1996

1997

1998

Plausible introduction site 2NORMANTON (first sight: 1964)

Cane toad invasion in AustraliaCane toad invasion in Australia

Estoup, A., Baird, S. J. E., Ray, N., Currat, M., Cornuet, J.-M., Santos, F., Beaumont, M. A. and L. Excoffier. Combining genetic, historical and geographic data to reconstruct the dynamics of the bioinvasion of cane toad Bufo marinus. In prep

Page 41: Tracking the genetic legacy of past human populations through the grid

Take-home messageTake-home message

Page 42: Tracking the genetic legacy of past human populations through the grid