18
The region of chromosome 2 (234,876,004– 234,884,481 bp; NCBI build 34) within ENr131.2q37 contains 36 SNPs, with zero obligate recombinaNon events in the CEU samples. A haplotype map of the human genome HapMap Nature 2005 “The discovery of [..] Coalescent is arguably one of the most important theoreNcal discoveries in all of biology over the past 50 years.” Nielsen 2016 GeneNcs

Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

The  region  of  chromosome  2  (234,876,004–234,884,481 bp;  NCBI  build  34)  within  ENr131.2q37  contains  36  SNPs,  with  zero  obligate  recombinaNon  events  in  the  CEU  samples.    

A  haplotype  map  of  the  human  genome  HapMap  Nature  2005    

“The  discovery  of  [..]  Coalescent  is  arguably  one  of  the  most  important  theoreNcal  discoveries  in  all  of  biology  over  the  past  50  years.”  Nielsen  2016  GeneNcs  

Page 2: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

T2(=2  gens)          T3  (=14  gens)          T4  (=2gens)    

Generations

The  coalescent  for  sample  sizes  greater  than  two  T M

RCA(=18  gens)

   

Page 3: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

Generations

T2  (=9  gens)          T3  (=4  gens)  T4  (=2gens)    

The  coalescent  for  sample  sizes  greater  than  two  

T MRC

A(=15  gens)

   

Page 4: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

From  Nordborg  2000  

The  coalescent  for  sample  sizes  greater  than  two  

 You  don’t  increase  the  informaNon  about  the  tree  much  from  increasing  your  sample  size  

From  Nordborg  review  

There  is  liale  informaNon  about    underlying  processes  in  a  single  genealogy  

Page 5: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

Generations

T2  (=9  gens)          T3  (=4  gens)  T2  (=2gens)    

The  coalescent  for  sample  sizes  greater  than  two  

CCTG

A  

CGTG

A  

TCTG

A   CGTC

A  TCTGA  

CGTCA  

CGTGA  

CGTGA  Given  the  total  length  of  the  tree  L,  the  expected    number  of  segregaNng  sites  is  Lμ  

Page 6: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

The  frequency  of  an  allele  in  the  sample    reflects  the  branch  in  the  genealogy  on  which  the  branch  occurred.  

CCTGA  

CGTGA  

CGTGT  

TCTGA  

TCTGA  

TCTGA  

CCTCA  

CGTGA  

CGTGA  

CGTGT  

       1            2          3  

#  copies  

CCTCA  

Page 7: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

Frequency  spectrum  for  derived  neutral  alleles  

Expectations for the unfolded site-frequency spectra

Sample size Theoretical spectrum (singletons, doubletons, ...)2 �3 �, �/24 �, �/2, �/35 �, �/2, �/3, �/4

General solution: Expected count of sites at frequency i is equal to�i .

Curious property: As sample size increase, number of singletons,doubletons, etc. stays the same, just a new category is added.TCTGA  

TCTGA  

TCTGA  

CCTCA  

CCTCA  

Page 8: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

A  balance  between  mutaNon,  drie  and  SelecNon  could  explain  our  frequency    spectrum  for  NS  sites  

Page 9: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

T  

A growing population

No

NT

Slower coal. rate towards the tips, compared with a stationary population.

What  does  the  genealogy  look  like  in  a  recently  expanded  populaNon?  

Excess  of  singletons  

Page 10: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

T  

A bottlenecked population

N  

neutral No

faster coal. here

What  does  the  genealogy  look    like  if  there  was  a  recent  boaleneck?  

Can  lead  to  excess  of  intermediate  freq.alleles  Leading  to  high  variance    

Page 11: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$
Page 12: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

T  

N  

Locus    1          2        3        4              5  

Page 13: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

T  

N  

Page 14: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

EffecNve  populaNon  size  (x104)  

Gene

raNo

ns  back  

Page 15: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$
Page 16: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$
Page 17: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

Generations

Past Present

●Barriers  to  migraNon  amongst  populaNons  Allow  independent  drie  and  differenNaNon    Lower  rates  of  migraNon  &/or  smaller  populaNon  sizes  (higher  rates  of  drie)      

Page 18: Ahaplotypemapofthehumangenome HapMap$Nature$2005$$ · Theregionofchromosome2(234,876,004– 234,884,481 bp;$NCBIbuild$34)$within$ ENr131.2q37$contains$36$SNPs,$with$zero$ obligate$recombinaon$events$in$the$CEU$

Drie  creates  allelic  differenNaNon  between  populaNons  if  migraNon  

rates  are  low  &  divergence  Nmes    are  long