Upload
family-tree-dna
View
781
Download
1
Embed Size (px)
Citation preview
SOD
The paternal tree of humanityDoron M Behar, MD, PhD
Family Tree DNA, Houston, Texas
The 12th Genetic Genealogy Conference for Family Tree DNA Group Administrators November 11-13, 2016
1
We are establishing it!
Together!!!2
One for all and all for One
3
It is commonly thought that human genetic diversity in non-African populations was shaped primarily by an out-of-Africa dispersal 50-100 thousand yr ago (kya). Here, we present a study of 456 geographically diverse high-coverage Y chromosome sequences, including 299 newly reported samples. Applying ancient DNA calibration, we date the Y-chromosomal most recent common ancestor (MRCA) in Africa at 254 (95% CI 192-307) kya and detect a cluster of major non-African founder haplogroups in a narrow time interval at 47-52 kya, consistent with a rapid initial colonization model of Eurasia and Oceania after the out-of-Africa bottleneck. In contrast to demographic reconstructions based on mtDNA, we infer a second strong bottleneck in Y-chromosome lineages dating to the last 10 ky. We hypothesize that this bottleneck is caused by cultural changes affecting variance of reproductive success among males.4
The HorowitzsMigration path ~1400 A.C.Genealogy from 1450 A.C.YESHAYA HOROVSKY ISH HOROWICE
5
5 individuals
Horowitz genealogyDocumentedMolecular
6
Written vs molecular genealogyWrittenMolecular
7
1450 = 566 ybp1615 = 401 ybp546 ybp399 ybp
This is what you need, Right?!
Good, cause we are building it!8
Whole Y Chromosome60M bps longKarmin et al:We exclude all oF Chr Y outside 10.8-Mb sequence>5x unique coverageFTDNA:Around 11.5 to 12.5 million base-pairs of reliably mapped positions of non-recombining Y chromosome9
Haplogroup Q
10
Koryaks10
A fraction of the tree
11
162 variants = ~35,000ybpQ3
Count of mutations
12
Actual positionsPositionNew_HGMarkerISOGG name2692142Q3M378_eqF711 L6122713850Q3M378_eqF7132794289Q3M378_eq#N/A2806676Q3M378_eq#N/A4113324Q3M378_eq#N/A6679787Q3M378_eqF8036718686Q3M378_eqF8156746675Q3M378_eq#N/A6753100Q3M378_eqY10386986250Q3M378_eq#N/A
13
Whole Y Chromosome60M bps longKarmin et al:We exclude all oF Chr Y outside 10.8-Mb sequence>5x unique coverageFTDNA:Around 11.5 to 12.5 million base-pairs of reliably mapped positions of non-recombining Y chromosome14
In which Capture
Message No 1:
Know the capture you are choosing!
15
Intra-platform performance16
Inter platform performance
Genotyping platforms:Complete GenomicsIllumina5 samples were run in both platformsThe overlapping region is 6M bpAre we identifying the same variants?17
Inter-platform performance18
Message No 2:
Different platform are overall concordant !
19
Nomenclature
20
What is a reference genome?The reference genome does not represent the ancestral genome!
The reference genome represent a haploid mosaic of different DNA sequences from different donors. For example, GRCh37, the Genome Reference Consortium human genome (build 37) is derived from thirteen anonymous volunteers from Buffalo, New York.
Accordingly, the Y chromosome sequence is an assembly of a few haplogroups.21
Root vs Reference
22
22
Genome buildsRelease nameDate of releaseEquivalent UCSC versionGRCh38Dec 2013hg38GRCh37Feb 2009hg19NCBI Build 36.1Mar 2006hg18NCBI Build 35May 2004hg17NCBI Build 34Jul 2003hg16
23The same variant can be in different positions in different genome builds.
Same variant Different name
24
Message No 3:
Speak the language!
25
Whole Y sequencing
Whole Y Sequencing27GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATGTCTGCACAGCCGCTTTCCACACAGACATCATAACAAAAAATTTCCACCAGAGCCGGAGCACCTTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATCGAGCCGGAGCACCTTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATCGTCTGCACAGCCGCTTTCCACACAGACATCATAACAAAAAATTTCCACCATAACTAAGCTATACTAACCCCAGGGTTGGTCAATTTCGTGCCAGCCACCGCACCCCCACGGGAAACAGCAGTGATTAACCTTTAGCAATAAACGAAAGTTGACAAGCATCAAGCACGCAGCAATGCAGCTCAAAACGCTTAGCCTAGCCAAGCATCCCCGTTCCAGTGAGTTCACCCTCTAAATCACCACGATCAAAAGGTAGGTTTGGTCCTAGCCTTTCTATTAGCTCTTAGTAAGATTACACATGCAAGCAATACACTGAAAATGTTTAGACGGGCTCACATCACCCCATAAACAAAAACCAAACCCCAAAGACACCCCCCACAGTTTATGTAGCTTACCTCCTCAACCCATCCTACCCAGCACACACACACCGCTGCTAACCCCATACCCCGAACCTATTTTCCCCTCCCACTCCCATACTACTAATCTCATCAATACAACCCTCGTTATCTTTTGGCGGTATGCACTTTTAACAGTCACCCCCCAACTAACACATAAACCCCAAAAACAAAGAACCCTAACACCAGCCTAACCAGATTTCAAATTAACCCCCCCTCCCCCCGCTTCTGGCCACAGCACTTAAACACATCTCTGCCCGGTCACACGATTAACCCAAGTCAATAGAAGCCGGCGTAAAGAGTGTTTTAGATCACCCCCTCCCCAATAAAGCTAAAACTCACCTGAGTTGTAAAAAAC
Whole Y Sequencing
Now, we have to alignY chromosome reference
AlignmentY chromosome reference
Coverage=3
REFGALTAGenotypeHOMdepth60qual_base_calling214maxqual_mapping60maxqual_genotype99max
Quality control per variant
31
Intra-platform performance32
Message No 4:
Not all positions within the capture will pass QC!
33
Pipeline for Whole Y analysis
VCF (Variant Call Format)VariantGRC38 positionReferenceDerivedVariantGRC38 positionReferenceDerivedP3052842113GAM23113357844GAL10852922685TCM21413360045TCZ47623953196ACM21313414871TCV1715030624CGL113014549130TGM5236885478AGP1415286718CTM5227305102GAV16815835792GAM5787334662CTL72917319728ACL6667702775GAF54917401190CTV2217721262GTF316319069977GAM23087822141ATM919568371CGF11548513272TCM4219704954ATF12068572376CTM8919755427CTF13298720990CTL115520029380GCP14312077161GAL73520977731GTM16812702062CTM52621389038ACP9712774339GTF65021455120GAP10813314368CT
35
VCF (Variant Call Format)VariantGRC38 positionReferenceDerivedVariantGRC38 positionReferenceDerivedP3052842113GAM23113357844GAL10852922685TCM21413360045TCZ47623953196ACM21313414871TCV1715030624CGL113014549130TGM5236885478AGP1415286718CTM5227305102V16815835792GAM5787334662CTL72917319728ACL6667702775GAF54917401190CTV2217721262GTF316319069977GAM23087822141ATM919568371CGF11548513272TCM4219704954ATF12068572376CTM8919755427CTF13298720990CTL115520029380GCP14312077161GAL73520977731GTM16812702062CTM52621389038ACP9712774339GTF65021455120GAP10813314368CT
36The position failed, nothing to worry about
The position did not fail and shows the reference which means it is a private back mutation!
Haplogroup labeling37A0'1'2'3'4-L1085*(xV148,V168) A1'2'3'4-V168*(xM31,P108) A2'3'4-P108*(xL419,M42) A4-M42*(xM60,M168) CDEF-M168*(xM145,P143) CF-P143*(xM130,M89) F-M89*(xF1329) GHIJKLT-F1329*(xM201,M578) HIJKLT-M578*(xL901,M522) IJKLT-M522*(xM429,M9) KLT-M9*(xM526,P326) K-M526*(xP331,F549) X-F549*(xM214) NO-M214*(xM231,M175) N-M231*(xY6503,L735) N-L735*(xF2930,L729) N-L729*(xL666,M46) N-L666*(xF1154,P43)
N-F1154
Private mutations38A0'1'2'3'4-L1085*(xV148,V168) A1'2'3'4-V168*(xM31,P108) A2'3'4-P108*(xL419,M42) A4-M42*(xM60,M168) CDEF-M168*(xM145,P143) CF-P143*(xM130,M89) F-M89*(xF1329) GHIJKLT-F1329*(xM201,M578) HIJKLT-M578*(xL901,M522) IJKLT-M522*(xM429,M9) KLT-M9*(xM526,P326) K-M526*(xP331,F549) X-F549*(xM214) NO-M214*(xM231,M175) N-M231*(xY6503,L735) N-L735*(xF2930,L729) N-L729*(xL666,M46) N-L666*(xF1154,P43)
N-F1154Private mutations:g.2654329C>Tg.4448652G>Ag.7598733A>G
Establishing a new branchJohn Smith39N-M231N-L735N-F1206N-F1154N-F3163
g.2654329C>Tg.4448652G>Ag.7598733A>GMike SmithN-M231N-L735N-F1206N-F1154N-F3163
g.3447764C>Tg.6853865A>G
John Smith40N-M231N-L735N-F1206N-F1154N-F3163
g.2654329C>Tg.4448652G>Ag.7598733A>GMike Smithg.3447764C>Tg.6853865A>G
~1400 ybp
Establishing a new branchJohn Smith41N-M231N-L735N-F1206N-F1154N-F3163
g.2654329C>Tg.4448652G>Ag.7598733A>GMike SmithN-M231N-L735N-F1206N-F1154N-F3163
g.3447764C>Tg.4448652G>Ag.6853865A>G
John Smith42N-M231N-L735N-F1206N-F1154N-F3163
g.2654329C>Tg.7598733A>GMike Smithg.3447764C>Tg.6853865A>G
g.4448652G>A
~1000 ybp
Message No 5:
Help is on the way! These features will be released during 2017!
43
TartuEstonian BiocentreLauri SaagMonika KarminHovhannes SahakyanEne MetspaluMait Metspalu Siiri RootsiRichard VillemsAcknowledgementsGenealogical peersAll Big Y friendsFamily Tree DNAConnie BormansLuisa Fernanda SanchezBrent ManningElliott GreenspanBennett Greenspan
44