Upload
hovhannes-sahakyan
View
21
Download
0
Tags:
Embed Size (px)
Citation preview
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
1/48
The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
Bayazit Yunusbaev1,2,3*, Mait Metspalu1*, Mari Jrve1*, Ildus Kutuev1,2, Siiri Rootsi1, Ene
Metspalu1, Doron M. Behar1, Krt Varendi1, Hovhannes Sahakyan1,4, Rita Khusainova2,3,
Levon Yeppiskoposyan4, Elza K. Khusnutdinova2,3, Peter A. Underhill5, Toomas Kivisild1,6,
Richard Villems1
1Department of Evolutionary Biology, University of Tartu and the Estonian Biocentre, Tartu, Estonia
2Institute of Biochemistry and Genetics, Ufa Research Center, Russian Academy of Sciences, Ufa, Russia
3Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Russia
4
Institute of Molecular Biology of the Academy of Sciences of Armenia, Yerevan, Armenia
5Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, USA
6Leverhulme Centre for Human Evolutionary Studies, University of Cambridge, Cambridge, UK
*These authors contributed equally to this work
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
2/48
eastern from western North Caucasians. Variation within the autosomal genome is
consistent with predominantly Near/Middle Eastern origin of Caucasians, with minor
external impacts. Genetic discontinuity between North Caucasus and the East European
Plain contrasts with continuity through Anatolia and Balkans, suggesting major routes
of ancient gene flows and admixture.
The Caucasus is a mountainous region between the Black and Caspian Seas, divided by the
High Caucasus Mountain Range into the North and South Caucasus. The earliest evidence of
the dispersal of the genusHomooutside Africa comes from the Caucasus (1, 2); anatomically
modern humans appeared there at least 42,000 years ago (3). The linguistic diversity in the
Caucasus is remarkably high (4). The Abkhazian-Adyghe (Northwest (NW) Caucasian),
Nakh-Dagestanian (Northeast (NE) Caucasian) and Kartvelian (South Caucasian) language
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
3/48
Irrespective of their languages, in our heatmap plot of pairwise FST, the Caucasus populations
show the lowest genetic distances to one another (7), followed closely by their distance to the
populations of the Near/Middle East, Turks in particular (Figure 2A). Meanwhile, a sharp
increase of genetic distance progressing from the Caucasus to the East Europe Plain is evident
(Figure 2A). The Indo-European-speaking Armenians and Ossetians follow the same pattern
and do not show higher genetic similarity to Indo-European-speaking populations from
Europe or the Near/Middle East. Similarly, the populations of the Caucasus cluster together
between their neighbors according to geography on the two-dimensional plots of the principal
components (PC) of autosomal variation (Figure 2B; Figure S1). Importantly, in contrast to
the continuous transition from the Near/Middle East to the Caucasus, there is a noticeable gap
between the Caucasus and the East European Plain (Figure 2B). Geography rather than
language based clustering can also be observed in the PC analysis of Y chromosome data
(Fi S2) I d d l t i f G i h b l t K t li f il f
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
4/48
We analyzed the autosomal data of the Caucasus and reference populations (5, 6) using a
structure-like (9) clustering approach (10). At K=7, the major ancestry component of the
Caucasus populations (shown in blue) has comparable presence in the Near/Middle East, but
is almost absent among the immediate northern neighbors of the Caucasus the populations
of the East European Plain (Figure 3; Figure S4). Similarly to the blue ancestry component,
the green component is also ubiquitously present among the Caucasus populations,
irrespective of their linguistic affinities, but at much lower frequencies than blue. The green
component is most frequent in the Indus basin (Pakistan), extending to Central Asia and the
Near/Middle East, while fading away in Europe. Although structure-like clustering cannot be
readily interpreted into (human) migrations, this pattern might suggest a gene flow from
South Asia to the west and northwest. We cannot point to any well documented evidence of
such events during the historic period. However, the noticeable presence in NE Caucasus of
th Y h h l L3 f f d l i P ki t (11) d N th I (12)
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
5/48
Hurrians-Urartians, the gene flow postulated here may equally well antedate the arrival of the
Armenian language to the area.
The mountain range that divides the Caucasus into the North and South Caucasus has
apparently not been an impenetrable barrier for gene flows. This is illustrated by the relative
similarity of the ancestry component patterns of the Caucasus populations on either side of the
High Caucasus Mountain Range (Figure 3). However, the dark blue ancestry component,
dominant among the Slavic-, Turkic-, and Finnic-speaking East European Plain populations,
reaches the North Caucasus (10-20%), but just barely (~5%) crosses the High Caucasus to the
three linguistically distinct South Caucasus populations Armenians, Georgians and
Abkhazians (Figure 3). Remarkably, the decrease of Y chromosome haplogroup G and J1
frequencies towards the Eastern European populations inhabiting the area adjacent to the
N th C h th R i d Uk i i (14 15) f b t
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
6/48
on distance matrices (18, 19) to analyze our whole genome data in order to test whether
factors other than geographic distance can explain the observed variation in genetic distances
between populations. Considering pairwise FSTdistances between populations, we tested
independent variables which are expected to increase genetic differentiation and thus impact
the linear relationship between geographic and genetic distance, defining them as putative
barriers (7, Table S4, supporting online text). Three of the putative barriers we tested first
were geographic the Caucasus barrier between North Caucasus and Eastern Europe, the
Balkans barrier between Anatolia and Europe, and the South Asian barrier between South
Asia and the Near/Middle East. The other barriers tested separated populations that are known
isolates/outliers due to religion, language, or different origin the Jewish groups, Kuban
Nogays, French Basques, Druze, and Burusho from their respective surrounding
populations. Geographic distance by itself explained only 43% (coefficient of determination r2
0 43) f th i ti i ti di t b t l ti d b F Th
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
7/48
Black and Caspian Seas dated 1214,000 years BP (20, 21) which may have served as a
natural barrier (supporting online text).
The Kuban Nogays and the Kara Nogays (Figure 1) have a special status among the
Caucasian populations due to their recent, late 18th early 19thcentury arrival from the
Pontocaspian steppes (22), evident from both Y chromosome and autosomal PC plots (Figure
S2; Figure 2B) as well as ADMIXTURE analysis (Figure 3) (only the Kuban Nogays were
included in the autosomal analyses). It has been shown that the Nogays possess 40% of East
Eurasian mtDNA lineages (23). Comparing the two subpopulations with respect to
proportions of typical western Eurasian (G, J, R1a1) and eastern Y chromosome lineages (C,
D, N, O), it becomes apparent that the Kara Nogays have more (~35%) typical eastern Y
chromosome lineages, while among the Kuban Nogays the percentage is around 17% (Table
S2) P h i t ti l h f d th t b th th K b N d th K
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
8/48
levels of between-population differentiation, with an average FSTof 0.113, a value which is
almost as large as the FSTof 0.157 for worldwide populations (25). Our results, based on
genome-wide data, reveal instead that the populations of the Caucasus region show between
population differentiation (average FST= 0.004) that is slightly lower than that of the Near
East (0.006) and of Europe (0.006), and are thus more consistent with the results of Bulayeva
et al.(26). Whereas the Y chromosome haploid system reveals some cases of high
differentiation between regions and/or individual populations in the Caucasus, such as the
separation of the Dagestanian-speaking NE Caucasian populations from the rest of the region
(Figure S2), the autosomal variation in the Caucasus matches geography rather than linguistic
divisions. While the variation of all three genetic systems analyzed here autosomal, Y
chromosome, and mtDNA shows a genetic continuity between the Caucasus and the
Near/Middle East, there is a clear discontinuity, supported by principal component,
ADMIXTURE d lti l i l b t th N th C d th E t
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
9/48
Semitic speakers of the Near East and the Arabian Peninsula (Figure 3), having been carried
to the Caucasian populations by route or routes unknown at present.
We conclude that irrespective of the Early Upper Paleolithic presence of anatomically modern
humans both south and north of the Caucasus, the combined high-resolution autosomal and
gender-specific genetic variation of the Caucasian populations testifies to their predominantly
southern, Near/Middle Eastern descent. Y chromosomal variants under strong founder events,
seen in particular among populations inhabiting the northern flank of the High Caucasus
Mountain Range, appear to never have expanded to the East European Plain, while the
nomadic people of the latter, once settled down predominantly on the northern slopes of the
Caucasus, have preserved, to different extent, some of their earlier genetic heritage. In sum,
though the Caucasus may well have served as a corridor for invasive expeditions in the past,
thi h h d l i i fl th l l d t l ti f th i
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
10/48
Figure legends
Figure 1. Geographical map of the populations of the Caucasus included in this study.
The language family affiliation of each population is given. Adapted from Wikipedia.
Figure 2. Pairwise FSTdistances and principal component analysis of the Caucasus and
neighboring populations.
APairwise FSTdistances between populations, ranging from red (low) to blue (high), based
on autosomal data. The populations [data from this study and the literature (5, 6)] are divided
into regional groups. BPlot of the first and second components of the principal component
analysis (27) of the Caucasus and neighboring populations based on autosomal data, with the
clustering of populations approximating geography. The thick lines denote probable directions
f t f l i t t d b t th N th C d E t E
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
11/48
References
1. D. Lordkipanidze et al., Postcranial evidence from early Homo from Dmanisi, Georgia.Nature449, 305-310 (2007).
2. D. E. Lieberman, Palaeoanthropology: homing in on early Homo.Nature449, 291-292(2007).
3. D. S. Adler et al., Dating the demise: neandertal extinction and the establishment ofmodern humans in the southern Caucasus.J. Hum. Evol.55, 817-833 (2008).
4. B. Comrie, Linguistic Diversity in the Caucasus.Annu. Rev. Anthropol.37, 131-143(2008).
5. J. Z. Li et al., Worldwide human relationships inferred from genome-wide patterns ofvariation. Science319, 1100-1104 (2008).
6. D. M. Behar et al., The genome-wide structure of the Jewish people.Nature466, 238-242(2010).
7. Materials and methods are available as supporting material on ScienceOnline.8. E. E. Marchani, W. S. Watkins, K. Bulayeva, H. C. Harpending, L. B. Jorde, Culturecreates genetic structure in the Caucasus: autosomal, mitochondrial, and Y-chromosomal
variation in Daghestan.BMC Genet.9, 47 (2008).
9. J. K. Pritchard, M. Stephens, P. Donnelly, Inference of population structure usingmultilocus genotype data. Genetics155, 945-959 (2000).
10.D. H. Alexander, J. Novembre, K. Lange, Fast model-based estimation of ancestry inunrelated individuals. Genome Res.19, 1655-1664 (2009).
11 S S t t l P l it d t lit f hi h l ti h di t ib ti
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
12/48
21.A. A. Svitoch, Khvalynian transgression of the Caspian Sea was not a result of wateroverflow from the Siberian Proglacial lakes, nor a prototype of the Noachian flood.
Quatern. Int.197, 115-125 (2009).22.M. Kolga, I. Tnurist, L. Vaba, J. Viikberg, The Red Book of the Peoples of the Russian
Empire. (NGO Red Book, Tallinn, ed. 2, 2001).
23.M. A. Bermisheva et al., Phylogeografic analysis of mitochondrial DNA in the Nogays:the high level of mixture of maternal lineages from Eastern and Western Eurasia. Mol.
Biol. (Mosk.)38, 617-624 (2004).
24.T. Zerjal et al., The genetic legacy of the Mongols.Am. J. Hum. Genet.72, 717-721(2003).
25.I. Nasidze et al., Alu insertion polymorphisms and the genetic structure of humanpopulations from the Caucasus.Eur. J. Hum. Genet.9, 267-272 (2001).
26.K. Bulayeva et al., Genetics and population history of Caucasus populations.Hum. Biol.75, 837-853 (2003).
27.N. Patterson, A. L. Price, D. Reich, Population structure and eigenanalysis. PLoS Genet.2, 2074-2093 (2006).
28.We thank the individuals who provided DNA samples for this study, and Mari Nelis,Georgi Hudjashov and Viljo Soo for conducting the autosomal genotyping. R.V. andD.M.B. thank the European Commission, Directorate-General for Research for FP7
Ecogene grant 205419. R.V. thanks the European Union Regional Development Fund for
support through the Centre of Excellence in Genomics, the Estonian Ministry of
Education and Research for the Basic Research grant SF 0270177As08, and the Swedish
Collegium for Advanced Studies for support during the initial stage of this work. S.R.
thanks the Estonian Science Foundation for grant 7445. E.M. thanks the Estonian Ministry
f Ed ti d R h f th B i R h t SF 0270177B 08 d th
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
13/48
1
3
4
5
6
1415
8
11
13
22
23
16
20
17
18
21
9
7
10
12
243
19*
Fig1
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
14/48French
Tuscans
N Italians
Romanians
Bulgarians
Lithuanians
Belorussians
Ukrainians
Chuvashes
Mordvins
Russians
Kuban Nogays
N Ossetians
Kumyks
Balkars
Lezgins
Chechens
Adyghe
Abkhazians
GeorgiansArmenians
Iranians
Turks
Syrians
Lebanese
Jordanians
Palestinians
Druze
Bedouins
Saudis
Fig 2
A Near/Middle East Caucasus East Europe
West & South
Europe
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
15/48
Ba
ntuSA
Ba
ntuNE
Yo
rubans
Mandenkas
Ethiopians
Egyptans
Saudis
Be
douins
Palestnians
Druze
Jord
anians
Lebanese
Syrians
Turks
French
FrenchB
asques
North
Italians
T
uscans
Rom
anians
Bulgarians
Ukr
ainians
Belorussians
Lithu
anians
Russians
Chu
vashes
Arm
enians
Georgians
Abkhazians
Balkars
A
dyghe
N.
Os
setans
Chechens
Lezgins
Kuban
Nogays
Kumyks
GeorgianJews
Mounta
inJews
Ira
qiJews
Irania
nJews
Iranians
Tajiks
Turkmens
Uzbeks
Uygurs
A
ltaians
H
azaras
P
athans
B
urusho
B
alochis
Brahui
Makranis
Sindhi
Yakuts
Camb
odians
Dai
Lahu
Han
Daur
Oroqens
M
ongols
Japanese
Mordvins
Kurds
Africa Near East & Europe
South North
The Caucasus Cen. Asia South Asia East Asia
IE KV AA IE ND TUK = 7
Fig 3
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
16/48
Supporting Online Material
Materials and Methods
DNA samples from all the subregions and major language groups of the Caucasus were
analyzed, using whole genome, Y chromosome and mtDNA markers. The geographic
locations and language affiliations of the Caucasus populations studied are presented in
Figure 1. DNA samples were obtained from unrelated male volunteers after getting informed
consent in accordance with the guidelines of the ethical committees of the institutions
involved. DNA was purified from blood by the phenol/chloroform extraction method. DNA
concentrations were determined by spectrometry (NanoDrop products, Wilmington, DE,
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
17/48
We used PLINK software 1.05 (5) to filter the combined dataset to include only SNPs on the
22 autosomal chromosomes with minor allele frequency >1% and genotyping success >97%.
Because background linkage disequilibrium (LD) can affect both principal component and
structure-like analysis, we thinned the marker set by excluding SNPs in strong LD (pairwise
genotypic correlation r2>0.4) in a window of 200 SNPs (sliding the window by 25 SNPs at a
time). The final data set consisted of 210,575 SNPs and 1119 individuals that were used in
subsequent analyses.
We explored the population structure at K=3 to K=10. To monitor convergence between
individual runs at each K we ran ADMIXTURE one hundred times and examined the
loglikelihood (LL) values of a ten percent fraction of runs with the highest LL yield at each K.
We assumed that the global maximum had been reached if the maximum difference between
those LL values was negligible (less than one LL unit) (2, 6). Since this was the case for all
t t d l f K d d th lt t b bl d l tt d lt f i th
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
18/48
virtually identical values, converged at respective global maxima, and are thus usable
representations of genetic structure at different levels (Figure S4).
Principal component analysis and FST
Since the principal component analysis (PCA) method assumes that markers are unlinked, we
thinned our marker set for this analysis with PLINK software 1.05 (5) according to the same
parameters used for genetic clustering analysis in order to mitigate background LD. However,
LD pruning was carried out after the exclusion of the populations from Africa (except
Egyptians), East Asia and Siberia, and also Hazaras, Kurds, Uygurs and Altaians, resulting in
a final data set of 189,747 markers and 838 individuals. PCA was carried out in the smartpca
program (8) using outlier removal procedure (18 outliers were removed, leaving 820
individuals). Pairwise genetic differentiation indices (FSTvalues) were also estimated using
t ft b d th thi d k t
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
19/48
Multiple regression on distance matrices
We used multiple regression on distance matrices (MRM) (11, 12) to explore various
explanatory variables (genetic distance, barriers to gene flow) predicting the genetic distances
between populations. In this method a single dependent distance matrix Yis considered as a
function of multiple independent distance matrices Xi(independent variables), and the
statistical significance of regression coefficients for each independent variable Xiis tested
based on matrix permutations (13). The corresponding permutation procedure is described in
Legendre et al.(13) and implemented in the ecodist R package (14).
In order to test whether factors other than geographic distance can explain the observed
variation in genetic distances between populations and whether the contribution of each
variable is statistically significant, we considered a matrix of pairwise FSTdistances between
populations as a response matrix and included explanatory variables into the regression model
ith t l i bi ti A i t f i d d t i bl h t ti
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
20/48
X3: Putative barrier separating South Asia from the Near/Middle East.
X4: Zero distances between the isolated Jewish communities and distances of 1 between the
Jews and the populations surrounding them.
X5: Putative barrier separating the Kuban Nogays and the French Basques from other
populations.
X6: Putative barrier separating the Druze from other populations.
X7: Putative barrier separating the Burusho from other populations.
Y chromosome analyses
A total of 1952 samples from 24 populations from the Caucasus were analyzed for Y
chromosome markers. The samples were typed for 51 Y chromosome SNP markers, 2 of
which [M81 and M128 (15)] were found to have the ancestral state in all of the samples. The
t f th k 12f2 (16) YAP (17) SRY (18) T t (19) 92R7 (20) M9 M12
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
21/48
DYS388 (29, 30) and DYS461 (31). The phylogenetic network of the data obtained was
constructed with the program Network 4.5.0.0 (Fluxus-Engineering), using the median joining
algorithm. Spatial frequency maps were drawn with the program Surfer 8 (Golden Software
Inc., Cold Spring Harbor, NY, USA). Coalescence ages were calculated according to the
ASD0method (32).
mtDNA analyses
The haplogroups of 2262 mtDNA samples from 24 Caucasus populations were determined by
typing HVSI and coding region markers according to the nomenclature presented in Richards
et al.(33) (Table S4). The data obtained were used to generate a PC plot of the Caucasus
populations in the context of populations from neighboring regions, using the POPSTR
software (http://harpending.humanevo.utah.edu/popstr/).
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
22/48
languages of the NE Caucasus cluster apart from the rest of the Caucasus peoples who are
genetically similar despite language family barriers, whereas the Nakh-speaking NE
Caucasian Chechens and Ingushes do not fall into either group, being set apart by a high
frequency of haplogroup J2a2* (Figure S2; Table S2). But this single example of concordance
of genetic and linguistic data in NE Caucasus is only observable in case of the Y
chromosome; neither autosomal nor mtDNA data support the distinctness of the Dagestanian
language group populations (Figure 2B; Figure S1; Figure S3). NE Caucasian Y
chromosomes (n=640) mostly belong to haplogroups J1* (35.3%), J2a2* (27.8%), and
R1b1b2 (9.8%); while those from NW Caucasus (n=844) mostly belong to haplogroups G2a
(45.4%), R1a1* (14.9%), and J2a* (9.1%), and those from S Caucasus (n=305) also mostly to
haplogroups G2a (41.3%), J2a* (12.1%), and R1a1* (7.9%) (Table S2). In summary, the NE
Caucasian populations are distinguished mainly by a high frequency of haplogroup J1, more
ifi ll J1* hi h i di ti t f th J1 * li i th N /Middl E t
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
23/48
P58 mutation first arose, since the coalescence ages of the J1e* clade are highest in the
Near/Middle East, the estimate for the whole region being 12 000 2 600 years (Table S5).
The J1e* subclade has spread to the Caucasus relatively lately, as evidenced by a low
coalescence age, the limited divergence of Caucasian J1e* STR haplotypes both from
Near/Middle Eastern STR haplotypes and from each other (Figure S7), and the generally low
frequency of this subhaplogroup in the Caucasus. The J1e* coalescence age of 5 600 1 400
years, considerably lower than the estimates for other J subhaplogroups in the Caucasus,
possibly reflects a migration of Neolithic farmers from the Near East and is consistent with
the scenario proposed by Tofanelli et al.(39). The J1* clade is not uniform, but divided into
two clusters according to the number of repeats of the STR marker DYS388 (10-14 versus 15-
17 repeats), clearly distinct on median joining networks (Figure S7). The long DYS388
group is the older of the two, apparently having risen in the Near/Middle East, where it has a
l f 21 400 8 900 A di t l ti t th
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
24/48
The lineages of the J1 sister haplogroup J2 have spread from their Near/Middle Eastern
homeland in both the eastern and western direction, and, with some exceptions, also to the
Caucasus in the north. The haplogroup J2 subclades J2a* and J2a2* have both spread
throughout the Caucasus, although remaining at lower frequencies in NE Caucasus, with the
exception of the Chechens and the Ingush, who have a high frequency of J2a2* (Figure S8).
On the other hand, the subclades J2a2a and J2b, the latter otherwise spread from southern
Europe to India (23, 37), are practically absent in the Caucasus. Our coalescence age
estimates, probably strongly influenced by the high frequency and diversity of subclade J2a2*
among the Chechens and the Ingush, set the time of the expansion of J2a2* to the Caucasus
into the distant past, at about 1214,000 years (Table S5).
Both the J1*(xP58) short DYS388 group and J2a2* exhibit founder effects about 12,000
i th NE C ibl l t d t th Kh l i t i ti
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
25/48
presence of this lineage in the Caucasus was shown by Nasidze et al.(42), it was intriguing to
find it to be present so widely and with such a high frequency. Similarly to Anatolia (21), the
absolute majority of haplogroup G samples in the Caucasus belong to subhaplogroup G2, with
only a few (mostly Armenian) samples falling into G1. The major subclade G2 is unevenly
distributed, being very frequent in NW Caucasus and S Caucasus (covering about 45% of the
paternal lineages in both regions, with the highest incidence detected in North Ossetians, at
70%), while present in NE Caucasus with an average frequency of only 5%, ranging from 19
to 0%.
Interestingly, the decrease of both haplogroup J1 and G frequencies (the two major lineages in
the Caucasus) towards the eastern European populations inhabiting the area adjacent to NW
Caucasus, such as southern Russians and Ukrainians (43, 44), is very rapid and the borderline
h (Fi S5 Fi S6) i di ti th t fl f th C i th th
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
26/48
In addition to physical barriers like mountains and rivers, factors such as linguistic, ethnic and
religious restrictions should be considered as potential barriers to gene flows when analyzing
human populations. Deviations from panmixia, such as assortative mating between members
of specific social groups, or admixture with highly divergent immigrant populations, can also
lead to higher genetic distances between neighboring human populations. Because all these
forces lead to higher differentiation between neighboring populations, they can be considered
as barriers to gene flow. In terms of multiple regression analysis, not considering such factors
can lead to decreased explanatory power of the simpler model if that includes only geographic
distance as a predictor.
Multiple regression analysis on distance matrices provides a convenient framework to
consider multiple, independently acting factors. Whenever such factors can be formulated as
di t t i th lti l i th d (MRM) b d t t t th i l
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
27/48
FSTanalysis
The heatmap plot (Figure 2A) reveals three clusters of low genetic distance, encompassing
geographically nearby populations: the Near/Middle East, the Caucasus, and Europe.
However, Europe is not as homogeneous as the other clusters. French Basques and Volga
Basin Turkic-speaking Chuvashes are clearly more distant from their immediate neighbor
populations, while geographically somewhat southern populations, from the Atlantic to the
Black Sea (French, Italians, Bulgarians and Romanians), exhibit particularly low inter-
population genetic distances. As already mentioned, the smooth transition from the Caucasus
to Anatolia (Turks) and Iran, and from the latter to Syrians, Lebanese, Jordanians and further
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
28/48
Figure legends
Figure S1
Plots of the first and the second, the first and the third, and the second and the third
components of the principal component analysis of the Caucasus and neighboring populations
based on autosomal data [data from this study and the literature (1, 2)]. For population
abbreviations see Table S1.
Figure S2
Plot of the first and second principal components of Y chromosome variation in the Caucasus
and neighboring regions. Populations [data from this study and the literature (21, 23, 34, 36,
43, 45-49)] are colored according to their language group affiliations. Populations of the
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
29/48
proportions shown in color from K=3 to K=10. Populations introduced for the first time in
this study and analyzed together with data from Li et al.(1) and Behar et al.(2) are labeled in
color.
Figure S5
Spatial frequency distribution of the Y chromosome haplogroup G. Frequency data from this
study and the literature (21, 23, 36, 43, 45-49, 62-76) were converted into a spatial frequency
map using the Surfer software (version 8, Golden Software Inc., Cold Spring Harbor, NY,
USA), applying the kriging algorithm.
Figure S6
Spatial frequency distribution of the Y chromosome haplogroup J1 and its subclades: Aall of
J1 B J1 * d C J1*( P58) F d t f thi t d d th lit t (21 23 34 36
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
30/48
Figure S8
Spatial frequency distribution of the Y chromosome haplogroup J2 and its subclades:A
all of
J2, BJ2a*, and CJ2a2*. Frequency data from this study and the literature (21-23, 36, 37, 43,
45, 46, 48, 68, 78-80, 84) were converted into spatial frequency maps using the Surfer
software (version 8, Golden Software Inc., Cold Spring Harbor, NY, USA), applying the
kriging algorithm.
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
31/48
References
1. J. Z. Liet al., Worldwide human relationships inferred from genome-wide patterns ofvariation. Science319, 1100-1104 (2008).
2. D. M. Beharet al., The genome-wide structure of the Jewish people.Nature466, 238-242(2010).
3. D. H. Alexander, J. Novembre, K. Lange, Fast model-based estimation of ancestry inunrelated individuals. Genome Res.19, 1655-1664 (2009).
4. J. K. Pritchard, M. Stephens, P. Donnelly, Inference of population structure usingmultilocus genotype data. Genetics155, 945-959 (2000).5. S. Purcellet al., PLINK: A tool set for whole-genome association and population-based
linkage analyses.Am. J. Hum. Genet.81, 559-575 (2007).
6. M. Rasmussenet al., Ancient human genome sequence of an extinct Palaeo-Eskimo.Nature463, 757-762 (2010).
7. D. H. Alexander, J. Novembre, K. Lange, ADMIXTURE 1.04 Software Manual (2010;http://www.genetics.ucla.edu/software/admixture/admixture-manual.pdf).
8. N. Patterson, A. L. Price, D. Reich, Population structure and eigenanalysis. PLoS Genet.2, 2074-2093 (2006).
9. S. Banerjee, On geodetic distance computations in spatial modeling.Biometrics61, 617-625 (2005).
10.H. M. Cannet al., A human genome diversity cell line panel. Science296, 261-262(2002).
11 B F M l R d i ti d i th d f t ti f i ti ith
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
32/48
23.S. Senguptaet al., Polarity and temporality of high-resolution y-chromosome distributionsin India identify both indigenous and exogenous expansions and reveal minor genetic
influence of Central Asian pastoralists.Am. J. Hum. Genet.78, 202-221 (2006).
24.P. A. Underhillet al., inRethinking the Human Revolution: New Behavioural andBiological Perspectives on the Origin and Dispersal of Modern Humans (Mcdonald
Institute Monographs) P. Mellars, K. Boyle, O. Bar-Yosef, C. Stringer, Eds. (McDonald
Institute for Archaeological Research, Cambridge, 2007), pp. 33-42.
25.P. A. Underhillet al., Separating the post-Glacial coancestry of European and Asian Ychromosomes within haplogroup R1a.Eur. J. Hum. Genet.18, 479-484 (2010).
26.M. F. Hammeret al., Jewish and Middle Eastern non-Jewish populations share a commonpool of Y-chromosome biallelic haplotypes. Proc. Natl. Acad. Sci. U. S. A.97, 6769-6774(2000).
27.N. Elliset al., A nomenclature system for the tree of human Y-chromosomal binaryhaplogroups. Genome Res.12, 339-348 (2002).
28.T. M. Karafetet al., New binary polymorphisms reshape and increase resolution of thehuman Y chromosomal haplogroup tree. Genome Res.18, 830-838 (2008).
29.P. de Knijffet al., Chromosome Y microsatellites: population genetic and evolutionaryaspects.Int. J. Legal Med.110, 134-149 (1997).
30.M. Kayseret al., Evaluation of Y-chromosomal STRs: a multicenter study.Int. J. LegalMed.110, 125-133 (1997).
31.P. S. White, O. L. Tatum, L. L. Deaven, J. L. Longmire, New, male-specific microsatellitemarkers from the human Y chromosome. Genomics57, 433-437 (1999).
32.L. A. Zhivotovskyet al., The effective mutation rate at Y chromosome short tandemrepeats, with application to human population-divergence time.Am. J. Hum. Genet.74,
50 61 (2004)
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
33/48
44.V. N. Kharkovet al., Gene pool structure of Eastern Ukrainians as inferred from the Y-chromosome haplogroups.Russ. J. Genet.40, 326-331 (2004).
45.A. M. Cadenas, L. A. Zhivotovsky, L. L. Cavalli-Sforza, P. A. Underhill, R. J. Herrera, Y-chromosome diversity characterizes the Gulf of Oman.Eur. J. Hum. Genet.16, 374-386
(2008).
46.C. Floreset al., Isolates in a corridor of migrations: a high-resolution analysis of Y-chromosome variation in Jordan.J. Hum. Genet.50, 435-441 (2005).
47.J. R. Luis et al., The Levant versus the Horn of Africa: Evidence for bidirectionalcorridors of human migrations.Am. J. Hum. Genet.74, 532-544 (2004).
48.M. Pericic, L. B. Lauc, I. M. Klaric, B. Janicijevic, P. Rudan, Review of croatian geneticheritage as revealed by mitochondrial DNA and Y chromosomal lineages. Croat. Med. J.46, 502-513 (2005).
49.P. A. Zallouaet al., Y-chromosomal diversity in Lebanon is structured by recent historicalevents.Am. J. Hum. Genet.82, 873-882 (2008).
50.N. Al-Zaheryet al., Y-chromosome and mtDNA polymorphisms in Iraq, a crossroad ofthe early human dispersal and of post-Neolithic migrations.Mol. Phylogenet. Evol.28,
458-472 (2003).
51.D. M. Beharet al., Counting the founders: the matrilineal genetic ancestry of the JewishDiaspora. PLoS One3, e2062 (2008).
52.M. Bermisheva, K. Tambets, R. Villems, E. Khusnutdinova, Diversity of mitochondrialDNA haplotypes in ethnic populations of the Volga-Ural region of Russia.Mol. Biol.
(Mosk.)36, 990-1001 (2002).
53.S. Cvjetanet al., Frequencies of mtDNA haplogroups in southeastern Europe--Croatians,Bosnians and Herzegovinians, Serbians, Macedonians and Macedonian Romani. Coll.
A t l 28 193 198 (2004)
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
34/48
64.C. Di Gaetanoet al., Differential Greek and northern African migrations to Sicily aresupported by genetic evidence from the Y chromosome.Eur. J. Hum. Genet.17, 91-99
(2009).
65.F. Di Giacomoet al., Clinal patterns of human Y chromosomal diversity in continentalItaly and Greece are dominated by drift and founder effects.Mol. Phylogenet. Evol.28,
387-395 (2003).
66.R. Goncalveset al., Y-chromosome lineages from Portugal, Madeira and Acores recordelements of Sephardim and Berber ancestry.Ann. Hum. Genet.69, 443-454 (2005).
67.M. F. Hammeret al., Dual origins of the Japanese: common ground for hunter-gathererand farmer Y chromosomes.J. Hum. Genet.51, 47-58 (2006).
68.T. M. Karafetet al., High levels of Y-chromosome differentiation among native Siberianpopulations and the genetic signature of a boreal hunter-gatherer way of life.Hum. Biol.74, 761-789 (2002).
69.A. O. Karlsson, T. Wallerstrom, A. Gotherstrom, G. Holmlund, Y-chromosome diversityin Sweden - A long-time perspective.Eur. J. Hum. Genet.14, 963-970 (2006).
70.R. J. Kinget al., Differential Y-chromosome Anatolian influences on the Greek andCretan Neolithic.Ann. Hum. Genet.72, 205-214 (2008).
71.I. Nasidzeet al., Genetic evidence concerning the origins of South and North Ossetians.Ann. Hum. Genet.68, 588-599 (2004).
72.I. Nasidze, T. Sarkisian, A. Kerimov, M. Stoneking, Testing hypotheses of languagereplacement in the Caucasus: evidence from the Y-chromosome.Hum. Genet.112, 255-
261 (2003).
73.V. N. Pimenoffet al., Northwest Siberian Khanty and Mansi in the junction of West andEast Eurasian gene pools as revealed by uniparental markers.Eur. J. Hum. Genet.16,
1254 1264 (2008)
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
35/48
Table S1.Samples used for whole genome analysis.
Geographic
region
Population Abbreviation Li et
al.(1)
Behar et
al.(2)
This
study
Total
Africa NE Bantu BN 11 11
S Bantu BS 8 8
Mandenka Mnd 22 22
Yoruba Yor 21 21
North Africa Egypt Egy 12 12Ethiopians Eth 19 19
Near/Middle
East
SaudisSdi 20 20
Bedouin Bdn 45 45
Druze Drz 42 42
Jordanians Jor 20 20
Lebanese Leb 7 7
Palestinian Pal 46 46
Syrians Syr 16 16
Iranians Irn 20 20
Turks Tur 19 19
Europe French Fre 28 28
French Basques FrB 24 24
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
36/48
Kurds (sampled in
Kazakhstan)Krd 6 6
Uzbeks Uzb 15 15
Uygur Uyg 10 10
Altaians Alt 13 13
South Asia Hazara Haz 22 22
Pathan Ptn 22 22
Burusho Bur 25 25
Balochi Blo 24 24
Brahui Brh 25 25
Makrani Mak 25 25Sindhi Sin 24 24
Siberia Yakuts Yak 25 25
East Asia Cambodians Cam 10 10
Dai Dai 10 10
Lahu Lah 8 8
Han Han 44 44
Daur Dau 9 9
Oroqen Oro 9 9
Mongolas Mng 10 10
Japanese Jap 28 28
Total 639 267 214
Grand Total 1120
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
37/48
M20
M436
NE Caucasus N
94sidnA
24sravA
82slalavgaB
72slalamahC
561snehcehC
76snigraD
501hsugnI
37skymuK
13snigzeL
34snarasabaT
Mountain Jews 10
NEC 046latoT
Kuban Nogays 87
Kara Nogays 76
Nogays & Kara Nogays Total 163
NW Caucasus
88snizabA
00226 5 19 178 4 0
22 2 3 9 1362 2
1
1
810213 2 5 0 0 0 10 0 0 0 347191
10 3 1
13 0
1127121
277 1
5 2 1
218 4
30 0
10
01 2 0 4 10 35 0143 0 0 0 9 1
1 23 4
1 213
2 183 12
2610 1714 2
9
2
865
61
2
22
332 77 2604111
1 1815
6
428 2 23
1 7281213
N 1 b
1
L3
N*
J2a
*
J2a
2*
J1e
*
L1
L2
I2*
1
J1*
K*
TE1b1b1c
G1
J2a
2a
J2b*
G2a
H1
I1 I2a
C C3c
D E E1b1b1a
4P29M322M2.73P
M357
M 7 M
07M
M231M76 M317
M52 M9
M438
12f2M170
M253
M
85P321M87M
M130 YAP
M267
M48 M174
M172
M40 M201
M168
M35 M285 P15
0000
I2b
F*
0
0
2
2
0
M89
Table S2.Y chromosome data of the Caucasus populations and the phylogenetic relationships
of the 51 Y chromosome markers typed.
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
38/48
NE Caucasus N A B C D E F G H HV* HV0 HV1 HV2 I J K L M M
585426sidnA
8111441316sravA
74133slalavgaB
13243slalamahC
2016823422671snehcehC
3121728621011snigraD
543819232301hsugnI
1396227231211skymuK
56211264snigzeL
3719225snarasabaT
Mountain 311132sweJ
NEC 8006426143305220271001430218latoT
Kuban Nogays 131 12156521482625216
Kara 32126275318157031syagoN
Nogays & Kara Nogays Total 261 13 6 20 18 0 7 13 54 6 0 1 3 7 6 5 1 5 1
NW Caucasus
3543192155501snizabA
1312531275121551ehgydA
8112353142041sraklaB
4193114321571321snaissekrehC
11921243121501051nidrabaK
146417251601syahcaraK
N 42171462521112831snaitessO
NWC 4120065551131851712910236213719latoT
South Caucasus
152322231481631snaizahkbA
121132163snainemrA
346111267snaigroeG
S 1411112142snaitessO
SC 272latoT 1 0 8 7 0 1 0 57 6 0 4 1 11 9 13 0 0 2
NE Caucasus N N1d N2a N9 R R0a T*(xT1) T1 U U1 U2 U3 U4 U5 U6 U7 U8 U9 V
2422281326sidnA
3845116sravA
126233slalavgaB
1113243slalamahC
313341671snehcehC 13518862
221011snigraD 1517242
5301hsugnI 11413922224
21111211skymuK 1567735
6332464snigzeL
25snarasabaT 154441122
Mountain 21532sweJ
Table S3.mtDNA data of the Caucasus populations.
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
39/48
Table S4. The power of various models to predict the observed genetic distances (pairwise
FSTdistances) between the populations studied. The coefficient of determination r2, its
increase relative to the default model of only geographic distance explaining genetic distance,
the marginal regression coefficient, and p-value are given. Statistically significant predictors
printed in bold.
Predictors (putative
barriers)
r2 Increment in
r2
Marginal regression
coefficient
p-
value
only geographic distance 0.4316 * 0.0347 0.0001Caucasus barrier 0.5522 0.1206 0.0055 0.0001
Balkans barrier 0.5058 0.0742 -0.0048 0.0001
Mountain Jews 0.4751 0.0435 0.0063 0.0030
Iranian Jews 0.4693 0.0377 0.0058 0.0247
Burusho 0.4523 0.0207 0.0044 0.0596
Georgian Jews 0.4481 0.0165 0.0039 0.0959
Iraqi Jews 0.4467 0.0151 0.0037 0.1247
Druze 0.4376 0.0059 0.0023 0.4500
French Basque 0.4341 0.0025 0.0016 0.7090
South Asian barrier 0.4328 0.0012 0.0006 0.6570
Kuban Nogays 0.4318 0.0002 -0.0004 0.9531
All predictors 0.7695 0.3378 * *
T bl S5 C l i i h d d f h h h l 1 ( d bh l )
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
40/48
Table S5.Coalescence times with standard errors for the Y chromosome haplogroups J1 (and subhaplogroups) a
regions/populations. Median haplotypes of the 10 Y chromosome STR markers based on which the coalescence t
(32)] are also given.
Haplogroup Region/population N TC
(ky)
SE
(ky)
DYS19 DYS388 DYS389I DYS389II DYS390
J1* All J1* (Caucasus, Near/Middle East, Central Asia) 121 18.6 4.4 14 13 13 16 23
J1* Caucasus 72 14.6 4.2 14 13 13 16 23
J1* Near/Middle East 33 22.8 6.0 14 13 14 17 23
J1*,DYS388
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
41/48
PC2
PC1
Bdn
Sdi
Egy
Bdn
Pal Jor
Drz
Cyp
Syr
Leb
IqJInJ
Armenians
MountainJewsGeorgianJews
GeorgiansAbkhazians
Adyghe
Chechens
Balkars
Lezgins
Nogais
Chv
Rus
Mrd
Lit
Blr
Ukr
Fre
FrB
Rmn
Bul
NIt
Tus
Uzb
TjkTrmIrn
Ptn
Blo
Mak
Brh
Sin
Bur
Tur
Pal Jor
DrzCyp
Syr
Leb
IqJ
InJ
Armenians
MountainJews
GeorgianJews
Georgians Abkhazians
AdygheChechens
Balkars
Lezgins
Nogais
Fre
Rmn
Bul
NIt
Tus
Trm
Irn
Ptn
Blo
Mak
Brh
Sin
Tur
Fig S2
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
42/48
Chv
Dar
Aba
Abh
Ady
CrkKab
Bag
Cha
Tab
Avr
And
Lzg
Che
Ing
UAE
EgyIrq
Jor
Leb
Pal
Syr
Grg
Blr
Rus
SSv
Ukr
Pak
Irn
Krd
MnJ
Arm
NOs
SOs
Bsh
Ttr KNo
Nog
BlkKar
KumTur
PC2(19%)
LANGUAGE GROUPS
Nakh-Dagestanian
Abkhaz-Adyghe
Turkic
Indo-European
Afro-Asiatic
Kartvelian
Caucasus populations
Other populations
g
Fig S3
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
43/48
AbaQtr
Bsh
KNo
MnJ
Abh
Ady
Crk Kab
UAE
IrqJorLeb
PalSyrEgy
Irn
Krd
Arm
NOs
SOs
SSv
RusBlr
Ukr
And
Avr
Bag
ChaCheDar
IngLzg
Tab Blk
Chv
KarKum
Nog
Ttr
Tur Grg
Pak
PC2(12.5
%)
LANGUAGE GROUPS
Nakh-Dagestanian
Abkhaz-Adyghe
Turkic
Indo-European
Afro-Asiatic
Kartvelian
Caucasus populations
Other populations
g S3
3
Fig S4
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
44/48
3
4
5
6
7
8
9
10
BantuSA
BantuNE
Yorubans
Mandenkas
Ethiopians
Egyptans
Saudis
Bedouins
Palestnians
Druze
Jordanians
Lebanese
Syrians
Turks
French
FrenchBasques
NorthItalians
Tuscans
Romanians
Bulgarians
Ukrainians
Belorussians
Lithuanians
Russians
Chuvashes
Armenians
Georgians
Abkhazians
Balkars
Adyghe
N.
Ossetans
Chechens
Lezgins
KubanNogays
Kumyks
GeorgianJews
MountainJews
IraqiJews
IranianJews
Iranians
Tajiks
Turkmens
Uzbeks
Uygurs
Altaians
Hazaras
Pathans
Burusho
Balochis
Brahui
Makranis
Sindhi
Yakuts
Cambodians
Dai
Lahu
Han
Daur
Oroqens
Mongols
Japanese
Mordvins
Kurds
Africa Near East & Europe
South North
The Caucasus Cen. Asia South Asia East Asia
Fig S5
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
45/48
Fig S6
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
46/48
g
Fig S7
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
47/48
g
Fig S8
5/26/2018 Yunusbaev_The Caucasus as an Asymmetric Semipermeable Barrier to Ancient Human Migrations
48/48
Fig S8