ooogiii

The Central Asian Landscape: Possible Inquiries into the Population History and Structure of Mongolia through Quantitative Genetic Analyses

R.W. Schmidt

INTRODUCTION

Mongolia, located in central Asia (see figure 1), has generated variable and

extensive genetic analyses, including the possible founding populations of North America

(Kolman et al., 1996; Merriweather et al., 1996), modern ethnogenetic hypotheses for

groups currently inhabiting the country and surrounding areas (Nasidze et al., 2005;

Keyser-Tracqui et al., 2006; Fu et al., 2007), the likely Y-chromosomal lineage of

Genghis Khan and his male-line descendents and the extensive geographic expansion in

which it is found (Zerjal et al., 2003), and lastly, the complex processes of unraveling the

underlying genetic variation seen in the larger regional context of central Asia (Comas et

al., 1998; Yao et al., 2000; Wells et al., 2001; Oota et al., 2002; Zerjal et al., 2002; Comas

et al., 2004; Quintana-Murci et al., 2004; Yao et al., 2004; Bennett and Kaestle 2006;

Derenko et al., 2007). The majority of these studies utilize common genetic markers,

such as mitochondrial DNA (mtDNA) and Y-chromosome, which have yielded

significant findings in anthropological genetic research (for a review see Crawford,

2007).

This paper will make use of existing research on the genetics of Mongolia and

central Asia to explore population history and structure of a region that has been

inhabited by a diverse mixture of individuals and groups, whom have occupied favorable

and unfavorable environments, and who now define clearly demarcated boundaries in the

form of nation-states. Central Asia is a vast territory located at the confluence of

historical empires and trade, crossed by the famous Silk Road with contacts to the south

1

in India and open to the steppes of the north. This region is essential to understanding

complex cultural phenomena such as acculturation, assimilation, languages, overlapping

economies, and ways of life that include migrations, expansions and conquests.

These topics will be investigated through current research of genetic markers

(including ancient DNA), migration studies in Mongolia and central Asian populations

(Perez-Lezaun et al., 1999). Also, biological variation will be investigated through the

use of quantitative trait variation, which may or may not correlate with historical and

genetic findings. Few studies in Mongolian population history and structure have given

primacy to quantitative analysis. This paper will utilize quantitative trait variation in the

form of craniometric measurements as a tool to potentially understand the complex

history of Mongolia and other nomadic groups now inhabiting the central Asian

landscape.

FIGURE 1. Map of Mongolia

2

MATERIALS AND METHODS

For comparative purposes in evaluating quantitative craniometric data, groups

were aggregated into major geographic regions with some partitioning: China, Japan,

Mongolia, Siberia, Southeast Asia, Europe, India, West Africa, North Africa, Mideast

(includes Israel, Iran, and Iraq), Russia, and North America (see Table 1). Group

differences were calculated by Wilks’ Lambda and discriminant function classification,

with significant differences between all groups (p ≤ .001). The Mongolian groups were

aggregated because of small sample sizes. Groups were further combined by time period:

Bronze Age, Mongolian period, Hunnu and modern. In addition, one group was labeled

“test”. A discriminant function analysis was conducted to ascertain possible group

differences (n = 14) that may skew statistical interpretation. The Mongolian “Iron Age”

and Mongolian “Bronze Age” did show significant statistical differences (p < .05) and

were therefore excluded from additional analysis (see Figure 2 and Table 2).

TABLE 1. Samples used in current study

Sample NChina 105Japan 144North China 54Mongolia 109Siberia 10Southeast Asia 69Europe 90India 39West Africa 36North Africa 45Middle East 40Russia 59North America 76

876

3

420-2-4-6

Function 1

4

2

0

-2

-4

Func

tion

2

Mong "Test"

Mong Modern

Mong Bronze

Mong PeriodMong Hunnu

Mong Iron Age?

N China

S China

Group CentroidMong "Test"Mong ModernMong BronzeMong PeriodMong HunnuMong Iron Age?N ChinaS China

site

Canonical Discriminant FunctionsFIGURE 2. Mongolian Classification and Group Differences

TABLE 2. R matrix values for Chinese and Mongolian Samples

Population S China N China Iron? Hunnu MongperiodBronze Modern "Test"S China 0.0000N China 0.1028 0.0000Mongolia Iron? 0.0854 0.0936 0.0000Mongolia Hunnu -0.0672 -0.0664 -0.0769 0.0000Mongolian period -0.0609 -0.0584 -0.0583 0.0357 0.0000Mongolia Bronze -0.0401 -0.0669 -0.0673 0.0300 0.0144 0.0000Mongolia Modern -0.0859 -0.0820 -0.0707 0.0519 0.0372 0.0045 0.0000Mongolia "Test" -0.0572 -0.0720 -0.0579 0.0347 0.0296 0.0331 0.0458 0.0000

All samples were taken from the University of Michigan’s Museum of

Anthropology database kindly provided by Dr. Noriko Seguchi. Only males were used in

the analysis to facilitate statistical competence. Seventeen craniofacial measurements

were taken on all samples, with no missing data. See Table 3 for traits used in this

analysis. For definitions of measurements, see Brace and Tracer (1992). Metric variables

4

record inherited differences in cranial and facial form and further, configurations in facial

form remain stable over considerable periods of time, making them excellent indicators

of groups similarities and differences (Brace et al., 2001).

TABLE 3. Traits used in this analysis with corresponding abbreviations

Quantitative Trait AbbreviationNasal Height nasohtNasal bone height nasbnhtNasion prosthion length naprlngNasion basion nasbasBasion prosthion basprosSuperior nasal bone width supnasbnInferior nasal bone width infnasbnNasal breadth nasbrdtFrontoorbital width subtense at nasion fowsubnaMid orbital width subtense at rhinion mowsubriBizygomatic breadth bizygomaGlabella opisthocranion glabopisMaximum cranial breadth maxbredtBasion bregma basibregBasion rhinion basirhinWidth at 13 (fronto malar temporalis) fmtfmtMid orbital width (width at 14) mowidth

An analytical model has been used for this study. Quantitative variation will be

explored through the used of an R matrix analysis. R matrix analysis has become a

standard method for investigating population structure and history in both modern and

prehistoric contexts using quantitative traits due in large part to the interpretive quality of

the results (e.g. Relethford and Blangero, 1990; Relethford et al., 1997; Steadman, 2001;

Stojanowski, 2005). The R matrix (Relethford-Blangero) analysis has a number of

interpretive qualities that are useful for microevolutionary studies. Genetic distances

between pairs of populations can be estimated directly from the R matrix (Harpending

5

and Jenkins, 1973; Williams-Blangero and Blangero, 1989) as well as estimates of

phenotypic Fst. Genetic distances represent morphological similarity and difference

between samples, and serves as an indication of the rate of migration and mate exchange,

assuming the effects of random genetic drift are minimal (Relethford, 1996). Fst is a

measure of regional estimates of microdifferentiation (heterogeneity) based on the

contemporary array of allele frequencies (or quantitative traits). Large estimates of Fst are

the result of less gene flow or smaller population sizes, and smaller estimates of Fst are

the result of extensive gene flow between subpopulations. Significance tests for Fst are

calculated from standard errors, following Relethford et al. (1997).

The R matrix also another important interpretive function that is used to generate

estimates of differential extralocal gene flow by comparing observed and expected levels

of within-sample variability (Relthford and Blangero, 1990). The residual value (the

difference between the observed and expected values) indicate the rate of external alleles

being introduced into a subpopulation from outside the mating network. Positive

residuals indicate greater than average external gene flow, and negative individuals

indicate the opposite (Reddy 2001). Taken together, these analyses provide a robust

interpretation concerning the details on patterns of group affinity and phenotypic

variation among the selected populations.

Raw data sets were analyzed using the quantitative genetics software RMET 5.0,

provided by John Relethford (Relethford et al., 1997). RMET allows for trait heritability

to be estimated. A heritability of 1.0 produced both minimum genetic distances and

estimates of minimum Fst that are comparable to other phenotypic studies (Hemphill,

6

1998; Steadman, 2001); however, because a heritability of one for craniometric variation

(which includes environmental variance) is not possible, an estimate of 0.55 was used

according to Relethford and Blangero (1990). They found that using an average of 0.55

for craniometric trait heritability did not significantly alter the results. That is, the average

heritability is a fairly robust one (although see Carson, 2006). This study has used a

heritability of 1.0 and 0.55 for comparisons. All tables shown use minimum Fst and

genetic distances (h2 = 1.0). Unless otherwise noted, the results using differential trait

heritability were similar.

RESULTS

Means and standard deviations for the Mongolian sample are shown in Table 4.

The results from the R matrix analyses are shown in tables 5 through 7. Table 5 gives

distance to the centroid (rii) and unbiased Fst values for all 13 populations. Table 6

displays the results of the Relethford-Blangero residuals and Table 7 gives the results for

the genetic (d2) distances among all sampled populations.

TABLE 4. Means and standard deviations for 17 craniometric measurements for the Mongolian sample

Trait Mean SDnasal height 53.84 3.44nasal bone height 27.42 3.12nasion prosthion length 74.78 5.25nasion basion 100.76 4.58basion prosthion 98.08 5.5superior nasal bone width 11.20 2.35inferior nasal bone width 19.0 12.50nasal breadth 26.66 2.24frontoorbital width subtense at nasion 18.85 3.14mid orbital width subtense at rhinion 17.71 3.99bizygomatic breadth 139.94 6.63glabella opisthocranion 183.81 6.77maximum cranial breadth 147.84 6.78basion bregma 130.76 5.35basion rhinion 103.31 5.60width at 13 (fronto malar temporalis) 107.58 4.45mid orbital width (width at 14) 57.32 4.93

7

TABLE 5. R matrix results: Genetic distance (biased and unbiased) to the centroid for all 13 populations(h2 = 1.0)

Population Biased r(ii) Unbiased r(ii) seChinese 0.079523 0.074761 0.008804Japanese 0.068142 0.064670 0.006959North China 0.128733 0.119473 0.015621Mongolia 0.147474 0.143851 0.010458Siberia 0.228754 0.178754 0.048388SE Asia 0.102555 0.095308 0.012334Europe 0.082653 0.077098 0.009695India 0.183652 0.170832 0.021954West Africa 0.283655 0.269766 0.028398Mideast 0.083717 0.071217 0.014636North Africa 0.088384 0.077273 0.014179Russia 0.111038 0.102563 0.013897North America 0.101747 0.095168 0.011706

Fst = 0.13002Unbiased Fst = 0.118518

se = 0.004779

TABLE 6. R matrix results: Relethford-Blangero residuals (h2 = 1.0)

Within-group Phenotypic Variance

Population r(ii) Observed Expected Residual

Chinese 0.074761 0.694 0.788 -0.094Japanese 0.06467 0.688 0.796 -0.108North China 0.119473 0.703 0.75 -0.047Mongolia 0.143851 1.19 0.729 0.461Siberia 0.178754 0.784 0.699 0.085SE Asia 0.095308 0.692 0.77 -0.078Europe 0.077098 0.809 0.786 0.023India 0.170832 0.629 0.706 -0.077West Africa 0.269766 0.663 0.622 0.041Mideast 0.071217 0.695 0.791 -0.096North Africa 0.077273 0.764 0.786 -0.021Russia 0.102563 0.806 0.764 0.042North America 0.095168 0.639 0.77 -0.131

8

TABLE 7. Genetic distances among 13 populations used in analysis (h2 = 1.0)

Pop China Japan NChinaMong Siberia SE Asia Europe India WAfrica Mideast NAfricaRussia NAmericaChina 0.000 0.045 0.079 0.038 0.027 0.051 -0.040 -0.052 -0.021 -0.059 -0.064 -0.056 -0.028Japan 0.049 0.000 0.056 -0.002 0.023 0.023 -0.034 -0.043 0.007 -0.035 -0.028 -0.035 -0.045N China 0.036 0.073 0.000 0.011 0.025 0.037 -0.055 -0.068 -0.060 -0.046 -0.038 -0.040 -0.029Mong 0.142 0.213 0.241 0.000 0.066 0.007 0.027 -0.104 -0.065 -0.064 -0.068 -0.025 0.030Siberia 0.199 0.196 0.249 0.190 0.000 -0.041 -0.032 -0.130 -0.024 -0.086 -0.069 -0.051 0.062SE Asia 0.068 0.114 0.141 0.226 0.356 0.000 -0.045 0.006 0.028 -0.035 -0.042 -0.042 -0.047Europe 0.231 0.210 0.307 0.168 0.319 0.263 0.000 0.000 -0.066 0.043 0.037 0.062 0.019India 0.350 0.322 0.426 0.522 0.610 0.255 0.247 0.000 0.079 0.073 0.059 0.016 -0.020WAfrica 0.387 0.320 0.509 0.543 0.497 0.310 0.479 0.283 0.000 0.001 -0.019 -0.091 -0.053Mideast 0.265 0.207 0.284 0.342 0.421 0.236 0.062 0.095 0.339 0.000 0.072 0.059 0.073N Africa 0.279 0.197 0.273 0.357 0.393 0.257 0.080 0.130 0.385 0.004 0.000 0.073 -0.003Russia 0.289 0.238 0.302 0.296 0.383 0.282 0.055 0.242 0.555 0.056 0.034 0.000 0.019NAmerica0.225 0.250 0.272 0.179 0.150 0.285 0.134 0.305 0.470 0.182 0.179 0.160 0.000Note: Values in the upper diagonal are derived from the R matrix. Values in the lower diagonal are derived from d2 distances.

Visual representation for group affinity is given in Figures 3, 4 and 5. Figure 3 is

the genetic distance map (scaled by the square root of their eigenvalues) produced from

the Relethford-Blangero analysis. The first two principal coordinates account for 64.6%

of the variation. Figure 4 plots group centroids on the first two canonical variates and

Figure 5 plots group centroids on the first three canonical variates resulting from

discriminant function analysis. The first three canonical variates account for 76.8% of the

variation.

9

FIGURE 3. Genetic Distance Map

(37.4%)

-0.4000 0.0000 0.4000

PC1

-0.2000

0.0000

0.2000

0.4000PC

2

ChineseJapanese

North China

Mon golia

Siberia

SE Asia

Europe

India

West Africa

MideastNorth Africa

Russia

North America

FIGURE 4. Plot of the first two canonical variates resulting from discriminant function for 13 groups, 17 variables

-1.368-1.300

-0.924-0.900

-0.856-0.511

0.0410.774

1.3571.588

1.6371.663

1.668

Function1

-2.112

-1.693

-0.829

-0.657

-0.640

-0.599

-0.370

-0.196

0.189

0.723

0.746

0.996

1.727

Func

tion2

Chinese

Japanese

N China

Mon golia

Siberia

SE Asia

Europe

India

W Africa

Mideast

N Africa

Russia

North America

10

FIGURE 5. Plot of the first three canonical variates resulting from discriminant function analysis for 13 groups, 17 variables

ChineseJapanese

N China

Mongolia

Siberia

SE Asia

Europe

India

W Africa

Mideast

N Africa

Russia

North America

DISCUSSION

Little is known about the people of Mongolia prior to the rise of Genghis Khan

(Keyser-Traqui et al., 2006). Early in Mongolia’s history, there were many war-like tribes

inhabiting the region, usually nomadic similar to other peoples of the central Asian

steppe. These nomadic tribes sometimes united with other peoples of the steppe, forming

large confederations that routinely threatened places like China, Europe, and the Middle

East. These confederacies rarely lasted; however these conflicts did redistribute people

and left particular genetic impressions.

Central Asia is a vast territory that has been central to the development of human

history because of its strategic location. The territory has been a complex assembly of

11

peoples, cultures, and habitats. The area has been occupied since Lower Paleolithic times,

and there is evidence of Neanderthal skeletal material in Uzbekistan (Comas et al., 2004).

The genetic legacy of the Mongols was expanded with the rise of Temujin (c.

1162-1227), otherwise known as Genghis Khan (Chinggis Khaan) and later the formation

of the Yuan Dynasty (1271-1368) (Mote 1999). By 1206 all tribes had come under the

rule of Temujin, who firmly began the establishment of the Mongol Empire. Genghis

Khan and his immediate successors conquered nearly all of Asia and European Russia, as

well as sending armies as far west as the Middle East, and south into Southeast Asia. This

was the largest land empire known in history (Figure 5).

FIGURE 6. Map showing the extent of the Mongol Empire circa 1294

Genghis Khan and his male-line descendents left a large genetic imprint across

the Old World by ruling large areas of Asia for many generations. Genghis Khan and his

descendents would often slaughter large segments of the population under their control,

which allowed a new genetic signature to thrive (Mote, 1999; Zerjal et al., 2003). Zerjal

12

et al., (2003) suggest the Mongol ruler and his male lineage may be responsible for a

“star-cluster” Y-chromosomal pattern found throughout a large geographical area

extending from Central Asia to the Pacific. This “star-cluster” formation (closely related

lineages) is found in 16 populations extending from the Pacific to the Caspian Sea and is

found in high frequencies (~8%), suggesting they do not result from an event specific to

any single population (Zerjal et al., 2003). It is possible that a form of social selection is

responsible for the observed pattern. That is, on the basis of social prestige (descendent of

Genghis Khan), a novel form of selection favored various human populations.

Central Asia is a major contact point for many diverse peoples. As such, the

history and development of the Mongolian population was a complex process affected by

the mixture of ethnically diverse groups (Keyser-Traqui et al., 2006). Importantly, little is

known genetically of this region, which has played a crucial role in the history of

humankind (the Silk Road), where contacts and trade occurred between the steppe

peoples of the north and peoples of India in the south. These contacts should have

resulted in the generation of complex cultural phenomena, such as acculturation,

assimilation, language acquisition, overlapping economies, all acting upon the genetic

makeup of diverse groups found throughout central Asia.

Comas et al., (1998; 2004) found the central Asian genetic landscape to present

features (such as frequencies of certain nucleotides, levels of nucleotide diversity, mean

pairwise differences, and genetic distances) intermediate between Europe and eastern

Asia, possibly suggesting significant gene flow enhanced as a result from the trade routes

along the Silk Road. Further, these researchers point to mtDNA eastern Asian sequences

in central Asia originating in the Mongols and/or Chinese (Comas et al., 1998). Yao et

13

al., (2000) examined mtDNA control region segment I and melanocortin 1 receptor

(MC1R) gene polymorphisms along the Silk Road region of China. In congruence with

Comas et al., (1998) in the larger region of central Asia, both the frequencies of the

MC1R variant and the mtDNA presented intermediate values between those of Europe

and East and Southeast Asia, suggestive of extensive admixture in this area of increased

contact and interaction.

This study makes use of quantitative trait variation and accordingly, the results

are similar to the genetic analyses described above. Table 5 shows the results from the

Relethford-Blangero analysis. Within-group phenotypic variance is greatest in Mongolia

(1.190), indicating greater than expected extralocal gene flow (0.461). In fact, Mongolia

has the highest value of positive residuals. This finding would suggest that significant

admixture has been occurring in Mongolia despite the relative nomadic lifestyle of many

groups. Figures 3, 4, and 5 all suggest an intermediate position for Mongolia between

European and East Asian populations. Interestingly, although contacts have been

persistent between central Asia and India, there is little indication that gene flow has been

occurring between Mongolians and people of the Indian subcontinent. India is seen as a

consistent outlier in all three analyses, clustering closer to the Middle East and North

Africa.

Genetic distances resulting from the quantitative analyses are also informative.

The lowest d2 values for Mongolia are China (0.142), North America (0.179), and Europe

(0.168). The R matrix values derive similar results for Mongolia, indicating a closer

genetic relationship to the Chinese groups, Southeast Asia, North America, Siberia, and

Europe. Kolman et al. (1996) suggest that central Asian groups (including Mongolia)

14

represent the closest link between the Old World and the New World using mtDNA

diversity. They feel that the narrow geographic corridor of east Central Asia, extending

from Mongolia to the Pacific coast may have served as a starting point for the human

migration that lead to the colonization of the New World. Although this study does not

allow for the more nuanced underlying variation that could support this hypothesis, the

data does suggest an affinity for Mongolians and North American Indian groups.

CONCLUSION

The analyses conducted in the present study indicate the utility of quantitative

genetic variation. Although the R matrix analysis does not get to greater underlying

variation for the Mongolian population, it does however show a correlation with recent

genetic studies using mtDNA, Y chromosome and ancient DNA analysis.

15

Education

ooogiii