Chapter 9 Quantitative Genetics

Chapter 9 Quantitative GeneticsRead Chapter 9.

Traits such as cystic fibrosis or flower color in peas produce distinct phenotypes that are readily distinguished.

Such discrete traits, which are determined by a single gene, are the minority in nature.

Most traits are determined by the effects of multiple genes.

Continuous variationHowever, traits determined by many genes

(polygenic traits) show continuous variation.

Grain color in winter wheat is determined by three alleles at three loci.

Additive effects of genesThe genes affecting color of winter wheat interact in

a particularly straightforward way.

They have additive genetic effects.

This means that the phenotype for an individual is obtained just by summing the effects of individual alleles.

The more alleles for dark color an individual has the darker it will be

Continuous variationExamples in humans of traits that show

continuous variation include height, intelligence, athletic ability, and skin color.

Quantitative traitsFor continuous traits we cannot assign

individuals to discrete categories. Instead we must measure them.

Therefore, characters with continuously distributed phenotypes are called quantitative traits.

Quantitative traitsQuantitative traits determined by influence of

(1) genes and (2) environment.

East (1916)In early 20th century there was considerable

debate over whether Mendelian genetics could explain continuous traits.

Edward East (1916) showed it could.Studied longflower tobacco (Nicotiana

longiflora)

East (1916)East studied corolla length (part of flower) in

tobacco flowers.

Crossed pure breeding short and long corolla individuals to produce F1 generation. Crossed F1’s to create F2 generation.

East (1916)Using Mendelian genetics we can predict

expected character distributions if character determined by one gene, two genes, or more etc.

East (1916)Depending on number of genes: models

predict different numbers of phenotypes.One gene: 3 phenotypesTwo genes: 5 phenotypesSix genes: 13 phenotypes. Continuous

distribution.

East (1916)How do we decide if a quantitative trait is

under the control of many genes?In one and two locus models many F2 plants

have phenotypes like the parental strains.Not so with 6-locus model. Just 1 in 4,096

individuals will have the genotype aabbccddeeff.

East (1916)But, if Mendelian model works you should be

able to recover the parental phenotypes through selective breeding.

East selectively bred for both short and long corollas. By generation 5 most plants had corolla lengths within the range of the original parents.

East (1916)Plants in F5 generation of course were not

exactly the same size as their ancestors even though they were genetically identical.

Why?

East (1916)Environmental effects.

Depending on environment genetically identical organisms may differ greatly in phenotype.

Genetically identical plants

grown at different elevations

differ enormously

(Clausen et al. 1948)

The importance of latent variationEarly work in the 2oth century on polygenic

traits showed that new types or values of traits not seen in a parent population could appear in offspring produced by that population.

It was unclear where these new variants came from. It’s easy to see in figure A (next slide) how natural selection could favor some members of a population so that after a time the mean values of a population would increase within the range of previous variation.

The importance of latent variationHowever, it’s less clear how a population

could as a result of natural selection arrive at B in the previous slide in which the selected population is outside the range of the original population.

The key to understanding this phenomenon is to realize that when multiple genes contribute to a trait there will be many, many unique combinations of alleles that produce different phenotypes.

A population is not likely to include all of these possibilities.

Thus, a new variant can contain an assortment of alleles not seen previously. See next slide.

Gene interactionsNot all genes interact additively with the

alleles’ effects summing together.

In many cases genes interact with each other nonadditively a phenomenon we call epistasis.

Gene interactionsFor example, two loci influence coat color in

oldfield mice, but they interact epistatically.

The effect of the Mc1R allele depends on which alleles are present at the agouti locus (next slide).

Population genetics of multiple lociA locus is the physical location on a chromosome

where a gene occurs.

Different versions of a gene are called alelles.

The Hardy-Weinberg models we have discussed so far are quite simple because they consider only a single locus and its alleles.

However, many traits are controlled by the combined influence of many genes.

Population genetics of multiple lociGenes located on different chromosomes

segregate (i.e. they enter gametes) independently of each other.

However, when genes are located on the same chromosome they frequently do not segregate independently, especially if they are located close to each other on a chromosome. Such loci have a physical linkage.

Population genetics of multiple lociThe closer together two loci are on a

chromosome the less likely it is that crossing over will occur between the loci during meiosis and split them up.

In most cases they will be inherited as a pair.

Population genetics of multiple lociConsider a pair of loci located on same

chromosome.

Gene at locus A has two alleles A and a

Gene at locus B has two alleles B and b

Population genetics of multiple lociIn two-locus Hardy-Weinberg analysis we

track allele and chromosome frequencies.

Thus 4 possible chromosome genotypes are possible in previous slide:

AB, Ab, aB, abA multilocus genotype is referred to as a

haplotype (from haploid genotype).

Statistical associations between lociDoes selection on locus A affect our ability to

make predictions about evolution at locus B?

Sometimes. Depends on whether loci are in linkage equilibrium or linkage disequilibrium.

Statistical associations between lociTwo loci in a population are in linkage

equilibrium when the genotype of a chromosome at one locus is independent of the genotype at the other locus on the same chromosome.

I.e. knowing genotype at one locus is of no use in predicting genotype at the other locus.

Statistical associations between lociIn contrast two loci are said to be in linkage

disequilibrium when knowing the allele at one locus enables you to predict what the allele at the other locus likely is.

For example in a population where there are AB, Ab, and aB haplotypes, but no ab haplotypes if we know an individual has a b allele we know that individual also has at least one A allele.

Quantifying linkage disequilibriumTo measure the associations between allele

frequencies at two loci A and B we examine the haplotype frequencies at these loci.

Let fA, fB, fa and fb be the frequencies of the A, B, a and b alleles respectively.

Let hAB, hAb, haB, hab be the haplotype frequencies of AB, Ab, aB and ab haplotypes.

Quantifying linkage disequilibriumIf the allele at the A locus occurs

independently of the allele at the B locus then the haplotype frequencies will be:hAB = fAfB

hAb = fAfb

haB = fafB

hab = fafb

Quantifying linkage disequilibriumSo the expected haplotype frequency is found

just by multiplying the appropriate allele frequencies by each other.

If the frequency of allele A (fA) = 0.7 and the frequency of allele B (fB) = 0.8 then the expected haplotype frequency hAB, if the alleles are in linkage equilibrium, would be 0.56.

Coefficient of linkage disequilibriumTo measure the degree of linkage

disequilibrium we can calculate a coefficient of linkage disequilibroum (D).

For a given haplotype this is defined as the difference between the actual frequency we observe of a haplotype, e.g. AB, and the expected frequency fAfB of the same haplotype if the loci are independent.

D = hAB - fAfB

Coefficient of linkage disequilibriumWhen the alleles at each locus occur

independently then the coefficient of linkage disequilibrium will be zero. We then say the alleles are in linkage equilibrium.

If the alleles at each locus occur non-independently then the value of D will be non-zero and we say they are in linkage disequilibrium.

Coefficient of linkage disequilibriumIn a gene pool the frequencies of the alleles

are as follows: A = 0.4, a= 0.6, B=0.3 and b= 0.7.

The haplotype frequencies are AB = 0.12, Ab =0.28, aB = 0.18 and ab=0.42.

Is the population in linkage equilibrium?

Coefficient of linkage disequilibriumYes.

hAB = 0.12 fAfB = 0.3*0.4 = 0.12hAb = 0.28 fAfb = 0.4*0.7 = 0.28haB = 0.18 fafB = 0.6*0.3 = 0.18hab = 0.42 fafb = 0.6*0.7 = 0.42

For each haplotype D = zero e.g. D = hAB – fAfB = 0.12-0.12 = 0

Coefficient of linkage disequilibriumIn a second gene pool the frequencies of the

alleles are as follows: A = 0.6, a= 0.4, B=0.8 and b= 0.2

The observed haplotype frequencies are AB = 0.44, Ab =0.16, aB = 0.36 and ab=0.04.

Is this population in linkage equilibrium?

Coefficient of linkage disequilibriumNo.

hAB = 0.44 fAfB = 0.6*0.8 = 0.48hAb = 0.16 fAfb = 0.6*0.2 = 0.12haB = 0.36 fafB = 0.4*0.8 = 0.32hab = 0.04 fafb = 0.4*0.2 = 0.08

For each haplotype D not equal to zero e.g. D = hAB – fAfB = 0.44-0.48 = -0.04

Coefficient of linkage disequilibriumAnother way to calculate the coefficient of

linkage equilibrium if we just know haplotype frequencies is the following equation:

D = hABhab - hAbhaB

The value of this equation will be zero if the haplotypes are in linkage equilibrium.

Proof of the formula for linkage disequilibriumD = hABhab - hAbhaB

Let p and q be the frequencies of alleles A and a.Let s and t be the frequencies of alleles B and b.If the population is in linkage equilibrium then hAB = ps, hab = qt, hAb = pt, haB = qsTherefore rewriting the equation for linkage

disequilibrium in terms of allele frequencies we get D = psqt - ptqs which equals zero if the population is

in linkage equilibrium.Any value of D not equal to zero implies the population

is in linkage disequilibrium.

Coefficient of linkage disequilibriumIs this population, which has the following

haplotypes, in linkage equilibrium?

AB= 0.46, Ab = 0.14 aB = 0.34 ab= 0.06

Coefficient of linkage disequilibriumUse the formula: D = hABhab - hAbhaB

D = 0.46*0.06 – 0.14*0.34

D = 0.0276 – 0.0476 = -0.02

D is not equal to zero, so the population is in linkage disequilibrium.

Coefficient of linkage disequilibriumThe maximum value for D is 0.25 when AB

and ab are the only haplotypes present and each has a frequency of 0.5.

The minimum value for D is -0.25 when Ab and aB are the only haplotypes present and each has a frequency of 0.5.

This formula thus tells us not only whether a population is in linkage disequilibrium but how strong the disequilibrium is.

What creates linkage disequilibrium in populations?Multiple Mechanisms:

MutationSelection on multilocus genotypes.Genetic driftMigration

MutationA population contains only the haplotypes AB and

aB. A mutation occurs with the haplotype aB so that B

mutates to b producing the haplotype ab.

This population will have the genotype aB , AB and ab, but there will be no Ab haplotypes.

Hence, the population will be in linkage disequilibrium because of the missing Ab haplotype.

Selection on multilocus genotypes.Scenario: Either of two biosynthetic pathways is

sufficient to produce an essential molecule from two precursor molecules.

Each pathway is controlled by a single locus. The functional wild-type alleles (A & B) are dominant over the nonfunctional recessive alleles (a & b).

Only aabb individuals cannot produce the essential molecule.

Selection on multilocus genotypes.Because of selection against the aabb

genotype there will be fewer ab haplotypes than we would expect based on the allele frequencies of a and b.

Genetic driftScenario: Small population with two

genotypes AB and Ab. No copies of allele a.Single Ab chromosome mutation converts an

A to an a. This single ab chromosome puts population in linkage disequilibrium.

Scenario is drift because only in a small population would you expect to have only a single mutation of A to a. In large population you would expect many mutations of A to a and a to A.

Genetic driftScenario: a small population with AB, Ab, aB and ab

haplotypes where there is a low recombination rate between the A and B loci.

Drift can lead to the loss of alleles in a small population and haplotypes can disappear even more easily. If by chance all of one haplotype disappears then the population will have only three haplotypes.

Haplotypes need not necessarily disappear. In a small population random fluctuations in haplotype frequencies can easily lead to statistical associations between alleles and create linkage disequilibrium.

MigrationScenario: Suppose that the a & b alleles are fixed

in a mainland lizard population and the A&B alleles in an island lizard population.

Mainland thus has only the ab haplotype and the island the AB haplotype.

If some individuals migrate from the mainland to the island ab haplotypes will be introduced.

The population will be in linkage disequilibrium initially because there will be no aB and Ab haplotypes and a strong statistical association between the A and B alleles.

What eliminates linkage disequilibrium from a population?A population in linkage disequilibrium will

not stay in that state forever.

Unless no other evolutionary process prevents it (e.g. selection) linkage is broken down by recombination.

What eliminates linkage disequilibrium from a population?Sexual reproduction steadily reduces

linkage disequilibrium.

Crossing over during meiosis breaks up old combinations of alleles and creates new combinations.

Genetic recombinationGenetic recombination tends to randomize

genotypes in relation to other genotypes (i.e., it reduces linkage disequilibrium.)

Rate of decline in linkage disequilibrium is proportional to rate of recombination.

r is recombination rate, r is related to how far apart two loci are

on a chromosome.

Empirical example of genetic recombinationClegg et al. (1980) established two fruit fly

populations that were in linkage disequilibrium.

Population 1 AB and ab each 0.5 frequency.Population 2 aB and Ab each 0.5 frequency.

Empirical example of genetic recombinationPopulations of about 1,000 individuals

maintained for 48-50 generations.Flies allowed to mate freely. Populations sampled every 1-2 generations to

count frequencies of 4 haplotypes.

Empirical example of genetic recombinationCrossing-over created missing haplotypes in

each population and linkage disequilibrium disappeared.

In general, in random-mating populations sex is efficient enough at eliminating linkage disequilibrium that most alleles are in linkage equilibrium most of the time.

Practical reasons to measure linkage disequilibriumThere are two major uses of measures of

linkage disequilibrium.Can be used to reconstruct history of genes

and populationsCan be used to identify alleles recently favored

by positive selection

Reconstructing history of the CCR5-Δ32 locusHIV is the virus responsible for AIDS. It

parasitizes macrophages and T-cells of immune system. It enters by binding to two protein receptors on cell’s surface : CD4 and a coreceptor, usually CCR5.

Some people appear resistant to the virus even though exposed multiple times.

Some resistant individuals possess a mutant CCR5 co-receptor protein whose gene is missing 32 base pairs. This allele is referred to as the CCR5 Δ32 allele.

Reconstructing history of the CCR5-Δ32 locusFrequency of the CCR5-Δ32 allele is highest

in European populations (9%), but scarce or absent elsewhere.

Where did the CCR5-Δ32 allele come from and when did it originate?

Reconstructing history of the CCR5-Δ32 locusCCR5-Δ32 is located on chromosome 3 and

near two short-tandem repeat sites called GAAT and AFMB.

GAAT and AFMB are non-coding and have no effect on fitness. Both GAAT and AFMB have a number of different alleles.

Reconstructing history of the CCR5-Δ32 locusStephens et al. (1998) examined haplotypes

of 192 Europeans.Found that GAAT and AFMB alleles were in

close to linkage equilibrium with each other.

Reconstructing history of the CCR5-Δ32 locusHowever, CCR5 is in strong linkage

disequilibrium with both GAAT and AFMB.Almost all chromosomes carrying CCR5-Δ32

also carry allele 197 at GAAT and allele 215 at AFMB.

Reconstructing history of the CCR5-Δ32 locusMost likely reason for observed linkage

disequilibrium is genetic drift.Hypothesis: in past was originally only one

CCR5 allele the CCR5+ allele. Then a mutation on a chromosome with the

haplotype CCR5--GAAT-197--AFMB-215 created the CCR5Δ32 allele.

Reconstructing history of the CCR5-Δ32 locusThe CCR5Δ32 allele was favored by

selection and rose to high frequency dragging the other two alleles with it.

Since its appearance and spread, crossing over and mutation have been breaking down the linkage disequilibrium. Now about 15% of Δ32-197-215 haplotypes have changed to other haplotypes.

Reconstructing history of the CCR5-Δ32 locusBased on rates of crossing over and mutation

rates, Stephens et al. (1998) estimate the CCR5-Δ32 allele first appeared about 700 years ago (range of estimates 275-1875 years)

Reconstructing history of the CCR5-Δ32 locusBecause the CCR5-Δ32 increased in frequency so

rapidly selection must have been strong.

Most obvious candidate is an epidemic disease.

Myxoma virus a relative of smallpox uses CCR5 protein on cell surface to enter host cell, which suggests the epidemic disease that favored CCR5-Δ32 may have been smallpox.

However, timing of origin also closely matches period of bubonic plague.

Using linkage disequilibrium to detect strong positive selection.A new mutant allele will be in linkage

disequilibrium when it first appears. If it persists, it may increase in frequency.

Over time linkage disequilibrium will break down as a result of recombination from crossing over.

Linkage disequilibrium breaks down fastest for loci further apart on a chromosome because crossing over take place more often between distant loci.

Using linkage disequilibrium to detect strong positive selection.High linkage disequilibrium indicates an

allele originated recently.

Also, expect a recently mutated allele to be rare unless selection strongly favors it.

Using linkage disequilibrium to detect strong positive selection.If an allele is common, but has high linkage

disequilibrium, especially with loci that are located far away on the chromosome, this suggests that the allele has been strongly selected for and must have originated recently.

If the allele had arisen a long time ago, sex should have eliminated the linkage disequilibrium.

Using linkage disequilibrium to detect positive selection.An allele of G6PD (Glucose-6-phosphate

dehydrogenase), G6PD-202A has a high frequency (~18% in African populations) and has a high degree of linkage disequilibrium.

Thus, it appears to have been strongly selected for recently.

G6PD and malariaThere are many common G6PD deficiencies

and their distribution corresponds closely with the distribution of malaria.

Appears that G6PD-202A confers strong protection against malaria.

Documents

Chapter 9 Quantitative Genetics