Population Genetics - University of...

Preview:

Citation preview

Population Genetics

Joe Felsenstein

GENOME 453, Autumn 2015

Population Genetics – p.1/73

Godfrey Harold Hardy (1877-1947) Wilhelm Weinberg (1862-1937)

Population Genetics – p.2/73

A Hardy-Weinberg calculation

5 AA 2 Aa 3 aa

0.50 0.20 0.30

Population Genetics – p.3/73

A Hardy-Weinberg calculation

5 AA 2 Aa 3 aa

0.50 0.20 0.30

0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30

Population Genetics – p.4/73

A Hardy-Weinberg calculation

0.6 A

5 AA 2 Aa 3 aa

0.50 0.20 0.30

0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30

0.6 A 0.4 a

0.4 a

Population Genetics – p.5/73

A Hardy-Weinberg calculation

0.36 AA 0.24 Aa

0.24 Aa 0.16 aa

0.6 A

5 AA 2 Aa 3 aa

0.50 0.20 0.30

0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30

0.6 A 0.4 a

0.4 a

Population Genetics – p.6/73

A Hardy-Weinberg calculation

0.6 A

5 AA 2 Aa 3 aa

0.50 0.20 0.30

0.50 + (1/2) 0.20 (1/2) 0.20 + 0.30

0.6 A 0.4 a

0.4 a

Result:

0.36 AA

0.48 Aa0.16 aa

0.6 A

0.4 a

1/2

1/2

0.36 AA 0.24 Aa

0.24 Aa 0.16 aa

Population Genetics – p.7/73

Hardy-Weinberg mathematics

aaq 2Aapq

A

A a

a

Result:

AA Aa aa

P Q R

p

q

p qP + (1/2) Q (1/2) Q + R

p AA

q aa

p A

q a

2

22pq Aa

AAp2 Aapq

1/2

1/2

Population Genetics – p.8/73

Calculating the gene frequency (two ways)

Suppose that we have 200 individuals: 83 AA, 62 Aa, 55 aa

Method 1. Calculate what fraction of gametes bear A:

AA

Aa

aa

Genotype Number

83

62

55

Genotype frequency

0.415

0.31

0.275

Fraction of gametes

0.57

0.431/2

A

a

all

1/2

all

Population Genetics – p.9/73

Calculating the gene frequency (two ways)

AA

Aa

aa

Genotype Number

83

62

55

0.57

0.43

A

a

A’s a’s

0

1100

166

62 62

228 172 = 400

228

400=

400

172=

Method 2. Calculate what fraction of genes in the parents are A:

Suppose that we have 200 individuals: 83 AA, 62 Aa, 55 aa

+

Population Genetics – p.10/73

A numerical example of natural selection

Genotypes: AA Aa aaRelative fitnesses 1 1 0.7 (viabilities)

Initial gene frequency of A = 0.2(newborns) 0.04 0.32 0.64

×1 ×1 ×0.7Survivors: 0.04 + 0.32 + 0.448 = 0.808

Genotype frequencies among the survivors (divide by the total)0.0495 0.396 0.554

gene frequencyA : 0.0495 + 0.5 × 0.396 = 0.2475a : 0.554 + 0.5 × 0.396 = 0.7525

genotype frequencies (among newborns)0.0613 0.3725 0.5663

Population Genetics – p.11/73

The algebra of natural selection

Genotype AA Aa aa

Frequency p2 2pq q2

Relative fitnesses wAA wAa waa

After selection p2 wAA 2pq wAa q2 waa

Sum = p2 wAA + 2pq wAa + q2 waa

Survivors: p2 wAA

Sum2pq wAa

Sumq2 waa

Sum

p′ = p2 wAA + pq wAa

p2 wAA + 2pq wAa + q2 waa

p′ = p(p wAA + q wAa)p2 wAA + 2pq wAa + q2 waa

p′ = p w̄A

Population Genetics – p.12/73

Is weak selection effective?

Suppose (relative) fitnesses are:

AA Aa aa

(1+s)2 1+s 1 So in this example each change of

a to A multiplies the fitnessby (1+s), so that it increases itby a fraction s.

0.01 − 0.1 0.1 − 0.5 0.5 − 0.9 0.9 − 0.99s

The time for gene frequency change, in generations, turns out to be:

change of gene frequencies

1

0.1

0.01

0.001

3.46 3.17 3.17 3.46

25.16 23.05 23.05 25.16

240.99 220.82 220.82 240.99

2399.09 2198.02 2198.02 2399.09

Age

ne fr

eque

ncy

of

generations

0

1

0.5

(1+s) (1+s)xx

Population Genetics – p.13/73

An experimental selection curve

Population Genetics – p.14/73

Rare alleles occur mostly in heterozygotes

Genotype frequencies:

0.81 AA : 0.18 Aa : 0.01 aa

Note that of the 20 copies of a,

18 of them, or 18 / 20 = 0.9 of them are in Aa genotypes

This shows a population in Hardy−Weinberg equilibrium

at gene frequencies of 0.9 A : 0.1 a

Population Genetics – p.15/73

Overdominance and polymorphism

AA Aa aa1 − s 1 1 − t

when A is rare, most A’s are in Aa, and most a’s are in aa

The average fitness of A−bearing genotypes is then nearly 1

The average fitness of a−bearing genotypes is then nearly 1−t

So A will increase in frequency when rare

when a is rare, most a’s are in Aa, and most A’s are in AA

The average fitness of a−bearing genotypes is then nearly 1

The average fitness of A−bearing genotypes is then nearly 1−s

So a will increase in frequency when rare

0 1gene frequency of A

Population Genetics – p.16/73

Underdominance and unstable equilibrium

when A is rare, most A’s are in Aa, and most a’s are in aa

The average fitness of A−bearing genotypes is then nearly 1

when a is rare, most a’s are in Aa, and most A’s are in AA

The average fitness of a−bearing genotypes is then nearly 1

1gene frequency of A

AA Aa aa11+s 1+t

The average fitness of a−bearing genotypes is then nearly 1+t

So A will decrease in frequency when rare

The average fitness of A−bearing genotypes is then nearly 1+s

So a will decrease in frequency when rare

0 Population Genetics – p.17/73

Fitness surfaces (adaptive landscapes)

Overdominance

0 1p0 1p

stable equilibrium

(gene frequency changes)

Underdominance

w__

0 1p

(gene frequency changes)

unstable equilibrium

Is all for the best in this best of all possible worlds?

w__

Population Genetics – p.18/73

A case to consider: two interacting adaptations

Bb BB

aaAaAA

bbBB Bb bb

AA 0.8 0.9 1.0Aa 0.9 1.0 0.9aa 1.0 0.9 0.8

Population Genetics – p.19/73

A fitness surface (in a haploid case)

AB 1.1

Ab 0.9

aB 0.9

ab 1

1.08

1.06

1.04

0.00

0.20

0.40

0.60

0.80

1.00

0.92

0.94

0.96

0.98

0.98

1.00

1.02

0.96

0.94

0.92

0.00 0.20 0.40 0.60 0.80 1.00

Gene Frequency of A

Gen

e F

requ

ency

of B

Population Genetics – p.20/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.21/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.22/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.23/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.24/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.25/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.26/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.27/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.28/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.29/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.30/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.31/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.32/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.33/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.34/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.35/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.36/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.37/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.38/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.39/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.40/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.41/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.42/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.43/73

Genetic drift

Time (generations)

0 1 2 3 4 5 6 7 8 9 10 11

Gen

e fr

eque

ncy

0

1

Population Genetics – p.44/73

Distribution of gene frequencies with drift

0 1

0 1

0 1

0 1

0 1

time

Note that although the individual populations

when we have infinitely many populations)

wander, their average hardly moves (not at all

Population Genetics – p.45/73

Genetic drift in some small populations

from Hartl and Clark,Principles of Population Genetics

Population Genetics – p.46/73

They’re real (lab) populations of Drosophila

from Barton et al., EvolutionPeter Buri. 1956. Gene frequency in small populations of mutantDrosophila. Evolution 10 (4): 367-402.

Population Genetics – p.47/73

Some copies happen to have more descendants

Time

Population Genetics – p.48/73

Some copies happen to have more descendants

Time

Population Genetics – p.49/73

Some copies happen to have more descendants

Time

Population Genetics – p.50/73

Some copies happen to have more descendants

Time

Population Genetics – p.51/73

Some copies happen to have more descendants

Time

Population Genetics – p.52/73

Some copies happen to have more descendants

Time

Population Genetics – p.53/73

Some copies happen to have more descendants

Time

Population Genetics – p.54/73

Some copies happen to have more descendants

Time

Population Genetics – p.55/73

Some copies happen to have more descendants

Time

Population Genetics – p.56/73

Some copies happen to have more descendants

Time

Population Genetics – p.57/73

Some copies happen to have more descendants

Time

Population Genetics – p.58/73

Averaging of gene frequencies when populations admix

Population 1 Population 2

0.2

0.8

0.7

0.3

0.60 from Pop. 1 0.40 from Pop. 2

makes

0.60 x 0.8+ 0.40 x 0.3

which is

0.60

Population Genetics – p.59/73

A cline (name by Julian Huxley)

no migrationsome

more

geographic position

gene

freq

uenc

y

0

1

S N

Population Genetics – p.60/73

Achillea

Achillea lanulosa, the common yarrow or Queen-Anne’s Lace.

Population Genetics – p.61/73

A famous common-garden experiment

Clausen, Keck and Hiesey’s (1949) common-garden experiment inAchillea lanulosa

Population Genetics – p.62/73

Zinc tolerance in a grass species

Sweet vernal grass Anthoxanthum odoratum.

Population Genetics – p.63/73

Heavy metal

Jain and Harper’s 1966 study on zinc tolerance in the grass Anthoxanthumodoratum at an abandoned zinc mine in Wales.

Population Genetics – p.64/73

Johnson and Selander’s studies on house sparrows

male female

Incredibly common, introduced to all of North America by lunatics startingin 1852 in New York City.

Population Genetics – p.65/73

Geographic differentiation of house sparrows

Body weight as a function of temperature

Population Genetics – p.66/73

House sparrows

Color of breast feathers in four locations (reflectance at different wavelengths

Population Genetics – p.67/73

Mutation RatesCoat color mutants in mice. From

Schlager G. and M. M. Dickie. 1967. Spontaneous mutations andmutation rates in the house mouse. Genetics 57: 319-330

Locus Gametes tested No. of Mutations Rate

Nonagouti 67,395 3 4.4 × 10−6

Brown 919,619 3 3.3 × 10−6

Albino 150,391 5 33.2 × 10−6

Dilute 839,447 10 11.9 × 10−6

Leaden 243,444 4 16.4 × 10−6

——- — ————-Total 2,220,376 25 11.2 × 10−6

Population Genetics – p.68/73

Mutation rates in humans

Population Genetics – p.69/73

Forward vs. back mutationsWhy mutants inactivating a functional gene will be morefrequent than back mutations

The gene

12 places can mutate to nonfunctionality

only one place can mutate back to function

function can sometimes be restored by a "second site" mutation, too

Population Genetics – p.70/73

A sequence space

For sequences of length 1000, there are 3 × 1000 = 3000 “neighbors” onestep away in sequence space.

But there are 41000 sequences, which is about 10602 in all !

No two of them are more than 1000 steps apart. Hard to draw such aspace! How do we ever evolve? Wouldn’t it be impossible to find one ofthe tiny fraction of possible sequences that would be even marginallyfunctional?

The answer seems to be that the sequences are clustered. An example ofsuch clustering is the English language, as illustrated by a word game:

W O R DW O R EG O R EG O N EG E N E

But the word BCGHcannot be made into anEnglish word

There are also only a tiny fraction of all 456,976 four-letter words that areEnglish words But they are clustered, so that it is possible to “evolve” fromone to another through intermediates.

Population Genetics – p.71/73

Mutation as an evolutionary force

10−6

and mutation rate back is the same,

0

1

0

0.5

Mutation is critical in introducing new alleles but is very slow

in changing their frequencies

If we have two alleles A and a, and mutation rate from A to a is

500,000 generations

Population Genetics – p.72/73

Estimation of a human mutation rateBy an equilibrium calculation. Huntington’s disease. Dominant. Does notexpress itself until after age 40. 1/100, 000 of people of European ancestryhave the gene. Reduction in fitness maybe 2%.

If allele frequency is q, then 2q(1 − q) of everyone areheterozygotes. So q ≃ 0.000005.

0.02 of these die. So a fraction 0.02 of all copies of the mutant allelein the population are eliminated by natural selection eachgeneration.

So the fraction of all gene copies at that locus that are mutant allelesthat are eliminated is 0.000005 × 0.02 ≃ 10−7

If we are at equilibrium between mutation and selection, this is alsothe fraction of all gene copies at that locus that have a new mutation.

So the mutation rate is in that case 10−7

Similar calculations can be done with recessive alleles, but we mustremember that in their case each death (or reduction in fitness) kills twocopies of the mutant. Population Genetics – p.73/73

Recommended