54
1 Evolutionary Change in Nucleotide Sequences Dan Graur

1 Evolutionary Change in Nucleotide Sequences Dan Graur

Embed Size (px)

Citation preview

Page 1: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

1

Evolutionary Change in Nucleotide Sequences

Dan Graur

Page 2: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

2

So far, we described the evolutionary process as a series of gene substitutions in which new alleles, each arising as a mutation in a single individuala single individual, progressively increase their frequency and ultimately become fixed in the populationin the population.

Page 3: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

3

We may look at the process from a different point of view.An allele that becomes fixed is different in its sequence from the allele that it replaces. That is, the substitution of a new allele for an old one is the substitution of a new sequence for a previous sequence.

1 2 3

Page 4: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

4

If we use a time scale in which one time unit is larger than the time of fixation, then the DNA sequence at any given locus will appear to change with time. 1. actgggggtaaactatcggtatagatcat2. actgggggttaactatcggtatagatcat2. actgggggttaactatcggtatagatcat2. actgggggttaactatcggtatagatcat3. actgggggtgaactatcggtatagatcat4. actgggggtgaactatcggtacagatcat

Page 5: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

5

1. actgggggtaaactatcggtatagatcat2. actgggggttaactatcggtatagatcat2. actgggggttaactatcggtatagatcat2. actgggggttaactatcggtatagatcat3. actgggggtgaactatcggtatagatcat4. actgggggtgaactatcggtacagatcat

Nucleotide Substitution

Page 6: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

6

To study the dynamics of nucleotide substitution, we must make several assumptions regarding the probability of substitution of a nucleotide by another.

Page 7: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

7

Jukes & Cantor’s Jukes & Cantor’s one-parameter modelone-parameter model

Page 8: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

8

Assumption:Assumption:• Substitutions occur with equal probabilities Substitutions occur with equal probabilities among the four nucleotide types.among the four nucleotide types.

Page 9: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

9

If the nucleotide residing at a certain site in a DNA sequence is A at time 0, what is the probability, PA(t), that this site will be occupied by A at time t?

Page 10: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

10

PA(1)

=1− 3α

Since we start with A, PA(0) = 1. At time 1, the probability of still having A at this site is

where 3 is the probability of A changing to T, C, or G, and 1 – 3 is the probability that A has remained unchanged.

Page 11: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

11

1. The nucleotide has remained unchanged from time 0 to time 2.

To derive the probability of having A at time 2, we consider two possible scenarios:

2. The nucleotide has changed to T, C or G at time 1, but has reverted to A at time 2.

Page 12: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

12€

PA(2)

= (1− 3α )PA(1)

+α 1− PA(1)

⎡ ⎣ ⎢

⎤ ⎦ ⎥

Page 13: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

13

PA(t+1)=(1−3a)PA(t)+a1−PA(t)⎡ ⎣

⎤ ⎦

The following equation applies to any t and any t+1

Page 14: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

14

ΔPA(t)

= PA(t + 1)

− PA(t)

= −3αPA(t)

+α 1 − PA(t)

⎡ ⎣ ⎢

⎤ ⎦ ⎥= −4αP

A(t)+α

We can rewrite the equation in terms of the amount of change in PA(t) per unit time as:

Page 15: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

15

dPA(t)

dt=−4αPA(t) +α

We approximate the discrete-time process by a continuous-time model, by regarding ΔPA(t) as the rate of change at time t.

Page 16: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

16

The solution is:

PA(t)

=1

4+ P

A(0)−

1

4

⎝ ⎜

⎠ ⎟e−4α t

Page 17: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

17

In the Jukes and Cantor model, the probability of each of the four nucleotides at equilibrium (t = ) is 1/4.

PA(0)

= 0 : PA(t)

=1

4−

1

4e−4α t

PA(0)

= 1 : PA(t)

=1

4+

3

4e−4α t

Page 18: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

18

So far, we treated PA(t) as a probability.

However, PA(t) can also be interpreted as the frequency of A in a DNA sequence at time t.

For example, if we start with a sequence made of adenines only, then PA(0) = 1, and PA(t) is the expected frequency of A in the sequence at time t.

The expected frequency of A in the sequence at equilibrium will be 1/4, and so will the expected frequencies of T, C, and G.

Page 19: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

19

After reaching equilibrium no further change in the nucleotide frequencies is expected to occur. However, the actual frequencies of the nucleotides will remain unchanged only in DNA sequences of infinite length. In practice, fluctuations in nucleotide frequencies are likely to occur.

Page 20: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

20

Page 21: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

21

Kimura’s two-

parameter model

Page 22: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

22

Assumptions:

•The rate of transitional substitution at each nucleotide site is per unit time.

•The rate of each type of transversional substitution is per unit time.

Page 23: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

23

α ⁄ β ≈ 5−10

Page 24: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

24

If the nucleotide residing at a certain site in a DNA sequence is A at time 0, what is the probability, PA(t), that this site will be occupied by A at time t?

Page 25: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

25

PAA(1) =1−α−2β

After one time unit the probability of A changing into G is , the probability of A changing into C is and the probability of A changing into T is . Thus, the probability of A remaining unchanged after one time unit is:

Page 26: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

26

To derive the probability of having A at time 2, we consider four possible scenarios:

Page 27: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

27

1. A remained unchanged at t = 1 and t = 2

Page 28: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

28

2. A changed into G at t = 1 and reverted by a transition to A at t = 2

Page 29: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

29

3. A changed into C at t = 1 and reverted by a transversion to A at t = 2

Page 30: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

30

4. A changed into T at t = 1 and reverted by a transversion to A at t = 2

Page 31: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

31

X(t) =14

+14e−4βt

+12e−2(α+β)t

X(t) = The probability that a nucleotide at a site at time t is identical to that at time 0

At equilibrium, the equation reduces to X() = 1/4. Thus, as in the case of Jukes and Cantor's model, the equilibrium frequencies of the four nucleotides are 1/4.

3 probabilities3 probabilities

Page 32: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

32

Y(t) =14

+14e−4βt

−12e−2(α+β)t

Y(t) = The probability that the initial nucleotide and the nucleotide at time t differ from each other by a transition.

Because of the symmetry of the substitution scheme, Y(t) = PAG(t) = PGA(t) = PTC(t) = PCT(t).

3 probabilities3 probabilities

Page 33: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

33

Z(t) =14

−14e−4βt

Z(t) = The probability that the nucleotide at time t and the initial nucleotide differ by a specific type of transversion is given by

3 probabilities3 probabilities

Page 34: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

34

Each nucleotide is subject to two types of transversion, but only one type of transition. Therefore, the probability that the initial nucleotide and the nucleotide at time t differ by a transversion is twice the probability that differ by a transition

X(t) + Y(t) + 2Z(t) = 1

Page 35: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

35

Problem with the “t” approach. Too long even for Methuselah, who is said to have lived 187 years (Genesis 5:25)

Page 36: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

36

Page 37: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

37

Page 38: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

38

=

Page 39: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

39

Page 40: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

40

NUMBER OF NUCLEOTIDE NUMBER OF NUCLEOTIDE SUBSTITUTIONS BETWEEN SUBSTITUTIONS BETWEEN

TWO DNA SEQUENCESTWO DNA SEQUENCES

Page 41: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

41

After two nucleotide sequences diverge from each other, each of them will start accumulating nucleotide substitutions.

If two sequences of length N differ from each other at n sites, then the proportion of differences, n/N, is referred to as the degree of divergence or Hamming distance.

Degrees of divergence are usually expressed as percentages (n/N 100%).

Page 42: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

42

Page 43: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

43

The observed number of differences is likely to be smaller than the actual number of substitutions due to multiple hits at the same site.

Page 44: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

44

13 substitution

s=3

differences

Page 45: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

45

Page 46: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

46

Number of substitutions between two

noncoding (NOT protein coding)

sequences

Page 47: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

47

The one-parameter

model

p=34

1−e−8αt⎛ ⎝

⎞ ⎠

The probability that the two sequences are different at a site at time t is p = 1 – I(t).

Where is the probability of a change from one nucleotide to another in one unit time, and t is the time of divergence.

Page 48: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

48

The one-parameter

model

p=34

1−e−8αt⎛ ⎝

⎞ ⎠

Problem: t and are usually not known. Instead, we compute K, which is the number of substitutions per site since the time of divergence between the two sequences.

Page 49: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

49

L = number of sites compared between the two sequences.

Page 50: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

50

In the two-parameter model:

The differences between two sequences are classified into transitions and transversions.

P = proportion of transitional differencesQ = proportion of transversional differences

Page 51: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

51

V(K)=1

LP

1

1−2P−Q

⎛ ⎝ ⎜ ⎞

2

+Q1

2−4P−2Q+

1

2−4Q

⎛ ⎝ ⎜ ⎞

2

−P

1−2P−Q+

Q

2−4P−2Q+

Q

2−4Q

⎛ ⎝ ⎜ ⎞

2⎡

⎣ ⎢

⎦ ⎥

Page 52: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

52

Page 53: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

53

Numerical example (2P-model)

Page 54: 1 Evolutionary Change in Nucleotide Sequences Dan Graur

54

There are substitution

schemes with more than two

parameters!