29
Ch.6 Phylogenetic Tre es

Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

Embed Size (px)

Citation preview

Page 1: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

Ch.6 Phylogenetic Trees

Page 2: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

2

Contents Phylogenetic Trees Character State Matrix

Perfect Phylogeny Binary Character States Two Characters

Distance Matrix Additive Trees Ultrametric Trees

Agreement (Isomorphic) between Phylogenies

Page 3: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

3

Phylogenetic Trees (Phylogenies) Explain the evolutionary history of today’s species (Figure 6.1) A hypothesis; do not have enough data about distant ancestors

of present-day species Characteristic

Leaf; an object or a set of objects, Interior node; hypothetical ancestor objects

Unrooted tree Classify input data for phylogeny reconstruction into main categ

ories Character state matrix Distance matrix

Page 4: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

4

Character State Matrix Character have following features

Independent inheritance Homologous

Character state matrix A matrix M with n rows (objects) and m columns (characters) Mij denotes the state the object i has for character j Each row is the state vector for an object

Page 5: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

5

Difficulties to create a phylogeny from a character state matrix

Convergence or parallel evolution Objects that share the same state are genetically closer than

objects that do not Reversal

Gains and losses of the character

☞ assume convergence or reversal should not happen, or their number should be minimized

Ordered or unordered, directed

Page 6: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

6

Perfect Phylogeny Problem For each state s of each character c, the set of all no

des u (leaves and interior nodes) for which the state is s with respect to c must form a subtree of T

Characters are compatible If a set of objects defined by a character state matrix admits

a perfect phylogeny

Page 7: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

7

Example

Page 8: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

8

Perfect Phylogeny Problem How many different trees can we build for n objects?

Consider only unrooted binary trees )!( )52(

3nOi

n

i

Page 9: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

9

Binary Character States Two phases algorithm (runs in time O(nm))

Decide whether the input matrix M admits a perfect phylogeny

Construct one possible phylogeny Assume that state 0 is ancestral and state 1 is

derived

Page 10: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

10

Deciding perfect phylogeny A rooted tree T is a perfect phylogeny for input matrix

M, if Every character in input matrix M there corresponds an edge

in T, and this edge marks the transition from state 0 to state 1 for that character

Edges are labeled by their respective characters and root has character state vector (0, 0, …, 0)

Page 11: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

11

Deciding perfect phylogeny Definition 6.1 For each column j of M, let Oj be the se

t of objects whose state is 1 for j. Let Oj be the set of objects whose state is 0 for j

Lemma 6.1 A binary matrix M admits a perfect phylogeny if and only if for each pair of character i and j the sets Oi and Oj are disjoint or one of them contains the other

Page 12: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

12

Deciding perfect phylogeny Example; Table 6.2

O1 = {B, D}, O2 = {B}, O3 = {D}

O4 = {A, C, E}, O5 = {A, C}, O6 = {C}

Lemma 6.1 for decision phase takes O(nm2) Figure 6.5 Algorithm Perfect Binary Phylogeny Decision ->

O(nm)

Page 13: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

13

Deciding perfect phylogeny

if Lij ≠ Llj for some i, l and both Lij and Llj are nonzero then

return FALSE

M c4 c1 c5 c2 c3 c6

A 1 0 1 0 0 0

B 0 1 0 1 0 0

C 1 0 1 0 0 1

D 0 1 0 0 1 0

E 1 0 0 0 0 0

L c4 c1 c5 c2 c3 c6

A -1 0 1 0 0 0

B 0 -1 0 2 0 0

C -1 0 1 0 0 3

D 0 -1 0 0 1 0

E -1 0 0 0 0 0

Page 14: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

14

Construction perfect phylogeny Figure 6.6 Algorithm Perfect Binary Phylogeny

Construction Running time O(nm)

Page 15: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

15

Unordered binary character The majority state becomes 0 and the other 1 If equal frequency, choose either one to be 0 and the

other to be 1

Page 16: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

16

Two characters Allow characters can be unordered and have an arbitrary numb

er of states, but restrict on the maximum number of characters two

Definition 6.2 A triangulated graph is an undirected graph in which any cycle with four or more vertices has a chord, that is, an edge joining two nonconsecutive vertices of the cycle

Theorem 6.1 To every collection of subtrees {T1, T2, …, Tl} of a tree T there corresponds a triangulated graph and vice versa

Page 17: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

17

Two characters Definition 6.3 An intersection graph for a collection C of sets is the

graph G that we get by mapping each set in C to a vertex of G, and linking two vertices in G by an edge if the corresponding sets have a nonempty intersection

Definition 6.4 Given a graph G = (V, E) with a coloring c on V, we say that G can be c-triangulated if there exists a triangulated graph H = (V, E’), such that E ⊆ E’ and c is a valid coloring for H. In other words, any edge present in E’ but not in E must link two vertices with different colors

Page 18: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

18

Two characters Theorem 6.2 A character state matrix M, with a char

acter set defining a coloring c, admits a perfect phylogeny if and only if its corresponding SIG can be c-triangulated

Theorem 6.3 A character state matrix M with only two characters admits a perfect phylogeny if and only if its corresponding SIG is acyclic

Page 19: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

19

Example

x1

y1

x2

z2

x3

y3

z3y2

{B} {A, B}

{A}

{B, C}

{C}

{C, D}

{D}

{A, D}

Page 20: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

20

Reconstruction algorithm for two characters

Running time O(n) Test for acyclicity -> O(n) Reconstruction of the perfect phylogeny -> O(n)

Page 21: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

21

Parsimony and Compatibility Real character state matrices are unlikely to admit

perfect phylogenies Experimental data always carries errors The assumptions (no reversals and no convergence)

sometimes are violated Two approach

Parsimony criterion Allow reversal and convergence events, but to try to minimize

their occurrence Compatibility criterion

Find a maximum set of characters that are compatible -> exclude characters that cause such “problem”

Page 22: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

22

Algorithms for Distance Matrices Problem of reconstructing trees based on comparativ

e numerical data between n objects, distance matrix M

Consider two problems Reconstructing Additive Trees Reconstructing Ultrametric Trees

Page 23: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

23

Reconstructing Additive Trees Metric space

A set of objects O such that to every pair i, j ∈ O and associated a nonnegative real number dij with the following properties:

dij > 0 for i ≠ j,

dij = 0 for i= j,

dij = dji for all i and j,

dij ≤ dik + dkj for all i, j, and k (the triangle inequality)

M and T are additive Tree must have n leaves Leaves are nodes with degree one; the others with degree three All edges in the tree have nonnegative weight The weight of the path between any two leaves i and j must be equal to

Mij

Page 24: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

24

Reconstructing Additive Trees Lemma 6.2 A metric space O is additive if and only if gi

ven any four objects of O labeled i, j, k, and l such that dij + dkl = dik + djl ≥ dil + djk

If M is additive, T is unique (algorithm runs in time O(n2))

Real-life distance matrices are rarely additive due to errors in the distance measurement

Obtain a tree that is as close as possible to an additive tree Approaching the problem that is tractable

Page 25: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

25

Reconstructing Ultrametric Trees Given two distance matrices, Ml and Mh, reconstruct a

n evolutionary tree such that the distances measured on the tree fit “between” these two input matrices (sandwich constraints, )

A tree is ultrametric when it is additive and can be rooted in such a way that the lengths of all leaf-root paths are equal -> the objects being studied have evolved at equal rate from a common ancestor

hijij

lij MdM

Page 26: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

26

Reconstructing Ultrametric Trees link of a and b in MST T; (a, b)max

The largest-weight edge in the unique path from a to b in T Definition 6.5 The cut-weight of an edge e of the mini

mum spanning tree of Gh is given by

})(|max{)( max, a,beMeCW lba

Page 27: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

27

Reconstructing Ultrametric Trees Reconstruction algorithm -> runs in time O(n2)

Compute a MST T of Gh; Construction of R; Compute CW(e); Build ultrametric tree U

Page 28: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

28

Agreement between Phylogenies In practice it occurs quite often that two different meth

ods applied on the same data yield different trees (in the topological sense)

Definition 6.6 We say that a tree Tr refines another tree Ts whenever Tr can be transformed into Ts by contracting selected edges from Tr. Two trees T1 and T2 agree when there exists a tree T3 that refines both

Page 29: Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix

29

Isomorphic Two trees T1 and T2 are isomorphic when there is an

one-to-one correspondence between their nodes such that for every pair u, v of corresponding nodes, u ∈ T1 and v ∈ T2, the objects contained in leaves below u are the same as the objects contained in leaves below v

Binary Tree Isomorphism Figure 6.21 runs in time O(n)

General case (leaves contain several objects) Figure 6.22 runs in time O(n)