Distance based method

  • View
    89

  • Download
    0

  • Category

    Science

Preview:

Citation preview

PHYLOGENETIC TREE CONSTRUCTION BY DISTANCE

BASED METHOD

INTRODUCTION A phylogenetic tree also known as

a phylogeny is a diagram that depicts the lines of evolutionary descent of different species, organisms, or genes from a common ancestor. Attempt to reconstruct evolutionary

ancestors Estimate time of divergence from ancestor

Can be used to solve a number of interesting problems Forensics

• HIV virus mutates rapidly Predicting evolution of influenza viruses Predicting functions of uncharacterized

genes - ortholog detection Drug discovery Vaccine development

• Target inferred common ancestor

HOW TO CONSTRUCT A PHYLOGENETIC TREE

Step1: Make a multiple alignment from base alignment or amino acid sequence (by using MUSCLE, BLAST, or other method)

Step 2: Check the multiple alignment if it reflects the evolutionary process.

Step3: Choose what method we are going to use and calculate the distance or use the result depending on the method.

Step 4: Verify the result statistically.

TYPES OF APPROACHES CHARACTER BASED APPROACH It makes use of all known evolutionary information, i.e. the individual substitutions among the sequences, to determine the most likely ancestral sequences.

DISTANCE BASED APPROACH Distance-matrix methods of phylogenetic analysis explicitly rely on a measure of "genetic distance" between the sequences being classified and therefore they require an MSA(multiple sequnce alignment) as an input.

Distance-based methods must transform the sequence data into a pairwise similarity matrix for use during tree inference.

VARIOUS DISTANCE BASED METHODS

1. UPGMA2. NJ(Neighbor Joining)3. FM(Fitch-Margoliash)4. Minimum evolution

UPGMA• Stands for Unweighted pair group

method with arithmetic mean.• Originally developed for numeric

taxonomy in 1958 by Sokal and Michener.

• This method uses sequential clustering algorithm.

This method follows a clustering procedure:

(1) Assume that initially each species is a cluster on its own.

(2) Join closest 2 clusters and recalculate distance of the joint pair by taking the average.

(3) Repeat this process until all species are connected in a single cluster.

CONSTRUCTION OF PHYLOGENETIC TREE

DRAWBACK• Strictly speaking, this algorithm is

phenetic, which does not aim to reflect evolutionary descent.

• It assigns equal weight on the distance and assumes a randomized molecular clock.

• WPGMA(Weighted Pair Group Method with Arithmetic Mean)is a similar algorithm but assigns different weight on the distances.

NEIGHBOUR JOINING METHOD Neighbor-joining methods apply general 

data clustering techniques to sequence analysis using genetic distance as a clustering metric.

 Developed in 1987 by Saitou and Nei.

 The simple neighbor-joining method produces unrooted trees, but it does not assume a constant rate of evolution (i.e., a molecular clock) across lineages. 

It begins with an unresolved star-like tree . Each pair is evaluated for being joined and

the sum of all branches length is calculated of the resultant tree.

The pair that yields the smallest sum is considered the closest neighbors and is thus joined .

A new branch is inserted between them and the rest of the tree and the branch length is recalculated.

This process is repeated until only one terminal is present.

DRAWBACKS But it produces only one tree and

neglects other possible trees, which might be as good as NJ trees, if not significantly better.

Moreover since errors in distance estimates are exponentially larger for longer distances, under some condition, this method will yield a biased tree.

WEIGHTED NEIGHBOUR JOINING(WEIGHBOR)

It is a new method proposed recently. The Weighbor criterion consists of two

terms; 1. additivity term (of external branches) 2. positivity term (of internal branches), that quantifies the implications of joining the pair.

Weighbor gives less weight to the longer distances in the distance matrix and the resulting trees are less sensitive to specific biases than NJ and relatively immune to the "long branches attraction/distraction" drawbacks observed with other methods.

FITCH – MARGOLIASH METHOD Proposed in 1967 Produces unrooted trees Criteria for fitting trees to distance matrices Uses a weighted least squares method for

clustering based on genetic distance. Closely related sequences are given more

weight in the tree construction process to correct for the increased inaccuracy in measuring distances between distantly related sequences.

MINIMUM EVOLUTION First decribed by Kidd & Sgaramella –

Zonta in 1971, then earlier by Rzhetsky & Nei in 1992.

Based on the assumption that the tree with the smallest sum of branch length estimates is most likely to be the true one.

Unrooted metric trees

In ME, the tree that minimizes the lengths of the tree, which is the sum of the lengths of the branches, is regarded as the estimate of the phylogeny:

where n is the number of taxa in the tree, vi is the ith branch.

DRAWBACKS In principle all different tree topologies

have tobe investigated to find the minimum tree. However, this is impossible in practice because of the explosive increase in the number of tree topologies.

Slower than clustering methods. Information lot when characters

transformed to distances.

ADVANTAGES OF DISTANCE BASED APPROACH

Less sensitive to variations in evolutionary rate than cluster analysis

Fast Can handle many sequences at a time Produce a reasonable estimate of

phylogeny

DISADVANTAGES OF DISTANCE BASED APPROACH

More sensitive than Parsimony or Maximum Likelihood to systematic errors.

The relationship between the individual characters and the tree is lost in the process of reducing characters to distances.

Strength of the technique is dependent on accuracy of the distance estimate, and thus dependent on the model used to obtain the distance matrix.

THANK YOU

Recommended