Upload
swaatisoni
View
138
Download
0
Embed Size (px)
Citation preview
MULTIPLE SEQUENCE ALIGNMENT
Name – Swati KumariM.Sc. BioinformaticsRoll No – 22Central University of Bihar, Patna
CONTENTS
Introduction Goal of MSA General Considerations for MSA Steps of MSA Diagrammatic representation of steps of
MSA Applications of MSA Software for MSA References
INTRODUCTION
A Multiple Sequence Alignment (MSA) is alignment of biological sequences to find the similarity or dissimilarity between sequences, where the number of sequences are more than two.
One of most essential tools in molecular biology applied for both amino acids and nucleotides (DNA,RNA).
Goal of MSA
To find the Similarities based on Nucleotide or Amino Acid.
To find the Functional similarities. To find the Structural similarities. To find the Evolutionary relationship.
General Considerations for MSA
The more number of sequences to align gives the better result.
Only 40% of similarities between two amino acid sequences shows they are close in structure.
Subgroup of n number of sequences should be prealigned separately, and one member of each subgroup should be included in the final multiple alignment.
Steps of MSA
Compare all sequences pairwise. Perform cluster analysis on the pairwise data to
generate a hierarchy for alignment. This may be in the form of a binary tree.
Build the multiple alignment by first aligning the most similar pair of sequences, then the next most similar pair and so on.
Applications of MSA
Identify new protein or gene families. Determining the relation between the aligned
protein. Development of a genetic representation of
protein family. In practical analysis
Mutant analysis Identify conserved primer binding site.
Designing experiment to test and modify the function of the aligned sequences –
Identify amino acids crucial for function. Locating non conserved region / useful or tag
insertion. Designing degenerate primer for Polymerase
Chain Reaction (PCR).
Software for MSA
ClustalW (Command Line Version) the famous multiple alignment program for
nucleic acid and protein sequences. ClustalX (Graphical Line Version) –
provides a windowbased user interface to the ClustalW multiple alignment program.
MUSCLE (multiple sequence comparison by logexpectation) – more accurate than TCoffee, faster than
ClustalW.
TCoffee (Tree based Consistency Objective Function For alignment Evolution) Main characteristics is allow to combine result
obtained with several alignment method. Kalign
It is very fast MSA tool that concentrates on local region suitable for large alignment.
References
http://www.bioinformatics.org/wiki/Multiple_sequence_alignment
http://en.wikipedia.org/wiki/Multiple_sequence_alignment
www.clustal.org/ http://www.ebi.ac.uk/Tools/msa/ Class lecture