Upload
duane-pope
View
235
Download
0
Embed Size (px)
DESCRIPTION
Why Is It Difficult To Compute A multiple Sequence Alignment? A CROSSROAD PROBLEM BIOLOGY: What is A Good Alignment COMPUTATION What is THE Good Alignment chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKD wheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSE trybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGP mouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. :::.:... :.. *. *: *
Citation preview
T-COFFEE, a novel method for Multiple Sequence
AlignmentsCédric Notredame
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :
Potential Uses of A Multiple Sequence Alignment?
Extrapolation
Motifs/Patterns
Phylogeny
Profiles
Struc. PredictionMultiple Alignments Are CENTRAL to MOST Bioinformatics Techniques.
Why Is It Difficult To Compute A multiple Sequence Alignment?
A CROSSROAD PROBLEMBIOLOGY:
What is A Good Alignment
COMPUTATIONWhat is THE Good
Alignment
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *
Why Is It Difficult To Compute A multiple
Sequence Alignment ?
BIOLOGY
CIRCULAR PROBLEM....
GoodSequences
GoodAlignment
COMPUTATION
Dynamic Programming Using A Substitution Matrix
Progressive Alignment
The T-Coffee Algorithm
Progressive Alignment Principle and its Limitations…
The Extended Library Principle…
The Extended Library Principle…
The Triplet Assumption
SEQ A
SEQ B
Weighting And Extension
Extension=Using Information from Other Sequences
Weighting=Using The surrounding Information (Coffee)
T-Coffee Progressive Alignment
Notredame, Higgins, Heringa, 2000
Dynamic Programming Using The extended Library
Local Alignment Global Alignment
Extension
Multiple Sequence Alignment
Mixing Local and Global Alignments
What is a library?
Extension+T-Coffee
Library Based Multiple Sequence Alignment
2Seq1 MySeqSeq2 MyotherSeq#1 21 1 253 8 70….
3Seq1 anotherseqSeq2 atsecondoneSeq3 athirdone#1 21 1 25#1 33 8 70….
How Long Does it Take
Primary Lib: O(N2L2)
Extension:O(N3L2)
Tree :O(N2L2)+O(N3)Aln :O(NL2)
N times slower than
ClustalW
Validating T-Coffee
What Is BaliBaseBaliBase
BaliBase is a collection of reference Multiple Alignments
The Structure of the Sequences are known and were used to assemble the MALN.
Evaluation is carried out by Comparing the Structure Based Reference Alignment With its Sequence Based Counterpart
BaliBase
DALI, Sap …
Method X
Comparison
Validation Using BaliBase
T-Coffee Results
Validation Using BaliBase
Taking T-Coffee Further:
Using Structures
Mixing Heterogenous Information With T-Coffee
Local Alignment Global Alignment
Multiple Sequence Alignment
Multiple Alignment
StructuralSpecialist
Running T-Coffee ONLINE
The T-Coffee Server
The T-Coffee Server
ES45, 4Proc1 Gb RAM
Future…
Large Scale…
Tailor Made…
WHO ?
WHO USES T-Coffee ?
Dali Domain DictionnaryPfamSwissProt
WHO Makes T-Coffee ?
Cédric NotredameDes HigginsChantal AbergelOlivier PoirotOrla O’Sullivan