Multiple Sequence Alignment & Phylogenetic Tree

Embed Size (px)

Citation preview

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    1/24

    Multiple Sequence Alignment

    &Phylogenetic Tree

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    2/24

    Multiple Sequence Alignment

    Multiple sequence alignment is the process toaligning three or more sequences with each

    other so as to bring as many similar sequencecharacters as possible.

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    3/24

    Why perform an MSA? V isualize trends between homologous sequences:

    Shared regions of homology R egions unique to a sequence within a family

    Structural/functional motif As the first step in a phylogenetic analysis I mprove accuracy of structure predictions

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    4/24

    P rogressive Multiple Alignment

    H euristicPerform pairwise alignments

    Align sequences to alignments or alignments toexisting alignments (profile alignments)

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    5/24

    I terative met hods

    Several progressive alignment met hods can be iteratede.g. ClustalX

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    6/24

    ClustalX Algorit hm

    P erform alignments and calculate distances for all pairs of sequences Construct guide tree (dendrogram) joining t he mostsimilar sequences using Neig h bour Joining Align sequences, starting at t he leaves of t he guidetree. T his involves t he pair-wise comparisons as wellas comparison of single sequence wit h a group of seqs(P rofile)

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    7/24

    U sing ClustalX

    Start wit h sequences in FASTA format (or an existingalignment in Clustal format[Do Alignment] on t he alignment menu

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    8/24

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    9/24

    Consensus Sequences

    Simplest Form:A single sequence w hich represents t he most common amino acid/basein that position

    Y D D G A V - E A LY D G G - - - E A LF E G G I L V E A LF D - G I L V Q A VY E G G A V V Q A LY D G G A/I V/L V E A L

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    10/24

    ClustalX P arameters

    Scoring Matrix Gap opening penalt y Gap extension penalt y

    P rotein gap parameters Additional algorit hm parameters Secondar y structure penalties

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    11/24

    Score Matrices

    P airwise matrices and multiple alignmentmatrix series

    PAM (Da yhoff), BLOS U M (Hennikof),GONNET (default), user definedTransition (AG)/Transversion (C

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    12/24

    Gap P enalties

    Linear gap penalties Affine gap penaltiesp = (o + l.e)

    Gap openingGap extension

    P rotein specific penalties (on b y default)-I ncrease t he probabilit y of gaps associated

    with certain residues-I ncrease t he chances of gaps in loop

    regions (> 5 hydrop hilic residues)

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    13/24

    Algorit hm parameters

    Slow-accurate pair-wise alignment Do alignment from guide tree Reset gaps before aligning (iteration) Dela y Divergent sequences (%)

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    14/24

    Additional displa ys

    Column ScoresLow qualit y regionsExceptional residues

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    15/24

    Multiple Alignment Strategies

    Align pairs of sequences using an optimal met hodC hoose representative sequences to align carefull yC hoose sequences of comparable lengt hsP rogressive alignment programs suc h as ClustalX for multiple alignment

    P rogressive alignment programs ma y be combined

    Review alignment b y eye and edit

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    16/24

    1 MITTENS1 MITTENS2 KITTIES2 KITTIES3 SMITTEN3 SMITTEN4 KITTE4 KITTE

    11

    33

    44

    22

    11 --MITTENSMITTENS3 SMITTEN3 SMITTEN--

    13 MITTEN13 MITTEN

    13 MITTEN13 MITTEN4 KITTENS4 KITTENS

    134 ITTEN134 ITTEN

    Manual editing:Fine adjustment of particular columns

    Incorporate specific knowledgeRemoval of gappy bits

    Important for phylogenetic analysisRemoval of parts of/whole sequences

    Non-homologous regionsSequences included by error

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    17/24

    Multiple Alignments & P hylogenetic

    TreesYou can make a more accurate multiplesequence alignment if you know t he tree

    alread yA good multiple sequence alignment is animportant starting point for drawing a treeT he process of constructing a multiplealignment (unlike pair-wise) needs to takeaccount of p hy logenetic relations hips

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    18/24

    P hylogenetic Tree

    A p hy logenetic tree is a representation of t heevolutionar y/geneological relations hips between acollection of organisms (or molecular sequences).

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    19/24

    Components of Tree

    Node St yleTerminal Nodes

    I nternal NodesRootBranc hesFont St yle

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    20/24

    Met hods For Tree Building

    C ladistic Approach-Maximum Parsimony Method

    -Maximum Likelihood MethodPhenetic Approach

    -Neighbour Joining (NJ method)

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    21/24

    Editing a multiple sequence alignment

    I t is NOT fraud to edit a multiple sequence alignmentI ncorporate additional knowledge if possible

    Alignment editors help to keep t he data organized andhelp to prevent unwanted mistakes.e.g. Bioedit, Seaview, Jalview etc.

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    22/24

    P rofile Alignment

    P osition specific scoresAllows alignment of alignmentsGaps introduced as w hole columns in t he separatealignmentsOptimal alignment in time O(a2l2)

    I nformation about t he degree of conservation of

    sequence positions is included

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    23/24

  • 8/6/2019 Multiple Sequence Alignment & Phylogenetic Tree

    24/24

    Thank You