Bioinformatics Computing 1 CMP 807 – Day 2 Kevin Galens

Embed Size (px)

DESCRIPTION

Fundamentals of Sequence Alignment

Citation preview

Bioinformatics Computing 1 CMP 807 Day 2 Kevin Galens Todays Objectives Sequence Alignment Global Local Substitution Matrices DNA Sequencing BLAST Algorithm Install Software: BLAST DB EMBOSS emboss.open-bio.org ClustalW - ftp.ebi.ac.uk File Formats Fundamentals of Sequence Alignment Global Alignment: Needleman-Wunsch What is Global alignment? Uses whole length of both sequences Result: 1 optimal alignment Needleman-Wunsch: Utilize a 2-d matrix Scenario: Align: COELACANTH and PELICAN +1 Match -1 Mismatch -1 - Gap Global Alignment: Needleman-Wunsch Resulting alignment: COELACANTH P-ELICAN-- or COELACANTH -PELICAN-- Local Alignment: Smith-Waterman What is a local alignment? Find the highest scoring substring No assumption on sequence length Smith-Waterman Use a 2-d matrix Scenario: Align: COELACANTH and PELICAN +1 Match -1 Mismatch -1 - Gap Local Alignment: Smith-Waterman Resulting alignment: ELACAN ELICAN Sequence Alignment More sophisticated scoring: Substitution Matrix PAMX (Point Accepted Mutation) Scaled according to evolutionary distance of closely related proteins PAM1 = 1% of amino acid positions have changed PAM250 most common BLOSUMX (BLOck SUbstitution Matrix) Scaled according to more distantly related proteins BLOSUM62 based on proteins with