16
BIOINFORMATICS FOR NUCLEOTIDE SEQUENCE ANALYSIS BIOC 21121 25-05-2015

Bioinformatics for Nucleotide Sequence Analysis

Embed Size (px)

DESCRIPTION

DNA Sequencing bioinformatics

Citation preview

Page 1: Bioinformatics for Nucleotide Sequence Analysis

BIOINFORMATICS FOR NUCLEOTIDE

SEQUENCE ANALYSIS

BIOC 2112125-05-2015

Page 2: Bioinformatics for Nucleotide Sequence Analysis

Home page of NCBI

http://www.ncbi.nlm.nih.gov/

Page 3: Bioinformatics for Nucleotide Sequence Analysis

What can we do with NCBI ?

To find similar sequences – Sequence AlignmentTo find the relatedness of a gene or protein (do they have a common ancestor?)To study mutationsPrimer designing

Page 4: Bioinformatics for Nucleotide Sequence Analysis

WHAT IS BLAST ? Basic Local Alignment Search Tool A method for rapid searching of nucleotide

and protein databases Offers both sensitivity and speed Fastest and most frequently used sequence

alignment tool

Align 2 or more sequencesPrimer blast

Page 5: Bioinformatics for Nucleotide Sequence Analysis

BLAST ACCESS NCBI BLAST http://www.ncbi.nlm.nih.gov/BLAST/ Canadian Bioinformatics Resource BLAST http://cbr-rbc.nrc-cnrc.gc.ca/blast/ European Bioinformatics Institute BLAST http://www.ebi.ac.uk/blastall/ http://www.ebi.ac.uk/blast2/

Page 6: Bioinformatics for Nucleotide Sequence Analysis
Page 7: Bioinformatics for Nucleotide Sequence Analysis

BLAST - PROGRAMS Blastp - Compares an amino acid query sequence against a

protein sequence database.

Blastn - Compares a nucleotide query sequence against a nucleotide sequence database.

Blastx - Compares a nucleotide query sequence translated in all reading frames against a protein sequence database.

Tblastn - Compares a protein query sequence against a nucleotide sequence database.

Page 8: Bioinformatics for Nucleotide Sequence Analysis

Accession Number - NM_008862

TCGAAATAACGCGTGTTCTCAACGCGGTCGCGCAGATGCCTTTGCTCATC AGATGCGACCGCAACCACGTCCGCCGCCTTGTTCGCCGTCCCCGTGCCTC AACCACCACCACGGTGTCGTCTTCCCCGAACGCGTCCCGGTCAGCCAGCC TCCACGCGCCGCGCGCGCGGAGTGCCCATTCGGGCCGCAGCTGCGACGGT GCCGCTCAGATTCTGTGTGGCAGGCGCGTGTTGGAGTCTAAA

Q1 : Find the protein coded by the following nucleotide sequence.

http://www.ncbi.nlm.nih.gov/

Page 9: Bioinformatics for Nucleotide Sequence Analysis

BLAST StatisticsThe Score and the E-value (E) are particularly helpful in making interpretations.

Score - a good measure of the quality of an alignment.E-value - expectation value is a good measure of the

significance of the alignment.

The E-value is the number of different alignments, with scoresequivalent to or better than S, that are expected to occur in a database search by chance.

Significant the alignmentE value

Page 10: Bioinformatics for Nucleotide Sequence Analysis
Page 11: Bioinformatics for Nucleotide Sequence Analysis

Toll-like receptor 3 - human - NM_003265 - mouse - NM_126166 - zebra fish - NM_001013269

Q2 : Find the relatedness of the following receptors.

http://www.ncbi.nlm.nih.gov/

Page 12: Bioinformatics for Nucleotide Sequence Analysis

Crassostrea gigas heat shock protein70

Q3 : Design primers for the following gene.

http://www.ncbi.nlm.nih.gov/

Page 13: Bioinformatics for Nucleotide Sequence Analysis
Page 14: Bioinformatics for Nucleotide Sequence Analysis

Q4 : Determine the fragment size of the following PCR product.

TubA1 U12589Alpha tubulin gene of Forward 5’ GGAAACGCCTGCTGGGAGReverse 5’ AACAGTTGGAGGCTGATAAT

http://www.ncbi.nlm.nih.gov/

Page 15: Bioinformatics for Nucleotide Sequence Analysis

CONCLUSIONS BLAST is one of the most important program in

bioinformatics (maybe all of biology) BLAST is based on sound statistical principles

(key to its speed and sensitivity) A basic understanding of its principles is key for

using/interpreting BLAST output

Page 16: Bioinformatics for Nucleotide Sequence Analysis