16
On line (DNA and amino acid) Sequence Information Lecture 9

On line (DNA and amino acid) Sequence Information Lecture 9

Embed Size (px)

Citation preview

Page 1: On line (DNA and amino acid) Sequence Information Lecture 9

On line (DNA and amino acid) Sequence Information

Lecture 9

Page 2: On line (DNA and amino acid) Sequence Information Lecture 9

Introduction

• Annotation of genes• Basic bioinformatics Databases• NCBI home page• Query and return results• DNA sequence results page• Protein sequence results page

Page 3: On line (DNA and amino acid) Sequence Information Lecture 9

Bioinformatcs Databases• The Biological data, generated by various labs, is

submitted and stored in specific databases is : • The data is Nucleotide: DNA and mRNA (cDNA)

and Proteins sequences• The main “primary” nucleotide sequence

databases are:– United states: Genebank (NCBI) – Europe: Nucleotide sequence database (EMBL)– Japan: DNA databank of Japan.

• These databases also contain sequences related to: – Expressed sequence tags (ESTs) small (800 bp) of mRNA

and can be used to see what genes are expressed…

Page 4: On line (DNA and amino acid) Sequence Information Lecture 9

Protein Databases

• The main protein databases is:• Uniprot: (universal Protein resource)• Uniprot (KB) databases contains data from– SWISS-PROT (most up-to date information)– Trembl: (translation of coding sequences.)– PIR database

• Both the nucleotide and databases contain much more detail than sequences and the detail is referred to annotation.

Page 5: On line (DNA and amino acid) Sequence Information Lecture 9

Global Sequence 5

Annotation of sequences

• Once the gene sequence’s have been determined then the data must be annotated: (Klug 2010)– Identify regulatory regions – Other sequences of interest: exons/ introns, coding

sequences (cds), polyA signal– In protein annotation there are mRNA sequences– Other organisms where the DNA sequence/ AA

sequence is to found– Journals/Reference to where data came from.

Page 6: On line (DNA and amino acid) Sequence Information Lecture 9

Bioinformatics Database

• Bioinformatic Databases contain information for various biological data:

• To faciliate finding information there are a number of specific search engines:– NCBI has ENTREZ– EMBL has SRS

• Consider the following query:– What is the DNA and amino acid sequence for the

following gene: Human BTEB – more detail on the terms can be found by looking at a

sample record: http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord

Page 7: On line (DNA and amino acid) Sequence Information Lecture 9

NCBI Entrez search page

Page 9: On line (DNA and amino acid) Sequence Information Lecture 9

Coding section of gene

The Exon intron structure is also available in graphic form

Page 11: On line (DNA and amino acid) Sequence Information Lecture 9

Other databases databases

• The nucleotide (Genbank and EMBL) and protein (Uniprot) contain the “raw data” and are referred to as primary databases.

• More specific databases derive data from these and are referred to as secondary database; examples include protein family and sequence similarity databases such as PROSITE and PRINTS

• There are databases which contain information about specific organisms such as e. coli using Genome online database (GOLD)

Page 12: On line (DNA and amino acid) Sequence Information Lecture 9

Other databases

• Databases for specific types of sequences such as those associated with promoters and other regulatory elements.

• Others include structural databases from the Protein Data Bank

• On-line Mendelian inheritance of man (OMIM) which contains information on human genes and genetic disorders.

Page 13: On line (DNA and amino acid) Sequence Information Lecture 9

Bioinformatics Search Engines

• The Entrez (NCBI) search engine retrives information from NCBI databases and can be used to obtain other information including publications (Pubmed), 3D protein structures, online mendellian inheritance of Man…. A tutorial can be found at: – Entrez: Making use of its power:

• The EMBL uses ExPASy site which utilises the open source application: Sequence retrival system: a tutorial can be found at: – SRS tutotial: quick tour

Page 14: On line (DNA and amino acid) Sequence Information Lecture 9

Other important information sources• PUBMED: Literature research: journal articles/

conference proceedings/ books etc.– Search under many fields: keyword, author….– Returns: journal articles/abstracts– Two types: general/review.

• NCBI account: set up an NCBI account to manage previous searches….

• BTEB pubmed search found at:– http://www.ncbi.nlm.nih.gov/pubmed?term=BTEB&c

md=DetailsSearch

Page 15: On line (DNA and amino acid) Sequence Information Lecture 9

BTEB pubmed search result

Page 16: On line (DNA and amino acid) Sequence Information Lecture 9