Upload
maite-jimenez
View
37
Download
2
Embed Size (px)
DESCRIPTION
Introduction of bioinformatics and Biological Database. 高雄醫學大學 生物醫學暨環境生物學系 助理教授 張學偉 2006/08/08. Outline. Fields of Bioinformatics Genome Projects Today Database issue in “Nucleic Acids Research” Server issue in “Nucleic Acids Research”. Post-Genomic Era: Lots of Data!. - PowerPoint PPT Presentation
Citation preview
Introduction of bioinformatics and Biological Database
高雄醫學大學生物醫學暨環境生物學系
助理教授 張學偉2006/08/08
Outline
Fields of Bioinformatics Genome Projects Today Database issue in “Nucleic Acids
Research” Server issue in “Nucleic Acids Research”
Post-Genomic Era: Lots of Data!
“The study of genetic and other biological information using computer and statistical techniques.”
A Genome Glossary, Science, Feb 16, 2001
Bioinformatics
Bioinformatics is the discipline of biology that has evolved to gather, store and manage in specialized databanks the vast amounts of biological data, which it then mines for knowledge
生物資訊的領域
資料庫的建立與整合
序列分析結構 /功能分析
實驗資料分析 知識管理
ref. 中央研究院計算中心通訊 Vol.19 No.20
生物資訊學
Biotech and Computer Science
1953 1958 1974 1981 1990 1992 2003
Watson and CrickDNA double helix discovery
Computerrevolution begin
Stan Cohen and Herb Boverrecombinant DNA molecule
First portablecomputer begin
Human genomeproject begin
World web site
Human genomefully mapped
Breaking pointof Biotechnology
The breaking point of Biotechnology is Human Genome Project
GenBankGCG Package
Bioinformatics- hot issues
Genome Analysis Pipeline Analysis Genome Annotation SNP
Data warehouse/ Databases integration New Algorithm Literature Mining System Biology/ Microarray Analysis
The growth of Genbank (updates)
Prediction: data size doubles every 14 months
44,575,745,176 bases, from 40,604,319 reported sequences (up to Dec.,15, 2004)
Biological databases
Like any other database Data organization for optimal analysis
Data is of different types Raw data (DNA, RNA, protein sequences) Curated data (DNA, RNA and protein
annotated sequences and structures, expression data)
The growth of public domain bio-databases
0
100
200
300
400
500
600
700
800
1999 2000 2001 2002 2003 2004 2005Year
Dat
abas
e n
um
ber
(The Molecular Biology Database Collection from Nucleic Acids Research)
“The Gene Ontology (GO) project seeks to provide a set of structured vocabularies for specific biological domains that can be used to describe gene products in any organism.”
A few key points:GO is a “structured” vocabulary, which is really a specialized type of a “controlled” vocabulary.
Gene Ontology database
The ontologies in GO are intended to describe three biological areas, “molecular function”, “biological processes” and “cellular components”.
GO was originally developed through the collaboration of the members of three model organism projects: SGD, the Saccharomyces Genome database; FlyBase, the Drosophila genome database; and MGD/GXD, the Mouse Genome Informatics databases.
What GO is Not
1. GO is not a way to unify biological databases. Sharingnomenclature is a step toward unification, but is not, in itself,sufficient.
2. GO is not a dictated standard, mandating nomenclatureacross databases. Groups participate because of self-interestand cooperate to arrive at a consensus.
3. GO does not define homologies between gene products fromdifferent organisms. The use of the GO results in sharedannotations for gene products from different organisms, andthis may reflect an evolutionary relationship, but the sharedannotation is in itself not sufficient for such a determination.
Swimming in Data Sources
Database Integration