Upload
lilian-dennis
View
218
Download
1
Tags:
Embed Size (px)
Citation preview
The Gene Ontology project
Jane Lomax
Ontology (for our purposes)
• “an explicit specification of some topic” – Stanford Knowledge Systems Lab
• Includes:– a vocabulary of terms (names)– defined logical relationships to each
• Compile structured vocabularies describing aspects of molecular biology
• Describe gene products using vocabulary terms (annotation)
• Develop tools:• to query and modify the vocabularies and annotations• annotation tools for curators
GO Project Goals:
•Molecular Function — elemental activity or task
•Biological Process — broad objective or goal
•Cellular Component — location or complex
The Three Ontologies
•Molecular Function — elemental activity or tasknuclease, DNA binding, transcription factor
•Biological Process — broad objective or goal
•Cellular Component — location or complex
The Three Ontologies
•Molecular Function — elemental activity or tasknuclease, DNA binding, transcription factor
•Biological Process — broad objective or goalmitosis, signal transduction, metabolism
•Cellular Component — location or complex
The Three Ontologies
•Molecular Function — elemental activity or tasknuclease, DNA binding, transcription factor
•Biological Process — broad objective or goalmitosis, signal transduction, metabolism
•Cellular Component — location or complexnucleus, ribosome, origin recognition complex
The Three Ontologies
DAG Structure
Directed acyclic graph: each child may have one or more parents
Every path from a node back to the root must be biologically accurate
The True Path Rule
True Path Rule
Chitin biosynthesis
Chitin catabolism
chitin metabolism
Cuticle synthesis
Cell wall biosynthesis
GO process
chitin metabolism
Cuticle biosynthesis
Cell wall biosynthesis
New GO Terms
cell wall chitin biosynth.
cell wall chitin catab.
cuticle chitin biosynth.
cuticle chitin catab
cell wall chitin metab.
chitin catabo-lism
chitin biosynthesis
cuticle chitin metab.
GO process
cell wall bio-synthesis (fungi)
chitin metabolism
cuticle synthesis
cell wall chitin catab.
chitin catabo-lism
chitin metabolism
cell wall chitin metab.
cell wall bio-synthesis
• is-asubclass; a is a type of b
• part-ofphysical part of (component)subprocess of (process)
Relationship Types
• Not a way to unify biological databases
• Not a dictated standard
• Does not define evolutionary relationships
• Additional ontologies needed to model biology and experimentation
What GO is NOT:
• Names of gene products
• Protein domains
• Protein sequence features
• Phenotypes; diseases
• Anatomical terms (except as part of terms generated by cross-products)
Terms outside the Scope of GO
Advantages of GO
• Cross-species comparisons• already used by an increasing number of databases
• More comprehensive• many terms per gene product• not a strict hierarchy: many-to-many relationships possible
• Simplify querying• Uses restricted vocabulary developed by curators and
annotators
• Use of evidence codes
• Database object: gene or gene product
• GO term ID
• Reference
•publication or computational method
• Evidence supporting annotation
Annotation Features:
DAG Structure
Annotate to any level within DAG
• GO Annotations for:
• Human proteins
• All SWISS-PROT/TrEMBL proteins
• Annotation sets for completely sequenced proteomes
GOA: GO Annotation at EBI
• Methods:
• Manual curation
• SWISS-PROT keyword <-> GO term mapping
• EC number <-> GO term mapping
• InterPro entry <-> GO term mapping
GOA: GO Annotation at EBI
• Browsers:
• DAG-Edit
• AmiGO
• “QuickGO” at EBI
• EP:GO browser
GO Tools
• Developmental processes — DAG cross- products with anatomy terms
• Physiological processes
• Relational database
•Expand relationship types
The Future of GO:
• FlyBase & Berkeley Drosophila Genome Project • WormBase• Saccharomyces Genome Database • DictyBase• Mouse Genome Informatics • Compugen, Inc• The Arabidopsis Information Resource• Swiss-Prot/TrEMBL/InterPro
• Pathogen Sequencing Unit (Sanger Institute)
• PomBase (Sanger Institute)
• Rat Genome Database
• Genome Knowledge Base (CSHL)
• The Institute for Genomic Research
www.geneontology.org
The Gene Ontology Consortium is supported by NHGRI grant HG02273 (R01). The Gene Ontology project thanks AstraZeneca for financial support. The Stanford group acknowledges a gift from Incyte Genomics.