View
130
Download
0
Category
Tags:
Preview:
DESCRIPTION
Citation preview
1
ONTO-ToolKit: enabling bio-ontology engineering via Galaxy
Aravind Venkatesan, ONTO-ToolKit: enabling bio-ontology engineering via Galaxy
Aravind Venkatesan
Systems Biology group, Department of BiologyNTNU, Trondheim
aravind.venkatesan@bio.ntnu.no
2
Overview
• Galaxy• Ontology for Life Sciences• ONTO-Toolkit• Use Cases• Conclusion• Future Directions• Acknowledgment• References
5
Ontology for Life Sciences
• Ontologies aid in knowledge formalisation and machine interoperability
• The success of ontologies in the Life Sciences is marked by the wide spread use of Gene Ontology1 (GO)
• Application ontologies such as the Cell Cycle Ontology2
• The OBO flat file format3 (OBOF) and the Web Ontology Language4 (OWL) have gained wide acceptance as knowledge representation languages.
6
ONTO-Toolkit
• Is a collection of tools to manage ontologies represented in the OBO file format within Galaxy environment
• The tools are wrappers for commonly used functions provided by ONTO-PERL5
• ONTO-PERL was developed as part of the Semantic Systems Biology6 (SSB) initiative
• ONTO-PERL (OBOF-centered PERL API) comprises of extensible set of (Object-oriented) PERL modules
• These have an organised set of subroutines to deal with ontologies and is fully compatible with the current OBO specifications (ver. 1.2)
• The latest version (ver.1.22) of ONTO-PERL can be directly downloaded from CPAN, http://search.cpan.org/dist/ONTO-PERL/
ONTO-PERL: An API supporting the development and analysis of bio-ontologies. Antezana E, Egana M, De Baets B, Kuiper M, Mironov V. Bioinformatics 2008; doi: 10.1093/bioinformatics/btn042
7 Examples of ONTO-PERL functionalitiesScripts Functionality
get_ancestor_terms.pl Collects the ancestor terms (list of IDs) froma given term (existing ID) in the given OBO ontology.
get_child_terms.pl Collects the child terms (list of term IDs andtheir names) from a given term (existing ID) in the given OBOontology.
get_descendent_terms.pl Collects the descendent terms (list of IDs)from a given term (existing ID) in the given OBO ontology.
get_subontology_from.pl Extracts a subontology (in OBO format) of agiven ontology having the given term ID as the root.
get_intersection_ontology.pl Provides an intersection of the given ontologies (in OBO format)
obo2owl.pl OBO to OWL translator.
obo2rdf.pl OBO to RDF translator.
obo_trimming.pl This script trims a given branch of an OBO ontology.
8 ONTO-Toolkit - GALAXY
Define arguments
9 ONTO-Toolkit - GALAXY
10 Use Cases
• To investigate similarities between given molecular functions
• Collecting all the upstream terms (ancestors) of two given molecular function terms and to identify common ancestors terms.
Term ID 1
Term ID 2
Motivation:• To demonstrate the functionality of ONTO-Toolkit in GALAXY
• To demonstrate the usefulness of ontology engineering in biological domain
Use Case I:
Chosen Ontology: Cell Cycle OntologyChosen Terms:Term 1: id: CCO:F0000004name: trans-hexaprenyltranstransferase activityTerm 2:id: CCO:F0000820name: homogentisate 1,2-dioxygenase activity
11 Use Case I
Uploading an obo ontology file – e.g.:
cco_S_pombe
12
Molecular function Term ID: CCO:F0000004
Conti…
13
• This step is repeated for the second term - CCO:F0000820
List of ancestor terms for the given Molecular function Term 1
List of ancestor terms for Term 2
14
Common ancestor terms
Gets the overlapping ancestor terms
15
Use Case II
• Identifying overlapping annotations for a given pair of distinct biological process terms
Chosen Ontology: Cell Cycle OntologyChosen Terms:
Term 1: id: CCO:P0000005name: cell cycle checkpoint
Term 2:id: CCO:P0000069name: mitosis
Term ID 1
Term ID 2
16
Gets the sub-ontology for the given terms
Use Case II
17
Generated sub-ontology of Term 1 : CCO:P0000005
Generated sub-ontology of Term 2 : CCO:P0000069
18
Gets the intersection of the two sub-ontologies
19
Conclusion
• Use Case I – the results provides evidence that the two molecular functions are unrelated as only the high level terms are shared by them.
• Use Case II – the results suggests the possibility of an overlap between two distinct biological processes
• ONTO-Toolkit functionalities provides rich-ontology driven solutions within the Galaxy framework
Future Directions
• Provide interface to perform SPARQL queries within Galaxy
• Provide visualisation module
20
Acknowledgment
• Dr. Erick Antezana, NTNU• Dr. Vladimir Mironov, NTNU• Dr. Martin Kuiper, NTNU
References1. M. Ashburner, et al. Gene ontology: tool for the unification of biology. The Gene Ontology
Consortium. Nat Genet, 25:25– 29, May 2000.
2. The Cell Cycle Ontology, http://www.semantic-systems-biology.org/cco
3. The OBO Flat File Format Specification (ver.1.2), http://www.geneontology.org/GO.format.obo-1_2.shtml
4. OWL Web Ontology Language, http://www.w3.org/TR/owl-semantics/
5. ONTO-PERL: An API supporting the development and analysis of bio-ontologies. Antezana E, Egana M, De Baets B, Kuiper M, Mironov V. Bioinformatics 2008; doi: 10.1093/bioinformatics/btn042
6. Semantic Systems Biology, http://www.semantic-systems-biology.org/
Recommended