Upload
shauna-griffith
View
220
Download
2
Tags:
Embed Size (px)
Citation preview
2
Semantic Web, Moby, wikis, crowd sourcing, NLP, etc.
let a million flowers (and weeds) bloom
to create integration rely on (automatically generated?) post hoc mappings
The result is noisy
How create broad-coverage semantic annotation systems for biomedicine?
4
for science
develop high quality annotation resources in a collaborative, community effort
creating an evolutionary path towards improvement of terminologies of the sort we find elsewhere in science
Foundry alternative:prospective standardization
7
science basis of the GO: trained experts curating peer-reviewed literature
different model organism databases employ scientific curators who use the experimental observations reported in the biomedical literature to associate GO terms with gene products in a coordinated way
The methodology of annotations
8
cellular locations
molecular functions
biological processes
used to annotate the entities represented in the major biochemical databases
thereby creating integration across these databases and making them available to semantic search
A set of standardized textual descriptions of
9
and also
need to extend the GO by engaging ever broader community support for the addition of new terms and for the correction of errors
need to extend the methodology to other domains, including clinical domains
10
this requires that weestablish common rules governing best practices for creating ontologies and for using these in annotations
apply these rules to create a complete suite of orthogonal interoperable biomedical reference ontologies
11
shared portal + low regimentation
http://obo.sourceforge.net NCBO BioPortal
2003
13
A prospective standarddesigned to guarantee interoperability of ontologies from the very start (contrast to: post hoc mapping)
established March 2006
12 initial candidate OBO ontologies – focused primarily on basic science domains
several being constructed ab initio
by influential consortia who have the authority to impose their use on large parts of the relevant communities.
14
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy?)
Anatomical Entity
(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
MOLECULE Molecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)Building out from the original GO
15
OBO Foundry = a subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles reflecting best practice in ontology development designed to ensure
tight connection to the biomedical basic sciences
compatibility
interoperability, common relations
formal robustness
support for logic-based reasoning
The OBO Foundry http://obofoundry.org/
16
CRITERIA
The ontology is OPEN and available to be used by all.
The ontology is in, or can be instantiated in, a COMMON FORMAL LANGUAGE.
The developers of the ontology agree in advance to COLLABORATE with developers of other OBO Foundry ontology where domains overlap.
CRITERIA
The OBO Foundry http://obofoundry.org/
17
CRITERIA UPDATE: The developers of each ontology
commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement.
ORTHOGONALITY: They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary.
The OBO Foundry http://obofoundry.org/
18
for science
if we annotate a database or body of literature with one high-quality biomedical ontology, we should be able to add annotations from a second such ontology without conflicts
AND WITHOUT THE NEED FOR MAPPINGS
orthogonality of ontologies implies additivity of annotations
The OBO Foundry http://obofoundry.org/
19
CRITERIA
IDENTIFIERS: The ontology possesses a unique identifier space within OBO.
VERSIONING: The ontology provider has procedures for identifying distinct successive versions to ensure BACKWARDS COMPATIBITY with annotation resources already in common use
The ontology includes TEXTUAL DEFINITIONS and where possible equivalent formal definitions of its terms.
CRITERIA
20
CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content.
DOCUMENTATION: The ontology is well-documented.
USERS: The ontology has a plurality of independent users.
CRITERIA
The OBO Foundry http://obofoundry.org/
21
COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.*
* Smith et al., Genome Biology 2005, 6:R46
CRITERIA
The OBO Foundry http://obofoundry.org/
Anatomy Ontology(FMA*, CARO)
Environment
Ontology(EnvO)
Disease, Disorder and
Treatment (OGMS)
Biological Process
Ontology (GO*)
Cell Ontology
(CL)
CellularComponentOntology
(FMA*, GO*) Phenotypic Quality
Ontology(PaTO)
CHEBI
Sequence Ontology (SO*) Molecular
Function(GO*)Protein Ontology
(PRO*) Extension Strategy – Downward Population 22
top level
mid-level
domain level
Information Artifact Ontology
(IAO)
Ontology for Biomedical
Investigations(OBI)
Spatial Ontology
(BSPO)
Basic Formal Ontology (BFO)
OGMS
Cardiovascular Disease OntologyGenetic Disease OntologyCancer Disease OntologyGenetic Disease OntologyImmune Disease OntologyEnvironmental Disease OntologyOral Disease OntologyInfectious Disease Ontology…
OGMS
Cardiovascular Disease OntologyGenetic Disease OntologyCancer Disease OntologyGenetic Disease OntologyImmune Disease OntologyEnvironmental Disease OntologyOral Disease OntologyInfectious Disease Ontology…
BFO, OGMS, and IDO
• Material Entity• Disposition• Process
• Disorder• Disease• Disease Course
• Infection• Infectious Disease• Infectious Disease Course
OGMS
Cardiovascular Disease OntologyGenetic Disease OntologyCancer Disease OntologyGenetic Disease OntologyImmune Disease OntologyEnvironmental Disease OntologyOral Disease OntologyInfectious Disease Ontology
IDO Staph Aureus IDO MRSA IDO Australian MRSA IDO Australian Hospital MRSA …
How IDO evolvesIDOCore
IDOSa
IDOHumanSa
IDORatSa
IDOStrep
IDORatStrep
IDOHumanStrep
IDOMRSA
IDOHumanBacterial
IDOAntibioticResistant
IDOMAL IDOHIVCORE and SPOKES:Domain ontologies
SEMI-LATTICE:By subject matter experts in different communities of interest.
IDOFLU
28
31
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy?)
Anatomical Entity
(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
MOLECULE Molecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)Building out from the original GO
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
COMPLEX OFORGANISMS
Family, Community, Deme, Population
Population Phenotype
Population
Process
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO)
Phenotypic Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Component(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(CHEBI, SO,RNAO, PRO)
Molecular Function
(GO)
Molecular Process
(GO)
Population-level ontologies 32
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
COMPLEX OFORGANISMS
Family, Community, Deme, Population
Population Phenotype
PopulationProcess
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(FMP, CPRO)
Phenotypic Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Component(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(CHEBI, SO,RNAO, PRO)
Molecular Function
(GO)
Molecular Process
(GO)Population-level ontologies 33
Environm
ent (EnvO
, EO
)
Successes
The OBO Foundry strategy for ontology collaboration and reuse is being replicated in major grant-funded projects
OBO Foundry approach extended into other domains
35
NIF Standard Neuroscience Information Framework
ISF Ontologies Integrated Semantic Framework for Clinical and Translational Science
ImmPort Immunology Database and Analysis Portal
OGMS and Extensions Ontology for General Medical Science
IDO Consortium Infectious Disease OntologycROP Common Reference
Ontologies for Plants
FUNDED
Successes
Huge and continuing expansion in the awareness of the need for re-using ontologies
Huge and continuing expansion in ontology software created to support Foundry efforts (Ontobee, Mireot, …)
Current status
Coordinating editors:Michael AshburnerChris MungallSuzanna LewisAlan RuttenbergRichard ScheuermannBarry Smith
New operations committee• https://
code.google.com/p/obo-foundry-operations-committee/wiki/OutreachWG
Mathias BrochhausenMelanie CourtotMelissa HaendelJanna HastingsChris MungallAlan RuttenbergRamona Walls
Ontologies admitted to full membership afte first phase of reviews
• CHEBI: Chemical Entities of Biological Interest
• GO: Gene Ontology• PATO: Phenotypic Quality Ontology• PRO: Protein Ontology• XAO: Xenopus Anatomy Ontology• ZFA: Zebrafish Anatomy Ontology
Current statusNext round of candidates for reviewOGMS: Ontology for General Medical ScienceOBI: Ontology for Biomedical InvestigationsCL: Cell OntologyIDO: Infectious Disease Ontology