View
223
Download
6
Tags:
Embed Size (px)
Citation preview
BASIC FORMAL BASIC FORMAL ONTOLOGYONTOLOGY
Robert Arp, Ph.D.Ontology Research Group (ORG)
www.org.buffalo.edu
National Center for Biomedical Ontology (NCBO)www.bioontology.org
(1)Philosophical Ontology
“...I can fit wholesale evolution and a creating god into my ontology without contradiction.”
“...just because it has mental existence doesn’t mean it has ontological existence.”
- Ontos (being, existence)+ Logos (word, account, explanation)
- The study of what is, of the kinds and structures of objects, properties, events, processes, and relations in every area of reality
- “The branch of Metaphysics that studies the nature of existence.” Random House College Dictionary
To a certain extent, all of us are Philosophical Ontologists in that we naturally and automatically categorize any and all things in reality so as to understand, explain, control, dominate, and navigate reality.
(2) Domain Ontology“...I’m working on an ontology for annelids.”
“...the Gene Ontology has data on that HOX gene.”
- Representation of the entities and relations existing within a particular domain of reality such as medicine, geography, ecology, or law Gene Ontology (GO)
Foundational Model of Anatomy (FMA) Environment Ontology (EnvO)
- Opposed to ontology in the philosophical sense, which has all of reality as its subject matter
- Ideally, provides a controlled, structured vocabulary to annotate data in order to make it more easily searchable by human beings and processable by computers
ONTOLOGY:
“a representational artifact, comprising a taxonomy as its main part, whose representational units are intended to designate some combination of universals, defined classes, and certain relations between them.” ** Smith, B., Kusnierczyk, W., Schober, D., & Ceusters, W. (2006). Towards a reference terminology for ontology research and development in the biomedical domain. Proceedings of KR-MED 2006, 1, 1-14.
REALISM-BASED ONTOLOGY:
“…built out of representational units which are intended to refer exclusively to (real) universals, and corresponds to that part of the content of a scientific theory that is captured by its constituent general terms and the interrelations between the universals denoted by these terms.” (Smith et al., 2006)
Method of Ontological Realism• Find out what the world is like by
doing science, talking to other scientists, and working continuously with them to ensure that you don’t go wrong
• Build representations adequate to this world, not to some simplified model in your laptop
Informatics:The science of information collection, categorization, management, storage, processing, retrieval, and dissemination.
“…the fundamental role of a domain ontology is to support knowledge sharing and reuse.” * * Domingue, J., & Motta, E. (1999). A knowledge-based news server supporting ontology-driven story enrichment and knowledge retrieval. In D. Fensel & R. Studer (Eds.), Knowledge acquisition, modeling and management (pp. 104-112). Berlin: Springer.
Domain ontology contrasted with:
- Database- Rule-Based Language - Thesaurus- Glossary- Catalogue- Inventory- Axiomatic Theory- Simple Taxonomy
Ontology characterized as a hybrid of:
- Taxonomy
- Axiomatic Theory
The Central Distinction
universal vs. instance(catalogue vs. inventory)
(science text vs. diary)(human being vs. George Bush)
(mouse brain vs. Mickey Mouse’s brain)(cytoplasm vs. this cytoplasm under the scope)
Example Domain Ontology
Mechanism
Doorbell Ther-mometer Clock Trap
Animal Trap
Rodent Trap
Mouse Trap
Spring-Loaded Bar Mouse Trap
Electric Mouse Trap
Glue Mouse Trap
Rat Trap
Insect Trap Bear Trap Fish Trap
Human TrapMouse Trap
Beverage
Alcoholic Beverage
Beer
Ale
Bitter Ale Mild Ale Sweet Ale
Lager Lambic Beer
Wine Whisk(e)y
Non-Alcoholic Beverage
Soda Coffee
Example Domain Ontology
Beer
BORROWED FROM: http://www.bio.davidson.edu/courses/genomics/2006/martens... 3DN
A Gene Ontology Example:
Glutathione
Scientific Experiment Ontologyhttp://technology.newscientist.com/article/dn9288-translator-lets-
computers-understand-experiments-.html
Entity
Polyatomic Entity
Biological Entity
Biomol-ecules
Small Molecules
Lipid
LC Fatty Acyls
LC Docos-anoids
LC Eicos-anoids
LC Lipoxins
LC Hepoxilins
LC Cluvalones
Part of a Lipid
Ontology
Being developed by:Low, H-S., Alexander, G., Baker, C., & Wenk, M. (2008). Lipid ontology
Available at: http://MUS.12R.lipidontology.biochem.nus.edu.sg/lipidversion3.owl.
ONTOLOGY SCOPE URL CUSTODIANSCell Ontology (CL) cell types from prokaryotes to mammals obo.sourceforge.net/cgi-
bin/detail.cgi?cellJonathan Bard, Michael Ashburner,
Oliver Hofmann
Chemical Entities of Biological Interest
(ChEBI)
molecular entities which are products of nature or synthetic products used to intervene in the
processes of living organismsebi.ac.uk/chebi Paula Dematos, Rafael Alcantara
Common Anatomy Reference Ontology
(CARO)
anatomical structures in human and model organisms (initially mouse, fly, zebrafish) (under development) Melissa Haendel, David Sutherland
Disease Ontology (DO)
(Candidate member)human diseases and associated conditions diseaseontology.source
forge.netRex Chisholm, Warren Kibbe, John
Osborne, Wendy Wolf
Foundational Model of Anatomy (FMA) structure of the human body fma.biostr.
washington.edu JLV Mejino Jr., Cornelius Rosse
Gene Ontology (GO)attributes of gene products (divided into: cellular component, molecular function,
biological process) in all organismswww.geneontology.org Gene Ontology Consortium
Ontology for Biomedical
Investigations (OBI)
design, protocol, instrumentation, data and analysis applied in functional genomics
investigationsfugo.sf.net OBI/FuGO Working Group
Phenotypic Quality Ontology (PATO) qualities of anatomical structures
obo.sourceforge.net/cgi-bin/ detail.cgi?
attribute_and_value
Michael Ashburner, Suzanna Lewis, Georgios Gkoutos
Protein Ontology (PrO) protein types and modifications classified on the basis of evolutionary relationships pir.georgetown.edu/pro Cathy Wu, Darren Natale
Relation Ontology (RO) relations between universals and instances in biomedical ontologies obofoundry.org/ro Chris Mungall
Ontology (RnaO)three-dimensional structures and homologous sequence alignments and associated attributes
and processes(under development) Ontology Consortium
Sequence Ontology (SO) features and properties of nucleic sequences www.sequenceontology.
Org Karen Eilbeck
Because of:- Varying perspectives, methodologies, ideas, and
Data- Extraordinary depth, magnitude of data…- Overwhelmed with data and information…- More information than humans can handle…
A couple of problems result
(there are more…)
How do you find your data?
- How do you understand the significance of the data you collected 3 years earlier?
- How do you reason with the data when you find it?
- How do you integrate your data with other people’s data?
1
THE SILO EFFECT2
Many domains that arenon-interoperable,
non-communicative, isolated, insolated, encapsulated
“silos” of data
Informatics problems that contribute to SILO EFFECT:- Dumb Beast- Nonsense-In-Nonsense-Out- Computer Solipsism- Human Idiosyncrasy- Tower of Babel- Pressures from Insurance Companies- Legal Pressures
** Human Error: Incorrect Thinking
THE SILO EFFECT
Three Levels to Keep Straight• Level 1: The entities in reality, both instances
and universals
• Level 2: Cognitive representations of this reality on the part of scientists
• Level 3: Publicly accessible concretizations of these cognitive representations in textual, graphical, or computational representational artifacts
** Human Error: Incorrect Thinking
PROBLEM:DE-SILOING all of this
domain data so that it may be found (!), queried
effectively, shared, and re-used…
PROBLEM:DE-SILOING all of this
domain data so that it may be found (!), queried
effectively, shared, and re-used…
SOLUTION:Formal Ontology
(3) Formal Ontology“...This upper-level ontology should help organize these domains.”
“...IEEE just came out with the latest version of SUMO that may solve some of these problems.”
Assists in making communication between and among domain ontologies possible by providing:
-Common language
-Common formal framework for reasoning
Concerns, at least:
- Adoption of a set of basic categories of objects
- Discerning what kinds of entities fall within each of these categories of objects
- Determining what relationships hold between the different categories in the domain ontology
Formal Ontology is like a “backbone” or “spine” making communication,
interoperability, and optimal dissemination of information possible between and among
domain ontologies
Data
Data
Data
Data
Data
Data
Data
Formal Ontology E.G., Basic Formal Ontology
From this…
To this…
Data
Data
Data
Data
Data Data
Data
Data
Data
Data
Data
Data
Data
Data
Formal Ontology E.G., Basic Formal Ontology
Program Announcement Number: PAR-07-425
Title: Data Ontologies for Biomedical Research (R01)NIH Blueprint for Neuroscience Research, (http://neuroscienceblueprint.nih.gov/)National Cancer Institute (NCI), (http://www.cancer.gov)National Center for Research Resources (NCRR), (http://www.ncrr.nih.gov/)National Eye Institute (NEI), (http://www.nei.nih.gov/)National Heart Lung and Blood Institute (NHLBI), (http://http.nhlbi.nih.gov )National Human Genome Research Institute (NHGRI), (http://www.genome.gov)National Institute on Alcohol Abuse and Alcoholism (NIAAA), (http://www.niaaa.nih.gov/)National Institute of Biomedical Imaging and Bioengineering (NIBIB), (http://www.nibib.nih.gov/)National Institute of Child Health and Human Development (NICHD), (http://www.nich.nih.gov/)National Institute on Drug Abuse (NIDA), (http://www.nida.nih.gov/)National Institute of Environmental Health Sciences (NIEHS), (http://www.niehs.nih.gov/)National Institute of General Medical Sciences (NIGMS), (http://www.nigms.nih.gov/)National Institute of Mental Health (NIMH), (http://www.nimh.nih.gov/)National Institute of Neurological Disorders and Stroke (NINDS), (http://www.ninds.nih.gov/)National Institute of Nursing Research (NINR), (http://www.ninr.nih.gov)
PAR-07-425 Purpose
“Optimal use of informatics tools… and (data) resources depends upon explicit understandings of concepts related to the data upon which they compute.”
“This is typically accomplished by a tool or resource adopting a formal controlled vocabulary and ontology.”
EXAMPLES:
Basic Formal Ontology (BFO)
Standard Upper Merged Ontology (SUMO)
Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE)
BFO is an ontology tosupport integration of
scientific research data
SUMO contains many portions which are more properly conceived of as domain ontologies (airports, bacteria)
DOLCE is tilted towards objects of general thought and communication (fiction, mythology)
Taxonomy
Taxonomy with Formal Rules =
Ontology
Philosophical Ontology
E.G., Porphyrian Tree
Domain Ontology
Domain Reference OntologyE.G., Table of the Elements, Linnean, GO, FMA
Domain Application OntologyE.G., Amazon.com, Library of Congress catalogue
Formal Ontology
Formal Reference OntologyE.G., SUMO, DOLCE, BFO
Formal Application OntologyE.G., Friend of a Friend (FOAF)
Simple Taxonomy
E.G., Thesaurus
An Ontology of Ontologies
BFO: General Preliminaries- Upper-Level, Top-Level, Formal...“...applicable to all domains of objects”** Barry Smith and David Woodruff Smith, The Cambridge Companion to Husserl, ed. Barry Smith and David Woodruff Smith (Cambridge: Cambridge University Press, 1995), 28.
EMBRACES- Perspectivalism- Granularity- Fallibility
REALISM-BASED ONTOLOGYUniversals
(1) real objects, substances, endurants, or continuants- SNAP shots of reality
(2) real processes, activities,perdurants, or occurrents- SPAN of time
Relationsis_a, part_of, has_participant
Universals(1) real objects, substances,
endurants, or continuants
- SNAP shots of reality
(2) real processes, activities, perdurants, or
occurrents- SPAN of time
continuants vs. occurrents
In classifying parts of reality, we keep track of these two different kinds of
entities in two different ways
continuant (substance, object)
t i m
e
occurrent (process)
continuant entities- have continuous existence in time- preserve their identity through change- exist in toto, if they exist at all
occurrent entities- have temporal parts- unfold themselves phase by phase- exist only in their phases/stages
Two Orthogonal, Independent,
Complementary Perspectivesstocks and flows
commodities and servicesproduct and process
anatomy and physiology
The tumor developed in the lung over 25 years.
substances things processes objects activities continuants occurrents
BFO: The Very Top
continuant occurrent
(always dependent
on one or more independent continuants)
independentcontinuant
dependentcontinuant
BFO: The Very Top
continuant occurrent
(always dependent
on one or more independent continuants)
independentcontinuant
dependentcontinuant
objectsfiat objectssites
qualitiesfunctionsrolesdispositions
processesfiat process partsprocess contexts
BFO: The Very Top
continuant occurrent
(always dependent
on one or more independent continuants)
independentcontinuant
dependentcontinuant
object:mice
quality:that are black
process:have drugs injected in them
Example:
BFO: The Very Top
continuant occurrent
(always dependent
on one or more independent continuants)
independentcontinuant
dependentcontinuant
object:LSD
quality:that is hallucinogenic
process:is digested in the blood stream
Example:
BFO: The Very Top
continuant occurrent
(always dependent
on one or more independent continuants)
independentcontinuant
dependentcontinuant
object:kidney
function:whose function is to filter urine
process:filters urine
Example:
BFO: The Very Top
continuant occurrent
(always dependent
on one or more independent continuants)
independentcontinuant
dependentcontinuant
object:conjuctiva
disposition:which is affected with conjunctivitis
process:engages in edema
Example:
BFO: The Very Top
continuant occurrent
(always dependent
on one or more independent continuants)
independentcontinuant
dependentcontinuant
site:inner area, spare tire
role:acts as reservoir
process:colonization of mosquitoes
Example:
Three Dichotomies• continuant vs. occurrent• dependent vs. independent• instance vs. universal
universals exist in reality through their instances
continuant(object)
occurrent(process)
independentcontinuant
(molecule, cell, organ,organism)
dependentcontinuant
(quality, function,disease)
functioning side-effect, stochastic process, ...
..... ..... .... .....instances
continuant
human heart
surface of the heart
all hearts in this room
a biopsy of the heart
chest cavity
pink, smooth
stops if no circulation
pumps blood
prop in a display
HUMAN HEART
occurrent
ECG (EKG) test
start/end of ECG
all ECGs in clinic
2nd lead attached
activities in clinic
s/t ECG began
s/t region of ECG
moment ECG began
time occupied
ECG/EKG TEST
REALISM-BASED ONTOLOGYUniversals
(1) real objects, substances, endurants, or continuants- SNAP shots of reality
(2) real processes, activities,perdurants, or occurrents- SPAN of time
Relationsis_a, part_of, has_participant
The Relations Ontology http://www.obofoundry.org/ro/
Based on:Smith, B., Ceusters, W., Klagges, B., Köhler, J., Kumar, A., Lomax, J., et al. (2005). Relations in biomedical ontologies. Genome Biology, 6, R46.
Groups and Organizations Using BFO:
AstraZeneca - Clinical Information Science BioPAX-OBO BIRN Ontology Task Force (BIRN OTF) Computer Task Group Inc. Duke University Laboratory of Computational Immunology Dumontier Lab INRIA Lorraine Research Unit Kobe University Graduate School of Medicine Language and Computing National Center for Multi-Source Information Fusion Ontology Works University of Texas Southwestern Medical Center Science
Science Commons: Neurocommons
Neurocommons team is working to:
- Release, improve, and extend an open knowledge base of annotations to the biomedical abstracts (in RDF)- Debug and tailor an open-source codebase for computational biology- Gradually integrate major neuroscience databases into the annotation graph…
From:http://sciencecommons.org/projects/data/
“All the while using these efforts to further bring together the community within neuroscience around open approaches to systems biology…”
Alan Ruttenberghttp://sciencecommons.org/about/whoweare/ruttenberg/
…currently involved in a number of open biomedical ontology efforts, including:
BioPAX: representing molecular and cellular pathways…Ontology for Biomedical Investigations (OBI)…Basic Formal Ontology (BFO) that will form the upper- level ontology for the OBO foundry …
A Few Ontologies Using BFO
BioTop: A Biomedical Top-Domain Ontology Common Anatomy Reference Ontology (CARO) Foundational Model of Anatomy (FMA)Gene Ontology (GO) Infectious Disease Ontology (IDO)Ontology for Biomedical Investigations (OBI)Ontology for Clinical Investigations (OCI) Phenotypic Quality Ontology (PaTO) Protein Ontology (PRO) RNA Ontology (RnaO) Senselab OntologySequence Ontology (SO)Subcellular Anatomy Ontology (SAO) Vaccine Ontology (VO)
Researchers use Protégé, OBO-Edit, Microsoft Excel,
or any number of other media (chalk boards) to
classify entities using BFO
Lipid Ontology using BFO Protégé being developed by:Low, H-S., Alexander, G., Baker, C., & Wenk, M. (2008). Lipid ontology.
Available at: http://MUS.12R.lipidontology.biochem.nus.edu.sg/lipidversion3.owl.
BFO RESOURCES
Institute for Formal Ontology and Medical Information Science (IFOMIS)
http://www.ifomis.uni-saarland.de/bfo/
Ontology Research Group (ORG)
http://org.buffalo.edu/
Step #1: Determine the purpose of the domain ontology:
reference or application?Step #2: Determine and demarcate the relevant subject-matter of the domain.Step #3: Determine the level of granularity of the domain.Step #4: Provide explicit statement of the intended subject-matter of the domain.
Taxonomy
Taxonomy with Formal Rules =
Ontology
Philosophical Ontology
E.G., Porphyrian Tree
Domain Ontology
Domain Reference OntologyE.G., Table of the Elements, Linnean, GO, FMA
Domain Application OntologyE.G., Amazon.com, Library of Congress catalogue
Formal Ontology
Formal Reference OntologyE.G., SUMO, DOLCE, BFO
Formal Application OntologyE.G., Friend of a Friend (FOAF)
Simple Taxonomy
E.G., Thesaurus
An Ontology of Ontologies
relation to time
granularity
continuant
occurrentindependent dependent
organ andorganism
Organism(Species
Taxonomy)
Anatomical Entity(FMA, CARO)
OrganFunction
(GO+) Phenotypic Quality(PaTO)
Organism-Level
Process(GO)
cell and cellular compo-
nent
Cell(CL)
Cellular Compo-
nent(FMA,GO)
Cellular Function
(GO+)
Cellular Process
(GO)
moleculeMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)
Demarcation and Determining Granularity
The Foundational Model of Anatomy (FMA) http://sig.biostr.washington.edu/projects/fm/AboutFM.html
“…the FMA is a domain ontology that represents a coherent body of explicit declarative knowledge about human anatomy.”
The Gene Ontology (GO)http://www.geneontology.org/GO.doc.shtml
“…The Gene Ontology Project has developed three structured controlled vocabularies (ontologies) that describe gene products in terms of their associated biological processes, cellular components, and molecular functions in a species-independent manner.”
Step #5: Determine the most basic(a) universals(b) relations
dealt with in the domain.
Step #6: Construct a list of terms for the domain.Step #7: Seek precision in categorizing, but go for the simpler, “low hanging fruit” first.
INFECTIOUS DISEASE ONTOLOGY http://www.bioontology.org/wiki/index.php/ Infectious_Disease_Ontology
• reservoir• host reservoir• end reservoir• colonization• oral-fecal transmission• transmission• incubation period• infectious disease progression• contagious• quality of pathogen• epidemic• symptom• vehicle
• “end reservoir is_a reservoir”• “oral-fecal transmission is_a
transmission”• “contagious is_a quality of
pathogen”• “incubation period part_of
infectious disease progression”• “colonization part_of infectious
disease progression”• vehicle located_in reservoir• symptom preceded_by
incubation period• epidemic has_participant
colonization
High-HangingFruit:
LifeWhat is “Right”MeaningGene?Neuropathy?Cancer?
Low-HangingFruit:
CellMinimal RiskQuality of LifeHomeotic GeneCauda Equina SyndromeLeukemia
Step #8: Regiment the information in order to ensure logical and scientific coherence:
Avoid the Pitfalls of Incorrect Thinking (IT)
IT: Simply Getting the Facts Wrong *FROM GO, SNOMED, BRIDG, and UMLS
(1) “extracellular region is_a cellular component”(2) “extrinsic to membrane part_of membrane”(3) ‘derives from’ confused with ‘develops from’(4) “both testes is_a testis”(5) Animal =Def. “A non-person living entity…”
(6) “An ontology is the same thing as a database…”(7) “An ontology is just a taxonomy…”
* N.B. It may be the case that the examples of IT used in this presentation have been resolved.
IT: Lack of Clear and Coherent DefinitionsFROM NCIT, BRIDG, and SNOMED:
(1) Disease Progression =Def. “Cancer that continues to grow and spread,” and “Increase in size of tumor…,” and “The worsening of a disease over time”
(2) Person =Def. “Human being”(3) “European is_a ethnic group”(4) “Other European in New Zealand is_a ethnic group”(5) “Mixed ethnic census group is_a ethnic group”
IT: Circular DefinitionsFROM GO and BRIDG
(1) Hemolysis of red blood cells=Def. “The processes by which an organism effects hemolysis”
Compare: Filtration of kidneys=Def. “The processes by which an organism effects filtration (of kidneys)”
(2) Ingredient =Def. “A substance that acts as an ingredient within a product. Note that ingredients may also have ingredients.”(3) Protection from natural killer cell mediated cytolysis =Def. “The process of protecting a cell from cytolysis by natural killer cells”
IT: Examples Instead of Definitions
FROM BRIDG
(1) Adverse Event =Def.(a) “toxic reaction”…(b) “…untoward occurrence in a subject
administered a pharmaceutical product…”
(c) “An unfavorable and unintendedreaction, symptom, syndrome, or disease encountered by a subject on a clinical trial…”
(2) Defeasibility =Def. “a line of communication that is terminated,” “boundaries for software”
IT: Use-Mention ConfusionFROM BIRN, MeSH, NCIT, and HL7
(1)Mouse =Def. “Name for the species Mus musculus”(2)“National Socialism is_a MeSH Descriptor” (3) Conceptual Entities =Def. “An organizational header for concepts representing mostly abstract entities”(4) Animal =Def. “a subtype of Living Subject representing any animal-of-interest to the Personnel Management domain”(5) “living subject is_a code system ”
IT: Conception/Perception vs. Reality ConfusionFROM NCIT and UMLS
(1) Living subject =Def. “An object representing an organism”(2) Class performed activity =Def. “The description of applying, dispensing or giving agents or medications to subjects”(3) Adverse Event =Def. “An observation of a change in the
state of a subject that is assessed as being untoward…”(4) Objective Result =Def. “An act of monitoring, recognizing
and noting reproducible measurement…”(5) “Individual allele is_a act of observation ”(6) “Cancer documentation is_a cancer”(7) “Bacterium causes experimental model of disease”
Some Pitfalls of Incorrect Thinking to Avoid
1) Representing defined classes or particulars2) Representing concepts rather than real entities3) Blurring the use/mention distinction4) Blurring the perception/reality distinction5) Giving examples instead of definitions6) Giving circular definitions7) Not ensuring necessary and specific conditions8) Equivocation9) Using categories of non-existent entities10) Classifying using multiple inheritance
Step #9: Use basic Aristotelian structure when formulating definitions.
- Get at the essential features of an entity when defining it.
- Use a taxonomy structured by is_a relations.
Step #10: Regiment the information in order to ensure compatibility with other relevant ontologies (BFO important here).
Data
Data
Data
Data
Data
Data
Data
Formal Ontology E.G., Basic Formal Ontology
Step #11: Concretize this information in a representational artifact (on paper, in Excel, using Protégé…).
Step #12: Formalize the representational artifact in a computer tractable language.
Step #13: Implement the artifact in some specific computing context.
Cognizance of Informatics ProblemsCooperation of Researchers, Doctors…Conferences, Colloquia, Meetings…Clarity of Terms and RelationsCogency: Counter-Example Free?Coherency of Domain OntologiesCoordination of Domain OntologiesComputational TractabilityCommunicability of InformationCoding of Information CorrectlyConvenience of Accessibility to InformationCare of Humans/Animals (First, Do No Harm)Comfort of Humans/Animals