Upload
robertstevens65
View
87
Download
1
Tags:
Embed Size (px)
Citation preview
Ontology at Manchester
Robert Stevens
BioHealth Informatics Group
School of Computer Science
University of Manchester
2
Ontology Research at Manchester
Language and Reasoning
Tools
Modelling
3
So what is an ontology?
Catalog/ID
Thesauri
Terms/glossary
Informal Is-a
FormalIs-a
Formalinstance
Frames(properties)
General Logicalconstraints
Valuerestrictions
Disjointness,Inverse, partof
Gene Ontology
Mouse AnatomyEcoCyc
PharmGKB
TAMBIS
Arom
After Chris Welty et al
4
A Definition
o a set of logical axioms designed to account for the intended meaning of a formal vocabulary used to describe a certain (conceptualisation of) reality [Guarino 1998]
o “conceptualisation of” inserted by me
o “Logical axioms” means a formal definition of meaning of terms in a formal language
o Formal language—something a computer an reason with
o Use symbols to make inferences
o Symbols represent things and their relationships
o Making inferences about things computationally amenable
5
OWL
• Ontologies will form the back bone of the semantic web
• OWL is the latest standard in ontology languages from the W3C
• Layered on top of RDF and RDF Schema• Underpinned by Description Logics
6
OWL represents classes of instances
A
BC
7
Interpretations
• Individuals are interpreted as objects
• Classes are interpreted as sets containing objects
• Properties are interpreted as binary relations on objects
8
Logical Descriptions
• Class: Water• EquivalentTo: Molecule that
– madeOf 1 OxygenAtom and– madeOf 2 HydrogenAtom and– madeOf only (OxygenAtom or HydrogenAtom)
Class: WaterSubClassOf: Molecule that
hasBoilingPoint value 100 and
hasFreezingPoint value 0 and
hasState some Liquid
*Not beautiful modelling….!*
9
Reasoning
• These OWL descriptions can be submitted to a DL reasoner
• Translated into DL• Checked for consistency—is what we’ve said
satisfiable• Also infers subsumption hierarchy implied by
statements• Mistakes all too easy without help• Formality is your friend
10
Language & Reasoning
• Supporting ontology engineering by automated reasoning– Classification– Consistency checking– Query answering
• Say the things you want to say and still reason• Explain reasoning results• Help debugging unexpected results• Supporting modularity in ontologies• Segmenting large ontologies into modules
11
Language & Reasoning
• Inspecting ontologies to find missing knowledge
• Scalability: Larger ontologies; faster reasoning; more instances; more expressivity
• Instance Store: Query answering over vast numbers of instances
Old Protégé (matrix wizard)
New Protégé (matrix tab)
SWOOP (crop circles)
15
ComparaGRID
16
Classsifying Protein Phosphatases
• Annotating a genome’s proteins is a bottleneck
• Classifying proteins is a first step to annotation
• Tools for detecting features• Need human knowledge to determine class
membership• Can we capture “how to recognise a
phosphatase” in an ontology?
17
Definition of Tyrosine Phosphatase
Class: TyrosinePhosphatase Complete
(Protein and - (contains atLeast-1
ProteinTyrosinePhosphataseDomain) and- (contains 1 TransmembraneDomain))
18
Definition for R2A Phosphatase
Class R2A Complete(Protein and - (contains 2 ProteinTyrosinePhosphataseDomain) and- (contains 1 TransmembraneDomain )and - (contains 4 FibronectinDomains) and- (contains 1 ImmunoglobulinDomain) and- (contains 1 MAMDomain) and- (contains 1 Cadherin-LikeDomain) and- (contains only (TyrosinePhosphataseDomain or
TransmembraneDomain or FibronectinDomain or ImnunoglobulinDomain or Clathrin-LikeDomain or ManDomain)))
19
Building the Ontology
• Classifications already made by biologists – based on protein functionality;
• Protein domain composition and other details in the literature;
• Some 50 classes of phosphatase, 30 protein domains and 39 relationships;
• ”Value partition” of protein domains (covering and disjoint);
• Defines range of contains property;• Literature contains knowledge of how to recognise
members of each class of phosphatase.
20
Incremental Addition of Protein Functional Domains
Phosphatase catalytic
Cadherin-like
Immunoglobulin
MAM domain Cellular retinaldehyde
Adhesion recognition Transmembrane
Fibronectin III Glycosylation
21
Classification of the Classical Tyrosine Phosphatases
22
What is the Ontology Telling Us?
• Each class of phosphatase defined in terms of domain composition
• We know the characteristics by which an individual protein can be recognised to be a member of a particular class of phosphatase
• We have this knowledge in a computational form• If we had protein instances described in terms of
the ontology, we could classify those individual proteins
• A catalogue of phosphatases
23
Classification of Protein Tyrosine Phosphatases
24
Results
• Human “gold standard”: Same results plus two more
• Partially annotated A. fumigatis: Better results and two new putative phosphatases
• Easily generated and compared phosphatase profiles
• Parasites• Whole range of unexpected results---back to
bioinformatics sequence analysis
25
myGrid Service Ontology
• myGrid services and workflow toolkit
• Web service discovery and composition
• Semantic content of provenance repository
• Wide use of service ontology
• Links wit BioMOBY
• Workflows as knowledge management
26
Informal Modelling
• OWL is formal, but ontology has a long informal stage
• Tool forms of knowledge elicitation techniques such as card sorting and laddering
• Experiments with text to ontology tools • With suitable text can truncate the informal
stage• Provide useful starting points for later stages
27
Casual Modelling
• OWL can be scary
• Need the equivalent of pseudo-code
• Work on concept maps as an elicitation tool
• Convertible to OWL
• Converting spreadsheets to OWL
• Converting thesaurae to OWL
28
Community Building of Ontologies
• Collaboration with University of British Columbia, Vancouver
• No money and no centre: What can you do?
• Use your community to build, extend, check facts in your ontology
• Currently running experiments
29
The Sealife Browser
• An EU project to build a Semantic Grid browser for the life sciences
• Uses ontology as background knowledge• Dynamically link to terms on a page• Link to tools, data, documents, etc• A semantic shopping cart• Need to use a broad range of ontologies and
many conversions
30
Modelling Biology & Medicine
• Describing biological phenomena• Reconciling descriptions• Analysing biological data• Describing and analysing healthcare
records• Guiding annotation: Creating and filling
forms• Describing medical phenomena
31
Outside Relationships
• BioPAX
• FUGO/OBI
• Plant Ontology
• CBIO
• HL7, …
32
Training
• Introductory OWL tutorials: Non-biological
• Advanced tutorial: Biology orientated
• Hundreds trained in UK and overseas (mainly life sciences)
• Hands-on training