View
2.805
Download
3
Category
Tags:
Preview:
DESCRIPTION
My presentation to the IEEE Asia Pacific Services Computing Conference 2009 - Semantic Web Services In Practice (SWSIP 09) track. This show introduces the SADI (Semantic Automated Discovery and Integration) Framework - the replacement for our earlier explorations with the BioMoby project. In this slideshow I explore what SADI is, and why it is able to generate such interesting (and useful!) Semantic-Webby behaviours from Web Services. I also discuss our current research activities around how we are trying to exploit the SADI system to create much more natural query interfaces for Cardiovascular researchers.
Citation preview
’cause you can’t always GET what you want!
Web Services
vs.
Semantic Web
Web Servicesare not “connected” to
the Semantic Web
Why?
Web ServicesXML + XML Schema
Semantic WebRDF + OWL
Web ServicesPOST of SOAP-XML
Semantic WebGET of RDF-XML
Web ServicesNo (rigorous) semantics
Semantic WebRich, flexible semantics
Web Services&
Semantic Web
Fundamentally different technologies!
>1000 X more data in the Deep Web than in Web pages
In bioinformatics this is primarily databases and
analytical algorithms
Web Service output is
critical to success
for the Semantic Web!!
RESEARCH QUESTION
WITHOUT INVENTING A NEW
TECHNOLOGY OR STANDARD
Can we define
frameworks and/or best practices
that make Web Services
“transparent” to the Semantic Web?
Semantic Web?
An information system where machines can receive information from one source, re-interpret it, and correctly use it for a purpose that the source had
not anticipated.
Semantic Web?
If we cannot achieve those two things, then IMO we don’t have a “semantic web”, we only have a distributed (??), linked database… and that isn’t
particularly exciting…
What do we need to do?
Make Web Services exhibit the same “behavior” as GET
Add rich Semantics to Web Services at every level of the WS “stack”
What do we have to work with?(Without inventing new technologies or standards…)
Data
Interface annotations
Founding partner
Semantic Automated Discovery and Integrationhttp://sadiframework.org
History of SADI
Based on observations of the usage and behaviour of BioMoby Semantic Web Services
since 2002
SADI Observation #1:
What does ‘GET’ really do?
What “behaviour” of GET do we need to mimic
in a Web Service?
SADI Observation #1:
GET guarantees the response relates to the input URI in a very precise and predictable way
POST does not…
Web Services have a fundamentally different semantic behaviour than the Semantic Web
SADI Observation #1:
We can fix that!
(without breaking any existing rules or standards!)
SADI Observation #2:
All Web Services in Bioinformatics create implicit biological relationships
between their input and output
SADI Observation #2:
SADI: Core Proposition
Make the implicit explicit
Model all Web Services this way
SADI: Core Proposition
Web Service output must be Triples
SUBJECT URI of the output graph is the same URI
as the SUBJECT URI of the input graphthat was POSTed to the Web Service
Define OWL classes that describe your input and output triples
Consequence(i.e. why SADI works!)
The Semantics of the triples from a Web Service are now
explicit and identical to the Semantics of GET
How do we exploit this?
Current Web Service definition systems such as WSDL, and OWL-S
are missing a crucial annotation element…
…the semantic relationship between input and output
How do we exploit this?
With SADI services, this relationship can be automatically derived by “diff-ing” the input and output OWL classes
…because the SUBJECT node of the input and output graphs is guaranteed to be the same!
How do we exploit this?
We have constructed a simple prototype
Web Services registry that
provides discovery based on
annotation of these biological relationships
Demo #1a plug-in to the
IO-Informatics Knowledge Explorer
The Sentient Knowledge Explorerfrom IO Informatics
A Semantic Integration, Visualization and Search Tool
SADI Plug-in to the Sentient Knowledge Explorer
• The Knowledge Explorer has a plug-ins API that we have utilized to pass data directly from SADI to the KE visualization and exploration system
• Data elements in KE are queried for their “type” (rdf:type); this is used to discover additional data properties by querying the SADI Web Service registry
BACKGROUND of DATASET
NIST toxicity compendium study
8 toxicantsLiver
SerumUrine
Genomic [Affymetrix MA]Metabolomic [LC/MS] data
• conducted under NIST Advanced Technology Program (ATP), Award # 70NANB2H3009 as a Joint Venture between Icoria / Cogenics (Division of CLDA) and IO Informatics.
• Microarray studies were conducted under NIEHS contract # N01-ES-65406. • The Alcohol study was conducted under NIAAA contract # HHSN281200510008C in
collaboration with Bowles Center for Alcohol Studies (CAS) at UNC. • This work was also made possible through contributions from members of IO
Informatics’ Working Group on “Semantic Applications for Translational Research”.
DEMOKnowledge Explorer
Plug-in
For more information about the Knowledge Explorer surf to:http://io-informatics.com
Call SADI to provide discovery of predicates based on data-type
SADI fills-in the “Encodes” property from the discovered Web Service(s)
Discover services that provide the hasGOTerm property for Protein Sequence datatype
DEMOKnowledge Explorer
Plug-in
For more information about the Knowledge Explorer surf to:http://io-informatics.com
SADI Demo #2
Imagine there is a “virtual graph” connecting every conceivable input for every conceivable Web Service
to their respective outputs
How do we query that graph?
“SHARE”
Semantic Health And Research EnvironmentSADI client application
What pathways does UniProt protein P47989 belong to?
PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#>PREFIX ont: <http://ontology.dumontierlab.com/>PREFIX uniprot: <http://lsrn.org/UniProt:>SELECT ?gene ?pathway WHERE {
uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway .
}
Recapwhat we just saw
A standard SPARQL query was entered into SHARE – a SADI-based query engine
The query was interpreted to extract the properties being queried and these were passed to SADI for Web
Service discovery
SADI searched-for, found, and accessed the databases and/or analytical tools capable of
generating those properties
Recapwhat we just saw
We posed, and answered a complex database query
WITHOUT A DATABASE
(in fact, the data didn’t even have to exist...)
RDF graph browsing and querying is useful…
…but where’s the semantics??
Let’s bring in some ontologies!
My Definition of Ontology (for this talk)
Ontologies explicitly define the things that exist in “the world”
based on what properties each kind of thing must have
Ontology Spectrum
Catalog/ID
SelectedLogical
Constraints(disjointness,
inverse, …)
Terms/glossary
Thesauri“narrowerterm”relation
Formalis-a
Frames(Properties)
Informalis-a
Formalinstance
Value Restrs.
GeneralLogical
constraints
Semantic Web?
An information system where machines can receive information from one source, re-interpret it, and correctly use it for a purpose that the source had
not anticipated.
Dynamic
Discovery
Distributed
Interpretation
Re-interpretation
Powered by SADI
Prototype SADI-enabled SWS Client
Data exhibits “late binding”
Late binding:
“purpose and meaning” of the data is
not determined untilthe moment it is required
Benefitof late binding
Data is amenable toconstant re-interpretation
How do we achieve this?
DO NOT PRE-CLASSIFY DATA!
just hang properties on it
Catalog/ID
SelectedLogical
Constraints(disjointness,
inverse, …)
Terms/glossary
Thesauri“narrowerterm”relation
Formalis-a
Frames(Properties)
Informalis-a
Formalinstance
Value Restrs.
GeneralLogical
constraints
These are indexing/annotation systems
These are axioms that enable classification
Design OWL-DL ontologies describing cardiovascular concepts
Ontologies are in the “Frames+” area of the Ontology spectrum, and therefore can leverage SADI and be
“executed” as workflows
What the…???
Did you just say “execute ontologies as workflows?!”
Another interesting consequence of the SADI architecture
Ontology = Query = Workflow
WORKFLOW
QUERY:
SELECT images of mutations from genes in organism XXX that share homology to this gene in
organism YYY
Concept:
“Homologous Mutant Image”
As OWL Axioms
HomologousMutantImage is owl:equivalentTo {
Gene Q hasImage image P
Gene Q hasSequence Sequence Q
Gene R hasSequence Sequence R
Sequence Q similarTo Sequence R
Gene R = “my gene of interest” }
Those axioms combine to create an OWL Class:
homologous mutant images
QUERY:
Retrieve owl:homologous
mutant images for gene XXX
Demo #3Discover instances of OWL classes
from data that doesn’t exist…
Show me patients whose creatinine level is increasing over time, along with their latest BUN and creatinine levels.
PREFIX regress: <http://sadiframework.org/examples/regression.owl#> PREFIX patients: <http://sadiframework.org/ontologies/patients.owl#> PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creat FROM <http://sadiframework.org/ontologies/patients.rdf>WHERE {
?patient patients:creatinineLevels ?collection . ?collection regress:hasRegressionModel ?model . ?model regress:slope ?slope
FILTER (?slope > 0) .?patient pred:latestBUN ?bun . ?patient pred:latestCreatinine ?creat .
}
Show me patients whose creatinine level is increasing over time, along with their latest BUN and creatinine levels (now as an OWL class ElevatedCreatininePatient)
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patients: <http://sadiframework.org/ontologies/patients.owl#> PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE {
?patient rdf:type patients:ElevatedCreatininePatient .?patient pred:latestBUN ?bun . ?patient pred:latestCreatinine ?creat .
}
Start burrowing through the OWL class find that we need aregression model OWL class
Regression models have features like slopes and intercepts… and so onThe class is completely decomposed until a set of required Services are discoveredcapable of creating all these necessary properties
Successful decomposition of the OWL class to discover the need for a LinearRegression Web Service, and so on
VOILA!
Demo #3Discover instances of OWL classes
from data that doesn’t exist…
SADI and CardioSHARE
OWL Class restrictions converted into workflows
SPARQL queries converted into workflows
Reasoning happening at the same time as queries are being executed
(AFAIK)
This is a completely novel behaviour!
Ontologies are reasoned
Workflows are executed
Queries are resolved
so what do we call what this client does??
“Reckoning”Dynamic discovery of instances of OWL classes through synthesis and invocation of a Web Service
workflow capable of generating data described by the OWL Class restrictions, followed by reasoning to classify the data into that ontology
But OWL Classes can be (and often are) completely
“made-up”
What does that mean in the context of SADI and CardioSHARE?
Current Research
We believe that ontologies and hypotheses are, in some ways, the same “thing”…
…simply assertions about individuals that may or may not exist
If an instance of a class exists, then the hypothesis is “validated”
Current Research Constructing OWL classes that represent patient categories mimicking clinical outcomes research experiments done on the BC Cardiac Registry
The original experiments have already been published, and therefore act as a gold-standard
We attempt to use “Reckoning” to automatically discover instances of these patient Classes and compare our results with those of the clinical researchers
CardioSHARE architecture: Increasingly complex ontological layers organize data into richer concepts, even hypotheses
Blood Pressure
Hypertension
Ischemia
Hypothesis
Database 1 Database 2 AnalysisAlgorithmXX
SADIWeb
“agents”
Recap
SADI Semantic Web Services generate triples; the predicates of those triples are indexed... Period!!
For a given query, determine which properties are available, and which need to be
discovered/generated
Discovery of services through index queries + on-the-fly “classification” of local data against the
service interface OWL Classes
Semantic Web
An information system where machines can receive information from one source, re-interpret it, and correctly use it for a purpose that the source had
not anticipated.
What SADI + SHARE supports
Re-interpretation :
The SADI data-store simply collects properties, and matches them up with OWL Classes in a SPARQL query and/or
from individual service provider’s Web Service interface
Novel re-use:
Because we don’t pre-classify, there is no way for the provider to dictate how their
data should be used. They simply add their properties into the “cloud” and
those properties are used in whatever way is appropriate for me.
What SADI + SHARE supports
Data remains distributed – no warehouse!
Data is not “exposed” as a SPARQL endpoint greater provider-control over
computational resources
Yet data appears to be a SPARQL endpoint… no modification of SPARQL or
reasoner required.
Important “wins”
And all this because we simply require
that the input URI
is the same as the output URI
Join us!
SADI and CardioSHARE are Open-Source projects
Come join us – we’re having a lot of fun!!
http://sadiframework.org
Fin
Recommended