Upload
robert-gibson
View
217
Download
1
Tags:
Embed Size (px)
Citation preview
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Examples of Integrating Ecological Information on the Semantic Web
Joel Sachs and Cynthia Simms Parrcontact: [email protected]
Joint work with: Andriy Parafiynyk, Li Ding, Rong Pan, Lushan Han, and Tim Finin
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Overview of Talk
• Introduction and background– What are we going to integrate?
– And why?
• ELVIS– The Food Web Constructor and the Evidence Provider
• ETHAN– An ontology for evolutionary trees and natural history
• Tripleshop and Swoogle• Questions
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
An NSF ITR collaborative project with•University of Maryland, Baltimore County •University of Maryland, College Park•U. Of California, Davis•Rocky Mountain Biological Laboratory •NASA Goddard Space Flight Center•NBII
An NSF ITR collaborative project with•University of Maryland, Baltimore County •University of Maryland, College Park•U. Of California, Davis•Rocky Mountain Biological Laboratory •NASA Goddard Space Flight Center•NBII
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
An invasive species scenario
• Nile Tilapia fish have been found in a California lake.• Can this invasive species thrive in this environment?• If so, what will be the likely
consequences for theecology?
• So…we need to understandthe effects of introducingthis fish into the food webof a typical California lake
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Bacteria
Microprotozoa
Amphithoe longimana
Caprella penantis
Cymadusa compta
Lembos rectangularis
Batea catharinensis
Ostracoda
Melanitta
Tadorna tadorna
ELVIS: Ecosystem Localization, Visualization, and Information System
Oreochromis niloticusNile tilapia
??
. . .
Species list constructor
Food web constructor
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
The problem
• We have data on what species are known to be in the location and can further restrict and fill in with other ecological models
• But we don’t know which of these the Nile Tilapia eats of who might eat it.
• We can reason from taxonomic data (similar species) and known natural history data (size, mass, habitat, etc.) to fill in the gaps.
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Food Web
G
S
node
link
Evolutionary tree
step
G
taxonS
taxon
A
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Food Web ConstructorPredict food web links using database and taxonomic reasoning.
In a new estuary, Nile Tilapia could compete with ostracods (green) to eat algae. Predators (red) and prey (blue) of ostracods may be affected
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Food Web Constructor generates possible links
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Evidence provider gives details
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Food web database
Source Webs Nodes Links
Animal Diversity Web n/a 711 2165
EcoWEB 212 4503 11967
Webs on the Web 19 1373 12056
Interaction Web DB 26 2139 9882
Tuesday Lake 2 101 510
Total 259 8827 36580
4600 distinct taxa
Food web data: Cohen 1989, Dunne et al. 2006, Vazquez 2006, Jonsson et al. 2005
Evolutionary tree: Parr et al. 2004. + plants from ITIS + hierarchy of non-taxonomic nodes
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Testing the algorithm
• Take each web out of the database• Attempt to predict its links• Compare prediction with actual data
Accuracy percentage of all predictions that are correct 89%
Precision percentage of predicted links that are correct 55%
Recall percentage of actual links that are predicted 47%
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Evolutionary distance threshold2 steps up and 4 steps down
1 2 3 4S1
S40.3
0.35
0.4
0.45
0.5
1 2 3 4S1
S30.4
0.45
0.5
0.55
0.6
steps up
steps down
precision
steps up
recall
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Evolutionary direction penalty not very sensitive
AB
XA XA YB
1
1 ( ) ( )YB
WeightDistance Penalty Distance Penalty
ancestor
descendent
siblings
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Negative evidence discount is sensitive
XY
1
( )i
Ni
i
weightCertaintyIdx LinkValue
discount
0.2
0.3
0.4
0.5
0.6
0.7
0 25 50 75 100
Negative evidence discount
Recall
Precision
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Some phyla are easier to predict than others
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Annelida
Arthro
poda
Bacill
ario
phyta
Chordata
Mollu
sca
Phylum
Re
ca
ll ra
te
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Trait space distance weighting
Euclidean distance in natural historyN-space
Parameterize functions from the literature that might predict links using characteristics of taxa. For example, size or stoichiometry.
LinkStatusAB= ƒ(α, sizeA, sizeB), ƒ(β, stoichA, stoichB) …
…need more data
How can we do better predicting links?
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Status
• Goal is ELVIS (Ecosystem Location Visualization and Information System) as an integrated set of web services for constructing food webs for a given location.
• Background ontologies
– SpireEcoConcepts: concepts and properties to represent food webs, and ELVIS related tasks, inputs and outputs
– ETHAN (Evolutionary Trees and Natural History) Concepts and properties for ‘natural history’ information on species derived from data in the Animal diversity web and other taxonomic sources
• Under development
– Connect to visualization software
– Connect to triple shop to discover more data
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
ETHANEvolutionary Trees and Natural History ontology
Animal Diversity Webhttp://www.animaldiversity.org• geographic range• habitats• physical description • reproduction• lifespan• behavior and trophic info• conservation status
“Esox lucius” hasMaxMass “1.4 kg”
“Esox lucius” isSubclassOf “Esox”
“Esox” eats “Actinopterygii”
Triples
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Triple Shop
Actinopterygii.owl
webs_publisher.php?published_study=11
Esox_lucius.owl
http://swoogle.umbc.edu
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Triple Shop What are body masses of fish that eat
fish?Enter a SPARQL query
SELECT DISTINCT ?predator ?prey ?preymaxmass ?predatormaxmass
WHERE {
?link rdf:type spec:ConfirmedFoodWebLink .
?link spec:predator ?predator .
?link spec:prey ?prey .
?predator rdfs:subClassOf ethan:Actinopterygii .
?prey rdfs:subClassOf ethan:Actinopterygii .
OPTIONAL { ?predator kw:mass_kg_high ?predatormaxmass } . OPTIONAL { ?prey kw:mass_kg_high ?preymaxmass }
}
. . . leaving out the FROM clause
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Which fish eat other fish? Show their max body masses.
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Enter query w/oFROM clause!
log in
specify dataset
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
302 RDF documents were found that might have useful data.
302 RDF documents were found that might have useful data.
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
We’ll select them all and add them to the current dataset.
We’ll select them all and add them to the current dataset.
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
We’ll run the query against this dataset to see if the results are as expected.
We’ll run the query against this dataset to see if the results are as expected.
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
The results can be produced in any of several formats
The results can be produced in any of several formats
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Results
http://sparql.cs.umbc.edu/tripleshop2/
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Looks like a useful dataset. Let’s save it and also materialize it the TS triple store.
Looks like a useful dataset. Let’s save it and also materialize it the TS triple store.
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
We can also annotate, save and share queries.
We can also annotate, save and share queries.
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Work in Progress
• There are a host of performance issues• We plan on supporting some special datasets, e.g.,
– FOAF and SPIRE data collected from Swoogle– Definitions of RDF and OWL classes and properties
from all ontologies that Swoogle has discovered• Expanding constraints to select candidate SWDs to
include arbitrary metadata and embedded queries– FROM “documents trusted by a member of the SPIRE
project”• “Qurantine” needed to handle conflicts.
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
An RDF Dictionary
• We hope to develop an RDF dictionary.• Given an RDF term, returns a graph of its definiton
– Term definition from “official” ontology– Term+URL definition from SWD at URL– Term+* union definition– Optional argument recursively adds definitions of
terms in definition excluding RDFS and OWL terms– Optional arguments identifies more namespaces to
exclude
UMBCUMBC
an Honors University in an Honors University in
MarylandMaryland
Review
• All Elvis functionality is encapsulated as web services, and all input and output is OWL based.– So Elvis integrates easily with other semantic web applications,
like the TripleShop.
• ELVIS as a platform for experimenting with different approached to food web prediction.
• TripleShop as an integrating platform• TripleShop allows researchers to semi-automatically
construct datasets in response to ad-hoc queries.• Contact [email protected] to participate.