60
UMBC UMBC an Honors University in an Honors University in Maryland Maryland 1 Finding knowledge, data and answers on the Semantic Web Tim Finin University of Maryland, Baltimore County http://ebiquity.umbc.edu/resource/html/id/202/ Joint work with Li Ding, Anupam Joshi, Yun Peng, Cynthia Parr, Pranam Kolari, Pavan Reddivari, Sandor Dornbush, Rong Pan, Akshay Java, Joel Sachs, Scott Cost and Vishal Doshi http://creativecommons.org/licenses/by-nc-sa/2.0/ This work was partially supported by DARPA contract F30602-97-1-0215, NSF grants CCR007080 and IIS9875433 and grants from IBM, Fujitsu and HP.

Finding knowledge, data and answers on the Semantic Web

  • Upload
    mala

  • View
    45

  • Download
    1

Embed Size (px)

DESCRIPTION

Finding knowledge, data and answers on the Semantic Web. Tim Finin University of Maryland, Baltimore County http://ebiquity.umbc.edu/resource/html/id/202/ - PowerPoint PPT Presentation

Citation preview

Page 1: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 1

Finding knowledge, data and answers on

the Semantic WebTim Finin

University of Maryland, Baltimore Countyhttp://ebiquity.umbc.edu/resource/html/id/202/

Joint work with Li Ding, Anupam Joshi, Yun Peng, Cynthia Parr, Pranam Kolari, Pavan Reddivari, Sandor Dornbush, Rong Pan, Akshay Java, Joel Sachs, Scott Cost and Vishal Doshi

http://creativecommons.org/licenses/by-nc-sa/2.0/ This work was partially supported by DARPA contract F30602-97-1-0215, NSF grants CCR007080 and IIS9875433 and grants from IBM, Fujitsu and

HP.

Page 2: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 2

This talk• Motivation• Swoogle Semantic Web

search engine• Use cases and applications• Observations• Conclusions

Page 3: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 3

Google has made us smarter

Page 4: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 4

But what about our agents?

tell

register

Agents still have a very minimal understanding of text and images.

Page 5: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 5

But what about our agents?

A Google for knowledge on the Semantic Web is needed by software agents and programs

SwoogleSwoogle

Swoogle

Swoogle

SwoogleSwoogle

SwoogleSwoogle

Swoogle SwoogleSwoogle

SwoogleSwoogle

SwoogleSwoogle

tell

register

Page 6: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 6

This talk• Motivation• Swoogle Semantic Web

search engine• Use cases and applications• Observations• Conclusions

Page 7: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 7

•http://swoogle.umbc.edu/•Running since summer 2004•1.8M RDF docs, 320M triples, 10K

ontologies,15K namespaces, 1.3M classes, 175K properties, 43M instances, 600 registered users

Page 8: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 8

Analysis

Index

Discovery

IR IndexerSearch Services

Semantic Webmetadata

Web Service

Web Server

Candidate URLs

Bounded Web CrawlerGoogle Crawler

SwoogleBot

SWD Indexer

Ranking

document cache

SWD classifier

human machine

html rdf/xml

the WebSemantic Web

Information flow Swoogle‘s web interface

Legends

Swoogle Architecture

Page 9: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 12

This talk• Motivation• Swoogle Semantic Web

search engine• Use cases and applications• Observations• Conclusions

Page 10: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 13

Applications and use casesSupporting Semantic Web developers

– Ontology designers, vocabulary discovery, who’s using my ontologies or data?, use analysis, errors, statistics, etc.

Searching specialized collections– Spire: aggregating observations and data from biologists– InferenceWeb: searching over and enhancing proofs– SemNews: Text Meaning of news stories

Supporting SW tools– Triple shop: finding data for SPARQL queries

1

2

3

Page 11: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 14

1

Page 12: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 15

By default, ontologies are ordered by their ‘popularity’, but they can also be ordered by recency or size.

80 ontologies were found that had these three terms

Let’s look at this one

Page 13: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 16

Basic MetadatahasDateDiscovered:  2005-01-17 hasDatePing:  2006-03-21 hasPingState:  PingModified type:  SemanticWebDocument isEmbedded:  false hasGrammar:  RDFXML hasParseState:  ParseSuccess hasDateLastmodified:  2005-04-29 hasDateCache:  2006-03-21 hasEncoding:  ISO-8859-1 hasLength:  18K hasCntTriple:  311.00 hasOntoRatio:  0.98 hasCntSwt:  94.00 hasCntSwtDef:  72.00 hasCntInstance:  8.00

Page 14: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 17

Page 15: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 18

rdfs:range was used 41 times to assert a value.

owl:ObjectProperty was instantiated 28 times

time:Cal… defined once and used 24 times (e.g., as range)

Page 16: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 19

These are the namespaces this ontology uses. Clicking on one

shows all of the documents using the namespace.

All of this is available in RDF form for the

agents among us.

Page 17: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 20

Here’s what the agent sees. Note the swoogle and wob (web of belief) ontologies.

Page 18: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 21

We can also search for terms (classes, properties) like terms for “person”.

Page 19: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 22

10K terms associated with “person”! Ordered by use.

Let’s look at foaf:Person’s metadata

Page 20: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 23

Page 21: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 24

Page 22: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 25

Page 23: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 26

87K documents used foaf:gender with a foaf:Person instance as the subject

Page 24: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 27

3K documents used dc:creator with a foaf:Person instance as the object

Page 25: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 28

Swoogle’s archive saves every version of a SWD it’s seen.

Page 26: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 29

Page 27: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 30

2

An NSF ITR collaborative project with•University of Maryland, Baltimore County •University of Maryland, College Park•U. Of California, Davis•Rocky Mountain Biological Laboratory

Page 28: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 31

An invasive species scenario• Nile Tilapia fish have been found in a California lake.

• Can this invasive species thrive in this environment?• If so, what will be the likely

consequences for theecology?

• So…we need to understandthe effects of introducingthis fish into the food webof a typical California lake

Page 29: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 32

Food Webs• A food web models the trophic (feeding)

relationships between organisms in an ecology– Food web simulators are used to explore the

consequences of changes in the ecology, such as the introduction or removal of a species

– A locations food web is usually constructed from studies of the frequencies of the species found there and the known trophic relations among them.

• Goal: automatically construct a food web for a new location using existing data and knowledge

• ELVIS: Ecosystem Location Visualization and Information System

Page 30: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 33

East River Valley Trophic Web

http://www.foodwebs.org/

Page 31: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 34

Species List ConstructorClick a county, get a species list

Page 32: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 35

The problem• We have data on what species are known to be in

the location and can further restrict and fill in with other ecological models

• But we don’t know which of these the Nile Tilapia eats of who might eat it.

• We can reason from taxonomic data (simlar species) and known natural history data (size, mass, habitat, etc.) to fill in the gaps.

Page 33: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 36

Page 34: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 37

Food Web ConstructorPredict food web links using database and taxonomic reasoning.

In an new estuary, Nile Tilapia could compete with ostracods (green) to eat algae. Predators (red) and prey (blue) of ostracods may be affected

Page 35: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 38

Evidence ProviderExamine evidence for predicted links.

Page 36: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 39

Status• Goal is ELVIS (Ecosystem Location Visualization and

Information System) as an integrated set of web services for constructing food webs for a given location.

• Background ontologies– SpireEcoConcepts: concepts and properties to represent food

webs, and ELVIS related tasks, inputs and outputs– ETHAN (Evolutionary Trees and Natural History) Concepts and

properties for ‘natural history’ information on species derived from data in the Animal diversity web and other taxonomic sources

• Under development– Connect to visualization software– Connect to triple shop to discover more data

Page 37: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 40

UMBC Triple Shop• http://sparql.cs.umbc.edu/• Online SPARQL RDF query processing with several

interesting features• Automatically finds SWDs for give queries using

Swoogle backend database• Datasets, queries and results can be saved, tagged,

annotated, shared, searched for, etc.• RDF datasets as first class objects

– Can be stored on our server or downloaded– Can be materialized in a database or

(soon) as a Jena model

3

Page 38: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 41

Web-scale semantic web data access

agent data access service the Web

ask (“person”)Search vocabulary

ask (“?x rdf:type foaf:Person”)

inform (“foaf:Person”)

Fetch docs

Populate RDF database

Query localRDF database

inform (doc URLs)

Search URIrefs in SW vocabulary

Search URLsin SWD index

Compose query

Index RDF data

Page 39: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 42

Who knows Anupam Joshi?Show me their names, email address and pictures

Page 40: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 43

The UMBC ebiquity site publishes lots of RDF data, including FOAF profiles

Page 41: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 44

No FROM clause!

PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT DISTINCT ?p2name ?p2mbox ?p2pixFROM ???WHERE { ?p1 foaf:surname "Joshi" . ?p1 foaf:firstName “Anupam" . ?p1 foaf:mbox ?p1mbox . ?p2 foaf:knows ?p3 . ?p3 foaf:mbox ?p1mbox . ?p2 foaf:name ?p2name . ?p2 foaf:mbox ?p2mbox . OPTIONAL { ?p2 foaf:depiction ?p2pix } . }ORDER BY ?p2name

Page 42: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 45

Enter query w/oFROM clause!

log in

specify dataset

Page 43: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 46

Page 44: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 47

Page 45: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 48

302 RDF documents were found that might have useful data.

Page 46: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 49

We’ll select them all and add them to the current dataset.

Page 47: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 50

We’ll run the query against this dataset to see if the results are as expected.

Page 48: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 51

The results can be produced in any of several formats

Page 49: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 52

Page 50: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 53

Looks like a useful dataset. Let’s save it and also materialize it the TS triple store.

Page 51: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 54

Page 52: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 55

We can also annotate, save and share queries.

Page 53: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 56

Work in Progress• There are a host of performance issues• We plan on supporting some special datasets, e.g.,

– FOAF data collected from Swoogle– Definitions of RDF and OWL classes and properties from all

ontologies that Swoogle has discovered• Expanding constraints to select candidate SWDs to include

arbitrary metadata and embedded queries– FROM “documents trusted by a member of the SPIRE

project”• We will explore two models for making this useful

– As a downloadable application for client machines– As an (open source?) downloadable service for servers

supporting a community of users.

Page 54: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 57

This talk• Motivation• Swoogle Semantic Web

search engine• Use cases and applications• Observations• Conclusions

Page 55: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 58

Will Swoogle Scale? How?Here’s a rough estimate of the data in RDF documents on the semantic web based on Swoogle’s crawling

System/date Terms Documents Individuals Triples Bytes

Swoogle2 1.5x105 3.5x105 7x106 5x107 7x109

Swoogle3 2x105 7x105 1.5x107 7.5x107 1x1010

2006 1x106 5x107 5x107 5x109 5x1011

2008 5x106 5x109 5x109 5x1011 5x1013

We think Swoogle’s centralized approach can be made to work for the next few years if not longer.

Page 56: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 59

How much reasoning should Swoogle do?

• SwoogleN (N<=3) does limited reasoning– It’s expensive– It’s not clear how much should be done

• More reasoning would benefit many use cases– e.g., type hierarchy

• Recognizing specialized metadata– E.g., that ontology A some maps terms from B to C

Page 57: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 60

A RDF Dictionary• We hope to develop an RDF dictionary.• Given an RDF term, returns a graph of its

definiton– Term definition from “official” ontology– Term+URL definition from SWD at URL– Term+* union definition– Optional argument recursively adds definitions of terms

in definition excluding RDFS and OWL terms– Optional arguments identifies more namespaces to

exclude

Page 58: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 61

This talk• Motivation• Swoogle Semantic Web

search engine• Use cases and applications• Observations• Conclusions

Page 59: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 62

Conclusion• The web will contain the world’s knowledge in

forms accessible to people and computers– We need better ways to discover, index, search and

reason over SW knowledge• SW search engines address different tasks than

html search engines– So they require different techniques and APIs

• Swoogle like systems can help create consensus ontologies and foster best practices– Swoogle is for Semantic Web 1.0– Semantic Web 2.0 will make different demands

Page 60: Finding knowledge, data and answers on the Semantic Web

UMBCUMBCan Honors University in an Honors University in

MarylandMaryland 63

http://ebiquity.umbc.edu/Annotated

in OWL

For more information