153
The 2009 Semantic Web Landscape Technologies, tools, and projects Lee Feigenbaum VP Technology & Standards, Cambridge Semantics Co-chair, W3C SPARQL Working Group For PRISM Forum SIG on Semantic Web May 12, 2009

Semantic Web Landscape 2009

Embed Size (px)

DESCRIPTION

These slides were originally a tutorial presented for the SIG preceding the May 2009 meeting of the PRISM Forum. They attempt to give a survey of the technologies, tools, and state of the world with respect to the Semantic Web as of the first half of 2009.

Citation preview

Page 1: Semantic Web Landscape 2009

The 2009 Semantic Web LandscapeTechnologies, tools, and projects

Lee FeigenbaumVP Technology & Standards, Cambridge Semantics

Co-chair, W3C SPARQL Working Group

For PRISM Forum SIG on Semantic WebMay 12, 2009

Page 2: Semantic Web Landscape 2009

Much material & wisdom used with gracious permission of:Ivan Herman

W3C Semantic Web Activity LeadBijan Parsia

Co-editor of the core OWL 2 specificationIan Horrocks

Co-chair of the W3C OWL 2 Working GroupPhil Archer

Chair of the W3C POWDER Working Group

Thanks Upfront

May 12, 2009 2

Page 3: Semantic Web Landscape 2009

Much material & wisdom used with gracious permission of:Michael Hausenblas

Evangelist for RDFa, Linked Data, and Multimedia SemanticsFabien Gandon

Member, GRDDL and OWL 2 Working GroupsSusie Stephens

Co-chair W3C HCLS Interest GroupEric Prud’hommeaux

W3C team member, Semantic Web expert

Thanks Upfront

May 12, 2009 3

Page 4: Semantic Web Landscape 2009

Executive Summary: The Semantic Web in 2009

May 12, 2009 4

The Semantic Web in 2009 is characterized by a healthy environment of stable, broadly-implemented core standard

technologies complemented by a number of continually emerging new standards.

Adopters of Semantic Web technologies in 2009 can choose from a wide range of commercial and open-source interoperable tools

and systems.

Enterprise Semantic Web projects are beginning to move beyond proofs of concept to serious production implementations.

Community projects on the World Wide Web have linked hundreds of public data sets into an emergent Semantic Web.

Page 5: Semantic Web Landscape 2009

IntroductionThe data model (RDF)The query language (SPARQL)Adding structure & semantics (RDFS, OWL, RIF)Working in the real world (GRDDL, RDF2RDB)Working on the Web (Linked Data, RDFa, POWDER)

Agenda

May 12, 2009 5

Page 6: Semantic Web Landscape 2009

The W3C HCLS interest group set out to use Semantic Web technologies to receive precise answers to a complex question:

A Motivating Example: Drug Discovery

May 12, 2009 6

Find me genes involved in signal transduction that are related to pyramidal neurons.

Page 7: Semantic Web Landscape 2009

General search

223,000 hits, 0 results

May 12, 2009 7

Page 8: Semantic Web Landscape 2009

Domain-limited search

2,580 potential results

May 12, 2009 8

Page 9: Semantic Web Landscape 2009

Specific databases

Too many silos!

May 12, 2009 9

Page 10: Semantic Web Landscape 2009

A Semantic Web Approach

Integrate disparate databases…

MeSHPubMedEntrez GeneGene Ontology…

May 12, 2009 10

Page 11: Semantic Web Landscape 2009

A Semantic Web Approach (cont’d)

…so that one query…

May 12, 2009 11

Page 12: Semantic Web Landscape 2009

A Semantic Web Approach (cont’d)

…(trivially) spans several databases…

May 12, 2009 12

Page 13: Semantic Web Landscape 2009

A Semantic Web Approach (cont’d)

…to deliver targeted results…

May 12, 2009 13

Page 14: Semantic Web Landscape 2009

1. Agreement on common terms and relationships

2. Incremental, flexible data structure3. Good-enough modeling4. Query interface tailored to the data

model

What’s the trick?

May 12, 2009 14

Page 15: Semantic Web Landscape 2009

Names

May 12, 2009 15

Page 16: Semantic Web Landscape 2009

Semantic WebWeb of DataGiant Global GraphData WebWeb 3.0Linked Data WebSemantic Data Web

Branding

May 12, 2009 16

Page 17: Semantic Web Landscape 2009

“The Semantic Web”Augments the World Wide WebRepresents the Web’s information in a machine-readable fashionEnables…

…targeted search…data browsing…automated agents

What is it & why do we care? (1)

May 12, 2009 17

World Wide Web : Web pages :: The Semantic Web : Data

Page 18: Semantic Web Landscape 2009

“Semantic Web technologies”A family of technology standards that ‘play nice together’, including:

Flexible data modelExpressive ontology languageDistributed query language

Drive Web sites, enterprise applications

What is it & why do we care? (2)

May 12, 2009 18

The technologies enable us to build applications and solutions that were not possible, practical, or feasible traditionally.

Page 19: Semantic Web Landscape 2009

A common set of technologies:...enables diverse uses...encourages interoperability

A coherent set of technologies:…encourage incremental application…provide a substantial base for innovation

A standard set of technologies:...reduces proprietary vendor lock-in...encourages many choices for tool sets

A Common & Coherent Set of Technology Standards

May 12, 2009 19

Page 20: Semantic Web Landscape 2009

The (In)Famous Layer Cake

May 12, 2009 20

Page 21: Semantic Web Landscape 2009

Semantic Web Technology Timeline

1999 2001 2004 2008 20092007

RIF

HCLS

May 12, 2009 21

Page 22: Semantic Web Landscape 2009

As technologies & tools have evolved, Semantic Web advocates have progressed through stages:

2009: Where we are

May 12, 2009 22

Report on… Execute on…

Semantic Web vision Initial experiments

Experiments Technology standards

Technology standards Software packages

Software packages Proofs of concept

Proofs of concept Production implementations

Page 23: Semantic Web Landscape 2009

2009: Where we are (cont’d)

May 12, 2009 23

http://www.w3.org/2001/sw/sweo/public/UseCases/

Page 24: Semantic Web Landscape 2009

2009: Where we’re not

May 12, 2009 24

Semantic Web technologies are not a ‘magic crank’ for discovering new drugs (or solving other problems, for that matter)!

Image from Trey Ideker via Enoch Huang

Page 25: Semantic Web Landscape 2009

2009: Where we’re not (cont’d)

May 12, 2009 25

The Semantic Web still suffers from confusing and conflicting messaging, each of which asserts it’s “correct”.

XML vs. RDF?“Ontology” vs. “ontology”?

Semantic Web vs. Linked Data?

Data integration vs. reasoning vs. KBs vs. search vs. app. development vs. …

Page 26: Semantic Web Landscape 2009

2009: Where we’re not (cont’d)

May 12, 2009 26

People with appropriate skill sets for designing & building Semantic Web solutions are not widely available.

Page 27: Semantic Web Landscape 2009

2009: Where we’re not (cont’d)

May 12, 2009 27

We don’t yet have standard solutions for privacy, trust, probability, and other elements of the Semantic Web vision.

Page 28: Semantic Web Landscape 2009

Introduction to the Semantic Web approach

Thanks to Ivan Herman for this example

How does a Semantic Web approach help us merge data sets, infer new relations, and integrate

outside data sources?

May 12, 2009 28

Page 29: Semantic Web Landscape 2009

1. Map the various data onto an abstract data representation

• Make the data independent of its internal representation…

2. Merge the resulting representations3. Start making queries on the whole

• Queries not possible on the individual data sets

The rough structure of data integration

May 12, 2009 29

Page 30: Semantic Web Landscape 2009

Data set “A”: A simplified book store

ID Author Title Publisher Year

ISBN0-00-651409-X id_xyz The Glass Palace id_qpr 2000

ID Name Home page

id_xyz Ghosh, Amitav http://www.amitavghosh.com

ID Publisher Name City

id_qpr Harper Collins London

Books

Authors

Publishers

May 12, 2009 30

Page 31: Semantic Web Landscape 2009

1st: Export your data as a set of relations

May 12, 2009 31

Page 32: Semantic Web Landscape 2009

Data export does not necessarily mean physical conversion of the data

Relations can be virtual, generated on-the-fly at query time

via SQL “bridges” scraping HTML pages extracting data from Excel sheets etc.

One can export part of the data

Some notes on the data export

May 12, 2009 32

Page 33: Semantic Web Landscape 2009

A B D E

1 ID Titre Original

2

ISBN0 2020386682 A13 ISBN-0-00-651409-X

3

6 ID Auteur7 ISBN-0-00-651409-X A12

11

12

13

TraducteurLe Palais des miroirs

NomGhosh, AmitavBesse, Christianne

Data set “F”: Another book store’s data

May 12, 2009 33

Page 34: Semantic Web Landscape 2009

2nd: Export your second set of data

May 12, 2009 34

Page 35: Semantic Web Landscape 2009

3rd: start merging your data

May 12, 2009 35

Page 36: Semantic Web Landscape 2009

3rd: start merging your data (cont’d)

May 12, 2009 36

Page 37: Semantic Web Landscape 2009

4th: Merge identical resources

May 12, 2009 37

Page 38: Semantic Web Landscape 2009

User of data set “F” can now ask queries like:“What is the title of the original version of Le Palais des miroirs?”

This information is not in the data set “F”...…but can be retrieved after merging with data set “A”!

Start making queries…

May 12, 2009 38

Page 39: Semantic Web Landscape 2009

5th: Query the merged data set

May 12, 2009 39

Page 40: Semantic Web Landscape 2009

We “know” that a:author and f:auteur are really the sameBut our automatic merge does not know that!Let us add some extra information to the merged data:

a:author is the same as f:auteurBoth identify a Person, a category (type) for certain resources

However, more can be achieved…

May 12, 2009 40

Page 41: Semantic Web Landscape 2009

3rd revisited: Use the extra knowledge

May 12, 2009 41

Page 42: Semantic Web Landscape 2009

User of data set “F” can now query:“What is the home page of Le Palais des miroirs’s ‘auteur’?”

The information is not in data set “F” or “A”……but was made available by:

Merging data sets “A” and “F”Adding three simple “glue” statements

Start making richer queries!

May 12, 2009 42

Page 43: Semantic Web Landscape 2009

6th: Richer queries

May 12, 2009 43

Page 44: Semantic Web Landscape 2009

We can integrate new information into our merged data set from other sources

e.g. additional information about author Amitav Ghosh

Perhaps the largest public source of general knowledge is Wikipedia

Structured data can be extracted from Wikipedia using dedicated tools

Bring in other data sources

May 12, 2009 44

Page 45: Semantic Web Landscape 2009

7th: Merge with Wikipedia data

May 12, 2009 45

Page 46: Semantic Web Landscape 2009

7th (cont’d): Merge with Wikipedia data

May 12, 2009 46

Page 47: Semantic Web Landscape 2009

7th (cont’d): Merge with Wikipedia data

May 12, 2009 47

Page 48: Semantic Web Landscape 2009

It may look like it but, in fact, it should not be…What happened via automatic means is done every day by Web users!The difference: a bit of extra rigour so that machines could do this, too

Is that surprising?

May 12, 2009 48

Page 49: Semantic Web Landscape 2009

We combined different data sets that...may be internal or somewhere on the Web...are of different formats (RDBMS, Excel spreadsheet, (X)HTML, etc)...have different names for the same relations

We could combine the data because some URIs were identical

i.e. the ISBNs in this caseWe could add some simple additional information (the “glue”) to help further merge data setsThe result? Answer queries that could not previously be asked

What did we do?

May 12, 2009 49

Page 50: Semantic Web Landscape 2009

What did we do? (cont’d)

May 12, 2009 50

Page 51: Semantic Web Landscape 2009

…the graph representation is independent of the details of the native structures…a change in local database schemas, HTML structures, etc. do not affect the whole

“schema independence”…new data, new connections can be added seamlessly & incrementally

The abstraction pays off because…

May 12, 2009 51

Page 52: Semantic Web Landscape 2009

The rest of this tutorial introduces many of these technologies.

So where is the Semantic Web?

Semantic Web technologies make such integration possible

May 12, 2009 52

Page 53: Semantic Web Landscape 2009

IntroductionThe data model (RDF)The query language (SPARQL)Adding structure & semantics (RDFS, OWL, RIF)Working in the real world (GRDDL, RDF2RDB)Working on the Web (Linked Data, RDFa, POWDER)

Agenda

May 12, 2009 53

Page 54: Semantic Web Landscape 2009

RDF is…

May 12, 2009 54

Resource Description Framework

Page 55: Semantic Web Landscape 2009

RDF is…

May 12, 2009 55

The data model of the Semantic Web.

Page 56: Semantic Web Landscape 2009

RDF is…

May 12, 2009 56

A schema-less data model that features unambiguous identifiers and named relations

between pairs of resources.

Page 57: Semantic Web Landscape 2009

RDF graphs are collections of triplesTriples are made up of a subject, a predicate, and an object

Resources and relationships are named with URIs

RDF is…

May 12, 2009 57

A labeled, directed graph of relations between resources and literal values.

subject objectpredicate

Page 58: Semantic Web Landscape 2009

“Lee Feigenbaum works for Cambridge Semantics”

“Lee Feigenbaum was born in 1978”

“Cambridge Semantics is headquartered in Massachusetts”

Example RDF triples

May 12, 2009 58

Lee Feigenbaum

Cambridge Semantics

works for

Lee Feigenbaum 1978

born in

Cambridge Semantics

headquarteredMassachusetts

Page 59: Semantic Web Landscape 2009

Triples connect to form graphs

May 12, 2009 59

Lee Feigenbaum

Cambridge Semantics

works for

1978

born inheadquartered

Massachusetts

Boston

lives in

capital

Page 60: Semantic Web Landscape 2009

The graph data structure makes merging data with shared identifiers trivial (as we saw earlier)Triples act as a least common denominator for expressing dataURIs for naming remove ambiguity

…the same identifier means the same thing

Why RDF? What’s different here?

May 12, 2009 60

Page 61: Semantic Web Landscape 2009

61

Why RDF? Incremental Integration

Flexible Graph Model

URIs for

naming

Agile, Incremental

Integration

RelationalDatabase RDF

May 12, 2009

Page 62: Semantic Web Landscape 2009

Triple storesBuilt on relational databaseNative RDF store

Development librariesFull-featured application servers

Types of RDF Tools

May 12, 2009 62

Most RDF tools contain some elements of each of these.

Page 63: Semantic Web Landscape 2009

Community-maintained listshttp://esw.w3.org/topic/SemanticWebTools

Emphasis on large triple storeshttp://esw.w3.org/topic/LargeTripleStores

Michael Bergman’s Sweet Tools searchable list:http://www.mkbergman.com/?page_id=325

Finding RDF Tools

May 12, 2009 63

Page 64: Semantic Web Landscape 2009

RDF Tools – (Some) Triple Stores

May 12, 2009 64

Tool Commercial orOpen-source Environment

Anzo Both Java

ARC Open-source PHP

AllegroGraph Commercial Java, Prolog

Jena Open-source Java

Mulgara Open-source Java

Oracle RDF Commercial SQL / SPARQL

RDF::Query Open-source Perl

Redland Open-source C, many wrappers

Sesame Open-source Java

Talis Platform Commercial HTTP (Hosted) Virtuoso Both C++

Page 65: Semantic Web Landscape 2009

IntroductionThe data model (RDF)The query language (SPARQL)Adding structure & semantics (RDFS, OWL, RIF)Working in the real world (GRDDL, RDF2RDB)Working on the Web (Linked Data, RDFa, POWDER)

Agenda

May 12, 2009 65

Page 66: Semantic Web Landscape 2009

Motivating SPARQL

May 12, 2009 66

With a query language, a client can design their own interface.

--Leigh Dodds, Talis

Page 67: Semantic Web Landscape 2009

SPARQL is…

May 12, 2009 67

SPARQL Protocol And RDF Query Language

Page 68: Semantic Web Landscape 2009

SPARQL is…

May 12, 2009 68

The query language of the Semantic Web.

Page 69: Semantic Web Landscape 2009

SPARQL is…

May 12, 2009 69

A SQL-like language for querying sets of RDF graphs.

Page 70: Semantic Web Landscape 2009

SPARQL is…

May 12, 2009 70

A simple protocol for issuing queries and receiving results over HTTP. So…

Every SPARQL client works with every SPARQL server!

Page 71: Semantic Web Landscape 2009

71

SPARQL lets us:Pull information from structured and semi-structured data.Explore data by discovering unknown relationships.Query and search an integrated view of disparate data sources.Glue separate software applications together by transforming data from one vocabulary to another.

Why SPARQL?

May 12, 2009

Page 72: Semantic Web Landscape 2009

What automobiles get more than 25 miles per gallon, fit within my department’s budget, and can be purchased at a dealer located within 10 miles of one of my employees?

SELECT ?automobileWHERE { ?automobile a ex:Car ; epa:mpg ?mpg ; ex:dealer ?dealer . ?employee a ex:Employee ; geo:loc ?loc . ?dealer geo:loc ?dealerloc . FILTER(?mpg > 25 && geo:dist(?loc, ?dealerloc) <= 10) .}

Web dashboard SPARQL query

EmployeeDirectory

ERP / BudgetSystem

Web

Dealer 1Dealer 2

Dealer 3

EPA Fuel EfficiencySpreadsheet

SPARQL Query Engine

Page 73: Semantic Web Landscape 2009

PREFIX type: <http://dbpedia.org/class/yago/>PREFIX prop: <http://dbpedia.org/property/> SELECT ?country_name ?population WHERE { ?country a type:LandlockedCountries ; rdfs:label ?country_name ; prop:populationEstimate ?population . FILTER ( ?population > 15000000 && langMatches(lang(?country_name), "EN") ) . }ORDER BY DESC(?population)

SPARQL Example: Querying Wikipedia

May 12, 2009 73

Find me all landlocked countries with a population greater than 15 million.

Page 74: Semantic Web Landscape 2009

SPARQL Example: Querying Wikipedia

DBPedia SPARQL Endpoint

Page 75: Semantic Web Landscape 2009

SPARQL Example: Querying Wikipedia

Page 76: Semantic Web Landscape 2009

Query enginesThings that can run queriesMost RDF stores provide a SPARQL engine

Query rewritersE.g. to query relational databases (more later)

EndpointsThings that accept queries on the Web and return results

Client librariesThings that make it easy to ask queries

Types of SPARQL Tools

May 12, 2009 76

Page 77: Semantic Web Landscape 2009

Community-maintained list of query engineshttp://esw.w3.org/topic/SparqlImplementations

Publicly accessible SPARQL endpointshttp://esw.w3.org/topic/SparqlEndpoints

Michael Bergman’s Sweet Tools searchable list:http://www.mkbergman.com/?page_id=325

Finding SPARQL Tools

May 12, 2009 77

Page 78: Semantic Web Landscape 2009

(Some) SPARQL’able Data Sets

May 12, 2009 78

Page 79: Semantic Web Landscape 2009

bio2rdf.org – querying life sciences data

May 12, 2009 79

Page 80: Semantic Web Landscape 2009

bio2rdf.org – querying life sciences data

May 12, 2009 80

Page 81: Semantic Web Landscape 2009

IntroductionThe data model (RDF)The query language (SPARQL)Adding structure & semantics (RDFS, OWL, RIF)Working in the real world (GRDDL, RDF2RDB)Working on the Web (Linked Data, RDFa, POWDER)

Agenda

May 12, 2009 81

Page 82: Semantic Web Landscape 2009

We haven’t seen anything yet that begins to approach the long-term Semantic Web vision

Where’s the magic?

May 12, 2009 82

Page 83: Semantic Web Landscape 2009

3 pieces of the Semantic Web technology stack are about describing a domain well enough to capture (some of) the meaning of resources and relationships in the domain

RDF SchemaOWLRIF

From the explicit to the inferred

May 12, 2009 83

Apply knowledge to data to get more data.

Page 84: Semantic Web Landscape 2009

RDFS is…

May 12, 2009 84

RDF Schema

Page 85: Semantic Web Landscape 2009

Elements of:Vocabulary (defining terms)

I define a relationship called “prescribed dose.”

Schema (defining types)“prescribed dose” relates “treatments” to “dosagees”

Taxonomy (defining hierarchies)Any “doctor” is a “medical professional”

RDF Schema is…

May 12, 2009 85

Page 86: Semantic Web Landscape 2009

WOL OWL is…

May 12, 2009 86

Web Ontology Language

Page 87: Semantic Web Landscape 2009

Elements of ontologySame/different identity

“author” and “auteur” are the same relationtwo resources with the same “ISBN” are the same “book”

More expressive type definitionsA “cycle” is a “vehicle” with at least one “wheel”A “bicycle” is a “cycle” with exactly two “wheels”

More expressive relation definitions“sibling” is a symmetric predicatethe value of the “favorite dwarf” relation must be one of “happy”, “sleepy”, “sneezy”, “grumpy”, “dopey”, “bashful”, “doc”

OWL is…

May 12, 2009 87

Page 88: Semantic Web Landscape 2009

Answer questions ofConsistency

Are there any contradictions in this model?

ClassificationWhat are all the inferred types of this resource?

SatisfiabilityAre there any classes in this ontology that cannot possibly have any members?

What can we do with OWL?

May 12, 2009 88

Page 89: Semantic Web Landscape 2009

Building Useful OntologiesDeveloping and maintaining quality ontolgies is very challengingUsers need tools and services, e.g., to help check if ontology is:

Meaningful — all named classes can have instances

http://www.aber.ac.uk/compsci/public/media/presentations/OUCL-seminar.ppt

Page 90: Semantic Web Landscape 2009

Building Useful OntologiesDeveloping and maintaining quality ontolgies is very challengingUsers need tools and services, e.g., to help check if ontology is:

Meaningful — all named classes can have instancesCorrect — captures intuitions of domain experts

Page 91: Semantic Web Landscape 2009

Building Useful OntologiesDeveloping and maintaining quality ontolgies is very challengingUsers need tools and services, e.g., to help check if ontology is:

Meaningful — all named classes can have instancesCorrect — captures intuitions of domain expertsMinimally redundant — no unintended synonyms

Banana split Banana sundae

Page 92: Semantic Web Landscape 2009

Large: 373,731 concepts & over 1 million termsNHS version extended to 542,380 classes with

19,828 additional named classes148,821 class drug taxonomy (primitive hierarchy)

OWL reasoner (FaCT++) classified NHS ontology Able to classify whole ontology in <4 hoursInteresting results come from 19,828 additional named classes180 missing subClass relationships were found, e.g.:

Periocular_dermatitis subClassOf Disease_of_face

Example: SNOMED

May 12, 2009 92

Page 93: Semantic Web Landscape 2009

Example: SNOMED

May 12, 2009 93

Page 94: Semantic Web Landscape 2009

RIF is…

May 12, 2009 97

Rules Interchange Format

Page 95: Semantic Web Landscape 2009

Standard representation for exchanging sets of logical and business rulesLogical rules

A buyer buys an item from a seller if the seller sells the item to the buyerA customer becomes a "Gold" customer as soon as his cumulative purchases during the current year top $5000

Production rulesCustomers that become "Gold" customers must be notified immediately, and a golden customer card will be printed and sent to them within one weekFor shopping carts worth more than $1000, "Gold" customers receive an additional discount of 10% of the total amount

RIF is…

May 12, 2009 98

Page 96: Semantic Web Landscape 2009

99

Editors/environmentsOiled, Protégé, Swoop, TopBraid, Ontotrack, …

Developing Tools and Infrastructure

May 12, 2009

Page 97: Semantic Web Landscape 2009

100

Editors/environmentsOiled, Protégé, Swoop, TopBraid, Ontotrack, …

Reasoning systemsCerebra, FaCT++, Kaon2, Pellet, Racer, CEL, …

Developing Tools and Infrastructure

May 12, 2009

PelletKAON2 CEL

Page 98: Semantic Web Landscape 2009

Visualizing and Publishing Vocabularies

May 12, 2009 101

Page 99: Semantic Web Landscape 2009

Reusable, public ontologies

May 12, 2009 102

Measurement Units Ontology

The Event Ontology

FOAF

Page 100: Semantic Web Landscape 2009

IntroductionThe data model (RDF)The query language (SPARQL)Adding structure & semantics (RDFS, OWL, RIF)Working in the real world (GRDDL, RDF2RDB)Working on the Web (Linked Data, RDFa, POWDER)

Agenda

May 12, 2009 103

Page 101: Semantic Web Landscape 2009

Fantasy Land Architecture

May 12, 2009 104

Ontology / Schema+

Custom UI

Custom UI

Custom UI

Custom UI

Custom UI

Custom UI

Page 102: Semantic Web Landscape 2009

Reality

May 12, 2009 105

Internet

OracleRDB

DB2XML

LDAP Directory

Custom UI

Custom UI

Custom UI

Custom UI

Custom UI

Custom UI

Page 103: Semantic Web Landscape 2009

GRDDL is…

May 12, 2009 106

Gleaning Resource Descriptions from Dialects of Language

Page 104: Semantic Web Landscape 2009

GRDDL is…

May 12, 2009 107

A method for authoritatively getting RDF data from XML and XHTML documents.

Page 105: Semantic Web Landscape 2009

GRDDL is…

May 12, 2009 108

A mechanism for authoritatively deriving RDF data from families of XML and XHTML

documents.

Page 106: Semantic Web Landscape 2009

Community-maintained list:http://esw.w3.org/topic/GrddlImplementations

GRDDL tools

May 12, 2009 109

Most GRDDL tools are adapters to existing RDF stores or SPARQL engines to allow loading or querying data from XML and XHTML sources.

Host System GRDDL toolJena GRDDL Reader for Jena

RDFLib GRDDL.pyRedland (built in)

Swignition (built in)Virtuoso GRDDL “Sponger”

Page 107: Semantic Web Landscape 2009

RDB2RDF is…

May 12, 2009 110

Relational Database to RDF

Page 108: Semantic Web Landscape 2009

RDB2RDF is…

May 12, 2009 111

A proposed W3C Working Group to define a standard way to map from relational databases

to RDF (and SPARQL).

Page 109: Semantic Web Landscape 2009

Survey of existing approaches:http://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_SurveyReport.pdf

RDF2RDB tools

May 12, 2009 112

Tool Mapping Approach Dynamic vs. Static (ETL)

Anzo D2RQ configuration graph Both

Asio Tools OWL file, SWRL rules Both

Dartgrid XML file, visual mapper Dynamic

D2RQ D2RQ configuration file Both

R2O R2O XML file Both

RDBtoOnto Constraint rules Static (ETL)

SDS EII Query Engine/OOM XML Both

Triplify SQL config file Linked Data

Virtuoso RDF View Meta-Schema Language Both

Page 110: Semantic Web Landscape 2009

What about… everything else?

May 12, 2009 113

Standards don’t yet exist, but many tools exist to derive RDF and/or run SPARQL queries against

other sources of data.

Page 111: Semantic Web Landscape 2009

LDAP Directories

May 12, 2009 114

Squirrel RDFhttp://jena.sourceforge.net/SquirrelRDF/

Page 112: Semantic Web Landscape 2009

Excel spreadsheets

May 12, 2009 115

Anzo for Excelhttp://www.cambridgesemantics.com/products/anzo_for_excel

Page 113: Semantic Web Landscape 2009

Excel spreadsheets

May 12, 2009 116

Semantic Discovery Systemhttp://insilicodiscovery.com/installation/index.php

Page 114: Semantic Web Landscape 2009

Web-based data sources

May 12, 2009 117

Virtuoso Sponger Cartridgeshttp://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger

Page 115: Semantic Web Landscape 2009

Unstructured Text

May 12, 2009 118

Calaishttp://www.opencalais.com/

Page 116: Semantic Web Landscape 2009

Unstructured Text

May 12, 2009 119

Zemanta Web Servicehttp://developer.zemanta.com/

Page 117: Semantic Web Landscape 2009

IntroductionThe data model (RDF)The query language (SPARQL)Adding structure & semantics (RDFS, OWL, RIF)Working in the real world (GRDDL, RDF2RDB)Working on the Web (Linked Data, RDFa, POWDER)

Agenda

May 12, 2009 120

Page 118: Semantic Web Landscape 2009

A simple set of 4 guidelines for publishing RDF data on the Web (over HTTP)

Developed by Tim Berners-Lee in 2006

1. Use URIs as names for things• Globally unique identity

2. Use HTTP URIs • Everyone has a Web browser/client

3. When someone looks up a URI, provide useful information• …in the form of RDF data

4. Include links to other URIs• Foster discovery of additional information

Linked Data is…

May 12, 2009 121

Page 119: Semantic Web Landscape 2009

The Linking Open Data Project is...

A community project started within the W3C Semantic Web Education & Outreach group in 2007

A wealth of existing, open Web-based data sets exposed in RDF and linked together

A growing number of publicly available SPARQL endpoints

The first steps of “The” Semantic Web? No longer easily measured or depicted!

May 12, 2009 122

Page 120: Semantic Web Landscape 2009

The LOD “cloud”, May 2007

May 12, 2009 123

Page 121: Semantic Web Landscape 2009

The LOD “cloud”, March 2008

May 12, 2009 124

Page 122: Semantic Web Landscape 2009

The LOD “cloud”, September 2008

May 12, 2009 125

Page 123: Semantic Web Landscape 2009

The LOD “cloud”, March 2009

May 12, 2009 126

Page 124: Semantic Web Landscape 2009

Application specific portions of the cloud Notably, bio-related data sets (in light purple)

some by the W3C “Linking Open Drug Data” task force

May 12, 2009 127

Page 125: Semantic Web Landscape 2009

Sindice - Another view of data on the Web

May 12, 2009 128

Page 126: Semantic Web Landscape 2009

Many tools we’ve already seen publish RDF data according to linked data principles

E.g. Talis platform, Virtuoso, TriplifyOthers sit on top of existing systems and make the data available as Linked Data

E.g. pubby

Tools: Publishing linked data

May 12, 2009 129

Page 127: Semantic Web Landscape 2009

Tools: the Data Browser

May 12, 2009 130

World Wide Web : Web pages :: The Semantic Web : Data

World Wide Web : Web browser :: Linked Data Web : Data browser

Page 128: Semantic Web Landscape 2009

Tabulator: Generic Data Browser

May 12, 2009 131

Page 129: Semantic Web Landscape 2009

Disco Hyperdata Browser

May 12, 2009 132

Page 130: Semantic Web Landscape 2009

OpenLink Data Explorer

May 12, 2009 133

Page 131: Semantic Web Landscape 2009

Marbles Linked Data Browser

May 12, 2009 134

Page 132: Semantic Web Landscape 2009

DBPedia Mobile

May 12, 2009 135

Page 133: Semantic Web Landscape 2009

DBPedia Mobile

May 12, 2009 136

Page 134: Semantic Web Landscape 2009

DBPedia Mobile

May 12, 2009 137

Page 135: Semantic Web Landscape 2009

DBPedia Mobile

May 12, 2009 138

Page 136: Semantic Web Landscape 2009

QDOS – your online digital status

139May 12, 2009

Page 137: Semantic Web Landscape 2009

BBC Music Beta

140May 12, 2009

Page 138: Semantic Web Landscape 2009

On the current Web…Content publishers decide what can be done with the data (via links, script)

On the Semantic Web…Content publishers publish actionable dataContent consumers decide how to act on it

Producer-oriented Web to consumer-oriented Web

May 12, 2009 141

Page 139: Semantic Web Landscape 2009

UltraLink

May 12, 2009 142

UltraLink is Novartis’s solution for cross-linking over 1,500,000 biologic and chemical terms, including synonyms, taxonomies, and

pointers into data repositories.

Page 140: Semantic Web Landscape 2009

What if an acquisition brings with it a new Web-based corpus of pathway data that uses terms not recognized by the annotators?

New text miners must be created & deployedFinding & consuming data are too tightly coupled

UltraLink

May 12, 2009 143

Page 141: Semantic Web Landscape 2009

RDFa is…

May 12, 2009 144

RDF in Attributes

Page 142: Semantic Web Landscape 2009

RDFa is…

May 12, 2009 145

A collection of HTML attributes that allow RDF to be embedded directly in Web pages.

Page 143: Semantic Web Landscape 2009

Don’t Repeat Yourself (DRY)In-context metadata (copy & paste)Authoritative (no screen scrapig)

Why RDFa?

May 12, 2009 146

Page 144: Semantic Web Landscape 2009

Who’s using RDFa?

May 12, 2009 147

STW Thesaurus for Economics

Page 145: Semantic Web Landscape 2009

RDFa in action

May 12, 2009 148

Page 146: Semantic Web Landscape 2009

POWDER is…

May 12, 2009 149

Protocol for Web Description Resources

Page 147: Semantic Web Landscape 2009

groups of online resourcesdescriptions applied to

150

http://www.slideshare.net/fabien_gandon/powder-in-a-nutshell-presentation

Page 148: Semantic Web Landscape 2009

onedescription

many resources

151

Page 149: Semantic Web Landscape 2009

grouping mechanisms...

... list URIs... domain names, paths ... regular expressions on URIs

152

Page 150: Semantic Web Landscape 2009

descriptionsmay be grouped

153

queriesare on individual resources

Page 151: Semantic Web Landscape 2009

description…• Which resources does the DR describe?• What is the description?• Who has created the description?• When was the description created?• Until when is the description considered valid?• From when is the description considered valid?• Does anybody agree with this description?• Do other descriptions exist about this group of resources?

154

Page 152: Semantic Web Landscape 2009

in order to...

trust

155

protectauthorize

adapt

searchmonitor

Page 153: Semantic Web Landscape 2009

Thanks & Questions

May 12, 2009 156

[email protected]