18
Introduction to the Semantic Web Kjetil Kjernsmo <[email protected]>. INF5100, 2013-09-11. Creative Commons Attribution-Sharealike 3.0 Norway License. Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml 1 of 18 09/11/2013 10:42 AM

Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Introduction to the Semantic Web

Kjetil Kjernsmo <[email protected]>.

INF5100, 2013-09-11.

Creative Commons Attribution-Sharealike 3.0 Norway License.

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

1 of 18 09/11/2013 10:42 AM

Page 2: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

History

Thought of from the start.

Agents on the web should help people use web data.

HTML should have given us a lot.

By 1997, it was clear that it wouldn't.

Resource Description Framework Working Group established1998.

Convergence of three communities:

Web

Digital libraries

Computer Inference

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

2 of 18 09/11/2013 10:42 AM

Page 3: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Motivations

Connolly's Lament

The bane of my existence is doing things that I know thecomputer could do for me.

If the Web page with your personal calendar says you'll be inNew York next Thursday, and the page with your workgroupcalendar says you'll be in London all week, shouldn't thecomputer be able to warn you about the conflict? And shouldn'tit go ahead and ask you if it's OK to cancel your flight toLondon and purchase this other ticket to New York?

Dan Connolly, 1998

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

3 of 18 09/11/2013 10:42 AM

Page 4: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Motivations

Simple use cases

Distributed address books

Metadata repositories

Annotation of Web resources

Data sharing

...long list...

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

4 of 18 09/11/2013 10:42 AM

Page 5: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Motivations

Distinguishing features

Economy of arbitrary combination.

The value of links.

...and a dash of reasoning.

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

5 of 18 09/11/2013 10:42 AM

Page 6: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

A wide spectrum of applications

Business intelligence

Reasoning systems (see INF3580/4580)

Authentication and authorization on the web

Distributed social networks

Database applications

Sensor web

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

6 of 18 09/11/2013 10:42 AM

Page 7: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Available data

WorldFact-book

JohnPeel

(DBTune)

Pokedex

Pfam

US SEC(rdfabout)

LinkedLCCN

Europeana

EEA

IEEE

ChEMBL

SemanticXBRL

SWDogFood

CORDIS(FUB)

AGROVOC

OpenlyLocal

Discogs(Data

Incubator)

DBpedia

yovisto

Tele-graphis

tags2condelicious

NSF

MediCare

BrazilianPoli-

ticians

dotAC

ERA

OpenCyc

Italianpublic

schools

UB Mann-heim

JISC

MoseleyFolk

SemanticTweet

OS

GTAA

totl.net

OAI

Portu-guese

DBpedia

LOCAH

KEGGGlycan

CORDIS(RKB

Explorer)

UMBEL

Affy-metrix

riese

business.data.gov.

uk

OpenData

Thesau-rus

GeoLinkedData

UK Post-codes

SmartLink

ECCO-TCP

UniProt(Bio2RDF)

SSWThesau-

rus

RDFohloh

Freebase

LondonGazette

OpenCorpo-rates

Airports

GEMET

P20

TCMGeneDIT

Source CodeEcosystemLinked Data

OMIM

HellenicFBD

DataGov.ie

MusicBrainz

(DBTune)

data.gov.ukintervals

LODE

Climbing

SIDER

ProjectGuten-berg

MusicBrainz

(zitgist)

ProDom

HGNC

SMCJournals

Reactome

NationalRadio-activity

JP

legislationdata.gov.uk

AEMET

ProductTypes

Ontology

LinkedUser

Feedback

Revyu

GeneOntology

NHS(En-

AKTing)

URIBurner

DBTropes

Eurécom

ISTATImmi-gration

LichfieldSpen-ding

SurgeRadio

Euro-stat

(FUB)

PiedmontAccomo-dations

NewYork

Times

Klapp-stuhl-club

EUNIS

Bricklink

reegle

CO2Emission

(En-AKTing)

AudioScrobbler(DBTune)

GovTrack

GovWILDECS

South-amptonEPrints

KEGGReaction

LinkedEDGAR

(OntologyCentral)

LIBRIS

OpenLibrary

KEGGDrug

research.data.gov.

uk

VIVOCornell

UniRef

WordNet(RKB

Explorer)

Cornetto

medu-cator

DDC DeutscheBio-

graphie

Wiki

Ulm

NASA(Data Incu-

bator)

BBCMusic

DrugBank

Turismode

Zaragoza

PlymouthReading

Lists

education.data.gov.

uk

KISTI

UniPathway

Eurostat(OntologyCentral)

OGOLOD

Twarql

MusicBrainz(Data

Incubator)

GeoNames

PubChem

ItalianMuseums

Good-win

Familyflickr

wrappr

Eurostat

Thesau-rus W

OpenLibrary(Talis)

LOIUS

LinkedGeoData

LinkedOpenColors

WordNet(VUA)

patents.data.gov.

uk

GreekDBpedia

SussexReading

Lists

MetofficeWeatherForecasts

GND

LinkedCT

SISVU

transport.data.gov.

uk

Didac-talia

dbpedialite

BNB

OntosNewsPortal

LAAS

ProductDB

iServe

Recht-spraak.

nl

KEGGCom-pound

GeoSpecies

VIVO UF

LinkedSensor Data(Kno.e.sis)

lobidOrgani-sations

LEM

LinkedCrunch-

base

FTS

OceanDrillingCodices

JanusAMP

ntnusc

WeatherStations

Amster-dam

Museum

lingvoj

Crime(En-

AKTing)

Course-ware

PubMed

ACM

BBCWildlifeFinder

Calames

Chronic-ling

America

data-open-ac-uk

OpenElection

DataProject

Slide-share2RDF

FinnishMunici-palities

OpenEI

MARCCodesList

VIVOIndiana

HellenicPD

LCSH

FanHubz

bibleontology

IdRefSudoc

KEGGEnzyme

NTUResource

Lists

PRO-SITE

LinkedOpen

Numbers

Energy(En-

AKTing)

Roma

OpenCalais

databnf.fr

lobidResources

IRIT

theses.fr

LOV

Rådatanå!

DailyMed

Taxo-nomy

New-castle

GoogleArt

wrapper

Poké-pédia

EURES

BibBase

RESEX

STITCH

PDB

EARTh

IBM

Last.FMartists

(DBTune)

YAGO

ECS(RKB

Explorer)

EventMedia

STW

myExperi-ment

BBCProgram-

mes

NDLsubjects

TaxonConcept

Pisa

KEGGPathway

UniParc

Jamendo(DBtune)

Popula-tion (En-AKTing)

Geo-WordNet

RAMEAUSH

UniSTS

Mortality(En-

AKTing)

AlpineSki

Austria

DBLP(RKB

Explorer)

Chem2Bio2RDF

MGI

DBLP(L3S)

Yahoo!Geo

Planet

GeneID

RDF BookMashup

El ViajeroTourism

Uberblic

SwedishOpen

CulturalHeritage

GESIS

datadcs

Last.FM(rdfize)

Ren.EnergyGenera-

tors

Sears

RAE2001

NSZLCatalog

Homolo-Gene

Ord-nanceSurvey

TWC LOGD

Disea-some

EUTCProduc-

tions

PSH

WordNet(W3C)

semanticweb.org

ScotlandGeo-

graphy

Magna-tune

Norwe-gian

MeSH

SGD

TrafficScotland

statistics.data.gov.

uk

CrimeReports

UK

UniProt

US Census(rdfabout)

Man-chesterReading

Lists

EU Insti-tutions

PBAC

VIAF

UN/LOCODE

Lexvo

LinkedMDB

ESDstan-dards

reference.data.gov.

uk

t4gminfo

Sudoc

ECSSouth-ampton

ePrints

Classical(DB

Tune)

DBLP(FU

Berlin)

Scholaro-meter

St.AndrewsResource

Lists

NVD

Fishesof

TexasScotlandPupils &Exams

RISKS

gnoss

DEPLOY

InterPro

Lotico

OxPoints

Enipedia

ndlna

Budapest

CiteSeer

Media

Geographic

Publications

User-generated content

Government

Cross-domain

Life sciences

As of September 2011

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

7 of 18 09/11/2013 10:42 AM

Page 8: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Open Data vs Linked Open Data

★Available on the web (whatever format), but with an openlicence

★★Available as machine-readable structured data (e.g. excelinstead of image scan of a table)

★★★as (2) plus non-proprietary format (e.g. CSV instead ofexcel)

★★★★All the above plus, Use open standards from W3C (RDFand SPARQL) to identify things, so that people can point atyour stuff

★★★★★All the above, plus: Link your data to other people's datato provide context

Tim Berners-Lee

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

8 of 18 09/11/2013 10:42 AM

Page 9: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Resource Description Framework(RDF)

From two documents to a triple.

Subject - Predicate - Object

1.

Add semantics to the link.2.

Generalize the URI to identifyanything.

3.

The object can be a string (withlanguage or datatype).

4.

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

9 of 18 09/11/2013 10:42 AM

Page 10: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Example

Turtle syntax:

@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix dc: <http://purl.org/dc/terms/> .@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .@base <http://data.semanticweb.org/> .

</person/kjetil-kjernsmo> a foaf:Person ; foaf:name "Kjetil Kjernsmo"@no ; dc:publication </conference/iswc/2013/proceedings-2/paper-23/> .

</conference/iswc/2013/proceedings-2/paper-23/> a foaf:Document ; dc:title "Introducing Statistical Design of Experiments to SPARQL Endpoint Evaluation"@en ; foaf:maker </person/kjetil-kjernsmo>, </person/john-s-tyssedal> ; dc:date "2013"^^xsd:gYear .

</person/john-s-tyssedal> a foaf:Person ; foaf:name "John S. Tyssedal"@no ; foaf:knows </person/kjetil-kjernsmo> .

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

10 of 18 09/11/2013 10:42 AM

Page 11: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Dynamic Schema

For databases, the schema can be created and updated at anytime.

It can be formal or informal, in the form of an ontology orvocabulary.

Don't let anyone force their ontology on you!

But...

Economy makes it wise to adopt when you can

Unexpected reuse makes it wise to map when you can

...Thus, start at Linked Open Vocabularies.

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

11 of 18 09/11/2013 10:42 AM

Page 12: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Open World Assumption

There are known knowns. These are things we know that weknow. There are known unknowns. That is to say, there arethings that we now know we don’t know. But there are alsounknown unknowns. These are things we do not know we don’tknow.

United States Secretary of Defense Donald Rumsfeld

If an assertion isn't there you do not know anything aboutit!

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

12 of 18 09/11/2013 10:42 AM

Page 13: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

SPARQL

A query language for the Semantic Web

Standardised query language:

SPARQL 1.0: 2008

SPARQL 1.1: 2013

Graph patterns matching subgraphs.

Can return tables, RDF or booleans.

Remote queries over HTTP.

Most queries are simpler to write than SQL.

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

13 of 18 09/11/2013 10:42 AM

Page 14: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

SPARQL

Simple examples:

SELECT ?coauthorname WHERE { ?author a foaf:Person ; foaf:knows ?coauthor . ?coauthor foaf:name ?coauthorname .}

DESCRIBE ?coauthor WHERE { ?author a foaf:Person ; foaf:knows ?coauthor .}

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

14 of 18 09/11/2013 10:42 AM

Page 15: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Real, big SPARQL example

PREFIX geo: <http://www.geonames.org/ontology#>

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

PREFIX mo: <http://purl.org/ontology/mo/>

SELECT ?name ?homepage ?place ?population

WHERE

{ ?artisturi a mo:MusicArtist ;

foaf:based_near ?placeuri ;

foaf:name ?name .

OPTIONAL { ?artisturi foaf:homepage ?homepage . }

?placeuri

geo:name ?place ;

geo:inCountry <http://www.geonames.org/countries/#NO> .

OPTIONAL {?placeuri geo:population ?population . }

}

ORDER BY ?population

Run this at http://dbtune.org/jamendo/store/.

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

15 of 18 09/11/2013 10:42 AM

Page 16: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

SPARQL extensions

Not standardized:

Data stream queries.

Free text queries.

Various extension functions.

Spatial extensions.

Temporal extensions.

Improved federation.

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

16 of 18 09/11/2013 10:42 AM

Page 17: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

My own research

Federation on the odd cases

Because most of the world is far from the norm.

It's not normal, either, it is Pareto.

Statistics is needed for join order optimization.

We can expose statistics in the Service Description,

but we must not make the description too verbose.

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

17 of 18 09/11/2013 10:42 AM

Page 18: Kjetil Kjernsmo . INF5100, 2013-09-11. · stuhl-club EUNIS Bricklink reegle CO2 Emission (En-AKTing) Audio Scrobbler (DBTune) GovTrack GovWILD ECS South-ampton

Master theses available!

Database-oriented Semantic Web-theses:

Defensive measures for SPARQL endpoints (with Audun Jøsang).

OWL-driven join-order optimizations.

Implementing SPARQL in Free Software RDBMSes (in particularSQLite.

Efficiently import and index RDF data in fully indexed stores.

Design of Experiments-driven evaluation.

Hypermedia systems (with Ruben Verborgh):

Very practical Javascript-programming.

Authentication and authorization.

Backend for driving interaction.

(Linked) (Open) Data.

Traits-based programming with RDF/OWL.

Separation of Concerns in templating systems.

Sensor-web systems.

Practical Semantic Web-theses.

Introduction to the Semantic Web (1) file:///tmp/semweb-inf5100.xhtml

18 of 18 09/11/2013 10:42 AM