Upload
diego-valerio-camarda
View
151
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
RDF
principles and case studies
Diego Valerio Camarda regesta.exe
www.regesta.com
dvcama @ github&twitter
a brief introduction to linked open data
why things instead of documents
Html page
Html page Html
page Html page Html
page The nowadays WEB
why things instead of documents
The nowadays WEB at least 1.85 billion indexed documents someone says 1 trillion online documents
Html page
Html page Html
page Html page Html
page
why things instead of documents
The nowadays WEB at least 1.85 billion indexed documents someone says 1 trillion online documents actually the best HTML parser is still the HUMAN BRAIN
Html page
Html page Html
page Html page Html
page
why things instead of documents
The nowadays WEB is not the WEB that Tim proposed in 1998
why things instead of documents
The nowadays WEB is not the WEB that Tim proposed in 1998
why things instead of documents
The nowadays WEB is not the WEB that Tim proposed in 1998
what about URIs and RDF a new way to publish data on the web
ids are ambiguous and suck!
Use URIs as names for things Use HTTP URIs so that people can look up those names Use the standards (RDF, SPARQL) providing useful information Include links to other URIs so that they can discover more things
linked data principles Tim Berners-Lee July 27, 2006
HTTP://yourdomain.com/something
what about URIs and RDF turning web pages in “real” data
ids are ambiguous and suck!
what about URIs and RDF turning web pages in “real” data
ids are ambiguous and suck!
[…] l’animaletto venne indicato come: “il tasso del tasso del Tasso”
Achille Campanile
It’s time for machine (for parsing pages)
[…] l’animaletto venne indicato come: “il tasso del tasso del Tasso”
Achille Campanile
It’s time for machine (for parsing pages)
http://it.dbpedia.org/resource/Meles_meles
http://it.dbpedia.org/resource/Taxus http://it.dbpedia.org/resource/Torquato_Tasso
http://it.dbpedia.org/resource/Achille_Campanile (author of the sentence)
A new way to design databases RDF
(aka ’define knowledge’)
Go Triples, go! the standard (old) approach
ID_P COGNOME NOME REF_ID_SOCIETA GENERE
1 Camarda Diego 1 maschio
2 … … … …
ID_SOCIETA DENOMINAZIONE SITO
1 Regesta.exe srl www.regesta.com
Go Triples, go! the new (cool) approach
<http://www.regesta.com/diego>
Subject
Go Triples, go! the new (cool) approach
<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName>
Subject Predicate
Go Triples, go! the new (cool) approach
<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’.
Subject Predicate Object
Go Triples, go! the new (cool) approach
<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’. <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/firstName> ‘Diego’. <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/gender> ‘male’.
Go Triples, go! the new (cool) approach
<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ .
Go Triples, go! ok, but what a “diego” is?
Go Triples, go! it’s a person!
<http://www.regesta.com/diego> a <http://xmlns.com/foaf/0.1/Person>
Go Triples, go! adding a Class
<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ .
<http://www.regesta.com/diego> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .
Go Triples, go! building a graph
<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> .
<http://www.regesta.com/diego> <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> .
Go Triples, go! building a graph
<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> ; <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> .
<http://www.regesta.com/about> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/org#Organization> .
Go Triples, go! building a graph
<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> ; <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> . <http://www.regesta.com/about> <http://www.w3.org/1999/...#type> <http://www.w3.org/ns/org#Organization> .
Go Triples, go! building a graph
<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> ; <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> . <http://www.regesta.com/about> <http://www.w3.org/1999/...#type> <http://www.w3.org/ns/org#Organization> ; <http://www.w3.org/2004/02/skos/core#prefLabel> ‘Regesta.exe srl’ ; <http://xmlns.com/foaf/0.1/homepage> <http://www.regesta.com> .
Go Triples, go! Objects could be Subjects
diego
Go Triples, go! considering diego and regesta
diego
regesta
Go Triples, go! <diego> <memberOf> <regesta>
diego
regesta
Go Triples, go! but, <regesta> <locatedIn> <rome>
diego
regesta
rome
Go Triples, go! <diego> <placeOfBirth> <rome>
diego
regesta
rome
Go Triples, go! <rome> <parentADM> <italy>
diego
regesta
rome
italy
Go Triples, go! <silvia> <placeOfBirth> <italy>
diego
regesta
silvia
rome
italy
Go Triples, go! <silvia> <…> <…>
diego
regesta
silvia
rome
italy
Go Triples, go! <…> <…> <…> = a knowledge graph!
diego
regesta
silvia
rome
italy
A lot of sentence to achieve (descriptive) freedom
<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ . <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ . <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/gender> ‘male’ . <http://www.regesta.com/diego> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <http://www.regesta.com/diego> <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com> . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/familyName> ‘Mazzini’ . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/firstName> ‘Silvia’ . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/gender> ‘female’ . <http://www.regesta.com/silvia> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <http://www.regesta.com/silvia> <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com> . <http://www.regesta.com> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/org#Organization> . <http://www.regesta.com> <http://www.w3.org/2004/02/skos/core#prefLabel> ‘Regesta.exe srl’ . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/knows> <http://www.regesta.com/diego> .
<…> <…> <…>.
<noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>.<noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> …
Standards for semantic web
RDF http://www.w3.org/standards/techs/rdf SPARQL http://www.w3.org/standards/techs/sparql ONTOLOGIES http://www.w3.org/standards/semanticweb/ontology
Did you studied HTML? Good! it's time for a new standard
The Resource Description Framework is a general-purpose language for representing
information in the Web.
It's time for a new standard RDF
The SPARQL Protocol and RDF Query Language is a query language and protocol for RDF.
It's time for a new standard SPARQL
On the Semantic Web, vocabularies define the concepts and relationships
(also referred to as “terms”) used to describe and represent
an area of concern.
It's time for a new standard Ontologies
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> foaf:firstName dc:title rdfs:label
Pre:fixes (ontologies) just a few words
Browsing the web of data
Resource Description Framework
› SPARQL endpoint › dereferenceable URIs › content negotiation › standard ports, like 80 (HTTP) › JSONP support
MUST!
Resource Description Framework
› SPARQL endpoint › dereferenceable URIs › content negotiation › standards port, like 80 (HTTP) › JSONP support › up-to-date › the endpoint URL is easy to deduce from resources › the resources are described by dc:title or rdfs:label › the endpoint hosts a page for humans › the resources and the endpoint are on the same domain
SHOULD! (please do it, for me)
One single API a world to explore
One single API interlinking
<a href=“…”>click here</a>
owl:sameAs rdfs:seeAlso
…
SELECT * {?minnesota ?banana ?sun}
SPARQL a must know query language
SPARQL group graph pattern
diego
regesta
silvia
rome
italy
diego
regesta
silvia
rome
italy
SPARQL group graph pattern
diego
regesta
rome
silvia italy
silvia italy
SELECT ?person { ?person <placeOfBirth> ?place ; <memberOf> ?company . ?company <locatedIn> ?place . }
SPARQL group graph pattern
<diego>
SELECT ?person ?prop ?obj { ?person <placeOfBirth> ?place ; <memberOf> ?company ; ?prop ?obj . ?company <locatedIn> ?place . }
SPARQL group graph pattern
(turn the page)
person prop obj <diego> rdf:type foaf:Person <diego> foaf:firstName ‘Diego’ <diego> foaf:familyName ‘Camarda’ <diego> foaf:gender ‘male’ <diego> org:memberOf <regesta>
SPARQL group graph pattern
DESCRIBE <diego>
SPARQL describe
(turn the page)
<diego> rdf:type foaf:Person . <diego> foaf:firstName ‘Diego’ . <diego> foaf:familyName ‘Camarda’ . <diego> foaf:gender ‘male’ . <diego> org:memberOf <regesta> . <silvia> foaf:knows <diego> .
SPARQL describe
CONSTRUCT {<diego> foaf:donaldDuck ?c} WHERE{<diego> ?b ?c. }
SPARQL construct
(turn the page)
<diego> foaf:donaldDuck foaf:Person . <diego> foaf:donaldDuck ‘Diego’ . <diego> foaf:donaldDuck ‘Camarda’ . <diego> foaf:donaldDuck ‘male’ . <diego> foaf:donaldDuck <regesta> .
SPARQL construct
DISTINCT, COUNT GRAPH, PREFIX isBlank, isIRI, isLiteral, isNumeric FILTER, REGEX, STR FILTER NOT EXISTS, MINUS ORDER BY, OFFSET, LIMIT for other stuff http://www.w3.org/TR/sparql11-query/
SPARQL minimum requirements
Please start negotiating content right now!
Hi dude, I accept: text/html,application/xhtml+xml Html
page Great! I’ll serve you a web page
Hi dude, I accept: application/rdf+xml
RDF data Great… 303, redirect!
Hi dude, I accept: pizza/margherita
406 error mmm… sorry
Please start negotiating content right now!
application/rdf+xml application/xml text/plain text/turtle application/x-turtle application/trix application/x-trig text/n3 text/rdf+n3 application/trix
application/x-trig application/x-binary-rdf text/x-nquads application/ld+json application/rdf+json application/xhtml+xml text/xml application/json application/rdf+xml application/rdf+n3 application/sparql-results+xml application/sparql-results+json
curl -L -H "Accept: application/rdf+xml" http://dati.camera.it/ocd/governo.rdf/g102 curl -L -H "Accept: text/n3" http://dati.camera.it/ocd/governo.rdf/g102
Please start negotiating content using CURL…
Java : Sesame / Jena
Python : RDFLib Ruby : RDF.rb
nodeJs : sparql-client
or, as I do, simple HTTP GET +
parsing result as json or xml
Please start negotiating content …or a framework!
RDF data storing and deploying
It’s slow so keep calm
1 record 15 triples
2.949.771 votes 64.948.856 triples
usually
eg. Chamber of deputies
data big data
RDF probably will transform
Virtuoso Sesame
Fuseki (Jena) Owlim / Bigdata (Sesame)
AllegroGraph D2R server
ARC2 …
Triplestores I just need a SPARQL endpoint
I just really need http://yourdomain/sparql
Case studies
select distinct ?o where {?s a ?o}
select ?o count(distinct ?s) where {?s a ?o}
select count(?s) where {?s ?p ?o}
select count(?s) ?class where {?s ?p ?o; a ?class}
select distinct ?p where {?s a <http://classe>; ?p ?o}
select ?p count(?p) where {?s a <http://classe>; ?p ?o}
select ?s where {?s a <http://classe>}
?p ?o where {<http://URI> ?p ?o} ?p ?o ?p1 ?o2 where {<http://URI> ?p ?o. OPTIONAL{?o ?p1 ?o2. FILTER(isBlank(?o))}}
select distinct ?s ?title where {?s a <http://classe>;
dc:title ?title. FILTER(REGEX(? title,’parola’,’i’))} LIMIT 100
SPARQL magic a query for all seasons
Case studies Chamber of deputies
http://dati.camera.it/sparql
http://storia.camera.it
From SPARQL to html
Case studies Central State Archive
http://acs.beniculturali.it/sparql
http://labs.regesta.com/reloadProject
From SPARQL to html
Useful links
W3C standards http://www.w3.org/standards/semanticweb/ OKFN endpoints status (and list) http://sparqles.okfn.org LodLive (a SPRQL navigator) http://en.lodlive.it a very good intro to RDF https://github.com/JoshData/rdfabout/blob/gh-pages/intro-to-rdf.md Tim Berners-Lee’s “Linked Data – 5 stars ranking” http://www.w3.org/DesignIssues/LinkedData.html My github page http://github.com/dvcama My email mailto:[email protected]