Upload
christophe-gueret
View
199
Download
0
Embed Size (px)
Citation preview
LD & the research and education spaceChristophe GuéretArchive Development (TV) / TSA (D&E)@cgueret
disclaimer: will be experimenting following guidelines from Alice
Bartlett and Russell Davies (mostly)
@mildlydiverting made the Google Template
1. Linked Data
2. Research and Education Space
1. Linked Data
Linked Data Linked Open DataSemantic Web
Three different concepts !
Linked Data
Method for publishing data using HTTP as a storage and communication layer
More info on https://en.wikipedia.org/wiki/Linked_data
Linked Open Data
Linked Data applied to Open Data
For a definition of Open Data : http://opendefinition.org/
Semantic Web
Using the semantic capabilities of Linked Data to derive information
Why Linked Data ?
Slides derived from http://www.slideshare.net/cgueret/linking-knowledge-spaces
Dealing with documents until 1989
1. Find a source for the document2. Find a way to parse the document3. Create links and index the content4. Repeat on each update
Then came the Web...Standardized, connected, decentralised, easy
Document
Document
Document
One server Another server
Dealing with data until, well, now
1. Find a source for the data2. Find a way to parse the data3. Create links and index the data4. Repeat on each update
We deal with data the way we dealt with documents 20 years ago
Many formats, no links, ETL
Okay it’s not that bad...
● We have Web APIs now● All the APIs are RESTful● All the APIs do JSON, or XML
But what about schemas ? and links ?
Linked Data = doing the Web again but for data instead of documents
It’s possible because“Factual knowledge is a graph”
http://videolectures.net/iswc2011_van_harmelen_universal/
Concretely“Lille is in France and called Rijsel in Dutch”
http://dbpedia.org/resource/Lille
http://dbpedia.org/resource/France“Rijsel”@nl
http://dbpedia.org/ontology/country
http://www.w3.org/2000/01/rdf-schema#label
Not rocket science:
● Use IRIs for identifiers● Bind identifiers to data about them● Bind identifiers to other identifiers● Use IRIs for typing the links
Vocabularies
● QB : statistics● PROV-O : provenance● SIOC : social media● Schema.org : search engine results● And many more ...
Every data set is a graph part of a bigger graph
Only need to know the vocabulary used to meaningfully consume it
Vocabularies are part of the graph too!
http://dbpedia.org/ontology/country
http://www.w3.org/2000/01/rdf-schema#comment
“The country where the thing is located.”@en
Content negotiationhttp://dbpedia.org/resource/Cardiff
http://dbpedia.org/page/Cardiff http://dbpedia.org/data/Cardiff.ttl
Content negotiationhttp://www.bbc.co.uk/programmes/b006q2x0#programme
http://www.bbc.co.uk/programmes/b006q2x0 http://www.bbc.co.uk/programmes/b006q2x0.rdf
Two major traps to avoid
● Using the same IRI for a thing and the document describing it
● Applying a license to a thing instead of applying it to a document
And the semantic web then ?
LOD + Semantics = Semantic Web
● Document vocabularies with logics => New data gets derived
● “Lille is in France” + “All cities in France are in Europe” => “Lille is in Europe”
http://www.slideshare.net/ConnectedDataLondon/ten-years-of-linked-data-at-the-bbc
2. Research and Education Space
http://res.space
The Research and Education Space
In practice we
● Help GLAMs to publish LOD● Crawl and index that LOD● Provide a search interface over the
data crawled
Our crawler follows links across data publishers to hunt for (properly licensed) LOD
There is a set of rules used to interpret the data in a specific way
All of that is open source ! Both what we use and what we code :-)
https://github.com/orgs/bbcarchdev
http://acropolis.org.uk/