View
1.196
Download
0
Tags:
Embed Size (px)
Citation preview
1Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Kai EckertUniversity of Mannheim
DM2E All WP MeetingNovember 30th, 2012
Vienna
A Linked Data based Infrastructure for DM2E
(WP2)
2Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Agenda
Motivation: Follow the Linked Data Principles
Linked Data Architecture
Provenance in the Europeana Data Model
OAI-ORE vs. Named Graphs
Linked Data Publishing with Provenance
3Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Architecture
Integrated Webclient
The E
uro
peana D
ata
Model (
ED
M)
Provenance realizedby means of OAI-ORE.
Problems?
Users have tounderstand Proxies.
Users have tounderstand Aggregations.
Wouldn't named graphs be nicer?
5Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Removing the proxies
Proxies are (proxy-) resources for the actual resources. Every data provider has an "own" resource to describe, as a placeholder.
Practical approach: we use named graphs to distinguish descriptions from different providers within our store.
6Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Removing the Aggregations?
What is an aggregation?"Aggregations are used in Europeana to represent the complex constructs that are provided by contributors. An aggregation is associated to the object that it is about, by the property edm:aggregatedCHO."
Level of aggregation:
1 aggregation per providedCHO.
EuropeanaAggregation aggregates other aggregations (from data providers).
7Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
A Named Graph per Resource
Corresponds to the EDM Aggregations.Finegrained... feasible?Named Graphs as first class members in the model.
Statements about the aggregation that areonly valid for one resource!
If we allow this, the named graph must never get lost!
8Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
A Named Graph per Collection
This information must not get lost, too.But: It is not only valid for one resource. We are nowmore flexible regarding the publication of the data.
But: Where are the aggregations?
9Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
10Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
One Named Graph per Provided Dataset
Naturally fits to provenance requirements:All statements stem from some dataset.
Positive aspect: Dataproviders do not have to care any more!
11Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Provenance and Versioning on Collection Level
12Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Overlapping Resource Descriptions
13Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Crosswalk to EDM
Are we still backwards comaptible?
YES :-)
14Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Publishing
15Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
What's inside our store?
RDF Datasets per Collection, organized in Named Graphs.
NG URI scheme:
http://data.dm2e.eu/data/collection/[provider]/[collectionId]/[version]
Additional provenance statements for each collection.
16Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Make it available
Web-Documents (with URI) deliver RDF, provenance is included as statements about the URI.
On client side, the document creates a new Named Graph, with the URI as name.
17Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
RESTful API (Publishing)
http://data.dm2e.eu/data/...
... collection/[provider]/[collectionID]/[version] (dm2e:Collection) => dump of one whole ingested dataset
... resource/[provider]/[collectionId]/[identifier] (edm:providedCHO) => 303 to latest version of describing Aggregation
... collection/[provider]/[collectionID]/[version]/[identifier] (ore:Aggregation) => data about a single resource
... linkset/[provider]/[linksetID]/[version] (dm2e:LinkSet) => generated links
... linkset/[provider]/[linksetID]/[version]/[provider]/ [collectionID]/[identifier] (dm2e:LinkAggregation) => links for a specific resource
18Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Provenance in Documents
Generated from provenance information about datasets:
dc:creator => Data provider
dc:date => Timestamp
dm2e:version => version number
dm2e:nextVersion => link to next version of the document
dm2e:previousVersion => link to previous version
dm2e:links => link to a linkset
Optional: PROV statements for full provenance chain.
Maintained by the DM2E infrastructure.
Version means always the version of the underlying dataset.
19Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Consuming our Data (WP1 and WP3)
Fetch data from our URIs.
Fetch suitable linksets from our URIs (links provided with data).
Local data cleansing (recommended for WP3): Unify all URIs based on owl:sameAs links for better local querying (or use reasoning).
Client has to maintain a mapping for original URIs per Named Graph for the proper representation of annotations.
20Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Annotations
Annotations always on resource or statement level.
Subject of an annotation:
[Graph URI]/URLencode(resource URI) or [Graph URI]/URLencode(subject,predicate,object)
Example:
http://data.dm2e.eu/data/collection/[provider]/[collectionID]/[version]/[identifier]/[subject,predicate,object]
Similar to XPointer, SharedCanvas, ...
21Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
What is missing?
Several definitions of dm2e: terms.
A vocabulary for the annotations:
Requirements of the scientists
Mark wrong statements!
Comment on statements.
...
A vocabulary for the classification of linksets:
Automatically created
Manually created
Recall-oriented (exploratory, but with wrong links)
Precision-oriented (incomplete, but high quality)
...
...
22Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna)
Implementation pending ;-)
Questions?
Suggestions?