25
LEDS WWW.LEDS-PROJEKT.DE TOWARDS VERSIONING OF ARBITRARY RDF DATA FROMMHOLD M., NAVARRO PIRIS R., ARNDT N., TRAMP S., PETERSEN N., AND MARTIN M. September 14, 2016 1

Towards Versioning of Arbitrary RDF Data

Embed Size (px)

Citation preview

Page 1: Towards Versioning of Arbitrary RDF Data

LEDS

WWW.LEDS-PROJEKT.DE

TOWARDS VERSIONING OF ARBITRARY RDF DATA

FROMMHOLD M., NAVARRO PIRIS R., ARNDT N., TRAMP S., PETERSEN N., AND MARTIN M.

September 14, 2016

1

Page 2: Towards Versioning of Arbitrary RDF Data

LEDSINTRODUCTION

What it is?

Change tracking system for RDF datasets creating invertible patches for SPARQL Update queries.

What it is not (yet)?

A full-fledged versioning system with support for versioning operations.

September 14, 2016

2

Page 3: Towards Versioning of Arbitrary RDF Data

LEDSMOTIVATION

LEDS research project

Allow for co-evolution of RDF datasets outside and inside of an enterprise for better integration of public and private data.

http://www.leds-projekt.de/

September 14, 2016

3

Page 4: Towards Versioning of Arbitrary RDF Data

LEDSMOTIVATION

LUCID research project

Automatic distribution of RDF data changes in decentralized value chain networks to optimize the flow of information.

http://www.lucid-project.org/

September 14, 2016

4

Page 5: Towards Versioning of Arbitrary RDF Data

LEDSREQUIREMENTS

• Support for versioning operations• commute, revert, merge [Cassidy and Ballantine 2007]

• Detection of effective changes

• Full RDF support

• Protection against unperceived history manipulation

• Distributable patches

September 14, 2016

5

Page 6: Towards Versioning of Arbitrary RDF Data

LEDSFEATURES

ü Invertible patches

ü Change detection

ü Patches with quad and blank node support

ü Version integrity by signing of patches

ü eccrev: RDF changes and revisions vocabulary

September 14, 2016

6

Page 7: Towards Versioning of Arbitrary RDF Data

LEDSARCHITECTURE

September 14, 2016

7

Page 8: Towards Versioning of Arbitrary RDF Data

LEDSARCHITECTURE

September 14, 2016

8

Page 9: Towards Versioning of Arbitrary RDF Data

LEDSARCHITECTURE

September 14, 2016

9

Page 10: Towards Versioning of Arbitrary RDF Data

LEDSARCHITECTURE

September 14, 2016

10

Page 11: Towards Versioning of Arbitrary RDF Data

LEDSBLANK NODES

September 14, 2016

11

Page 12: Towards Versioning of Arbitrary RDF Data

LEDSBLANK NODES

September 14, 2016

12

Page 13: Towards Versioning of Arbitrary RDF Data

LEDSBLANK NODES

September 14, 2016

13

Page 14: Towards Versioning of Arbitrary RDF Data

LEDSBLANK NODES

September 14, 2016

14

Page 15: Towards Versioning of Arbitrary RDF Data

LEDSBLANK NODES

September 14, 2016

15

Tummarello et al. 2005:

Given an RDF statement s, the Minimum Self-contained Graph (MSG) containing that statement, written MSG(s), is the set of RDF statements comprised of the following:

1. The statement in question;

2. Recursively, for all the blank nodes involved by statements included in the description so far, the MSG of all the statements involving such blank nodes.

Page 16: Towards Versioning of Arbitrary RDF Data

LEDSBLANK NODES

September 14, 2016

16

Page 17: Towards Versioning of Arbitrary RDF Data

LEDSBLANK NODES

September 14, 2016

17

Now we are able to clearly identify the correct statement allowing us to revert the update.

Page 18: Towards Versioning of Arbitrary RDF Data

LEDSARCHITECTURE

September 14, 2016

18

Page 19: Towards Versioning of Arbitrary RDF Data

LEDSARCHITECTURE

September 14, 2016

19

Page 20: Towards Versioning of Arbitrary RDF Data

LEDSARCHITECTURE

September 14, 2016

20

Page 21: Towards Versioning of Arbitrary RDF Data

LEDSARCHITECTURE

September 14, 2016

21

Page 22: Towards Versioning of Arbitrary RDF Data

LEDS

No Versioning

Versioning Async.

Versioning Sync.

0 2000 4000 6000 8000 10000 12000

BSBM QMpH

EVALUATION

September 14, 2016

22

BSBM Explore and Update Benchmark (10 million statements)

Page 23: Towards Versioning of Arbitrary RDF Data

LEDSEVALUATION

September 14, 2016

23

Patch Size Benchmark• break even point: >10000 changed statements

0,0

48

s

0,0

75

s

0,4

02

s

4,8

5s

0,3

33

s

1,6

05

s

14

,90

5s

25

6,7

15

s

50 500 5000 50000

Changed Triples vs.SPARQL Update Execution Times

w/o versioning w/ versioning

Page 24: Towards Versioning of Arbitrary RDF Data

LEDSFUTURE WORK

September 14, 2016

24

• Integration into LEDS stack

• Implementation of versioning operations [Cassidy and Ballantine 2007]

• Support for more RDF repositories• Virtuoso✓• Stardog: provides versioning without blank node support

• Scalable hashing algorithm• current one based on [Carroll 2003]

• Benchmark for RDF versioning systems

Page 25: Towards Versioning of Arbitrary RDF Data

LEDS

September 14, 2016

25

Questions?

Web: http://aksw.org/MarvinFrommhold

Email: [email protected] /

[email protected]