27
Linked data hypercubes Dave Reynolds, Epimorphics Ltd

Linked Data Hypercubes - Semtech London

Embed Size (px)

DESCRIPTION

Presentation on the Data Cube vocabulary, and its uses, given at the Semantic Technologies Business Conference in London.

Citation preview

Page 1: Linked Data Hypercubes - Semtech London

Linked data hypercubes

Dave Reynolds, Epimorphics Ltd

Page 2: Linked Data Hypercubes - Semtech London
Page 3: Linked Data Hypercubes - Semtech London

Linked Data - great for describing “things”

data

e.g. Schools in England and Wales

Page 4: Linked Data Hypercubes - Semtech London

Linked Data - great for describing “things”

data model

ontology development classifications phase of education location, contact reporting class sizes etc

URI scheme reference data to link to

admin geography, LLSC, charity ...

Page 5: Linked Data Hypercubes - Semtech London

Linked Data - great for describing “things”

data model publish

convert to RDF in a triple store

entity URIs as linked data

SPARQL endpoint

Linked data API

Page 6: Linked Data Hypercubes - Semtech London

Linked Data - great for describing “things”

data model publish use

Page 7: Linked Data Hypercubes - Semtech London

But what about ... data

Government budget analysis

local authority spend with suppliers

regional demographic trends

performance metrics

air quality measurements

energy consumption

Page 8: Linked Data Hypercubes - Semtech London

Publishing tabular data as linked data

? why?

how?

does it work?

Page 9: Linked Data Hypercubes - Semtech London

Benefits data slices and values becomes addressable

annotate, explain, qualify values provenance for values trace back for derived reports

integrate, compare, slice across datasets common terms for dimensions and units common identifiers for values (regions,

departments ...) link to non-tabular data

put the data in context

Page 10: Linked Data Hypercubes - Semtech London

Data cube vocabulary collaborative development

sponsored by data.gov.uk simple, flexible vocabulary mirrors core information models from:

SDMX (Statistical Data and Metadata eXchange) DDI (Data Documentation Initiative)

extension to SCOVO vocabulary

Page 11: Linked Data Hypercubes - Semtech London

Data cube modelA set of observations indexed by dimensions describing measures interpreted according to attributes

dimension(e.g. time)

dim

ensi

on

(e.g

. re

gio

n)

• population = 32,567

measure(s)

unit of measure = countstatus = preliminary...

attributes

Page 12: Linked Data Hypercubes - Semtech London

Data cube vocabulary1. Top level DataSet

provenance and metadata structure

dimension valuesmeasure value(s)attribute values

qb:component

qb:DataSet

qb:Slice

qb:slice

qb:Observation

qb:observation

qb:dataset

qb:structure

qb:SliceKey

qb:sliceStructure

qb:DataStructureDefinition

qb:sliceKey

qb:subSlice

Page 13: Linked Data Hypercubes - Semtech London

Data cube vocabulary1. Top level DataSet

provenance and metadata structure

Observation measured values, at

dimensions with attributes direct link to DataSet

dimension valuesmeasure value(s)attribute values

qb:component

qb:DataSet

qb:Slice

qb:slice

qb:Observation

qb:observation

qb:dataset

qb:structure

qb:SliceKey

qb:sliceStructure

qb:DataStructureDefinition

qb:sliceKey

qb:subSlice

Page 14: Linked Data Hypercubes - Semtech London

Data cube vocabulary1. Top level DataSet

provenance and metadata structure

Observation measured values, at

dimensions with attributes direct link to DataSet

Slice optional grouping by fixing

dimensions guide to presentation allows for abbreviated data

dimension valuesmeasure value(s)attribute values

qb:component

qb:DataSet

qb:Slice

qb:slice

qb:Observation

qb:observation

qb:dataset

qb:structure

qb:SliceKey

qb:sliceStructure

qb:DataStructureDefinition

qb:sliceKey

qb:subSlice

Page 15: Linked Data Hypercubes - Semtech London

Data cube vocabulary2. Data Structure Definition explicit definition of cube

structure, inline in the data enables

validation visualization discovery abbreviation

still open world

qb:ComponentSpecification

qb:DataStructureDefinition

qb:DataSetqb:structure

qb:component

qb:dimension

qb:measure

qb:attribute

qb:componentRequired qb:componentAttachment qb:order

Page 16: Linked Data Hypercubes - Semtech London

Data cube vocabulary3. Coding values numeric or

symbolic explicit link to

coding scheme allows for

hierarchical codes SDMX coding

schemes and role markers available

qb:ComponentProperty

qb:DimensionProperty

qb:AttributeProperty

qb:MeasureProperty

qb:CodedPropertysdmx:ConceptRole

skos:ConceptSchemeqb:codeList

qb:concept

sdmx:FrequencyRolesdmx:CountRolesdmx:EntityRolesdmx:TimeRolesdmx:MeasureTypeRolesdmx:NonObsTimeRolesdmx:IdentityRolesdmx:PrimaryMeasureRole

sdmx:Concept

sdmx:CodeList

qb:measureTypeskos:Concept

Page 17: Linked Data Hypercubes - Semtech London

Exampleeg:dsd-le a qb:DataStructureDefinition; # The dimensions qb:component [qb:dimension eg:refArea; qb:order 1]; qb:component [qb:dimension eg:refPeriod; qb:order 2]; qb:component [qb:dimension sdmx-dimension:sex; qb:order 3]; # The measure(s) qb:component [qb:measure eg:lifeExpectancy]; # The attributes qb:component [qb:attribute sdmx-attribute:unitMeasure; qb:componentAttachment qb:DataSet;] .

eg:dataset-le1 a qb:DataSet; rdfs:label "Life expectancy"@en; rdfs:comment "Life expectancy in Welsh Unitary authorities"@en; qb:structure eg:dsd-le ; sdmx-attribute:unitMeasure <http://dbpedia.org/resource/Year> .

eg:o1 a qb:Observation; qb:dataset eg:dataset-le1 ; eg:refArea admingeo:newport_00pr ; eg:refPeriod <http://reference.data.gov.uk/id/year/2004> ; sdmx-dimension:sex sdmx-code:sex-M ; eg:lifeExpectancy 76.7 .

Page 18: Linked Data Hypercubes - Semtech London

Case study: Local government payments

UK local authorities publish data on all spending above £500

linked data version to enable comparison

data

Page 19: Linked Data Hypercubes - Semtech London

Case study: Local government payments

cube structure measure

amount net of recoverable VAT attributes

currency dimensions

time payer payee expenditure code item

package as an ontology

data model

Page 20: Linked Data Hypercubes - Semtech London

Case study: Local government payments

data model publish

LD API

visualizations

API structure mirrors cube dimensional structure

Page 21: Linked Data Hypercubes - Semtech London

Case study: Local government payments

data model publish use

Page 22: Linked Data Hypercubes - Semtech London

Case study: Environmental monitoring

data

Environment Agency bathing water quality monitoring

samples assay compliance

assessment

Page 23: Linked Data Hypercubes - Semtech London

Case study: Environmental monitoring

measures total coliform count, entero virus count, ... sample classification

dimensions sampling point sampling week sampling year

attributes abnormal weather

data model

Page 24: Linked Data Hypercubes - Semtech London

Case study: Environmental monitoring

data model publish

LD API

visualizations

API structure mirrors cube dimensional structure

Page 25: Linked Data Hypercubes - Semtech London

Case study: Environmental monitoring

data model publish use

Page 26: Linked Data Hypercubes - Semtech London

Data Cube : Summary foundational approach to publishing multi-

dimensional data as linked data enables

addressing – annotate, explain, provenance, context

integration – slice, dice and compare across setsputs data in context

explicit declarative structure => validation discovery automation - web APIs, visualizations, exploration

tools

Page 27: Linked Data Hypercubes - Semtech London

Acknowledgements John Sheridan (The National Archive)

for sponsoring the development of data cube Richard Cyganiak, Jeni Tennison

co-developers of the data cube vocabulary Paul Davidson

instigator of the Payments ontology Stuart Williams, Ian Dickinson

developers of the bathing water use case Photos:

dullhunk @ flickr Martin Pettitt @ flickr kikasso @ flickr Tax_Rebate @ fliCkr