Upload
chris-baillie
View
280
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
A Role for Provenance in Quality Assessment
Chris Baillie, Pete Edwards, and Edoardo Pignotti
Motivation
“we don’t know whether the information we find [on the Web] is accurate or not. So we have to teach people how to assess what they’ve found’’
Vint Cerf, 2010
Web of Documents has become the Web of documents, services, data, and people.
Anyone can publish anything so we need a way to evaluate quality.
We are investigating these issues within the Internet of Things Sensors now at the centre of many applications
Example Scenario
Evaluating Data Quality
F(E, R) = Q
Entity (and context)To evaluate quality, we must examine the context around data
WIQA Framework examines data content, context, and external ratings
(Bizer et al. 2009)
Quality Scores-Quality is a multi-dimensional construct
- Accuracy- Timeliness- Relevance
Data Requirements-Furber and Hepp (2011) use rules to identify quality problems
Representing Sensor Observations
Linked Data: “recommended best practice for exposing, sharing, and connecting pieces of data using URIs and RDF”
Performing Quality Assessment
Rrelevance = 1 - ( E distanceFromRoute X )
100
CONSTRUCT { _:b0 a QualityScore . _:b0 score ?qs . _:b0 dqm:ruleViolation _:b1 . _:b1 a DataRequirementViolation . _:b1 dqm:affectedInstance ?instance . } WHERE { ?instance a Observation . ?instance distanceFromRoute ?distance . LET (?qs := (1 - (?distance / 100))) .}
Quality Assessment Results
Observation Provenance
Provenance is a critical part of observation context
Describes the entities, agents, and activities involved in data creation: How was the observation value measured? Who controlled the sensing process? How has the observation been transformed since it was
created?
W3C Prov-O model provides linked data representation of provenance
Observation Provenance
Quality Score Provenance
Future Work Implementation of quality rules that examine provenance Investigate quality score re-use
Work To Date Developed Quality Assessment Framework that enables:
Linked data representation of sensor observations Definition of quality requirements using SPARQL rules Generation of quality scores via reasoning
Any questions?
Come and see the IRP demo (D9) to see quality assessment in action.
Implementation