24
Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Embed Size (px)

Citation preview

Page 1: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Representing Contextual Aspects of Data

WP2 - Andreas Harth

1st year reviewLuxembourg, December 2011

Page 2: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

18 24 30 366 120

Task 2.1Data quality assessment and repair

Task 2.2Temporal, spatial and social aspects of data

Task 2.3Recommendations for enhancing best practices for data publishing

D2.4 Update of D2.1

D2.3 Modelling and processing contextual aspects of data

D2.5 Proof-of-concept evaluation for modelling space and time

FUB

42 48D2.1 Conceptual model and best practices for high-quality data publishing

D2.2 Methods for quality repair

KIT

KIT

Work Plan View

D2.6 Methods for assessing the quality of sensor data

D2.7 Recommendations for contextual data publishing

Page 3: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Outline

• Motivation• NeoGeo Vocabularies• Mappings and Community Activities• Demo• Conclusion and Future Work

Page 4: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Motivation

Page 5: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Motivation

• Geospatial data is becoming increasingly relevant• Location-based services, mobile applications• Ever increasing amount of sensor data (phones,

satellites…)

• Applications require integrated access to geospatial data• Spatial querying• Spatial reasoning

Page 6: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Source 2

Source 1

Geospatial Data Scenario

Source n

Wrapper 1

Mapping 2

Mapping n

Integration

Mapping 1

...

Page 7: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Challenges

• Disparate data formats• Integrated data format (syntax) and access (data

transfer protocol) - Linked Data (RDF, HTTP)

• Most data sources provide just points (geo:lat, geo:long)• Create vocabulary/method for publishing regions

• Each data source uses own way to encode geospatial data• Allow for syntax differences, provide mappings where

possible (KML, GML, WKT, RDF…)

• Instance data not interlinked across sources• Create mappings between instances

• Geospatial data is political

Page 8: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

„Unpolitical“ Datasets

GADM-RDF – http://gadm.geovocab.org/ RDF representation of the administrative regions of the

GADM project (http://gadm.org/)

NUTS-RDF – http://nuts.geovocab.org/ RDF representation of Eurostat's NUTS nomenclature.

GADM and NUTS serve as:New geospatial information on the Semantic Web.Bridges between already published spatial datasets.Experimentation and evaluation datasets.

Page 9: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

NeoGeo Vocabularies

Page 10: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Prototypical Geo-Ontology

Feature Geometrygeometry

Spatial relations

Spatial functions

geometry„Luxembourg“

„Europe“

in

Page 11: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Features vs. Geometries

• NeoGeo vocabularies are based on the General Feature Model

• General Feature Model makes a distinction between the feature (resource to which the region belongs), and the actual geometry.

• Semantics of the feature are more important than the representation of the geometry.

• Instances of the feature are related to the type of the feature.

• A feature can be related to multiple distinct geometries; also allows for modelling different geometric properties for one single feature (e.g. different scales).

Page 12: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

NeoGeo Vocabularies

• Spatial Vocabulary• Representation and reasoning on topological relations based

on the Region Connection Calculus (RCC).

• Geometry Vocabulary• Representation of geo-referenced geometric shapes.

spatial:Feature ngeo:Geometryngeo:geometry

spatial:* RESTful services

RCC: Randell, D. A., Cui, Z. and Cohn, A. G.: A spatial logic based on regions and connection, 3rd Int. Conf. on Knowledge Representation and Reasoning, 1992.

Page 13: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Modelling Spatial RelationsDataset Disjoint Touches Overlaps Within Contains Equals Nearby

UN FAO hasBorderWith

isInGroup

Ordnance Survey disjoint touches partiallyOverlaps

within contains Equals

geo.linkeddata.es formaParteDe

formadoPor

LinkedGeoData

GeoNames.org neighbour/neighbouringFeatures

parentFeature

childrenFeatures

nearby / nearbyFeatures

Uberblic.org adjoining_location

containing_location

DBpedia locatedInArea

NUTS Part of

GADM Part of

Page 14: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

RCC-8 Properties

partially overlapping (PO)tangential proper part (TPP)non-tangential proper part (NTPP)equal (EQ)tangential proper part inverse (TPPi)non-tangential proper part inverse (NTPPi)externally connected (EC)disconnected (DC)

proper part (PP)proper part inverse (Ppi)part (P)part inverse (Pi)overlaps (O)connects with (C)discrete from (DR)

Page 15: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Spatial Vocabulary

• http://geovocab.org/spatial#

• Uses RCC for the representation of topological relations between regions

• Supports RCC5 and RCC8 relations• Inference available for most RCC

relations• However some rules require „negation as

failure“, which requires closed world assumption

Page 16: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Modelling RegionsDataset Point Bounding Box Points in

ListsSingle predicate

Literal

UN FAO Own

Ordnance Survey W3C Geo / GeoRSS

Own / GML

geo.linkeddata.es W3C Geo Own Own / GML

LinkedGeoData W3C Geo Own

GeoNames.org W3C Geo

Uberblic.org Own

DBpedia.org W3C Geo

NUTS

GADM

Page 17: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Modelling Regions - Alternatives

• RDF List of W3C Geo Points• RDF List of latitude/longitude pairs• RDF Literal of latitude/longitude• RDF Literal in GML/WKT format

(GeoSPARQL)• …• Geometries have own URI, content

format negotiated (KML, GML, WKT…)• Geometry vocabulary describes high-level

features of geospatial regions

Page 18: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Mappings and Community Activities

Page 19: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Vocabulary Mappings

TOFROM

NeoGeo DBpedia Linked-GeoData

geo.linked-data.es

Geo-names

NeoGeo - SC, SP SC, SP SC, SP SC, SP

DBpedia -

Linkged-GeoData

wip -

geo.linked-data.es

wip -

Geonames -

SC: rdfs:subClassOf, SP: rdfs:subPropertyOf, SA: owl:sameAs

Page 20: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Instance Mappings

TOFROM

NeoGeoNUTS

NeoGeo GADM

DBpedia Linked-GeoData

geo.linked-data.es

Geo-names

NeoGeo NUTS

- EQ PPi

NeoGeo GADM

EQ - PPi PPi PPi tbd

DBpedia EQ, PP (wip)

-

Linked-GeoData

PP (wip) -

geo.linked-data.es

PP (wp) -

Geonames -

Page 21: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

VoCamp

Planning underway for VoCamp at UPM (jointly organised with KIT), weekend of Feb 4th and 5th

Co-located with VoCamp Santa Barbara.

http://vocamp.org/

Page 22: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Demo!

Page 23: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Summary

• NeoGeo vocabularies and experiences on publishing geo-spatial data as Linked Data

• NUTS and GADM datasets online• Integration vocabulary online, including

vocabulary mappings• GADM mappings to NUTS, DBpedia,

LinkedGeoData…• Linked Data Services for accessing/querying

spatial indices (withinRegion, boundingBox)• Work on similarity metrics (with optimisations

and evaluation) for geospatial regions

Page 24: Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

Future Work

• Finalise NeoGeo vocabularies (VoCamp)• Improvement of precision of spatial similarity;

publish similarity service online• More instance mappings to GADM• Earth and space science data• More experiments: querying of integrated data• Reasoning

• Temporal context

• First deliverable due in March 2012