30
A RESTful Proxy and Data Model for Linked Sensor Data Krzysztof Janowicz * 1,3 , Arne Bröring 2,3,4 , Christoph Stasch 4 , Sven Schade 5 , Thomas Everding 4 , and Alejandro Llaves 4 1 University of California, Santa Barbara, USA 2 ITC Faculty, University of Twente, The Netherlands 3 52 North Initiative for Geospatial Open Source Software, Münster, Germany 4 Institute for Geoinformatics, University of Münster, Germany 5 Institute for Environment and Sustainability, Joint Research Centre, Ispra, Italy Abstract The vision of a Digital Earth calls for more dynamic information sys- tems, new sources of information, and stronger capabilities for their integra- tion. Sensor networks have been identified as a major information source for the Digital Earth, while Semantic Web technologies have been proposed to facilitate integration.. So far, sensor data is stored and published using the Observations & Measurements standard of the Open Geospatial Consortium (OGC) as data model. With the advent of Volunteered Geographic Informa- tion and the Semantic Sensor Web, work on an ontological model gained im- portance within Sensor Web Enablement (SWE). In contrast to data models, an ontological approach abstracts from implementation details by focusing on modeling the physical world from the perspective of a particular domain. Ontologies restrict the interpretation of vocabularies towards their intended meaning. The ongoing paradigm shift to Linked Sensor Data complements this attempt. Two questions have to be addressed: (i) how to refer to chang- ing and frequently updated data sets using Uniform Resource Identifiers, and (ii) how to establish meaningful links between those data sets, i.e., observa- tions, sensors, features of interest, and observed properties? In this paper, we present a Linked Data model and a RESTful proxy for OGC’s Sensor Obser- vation Service to improve integration and inter-linkage of observation data for the Digital Earth. Keywords: Digital Earth, Geospatial data integration, Spatial Data Infras- tructure, Earth Observation, Data exchange models, Linked Data, Semantic Enablement * Corresponding author: Krzysztof Janowicz; email: [email protected] 1

A RESTful Proxy and Data Model for Linked Sensor Datageog.ucsb.edu/~jano/RESTfulSOS.pdf · A RESTful Proxy and Data Model for Linked Sensor Data Krzysztof Janowicz 1,3, Arne Bröring2,3,4,

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

  • A RESTful Proxy and Data Modelfor Linked Sensor Data

    Krzysztof Janowicz∗1,3, Arne Bröring2,3,4, Christoph Stasch4, SvenSchade5, Thomas Everding4, and Alejandro Llaves4

    1University of California, Santa Barbara, USA2ITC Faculty, University of Twente, The Netherlands

    352◦ North Initiative for Geospatial Open Source Software, Münster, Germany4Institute for Geoinformatics, University of Münster, Germany

    5Institute for Environment and Sustainability, Joint Research Centre, Ispra, Italy

    Abstract

    The vision of a Digital Earth calls for more dynamic information sys-tems, new sources of information, and stronger capabilities for their integra-tion. Sensor networks have been identified as a major information source forthe Digital Earth, while Semantic Web technologies have been proposed tofacilitate integration.. So far, sensor data is stored and published using theObservations & Measurements standard of the Open Geospatial Consortium(OGC) as data model. With the advent of Volunteered Geographic Informa-tion and the Semantic Sensor Web, work on an ontological model gained im-portance within Sensor Web Enablement (SWE). In contrast to data models,an ontological approach abstracts from implementation details by focusingon modeling the physical world from the perspective of a particular domain.Ontologies restrict the interpretation of vocabularies towards their intendedmeaning. The ongoing paradigm shift to Linked Sensor Data complementsthis attempt. Two questions have to be addressed: (i) how to refer to chang-ing and frequently updated data sets using Uniform Resource Identifiers, and(ii) how to establish meaningful links between those data sets, i.e., observa-tions, sensors, features of interest, and observed properties? In this paper, wepresent a Linked Data model and a RESTful proxy for OGC’s Sensor Obser-vation Service to improve integration and inter-linkage of observation datafor the Digital Earth.

    Keywords: Digital Earth, Geospatial data integration, Spatial Data Infras-tructure, Earth Observation, Data exchange models, Linked Data, SemanticEnablement

    ∗Corresponding author: Krzysztof Janowicz; email: [email protected]

    1

  • 1 Motivation

    The initial vision of a Digital Earth was first formulated by former US Vice Presi-dent Al Gore as a multi-resolution, three-dimensional representation of the planet,into which we can embed vast quantities of geo-referenced data (Gore, 1998). Tenyears after this speech, Craglia et al. published a position paper to argue that thisvision has not yet been achieved (Craglia et al., 2008). In parallel to the growingavailability of information, the need to better understand the interplay of environ-mental and social phenomena has also increased, thus requiring more dynamicsystems, new sources of information, and stronger capacities for their integration.The Sensor Web has been identified as a central building block to address thesechallenges (De Longueville et al., 2010). A digital nervous system for the globehas been suggested as a vibrant approach for the Digital Earth. An implemen-tation based on Spatial Data Infrastructures (SDI), especially on the Sensor WebEnablement (SWE) standards of the Open Geospatial Consortium (OGC), has beenproposed. These infrastructures do not only deliver data but also offer geospatialprocessing capabilities and the final rendering on a virtual globe. Grounded inspatial and temporal reference systems, the outcomes of a variety of services canbe combined into a multi-layered representation of the Earth’s surface and help toanswer scientific questions or assist in emergency situations. A rigid standardiza-tion process as well as conformance tests ensure that a multitude of services can beintegrated into SDIs to realize advanced tasks such as predictions.

    Yet, in contrast to other scientific domains, we cannot define a context-free andcanonical representation of geographic features or even their corresponding types(Mark, 1993; Brodaric and Gahegan, 2007). For instance, there is no pre-given andcommon definition of Forest, Mountain, or Lake (Lund, 03/22/2010; Smith andMark, 2003; Montello and Sutton, 2006). Nevertheless, a meaningful layering ofgeo-referenced data for the Digital Earth requires the integration of the thematicaspects as well, e.g., by grounding them using semantic reference systems (Kuhn,2003; Scheider et al., 2009). This challenge, known as semantic integration, be-comes even more urgent when taking Volunteered Geographic Information and theidea of citizens as sensors into account (Goodchild, 2007; Goodchild and Glennon,2010). Every community has its own requirements and motivations for contributingdata. This is reflected by differences in the local conceptualizations of geographicspace – leading to semantic heterogeneity (Janowicz, 2010). Van Zyl et al., forinstance, pointed out that the Sensor Web requires well defined semantics to makeobservation data discoverable and reusable (van Zyl et al., 2009).

    The Semantic Web explicitly addresses the integration problem by (i) provid-ing formal and machine-readable specifications for the conceptualizations usedwithin different information communities, i.e., by creating ontologies, and by (ii)

    2

  • using reasoning engines to discover implicit facts, relations, and contradictions.Up to now, SDIs and the Semantic Web are not connected and, therefore, cannotexchange data or combine their services. To address this shortcoming, we havespecified and partially implemented a Semantic Enablement Layer (SEL) for SDIs(Janowicz et al., 2010b). It encapsulates Semantic Web reasoners and repositorieswithin OGC services and, thereby, enables a transparent and seamless integrationof Semantic Web technologies with SDI. However, our work also focuses on the re-verse direction – namely how to make spatial information available on the Seman-tic Web without changing existing SDI standards and implementations. Mazzettiet al. suggested that Representational State Transfer (REST) may provide an archi-tectural solution and identified the connection to the Semantic Web as an item forfuture Digital Earth research (Mazzetti et al., 2009).

    In this paper, we follow this proposal and introduce a RESTful proxy for theOGC Sensor Observation Service (SOS) to assign meaningful identifiers to sensordata and to directly publish this raw data on the Web. The free and open sourcecode as well as a demonstration are available at http://52north.org/RESTful_SOS. We present our research on developing a Linked Data model and argue whyit is required in addition to classical data models and ontologies. Our software canbe installed as a facade to a SOS without any modifications to the service interfaceor database, and hence follows the SEL methodology. The proxy provides an RDFrepresentation of observation data, links between data sets, as well as URIs – all ofthese are fundamental building blocks of Linked Data. Our work goes beyond theinitial proposal of Page et al. (Page et al., 2009) and moreover builds up on top ofthe sensor ontologies developed by the W3C Semantic Sensor Network IncubatorGroup (W3C SSN-XG).

    The remainder of this article is structured as follows. Based on our previ-ous work (Janowicz et al., 2010a; Schade and Cox, 2010; Janowicz and Compton,2010; Keßler and Janowicz, 2010), we discuss the need for a Linked Data modeland introduce all relevant parts in detail. Next, we describe the transparent andRESTful SOS proxy, offer conceptual as well as technical insights, and discuss po-tential applications. Finally, we conclude our paper by summarizing the achievedresults and lessons learned, pointing out directions for further work, and discussingthe future of SDI in relation to the Linked Data principles and the Digital Earth.

    2 Background

    This section introduces research on Spatial Data Infrastructure, Linked Data, andResource-Oriented Architectures relevant for the presented work and also clarifiestheir interrelation.

    3

    http://52north.org/RESTful_SOShttp://52north.org/RESTful_SOS

  • 2.1 Spatial Data Infrastructures and Sensor Web Enablement

    Spatial Data Infrastructure refers to the specialization of information infrastruc-tures for the geospatial sciences (Nebert, 2004). Dozens of SDIs have beendeveloped across the globe, both on national as well as on international-level(Crompvoets and Bregt, 2007). Nowadays, SDIs define the service-oriented man-agement, access, and processing of geospatial data and are implemented using WebServices. The demand for interoperability has boosted the development of stan-dards and tools to facilitate data transformation and integration, mostly in terms ofinterfaces specified by the OGC.

    An abstract structure for data modeling and encoding is provided in form ofthe Geographic Markup Language (GML). Specific data models and services forsensor networks are developed within OGC’s Sensor Web Enablement (SWE) ini-tiative (Botts et al., 2008). SWE is responsible for the development of standardsto make sensors and their observations accessible on the Web. The Observations& Measurements (O&M) and Sensor Model Language (SensorML) specificationsdefine how to exchange data, while a number of additional Web Services are re-sponsible for the storing, retrieving, and tasking of sensor-related data. One ex-ample is the Sensor Observation Service (SOS) which stores and gives pull-basedaccess to observation data, while the upcoming Sensor Event Service (SES) aimsat push-based notifications and also handles complex event processing.

    2.2 The Semantic Web and Linked Data

    While the OGC has been successful in addressing syntactic and structural het-erogeneity by standardization, semantic heterogeneity remains a major challenge(Kuhn, 2009; Janowicz et al., 2010b; Schade, 2010). For example, the missingcapabilities for semantic matching are among the main obstacles towards a plug &play infrastructure for the Sensor Web (Bröring et al., 2009a). The Semantic Webpromises to address exactly these shortcomings. It reduces semantic interoperabil-ity problems by providing a family of formal and machine-readable knowledgerepresentation languages such as the Web Ontology Language (OWL). SemanticWeb technologies such as inference engines discover implicit relations and helpto establish new facts. Web ontologies make semantic heterogeneity explicit byrestricting domain vocabularies towards their intended interpretations (Guarino,1998; Kuhn, 2009).

    Nevertheless, SDI and the Semantic Web can only interact to a very limiteddegree. A common layer to integrate ontologies and reasoning services into SDIis missing. We have proposed such a Semantic Enablement Layer (SEL) in ourprevious work (Janowicz et al., 2010b). Instead of defining novel services and

    4

  • protocols, it encapsulates Semantic Web repositories and reasoners using estab-lished OGC service types such as the Web Processing Service (WPS). We havealso introduced a similar approach for ontology repositories by encapsulating themby OGC catalog services. This layer provides a transparent and seamless integra-tion of Semantic Web technologies into Spatial Data Infrastructures and does notrequire any modifications to existing services. So far, the suggested solution hasbeen partially implemented and is available as free and open source software. Afirst release of the Web Reasoning Service (WRS) as part of the SEL is availableat https://svn.52north.org/svn/semantics/WRS.

    There is reason to assume that the development discussed previously for ser-vices will also lead away from top-down data models developed by authoritiestowards users becoming active knowledge engineers (Janowicz, 2010). This raisesquestions and problems that reach beyond the goals and capabilities of SDIs aswell as the current state of the art of Semantic Web technologies. Flexible, min-imalistic, and local vocabularies are required to interlink single, context-specificdata fragments on the Web. Linked Data has been introduced to address someof these new requirements (Bizer et al., 2009). In a nutshell, Linked Data de-scribes a paradigm shift from a Web of linked documents towards a Web of linkeddata. In conjunction with ontologies, such raw data can be combined and reusedon-the-fly. From a methodological point of view, Linked Data proposes uniqueidentifiers for data, links between them, and relies on the Resource DescriptionFramework (RDF) (Manola and Miller, 2004). The most common query languagefor RDF is SPARQL (Prud’hommeaux and Seaborne, 2008). SPARQL has similarcapabilities as querly languages for relational databases, but works by matchinggraph patterns and is optimized for RDF triple stores, such as Sesame or Virtu-oso. In comparison to SDIs, the Linked Data paradigm is relatively simple and,therefore, can help to open up SDIs to casual users. Within the last years, LinkedData has become the most promising vision for the Future Internet and has beenwidely adopted by academia and industry. The Linking Open Data cloud diagramprovides a good and up-to-date overview of the available data and the degree ofinterlinkage: http://lod-cloud.net/.

    The combination of classical SWE approaches and work on the Semantic Sen-sor Web (Sheth et al., 2008) with Linked Data has been recently proposed by var-ious research groups (Phuoc and Hauswirth, 2009; Sequeda and Corcho, 2009;Page et al., 2009; Janowicz et al., 2010a; Patni et al., 2010a,b; Schade and Cox,2010). While they differ in application areas and the used methods, they all arguethat sensors, features of interest, and observations should be identified using URIs,looked up by dereferencing these URIs over HTTP, encoded in machine-readableknowledge representation languages such as RDF, and interlinked with other re-sources. Linked Sensor Data can provide the resources and technologies required

    5

    https://svn.52north.org/svn/semantics/WRShttp://lod-cloud.net/

  • to semantically integrate observation-centric data on the Digital Earth.

    2.3 The RESTful Approach

    SDI and Linked Data have been developed with fundamentally different andpartially contradicting motivations in mind. While the OGC has standardizedWeb Services which follow a state-full request-response pattern and are designedbased on the publish-find-bind paradigm of Service-Oriented Architectures (SOA)(Mazzetti et al., 2009), Linked Data explicitly aims at breaking up such data si-los. Instead, it proposes a Resource-Oriented Architecture (ROA) which is focusedon distributed capabilities and services. ROA is an architectural style devoted tomanage distributed, heterogeneous resources, e.g., features, in which client appli-cations interact directly with the exposed resources. The main constraints behindROA-based applications are the principles known as REpresentational State Trans-fer (REST) (Fielding, 2000):

    1. Resources should be identified properly using URIs, i.e., each resource mustbe uniquely addressable;

    2. Uniform interfaces should be provided through the use of HTTP as theunique application-level protocol;

    3. Resources are manipulated through their representations, since clients andservers exchange self-descriptive messages with each another;

    4. Interaction is stateless since servers only record and manage the state of theresources they expose, i.e., client sessions are not maintained on the server;

    5. Hypermedia are the engine of application state, i.e., the application state isbuild following hyperlinks according to the navigation paradigm.

    Providers that wish to publish their sensor data on the Linked Data cloud (andthereby follow a ROA style) can do so by converting their data to RDF, store it in atriple store, and make it accessible via SPARQL endpoints. Consequently, they donot need any OGC Web Service. This implies moving data out of the SDI. Instead,we propose an integrated approach following our notion of semantic enablement(Janowicz et al., 2010b). Data governance and management remains with the SDI,while resources can still be exposed as Linked Data.

    3 A Linked Data Model for Sensor Data

    Exposing observations provided by a SOS as raw data to the Web, requires globalidentifiers for the different components of the O&M model (Sequeda and Corcho,

    6

  • 2009; Page et al., 2009; Janowicz et al., 2010a). In contrast to existing approaches,our work aims at introducing three of the core characteristics of Linked Data toOGC Web Services – namely, global instead of local identifiers, i.e, URIs, linksbetween data, and an RDF serialization of O&M data.

    While transforming GML-encoded geographic information to RDF is a majorstep towards making it accessible on the Linked Data Web, a purely automatedmapping is of questionable value and does not add any semantics (Jain et al., 2010;Schade and Cox, 2010). The problems of assigning meaningful URIs to data sets,the semantic annotation of data using ontologies, and how to establish links toexternal resources are still open research questions.

    This section addresses each of these requirements. We will especially focus onURIs as sources of reference. Linked Data detaches information from its originalcreation context, such as documents, applications, or databases. While this easesaccessibility and re-usability, it makes the interpretation of these data chunks morechallenging (Janowicz, 2010).

    3.1 A Linked Data Model for Sensor Observations

    The O&M specification describes a conceptual data model for handling observa-tions. It follows a classical object-oriented approach and provides an encoding ofthe model as XML schema. In contrast, work at the W3C Semantic Sensor Net-works Incubator Group aims at developing an ontology for sensors and their ob-servations that describes the physical processes of transforming stimuli to numericvalues (Neuhaus and Compton, 2009; Compton et al., 2009; Janowicz and Comp-ton, 2010). Both approaches are not sufficient to address the needs of Linked Data.While O&M supports unique identifiers, it currently does neither prescribe the useof HTTP URI’s, the persistence of identifiers, nor clear and flexible linking strate-gies between resources. Ontologies are an abstraction layer above data models andaim at describing the physical world. For instance, they introduce the notion of astimulus that triggers the sensor and leads to the observation. The stimulus as such,however, cannot be stored in data repositories.

    Therefore, we introduce an intermediate Linked Data model for sensor obser-vations. It is derived from an ontology developed by the W3C SSN-XG, namelythe Stimulus-Sensor-Observation (SSO) design pattern (Janowicz and Compton,2010). The SSO pattern was developed as a flexible and extensible starting pointfor sensor ontologies and Linked Data vocabularies. The OWL ontologies and a de-tailed documentation are available at http://www.w3.org/2005/Incubator/ssn/wiki/Main_Page. Figure 1 shows the classes and relations adopted fromthe pattern together with new elements such as the ObservationCollection and theSamplingTime. In a nutshell, we use the following definitions:

    7

    http://www.w3.org/2005/Incubator/ssn/wiki/Main_Pagehttp://www.w3.org/2005/Incubator/ssn/wiki/Main_Page

  • • FeatureOfInterest: the entity that comprises observable properties; for ex-ample, a 3-dimensional body of air or a sampling point where measurementsare taken.

    • ObservedProperty: the property that inheres in a feature of interest; for ex-ample, the temperature of a 3-dimensional body of air.

    • ObservationCollection: a set of observations that is grouped by a distinctcriteria; for example, all observations performed by a particular sensor, orall observations of a particular observed property that have been performedwithin a particular time frame (sampling time).

    • Observation: a (social) construct that connects observed properties with sen-sors, sensing results, and sampling times; for example, the connection be-tween air temperature, a particular temperature sensor, 11.00am (as samplingtime) and 23 (as result) degree Celsius.

    • SamplingTime: The time instant or interval at which an observation wasmade; for example, 11.00am in the observation above.

    • Result: a symbol representing an observed value; for example, 23 in theobservation above.

    • Sensor: an entity that performs observations and produces results in form ofvalues; for example, a device that measures air temperature. Humans canalso act as sensors.

    [Figure 1 about here.]

    The relations between the presented classes act as links in our model and de-fine the multiple navigation paths and external references; see section 3.4. A linkto additional sensor metadata modeled in SensorML can be established using therelation from an observation to the sensor and then link to additional sensor proper-ties. Based on this model we define an URI schema that provides unique identifiersand at the same time acts as filter for querying.

    3.2 A URI Schema for Linked Sensor Data

    In order to provide observations to the Linked Data Web, URIs for the differentcomponents of the O&M model are required. The main O&M components as-sociated with an observation are features of interest, sensors (procedures), sam-pling times, observed properties and results. The URIs are assigned to thesecomponents by appending the component type to the URI which identifies theauthority. For example, http://my.authority.org/sensors returns links to

    8

    http://my.authority.org/sensors

  • all sensor descriptions. Consequently, http://my.authority.org/sensors/thermometer1 provides the description of a certain thermometer and links to theobservation collection containing the produced observations. An according schemeis defined for the other components of observations as described before.

    To enable the RESTful access to sensor observations, the base URI schemehas the form: http://my.authority.org/observations. By following thisbase link an observation collection which contains all observations of a RESTfulservice are returned. Observations measured by specific sensors, gathered forparticular observed properties, or from specific features can be retrieved byappending the component type in one segment followed by the identifiers ofthose resources to the base URI in the next segment. For example, the refer-ence http://my.authority.org/observations/sensors/thermometer1/featuresofinterest/measurementpoint23/observedproperties/temperature, points to all observations gathered by the sensor thermometer1 atthe feature of interest measurementpoint23 for the observed property temperature.The order of the component type segments can be chosen as desired by the client.This makes the new scheme more robust compared to our previous approach(Janowicz et al., 2010a).

    Another example is a reference to all temperature and windspeed observationswhich appends the segment /observedproperties/temperature;windspeedto the base URI. In general, several identifiers can be appended for certainresources. By adhering to the proposal of Richardson and Ruby (2007) fora sound URI scheme, these multiple identifiers are separated by semicolons,as their order does not matter. To refer to observations from a particular timeinstant or period, the samplingtimes token can be appended to the URI followedby two comma-separated time strings encoded according to the ISO 8601specification (ISO, 2004). The temporal relationship between the time stringshas to be specified, e.g. ont:time:relation:between defines that all observa-tions between the first and second date should be returned. For instance, theURI http://my.authority.org/observations/samplingtimes/ont:time:relation:between,2008-01-10T14:00,2008-01-12T16:00/sensors/thermometer1/observedproperties/temperature pointsto the observation collection with all temperature observations from January10th 2008 at 2pm until January 12th at 4pm made by thermometer1. In thiscase, the time strings are comma-separated as the order of the time stringsis not arbitrary. The first time string represents the start date of the time pe-riod for which observation should be returned, the second indicates the endof the time period. The second time string can also be omitted when a linkto observations of a particular time instant has to be specified, e.g. all tem-perature observations gathered at January the 9th are identified by the URI

    9

    http://my.authority.org/sensors/thermometer1http://my.authority.org/sensors/thermometer1http://my.authority.org/observationshttp://my.authority.org/observations/sensors/thermometer1/featuresofinterest/measurementpoint23/observedproperties/temperaturehttp://my.authority.org/observations/sensors/thermometer1/featuresofinterest/measurementpoint23/observedproperties/temperaturehttp://my.authority.org/observations/sensors/thermometer1/featuresofinterest/measurementpoint23/observedproperties/temperature/observedproperties/temperature;windspeedhttp://my.authority. org/observations/samplingtimes/ont:time:relation:between,2008-01-10T14:00,2008-01-12T16: 00/sensors/thermometer1/observedproperties/temperaturehttp://my.authority. org/observations/samplingtimes/ont:time:relation:between,2008-01-10T14:00,2008-01-12T16: 00/sensors/thermometer1/observedproperties/temperaturehttp://my.authority. org/observations/samplingtimes/ont:time:relation:between,2008-01-10T14:00,2008-01-12T16: 00/sensors/thermometer1/observedproperties/temperature

  • http://my.authority.org/observations/samplingTimes/ont:time:relation:equals,2008-01-09/observedProperties/temperature.

    The URIs for observations described before are used to provide links from sen-sor and feature descriptions to their related observations. For example, the sensordescription at http://my.authority.org/sensors/thermometer1 containslinks to the observations produced by the sensor: http://my.authority.org/observations/sensors/thermometer1. This ability to collect observations byfollowing links offered by the RESTful SOS proxy replaces the extensive filteringcapabilities of the original SOS interface and reflects the interlinking paradigm ofLinked Data.

    As an additional spatial criteria, a bounding box can be appended to theURI. We use commas to separate the ordered parameters forming a boundingbox. The first four values are the coordinates defining a two dimensional rect-angle, while the fifth value is the identifier of their reference system: ,,,,. Since obser-vations do not necessarily have a location property, the spatial filter is applied tothe position of the feature of interest associated with an observation. The SOSspecification uses the ’propertyName’ parameter to identify the property to whichthe spatial filter is applied. For the sake of simplicity we restrict this parame-ter to the position (shape) of the feature of interest. In contrast to the temporalfilter, the coordinates are not in different URI segments, as leaving one of themaside would be worthless. An URI using a bounding box filter may be specifiedas follows: http://my.authority.org/observations/boundingBox/3,6,23,36,urn:ogc:def:crs:EPSG:6.5:4326.

    Finally, using geographic coordinates in the URIs raises the question of how toencode geometries within the Linked Data model. The geometry of features canbe encoded as plain-text lists of coordinates such as defined in GML or translatedto an RDF representation of single points connected by RDF predicates. The lattersolution leads to a substantial overhead as receiving the geometry of a feature ofinterest involves traversing all points defined within the RDF encoding. This isonly feasible if introducing a labeled and directed graph adds further retrieval orreasoning capabilities. Existing solutions do not require nor benefit from such anRDF serializations, e.g., for computing topological relations. Consequently, we donot advocate to transform GML geometries from well-known text to RDF.

    It is important to keep in mind that the proposed URI schema fulfills two re-quirements at the same time. First, it provides unique references for all resourcesstored in a SOS. Second, it defines a filtering language for querying the SOS basedon a RESTful paradigm. Consequently, many URIs may be used to request thesame data but a particular URI will always uniquely identify a data set. The vagueand uncertain nature of features of interest as well as the dynamic nature of sensor

    10

    http://my.authority.org/observations/samplingTimes/ ont:time:relation:equals,2008-01-09/observedProperties/temperaturehttp://my.authority.org/observations/samplingTimes/ ont:time:relation:equals,2008-01-09/observedProperties/temperaturehttp://my.authority.org/sensors/thermometer1http://my.authority.org/observations/sensors/thermometer1http://my.authority.org/observations/sensors/thermometer1http://my.authority.org/observations/boundingBox/3,6,23,36,urn:ogc:def:crs:EPSG:6.5:4326http://my.authority.org/observations/boundingBox/3,6,23,36,urn:ogc:def:crs:EPSG:6.5:4326

  • data in general make the definition of URIs and the creation of links between datasets challenging.

    3.3 Establishing Meaningful URIs

    One claim of the Linked Data initiative is to make raw data available on the Weband assign Uniform Resource Identifiers to each chunk of data. The notion of (andneed for) raw data was introduced by Berners-Lee during a TED talk in 2009 call-ing for a direct and unfiltered access to data; see http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html. Taken seriously, this postulationleads to numerous problems ranging from object identity (Janowicz, 2010), overgranularity, up to the question which processing steps are allowed before the datacannot be called raw anymore. In fact, most sensor data is pre-processed. e.g., bydeleting or replacing measurement errors. In this work, we use OGC’s notion of afeature of interest to demonstrate some of these challenges.

    Measurement is the process in which a sensors receives a stimulus and trans-lates it into another, often digital, representation. However, we are not interestedin such stimuli but in what they reveal about the properties of specific entities inthe physical world – the features of interest. Such features and their correspondingtypes, however, do not exist a priori but are an artifact of human cognition and so-cial convention (Brodaric and Gahegan, 2007; Mark, 1993; Lehar, 2003). The ex-traction of features from sensor observations requires several processing steps thatare arbitrary to a certain degree. For example, the extraction of land-cover featuressuch as forests from raster data varies among algorithms and applications. The clas-sical where is downtown problem can be used for illustration (Montello et al., 2003)– for instance, measuring the air temperature at the downtown of Santa Barbara,CA. One can determine the boundaries of such a (vague) region from human partic-ipants tests, by studying user assigned tags in Web 2.0 photo communities (Keßleret al., 2009), and various other approaches. Each of these methods depends on ad-ditional factors, such as confidence values in case of the human participants tests.Hence, a URI for Santa Barbara’s downtown extracted using a yolk-egg modelbased on a 75% confidence may be constructed as http://my.authority.org/featuresofinterest/SantaBarbara_downtown/yolk-egg/C75/ and linkedas a feature of interest to an air quality sensor system.

    A meaningful URI scheme also requires a careful sequencing of the usedsegments. For example, omitting the /C75/ segment should still identify a re-source – in this case, it should return all polygons based on the yolk-eggmodel available for downtown Santa Barbara. Further reducing the URI byremoving the /yolk-egg/ segment, returns all potential downtown representa-tions extracted by different approaches (and their parameters) from the data

    11

    http://www.ted.com/talks/tim_berners_lee_on_the_next_web.htmlhttp://www.ted.com/talks/tim_berners_lee_on_the_next_web.htmlhttp://my.authority.org/featuresofinterest/SantaBarbara_downtown/yolk-egg/C75/http://my.authority.org/featuresofinterest/SantaBarbara_downtown/yolk-egg/C75/

  • stored at my.authority.org. One could argue that http://my.authority.org/featuresofinterest/SantaBarbara_downtown/ is the base URI identifyingSanta Barbara’s downtown and providing all RDF-encoded information about thisfeature as well as the links to different geometries. While this is a feasible ap-proach, it does not solve the problem of identity. This would require a context-free,i.e., global, notion of Santa Barbara or complex semantic mappings. It is worthmentioning that using owl:sameAs to link between different versions of the down-town URI would rather add to the confusion and not resolve the problem (Halpinand Hayes, 2010).

    Finally, taking the call for raw data seriously, one could assign URIs to all un-processed outputs of sensors. However, this turns out to be of questionable valuefor two reasons: (i) What is the appropriate granularity for such chunks of data? Forexample, remote sensing data from satellites is recorded based on the swath widththat clearly has no reference to geographic features. Therefore, assigning URIsto such huge chunks of data would render them meaningless and moreover wouldfragment geographic features randomly over different data sets. Alternatively (andalso pointless), one could create URIs for each single pixel representing a sensedvalue. (ii) A major reason for the limited re-usability of sensor observations is thatby deploying a sensor we already make numerous assumptions about the studiedphenomena. Summing up, there is hardly any context-free sensor data. In mostcases raw data will be of limited value. The selection, processing, and publica-tion of data using URIs already involves making certain decisions, e.g., about thefeatures of interest.

    3.4 Establishing Links to External Sources

    While links are one of the crucial building blocks for Linked Data, Guéret et al.have shown that about 80% of all triples within the Linked Data cloud point eitherto URIs in the same namespace, blank nodes, or literals (Gueret et al., 2010). Inprevious work, we described some of the difficulties involved in linking highlydynamic resources such as sensor data and argued for a curated approach (Keßlerand Janowicz, 2010). For instance, links between sensors and features of interestsmay be of a temporal nature. An observes relation between a particular air qualitymonitoring station and a gas plume will only hold as long as the feature existsor until the station has been re-deployed at another location. For this reason, ourLinked Data model does not offer direct links between sensors and features; seealso figure 1.

    So far, links expressed as RDF predicates do not support a temporal scoping.A partial solution are blank nodes or reifications in which the observes relation isrepresented as an RDF subject to which timestamps are linked. Such approaches,

    12

    http://my.authority.org/featuresofinterest/SantaBarbara_downtown/http://my.authority.org/featuresofinterest/SantaBarbara_downtown/

  • however, complicate querying and require additional knowledge about the usedpredicates. To easy interlinking with external recourses, the proposed Linked Datamodel tries to keep the data identified by a URI as stable as possible. An ob-servation, for instance, will only change due to manual modifications by the dataprovider and, hence, can be considered as stable. Therefore, Linked Data appli-cations can use these URIs in combinations with owl:SameAs services such assameas.org. In contrast, if we would provide a direct mapping from an ontologysuch as the Semantic Sensor Network Ontology (Neuhaus and Compton, 2009;Janowicz and Compton, 2010), data about sensors or features of interest identifiedby a particular URI would change between requests. This would result in mislead-ing links and hamper meaningful information retrieval.

    While this is essentially due to the nature of dynamic information, we reducethe resulting problems by introducing the ObservationCollection known from theoriginal O&M specifications. As depicted in figure 1, sensors, features of interest,sampling times, results, and observed properties have a unidirectional link to theObservationCollection. This collection contains a list of observations related tothe subject of the query. Consequently, while the RDF triples describing a specificsensor will be relatively stable and may be used as basis for interlinking, the ob-servation collection encapsulates the dynamic parts. This again, points out that aLinked Data model is required in addition to ontologies.

    4 A Transparent RESTful SOS Proxy

    This section introduces the RESTful SOS proxy, which can be installed as a soft-ware facade in front of any OGC conform SOS and offers the core functionality tomake sensor data available to the Linked Data cloud. Based on the scheme intro-duced in section 3.2, the RESTful proxy extracts the user’s query from the URI,encodes it into valid SOS queries, fetches the results from the underlying SOS,and converts them (after according content negotiation) to an RDF representationaligned with the Linked Data model described in section 3.1. Hence, the URI iden-tifies a particular data set and at the same time encodes a query to the underlyingSensor Observation Service.

    4.1 Design of the RESTful SOS Proxy

    The RESTful SOS proxy (available online at http://52north.org/RESTful_SOS) is developed based on the OGC Web Service Access Framework (OX-Framework)(see http://52north.org/oxf), a software framework whose ar-chitecture can be used for an easy utilization of OGC Web Services, such as the

    13

    sameas.orghttp://52north.org/RESTful_SOShttp://52north.org/RESTful_SOShttp://52north.org/oxf

  • SOS. The OX-Framework (Bröring et al., 2009b) offers developers a customiz-able and extendable system of cooperating classes which supply a reusable designapplicable for the implementation of software clients. However, the frameworkdoes not only support access to data through the SOS. The OX-Framework hasbeen developed by the 52◦ North Sensor Web community (see http://52north.org/sensorWeb) to provide an integrative view to access all kinds of OGC WebServices and enable the processing of the queried data.

    The OX-Framework supports the access of various service interfaces by pro-viding a generic architecture which includes a plug-in mechanism for serviceadapters as extension points of the framework. The architecture is structured intoan adapters subsystem and a core subsystem as shown in figure 2.

    [Figure 2 about here.]

    The core subsystem incorporates a two-folded data model: the common ca-pabilities model implements the OGC Web Service Common model (Whiteside,2007) and introduces thereby the integrative view on service access to the frame-work. The internal feature model provides a basis for accessing and processingof features, including observations. It is based on OGC’s feature model (Kottmanand Reed, 2009), and, since the O&M design follows OGC’s feature model, re-trieved observations can be directly mapped. These two data models enable thecommunication between the various framework components.

    The adapters subsystem contains realizations of three kinds of service adapters.The service connectors are used to trigger service operations and to instantiate thecommon capabilities model. Here, a specific service connector for the SOS is used,first, to call the GetCapabilities operation for retrieving metadata about the serviceand its contents, and second, to execute the GetObservation, DescribeSensor orGetFeatureOfInterest operations to retrieve data requested via the RESTful SOSinterface. Once data has been retrieved from the service, feature stores provide thefunctionality to unmarshal received feature data into the core’s feature model. Incase of the RESTful SOS proxy, a specific feature store for O&M has been de-veloped to instantiate the feature model from received observations. Finally, dataprocessors can run on the instantiated feature model and are used to transform thefeature data into other representations. Within this work, we developed a proces-sor which converts observations into RDF-encoded Linked Data but we also sup-port other representations such as KML or JPEG charts. The RESTful SOS proxychooses the right data processor based on content negotiation and the HTTP acceptheader. The described interplay of different adapter components is illustrated infigure 3.

    [Figure 3 about here.]

    14

    http://52north.org/sensorWebhttp://52north.org/sensorWeb

  • Service adapters are implemented for specific types of OGC Web Services,e.g., for the SOS. Implementations of service adapters can be incorporated into theframework in a dynamic way by means of a plugin mechanism which enables aflexible extension of the framework. Once plugged into the OX-Framework, theapplications build on top of it can reuse the service adapter plugins. This flexibilityalso allows extending the RESTful proxy to work with other OGC Web Servicesfor which service adapters already exist.

    4.2 Discussion of the Proxy Approach

    During the implementation of the proxy, several differences between the OGC ap-proach for data provision services and a well-designed Linked Data model havebecome apparent. On the one hand, these differences increase the complexity ofthe implementation, on the other hand, these have implications on the performanceof the proxy. Originally, we planned to directly map a SOS URI identifying a cer-tain observation collection to a GetObservation request and translate the result toRDF using XSLT. However, this has not been possible for several reasons, whichwe discuss in the following as they are relevant for further work on semantic en-ablement of SDIs as well.

    According to the current SOS specification (Na and Priest, 2007), every SOSprovider has to group its observations in arbitrary groups called observation offer-ings. The specification does not further restrict this grouping which leads to verydifferent interpretations of such offerings. Some providers group their observationsby observed properties, others are using certain time periods, while observationsare sometimes also grouped by spatial extent. Conforming to the current SOS spec-ification, one GetObservation request can only query observations from one par-ticular offering.Thus, metadata about all observation offerings need to be cachedby the SOS proxy and kept updated for cases when new observation offerings areadded to the SOS at the backend. The OGC is aware of these problems and theupcoming SOS specification version 2.0 (Bröring et al., 2010) will only containan optional observation offering parameter in an GetObservation request. Also, theobservation offering is defined as a collection of observations produced by a certainsensor system.

    In consequence, when a certain URI identifying an observation collection hasto be resolved by the SOS proxy, the segments of the URI have to be mapped toone GetObservation request for each observation offering. Therefore, steps 2 and3 in figure 3 have to be repeatedly executed if, for example, the feature of interestspecified in the URI is registered at multiple observation offerings at the SOS. Incase of large datasets, which are divided in several observation offerings at the SOS,this might lead to numerous requests that have to be parsed and then transformed to

    15

  • RDF. To make the RESTful SOS proxy more performant and scalable, intelligentcaching mechanism as well as parallel querying and response merging becomenecessary. Once the RESTful SOS proxy is deployed for a certain SOS, it shouldquery all information required from the SOS and then only update the information,if the metadata of a particular offering (and thus its observations) have changed.

    5 Application

    This section illustrates how the RESTful SOS proxy can be used to ac-cess air quality observations made by a particular monitoring station. TheRESTful SOS proxy is deployed at http://v-swe.uni-muenster.de:8080/52nRESTfulSOS/RESTful/sos/AirBase_SOS/. For the sake of readability, thisendpoint URL is replaced with the abbreviation http://myRESTfulSOS/ in thefollowing. The deployed SOS offers data from the AirBase database provided bythe European Environmental Agency. As an entry point to Linked Sensor Data,the user retrieves the RDF representation of a particular monitoring station by fol-lowing the URI http://myRESTfulSOS/sensors/HR:0002A. The RDF serial-ization is shown below and contains links to the related observations. Followingthese links results in new query to the RESTful SOS.

    < r d f : D e s c r i p t i o n r d f : a b o u t =" h t t p : / / myRESTfulSOS / s e n s o r s / HR:0002A ">< r d f s : l a b e l >HR:0002A< / r d f s : l a b e l >< r d f : t y p e r d f : r e s o u r c e =" h t t p : / / v−swe . uni−m u e n s t e r . de :8080

    �� /5 2 nRESTfulSOS / RESTful / minionmld / Se ns o r " / >< m i n i o a m l d : r e l a t e d O b s e r v a t i o n sr d f : r e s o u r c e =" h t t p : / / myRESTfulSOS / o b s e r v a t i o n s / s e n s o r s

    �� / HR:0002A " / >< / r d f : D e s c r i p t i o n >

    Following the link http://myRESTfulSOS/observations/sensors/HR:0002A to the related observations, the user receives an RDF representation of theobservation collection listing all observations produced by the particular monitor-ing station. In case only observations from 2008 for nitrogen dioxide are required,the observation collection can be further restricted by calling the URI http://myRESTfulSOS/observations/sensors/HR:0002A/samplingtimes/2008-01-01,2008-12-31/observedproperties/concentration[NO2] accordingto the scheme defined in section 3.2. The resulting RDF serialization is shownbelow.

    < r d f : D e s c r i p t i o n r d f : a b o u t =" h t t p : / / myRESTfulSOS / o b s e r v a t i o n s/ s e n s o r s / HR:0002A / sampl ingTimes /2008−01−01 ,2008−12−31/ o b s e r v e d P r o p e r t i e s / C o n c e n t r a t i o n [NO2] ">

    < r d f s : l a b e l >OC_NO2_HR:0002A_2008−01−01 ,2008−12−31< / r d f s : l a b e l >< r d f : t y p e r d f : r e s o u r c e =" h t t p : / / v−swe . uni−m u e n s t e r . de :8080

    16

    http://v-swe.uni-muenster.de:8080/52nRESTfulSOS/RESTful/sos/AirBase_SOS/http://v-swe.uni-muenster.de:8080/52nRESTfulSOS/RESTful/sos/AirBase_SOS/http://myRESTfulSOS/http://myRESTfulSOS/sensors/HR:0002Ahttp://myRESTfulSOS/observations/sensors/HR:0002Ahttp://myRESTfulSOS/observations/sensors/HR:0002Ahttp://myRESTfulSOS/observations/sensors/HR:0002A/samplingtimes/2008-01-01,2008-12-31/observedproperties/concentration[NO2]http://myRESTfulSOS/observations/sensors/HR:0002A/samplingtimes/2008-01-01,2008-12-31/observedproperties/concentration[NO2]http://myRESTfulSOS/observations/sensors/HR:0002A/samplingtimes/2008-01-01,2008-12-31/observedproperties/concentration[NO2]

  • �� /5 2 nRESTfulSOS / RESTful / minionmld / O b s e r v a t i o n C o l l e c t i o n " / > [... shortened output]< / r d f : D e s c r i p t i o n >

    Implementing the browsing paradigm of Linked Data, users or applications canretrieve particular observations by following the hasObservation relationship; e.g.by following the URI http://myRESTfulSOS/observations/ids/o_5633 asshown below. Note that the URI to the aboutProperty is not pointing to a URI ofthe SOS, but to a Sensor Observable Registry which has been proposed as a registrymechanism for observable properties at OGC (Jirka and Bröring, 2009) and alsoserves a RESTful interface to the contained resources.

    < r d f : D e s c r i p t i o nr d f : a b o u t =" h t t p : / / myRESTfulSOS / o b s e r v a t i o n s / i d s / o_5633 ">

    < r d f s : l a b e l >o_5633< / r d f s : l a b e l >< r d f : t y p e r d f : r e s o u r c e =" h t t p : / / v−swe . uni−m u e n s t e r . de :8080

    �� /5 2 nRESTfulSOS / RESTful / minionmld / O b s e r v a t i o n " / >< m i n i o a m l d : a b o u t P r o p e r t y r d f : r e s o u r c e =" h t t p : / / giv−g e n e s i s

    �� . uni −m u e n s t e r . de :8080 /SOR / REST / phenomenon /OGC / C o n c e n t r a t i o n [NO2] " / >

    < m i n i o a m l d : h a s R e s u l t r d f : r e s o u r c e =" h t t p : / / myRESTfulSOS

    �� / r e s u l t s / 5 . 4 8 " / >< / r d f : D e s c r i p t i o n >

    With the RESTful SOS in place, unique identifiers can be used to persistentlyand globally refer to chunks of dynamically growing data sets. These identifiersare generated automatically by deploying the proxy on top of a classic SOS instal-lation. Other resources become able to seamlessly connect to the data in a trans-parent and consistent manner. For example, reports on environmental conditionscan directly refer to the earth observation data, which was used in the monitoringphase. In this way, we bridge the gap between SDI and the Semantic Web whilekeeping data management at the source. This proposed solution does not imposeany change of technology which would result in changes in connected geospatialdecision support systems. For SDI users and service providers, the proposed ap-proach provides a major achievement opposed to the mirroring of geospatial datawithin triple stores. Others may still prefer to step outside of the SDI frameworkand publish their data directly to the Linked Data cloud.

    Furthermore, spatial processing may be implemented inside a SOS, such thatspatial reasoning of SDIs could be combined with logics-based reasoning on top ofthe RDF serialization. Following this endeavor, the relatively complex silos of cur-rent SDIs can be partially embedded into mainstream applications, which opens a

    17

    http://myRESTfulSOS/observations/ids/o_5633

  • new dimension of geospatial data sharing for Digital Earth research. Raw measure-ment results may be equally integrated as value added information, such as spatialinterpolations or the results of forecasting simulations. In the mid-term, the en-abled flexible plug-and-play of observation data may lead to micro-SDIs (Janowiczet al., 2010b), in which any SDI managed resource can be easily augmented withcommon Web pages and services. Such a step would not only allow for better anal-ysis of the complex interplay between society and the environment, it would at thesame time open up a new market place for innovative integrated applications.

    6 Conclusions and Outlook

    In this section, we summarize the presented work, discuss lessons learned, andpoint out directions for further work.

    6.1 Summary

    In this work, we introduced a Linked Data model to combine OGC’s Sensor Ob-servation Service with the Linked Data cloud. We developed a URI scheme thatoffers unique global identifiers and build-in query filtering, and also discussed howto introduce links to external sources. We implemented a transparent and REST-ful SOS proxy that can serve Linked Sensor Data without any modifications toexisting services. Moreover, we gave insights into the problem of identity for theassignment of URIs. We argued that it stems from an entity-centric view takenby most Semantic Web and Linked Data research. It assumes that distinct entitiescan be identified and while their attributes may be different or vague, the exis-tence of mind-independent entities is not questioned. In contrast, work on sensorsand earth observations is focused on continuous fields and entities are introducedduring analysis (if at all).

    We decided to use a RESTful approach as it combines three key advantages.First, URIs are a fundamental building block of Linked Data. REST allows us toidentify data and at the same time encode the query using our scheme. Second, amajor requirement of our vision of Semantic Enablement is transparency. UsingREST, Linked Data users and applications are not even aware that they are query-ing an underlying OGC service. Third, a RESTful approach is very flexible withrespect to content negotiation. Our proxy does not only serve RDF but KML andother formats as well.

    Summing up, the proposed approach provides an important step towards the se-mantic enablement of existing information systems and infrastructures, and therebyeases the integration of dynamic information sources such as sensor networks. De-

    18

  • livering observations as Linked Data, connecting them with other data sources, andusing ontologies and Semantic Web reasoners to improve retrieval, alignment, andmatching are major building blocks for the implementation of a Digital Earth. Webelieve that this work can help to broaden the view of a Digital Earth as knowledgearchive towards a knowledge engine vision similar to IBM’s Watson.

    6.2 Lessons Learned

    Linked Sensor Data is a radically new approach and differs from our previous workon Sensor Web Enablement and geospatial semantics in many ways. The workpresented in this article points out several lessons to be learned.

    First, just transforming data to RDF does not add any semantics. As pointed outby Jain et al. (2010), ontologies are a crucial part of the Linked Data vision. Theseontologies may be lightweight or heavyweight, but they have to restrict the mean-ing of the used terms and relations towards their intended interpretation. WhileDBPedia is a great showcase for the idea of Linked Data, it also demonstrates theneed for more expressive ontologies and reasoning. For instance, using the SantaBarbara example introduced before, there are at least six different relations in theDBPedia ontologies describing the fact that a company is located within a city:headquarters of, headquarter of, foundationPlace of, location of, city of, and loca-tionCity. While the differences between some of them can be understood from theirnames, this is not possible for others. Moreover, semantics should not be encodedin literals that are not machine-understandable. Consequently, it is not clear whichrelations should be used or queried and users will get different results.

    Developing ontologies for Linked Data is difficult. If the ontologies are too spe-cific they will restrict the usage of the data and the possibility to interlink them withexternal sources; if they are not specific enough, data may be misinterpreted. Withrespect to the Digital Earth, data spanning different sources, topics, and perspec-tives should be integrated on-the-fly to answer complex scientific questions. Asdemonstrated by Probst and Lutz (2004), some mismatches cannot be discoveredon the syntactic level and combining data from sources that seem to be compatiblemay lead to wrong results, e.g., when simulating the dispersion of a toxic gas plumeusing weather data from different Web services. To address this challenge, we use aslightly modified version of the Stimulus-Sensor-Observation ontology design pat-tern developed by the W3C SSN-XG (Janowicz and Compton, 2010). This patternwas developed with Linked Sensor Data in mind and represents the minimal on-tological commitments for our RESTful SOS. The pattern comes with an optimalalignment to the DOLCE Ultra Light top-level ontology as well as the SemanticSensor Network ontology. In this work, we present the Linked Data Model, URIschemes, and the implementation of the RESTful SOS – aligning more specific

    19

  • ontologies to the pattern is up to the SOS providers and depends on their data andapplication areas.

    Second, our URI scheme does not only identify resources but acts as a queryfilter at the same time. Each URI is decomposed into its segments and translatedinto calls for the Sensor Observation Service. While this approach is elegant torealize the transparent Semantic Enablement Layer, it does not cover the full func-tionality of a SOS. In the future, the URI scheme may be extended to mirror morefilters. However, this makes a meaningful segmentation of the URI (followingREST principles) difficult. It is important to notice that Linked Data allows tobrowse and navigate data sets by following relations to other internal or externalsources. In our work, this browsing paradigm (known from the Document Web)replaces the extensive SOS filters. Finding the right balance between URI-basedfilters and following links needs to be determined based on feedback from usersand SOS providers.

    Third, as pointed out by Schade and Cox (2010) not all data stored in SDIsneeds to be transformed to RDF. Essentially, Linked Data forms a global graph ofinterconnected resources. Some of these resources may be leaf nodes of the graph.It is not clear how to connect these data sets, e.g., binary files, and where to stoptriplifying data. For instance, do we need an RDF serialization of shape files orgeometries in general? Based on our work, we believe that these decisions needto be taken on a case-by-case basis depending on the added value. Before tripli-fying data, one should clarify which ontologies can be used, whether the data willbenefit from internal and external links, and whether machine-readability and Se-mantic Web reasoning techniques are desired. For instance, we are skeptical aboutapproaches that try to develop RDF representations for well-known text (WTK)encoded geometries. While this (and especially separating latitude and longitude)creates a substantial overcharge, the added value remains unclear.

    6.3 Outlook

    Further work will target the extension of the Linked Data model as introduced inSection 3. This particularly includes links between the sensor observations and ex-ternal Linked Data sources. For reasons of simplicity, we have also not discussedthe distinction between URIs referring to real world entities as proposed by theWeb of Things (Guinard and Trifa, 2009) and URIs referring to data about theseentities. Moreover, further OGC specifications will be included in the Linked Datamodel – especially SensorML. It is used to encode sensor descriptions and is there-fore tightly coupled with the Sensor Observation Service and other sensor-relatedservices. Extending the introduced model to resource types, such as coverages andmaps, and RESTful proxies to other OGC services, e.g. the Web Coverage Service

    20

  • and Web Mapping Service are also on the agenda of the 52◦ North Semantics Com-munity. This is possible, as the presented implementation is based on the genericOX-Framework that can be extended through the development of according serviceadapters (see section 4). We also plan to implement semantic enablement strategiesfor push-based services such as the Sensor Event Service (Echterhoff and Everding,2008).

    Another important direction of further work is the ongoing development of asemantics-enabled Sensor Plug&Play infrastructure (Bröring et al., 2011). Sen-sors can by automatically registered at a Sensor Observation Service and matchedagainst the service profile. A sensor bus registers the sensors, while a mediatorperforms an ontology-based matching. Sensor Plug&Play is a pre-requisite for thevision of smart dust sensor networks and could extend the Digital Earth by large-scale, real-time observations in the future. Our current implementation is restrictedto subsumption-based matching, but we plan to implement a rule-based system andintegrate our SIM-DL similarity server in the near future.

    Finally, we hope that realizing a micro-SDI (Janowicz et al., 2010b) based onLinked Data and JavaScript will enable a ubiquitous Geospatial Web.

    Acknowledgments

    The presented work is developed within the 52◦ North semantics community,and partly funded by the European projects UncertWeb (FP7-248488), ENVISION(FP7-249170), as well as the GENESIS project (an Integrated Project, contractnumber 223996). We are thankful for discussions with members of the MünsterSemantic Interoperability Lab (MUSIL), the Spatial Data Infrastructures Unit ofthe Joint Research Centre (JRC) of the European Commission, and many of ourcolleagues from the W3C Semantic Sensor Network Incubator Group (SSN-XG).

    References

    Bizer, C., Heath, T., and Berners-Lee, T., 2009. Linked Data – The Story So Far,International Journal on Semantic Web and Information Systems, 5 (3), 1–22.

    Botts, M., Percivall, G., Reed, C., and Davidson, J., 2008. OGC Sensor Web En-ablement: Overview and High Level Architecture, Lecture Notes In ComputerScience, 4540, 175–190.

    Brodaric, B. and Gahegan, M., 2007. Experiments to Examine the Situated Natureof Geoscientific Concepts, Spatial Cognition and Computation, 7 (1), 61–95.

    21

  • Bröring, A., Janowicz, K., Stasch, C., and Kuhn, W., 2009a. Semantic challengesfor sensor plug and play, in: J.D. Carswell, A.S. Fotheringham, and G. McArdle,eds., Web and Wireless Geographical Information Systems, 9th InternationalSymposium, W2GIS 2009, Springer, Lecture Notes in Computer Science, vol.5886, 72–86.

    Bröring, A., Jürrens, E.H., Jirka, S., and Stasch, C., 2009b. Development of SensorWeb Applications with Open Source Software, in: First Open Source GIS UKConference (OSGIS 2009), 22 June 2009, Nottingham, UK.

    Bröring, A., Maué, P., Janowicz, K., Nüst, D., and Malewski, C., 2011.Semantically-enabled Sensor Plug & Play for the Sensor Web, Sensors; (ac-cepted for publication).

    Bröring, A., Stasch, C., and Echterhoff, J., 2010. OGC Interface Standard 10-037:SOS 2.0 Interface Standard [candidate standard], Open Geospatial Consortium.

    Compton, M., Henson, C., Neuhaus, H., Lefort, L., and Sheth, A., 2009. A sur-vey of the semantic specification of sensors, in: T. K., A. Ayyagari, and D. DeRoure, eds., Proceedings of the 2nd International Workshop on Semantic SensorNetworks (SSN09), CEUR, vol. Vol-522, 17–32.

    Craglia, M., Goodchild, M., Annoni, A., Camara, G., Gould, M., Kuhn, W., Mark,D., Masser, I., Maguire, D., Liang, S., and Parsons, E., 2008. Next-generationdigital earth: A position paper from the vespucci initiative for the advancementof geographic information science, International Journal of Spatial Data Infras-tructures Research, 3, 146–167.

    Crompvoets, J. and Bregt, A., 2007. Research and theory in advancing spatial datainfrastructure, ESRI Press, Redlands, USA, chap. National spatial data clearing-houses, 2000-2005, 133–146.

    De Longueville, B., Annoni, A., Schade, S., Ostländer, N., and Whitmore, C.,2010. Digital earth’s nervous system for crisis events: real-time sensor web en-ablement of volunteered geographic information, International Journal of Digi-tal Earth, 3 (3).

    Echterhoff, J. and Everding, T., 2008. OGC Discussion Paper 08-133: OpenGISSensor Event Service Interface Specification, Open Geospatial Consortium.

    Fielding, R., 2000. Architectural Styles and the Design of Network-based SoftwareArchitectures, Ph.D. thesis, University of California, Irvine.

    22

  • Goodchild, M., 2007. Citizens as sensors: the world of volunteered geography,GeoJournal, 69 (4), 211–221.

    Goodchild, M.F. and Glennon, J.A., 2010. Crowdsourcing geographic informationfor disaster response: a research frontier, International Journal of Digital Earth,3 (3), 231–241.

    Gore, A., 1998. The digital earth: Understanding our planet in the 21st century.

    Guarino, N., 1998. Formal Ontology and Information Systems, in: N. Guar-ino, ed., International Conference on Formal Ontology in Information Systems(FOIS1998), Trento, Italy: IOS Press, 3–15.

    Gueret, C., Groth, P., van Harmelen, F., and Schlobach, S., 2010. Find-ing the achilles heel of the web of data: using network analysis for link-recommendation, in: Proceedings of the Int. Semantic Web Conf 2010.

    Guinard, D. and Trifa, V., 2009. Towards the Web of Things: Web Mashups forEmbedded Devices, in: International World Wide Web Conference, Madrid,Spain.

    Halpin, H. and Hayes, P., 2010. When owl:sameas isn’t the same: An analysis ofidentity links on the semantic web, in: Linked Data on the Web (LDOW2010).

    ISO, 2004. 8601:2004(E) Data elements and interchange formats - Informationinterchange - Representation of dates and times.

    Jain, P., Hitzler, P., Yeh, P.Z., Verma, K., and Sheth, A.P., 2010. Linked Data isMerely More Data, in: AAAI Spring Symposium ’Linked Data Meets ArtificialIntelligence’, AAAI Press, 82–86.

    Janowicz, K., 2010. The role of space and time for knowledge organization on thesemantic web, Semantic Web, 1 (1-2), 25–32.

    Janowicz, K., Bröring, A., Stasch, C., and Everding, T., 2010a. Towards mean-ingful URIs for linked sensor data., Towards Digital Earth: Search, Discoverand Share Geospatial Data, Workshop at Future Internet Symposium, Septem-ber 20th, 2010, Berlin, Germany., 640.

    Janowicz, K. and Compton, M., 2010. The stimulus-sensor-observation ontologydesign pattern and its integration into the semantic sensor network ontology, in:A.A.D.D.R. Kerry Taylor, ed., Proceedings of the 3rd International workshopon Semantic Sensor Networks 2010 (SSN10) in conjunction with the 9th Inter-national Semantic Web Conference (ISWC 2010), CEUR-WS, vol. 668.

    23

  • Janowicz, K., Schade, S., Bröring, A., Keßler, C., Maue, P., and Stasch, C., 2010b.Semantic enablement for spatial data infrastructures, Transactions in GIS, 14 (2),111–129.

    Jirka, S. and Bröring, A., 2009. OGC Sensor Observable Registry Discussion Pa-per, OGC Discussion Paper 09-112, Open Geospatial Consortium.

    Keßler, C. and Janowicz, K., 2010. Linking sensor data – why, to what, and how?,in: A.A.D.D.R. Kerry Taylor, ed., Proceedings of the 3rd International work-shop on Semantic Sensor Networks 2010 (SSN10) in conjunction with the 9thInternational Semantic Web Conference (ISWC 2010), CEUR-WS, vol. 668.

    Keßler, C., Maué, P., Heuer, J.T., and Bartoschek, T., 2009. Bottom-up gazetteers:Learning from the implicit semantics of geotags, in: K. Janowicz, M. Raubal,and S. Levashkin, eds., GeoSpatial Semantics, Third International Conference,GeoS 2009, Springer, LNCS, vol. 5892, 83–102.

    Kottman, C. and Reed, C., 2009. The OpenGIS Abstract Specification - Topic 5:Features, Open Geospatial Consortium, Inc.

    Kuhn, W., 2003. Semantic reference systems (guest editorial), International Jour-nal of Geographical Information Science, 17 (5), 405–409.

    Kuhn, W., 2009. Semantic Engineering, in: G. Navratil, ed., Research Trends inGeographic Information Science, Springer, Lecture Notes in Geoinformationand Cartography, 63–76.

    Lehar, S., 2003. The World in Your Head: A Gestalt View of the Mechanism ofConscious Experience, Lawrence Erlbaum.

    Lund, G., 03/22/2010. Definitions of forest, deforestation, afforestation, and refor-estation, Tech. rep., Forest Information Services.

    Manola, F. and Miller, E., 2004. Rdf primer, w3c recommendation 10 february2004., Tech. rep., W3C.

    Mark, D.M., 1993. Toward a theoretical framework for geographic entity types, in:Spatial Information Theory: A Theoretical Basis for GIS, International Confer-ence COSIT ’93, 270–283.

    Mazzetti, P., Nativi, S., and Caron, J., 2009. Restful implementation of geospatialservices, International Journal of Digital Earth, 2 (1), 40–61.

    24

  • Montello, D.R., Goodchild, M.F., Gottsegen, J., and Fohl, P., 2003. Where’s down-town?: Behavioral methods for determining referents of vague spatial queries,Spatial Cognition & Computation, 3 (2), 185–204.

    Montello, D.R. and Sutton, P.C., 2006. An introduction to scientific research meth-ods in geography., Sage Publications.

    Na, A. and Priest, M., 2007. OGC Implementation Specification 06-009r6:OpenGIS Sensor Observation Service (SOS), Open Geospatial Consortium.

    Nebert, D., 2004. SDI Cookbook, Version 2.0.

    Neuhaus, H. and Compton, M., 2009. The Semantic Sensor Network Ontology: AGeneric Language to Describe Sensor Assets, in: AGILE 2009 Pre-ConferenceWorkshop Challenges in Geospatial Data Harmonisation, 02 June 2009, Han-nover, Germany.

    Page, K., De Roure, D., Martinez, K., Sadler, J., and Kit, O., 2009. Linked sensordata: Restfully serving rdf and gml, in: T. K., A. Ayyagari, and D. De Roure,eds., Proceedings of the 2nd International Workshop on Semantic Sensor Net-works (SSN09), CEUR, vol. Vol-522, 49–63.

    Patni, H., Henson, C., and Sheth, A., 2010a. Linked sensor data, in: 2010 Interna-tional Symposium on Collaborative Technologies and Systems, IEEE, 362–370.

    Patni, P., Sahoo, S., Henson, C., and Sheth, A., 2010b. Provenance aware linkedsensor data, in: P. Kärger, D. Olmedilla, A. Passant, and A. Polleres, eds., Pro-ceedings of the Second Workshop on Trust and Privacy on the Social and Se-mantic Web.

    Phuoc, D.L. and Hauswirth, M., 2009. Linked open data in sensor data mashups,in: A.A.D.D.R. Kerry Taylor, ed., Proceedings of the 2nd International Work-shop on Semantic Sensor Networks (SSN09), CEUR, vol. Vol-522, 1–16.

    Probst, F. and Lutz, M., 2004. Giving Meaning to GI Web Service Descriptions,in: International Workshop on Web Services: Modeling, Architecture and In-frastructure (WSMAI 2004).

    Prud’hommeaux, E. and Seaborne, A., 2008. Sparql query language for rdf. w3crecommendation, http://www.w3.org/tr/rdf-sparql-query/., Tech. rep.

    Richardson, L. and Ruby, S., 2007. RESTful Web Services, O’Reilly Media, Inc.

    Schade, S., 2010. Ontology-Driven Translation of Geospatial Data, no. 001 inDissertations in Geographic Information Science (GISDISS), IOS Press.

    25

  • Schade, S. and Cox, S., 2010. Linked data in sdi or how gml is not about trees,in: Proceedings of the 13th AGILE International Conference on GeographicInformation Science - Geospatial Thinking.

    Scheider, S., Janowicz, K., and Kuhn, W., 2009. Grounding geographic categoriesin the meaningful environment., in: K. Hornsby, C. Claramunt, M. Denis, andG. Ligozat, eds., Conference on Spatial Information Theory (COSIT 2009),Springer, LNCS, vol. 5756, 69–87.

    Sequeda, J. and Corcho, O., 2009. Linked stream data: A position paper, in:Proceedings of the 2nd International Workshop on Semantic Sensor Networks(SSN09), CEUR-WS, vol. 522, 148–157.

    Sheth, A., Henson, C., and Sahoo, S., 2008. Semantic Sensor Web, IEEE InternetComputing, 78–83.

    Smith, B. and Mark, D., 2003. Do mountains exist? towards an ontology of land-forms, Environment and Planning B, 30 (3), 411 – 427.

    van Zyl, T., Simonis, I., and McFerren, G., 2009. The sensor web: systems ofsensor systems, International Journal of Digital Earth, 2 (1), 16–30.

    Whiteside, A., 2007. OGC Implementation Specification 06-121r3: OGC Web Ser-vices Common Specification, Open Geospatial Consortium.

    26

  • List of Figures

    1 Concept map with the classes and relations of the Linked SensorData model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    2 Architecture of the RESTful SOS based on the OX-Framework . . 293 Resolving a URI by the RESTful SOS proxy . . . . . . . . . . . . 30

    27

  • Figure 1: Concept map with the classes and relations of the Linked Sensor Datamodel.

    28

  • Figure 2: Architecture of the RESTful SOS based on the OX-Framework

    29

  • Figure 3: Resolving a URI by the RESTful SOS proxy

    30

    MotivationBackgroundSpatial Data Infrastructures and Sensor Web EnablementThe Semantic Web and Linked DataThe RESTful Approach

    A Linked Data Model for Sensor DataA Linked Data Model for Sensor ObservationsA URI Schema for Linked Sensor DataEstablishing Meaningful URIsEstablishing Links to External Sources

    A Transparent RESTful SOS ProxyDesign of the RESTful SOS ProxyDiscussion of the Proxy Approach

    ApplicationConclusions and OutlookSummaryLessons LearnedOutlook