1
Semantic Integration of Ecological Data Using the SERONTO Ontology Introduction Nicolas Bertrand 1 , Herbert Schentz 2 , Bert Van der Werf 3 , Barbara Magagna 2 , Johannes Peterseil 2 , Sue Rennie 1 , Centre for Ecology and Hydrology (UK), Umweltbundesamt (Austria), ALTERRA (Netherlands) ALTER-Net is a network of excellence for Long-Term Biodiversity, Ecosystem and Awareness Research spanning 24 institutions in 7 European countries. The aim is to develop an integrative research framework in biodiversity research and monitoring to address biodiversity issues at a European scale. A key objective is the development of a framework for distributed data, information and knowledge management. The major challenge in achieving this objective is the provision of consistent data access and querying across multiple institutions and diverse data types. Semantic approaches to data integration are seen as an enabling mechanism to carry out integrated socio- ecological science at a global scale. The Socio-Ecological Research and Observation oNTOlogy (SERONTO) has been developed building upon Umweltbundesamt’s (Federal Environment Agency – Austria) experiences in developing a semantic database system for managing environmental data. To validate the development of SERONTO and its uses for future data integration, a proof of concept study was conducted. SERONTO is a core ontology for ecological observations and measurements SERONTO allows to annotate WHAT is observed WHERE, WHEN and HOW SERONTO allows to annotate how the investigation items were SELECTED from populations SERONTO is extended by DOMAIN Ontologies SERONTO integrates reference lists SERONTO has a versioning system Socio-Ecological Research and Observation Ontology More information is available on the ALTER-Net I6 Wiki: http://www5.umweltbundesamt.at/ALTERNet/index.php?title=Main_Page Proof of Concept JOKL cultural landscapes JODI vegetation 2835 foodplain Pythia vegetation ECN Summary Database Import SERONTO Connect Databases Query What is SERONTO? Key Concepts in SERONTO Outstanding questions •Can SWRL / F-Logic be used interchangeably? •Mapping of an OWL ontology sub property •Governance of global and local reference lists •Mapping requires knowledge of the connected database, ontology and F- Logic. Effort involved is significant. •Maintenance of mappings between reference lists is crucial •Coupling of value sets and units as well as calculations must be further tested. •Dealing with Globally Unique Identifiers SERONTO Results F-Logic / OntoStudio •F-Logic is an object oriented database language capable of expressing semantic queries. •OntoStudio is an ontology management system that can use F-Logic to connect databases (Oracle, MS-SQL, DB2, MySQL), Excel tables and folder structures of the file system •OntoStudio can import OWL ontologies into F-Logic. •F-Logic can then be used to query the connected systems •F-Logic differs significantly to OWL Description Logic (Closed World semantics vs Open World semantics) Scope The scope of the proof of concept was to test: • The feasibility of mapping relational databases to SERONTO • Querying of the connected database(s) from the semantic concepts captured in SERONTO The requirements for accepting the proof of concept were: • The databases must have different structures and must have been developed independently of SERONTO; • The databases must feature reference lists (e.g. species lists); • The database structures must not be altered as a result of the integration work; • New concepts may be imported into SERONTO as and when required; • The databases must contain data relevant to Long Term Ecological Research (e.g. vegetation surveys, records of species occurrences, measurement of biotic and abiotic components). List of parameters Results •We could import SERONTO and Units Ontologies into Ontostudio •We could import Database Schemata into Ontostudio •We could do simple and complex database queries •We could readily extend SERONTO classes from the contents of databases •We could map databases to SERONTO graphically where relations between tables and concepts where appropriate •We could create more complex mappings including at instance level using the F-Logic syntax •We could query multiple connected databases from SERONTO Outlook SERONTO 0 50 100 Region 1 Region 2 parameter_method parameter method Value_sets Unit Scale Distributed Data Mining with local tools Distributed Socio- Ecological Data SERONTO & Domain Ontologies common concepts and domain knowledge Portal Discover, retrieve and integrate data

Poster Semantic data integration proof of concept

Embed Size (px)

Citation preview

Page 1: Poster Semantic data integration proof of concept

Semantic Integration of Ecological Data Using the SERONTO Ontology

Introduction

Nicolas Bertrand1, Herbert Schentz2, Bert Van der Werf3, Barbara Magagna2, Johannes Peterseil2, Sue Rennie1,Centre for Ecology and Hydrology (UK), Umweltbundesamt (Austria), ALTERRA (Netherlands)

ALTER-Net is a network of excellence for Long-Term Biodiversity, Ecosystem and Awareness Research spanning 24 institutions in 7 European countries. The aim is to develop an integrative research framework in biodiversity research and monitoring to address biodiversity issues at a European scale.

A key objective is the development of a framework for distributed data, information and knowledge management. The major challenge in achieving this objective is the provision of consistent data access and querying across multiple institutions and diverse data types.

Semantic approaches to data integration are seen as an enabling mechanism to carry out integrated socio-ecological science at a global scale. The Socio-Ecological Research and Observation oNTOlogy (SERONTO) has been developed building upon Umweltbundesamt’s (Federal Environment Agency – Austria) experiences in developing a semantic database system for managing environmental data.

To validate the development of SERONTO and its uses for future data integration, a proof of concept study was conducted.

SERONTO is a core ontology for ecological observations and measurementsSERONTO allows to annotate WHAT is observed WHERE, WHEN and HOWSERONTO allows to annotate how the investigation items were SELECTED from populationsSERONTO is extended by DOMAIN OntologiesSERONTO integrates reference listsSERONTO has a versioning system

Socio-Ecological Research and Observation Ontology

More information is available on the ALTER-Net I6 Wiki: http://www5.umweltbundesamt.at/ALTERNet/index.php?title=Main_Page

Proof of Concept JOKLcultural

landscapes

JODIvegetation

2835foodplain

Pythiavegetation

ECN Summary Database

ImportSERONTO

Connect Databases

Query

What is SERONTO?Key Concepts in SERONTO

Outstanding questions•Can SWRL / F-Logic be used interchangeably?•Mapping of an OWL ontology sub property•Governance of global and local reference lists•Mapping requires knowledge of the connected database, ontology and F-Logic. Effort involved is significant.•Maintenance of mappings between reference lists is crucial •Coupling of value sets and units as well as calculations must be further tested.•Dealing with Globally Unique Identifiers

SERONTOResults

F-Logic / OntoStudio•F-Logic is an object oriented database language capable of expressing semantic queries.•OntoStudio is an ontology management system that can use F-Logic to connect databases (Oracle, MS-SQL, DB2, MySQL), Excel tables and folder structures of the file system•OntoStudio can import OWL ontologies into F-Logic.•F-Logic can then be used to query the connected systems•F-Logic differs significantly to OWL Description Logic (Closed World semantics vs Open World semantics)

ScopeThe scope of the proof of concept was to test: • The feasibility of mapping relational databases to SERONTO• Querying of the connected database(s) from the semantic concepts captured in SERONTO

The requirements for accepting the proof of concept were:

• The databases must have different structures and must have been developed independently of SERONTO;• The databases must feature reference lists (e.g. species lists);• The database structures must not be altered as a result of the integration work;• New concepts may be imported into SERONTO as and when required;• The databases must contain data relevant to Long Term Ecological Research (e.g. vegetation surveys, records of species occurrences, measurement of biotic and abiotic components).

List of parameters

Results•We could import SERONTO and Units Ontologies into Ontostudio•We could import Database Schemata into Ontostudio•We could do simple and complex database queries•We could readily extend SERONTO classes from the contents of databases•We could map databases to SERONTO graphically where relations between tables and concepts where appropriate•We could create more complex mappings including at instance level using the F-Logic syntax•We could query multiple connected databases from SERONTO

Outlook

SERONTO 0

50

100

Region 1 Region 2

parameter_method

parameter method

Value_sets Unit

Scale

Distributed Data Miningwith local tools

Distributed Socio-Ecological Data

SERONTO & Domain Ontologiescommon conceptsand domain knowledge

Portal

Discover, retrieve and integrate data