14
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations Yolanda Gil 1 , Daniel Garijo 1 , Varun Ratnakar 1 , Deborah Khider 2 , Julien Emile-Geay 2 and Nicholas McKay 3 1 Information Sciences Institute, University of Southern California, 2 Department of Earth Sciences, University of Southern California, 3 School of Earth Sciences and Environmental Sustainability, North Arizona University @yolandagil, @dgarijov {gil,dgarijo}@isi.edu Information Sciences Institute ISWC In-Use Track, Vienna, 2017

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

  • Upload
    dgarijo

  • View
    173

  • Download
    2

Embed Size (px)

Citation preview

Page 1: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

A Controlled Crowdsourcing Approach for Practical

Ontology Extensions and Metadata Annotations

Yolanda Gil1, Daniel Garijo1, Varun Ratnakar1,

Deborah Khider2, Julien Emile-Geay2 and Nicholas McKay3

1Information Sciences Institute, University of Southern California,2Department of Earth Sciences, University of Southern California,

3School of Earth Sciences and Environmental Sustainability, North Arizona University

@yolandagil, @dgarijov

{gil,dgarijo}@isi.edu

Information

Sciences

Institute

ISWC In-Use Track, Vienna, 2017

Page 2: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Data reuse in paleoclimate and environmental sciences

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

• Data is collected using idiosyncratic notation and protocols by independentscientists.

• Hundreds of types of observations• Physical samples may be from ice, tree, coral, marine sediment, etc.

• Hundreds of types of measures• Temperature, rainfall, PH, etc.

• Diversity is so great that no one dares to embark on standards.

• Typical situation for environmental sciences (water modeling, hydrology etc.)

Page 3: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Challenges

• How can we leverage basic core agreements?

• How can scientist create new properties that they want to use to describetheir data?

• How to facilitate consensus on new extensions to core agreements?

• How can the scientific community immediately benefit from these continuedexpansion of core agreements?

• Coordination and maintenance of new extensions to core agreements

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 4: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Approach: Controlled crowdsourcing

• A metadata crowdsourcing platform

• Controlled standardization process for new metadata properties

• Framework for updating metadata of previously annotated datasets

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 5: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

A Framework for Controlled Crowdsourcing

Data AnnotationCore

ontology

Snapshot

Snapshot Repository

Update

Ontology Repository

Coreontology revision

Crowdvocabulary

revision

Revision

Annotation Framework

Revision Framework

Update Framework

Version 0

Version 1

Requests & issues (core ontology)

Requests& issues

Extendedcrowd

vocabulary

Dataset metadata

Dataset metadata store

Changes-Monotonic changes-Non-monotonic changes

Crowdvocabulary

Load/reload

Load/reload

Reload datasets

Changes tocrowd vocabulary

Editorial Board

Basic editor

Datasets

Advancededitor

Core ontology

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 6: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Specifying metadata for a dataset

Data Download

Completed properties

Missing properties

Crowd Properties

Category

Category Annotation

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 7: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Fostering standardization

Suggestion of renames

Autocompletion

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 8: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Dynamic map-based visualizations

Dataset annotationinterface

Author credit Polls for decision makingCommunity discussions

Implementation: The Linked Earth Platform

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 9: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

The Linked Earth Ontology - Overview

• Modular design (Core modules + crowd extensions)

http://linked.earth/ontology#

Linked Paleo Data Ontology (LiPD)

EXTENSION(Coral, Wood,

Lake Sediment…)

EXTENSION (Spectral,

Chemical …)

EXTENSION (Rock, Snow,

Tree …)

EXTENSION(Spectrometer,

Spectroscope …)

EXTENSION(Precipitation,

time …)

Crowd Vocabulary Extension

Schema.org(Dataset)

Wgs_84(Position)

Geosparql(Position)

SSN(Observation)

FOAF(Person)

PROV(Derivation)

DC(Publication)

Co

re O

nto

logy

ProxyArchive ProxyObservation ProxySensor Instrument InferredVariable

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 10: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

The Linked Earth Ontology - versioning

• Working Groups discuss new changes to the ontology

• Once a new version is approved, the core vocabulary released and versionedoutside the wiki:• Naming schema: http://linked.earth/ontology/module/version

• Example: http://linked.earth/ontology/core/1.2.0

• Latest version preserves its URI (aggregates all modules):• http://linked.earth/ontology#

• Each version is documented and published in a machine readable and humanreadable manner

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 11: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Organizing the community

• Basic editors

• Advanced editor

• Editorial board

• Working group

• Periodic face to face events for community engagement

• Engagement through twitter polls, online surveys• Editorial board requests votes for candidate standard properties

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 12: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Current Situation

Page Distribution

Datasets 699

ProxyAcrhive 207

ProxyObservation 76

ProxySensor 63

Instrument 45

InferredVariable 1207

MeasuredVariable 3348

Working Group 12

Location 659

Person 524

Publication 875

• More than 14000 pages

• More than 150 registered users (50 active)

• One full iteration and revision of the ontology

• Identified leaders for working groups

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 13: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

Conclusions and Future Work

Approach for on the fly ontology extensions for scientific metadataannotations

• Foster standardization through renaming, autocompletion and voting

• Editorial process to review core standard with new crowd terms

• Framework for updating dataset properties when a new standard is released

Ongoing work:

• Support editorial process for core ontology revisions

• Automating the ontology documentation updates

• Further automations of update framework

A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations (Gilt et al, ISWC In use track, Vienna, 2017)

Page 14: A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotations

A Controlled Crowdsourcing Approach for Practical

Ontology Extensions and Metadata Annotations

Yolanda Gil1, Daniel Garijo1, Varun Ratnakar1,

Deborah Khider2, Julien Emile-Geay2 and Nicholas McKay3

1Information Sciences Institute, University of Southern California,2Department of Earth Sciences, University of Southern California,

3School of Earth Sciences and Environmental Sustainability, North Arizona University

@yolandagil, @dgarijov

{gil,dgarijo}@isi.edu

Information

Sciences

Institute

ISWC In-Use Track, Vienna, 2017