Upload
ina
View
22
Download
0
Embed Size (px)
DESCRIPTION
Sponsors: US Department of Energy. - PowerPoint PPT Presentation
Citation preview
Resource Discovery
for Extreme Scale Collaboration
Benno Lee1([email protected]), Patrick West1 ([email protected]), William Smith2 ([email protected]), Sumit Purohit2 ([email protected]), Karen
Schuchardt2 ([email protected]), Alan Chappell2 (
[email protected] ), Peter Fox1 ([email protected] ), Jesse Weaver2 (
[email protected] ),(1Rensselaer Polytechnic Institute,
2Pacific Northwest National Laboratory)
The amount of data produced in the practice of science is growing rapidly. Despite the accumulation and demand for scientific data, relatively little is actually made available for the broader scientific community. We surmise that the root of the problem is the perceived difficulty to electronically publish scientific data and associated metadata in a way that makes it discoverable. We propose to exploit Semantic Web technologies and practices to make (meta)data discoverable and easy to publish. We share our experiences in curating metadata to illustrate both the flexibility of our approach and the pain of discovering data in the current research environment. We also make recommendations by concrete example of how data publishers can provide their (meta)data by adding some limited, additional markup to HTML pages on the Web. With little additional effort from data publishers, the difficulty of data discovery/access/sharing can be greatly reduced and the impact of research data greatly enhanced.
RDESC Architecture
TWC/RPI S2S Faceted Browser
Facets on the left allow users to constrain their search based on data resources, GCMD Keywords, Special Measured Parameters, and lat/lon coordinates. The facets changed over time based on the metadata extracted from ingesting the various data resources.
RDESC RDF Graphs
An example description of a GCMD dataset as a RDF graph, using the initial ontology.
The current ontology. Ovals represent classes/concepts, and arrows indicate subClassOf relationships. Classes are colored so that darker classes were established in the ontology prior to lighter classes.
An example of a RDF description for an ARM data stream and how the ARM measured property hierarchy is used to link data streams to measured properties of interest
An example of a RDF description for an ARM data stream and how the ARM measured property hierarchy is used to link data streams to measured properties of interest
Conclusion
we have emphasized the importance that data publish- ers provide their (meta)data in a way that makes structural and semantic integration a natural process. This is accomplished by following a shared vocabulary of terms embodied as an ontology, and by expressing metadata as RDF triples that utilize the ontology. Although this can sound daunting, we showed that doing so is actually quite easy in practice. We demonstrated the flexibility of this approach by curating existing metadata into the recommended format. Publishing (meta)data in this (or a similar) way will ameliorate (at least in part) the poor data sharing practices that currently pervade the practice of science
No matter what dataset we have ingested we will be able to present the metadata in search and browse interfaces, like S2S above, and provide splash pages for each dataset with the information retrieved from the external system. And as you can see, the metadata retrieved from the various systems can be quite different.
Acknowledgments:Eric Rozell, Masters Student at Rensselaer Polytechnic Institute now with Microsoft
Sponsors:
US Department of Energy
Glossary:ARM – Atmospheric Radiation MeasurementOWL – Web Ontology LanguagePNNL – Pacific Northwest National LaboratoryRDESC – Resource Discovery for Extreme Scale CollaborationRDFS – Resource Description Language SchemaRPI – Rensselaer Polytechnic InstituteSPARQL – a RDF query languageS2S – a faceted web browserTWC – Tetherless World Constellation at Rensselaer Polytechnic Institute
Resources:http://rdesc.org - site developed fro RDESC projecthttp://rdesc.org/2014/ - The RDESC ontology