Infrastructure, Curation, and Metadata · Curation matters • Embedding “data wranglers” in...

Preview:

Citation preview

Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License

Infrastructure, Curation, and Metadata

Mark A. Parsons

NSF Workshop on Cyberinfrastructure for Polar Sciences10 September 2013

From Interregional Highways: Message from the President of the United States Transmitting a Report of the National Interregional Highway Committee, Outlining and Recommending a National System of Interregional Highways, 12 Jan. 1944.CC-BY Eric Fischer http://www.flickr.com/photos/walkingsf/8270270785/

Infrastructure is

Relationships, interactions, and connectionsbetween humans, technologies, and institutions

Interchangecc-by-sa Steven Vance http://www.flickr.com/photos/jamesbondsv/8475376363/

Ranch ExitCC-BY-SA Ken Lund http://www.flickr.com/photos/kenlund/2381991900/

Understanding Infrastructure: Dynamics, Tensions, and Design

2007Paul Edwards, Steven Jackson, Geoffrey Bowker,

and Cory Knobelhttp://hdl.handle.net/2027.42/49353

datawrangler

Data Management for CLPX—AGU 11 December 2003

Example of Snow Pit DataWe start with this:

Data Management for CLPX—AGU 11 December 2003

Example of Snow Pit Data

… …

Summary Table

Density Table

Stratigraphy Table

Temperature Table

And create this:

Curation is

continual and conscious improvement of data.

Curation matters

• Embedding “data wranglers” in the field significantly improved data quality and completeness. (~20% of the data sheets had issues corrected in the field that would not have been correctable later).

• Reveals and documents tacit knowledge. Can even uncover science questions.

• Integrated much more than the field data with common grids and formats

• “Standard” formats often do not exist and need to be created by the community. Sometimes you need multiple formats.

• A mediated relationship between user and collector.

• Need to include curators in data collection even if they are not in the field.

• Get to know your local curator.

Metadata are

Everything necessary to make data independently understandable by a designated community.

Infrastructure is comprised of relationships.

Curators create and enhance relationships.

Metadata is often the information shared (often tacitly) in those relationships.

A standard sensor format?

• Temporally varying sensors (e.g. a borehole)

• Spatially varying sensors (e.g. a transect)

• Temporally and spatially varying sensors (e.g. a drifting buoy)

27

NOAAThank You

parsom3@rpi.edu

Recommended