9
The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

The Best of Data - NCAR Library · The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Best of Data - NCAR Library · The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

The Best of Data The Worst of Data…

Experiences with small project data collection. D. Gochis, NCAR/RAL

Page 2: The Best of Data - NCAR Library · The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

Tales of Joy:

~dozen field campaigns over the last 10 years all with different scopes, types of data, levels of complexity, degrees of multi-disciplinarity…

Page 3: The Best of Data - NCAR Library · The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

Tales of Joy:

Attributes of successful efforts… – Data ‘engineering’ is at the table from day 1

– Methods for data collection, archival and access is clear from the beginning using relatively mature technologies

– ‘Raw’ data is archived either in real-time or immediately after collection

– Efficient methods to browse or ‘discover’ data within each project archive…critical for the revolving door of students and post-docs

– Downloading data is controlled in a manner that the provider can help the user interpret the data

Page 4: The Best of Data - NCAR Library · The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

Tales of Joy: BEACHON-MEF through RAL Winter Weather site

• Easy navigability/browsing • Real-time monitoring (…or soon after upload to dbase) • Automated data availability reporting • Password controlled downloading • Capitalized on an existing capability

Page 5: The Best of Data - NCAR Library · The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

Tales of Joy: EOL/NAME Field Data Catalog

• ‘Big data’ management but for a disparate group of small projects • Minimal overhead for data providers • ‘Perpetual’ access to individual measurments (not just large synthesis

datasets) some 10 years after data collection • Reasonable data browsing/discovery capabilities

Page 6: The Best of Data - NCAR Library · The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

Tales of Woe:

~dozen field campaigns over the last 10 years all with different scopes, types of data, levels of complexity, degrees of multi-disciplinarity…yet lacking consistent data plans…

Page 7: The Best of Data - NCAR Library · The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

Tales of Woe:

Attributes of ‘painful’ efforts…

– No clear pathway for data (result: to each their own and no one knows what to do)

– No clear archiving strategy (result: lost data)

– No browse-ability or discoverability component (result: lots of duplicative emails…lots of wasted student time)

– No data format plan (result: umpteen different formats to convert between)

Page 8: The Best of Data - NCAR Library · The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

Synthesis of Key Attributes for Managing Small Data:

1.Low barriers to entry for inputting and updating data

2.Utilize data standards (formats, etc) but provide conversion tools for those standards (getting easy with Python…)

3.Cross-referencing and browsing

4.Automated data availability diagnostics

5.Longevity for multiple student generations

Page 9: The Best of Data - NCAR Library · The Best of Data The Worst of Data… Experiences with small project data collection. D. Gochis, NCAR/RAL

Final thought: • A lot of emphasis on ‘Big Data’ these days…but

an overwhelming amount of research, particularly NSF-funded research is from small projects

• Most ‘Big Data’ efforts already have substantial engineering support (Rich Get Richer…)

• New White House directive is a game-changer for small projects

• Biggest impact on the community would be the development and availability of tools for small projects to archive and access their data