Streamlining the research data archival process at Johns Hopkins University (IASSIST 2014)

Preview:

DESCRIPTION

Among other research data management services, Johns Hopkins University Data Management Services (JHUDMS, http://dmp.data.jhu.edu) provides its researchers the opportunity to preserve and share their data through the JHU Data Archive. This archive is a research data-specific repository that can host a wide variety of quantitative and qualitative data, and is both format- and discipline-agnostic. We in JHUDMS have begun archiving research data originating from two NSF-funded engineering research projects. This archiving process has begun with data associated with publications, which may be a typical model for library research data archives. These efforts have been an opportunity to understand the time and effort required for activities that can add value to a research data collection (e.g. discussions with the researcher, development of data flow diagrams, migration of data to non-proprietary formats). We will discuss these collections and the steps taken to create them. It is of benefit to both the researchers and JHUDMS to scope and streamline the archiving process with research data associated with publications in mind. We will discuss our current understanding of the most effective curation activities we can efficiently accomplish, parameters for those activities, and those elements perceived by the researcher to be most valuable.

Citation preview

1

Streamlining the research data archival process at Johns Hopkins

University

Dr. Jonathan PettersJohns Hopkins Data Management Services

5 June 2014

JHU Data Management Services 5 June 2014

2

Interest from Libraries in Providing Research Data Management Services

(RDMS)

JHU Data Management Services 5 June 2014

Offer data management services (54)

100%

68%

Planning to offer DMS (17)

23%

Offer research support services (broadly defined) (73) 100%

ARL SpecKit 334: Research Data Management Services, Fearon, David Jr.; Gunia, Betsy; Lake, Sherry; Pralle, Barbara E.; Sallans, Andrew L. July 2013.

3

Digital Collections Management is One Aspect of RDMS

JHU Data Management Services 5 June 2014

4

Resources Required

JHU Data Management Services 5 June 2014

http://commons.wikimedia.org/wiki/File:Hazelton_coal_miners.jpghttp://commons.wikimedia.org/wiki/File:Video_archive.jpg

5

A pilot project for the JHU Data Archive (Engineering research)

Archive is discipline and format agnostic

Size Dependent Thermodynamics and Kinetics in Electric Field Mediated Colloidal Crystal Assembly , Edwards, T.D.; Beltran-Villegas, D.J.; Bevan, M.A. Soft Matter Vol.9, 9208-9218, 2013

JHU Data Management Services 5 June 2014

6

Pilots allow for experimentation!

Took several actions in preparing and depositing the data collection into the JHU Data Archive

~75 hoursJHU Data Management Services 5 June 2014

7

Background Research

http://www.jhu.edu/bevan/index.html

JHU Data Management Services 5 June 2014

~23 Hours

8

Interactions with the Researcher

JHU Data Management Services 5 June 2014

~10 Hours

9

2. Use Pertubation Theory

6. Run Monte Carlo Simulations

10. Validate minimum voltage results

14. Take Video Microscopy for kinetic system size effects

16. Fit to Smoluchowski Model

8. Take Video Microscopy

JHU Data Management Services 5 June 2014

~16 Hours Data Workflow Diagram

10

Creating/Modifying Data Item Descriptors

JHU Data Management Services 5 June 2014

Data Collection

Data Products

Descriptors

~10 Hours

11

Format Migration and Annotation

JHU Data Management Services 5 June 2014

http://www.sigmaplot.com/products/sigmaplot/sigmaplot-details.php

Migrate

Annotate

~9 Hours

12

Uploading Collection

JHU Data Management Services 5 June 2014

~7 Hours

13

Which actions to focus on?

~75 hours a collection is not scalable, how much can we reduce hours for those actions?

~75 hours -> ~28 hours

JHU Data Management Services 5 June 2014

14

Background Research

http://www.jhu.edu/bevan/index.html

JHU Data Management Services 5 June 2014

~23 Hours~6 Hours

Otherwise we’re fumbling in the dark

15

Interactions with the Researcher

JHU Data Management Services 5 June 2014

~10 Hours~8 Hours

Must be well-planned and focused

16

2. Use Pertubation Theory

6. Run Monte Carlo Simulations

10. Validate minimum voltage results

14. Take Video Microscopy for kinetic system size effects

16. Fit to Smoluchowski Model

8. Take Video Microscopy

JHU Data Management Services 5 June 2014

Data Workflow Diagram

~16 Hours ~5 Hours

17

Creating/Modifying Data Item Descriptors

JHU Data Management Services 5 June 2014

Data Collection

Data Products

Descriptors

~10 Hours ~3 Hours

18

Format Migration and Annotation

JHU Data Management Services 5 June 2014

http://www.sigmaplot.com/products/sigmaplot/sigmaplot-details.php

Migrate

Annotate

~9 Hours ~0 Hours

19

Uploading Collection

JHU Data Management Services 5 June 2014

~7 Hours~6 Hours

20

Thanks!JHU Data Management Serviceshttp://dmp.data.jhu.edu/datamanagement@jhu.edu

JHU Data Archive - http://archive.data.jhu.edu

Jonathan Petters - jpetter1@jhu.edu

JHU Data Management Services 5 June 2014

Recommended