Open access day

Preview:

Citation preview

Why would I ever let a librarian* anywhere near my research data?

C. Tobin Magle, PhDData Management Specialist

Morgan Library

Libraries = books

Libraries = digital information

Libraries = data?

Libraries are about

organizing and describing

things.

Why not data?

Libraries = expertise

Libraries = access

Short answer: open dataLibrarians have the skills to help

researches plan, organize, describe and share research data

Adelaide State Library of South Australia

Questions

• Why open data?

• What is the role of libraries?

• How can CSU libraries help?

data management != data sharing

• but the same principles apply to both

Why should I care about data management?

Rinehart, AK. “Getting emotional about data” College & Research Libraries News September 2015 vol. 76 no. 8 437-440

What is data management?

The policies, practices and procedures needed to manage the storage, access and preservation of data

produced from a research project

Everything* is digital

• Needs new skills• Data are ephemeral• Facilitates sharing

*ok not everything, but most things

More researchers

https://www.nsf.gov/statistics/2016/nsf16300/digest/nsf16300.pdf

See arXiv:1402.4578 for details

Working Email

Data are extant(If status known)

Status of data (if response)

Response (if email working)

doi:10.1016/j.cub.2013.11.014

We are losing vast amounts of data

00

0

0

0

0

0

0

0

00

0

0

1

1

1

11

1

11

1

1

1

1

1

1

1

0

00

0

0

0

000

000 0

1

1

1 1

10

Research funding is tight

http://www.bu.edu/research/articles/funding-for-scientific-research/

Funders want to do more with less

http://figshare.com/blog/2015_The_year_of_open_data_mandates/143

White House’s 2013 OSTP

“The Obama Administration is committed to the proposition that citizens deserve easy access to the results of research their tax dollars have paid for. That’s why, in a policy memorandum released today, OSTP Director John Holdren has directed Federal agencies with more than $100M in R&D expenditures to develop plans to make the results of federally funded research freely available to the public—generally within one year of publication.”

http://www.whitehouse.gov/blog/2013/02/22/expanding-public-access-results-federally-funded-research

NSF post-award requirements

“Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.”

http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4

It’s good for science

• Improves research reproducibility

• Improves efficiency

• Spurs innovation

It’s good for you

• You are the future data user

• Your data get used (and cited)

• Exposure to collaborators

• More competitive grants

Where does data management fit into

research?

Throughout the whole research cycle

Hypothesis

The research cycle

Hypothesis Experimental design

The research cycle

Hypothesis DataExperimental design

The research cycle

Hypothesis DataExperimental design

Results

The research cycle

Hypothesis DataExperimental design

ResultsArticle

The research cycle

Hypothesis DataExperimental design

ResultsArticle

The research cycle

Hypothesis DataExperimental design

ResultsArticle

Data Management Plans

The research cycle

HypothesisRaw data

Experimental design

Tidy Data

ResultsArticle

Data Management Plans

Cleaning

Analysis

The research cycle

HypothesisRaw data

Experimental design

Tidy Data

ResultsArticle

Data Management Plans

Cleaning

Sharing

Analysis

Open Data

The research cycle

HypothesisRaw data

Experimental design

Tidy Data

ResultsArticle

Data Management Plans

Cleaning

Sharing

Analysis

Open Data

Code Reproducible Research

The research cycle

HypothesisRaw data

Experimental design

Tidy Data

ResultsArticle

Data Management Plans

Cleaning

Sharing

Analysis

Open Data

Code Reproducible Research

Reuse

The research cycle

How are libraries getting involved?

•We’re NOT the data police

•We want to help!

•We provide services

Data Management Serviceshttps://lib.colostate.edu/services/data-management

Workshops

One on one meetings

• How do I write a DMP?

• How do I organize my data?

• How do I clean and format my data?

• How do I automate my analyses?

• How do I get my data ready to share?

Data archiving service

• CSU Digital Repository• 78 Datasets

• Satisfy requirements for manuscripts and grants

• At no cost <1 TB• $150/TB for 5 years• $300/TB for >5 years

Large Projects

• Shortgrass Steppe – LTER (32) http://hdl.handle.net/10217/100254

• Yellowstone Willows LTREB (21) http://hdl.handle.net/10217/173646

• RAPID: Characterizing the Response of a Burned Landscape to an Unusual and Extreme Rain Event (4) http://hdl.handle.net/10217/100371

Departments and Schools

Atmospheric science

23%

Bioagricultural sciences and pest

management9%

Biology9%

Clinical Sciences18%

Ecology9%

Ecosystem science and sustainability

5%

Psychology5%

Soil and crop science

14%

Statistics5%

The Energy institute5%

Take home points

• Data management requires new skills

• Library staff have those skills

• CSU Libraries can help • Education• One on one meetings• Digital Repository

Thanks!

Data Management Services

http://lib.colostate.edu/services/data-management

Need help?

Email library_data@colostate.edu

Contact Tobin

Tobin.magle@colostate.eduTwitter: @tobinmagle

http://orcid.org/0000-0003-3185-7034

Recommended