Transcript
Page 1: Update on Data Publishing With Dataverse

Update on Data Publishing With Dataverse

Eleni Castro, Research CoordinatorInstitute for Quantitative Social Science (IQSS)Harvard University

DataCite Annual 2014 Nancy, FranceAugust 25, 2014

Page 2: Update on Data Publishing With Dataverse

2

Introduction to Dataverse

Provides incentives for researchers to share:• Recognition & credit via data citations• Control over data & branding• Fulfill Data Management Plan

requirements

Software framework for publishing, citing and preserving research data (open source on github for others to install)

761 Dataverses

Harvard Dataverse (open to all; repository instance at Harvard) currently has:

54,828 Datasets

748,554 Files

> 1 Million Downloads

EZID DOI (2013)

Page 3: Update on Data Publishing With Dataverse

3

Who’s Using Dataverse?

Worldwide Dataverse Installations

Institutions can setup & host their own Dataverse installation (e.g., Odum, OCUL, DANS, Fudan, etc) and within them can support datasets from a variety of users (across all research domains): Researchers, Projects, Departments, Journals, etc.

Page 4: Update on Data Publishing With Dataverse

4

Journals Publishing Data w/ Dataverse

Option A. Journals include Dataverse as a Recommended Repository

Option B. Authors Contribute Directly to a Journal Dataverse

Option C. Seamless Integration btw Journal + Dataverse (e.g., OJS)

Page 5: Update on Data Publishing With Dataverse

5

OJS-Dataverse Integration

Citation to Data

Citation to Article

Details/Updates: 2 Year Project 2012-2014• Integrating w/ PKP’s Open Journal Systems (Data Deposit API).• Pilot with ~ 50 journals + expanding outreach (100s) .• OJS’ Dataverse plugin now available with latest OJS release.• Future: Embed Dataverse widgets into journal article.

OJS Journal Journal Dataverse

http://projects.iq.harvard.edu/ojs-dvn

Page 6: Update on Data Publishing With Dataverse

OJS Plugin: Journal Data Policies Boilerplate Templates

Read full Data Policies / Guidelines Template: http://bit.ly/1xkLjoZ

Including Guidelines for:1) Authors (w/ data citation)2) Reviewers

Page 7: Update on Data Publishing With Dataverse

OJS Plugin: Author Manuscript + Data Submission

Option to: (A) deposit into Dataverse AND/OR; (B) if data is already in a repository can include the data citation (w/ persistent URL/identifier).

A

B

Page 8: Update on Data Publishing With Dataverse

OJS Plugin: Editor Reviews Article + Data

Page 9: Update on Data Publishing With Dataverse

Data Published in Dataverse w/ OJS Plugin

2 Options in OJS: 1) Dataset Published (with DOI) at Article Approval.2) Dataset Published when Journal Issue is Released.

In OJS:

In Dataverse:

Page 10: Update on Data Publishing With Dataverse

OJS Plugin: Article Published w/ Data Citation

Page 11: Update on Data Publishing With Dataverse

11

Towards An Integrated Publishing Lifecycle

Image Credit: Mercè Crosas

See: Data Citation Principle #1 Importance

Page 12: Update on Data Publishing With Dataverse

Publishing in 4.0 (Late Fall 2014)

Page 13: Update on Data Publishing With Dataverse

13

Rigorous Data Publishing Workflows

DraftDatasetUpload

PublishedDataset v1

Publish Version 1

PublishedDataset

v1.1

Publish Version 1.1: small metadata change; citation doesn’t change.

Publish Version 2: File change (automatic); big metadata change (e.g., author, title).

PublishedDataset v2

Authors, Title, Year, DOI, Repository, V1

Authors, Title, Year, DOI, Repository, UNF, V2

See: Altman, M., & King, G. (2007) doi:10.1045/march2007-altman

See Data Citation Principle #7 Specificity & Verifiability

Page 14: Update on Data Publishing With Dataverse

14

Dataset Versioning (1)

Page 15: Update on Data Publishing With Dataverse

15

Dataset Versioning (2)

Page 16: Update on Data Publishing With Dataverse

16

Dataset Versioning (3)

Ex. Added files to a Dataset so it bumped up to a major version change.

Page 17: Update on Data Publishing With Dataverse

17

Dataset Versioning (4)

Ex. Added small metadata change to a Dataset so it bumped up to a minor version change.

Page 18: Update on Data Publishing With Dataverse

18

Deaccession Data in 4.0

Before a Dataset is published the DOI is private (reserved). Only when published is it made public & searchable.

You can Deaccession (in 4.0):

1. a version(s) of a Dataset, or

2. an entire Dataset.

In accordance w/ Data Citation Principle #6 Persistence: A Published Dataset cannot be deleted; only deaccessioned, with a reason.

Page 19: Update on Data Publishing With Dataverse

19

Deaccession Workflow (Step 1)

Ex. This file was added in v2 and has identifiable information.

Page 20: Update on Data Publishing With Dataverse

20

Deaccession Workflow (Step 2)

Page 21: Update on Data Publishing With Dataverse

21

Deaccession Workflow (Step 3)

Page 22: Update on Data Publishing With Dataverse

22

Deaccession Workflow (Step 4)

Deaccession Landing PageData Citation Principle #6

Persistence

Page 23: Update on Data Publishing With Dataverse

23

Data Publishing After 4.0 (2015)

Publishing Privacy Sensitive Data• Secure Dataverse• DataTags (demo) (based on

Privacy Laws and DUAs)

Integration with ORCID (API): create ORCID account, connect all Dataverse datasets to ORCID account. (Note: 4.0 will already allow for authors to enter ID.)

Full interview

Page 24: Update on Data Publishing With Dataverse

24

Thank you!Contact: [email protected]

More information: http://datascience.iq.harvard.eduTwitter: @thedataorg


Recommended