32
DATA MANAGEMENT PLANS & THE DATA CITATION INDEX NIGEL ROBINSON 19 MAY 2014

DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

  • Upload
    nili

  • View
    46

  • Download
    1

Embed Size (px)

DESCRIPTION

DATA MANAGEMENT PLANS & THE DATA CITATION INDEX. NIGEL ROBINSON 19 MAY 2014. THE INCREASING VISIBILITY OF DATA . “Data is the new gold” – Neellie Kroes , EU Digital Agenda Commissioner. Grant funding agencies Journal publishers Publisher website Electronic articles - PowerPoint PPT Presentation

Citation preview

Page 1: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

DATA MANAGEMENT PLANS & THE DATA CITATION INDEX NIGEL ROBINSON

19 MAY 2014

Page 2: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

THE INCREASING VISIBILITY OF DATA • Grant funding agencies• Journal publishers

• Publisher website• Electronic articles

• Data repositories & registration agencies

“Data is the new gold”

– Neellie Kroes, EU Digital Agenda Commissioner

Page 3: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DIGITAL SCHOLARSHIP

•Very visible within the literature as a concept•Articles, projects, university labs all devoted to

digital scholarship in various ways

Digital Scholarship•Authors/researchers•Research administrators•Librarians, data archivists•Publishers•Grant funding organizations

Interested Parties

•Discipline-specific and multidisciplinary content•Needs and requirements vary by discipline•Diverse content formats, with few standards

Content

Page 4: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

OVERVIEW• Emerging landscape• Opportunities• Citation and attribution

Page 5: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

NIH (2003) Data Sharing Policy that all funding applications of $500,000 or more per year are expected to address data-sharing in their application.

NSF (2011) All funding proposals submitted on or after January 18, 2011, must include a “Data Management Plan” describing how the proposal will conform to NSF policy on the dissemination and sharing of research results.

THE EMERGENCE OF FUNDING MANDATES

Page 6: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DATA MANAGEMENT REQUIREMENTS EXTEND ACROSS THE GLOBE

Aug 2011… “expectation that all our funded researchers should maximise access to their research data with as few restrictions as possible. …. submit a data management and sharing plan as part of the application process.”

2007… “Researchers are to retain research data and primary materials, manage storage of research data and primary materials, maintain confidentiality of research data and primary materials.”

Page 7: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DATA MANAGEMENT REQUIREMENTS EXTEND ACROSS THE GLOBE• “A further new element in Horizon 2020

is the use of Data Management Plans (DMPs) detailing what data the project will generate, whether and how it will be exploited or made accessible for verification and re-use, and how it will be curated and preserved. The use of a Data Management Plan is required for projects participating in the Open Research Data Pilot. Other projects are invited to submit a Data Management Plan if relevant for their planned research.”

Page 8: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

IMPACT ON RESEARCH LIBRARIES

8

Page 9: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

FUNDING MANDATES BECOMING STRONGER

January 14, 2013… “failure to provide the requisite Data Management Plan will result in the application being rejected or terminated.”

Page 10: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

WHY SHARE DATA?• Verification - Findings can be verified• Extend original findings, address new

questions• Reduce costs• Training – data reuse• Increased primary publications

A. Pienta, G. Alter, J. Lyle (2010). The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data. http://hdl.handle.net/2027.42/78307 Data sharing leads to more science &

more knowledge

Page 11: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DATA SHARING

• The level and timing of data sharing varies

28% only share prior to

publication

35% only share after publication

25% share before and after

publication

Page 12: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

INCREASED CITATION WITH SHARED DATA

35% to 69% more citations

courtesy of Jon Sears (AGU)

Piwowar HA, Day RS, Fridsma DB (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. doi:10.1371/journal.pone.0000308

Bibliometrics

Page 13: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DEPOSITION OF DATA BY RESEARCHERS

13

24%

36%

47%

51%

17%

Publisher website

Repository managed by a third party (e.g, domain-…

Department or institutional repository

Personal website

Other

Q16. Where do you place your non-traditional scholarly output to make it available to others? (n=471)

Page 14: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

RESEARCHERS NOT RECEIVING CREDIT

14

Barriers to creating and sharing data:

• Researchers are hesitant to spend time and effort to create and share data because they don’t feel the work is adequately exposed or accredited

• Researchers find it difficult to expose data they have produced because data repositories do not have clear standards or mechanisms in place for doing so

Page 15: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

RESEARCHER PROBLEMS• Access & discovery• Citation standards• Lack of willingness to deposit and cite• Lack of recognition / credit

Page 16: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DATA MANAGEMENT PLAN BENEFITS• Repository must hold data• Repository must provide access to dataData deposit

• Material added/updated• Provide statistics on deposited data• Actively curate data in the archive

Active

• Persistent IDs, DOIs or other permanent ID• Contacts available for confirmation of interpretation• Indication of intention to preserve data or provide

access over the long term• Contingency if repository was to cease to operate

• Make data accessible (or state licensing terms)• Sustainable• Funding information available for repository and

deposited data

Persistence

• Links to literature• Citation in literature databasesData reuse

Page 17: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

CHALLENGES

• Metadata– Resources– Expertise

• Citable data source• Metadata quality

– Unique & persistent identifiers– Consistency

• Data repositories are not static– How is version control handled?

• Partnerships

Page 18: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DATA CITATIONCurrent citation style (in full text of article as informal citations)

Desired/future citation style (as formally cited references)

U.S. Dept. of Justice, Bureau of Justice Statistics (1996): MURDER CASES IN 33 LARGE URBAN COUNTIES IN THE UNITED STATES, 1988. Version 1. Inter-university Consortium for Political and Social Research. http://dx.doi.org/10.3886/ICPSR09907.v1

Lee, Seung-Jae; Lee, He-Jin; Cho, Ji-Hoon; Rho, Sangchul; Hwang, Daehee (2008): GSE11574: The responses of astrocytes stimulated by extracellular a-synuclein. Gene Expression Omnibus. http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE11574

Page 19: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DATA CITATION

Lee, Seung-Jae; Lee, He-Jin; Cho, Ji-Hoon; Rho, Sangchul; Hwang, Daehee (2008): GSE11574: The responses of astrocytes stimulated by extracellular a-synuclein. Gene Expression Omnibus. http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE11574

Data Citation Index

New data metrics

Scientific literature

Published data sets

Page 20: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DATA CITATION INDEX AIMS

Launched October 2012

4M data records

• Enable the discovery of data repositories, data studies and data sets in the context of traditional literature

• Link data to research publications• Help researchers find data sets and

studies and track the full impact of their research output

• Provide expanded measurement of researcher and institutional research output and assessment

• Facilitate more accurate and comprehensive bibliometric analyses

Page 21: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

21

INDEXING A DATA REPOSITORY ON WEB OF SCIENCE

• Repository/Source: Comprises data studies, data sets and/or microcitations. Stores and provides access to the raw data.

• Data Study: Descriptions of studies or experiments with associated data which have been used in the data study. Includes serial or longitudinal studies over time.

• Data Set: A single or coherent set of data or a data file provided by the repository, as part of a collection, data study or experiment.

• Microcitation: (nanopublication) An assertion about concepts that have been found to be linked by scientific enquiry, and can be uniquely identified and attributed to its author. Made up of three separate parts: a subject, a predicate and an object.

Record Types

Descriptive metadata feed from repository

Repository raw

metadata is analysed

Metadataadded

Repository

Data study

Data set

Micro-citation

Page 22: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

Search Results within the Data Citation Index

present the powerful Web of Knowledge options for

exploring a body of information. Data

becomes discoverable alongside literature

Page 23: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

Data deposition makes it possible to show related data

from the repository

Page 24: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

Because data are accessible and able to be cited, they can be linked to publications describing research which uses them

Page 25: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

Link out directly to the original item, in this case

a Data Study.

Page 26: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

Start to build citation maps associated with

data through the association of data and

literature

Page 27: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

Provide assistance in how to associate data and

literature through citation

Page 28: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DATA CITATION INDEX & DATA MANAGEMENT PLANS• Discovery of data most important to scholarly

research• Data linked to published research literature• Measures of data citation, use and reuse with

attribution assisted by identifiers• New metrics for digital scholarship

Page 29: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

Thank you

Nigel Robinson

[email protected]

Page 30: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

As we evaluate repositories for inclusion, some of the things we consider are:

• Editorial Content - ensuring that material is desirable to the research community.

• Persistence and stability of the repository, with a steady flow of new information.

• Thoroughness and detail of descriptive information.

• Links from data to research literature.

REPOSITORY SELECTION & EVALUATION

Page 31: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

DATA REPOSITORIES• Over 1000 repositories identified

Page 32: DATA MANAGEMENT PLANS & THE DATA CITATION INDEX

©20

10 T

hom

son

Reu

ters

TYPES OF DATA BY DISCIPLINE

ART & HUMANITIES

CULTURAL HERITAGELANGUAGE CORPUS

IMAGE COLLECTIONSRECORDINGS

SOCIAL SCIENCES

POLL DATA

ECONOMIC STATISTICSLONGITUDINAL DATA

NATIONAL CENSUS

PUBLIC OPINION SURVEYS

SCIENCE & TECHNOLOGY

MAPS

ALGORITHMS

GENOMICS

SKY SURVEYS

ASTROPHYSICS

REMOTE SENSING

MUSEUM SPECIMENS