23
Emma Ganley Chief Editor, PLOS Biology & Data Lead for PLOS [email protected] @GanleyEmma August 10, 2016 Data, Metadata, and Data Citation: Insights from PLOS

Data Metadata and Data Citation - Emma Ganley (PLoS)

Embed Size (px)

Citation preview

Page 1: Data Metadata and Data Citation - Emma Ganley (PLoS)

Emma Ganley

Chief Editor, PLOS Biology & Data Lead for PLOS

[email protected] @GanleyEmma

August 10, 2016

Data, Metadata, and Data Citation:

Insights from PLOS

Page 2: Data Metadata and Data Citation - Emma Ganley (PLoS)

PLOS is a non-profit Open Access publisher and advocacy organization with a mission to accelerate

progress in science and medicine by leading a transformation in research communication.

Page 3: Data Metadata and Data Citation - Emma Ganley (PLoS)

ORCID

Page 4: Data Metadata and Data Citation - Emma Ganley (PLoS)

ORCID – Connecting Research & Researchers

Jan 2016 PLOS, as part of a coalition of publishers, PLOS signed an Open Letter

committing to follow best practices when collecting, processing, & displaying ORCID iDs.

https://orcid.org/content/requiring-orcid-publication-workflows-open-letter

Page 5: Data Metadata and Data Citation - Emma Ganley (PLoS)

ORCID API

NB ORCID API enables single sign-on via ORCID button in correspondence

Page 6: Data Metadata and Data Citation - Emma Ganley (PLoS)

Auto-update

With ORCID integration, Authors and their administrators no longer need to keep profiles updated manually

http://blogs.plos.org/plos/2016/01/author-credit-plos-orcid-update/

Page 7: Data Metadata and Data Citation - Emma Ganley (PLoS)

CRediT

Page 8: Data Metadata and Data Citation - Emma Ganley (PLoS)

CRediT Taxonomy

Makes contributions machine-readable and portable

http://casrai.org/CRediTConceptualization | Methodology | Software |Validation |Formal Analysis |Investigation | Resources | Data curation | Writing (original draft) | Writing (reviews) |Visualization | Supervision | Project administration | Funding acquisition

Page 9: Data Metadata and Data Citation - Emma Ganley (PLoS)

CRediT Taxonomy

Page 10: Data Metadata and Data Citation - Emma Ganley (PLoS)

Become portable to ORCID record, CV, faculty profile, etc

Page 11: Data Metadata and Data Citation - Emma Ganley (PLoS)

Open Data

Page 12: Data Metadata and Data Citation - Emma Ganley (PLoS)

Challenges and OpportunitiesScience 11 February 2011:vol. 331 no. 6018 692-693

CREDIT: M. TWOMBLY/SCIENCE; SOURCE: SCIENCE ONLINE SURVEY

2011 Survey in Science

Page 13: Data Metadata and Data Citation - Emma Ganley (PLoS)

Data Availability Declines Over Time

ALMOST ALL DATA LOST 10-15 YRS AFTER

PUBLICATION

Source: How Does the Availability of Research Data Change With Time Since Publication?Timothy H. Vines and colleagues, Abstract (podium), Peer Review Congress, 2013

Page 14: Data Metadata and Data Citation - Emma Ganley (PLoS)

Old to New Policy – March 2014

OLD: Authors should make all relevant data immediately available without restrictions (excl. patient confidentiality)

NEW: Require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception.

Authors must provide a Data Availability Statement (DAS)describing compliance with PLOS’ policy.

*No change to WHAT data needs to be shared, but the

focus was placed on WHERE it is housed, WHEN it is shared,

and HOW authors provide access for those who want it*

VALUE OF DATA

Replication/Validation;

New analysis;

Better interpretation;

Include in meta-studies;

Facilitate reproducibility;

Scrutiny post- publication;

* $$$* Better return on

research investment.

Page 15: Data Metadata and Data Citation - Emma Ganley (PLoS)

Data Availability Statement

Page 16: Data Metadata and Data Citation - Emma Ganley (PLoS)

The DAS is openly available, and machine-readable as part of the PLOS search API

Data Availability Statement

Page 17: Data Metadata and Data Citation - Emma Ganley (PLoS)

Anecdotes & Interpretation2011-12: 51 papers 12% with data availSince Mar 2014: 20 papers 40% complianceSource: ‘Confusion over publisher’s pioneering open-data rules’ Nature 515, 478 (27 November 2014) doi:10.1038/515478a

‘Mandated data archiving greatly improves access to research data’ T. H Vines et al. Faseb J 27, 1304-1308; Jan 2013

50 fMRI studies in PLOS ONE38 had shared the data12 had not shared the data76% compliance!

An increase in data sharing:- From 12% to 40%- Even up to as much as 76%

Not seeing full compliance but we are seeing massive improvements

Page 18: Data Metadata and Data Citation - Emma Ganley (PLoS)

Where are the data (PLOS ONE)?

DAS = Data availability statement

• Most common repositories: figshare, Dryad, NCBI Genbank, NCBI GEO, NCBI SRA, Dataverse, EBI, Github, Array Express, RCSB Protein DataBank

• Persistent Identifiers Used.• Increase seen both in repository deposits and in SI files.

Page 19: Data Metadata and Data Citation - Emma Ganley (PLoS)

Data Repositories & Institutional Repositories

http://journals.plos.org/plosbiology/s/data-availability#loc-recommended-repositories

Biosharing.org collection

Page 20: Data Metadata and Data Citation - Emma Ganley (PLoS)

Publishers in Data Access & Sharing

Lin J, Strasser C (2014) Recommendations for the Role of Publishers in Access to Data. PLoS Biol 12(10): e1001975. doi:10.1371/journal.pbio.1001975

NSF Funded project:Make Data Count.

To explore and test data-level metrics.

Meeting co-organized by PLOS & CaliforniaDigital Library

Page 21: Data Metadata and Data Citation - Emma Ganley (PLoS)

Data Citation Implementation PilotImplementation of the Joint Declaration of Data Citation Principles (JDDCP), developed by the FORCE11 Data Citation Implementation Group.

Goal: provide basic coordination between publishers, repositories and identifier / metadata services for early adopters of data citation according to the JDDCP

Funded by the NIH BD2K bioCADDIE to extend on Data Citation Implementation Group

Page 22: Data Metadata and Data Citation - Emma Ganley (PLoS)

THOR is funded by the European Commission under call H2020-EINFRA-2014-2, project number 654039

30 month project funded by the European Commission under the Horizon 2020 programme. - aim to establish seamless integration between articles, data, and researchers across the research lifecycle. - create a wealth of open resources and foster a sustainable international e-infrastructure. - reduced duplication, economies of scale, richer research services, and opportunities for innovation.

Page 23: Data Metadata and Data Citation - Emma Ganley (PLoS)

@EmmaGanley

[email protected]

Thank you!

Questions?