Data Metadata and Data Citation - Emma Ganley (PLoS)

Preview:

Citation preview

Emma Ganley

Chief Editor, PLOS Biology & Data Lead for PLOS

eganley@plos.org @GanleyEmma

August 10, 2016

Data, Metadata, and Data Citation:

Insights from PLOS

PLOS is a non-profit Open Access publisher and advocacy organization with a mission to accelerate

progress in science and medicine by leading a transformation in research communication.

ORCID

ORCID – Connecting Research & Researchers

Jan 2016 PLOS, as part of a coalition of publishers, PLOS signed an Open Letter

committing to follow best practices when collecting, processing, & displaying ORCID iDs.

https://orcid.org/content/requiring-orcid-publication-workflows-open-letter

ORCID API

NB ORCID API enables single sign-on via ORCID button in correspondence

Auto-update

With ORCID integration, Authors and their administrators no longer need to keep profiles updated manually

http://blogs.plos.org/plos/2016/01/author-credit-plos-orcid-update/

CRediT

CRediT Taxonomy

Makes contributions machine-readable and portable

http://casrai.org/CRediTConceptualization | Methodology | Software |Validation |Formal Analysis |Investigation | Resources | Data curation | Writing (original draft) | Writing (reviews) |Visualization | Supervision | Project administration | Funding acquisition

CRediT Taxonomy

Become portable to ORCID record, CV, faculty profile, etc

Open Data

Challenges and OpportunitiesScience 11 February 2011:vol. 331 no. 6018 692-693

CREDIT: M. TWOMBLY/SCIENCE; SOURCE: SCIENCE ONLINE SURVEY

2011 Survey in Science

Data Availability Declines Over Time

ALMOST ALL DATA LOST 10-15 YRS AFTER

PUBLICATION

Source: How Does the Availability of Research Data Change With Time Since Publication?Timothy H. Vines and colleagues, Abstract (podium), Peer Review Congress, 2013

Old to New Policy – March 2014

OLD: Authors should make all relevant data immediately available without restrictions (excl. patient confidentiality)

NEW: Require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception.

Authors must provide a Data Availability Statement (DAS)describing compliance with PLOS’ policy.

*No change to WHAT data needs to be shared, but the

focus was placed on WHERE it is housed, WHEN it is shared,

and HOW authors provide access for those who want it*

VALUE OF DATA

Replication/Validation;

New analysis;

Better interpretation;

Include in meta-studies;

Facilitate reproducibility;

Scrutiny post- publication;

* $$$* Better return on

research investment.

Data Availability Statement

The DAS is openly available, and machine-readable as part of the PLOS search API

Data Availability Statement

Anecdotes & Interpretation2011-12: 51 papers 12% with data availSince Mar 2014: 20 papers 40% complianceSource: ‘Confusion over publisher’s pioneering open-data rules’ Nature 515, 478 (27 November 2014) doi:10.1038/515478a

‘Mandated data archiving greatly improves access to research data’ T. H Vines et al. Faseb J 27, 1304-1308; Jan 2013

50 fMRI studies in PLOS ONE38 had shared the data12 had not shared the data76% compliance!

An increase in data sharing:- From 12% to 40%- Even up to as much as 76%

Not seeing full compliance but we are seeing massive improvements

Where are the data (PLOS ONE)?

DAS = Data availability statement

• Most common repositories: figshare, Dryad, NCBI Genbank, NCBI GEO, NCBI SRA, Dataverse, EBI, Github, Array Express, RCSB Protein DataBank

• Persistent Identifiers Used.• Increase seen both in repository deposits and in SI files.

Data Repositories & Institutional Repositories

http://journals.plos.org/plosbiology/s/data-availability#loc-recommended-repositories

Biosharing.org collection

Publishers in Data Access & Sharing

Lin J, Strasser C (2014) Recommendations for the Role of Publishers in Access to Data. PLoS Biol 12(10): e1001975. doi:10.1371/journal.pbio.1001975

NSF Funded project:Make Data Count.

To explore and test data-level metrics.

Meeting co-organized by PLOS & CaliforniaDigital Library

Data Citation Implementation PilotImplementation of the Joint Declaration of Data Citation Principles (JDDCP), developed by the FORCE11 Data Citation Implementation Group.

Goal: provide basic coordination between publishers, repositories and identifier / metadata services for early adopters of data citation according to the JDDCP

Funded by the NIH BD2K bioCADDIE to extend on Data Citation Implementation Group

THOR is funded by the European Commission under call H2020-EINFRA-2014-2, project number 654039

30 month project funded by the European Commission under the Horizon 2020 programme. - aim to establish seamless integration between articles, data, and researchers across the research lifecycle. - create a wealth of open resources and foster a sustainable international e-infrastructure. - reduced duplication, economies of scale, richer research services, and opportunities for innovation.

@EmmaGanley

eganley@plos.org

Thank you!

Questions?