Lortie data citation ignite talk ESA2014

Preview:

DESCRIPTION

For better or worse, citations are here stay. Citations have the capacity to serve as a proxy estimate of uptake or use by the community of ones products. Fortunately, the range of acceptable scientific products is rapidly expanding, datasets in many forms continue to serve as pivotal resources, and big data syntheses are reshaping the standards for acceptable derived evidence. Data citations are defined, general rules provided, and the unique elements of datasets described such as versioning and persistent identifiers. The cultural and scientific discovery implications of data citations are also described focusing on emerging linked-data futures. Citations to publications & select figures 1. H. A. Piwowar, T. J. Vision, M. C. Whitlock, Nature 473, 285 (05/19/print, 2011). 2. H. A. Piwowar, J. D. Carlson, T. J. Vision, Proceedings of the American Society for Information Science and Technology 48, 1 (2011). 3. C. W. Belter, PLoS ONE 9, e92590 (2014). 4. Ş. Kafkas, J.-H. Kim, J. R. McEntyre, PLoS ONE 8, e63184 (2013). 5. H. A. Piwowar, R. S. Day, D. B. Fridsma, PLoS ONE 2, e308 (2007). 6. A. Kenall, S. Harold, C. Foote, BMC Ecology 14, 10 (2014). Citation to dataset http://bit.ly/datacitecontrast

Citation preview

data citations

building blocks = reproducible science

use datasets

we need those bricks in repositories but…

collect data

analyze write publish paper

many scientists do this.

there are many scientific products

collect data

analyze write publish paper

all can be fundamental to scientific inquiry

publish dataset

publish data

descriptor

even better

collect data

analyzewrite publish

paper

share datasets firstly

publish dataset

publish data

descriptor

transparently, collaboratively work & communicate

why build alone in secret?

why would you do all this?

altruisticdiscourages fraud error identification

multiple perspectives training new researchers

avoiding duplicate data collection

individualcitations

reciprocity recognition

crowdsourcing more publishable units

Piwowar et al. 2007

evidence

sharing data may drive more citations to your papers

Piwowar et al. 2007

trials with data were cited about 70% more frequently

Citations have the capacity to serve as a proxy estimate of uptake or use by the community of ones products.

At least two issues with this assumption. !

Datasets can be independent products. Citations to papers are not everything.

Datasets can be well cited

Belter 2014

At this point in time however, citations to datasets likely underestimate use/reuse.

Belter 2014

wide variety of methods used to refer to datasets from formal citation to mentions

Piwowar et al. 2011

Manual review was performed for each instance of potential data reuse.

Yes, there is a Data Citation Index, a ScienceDirect database linking tool,

a stable identifier such as doi for each dataset, and a host of curated repositories.

still remains some unique data citation challenges

citation to repository dataset, data publication, or primary publication

the mechanics of citations

in addition to the ongoing development of data citation standards, curation, retrievability, versioning, and meta-data can introduce

challenges

solutions

use a data repository

cite you own datasets properly

cite databases used/reused in your papers in main literature lists

provide meta-data (use EML)

OA databases thus become resource

benefits of aggregated datasets not only promote better citation practices but provide cross-disciplinary insights

Kafkas et al. 2013

innovative solution: text mining for ascension numbers

Kafkas et al. 2013

however, citations are just one indicator of scientific discourse

altmetrics & collaboration through other channels are important

Recognize value of curation Accelerate synthesisLeverage work of others Promote open science Hone your workflow

data citations

cite & use datasets in your primary research papers too

Recommended