Upload
danny-kingsley
View
896
Download
0
Embed Size (px)
Citation preview
The purpose, practicalities, pitfalls and policies of managing and sharing data in the UK
AAMG-CICAG Measurement, Information and Innovation meeting
20 October 2015Dr Danny Kingsley
Can we cover this in 15 minutes (allowing 5 min for questions?)
• UK policy landscape• Places to share data• What are we trying to achieve?• Let’s start at the beginning• Basics of Research Data Management• Issues with sharing (or not) data
The data policy landscape
Lots of slightly different rules in the UK
Policies• Funder
– RCUK Common Principles on Data Policy• Government
– Draft Concordat on Open Research Data released by the RCUK for consultation which ended on 28 September• http://www.rcuk.ac.uk/research/opendata/
– Cambridge coordinated a joint response with other universities• https://unlockingresearch.blog.lib.cam.ac.uk/?p=285
• Publishers• Institutional
– Cambridge University Research Data Management Policy Framework. http://www.data.cam.ac.uk/university-policy
RCUK Common Principles on Data
–“Publicly funded research data are a public good (…), which should be made openly available with as few restrictions as possible”–http://www.rcuk.ac.uk/research/datapolicy
/
The principles might be common…
What the researcher hears
From Bill Hubbard Getting the rights right: when policies collidehttp://www.slideshare.net/UKSG/hubbard-uksg-may2015-public
Places to share data
There are lots of options
Open repositories
• (some are free, some charge)
Disciplinary specific repositories• Gene Expression Omnibus
– Public function genomics data repository• http://www.ncbi.nlm.nih.gov/geo/
• arXiv– e-prints in Physics, Mathematics, Computer Science, Quantitative Biology,
Quantitative Finance and Statistics• http://arxiv.org/
• Oxford Text Archive– Literary and linguistic texts for higher education
• http://ota.ox.ac.uk/
• UK Data Service – Social science data
• http://ukdataservice.ac.uk/
• Natural Environment Research Council (NERC) run 7 repositories• http://www.nerc.ac.uk/research/sites/data/
Journals
• Either as supplementary data, or in data-only journals– PLOS data sharing policy (Dec 2013)• https://www.plos.org/plos-data-policy-faq/
– Nature’s journal Scientific Data• http://www.nature.com/sdata/about
We are a long way from there
So what’s it all about then?
What are we actually trying to achieve with open data policies?
In conversation with Ben Ryan EPSRC
• Please share:– the data that underpins publications– the data that validates research findings– the data that is worth keeping
• The default position is ‘data should be open’• Published research findings should be testable• Maximise the impact of publicly funded research • Maintain public trust in science and research• They are trying to create a new research culture
• https://unlockingresearch.blog.lib.cam.ac.uk/?p=151
Responses to data sharing policies
• What’s the minimum we can get away with?• This is crap• ‘They’ are just doing this because ‘they’ can• But it will take a huge effort to get the data in
a useable form• No-one will look at it• What a waste of time
Data excuse bingo
We are trying to start at the end
We should begin at the beginning - a stitch in time and all that…
In conversation with Michael Ball BBSRC
• Disciplines themselves must establish ways of dealing with data– This is the beginning of an ongoing process
• Researchers need to consider how to deal with data from the beginning of a research project
• You can ask for money to manage data in the grant application
• https://unlockingresearch.blog.lib.cam.ac.uk/?p=337
Research data management
• The practice of sharing data requires the data to be:– Accessible– Intelligible– Assessable– Reusable
Some of it is really obvious
• How many of you:– Use a file naming protocol?– Ensure all your laptops are backed up?– Have written a data management plan for your
current project?– Determined who in the team owns the data? • PS: this last one REALLY matters
Skillsets required for managing and curating data
http://www.dcc.ac.uk/sites/default/files/documents/RDMF/RDMF2/coreSkillsDiagram.gif
Lots of jobs…
Issues with sharing data
Both with sharing and not sharing
Issues raised by researchers
• There is a very real concern that the UK will become unattractive for collaborations
• Researchers discussing changing the type of research being done to reduce the amount of data being produced
• There is discussion in some circles whether applying for EPSRC funding is worth the hassle
Consequences of not sharing data • Medicine
– Having the data publicly available in two trials of deworming pills demonstrated that a population wide deworming program did not improve school performance
– http://www.buzzfeed.com/bengoldacre/deworming-trials • Economics
– A study widely cited to justify budget cutting in the US had a mistake in the calculations which was only revealed when the Excel file was released
– http://www.bloomberg.com/bw/articles/2013-04-18/faq-reinhart-rogoff-and-the-excel-error-that-changed-history
• Physics– It took 12.5 years to withdraw Jan Hendrik Schon’s work on ‘organic semiconductors’
because the reviewers were unable to replicate the results without access to the original data or lab books
– http://www.science20.com/science_20/jan_hendrik_sch%C3%B6n_world_class_physics_fraud_gets_last_laugh_whole_book_about_himself
Questions?
Dr Danny KingsleyHead of Scholarly Communication
University of Cambridge
Email: [email protected]: https://unlockingresearch.blog.lib.cam.ac.uk/Website: http://osc.cam.ac.uk Twitter: @dannykay68