Open data for open scholarship: where are we?
Kevin Ashley Digital Curation Centre
www.dcc.ac.uk@kevingashley
Reusable with attribution: CC-BY The DCC is supported by Jisc
2
My home – the DCC
• Mission – to increase capability and capacity for research data services in UK institutions
• Not just a UK problem – an international one
• Training, shared services, guidance, policy, standards, futures
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
3
Before where -
WHY?
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
4
Data reuse stories
• The palaeontologist who saved years of work with archaeological data
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
5
What a paleontologist looks at
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
Now100 million years ago
25m50m 75m
1m
6
What a paleontologist looks at
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
Now100 million years ago
25m50m 75m
1mNow 1 million years
750,000500,000100,000
7
What an archaeologist looks at
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
Now 1 million years
750,000500,000100,000
100,000 years ago75,00050,00025,000
8
Data reuse stories
• The palaeontologist who saved years of work with archaeological data
• The 19th-century ships logs that help us model climate change
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
92014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
The Old weather project
Data for research, not from research
10
Data reuse stories
• The palaeontologist who saved years of work with archaeological data
• The 19th-century ships logs that help us model climate change
• The ‘noise’ from research radar that mapped dust from Eyjafjallajökull
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
11
Data reuse - messages
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
Often your data tells stories that your
publications do not
Not all data comes from other researchers
One person’s noise is another person’s signal
Discipline-bounded data discovery doesn’t give us
all we need or want
12
Data reuse from Hubble
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
Kevin Ashley –Confoa-2014 - CC-BY 13
G8UK - Endorses OAOpen Data CharterPolicy Paper18 June 2013
2014-10-07
14
Why does this matter?
• Research quality– How close can we get to
the truth?
• Research speed– How quickly can we get
to the truth?
• Research finance– How much does the
truth cost?
• Improving one or more of these is of interest to all actors:
• Researchers as data creators
• Researchers as data reusers
• Research institutions• Funders – hence
government and society2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
Kevin Ashley –Confoa-2014 - CC-BY 15
Open scholarship – the wider picture
• Not just about open papers, open data• Software, methods, workflows are all
important• Data need not be open – but its existence
must be• Data: DISCOVERABLE & REUSABLE
2014-10-07
Kevin Ashley –Confoa-2014 - CC-BY 162014-10-07
FUNDER POLICY
UNIVERSITY RESPONSE
17
Funder requirements
• UK
• USA – NSF, NEH, NIH• Europe
• Denmark – in development• Most place burden on researcher
– some on the institution
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
http://www.epsrc.ac.uk/about/standards/researchdata/Pages/policyframework.aspx
18
RCUK policy - The 1-minute version
• Research data are a public good – make openly available in timely & responsible way
• Have policies & plans. Data with long-term value should be preserved & usable
• Metadata for discovery & reuse. Link publications & data
• Sometimes law, ethics get in the way. We understand.• Limited embargos OK. Recognition is important –
always cite data sources• OK to use public money to do this. Do it efficiently.
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
Kevin Ashley –Confoa-2014 - CC-BY 192014-10-07
DCC Policy Summary
http://www.dcc.ac.uk/resources/policy-and-legal
Kevin Ashley –Confoa-2014 - CC-BY 202014-10-07
Kevin Ashley –Confoa-2014 - CC-BY 21
Research data centres are good value!
• See Jisc reports on ADS, BADC, UKDA:• Returns on investment between 400% and
1200%
2014-10-07
http://www.jisc.ac.uk/whatwedo/programmes/di_directions/strategicdirections/badc.aspx
22
Research Data Centres – the solution!
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
MANY AREAS OF RESEARCH HAVE NO
DATA CENTRE TO SERVE THEM
Kevin Ashley –Confoa-2014 - CC-BY 23
INSTITUTIONAL SERVICES
2014-10-07
24
DCC guidance
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
Kevin Ashley –Confoa-2014 - CC-BY 25
Roles and Responsibilities
What data to keep
2014-10-07
26
Some example services
• Storage – persistent, shareable• Permanent, citeable identifiers• Data Management Planning (DMPOnline,
DMPTool)• Database as a service (e.g. Oxford ORDS)• Embed tools in Excel – Dataup, others• Workflow management - Taverna
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
Kevin Ashley –Confoa-2014 - CC-BY 272014-10-07
28
Make data creation easier
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
29
Make data citable
• Making data available increases citations• Everyone – academic, funder, institution –
loves citations• Want evidence?
– Alter, Pienta, Lyle – 240%, social sciences *– Piwowar, Vision – 9% (microarray data)†– Henneken, Accomazzi – 20% (astronomy) #
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
† Piwowar H, Vision TJ. (2013) Data reuse & the open data citation advantage. PeerJ PrePrints 1:e1v1 http://dx.doi.org/10.7287/peerj.preprints.1v1
* Amy Pienta, George Alter, Jared Lyle, (2010) The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data.http://hdl.handle.net/2027.42/78307
# Edwin Henneken, Alberto Accomazzi, (2011) Linking to Data - Effect on Citation Rates in Astronomy. http://arxiv.org/abs/1111.3618
Kevin Ashley –Confoa-2014 - CC-BY 30http://dataintelligence.3tu.nl/en/home/
http://www
.sheffield.
ac.uk/is/re
search/pro
jects/
rdmrose
Choice of RDM training materials for librarians
Up-skilling for data
http://datalib.edina.ac.uk/mantra/libtraining.html
2014-10-07
Kevin Ashley –Confoa-2014 - CC-BY 31
Make data discoverable
• Data must be discoverable to be reused• Alone, or in conjunction with publication• Institutional catalogues, national data
registries – JISC is piloting through DCC• We are copying Australian approach
2014-10-07
32
Pimp your data –
make it findable & reusable
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
Gking.harvard.edu/data
33
Commercial services
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
342014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
SWEDEN
DENMARK?
CANADA
35
My message to you
• Help researchers understand the benefits to them of sharing their data
• Help them discover & reuse data• Give them tools that help the process• Work to ensure they get credit for data
citation
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY
36
My message to researchers• The credit belongs to you• The data belongs to all of us• Share, and we all reap the
benefits
2014-10-07 Kevin Ashley –Confoa-2014 - CC-BY