50
23 Research Data Things Research Data Coordinator Katina Toufexis

20160523 23 Research Data Things

Embed Size (px)

Citation preview

Page 1: 20160523 23 Research Data Things

23 Research Data ThingsResearch Data CoordinatorKatina Toufexis

Page 2: 20160523 23 Research Data Things

Data Sharing

• Research data may be shared in many ways.  • Getting Started looks at sharing data via access methods: Open, Shared

and Closed Data

Thing 5

Page 3: 20160523 23 Research Data Things

Data Sharing

Thing 5

https://vimeo.com/125783029Open / Shared / Closed: The world of data

Page 4: 20160523 23 Research Data Things

Data Sharing

Thing 5

1. What is 'open data'?2. Who benefits from open data?3. Overcoming barriers to opening data4. Making data open5. Open data in Research Data Australia

Page 5: 20160523 23 Research Data Things

Data Sharing

Thing 5

1. What is 'open data'?2. Who benefits from open data?3. Overcoming barriers to opening data4. Making data open5. Open data in Research Data Australia

Page 6: 20160523 23 Research Data Things

Thing 5

What is 'open data'?1. freely available to download in a reusable form.  Large or complex data

may be accessible via a service or facility that enables access in-situ or the compilation of sub-sets

2. licensed with minimal restrictions to reuse3. well described with provenance and reuse information provided4. available in convenient, modifiable and open formats5. managed by the provider on an ongoing basis.The Open Data Handbook provides an introduction to the legal, social and technical aspects of open data. It discusses what open data is as well as why and how to make data open.

Page 8: 20160523 23 Research Data Things

Data Sharing

Thing 5

1. What is 'open data'?2. Who benefits from open data?3. Overcoming barriers to opening data4. Making data open5. Open data in Research Data Australia

Page 9: 20160523 23 Research Data Things

Thing 5

Who benefits from open data?Everyone!  According to the Royal Society, open data supports:• new research and new types of research• the application of automated knowledge discovery tools online• the verification of previous results• a broader base set of data than any one researcher can hope to collect• the exploration of topics not envisioned by the initial investigators• the creation of new data sets, information and knowledge when data from

multiple sources are combined• the transfer of factual information to promote development and capacity

building in developing countries• interdisciplinary, inter-sectoral, inter-institutional and international research.

Page 10: 20160523 23 Research Data Things

Thing 5

Who benefits from open data?

The many ways open data benefits researchers, research organisations, funders, policy makers and the broader community:

Page 11: 20160523 23 Research Data Things

Data Sharing

Thing 5

1. What is 'open data'?2. Who benefits from open data?3. Overcoming barriers to opening data4. Making data open5. Open data in Research Data Australia

Page 12: 20160523 23 Research Data Things

Thing 5

Overcoming barriers to opening data

Someone might use my data to 'scoop' me

Page 13: 20160523 23 Research Data Things

Thing 5

Overcoming barriers to opening data

Someone might use my data to 'scoop' me1. Timing?  You may choose to restrict access to your data until a key paper is published. You decide the appropriate time for making your data open.

Page 14: 20160523 23 Research Data Things

Thing 5

Overcoming barriers to opening data:

Someone might use my data to 'scoop' me2. What is the real risk of ‘scooping’?Little risk according to:

Nature, Professor Issac Kohone, Harvard Medical School: "[we] need to convince people that the likelihood of being scooped if they put their data out there [is] not going to be high ... we need to do away with a culture of sitting on data until we have mined every useful scientific grain out of it". 

In a similar vein, some researchers report that any possible loss of future potential papers is well offset by the more immediate rewards of data citations and collaborative opportunities.

Page 15: 20160523 23 Research Data Things

Thing 5

Overcoming barriers to opening data:

Someone might use my data to 'scoop' me3. What is the real risk of ‘scooping’?

In fact, many researchers find that opening up their data has greatly benefited their research.  Report - Professor Tim Gowers, Royal Society Research Professor, University of Cambridge• opened up his data to crowd-source an unsolved mathematical problem.• 27 people made 800 substantive contributions to solve the problem in a

matter of days. • Professor Gowers commented that this approach to research was "like driving

a car whilst normal research is like pushing it".

Page 16: 20160523 23 Research Data Things

Thing 5

Overcoming barriers to opening data:

My data are sensitive due to cultural, ethical, ecological or security considerations

There are circumstances where it may not be appropriate to make data open. e.g. • where individuals may be identified;• threatened species located; or• information affecting national security revealed.

Page 17: 20160523 23 Research Data Things

Thing 5

Overcoming barriers to opening data:

My data are sensitive due to cultural, ethical, ecological or security considerations

However, there may be ways to make sensitive data at leastpartially open.

This comprehensive 26 page Publishing and Sharing Sensitive Data - ANDS Guide (PDF, 0.73 MB) outlines best practice for the publication and sharing of sensitive research data in the Australian context. It should be read in conjunction with the ANDS Introduction to Sensitive Data.

Page 18: 20160523 23 Research Data Things

Thing 5

Overcoming barriers to opening data:

My data are sensitive due to cultural, ethical, ecological or security considerations

http://www.ands.org.au/__data/assets/pdf_file/0010/385309/sensitive-decision-tree.pdf ANDS Publishing and Sharing Sensitive Data DECISION TREE

Page 19: 20160523 23 Research Data Things

Thing 5

Page 20: 20160523 23 Research Data Things

Thing 5

Overcoming barriers to opening data:

I won't get any recognition or reward for making my data open

Tools such as Thomson Reuters Data Citation Index, enable citation metrics to be captured for data

Page 21: 20160523 23 Research Data Things

Thing 5

Overcoming barriers to opening data:

There are contractual or commercial interests associated with my data

• research data may underpin a commercialisation opportunity such as a patent.

• Or it may be that contractually, IP arising from a project is owned by a third party.

• In others cases though, data is not shared because of the uncertainty arising from data not being explicitly addressed in contracts and project plans.

Page 22: 20160523 23 Research Data Things

Data Sharing

Thing 5

1. What is 'open data'?2. Who benefits from open data?3. Overcoming barriers to opening data4. Making data open5. Open data in Research Data Australia

Page 23: 20160523 23 Research Data Things

Thing 5

Making Data Open

Open data is Which ideally means ... So preferably not ...

Freely available to download

a) There is no cost to access the data;b) Access is via an internet accessible download;c) Data is in a form that can be readily downloaded. Large or complex data is located close to high performance computing or specialised services that enable access to the data in situ or the compilation of sub-sets.

a) Costed at more than reproduction cost;b) Burned to a DVD and posted via 'snail mail';c) Only available in huge packages that are difficult to reuse and/or take days to download.

Licensed An open license such as CC-BY is applied.A restrictive license, or worse, no license at all.   If no license is applied, no reuse is permitted.

Well described

Standards based metadata is used with details of data elements and inclusion of data dictionaries. Describe the purpose of the collection, the characteristics of the sample and the method of data collection.

Metadata descriptions that are very brief or will not be widely understood.  Avoid jargon and abbreviations and don't assume prior knowledge of the data or subject domain.

Provided in an open format

The data is in a convenient, modifiable and open format that can be readily retrieved, downloaded, indexed and searched. Where possible, formats should be machine-readable and non-proprietary formats are preferred.  For example, prefer netCDF over .xls.

Obscure formats or formats that require proprietary software to open and reuse.

Well managedThe data is managed on an ongoing basis with a point of contact designated to assist with data use.

Data that is loaded on to a server and forgotten.

Page 24: 20160523 23 Research Data Things

Data Sharing

Thing 5

1. What is 'open data'?2. Who benefits from open data?3. Overcoming barriers to opening data4. Making data open5. Open data in Research Data Australia

Page 25: 20160523 23 Research Data Things

Data Sharing

Thing 5

Open data in Research Data Australia

1. New Interface highlights the openness of data

2. Licenses can be applied

Page 26: 20160523 23 Research Data Things

Data SharingWiley Survey

Thing 5

http://www.acscinf.org/PDF/Giffi-%20Researcher%20Data%20Insights%20--%20Infographic%20FINAL%20REVISED.pdf

Page 27: 20160523 23 Research Data Things

Long-lived data: curation & preservation

https://youtu.be/qEmmeFFafUs US Library of Congress (LoC)

Thing 6

Page 28: 20160523 23 Research Data Things

Long-lived data: curation & preservation

Thing 6

Page 29: 20160523 23 Research Data Things

Long-lived data: curation & preservation

Thing 6

What key advice would you give someone about preserving their born digital objects e.g. the family historian, a researcher, yourself? ….Hint: look for ideas on the Library of Congress Digital Preservation website.

Page 30: 20160523 23 Research Data Things

Long-lived data: curation & preservation

Thing 6

Page 31: 20160523 23 Research Data Things

Long-lived data: curation & preservation

Video - http://www.clir.org/initiatives-partnerships/data-curation

Sayeed Choudhury, Associate Dean for Research Data Management at Johns Hopkins University (long video… to summarise)

Talks about the Stack Model for Data Mgt

Thing 6

Storage•Disk, tape, cloud etc.

Archiving• Identifiers for sharing and references

Preservation•Policy, metadata, long-term reuse

Curation•Adding value to data for reuse

Page 32: 20160523 23 Research Data Things

Data citation for access & attribution

• Data citation continues the tradition of acknowledging other people’s work and ideas.

• Along with books, journals and other scholarly works, it is now possible to formally cite research datasets and even the software that was used to create or analyse the data.

Thing 7

Page 33: 20160523 23 Research Data Things

Data citation for access & attribution

Thing 7

Page 36: 20160523 23 Research Data Things

Thing 7

Data citation for access & attribution

Force11 Joint Declaration of Data Citation Principles  • a set of principles for citing data. • based on the premise that data citation, like the citation of other evidence

and sources, is good research practice and is part of the scholarly ecosystem supporting data reuse.

Since they were published 2 years ago, the Principles have been endorsed by numerous individuals and more than 100 data centres, publishers and societies.

Page 37: 20160523 23 Research Data Things

Thing 7

Data citation for access & attribution

Page 38: 20160523 23 Research Data Things

Thing 7

Data citation for access & attribution

Force11 is endorsed by…https://www.force11.org/datacitation/endorsements

Page 39: 20160523 23 Research Data Things

Thing 7

Data citation for access & attribution

Given such support and clear direction, why do you think data citation has not been uniformly adopted, so far, across all disciplines?

Page 40: 20160523 23 Research Data Things

Citation Metrics for Data

Thing 8

What are Digital Object Identifiers (DOIs) and how do they support data citation and metrics for data and related research objects?

Page 41: 20160523 23 Research Data Things

Citation Metrics for Data

Thing 8

DOIs are:• unique identifiers• provide persistent access to published articles,

datasets, software versions and a range of other research inputs and outputs.

• over 120million DOIs in use, • last year DOIs were “resolved” (clicked on) over

5 billion times!• typical DOI looks like this: http://

doi.org/10.4225/08/50F62E0D359D5

Page 42: 20160523 23 Research Data Things

Citation Metrics for Data

Thing 8

Google “The compendium of crop Proteins with Annotated Locations (cropPAL) version 1 ”

Page 44: 20160523 23 Research Data Things

Citation Metrics for DataHere’s a controversial question to discuss: Should DOIs be routinely applied to all research outputs? Remember that DOIs carry an expectation of persistence (maintenance costs etc.) but can provide be used to collect metrics as well as  link articles and data (evidence of impact.)

Thing 8

Page 45: 20160523 23 Research Data Things

Thing 8

Citation Metrics for Data

• Alternative metrics or altmetrics count the number of views, number of downloads, social media "likes" and recommendations associated with a dataset.

• Because of their immediacy, altmetrics can be an early indicator of the impact or reach of a dataset; long before formal citation metrics can be assessed.

Page 47: 20160523 23 Research Data Things

Thing 8

Citation Metrics for Data

  

Page 48: 20160523 23 Research Data Things

Thing 8

Citation Metrics for Data

Look also at the associated data in Dryad noting that the data has been assigned a DOI.  

Page 49: 20160523 23 Research Data Things

Thing 8

Citation Metrics for Data

By way of comparison, as of early April 2016:• the same dataset had been cited once in Thomson Reuters Data Citation Index• the article had been cited 143 times in Web of ScienceShare your thoughtsDo you think altmetrics for data have value in academic settings?  Why?