15
Support for research data management Esther Dzalé Yeumo 21 September 2015

Support for Research Data Management, by Esther Dzalé Yeumo Kaboré

Embed Size (px)

Citation preview

Support for research data management

Esther Dzalé Yeumo 21 September 2015

.02 E. Dzalé Yeumo/ IG Agriculture meeting 21 September 2015

Agenda

Introduction

What data do we produce at INRA

Current data management and sharing practices and pain points: the point of

view of some researchers

The service offer under construction

.03 E. Dzalé Yeumo/ IG Agriculture meeting 21 September 2015

2009: political awarness of Inra CEO

2012: Report* of Inra scientific council « data management and sharing » 9 recommendations, 1st : define Inra Policy

2013: Inra data sharing policy 11 data management and sharing principles Implementation

Domain specific working groups inventory, requirements Trans-disciplinary working groups state of the art (legal/IP,

technical, social issues), proposals

Some key dates Introduction

*Gaspin, C., Pontier, D., Colinet, L., Dardel, F., Franc, A., Hologne, O., Le Gall, O., Maurin, N., Perrière, G., Pichot, C., Rodolphe, F. (2012). Rapport du groupe de travail sur la gestion et le partage des données http://prodinra.inra.fr/record/206746

What data?

.05 E. Dzalé Yeumo/ IG Agriculture meeting 21 September 2015

Current data sharing practices: the point of view of some scientists

E. Dzalé Yeumo, O. Hologne 16/juillet/2015

Genetic & genomic

- In general, data is released once the producers have published a paper - Many data sharing platforms exist at the national and international level - Many metadata and data format standards exist - Lots of data is produced in collaboration with public and private partners

Experimentation, observation and

simulation

- Raw data may be of as interest as processed data - Data sharing rules depends on the nature, granularity, and origin of the data - The importance of metadata is of paramount for the reusability of the released data - Lots of data is unique, and can be captured only once

Social sciences

- A few data sharing platforms exist at the national and international level - Lots of data are are bought and can’t be freely released - Experimental economics data can be freely released most of the time - Data documentation and statistical disclosure are of paramount importance - The importance of the longitudinal aspect (range or historical aspect) of the data increases the need for their long term preservation

Practices

.06 E. Dzalé Yeumo/ IG Agriculture meeting 21 September 2015

Current data management pain points: the point of view of some scientists

E. Dzalé Yeumo, O. Hologne 16/juillet/2015

Genetic & genomic

- Data is more and more massive economical model for long term hosting - Exchange, transfer and storage of large datasets - Lack of human resources - Gaps in semantic coverage - Many existing public data repositories are congested with important waiting time for the deposit of datasets - Risk of conferring an economic advantage to our competitors or to our partners competitors

Experimentation, observation and

simulation

- Data standardization - Exchange, transfer and storage of large datasets - Capturing metadata automatically - Gaps in semantic coverage - Metadata may be strategic (e.g protocols, methods) - Some metadata or data are sensitive (geographic information about epidemiologic data or GMO data) - Exchange of large datasets

Social sciences

- Data archiving: sustainability of the existing platforms - Statistical disclosure control - Lots of personal and sensitive data (social and economical survey data) risk of re-identification - Legal and intellectual property issues are of paramount importance (ex: data inferred from existing textual data through decision tools, data purchased from third parties)

Pain points

.07 E. Dzalé Yeumo/ IG Agriculture meeting 21 September 2015

What service offer to support the data management and sharing?

.08 E. Dzalé Yeumo/ IG Agriculture meeting 21 September 2015

The scientists whish list

E. Dzalé Yeumo, O. Hologne 16/juillet/2015

Genetic & genomic

- Provide templates for consortium agreements + identify special cases (biological material, long tradition of partnership, etc.)

- A data portal with access to all the data shared by INRA - DOIs - Metadata training - Volunteering for species-related data repositories at an international level

Experimentation, observation and

simulation

- Recognition of all the contributors - Recognition of data sharing as first

class skill at the institutional level

- Data papers training - A data portal with access to all the data shared by INRA - Harmonization of metadata standards and vocabularies

Social sciences

- Guidance with regards to publishing derived data: when, how?

- Webscrapping data of interest which may be removed from original sources - DOIs - A secured data platform that allows reviewers to access data and reproduce findings with respect to legal and IP requirements. - Thematic active data management platforms

Legal, IP , social

Technics, methods, tools

.09 E. Dzalé Yeumo/ IG Agriculture meeting 21 September 2015

Awareness, guidance, training

A digital repository

Base on the IT environment provided by the IT department: Two data centers The Netapp Storage Virtual Machine technology

Outsource the long term preservation Leverage the many existing data repositories

Upgrade the existing repositories up to trusted repositories in accordance with the Data Seal of Approval and the defra assessement grids: http://www.datasealofapproval.org/en/assessment/ and https://defradigital.blog.gov.uk/2015/02/09/are-you-a-mature-open-data-publisher/

Conform to the OAIS reference model Cover both active and historical data

Additional services

DOI MINTING SERVICE VOCABULARIES PUBLISHING SERVICE

Training

Technical school in data papers Technical school in Linked Data and how to publish data according to

Linked Data principles

.013 E. Dzalé Yeumo/ IG Agriculture meeting 21 September 2015

Thank you!

.014 E. Dzalé Yeumo/ IG Agriculture meeting 21 September 2015

Appendix: 11 data sharing and management principles

.015 E. Dzalé Yeumo/ IG Agriculture meeting 21 September 2015