73
(Open) Research Data Management in H2020 ISERD – Tel Aviv, Oct 31, 2016 @openaire_eu Natalia Manola OpenAIRE managing director Athena Research & Innovation Centre Credits The OpenAIRE team Sara Jones, Data Curation Center (DCC), UK Marjan Grootveld, DANS, NL

(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Embed Size (px)

Citation preview

Page 1: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

(Open) Research Data Management in H2020

ISERD – Tel Aviv, Oct 31, 2016@openaire_eu

Natalia ManolaOpenAIRE managing director

Athena Research & Innovation Centre

CreditsThe OpenAIRE team

Sara Jones, Data Curation Center (DCC), UKMarjan Grootveld, DANS, NL

Page 2: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Outline•OpenAIRE – who are we?•H2020 policies – what’s involved?•Research data management – what about?•Data Management Plan (DMP) – how to?•Lessons learnt – what to avoid?•OpenAIRE – what do we offer?

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 2

Page 3: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Who we are•EU project•In 24x7 operation since Dec 2010

• OpenAIRE• OpenAIREplus• OpenAIRE2020

•Consortium of 50 partners

•One of 5 key EU e-Infrastructures•A legal entity in 2017

Open Access experts

• Institutional, national and international perspectives on OA policies & e-InfrastructuresInformation & Computer Science experts

• Building efficient e-Infra technologies• State of the art technologies (big data, linked data)

Legal experts

• Legal &policy recommendations

Data communities• Best practices for data• Linking to data infrastructures

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 3

Page 4: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Human Network

50 Partners from every EU country, and beyondData centers, universities, libraries, repositories, legal experts

Digital Network

… fosters the social and technical links that enable Open Science in Europe and beyond

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

National Open Access Desks (NOADs)33 OA expert nodes in all Europe•(OA) Policy aligning•Technical assistance•Training 4

Page 5: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Integrated Scientific Information System

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 5

17.2 mi unique publications 760 validated data providers370Κ publications linked to projects from 6 funders28 K datasets linked to publications or funders3.5K links to software

Page 6: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

H2020 Policies

Page 7: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

The following European Commission branded slides come from the EC’s open access team and provide an overview to the key points. Content from Jean-Francois Dechamp and colleagues.

Mail: [email protected] Web: http://ec.europa.eu/research/openscience/index.cfm Twitter: @OpenAccessEC

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 7

Page 8: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

H2020 Open Access policies•Publications

• Openly accessible and minable. Eligible costs for APCs.

•Research data• Openly accessible research

data can typically be accessed, mined, exploited, reproduced and disseminated free of charge for the user.

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 8

Page 9: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Publications

Page 10: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
Page 11: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Page 12: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
Page 13: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Three top reasons to opt outWhether a (proposed) project

participates in the ORD or chooses to opt out does not affect the

evaluation of that project. Proposals will not be penalised for opting out

Page 14: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
Page 15: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

The EC Open Research Data pilotKey sources of information• Guidelines on Open Access to Scientific Publications and Research Data in

Horizon 2020http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf

• Guidelines on Data Management in Horizon 2020http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf

• Annotated model grant agreement, clause 29.3 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/amga/h2020-amga_en.pdf

• New infographic summarising key policy points http://ec.europa.eu/research/press/2016/pdf/opendata-infographic_072016.pdf

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 15

Page 16: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Good research data management practices

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 16

Page 17: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

A FAIR approach to data•Findable

– Assign persistent IDs, provide metadata, register in a searchable resource...

•Accessible– Retrievable by their ID using a standard protocol, metadata

remain accessible even if data aren’t...

•Interoperable– Use formal, broadly applicable languages, use standard

vocabularies, qualified references...

•Reusable– Rich metadata, clear licences, provenance, use of community

standards...

www.force11.org/group/fairgroup/fairprinciplesRDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 17

Page 18: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Findable• Use metadata and specify standards for metadata

creation (if any). If there are no standards in your discipline describe what type of metadata will be created and how

• Search keywords • Persistent and unique identifiers such as DOIs or other

handles • File and folder naming conventions• Versioning of the datasets and clear version numbers

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 18

Page 19: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Metadata and documentation• Metadata and documentation is needed to find

and understand research data• Think about what others would need in order

to find, evaluate, understand, and reuse your data

• Get others to check the metadata to improve quality

• Use standards to enable interoperability

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 19

Page 20: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Where to find metadata standardsMetadata Standards DirectoryBroad, disciplinary listing of standards and toolsMaintained by RDA grouphttp://rd-alliance.github.io/metadata-directory

BiosharingA portal of data standards, databases, and policies for life, environmental and biomedical scienceshttps://biosharing.org

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 20

Page 21: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Accessible• Explain which data can’t be shared openly, if any• Specify how access will be provided in case of

restrictions, e.g., through a data committee, a license, or arranged with the repository

• Will methods or software tools needed to access the data (if any) be included or documented?

• Deposit the data and associated metadata, documentation and code preferably in certified repositories which support Open Access

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 21

Page 22: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Where to find a repository?More information: https://www.openaire.eu/opendatapilot-repository

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

What to deposit?a. the data needed to validate the results

presented in scientific publications, including the metadata;

b. any other data, including the metadata, as specified in the DMP;

c. plus for a-b the documentation and the tools that are needed to validate the results, e.g. specialised software or software code, algorithms and analysis protocols (when possible, these instruments themselves).

22

Page 23: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Shop around

http://databib.org

http://service.re3data.org

Re3data is a registry of data repositories

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Searching with Re3data.orgwww.fosteropenscience.eu/content/re3data-demo

23

Page 24: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

How to select a repository? • Look into your community, university, publisher,

funder etc.• Check they match your particular data needs: e.g.

formats accepted; mixture of Open and Restricted Access

• See if they provide guidance on how to cite the deposited data

• Do they assign a persistent & globally unique identifier for sustainable citations and to links back to particular researchers and grants?

• Look for certification as a ‘Trustworthy Digital Repository’ with an explicit ambition to keep the data available in long term.

www.openaire.eu/opendatapilot-repository

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 24

Page 25: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

ZenodoMulti-disciplinary repository used for the long-tail of research data• An OpenAIRE-CERN joint effort• Multidisciplinary repository accepting

– Multiple data types– Publications– Software – link to Github

• Assigns a Digital Object Identifier (DOI), up t 50GB per dataset

• Links funding, publications, data & software

www.zenodo.org

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 25

Page 26: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Choose appropriate file formatsIf you want your data to be re-used and sustainable in the long-term, you typically want to opt for open, non-proprietary formats. Type Recommended Avoid for data

sharingTabular data CSV, TSV, SPSS portable ExcelText Plain text, HTML, RTF

PDF/A only if layout mattersWord

Media Container: MP4, OggCodec: Theora, Dirac, FLAC

QuicktimeH264

Images TIFF, JPEG2000, PNG GIF, JPGStructured data XML, RDF RDBMS

Further examples: www.data-archive.ac.uk/create-manage/format/formats-table

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 26

Page 27: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

File format considerations•No clearcut definitions of “sustainable file format”•Each archives has its own expertise, relative to its designated community Examples:

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

  4TU.ResearchData DANS

  Level 1 Level 2 or 3 Preferred Acceptedaudio .wav .ra, .mp3, .wma .wav, .flac .aiff, .mp3, .aac

chemistry

NMR, ChemDoodle, … .pdb, .xyz    

databases

delimited flat file w/DDL .mdb, .dbf, .acdb .sql, .siard, .csv

.mdb, .dbf, .hdf5 …

video  .mp1, .mp2, .mp4, .mov …

.mpg2, .mpg4, .avi, .mov .mkv

27

Page 28: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Interoperable• Interoperability on data and metadata, on data exchange

formats and protocols• Specify what data and metadata vocabularies, standards or

methodologies you will follow to facilitate interoperability • Standard vocabulary to allow inter-disciplinary

interoperability or a mapping from your vocabulary to more commonly used ontologies?

Aim for compliance to globally accepted practices

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 28

Page 29: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

• Clarify licences early on• License the data to permit the widest reuse

possible

• Specify a data embargo, if needed• If data re-use by third parties is restricted,

explain why • How long will the data remain reusable? • Describe data quality assurance processes

Reusable

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 29

Page 30: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

www.dcc.ac.uk/resources/how-guides/license-research-data

License research data openlyDCC guide outlines the pros and cons of each approach and gives practical advice on how to implement your licence

CREATIVE COMMONS LIMITATIONSNC Non-CommercialWhat counts as

commercial?

ND No DerivativesSeverely restricts use

These clauses are not open licenses

Horizon 2020 Open Access guidelines point

to:

or

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 30

Page 31: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

What should be preserved and shared?• The data needed to validate results in scientific publications

(minimally!).• The associated metadata: the dataset’s creator, title, year

of publication, repository, identifier etc.• Follow a metadata standard in your line of work, or a

generic standard, e.g. OpenAIRE or DataCite, and be FAIR.• Documentation: code books, lab journals, informed consent

forms – domain-dependent, and important for understanding the data and combining them with other data sources.

• Software, hardware, tools, syntax queries, machine configurations – domain-dependent, and important for using the data. (Alternative: information about the software etc.)RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Basically, everything that is needed to replicate a study should be available. Plus everything that is potentially

useful for others.

31

Page 32: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Rules of thumb for how long

RDNL Selection criteria: http://www.researchdata.nl/en/services/data-management/selecting-research-data/ DCC How-to guide: http://www.dcc.ac.uk/resources/how-guides/appraise-select-data

•When regenerating data is cheaper than archiving, don’t archive. Select what data you’ll need and want to retain.

•10 years is often stated in data policies and academic codes, but data can be valuable for ages, in climatology, sociology, health sciences, astronomy, linguistics, … Look beyond minimal retention periods where relevant.

•“The lifetime of software is generally not as long as that of data” (Daniel Katz e.a. http://bit.ly/2eScCKp)

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 32

Page 33: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

How much does it cost? Who pays?• What are the costs for making data FAIR in your project? • Resources needed for long term preservation• Check the UK Data Service Costing model • The High Level Expert Group on the European Open

Science Cloud recommends that “well budgeted data stewardship plans should be made mandatory and we expect that on average about 5% of research expenditure should be spent on properly managing and stewarding data”

• Who pays? How?UKDS model http://www.data-archive.ac.uk/create-manage/planning-for-sharing/costing

HLEG report http://ec.europa.eu/research/openscience/pdf/realising_the_european_open_science_cloud_2016.pdf#view=fit&pagemode=none

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 33

Page 34: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

RDM in a nutshellDevelop a DMP

Select which data to make openLicense data openly for the widest reuse

Use established community standards for interoperability

Provide metadata for data discovery and reuse

Deposit in a data repository

Share details about the tools and instruments used to allow verification

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 34

Page 35: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Managing and sharing data –A best practice guide

http://data-archive.ac.uk/media/2894/managingsharing.pdf RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 35

Page 36: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Data Management Plan Tools

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 36

Page 37: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

What is a data management plan?A brief plan written at the start of a project to define:• how the data will be created?• how it will be documented?• who will access it?• where it will be stored?• who will back it up?• whether (and how) it will be shared & preserved?

DMPs are often submitted as part of grant applications, but are useful whenever researchers are creating data

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

The DMP is a living document. You are not required to provide detailed answers to all the questions

in the first version of the DMP (due M6)

37

Explain any selection criteria in the DMP

Page 38: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

DMPonlineA web-based tool to help researchers write DMPsIncludes a template for Horizon 2020 – FAIR principles

https://dmponline.dcc.ac.uk RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 38

Page 39: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

How the tool works Click to write a generic DMP

Or choose your funder to get their specific templatePick your uni to add local guidance and to get their template if no funder appliesChoose any additional optional guidance

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 39

Page 40: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

DMPonline for H2020

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Focus on how you will ensure your data

are “FAIR”

KEEP IT UP TO DATE

40

Page 41: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example DMP plans

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 41

Page 42: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example plans• 108 DMPs from the National Endowment for the Humanities

www.neh.gov/divisions/odh/grant-news/data-management-plans-successful-grant-applications-2011-2014-now-available

• 20+ scientific DMPs submitted to the NSF (USA) provided by UCSD

• http://libraries.ucsd.edu/services/data-curation/data-management/ dmp-samples.html

• Example DMP collection from Leeds University• https://library.leeds.ac.uk/research-data-tools • Further examples:

• www.dcc.ac.uk/resources/data-management-plans/guidance-examples RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 42

Page 43: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example H2020 DMPs in Zenodo•Helix Nebula – High Energy Physics example•https://zenodo.org/record/48171#.WATexnriF40

•Tweether – engineering (micro-electronics) example

•https://zenodo.org/record/55791#.WATei3riF40  

•AutoPost – ICT example https://zenodo.org/record/56107#.WATefXriF40  

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 43

Page 44: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Different DMP for different needs•No strict rules – Every project has different requirements

•Approach the DMP in whatever way best fits your project

•EC template is intended as a service, not an obligation

•Read the background information and the guidance, and use it as a checklist.

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 44

Page 45: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example: OpenMinTeDOpenMinTed aims to create an infrastructure for Text and Data Mining (TDM) of scientific and scholarly content

Have adopted own structure to create a ‘Data and Software Management Plan’

http://openminted.eu RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 45

Page 46: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example: OpenMinTeD – Data chapterSix types of datasets identified1. Scholarly publications 2. Language and knowledge

resources 3. Services and workflows 4. Automatically and manually

generated annotations 5. Consortium publications 6. Metadata Described in a table

per dataset

E-Infra structure: data in transit

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 46

Page 47: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example OpenMinTeD – Software

Lots of software components

Background vs. foreground

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 47

Page 48: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example: CAPSELLACAPSELLA aims to develop ICT solutions for farmers and other actors engaged in agrobiodiversity

Devised a questionnaire to collate datset information from project partners

Identified 13 datasets, 6 of which are imported as is, 3 aggregated, 3 transformed and 1 generated

www.capsella.eu RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 48

Focus on data produced

Page 49: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example dataset record

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 49

Page 50: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example: Autopost

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 50

Page 51: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example: tweether

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Travelling wave tube based w‐band wireless networks with high data rate distribution, spectrum& energy efficiency

The academic institutions participating in TWEETHER have available appropriate depositories which in fact are linked to

OpenAIRE.

Apart from these repositories, TWEETHER will also use ZENODO to ensure the maximum dissemination of the information

generated in the project (research publications and data)…

51

Page 52: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Example metadata description

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Minimal metadata

More than one dataset? Describe generically what is possible and dataset-specific what is necessary.

52

Page 53: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Data description examplesThe final dataset will include self-reported demographic and behavioural data from interviews with the subjects and laboratory data from urine specimens provided. 

NIH data sharing statements

Every two days, we will subsample E. affinis populations growing under our treatment conditions. We will use a microscope to identify the life stage and sex of the subsampled individuals. We will document the information first in a laboratory notebook and then copy the data into an Excel spreadsheet. The Excel spreadsheet will be saved as a comma separated value (.csv) file.

DataOne – E. affinis DMP exampleRDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 53

Page 54: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Metadata examples

Metadata will be tagged in XML using the Data Documentation Initiative (DDI) format. The codebook will contain information on study design, sampling methodology, fieldwork, variable-level detail, and all information necessary for a secondary analyst to use the data accurately and effectively.

ICPSR Framework for Creating a DMP

We will first document our metadata by taking careful notes in the laboratory notebook that refer to specific data files and describe all columns, units, abbreviations, and missing value identifiers. These notes will be transcribed into a .txt document that will be stored with the data file. After all of the data are collected, we will then use EML (Ecological Metadata Language) to digitize our metadata. EML is one of the accepted formats used in ecology, and works well for the types of data we will be producing. We will create these metadata using Morpho software, available through KNB. The metadata will fully describe the data files and the context of the measurements.

DataOne – E. affinis DMP example

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 54

Page 55: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Data sharing examples

We will make the data and associated documentation available to users under a data-sharing agreement that provides for: (1) a commitment to using the data only for research purposes and not to identify any individual participant; (2) a commitment to securing the data using appropriate computer technology; and (3) a commitment to destroying or returning the data after analyses are completed. 

NIH data sharing statements

The videos will be made available via the bristol.ac.uk website (both as streaming media and downloads) HD and SD versions will be provided to accommodate those with lower bandwidth. Videos will also be made available via Vimeo, a platform that is already well used by research students at Bristol. Appropriate metadata will also be provided to the existing Vimeo standard.

All video will also be available for download and re-editing by third parties. To facilitate this Creative Commons licenses will be assigned to each item. In order to ensure this usage is possible, the required permissions will be gathered from participants (using a suitable release form) before recording commences.

University of Bristol Kitchen Cosmology DMP

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 55

Page 56: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Restriction on use examplesBecause the STDs being studied are reportable diseases, we will be collecting identifying information. Even though the final dataset will be stripped of identifiers prior to release for sharing, we believe that there remains the possibility of deductive disclosure of subjects with unusual characteristics. Thus, we will make the data and associated documentation available to users only under a data-sharing agreement.

NIH data sharing statements1. Share data privately within 1 year. Data will be held in Private Repository, but metadata will be public

2. Release data to public within 2 years. Encouraged after one year to release data for public access.

3. Request, in writing, data privacy up to 4 years. Extensions beyond 3 years will only be granted for compelling cases.

4. Consult with creators of private CZO datasets prior to use. Pis required to seek consent before using private data they can

accessBoulder Creek Critical Zone Observatory DMP

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 56

Page 57: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Archiving examples

Data will be provided in file formats considered appropriate for long-term access, as recommended by the UK Data Service. For example, SPSS Portal format and tab-delimited text for qualitative tabular data and RTF and PDF/A for interview transcripts. Appropriate documentation necessary to understand the data will also be provided. Anonymised data will be held for a minimum of 10 years following project completion, in compliance with LSHTM’s Records Retention and Disposal Schedule. Biological samples (output 3) will be deposited with the UK BioBank for future use.

Writing a Wellcome Trust Data Management and Sharing Plan

The investigators will work with staff at the UKDA to determine what to archive and how long the deposited data should be retained. Future long-term use of the data will be ensured by placing a copy of the data into the repository.

ICPSR Framework for Creating a DMP

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 57

Page 58: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Lessons learnt and a few planning tricks

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 58

Page 59: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Start early•Negotiation on licenses and consent agreement may preclude later sharing if not careful

•Costs cannot be included retrospectively•Useful to consider data issues at the consortium negotiation stage to make sure potential issues are identified and sorted asap

•Involve all work packages and partners to get a coherent plan. •Focus effort on datasets you’ll create rather than reuse

Decisions made early on affect what you can do later

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 59

Page 60: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Think backwardsWhat data organisation would a re-user like?

CREATING DATA

PROCESSING DATA

ANALYSING DATA

PRESERVING DATA

GIVING ACCESS TO

DATA

RE-USING DATA

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Think about the desired end result and plan for this

“Sharing” means “outside the consortium”

60

Page 61: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

InstitutionRDM policy

Facilities

€$£

Research funders

PublishersData Availability

policy

Commercial partners

Find who is responsible for what

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 https://www.openaire.eu/briefpaper-rdm-infonoads 61

Page 62: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Note: It’s more than open data

CC-BY Andreas Neuhold https://commons.wikimedia.org/wiki/File:Open_Science_-_Prinzipien.png

• Access to research facilities

• Access to processing capabilities

• Communication at all levels of research life cycle

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 62

Page 63: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Limitations on current DMPs•DMP valuation is project based•Too much text – Text and data mining to get aggregate knowledge

•Machine readability•Linking to infrastructures •Semi-automated evaluation and comparison

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 63

Page 64: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

How can Open AIRE help?

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 64

Page 65: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

OpenAIRE support materialsBriefing papers, factsheets, webinars, workshops, FAQs

Information on• Open Research Data Pilot• Creating a data

management plan• Selecting a data repository• Personal data

Developing guidance to add to DMPonline

https://www.openaire.eu/opendatapilothttps://www.openaire.eu/support

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 65

Page 66: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

OpenAIRE services•Researchers

•Zenodo for all types of publications, data and software•Claiming – linking research results•Amnesia, an anonymization tool for all

•Data providers – Interoperability Guidelines, validation,…•Project coordinators – reporting •Funders and institutions – monitoring•Research communities – gathering, monitoring all research

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 66

DASHBOARDS

Page 67: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Literature Repositorie

sOA Journals

Funding Info

Validation

Cleaning

De-duplicating

Inferring

Linking

Organizations

Projects

Authors

Datasets

Publications

Data Provide

rs

Monitoring

Reporting

Evaluation

Impact

Classification

Clustering

Analysis

CRIS systems

An EU-CRIS system

Data Repositorie

s

Metadata

Full text

Usage data

DiscoveryCrowdsourcin

g

APIs

Trends

Aggregators

Enriching

ServicesOpenAIRE PlatformData Providers

Routing

Archives

Guidelines

Page 68: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Integrated Scientific Information System

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

17.3 mi unique publications 760+ validated data providers370Κ publications linked to projects from 6 funders28 K datasets linked to publications3.5K links to software repositories33K organizations

OrganizationsProjects

AuthorsDatasets

Publications

Data Providers

Software Facilities MethodsResearch

Communities

OpenAIRE-ConnectFrom January 2017

68

Page 69: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

World-wide alignment & synergiesInteroperability alignment, sharing technologies & services•La Refencia: Latin America repository network

•JAIRO – Japanese Institutional Repositories Online

•REMERI – Mexican Network of Institutional Repositories

•…•ICSU/World Data Service – A network of 70 certified data centers

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016 69

Page 70: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Monitoring & reporting•Funders

•Institutions

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

+NSF, NIH (US)NOW (NL)MSES, CSF (HR)SFI (IR)DFG (DE)…..

Towards an EU-CRIS system

70

Page 71: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

Project level– Automatically export results

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Project’s web site

EC front-end system

71

Page 72: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

…to the EC's participant portal

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

EC back-end reporting

72

Page 73: (Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)

www.openaire.eu@openaire_eufacebook.com/groups/openaire linkedin.com/groups/OpenAIRE-3893548

Thank you!

[email protected]

RDM Seminar @ ISERD, Tel Aviv - Oct 1, 2016

Questions?

73