23
Documentation & metadata Research Data Repository (RDR) FAIR in practice 23/07/2022 Marleen Marynissen

Documentation & metadata Research Data Repository (RDR)

Embed Size (px)

Citation preview

Documentation & metadataResearch Data Repository (RDR)FAIR in practice

23/07/2022

Marleen Marynissen

Documentation& metadata

2

3

What?

• Documentation• Contextual and descriptive features of data• “Everything” someone needs to understand the data and to

be able to use it

• Metadata• Structured data documentation• Data provided through specific attributes: metadata

elements / fields• Machine-readable• Standards

4

Considerations

• Study- and data-level documentation

• As early as possible

• Understandable in your field

• Kept close to the data: embedded / in the same folder

• Logical folder structure, meaningfull file names

5

Examples

6

STORM Time Empirical Ionospheric Corrections data

7

Metadata that explainthe variable labels andprovenance (source)

Data

Metadata to understand data

Dataset Codebook

8

In practice

9

Tools and software during your research

• Data documentation: e.g. Electronic Lab Notebooks, OSF

• Versioning system: e.g. GitLab

• Easily generate metadata• SPSS: features to include metadata in database

• Tropy: organize and describe photographs of research material

• CEDAR: metadata templates for biomedical experiments

• MLflow: manage machine learning development

10

Repositories

• Metadata automatically applied

• Check required metadata scheme or mandatoryfields (e.g. DataCite)

• Research Software Registries: employ metadata to describe each software package

https://github.com/NLeSC/awesome-research-software-registries

11

Metadata standards

• List of metadata standards:• http://www.dcc.ac.uk/resources/metadata-standards/list

• https://fairsharing.org/

• https://rd-alliance.github.io/metadata-directory/standards/

only useful when your research community uses it or when it fits with a system or infrastructure!

12

Reporting guidelines – minimum information checklistsMIFlowCyt: Minimum Information about a Flow Cytometry Experiment

Used in FLOWrepository

1. Experiment Overview

2. Flow Sample/Specimen Details

3. Instrument Details

4. Data Analysis Details (if data analysis has been done)

13

Metadata standards: example

• OME-XML / OME-TIFF Storing microscopy information• Images encoded as pixels in XML

• Metadata block describing dataset is embedded in each TIFF file’s header• Dimensional parameters: scope of image pixels

• Hardware config: microscope, detectors, filters, lenses

• Hardware settings: laser gain/offset, channel config

• Person performing experiment

• Details experiment: FRET, time-lapse, …

• OMERO: server software & repository to manage, visualize & analyse images & metadata

14 https://docs.openmicroscopy.org/ome-model/6.2.2/ome-tiff/

Controlled vocabularies / ontologies

• Integrated in repositories / tools

• Vocabularies / ontologies for your listed terms?• Linked Open Vocabularies (LOV), FAIRsharing, Ontobee

• Accessing content of an ontology• Tools: RightField: adding ontology term selection to Excel spreadsheets• Python package to load ontologies as Python objects:

https://pypi.org/project/Owlready2/

• https://www.kuleuven.be/rdm/en/guidance/documentation-metadata/vocabularies

15

More information

• https://www.kuleuven.be/rdm/en/guidance/documentation-metadata

• https://ukdataservice.ac.uk/learning-hub/research-data-management/

• http://rdmkit.elixir-europe.org/

• https://rdm.elixir-belgium.org/

• https://dmeg.cessda.eu/

16

Research Data RepositoryRDR

17

What?Research Data Repository - RDR - RaDaR

• FAIR repository for the publication of finished research data• Data supporting publications

• Data with high relevance for research, society…

• Publication of datasets• Detailed metadata

• Files: • 1 dataset can contain 1 or more files

• “as open as possible, as closed as necessary”

• License

CC-BY-4.0 – Dieuwertje Bloemen

FAIRness

CC-BY-4.0 – Dieuwertje Bloemen

FINDABLE

DOI

Metadata: openly available &

machine-readable

File structuring possible

ACCESSIBLE

Openly accessible download infrastructure

with access information on dataset

and file level

Advice: open or commonly-used file

formats

INTEROPERABLE

Guidance on file types and data formats

Detailed metadata

Documentation README file mandatory

REUSABLE

Licenses: for data -for software/code

Documentation

Versioning

Who?

• All KU Leuven personnel can publish data• ORCID in KU Loket

• 50 GB per researcher per year

• Usage of RDR is not mandatory, but• Clear affiliation with KU Leuven

• Documentation & support

• Automatic reporting to Lirias

• Users are expected to have read the “terms of use” and the “prepareyour dataset checklist”

CC-BY-4.0 – Dieuwertje Bloemen

21

Support & guidelines

CC-BY-4.0

How

• Review:• Check metadata

• Check access metadata vs. file restrictions

• Check presence README file

• If OK Publish; Issues Return to author with feedback

• Published?• Metadata openly available

• Files dependent on restrictions

CC-BY-4.0 – Dieuwertje Bloemen

Via https://rdr.kuleuven.be

Questions?

23