Upload
khangminh22
View
1
Download
0
Embed Size (px)
Citation preview
Documentation & metadataResearch Data Repository (RDR)FAIR in practice
23/07/2022
Marleen Marynissen
What?
• Documentation• Contextual and descriptive features of data• “Everything” someone needs to understand the data and to
be able to use it
• Metadata• Structured data documentation• Data provided through specific attributes: metadata
elements / fields• Machine-readable• Standards
4
Considerations
• Study- and data-level documentation
• As early as possible
• Understandable in your field
• Kept close to the data: embedded / in the same folder
• Logical folder structure, meaningfull file names
5
STORM Time Empirical Ionospheric Corrections data
7
Metadata that explainthe variable labels andprovenance (source)
Data
Tools and software during your research
• Data documentation: e.g. Electronic Lab Notebooks, OSF
• Versioning system: e.g. GitLab
• Easily generate metadata• SPSS: features to include metadata in database
• Tropy: organize and describe photographs of research material
• CEDAR: metadata templates for biomedical experiments
• MLflow: manage machine learning development
10
Repositories
• Metadata automatically applied
• Check required metadata scheme or mandatoryfields (e.g. DataCite)
• Research Software Registries: employ metadata to describe each software package
https://github.com/NLeSC/awesome-research-software-registries
11
Metadata standards
• List of metadata standards:• http://www.dcc.ac.uk/resources/metadata-standards/list
• https://fairsharing.org/
• https://rd-alliance.github.io/metadata-directory/standards/
only useful when your research community uses it or when it fits with a system or infrastructure!
12
Reporting guidelines – minimum information checklistsMIFlowCyt: Minimum Information about a Flow Cytometry Experiment
Used in FLOWrepository
1. Experiment Overview
2. Flow Sample/Specimen Details
3. Instrument Details
4. Data Analysis Details (if data analysis has been done)
13
Metadata standards: example
• OME-XML / OME-TIFF Storing microscopy information• Images encoded as pixels in XML
• Metadata block describing dataset is embedded in each TIFF file’s header• Dimensional parameters: scope of image pixels
• Hardware config: microscope, detectors, filters, lenses
• Hardware settings: laser gain/offset, channel config
• Person performing experiment
• Details experiment: FRET, time-lapse, …
• OMERO: server software & repository to manage, visualize & analyse images & metadata
14 https://docs.openmicroscopy.org/ome-model/6.2.2/ome-tiff/
Controlled vocabularies / ontologies
• Integrated in repositories / tools
• Vocabularies / ontologies for your listed terms?• Linked Open Vocabularies (LOV), FAIRsharing, Ontobee
• Accessing content of an ontology• Tools: RightField: adding ontology term selection to Excel spreadsheets• Python package to load ontologies as Python objects:
https://pypi.org/project/Owlready2/
• https://www.kuleuven.be/rdm/en/guidance/documentation-metadata/vocabularies
15
More information
• https://www.kuleuven.be/rdm/en/guidance/documentation-metadata
• https://ukdataservice.ac.uk/learning-hub/research-data-management/
• http://rdmkit.elixir-europe.org/
• https://rdm.elixir-belgium.org/
• https://dmeg.cessda.eu/
16
What?Research Data Repository - RDR - RaDaR
• FAIR repository for the publication of finished research data• Data supporting publications
• Data with high relevance for research, society…
• Publication of datasets• Detailed metadata
• Files: • 1 dataset can contain 1 or more files
• “as open as possible, as closed as necessary”
• License
CC-BY-4.0 – Dieuwertje Bloemen
FAIRness
CC-BY-4.0 – Dieuwertje Bloemen
FINDABLE
DOI
Metadata: openly available &
machine-readable
File structuring possible
ACCESSIBLE
Openly accessible download infrastructure
with access information on dataset
and file level
Advice: open or commonly-used file
formats
INTEROPERABLE
Guidance on file types and data formats
Detailed metadata
Documentation README file mandatory
REUSABLE
Licenses: for data -for software/code
Documentation
Versioning
Who?
• All KU Leuven personnel can publish data• ORCID in KU Loket
• 50 GB per researcher per year
• Usage of RDR is not mandatory, but• Clear affiliation with KU Leuven
• Documentation & support
• Automatic reporting to Lirias
• Users are expected to have read the “terms of use” and the “prepareyour dataset checklist”
CC-BY-4.0 – Dieuwertje Bloemen
How
• Review:• Check metadata
• Check access metadata vs. file restrictions
• Check presence README file
• If OK Publish; Issues Return to author with feedback
• Published?• Metadata openly available
• Files dependent on restrictions
CC-BY-4.0 – Dieuwertje Bloemen
Via https://rdr.kuleuven.be