View
89
Download
1
Embed Size (px)
Citation preview
FAIR principles anddata management
planning Hugo Besemer
AIMS Webinar
2017-05-25
FAIR principles anddata management
realities Hugo Besemer
AIMS Webinar
2017-05-25
A FAIRLY short timeline
• January 2014 Workshop in Leiden (the Netherlands)• 2014 Results on Force11 site• 15 March 2016 Article in ‘Scientific data’• 26 July 2016 H2020 Programme Guidelines • December 2016 Webinar FAIR / repositories
Guiding Principles for Findable, Accessible, Interoperable and Re-usable Data Publishing version b1.0
Discussion about indicators of ‘FAIRness’
A bit longer timeline
What ‘FAIR’ does NOT want to be and what it wants to achieve
• It is NOT a specification• It is NOT a syntax (it aims to be syntax agnostic)• It is meant to precede technology and other implementation choices
• In my own words : these guidelines aim to create a research data environment that is FAIR to machines and humans
FFto be findableto be findable
• F1. (meta)data are assigned a globally unique and persistent identifier • F2. data are described with rich metadata (defined by
R1 below) • F3. metadata clearly and explicitly include the
identifier of the data it describes • F4. (meta)data are registered or indexed in a
searchable resource
Proposed indicators F(indable)
• 1.No PID and no metadata/documentation• 2.PID without or with insufficient* metadata• 3.Sufficient* metadata without PID• 4.PID with sufficient* metadata–Information on data provenance• 5.PID, rich metadata and additional documentation–Additional
explanation of how data can be used
* Sufficient = enough metadata to understand what the data is about
F(indable) @ Wageningen
• Presently departments decide what data is published• At best data that is underlying publications (pressure from journals
helps at lot….)• There are ongoing (series of) datasets that are only known to insiders
AAto be accessibleto be accessible
•A1. (meta)data are retrievable by their identifier using a standardized communications protocol •A1.1 the protocol is open, free, and universally
implementable •A1.2 the protocol allows for an authentication and
authorization procedure, where necessary •A2. metadata are accessible, even when the data are
no longer available
Proposed indicators A(ccessible)
1.No user license / unclear conditions of reuse / metadata nor data are accessible
2.Metadata are accessible (even when the data are not or no longer available)
3.User restrictions apply (of any kind, including privacy, commercial interests, embargo period, etc.)
4.Public Access (after registration)
5.Open Access (unrestricted, CC0 –perhaps also CCby?)
Accessible @ Wageningen
• Probably the most important problem: who decides who can get access (and who will grant the permission technically)• We have been awaiting guidelines on ownership / usage rights for
three years.
IIto be interoperableto be interoperable
•I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.•I2. (meta)data use vocabularies that follow FAIR principles•I3. (meta)data include qualified references to other (meta)data
Proposed indicators I(nteroperable)
1. Proprietary, non-open format data
2.Proprietary format, accepted by DSA Certified Trusted Data Repository
3.Non-proprietary, open format (= “preferred” or “archival” format)
4.Data is additionally harmonized/ standardized, using standard vocabularies
5.Data is additionally linked to other data to provide context
I(nteroperable) @ Wageningen
• In response to a blog about this the people working with ontologies met for the first time• Their main concerns• How to find the relevant ontologies• Can we rely on them to justify investments (consistency, process of
maintenance
• H2020 coordinators have no clue what all this is about
R R to be Reusable: to be Reusable:
•R1. meta(data) are richly described with a plurality of accurate and relevant attributes• R1.1. (meta)data are released with a clear and
accessible data usage license •R1.2. (meta)data are associated with detailed
provenance •R1.3. (meta)data meet domain-relevant community
standards
Also in F4
Also in F2, I1
Also in I1
Proposed indicators R(e-usable)
“First we attempted to operationalise R – Re-usable as well ... but we changed our mind
Reusable – is it a separate dimension? Partly subjective: it
depends on what you want to use the data for!”
ReferencesGuiding principles for findable, accessible, interoperable and re-usable data publishing version B1.0
https://www.force11.org/fairprinciples
The FAIR Guiding Principles for scientific data management and stewardship
https://www.nature.com/articles/sdata201618
Guidelines on FAIR Data Management in Horizon 2020 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
FAIR Data in Trustworthy Data Repositories Webinar
https://eudat.eu/events/webinar/fair-data-in-trustworthy-data-repositories-webinar
Two blogs about FAIR @ Wageningen
•https://weblog.wur.eu/openscience/can-wageningen-fair/
•https://weblog.wur.eu/openscience/vocabularies-and-the-i-in-fair-data-principles/