4
Data Foundations and Terminology IG https://www.rd-alliance.org/groups/data-foundations-and- terminology-ig.html Background Agreed upon terminology, the meaning of terms used in discussion, remains an important aspect of all RDA group work especially as RDA groups mature and common interests emerge. The DFT IG mission remains to continue the discussion initiated by the RDA DFT WG and to support continuing RDA efforts elaborate the basic data concepts within a useful framework that documents data vocabularies. A key aspect of the work is to support broader model and vocabulary agreements within and across RDA WGs and IGs along with their representative communities and stakeholders. See https://www.rd-alliance.org/group/d ata-foundations-and-terminology- ig/case-statement/case- statement.html for our Case Statement What are goal and objectives? The goals of this IG remain: Facilitate the data community's discussion on basic core model, clarifying some foundational terminologies, and some basic principles that will harmonize & help standardize various data organization solutions. Foster an RDA community culture by agreeing on basic, quality terminology arising from agreed upon reference models. Develop a group vocabulary method. Support registration & documentation of vocabulary in the DFT Term Tool (Ted-T) so that everyone can easily refer to and comment on term concepts. Discuss community based recommendations for RDA group vocabularies. Help RDA group white papers & discussions with supporting vocabularies. Explore ways to encourage vocabulary adoption and ensure its use. Identifying new topics/areas to keep DFT discussion on "hot" topics/new interests. The vocabulary is openly available for everyone who wants to run a project including those with large data collections. Current Status - what has been discussed/done? Our approach is to build own terminology stepwise rather than adopt one, although we leverage and adapt existing work. We use a home grown, but matured terminology tool to help and cooperate with RDA Groups and other efforts. Effective community discussion is used to avoid spending interminable time on terminology debates that can‘t close on definitions and terms for useful concepts. Since P5 we have solicited feedback on our vocabularies and made some curated revisions to various definitions and extended the initial synthesis model for wider use. We run focused virtual sessions in RDA Interest Group on Data Foundations and Terminology 1

docx - rd-alliance.org  · Web viewdeclared semantics) and Data Fabric which had an explosion of new terms as it started up. We are working with the MIG on how to add more semantics

Embed Size (px)

Citation preview

Page 1: docx - rd-alliance.org  · Web viewdeclared semantics) and Data Fabric which had an explosion of new terms as it started up. We are working with the MIG on how to add more semantics

Data Foundations and Terminology IG https://www.rd-alliance.org/groups/data-foundations-and-terminology-ig.html

BackgroundAgreed upon terminology, the meaning of terms used in discussion, remains an important aspect of all RDA group work especially as RDA groups mature and common interests emerge. The DFT IG mission remains to continue the discussion initiated by the RDA DFT WG and to support continuing RDA efforts elaborate the basic data concepts within a useful framework that documents data vocabularies. A key aspect of the work is to support broader model and vocabulary agreements within and across RDA WGs and IGs along with their representative communities and stakeholders.

See https://www.rd-alliance.org/group/data-foundations-and-terminology-ig/case-statement/case-statement.html for our Case Statement

What are goal and objectives?The goals of this IG remain: Facilitate the data community's discussion on

basic core model, clarifying some foundational terminologies, and some basic principles that will harmonize & help standardize various data organization solutions.

Foster an RDA community culture by agreeing on basic, quality terminology arising from agreed upon reference models.

Develop a group vocabulary method. Support registration & documentation of

vocabulary in the DFT Term Tool (Ted-T) so that everyone can easily refer to and comment on term concepts.

Discuss community based recommendations for RDA group vocabularies.

Help RDA group white papers & discussions with supporting vocabularies.

Explore ways to encourage vocabulary adoption and ensure its use.

Identifying new topics/areas to keep DFT discussion on "hot" topics/new interests.

The vocabulary is openly available for everyone who wants to run a project including those with large data collections.

Current Status - what has been discussed/done?Our approach is to build own terminology stepwise rather than adopt one, although we leverage and

adapt existing work. We use a home grown, but matured terminology tool to help and cooperate with RDA Groups and other efforts. Effective community discussion is used to avoid spending interminable time on terminology debates that can‘t close on definitions and terms for useful concepts.

Since P5 we have solicited feedback on our vocabularies and made some curated revisions to various definitions and extended the initial synthesis model for wider use.We run focused virtual sessions in between plenaries.We have collaborated with a number of groups including the MIG (to support metadata that has some formal syntax reflecting the structure of metadata & declared semantics) and Data Fabric which had an explosion of new terms as it started up. We are working with the MIG on how to add more semantics to metadata as discussed at recent Chairs meetings.

We have collaborated & shared definitions with Research Data Canada (RDC) in partnership with the international Consortia Advancing Standards in Research Administration Information (CASRAI) and the new International Research Data Management (IRiDiuM) Glossary terms and definitions to support work in the field of research data management. We have collaborated & shared definitions with the Data Publication Workflow group and members, the Science Europe Working Group on Research Data (Peter Doorn), the ISO 5127 data management Vocabulary and Paul Millar’s work funded under H202 project INDIGO-DataCloud to define the terms a user-community uses when describing the expected Quality-of-Service and Data-Lifecycle of their storage infrastructure. We may have updates from the co-located 2018 IEEE meeting on Big Data. We have Official status under the EU public procurement legislation: “Common Technical Specification” for people to comply with Regulation No 1025/2012, Annex II (See https://datashare.mpcdf.mpg.de/s/0rq5kVmMlv0h41X for a briefing on this.)

Our Ted-T tool allows users to view a list of all terms (alphabetical or hierarchical). Each Plenary we release a new Version and for P13 it will be called “Philadelphia” which includes PIDs for each definition. One can add your own term & populate

RDA Interest Group on Data Foundations and Terminology 1

Page 2: docx - rd-alliance.org  · Web viewdeclared semantics) and Data Fabric which had an explosion of new terms as it started up. We are working with the MIG on how to add more semantics

Data Foundations and Terminology IG https://www.rd-alliance.org/groups/data-foundations-and-terminology-ig.html

fields such as name, definition, explanation, example, sources etc. We also includes a group discussion page for a term so we can capture what people think about terms. The Processes group of terms includes many practical policy terms (see https://www.rd-alliance.org/practical-policy-wg-slides-dft.html), but most terms, like Digital Object or Metadata are in the Data Organization area.

Definition Example:Data practice is the actual application/use of ideas & methods (as opposed to theories) about how data are collected, created, stored (maintained), curated, used, shared and released (disseminated).

As needed, we try to handle the issue of similarity and overlapping terms. Sometimes definitions are very close to each other but still not the same. Usually we form distinctions and detail these and work on how to make links between definitions,.

DFT IG has collaborated with the RDA Vocabulary Services IG. The DFT vocabulary provides several use cases such as publishing the DFT vocabulary as Linked Data to the Semantic Web, automated vocabulary import more structured relations for the vocabulary as well as more formal taxonomies. This work included exploratory use of SKOS relations.

Over the last few Plenaries DFT IG has sponsored and organized BoFs to discuss domain vocabularies and methods to improve these with such things as domain semantics. Each domain group provided a brief summary of their work including relevant quality vocabularies, models & ontologies, standards and an illustrative use case such as harmonization and mappings between 2 or more vocabularies and issues involved as well as best practice solutions in the domain vocabulary space.Among the domains participating were Smart Cities, Chemical Safety, Global Water, Materials Science, Earth Science, Legal Interoperability and Health Science.

What are the plans?At P13 we will continue the focus on facilitating community discussion on core concepts & leverage existing work. We are expecting considerable discussion of requirements coming out of groups nearing completion, but also support as part of adoption. RDA metadata work is always of interest &

recent definitions include many on FAIR data: Data object type DO Interface Protocol; FAIR ecosystem, FAIR Digital Objects...

We will continue exploratory efforts to leverage Vocabulary Services and NLP exaction of term definitions to improve the quality of the work which also leverage the experience of other IGs as to success factors.

Based on feedback, some curated revisions on definitions and extension of the current synthesis model can be expected to improve and stabilize the effort for subsequent use.

Potential adopters will be encouraged at P13 to provide information on additional use case scenarios to illustrate what areas of work they plan on using the models and vocabulary for and on support for a vocabulary registry.

DFT will be involved in a DRA vocabulary best practices, where we may hear about various efforts to build robust, richer vocabularies using improved semantics & foundational ontologies - so we “don't have to keep redefining terms” on a word level as was discussed in previous Plenaries.

Links1. The IG homepage:https://www.rd-alliance.org/groups/data-foundations-and-

terminology-ig.html

2. The Case-statementhttps://www.rd-alliance.org/group/data-foundations-and-

terminology-ig/case-statement/case-statement.html

3. Our Term Tool

For more information on TeD=T see http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page

Come and join us!The group welcomes membership & input from data communities.As has been indicated more terms need to be defined in the coming phase and everyone is invited to comment via the semantic wiki.

Co-Chairs: Gary Berg-Cross, [email protected] Ritz [email protected]

RDA Interest Group on Data Foundations and Terminology 2