19
A framework for managing vocabularies Outcomes of the TDWG Vocabulary Management Task Group TDWG 2013 Annual Conference, Florence, Italy Dag Endresen*, Éamonn O Tuama, Gregor Hagedorn and Steve Baskauf * GBIF-Norway, Natural History Museum of the University in Oslo Global Biodiversity Information Facility (GBIF) 31 October 2013

TDWG VoMaG Vocabulary management workflow, 2013-10-31

Embed Size (px)

DESCRIPTION

Presentation of the VoMaG Vocabulary management workflow at TDWG 2013 in Firenze, 2013-10-31.

Citation preview

Page 1: TDWG VoMaG Vocabulary management workflow, 2013-10-31

   

A framework for managing vocabularies Outcomes of the TDWG Vocabulary Management Task Group TDWG 2013 Annual Conference, Florence, Italy

Dag Endresen*, Éamonn O Tuama, Gregor Hagedorn and Steve Baskauf * GBIF-Norway, Natural History Museum of the University in Oslo Global Biodiversity Information Facility (GBIF) 31 October 2013

Page 2: TDWG VoMaG Vocabulary management workflow, 2013-10-31

Three  main  sec,ons  •  Status  of  the  TDWG  ontology  

•  Seman4c  MediaWiki  as  a  development  pla<orm  

• Requirements  for  a  framework  for  managing  vocabularies  

VoMaG  report  

Page 3: TDWG VoMaG Vocabulary management workflow, 2013-10-31

h9p://community.

gbif.org/pg

/pages/view

/29101/  

Discussion: framework for managing vocabularies

Page 4: TDWG VoMaG Vocabulary management workflow, 2013-10-31

• Vocabularies/ontologies  are  one  of  the  three  core  components  in  the  TDWG  technical  architecture.  

Vocabulary  management  

Page 5: TDWG VoMaG Vocabulary management workflow, 2013-10-31

• Provide  a  shared  understanding  of  what  we  mean  when  describing  biodiversity  en44es.  

• What  kind  of  thing  or  property.  

• A  list  of  things  we  as  a  community  can  agree  upon  the  meaning  of.  

• “Concept  repository”  with  terms  iden4fied  by  URIs.  

Vocabularies/ontologies  

TDWG Technical Roadmap 2008. (TDWG TAG, convened by Roger Hyam).

Page 6: TDWG VoMaG Vocabulary management workflow, 2013-10-31

• Development,  management  and  governance  remain  a  challenge.  

• Maintenance  requires  efforts.  

• Challenge:  Lack  of  resources  -­‐  No  TDWG  Ontology  coordinator  (since  2007).  

• Experiment:  Possible  as  a  collabora4ve  community  approach  coordinated  using  a  Seman4c  Wiki…???  

Vocabulary  governance  

Page 7: TDWG VoMaG Vocabulary management workflow, 2013-10-31

Term  Wiki  

http://terms.gbif.org/wiki/

Page 8: TDWG VoMaG Vocabulary management workflow, 2013-10-31

• Collabora4ve  development  of  terms  at:  h[p://terms.gbif.org/wiki/  [?  terms.tdwg.org/wiki/  ?]  

• Terms  grouped  into  “concept  vocabularies”  each  controlled  by  a  TDWG  task  group.  

• REUSE  terms  (and  term  URI)  whenever  possible!!  

Concept  vocabularies  

Page 9: TDWG VoMaG Vocabulary management workflow, 2013-10-31

Data  standards  

Important principle: Re-use of terms from standardized terminologies wherever possible.

9 The cartoon is from XKCD: http://xkcd.com/927/ CC-BY-NC

Page 10: TDWG VoMaG Vocabulary management workflow, 2013-10-31

Term  versus  Concept  

Dextre Clarke, S.G. and L. Zeng (2012). From ISO 2788 to ISO 25964: the evolution of thesaurus Standards towards Interoperability and data modeling. ISQ Information Standards Quarterly 24(1): 20-26.

“The SKOS (simple knowledge organization system) format is designed to present KOS data in a format that is suitable for machine inferencing and particularly for use in the Semantic Web (….) concepts – units of thought – and distinguishes these from the terms that are used to label these concepts. Will, L. (2012). The ISO 25964 Data Model for the Structure of an Information Retrieval Thesaurus. Bulletin of the American Society for Information Science and Technology 38(4): 48-51.

10

Page 11: TDWG VoMaG Vocabulary management workflow, 2013-10-31

• What  kind  of  thing  or  property  • Concept  (skos:Concept)  

• Reused  as:  • Class  (rdf:Class)    [?]  • Property  (rdfs:Property)  • Controlled  element  values  

Vocabularies/ontologies  

Page 12: TDWG VoMaG Vocabulary management workflow, 2013-10-31

•  Maximize the reuse of terms, focus on the definition and labels for basic terms.

•  Low threshold for non-technical biologists and biodiversity domain experts to access terms and contribute (compared to richer ontologies).

•  Preferred technology: RDF (resource description framework) and SKOS (simple knowledge organization system).

•  Construction and maintenance of OWL ontologies are demanding in respect to expertise, effort and costs.

•  Maintaining SKOS vocabularies are less demanding. •  RDF resources are designed to be easily extended. •  Ontologies (OWL) can be based on (extend) terms

declared by a RDF/SKOS vocabulary. •  SKOS became a W3C recommendation in 2009.

Why use a flat vocabulary ?

12

Page 13: TDWG VoMaG Vocabulary management workflow, 2013-10-31

Darwin Core – a glossary of terms

Wieczorek J, Bloom D, Guralnick R, Blum S, Döring M, De Giovanni R, Robertson T, and Vieglais D (2012) Darwin Core: An Evolving Community-Developed Biodiversity Data Standard. PLoS ONE 7(1): e29715. doi:10.1371/journal.pone.0029715

Page 14: TDWG VoMaG Vocabulary management workflow, 2013-10-31

• Establish  a  repository  for  “concept  vocabularies”.  

• Example  at:    

     h[p://rs.gbif.org/terms/          [?  rs.tdwg.org/terms/  ?]  

Vocabulary  repository  

Photo CC-by-3.0 by Hannes Grobe/AWI. Palaeoclimate archives: Core repository of AWI, http://commons.wikimedia.org/wiki/File:Core-repository_hg.jpg

Page 15: TDWG VoMaG Vocabulary management workflow, 2013-10-31

Concept Vocabulary (rdf, skos) Term Wiki

For vocabulary development

Resources Repository

1.  Mint and maintain concepts and terms, in domain-expert working groups. Using e.g. the http://terms.gbif.org/wiki/ as a collaboration platform.

2. Release final version as a Concept Vocabulary.

E.g. publish vocabulary at the GBIF or a TDWG Resources Repository. REUSE terms from published concept vocabularies and other ontologies when designing new application schema such as DwC-A controlled term and value vocabularies.

Work-flow for Vocabulary management

15

http://terms.gbif.org/wiki/ http://rs.gbif.org/terms/

Page 16: TDWG VoMaG Vocabulary management workflow, 2013-10-31

Example: master SKOS/RDF resource

http://rs.gbif.org/terms/dwc/dwc_translations.rdf

[ [ [ [ en

es

zh

ja

16

Page 17: TDWG VoMaG Vocabulary management workflow, 2013-10-31

Concept Vocabulary (rdf, skos)

Ontologies (rdf, owl)

Biodiversity ontology development

REUSE terms from concept vocabularies whenever possible.

Biodiversity ontology

repository. 17

Page 18: TDWG VoMaG Vocabulary management workflow, 2013-10-31

BioPortal ontology repository A biodiversity “slice” was established at the NCBO BioPortal. •  Loading biodiversity ontologies into the NCBO

BioPortal promotes mapping (and reuse of terms) between bio-medical and biodiversity ontologies.

18

h9p://bis.bioportal.bioontology.org/ontologies?filter=BIS  

Page 19: TDWG VoMaG Vocabulary management workflow, 2013-10-31

Agenda  Time   Title/Presenter  

09.00  –  09.05   Introduc,on  to  VoMaG  Éamonn  Ó  Tuama        

09.05  –  09.20   The  TDWG  Ontologies  Steve  Baskauf      

09.20  –  09.35   A  framework  for  managing  vocabularies  Dag  Endresen  

09.35  –  09.45   Seman,c  MediaWiki    -­‐  a  community  plaWorm  for  vocabulary  development    Gregor  Hagedorn    

09.45  –  09.55   Developing  a  vocabulary:    the  SINP  use  case  Julie  Chataigner  &  Laurent  Poncet  

09.55  –  10.05   Modeling  property  values:  Is  SKOS  part  of  the  solu,on  or  part  of  the  problem?      Robert  A.  Morris  

10.05  –  10.30   General  discussion