Upload
morris-henderson
View
214
Download
0
Embed Size (px)
Citation preview
Acronym SoupGBIF, TDWG & GUIDs
Jerry Cooper
Global Biodiversity Information Facility (GBIF)
• Established in 2000 through non-binding MOU (25 countries + 31 organizations)
• Essentially a global information infrastructure for sharing primary biodiversity data (species occurrences)
• GBIF network currently provides access to 120 million records from 1000 ‘collections’
• Infrastructure evolved from existing exemplar networks – Species Analyst/DiGIR (Kansas), BioCISE/BioCASE (EU – Berlin)
• Taxonomic Databases Working Group (TDWG) provides a forum for development of GBIF technology
Taxonomic Databases Working Group (TDWG)
• Now re-badged as ‘TDWG- Biodiversity Information Standards’
• 20 year history with focus of activity at an annual meeting
• Initial focus on database design and data dictionaries
• Recently evolved towards data exchange standards,ontologies and data sharing protocols
• An appropriate forum for developing the Veg-X standard?
Taxonomic Databases Working Group (TDWG)
• Existing TDWG Standards (existing or actively being developed):
– Taxon Concept Schema (TCS)– Access to Biological Collection Data (ABCD)– Darwin Core (DC)– Structured Descriptive Data (SDD)– Collection – Institutional Metadata
– Literature (citation and document structures for taxonomic literature)– Images– Geospatial– Observation & Specimens– Alien Invasive Species Profiles– Globally Unique Identifiers (GUIDs)– TAPIR (TDWG Access Protocol for Information Retrieval)
TDWG GUIDs
• Need for unique, persistent, resolvable identifiers to communicate about objects
• TDWG promotes Life Science Identifiers (LSIDs)• LSIDs are of the form:
– urn:lsid:indexfungorum.org:names:417119– They are resolvable
• LSIDs are not URLs – resolution requires extra software.
• Debate continues about merits of LSID versus HTTP mediated schemes
LSIDs in action...
Server-side LSID resolver
Record Metadata returned as RDF XML Document
RDF subject-predicate-object triples for TCS name object
X-Standards, RDF & GUIDs
• TDWG reformulating existing standards expressed in XML-Schema as part of a generalized ‘TDWG Ontology’
• RDF Metadata are formalized as ‘vocabularies’ derived from the ontology
• E.g. Return from IndexFungorum conforms to TDWG TCS Vocabulary
• LCR working on DotNet LSID server/resolvers and linking TCS/ABCD (for IndexFungorum and Zoobank)
Conclusions
• GUIDs essential for cross-referencing in any X-Standard – even if present as generic URI ‘placeholders’ for such things as Taxon Concepts etc.
• Merits of LSID/RDF still debated
• TDWG Ontologies and LSID RDF vocabularies immature
• But ... point the way to components of a Veg-X standard that need to be harmonized across standards
• TDWG – appropriate umbrella organisation for development of a Veg-X standard.
Taxon Concepts VegBank Vocabulary
• Assertions – name/publication intersection
• Interpretation– something labelled with an assertion (observation,
collection etc)
• Correlation– between interpretations as >,<, =
• Usage – 3rd party opinion, i.e. party 1 believes that party 2’s
interpretation using name X should be labelled with name Y
Flavours of Taxon Concepts1. Use of names in primary taxonomic literature
• Nomenclatural statement (name attached to types, isotypes & protologue description)
• Homotypic synonyms (names based on same type, ‘objective’)• Heterotypic synonyms (names based on different types
• taxonomic opinion expressed as published list of synonyms. Perhaps with emended description and lists of collections examined
2. Use of names in secondary taxonomic literature• names within floras/faunas, guide books, keys etc
3. Use of names attached to ‘events’• names within species lists, surveys, observation records etc
For LCR & NZOR:• 1 & 2 stored in taxonomic database.• 3 stored against ‘events’ database. • Systems need to ‘expand’ names against combined concept stores.• TCS is designed to accommodate 1 & 2, not necessarily 3