33
GBIF Publishing Platform May 2011

GBIF Publishing Platform May 2011. Core publishing focus Primary Biodiversity Data (Specimens & Observations, Ecological Data) - Core data type is an

Embed Size (px)

Citation preview

GBIF Publishing Platform

May 2011

Core publishing focus

Primary Biodiversity Data (Specimens & Observations, Ecological Data) - Core data type is an occurrence of a taxon

Taxonomic Catalogues*, and Annotated Species Checklists. - Core data type is a taxon

* To distinguish our efforts from COL – GBIF provides the means not the ends

Enriched resource metadata – primarily focused on occurrence and taxon datasets.

Core publishing targets

Sufficient coverage of fit-for-use primary biodiversity data to meet identified requirements. A Primary Biodiversity Data clearinghouse.

A comprehensive “catalog of catalogues” that provide core taxonomic dictionaries and organisational framework for biodiversity data

* To distinguish our efforts from COL – GBIF provides the means not the ends

A comprehensive inventory of Primary Biodiversity data collections (digital and un-digitised) and nationally or thematically relevant species checklists.

Data Publishing Platform for who?

Institutional publishers in developed countries. Large proportion of current publishers in this category.

• Smaller institutions with less technical capacity. Many in high-biodiverse regions • Small (individual scientist) data holders

‘Disenfranchised” potential publishers who currently don’t recognise GBIF as a publishing option

• Consolidate

• Strengthen

• Simplify

• Accelerate

• Extend

Data publishing strategy

How

Data publishing today

• (Often) Unsupported software on dedicated servers• Misused protocols• Multiple data formats• Complex, rigid and difficult to maintain even for those with capacity• Requires System administrators and programmers• Re-indexing the CURRENT set of resources = 1 MONTH!!

these are toothpicks

TAPIR

Consolidated data standards

PrimaryBiodiversity

Data

TaxonomicData

Metadata

Darwin Core Ecological MetadataLanguage (EML)

• 172 Terms• Ratified in 2009• Text files• Extensible

• Rich dataset descriptions• GBIF Profile

Darwin Core Archive

PrimaryBiodiversity

Data

TaxonomicData

Metadata

http://www.someplace.org/data.zip

Accelerate network performance

• DarwinCore archives – self-contained packages of data• Published as a URL!• Consolidated – one format for Primary and Taxonomic data• From Months to Daily – for some users – faster• Published data appears much faster!• Simplified harvesting (Priority #3, Participants Report)

Darwin Core Archives

http://data.si.edu/dwc/archive.zip

• Rebuilt following community consultation• Provide a supported, evolving publishing tool• Supports all three CORE data types (Primary/Taxon/Metadata)• 2011 - Establish a Steering Committee to guide product direction• Seek Guidance from SC on directions too!• A Platform for offering Community Services

Integrated Publishing Toolkit 2.0

Strengthened publishing tool

Integrated Publishing Toolkit

Metadata AuthoringPrimary Biodiversity DataSpecies Checklists

Metadata AuthoringPrimary Biodiversity DataSpecies Checklists

Data Hosting Centers

Coming soon…Endangered Wildlife TrustSABIF EIA Data CenterINBIF EIA Data Center

Coming soon…Endangered Wildlife TrustSABIF EIA Data CenterINBIF EIA Data Center

Publish with spreadsheets

• Metadata• Primary Biodiversity data• Species Checklists• Publishing via email• For biologists and database managers

No special software required

Suite of publishing options

Data Publishing documentation

• Full documentation for all aspects of data publishing• Living documents

Improve Data Quality

New roles for engaging Participants

Enable Data Quality Assessment & Improvement as part of the Network

• Today done in Copenhagen

• Tomorrow – through the network

• Improvements made BEFORE data published

• New and increased roles for participating in GBIF

Extend the standard

• Darwin Core Archives are extensible• Simple• Extensible• Internationalised• Standards-based

Global Names Architecture

A Darwin Core-based profile using the GBIF networkto share taxonomic information.

Evaluation underway – 16 reviewers / 39 checklists

GBIF Schema Repository

http://rs.gbif.org/

Darwin Core Terms

List of Extensions

Vocabularies

An schema repository for developers and trainers

Ensure local uptake of technology

Make data publishing worthwhile

• Improved attribution through provenance improvements in Registry

• Improved relevance through extensibility

• DarwinCore Archives support multiple uses of data.

Not all roads lead to Copenhagen

For Organisations For Individuals

• Data publishing = scholarly publishing

• Increased visibility due to simplified and consistent citation methods

• Data is easier to consume!

For Both

Make good data even better!

PersistentIdentifiers

Journal System

SubmissionSubmission

AcceptanceAcceptance

RevisionRevision

Peer ReviewPeer Review

PublicationPublication

Registry

GBRDS

DOI

DistributedMetadata Catalogues

Metadata Authors

auto conversion to manuscriptauto conversion to manuscript

GBIF Metadata Repository

ZooKeys PhytoKeys BioRisks

Data Paper:Recognising Data Discovery

Reward data publishing

Data PaperMetadata document

Deep Data Citation Mechanism & Service

Deep data citation mechanism Recognise ALL with their roles Multilayer citation – Publisher determined

and User driven Cascading citation: Citations within

citations Assign GUIDs for citation text

Data Citation Service Register citations Resolve citation GUIDs

Status: Working with CODATA Data Citation Task

Group & DataCite

Anticipated Impact of Deep Data Citation

Best Practice Guide on Data Publishing & Use Decleration

• Survey of existing ‘Data Sharing’ & ‘Data Use’ agreements (35 agreements)

• Issues identified:– Terminologies used: Provider v/s

Publisher, Sharing v/s Publishing, Agreement v/s Declaration

– Lack of adequate information: e.g. data citation, fitness-for-use, license, etc.

• Best Practice Guide on Publishing & Use Declaration (Q3 2011)

Catalogue of tools & services for discovery, digitisation & publishing

• Tools and services for :– data/metadata capture/collection– data digitisation– quality assessment– quality enhancement– discovery and publishing

• Community-driven• Frequent updates• Included in Welcome Box and Online

Resource Centre

Questions: METADATA

Mobilising more metadata

In order to get more metadata catalogues connected, we need greater uptake by GBIF Participants and other organisations. How?

•act opportunistically – just accept what is offered?

•just target some of the key external networks (e.g., KNB)?

•undertake a focused campaign among Participants using the Participants Report feedback?

•is there a case for a second round of the incentivisation scheme for metadata catalogues but with a focus on “bigger catches” like national catalogues?

Questions: Checklists

Mobilising more sources

Suggest strategies should be employed for mobilising species lists?

Example: Harmonising national and state-relevant species lists to ITIS in USA

Priority lists – National Species Inventories Priority subject – National Clearinghouse Mechanisms

Questions: GEO BON / GEOSS

How can the GBIF community get itself better represented in GEOSS / GEO BON?

•leave it to the GBIF Secretariat?

•just need to be kept informed, e.g., via community site and GBIF news items?

•active participation - have resources to contribute infrastructure at national level?

•participate in project consortia to take development forward?