28
BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October 13 th , 2003. The Royal Irish Academy, Dublin. Chris Emblow Ecological Consultancy Services Ltd. (EcoServe)

BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

Embed Size (px)

Citation preview

Page 1: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

BioCASE programme – metadata and computing methods

The Irish National Node Workshop: October 13th, 2003. The Royal Irish Academy, Dublin.

Chris EmblowEcological Consultancy Services Ltd. (EcoServe)

Page 2: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

BioCASE Methodology

This development of an advanced electronic access system is based on:

1. National nodes - standardised input through meta-information 2. Enhancement of key metadata using indexing tools and an

extendible thesaurus…a WWW-based user interface3. Applications implemented at the central access node to combine

meta-information with specimen or observation-level data4. A User Panel to channel user feedback throughout the project

period5. A business model to sustain the service after the project period

and IPR issues addressed in collaboration with other FP5 projects

Page 3: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Collection Level

Unit Level

BioCASE data methodology:

all levels

Page 4: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Collection Level

BioCASE data methodology:

collection level

Page 5: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Stage 1: Metadata collection

Page 6: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Stage 2: Data capture and storage

Page 7: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Collection Level

BioCASE data methodology:

central node data harvesting

Page 8: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

BioCASE: National Node Network

BioCASE National Node

CORM

•31 National Nodes

• Core Meta Database

is updated every night

Page 9: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Stage 3: central node data harvesting

Page 10: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Collection Level

BioCASE data methodology:thesaurus and

indexing

Page 11: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Thesaurus and indexing

• Initially, the institution co-ordinating BioCASE will house the Central Node, however, it is envisaged that it will become a virtual facility, which can be accessed transparently through a number of servers on the network.

• The Central Node will integrate the data provided

by national nodes with the thesaurus module by means of the indexing module.

Page 12: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Collection Level

BioCASE data methodology:user interface

Page 13: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

BioCASE: Preliminary User Interface

A preliminary user interface displaying information collected by EcoServe to date is available at:

www.biocase.org/biocasesimple/Default.cfm

Page 14: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

BioCASE: Preliminary Access Interface

Page 15: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

BioCASE: Preliminary Access Interface

Page 16: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

BioCASE: Preliminary Access Interface

Page 17: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

BioCASE: Preliminary Access Interface

Page 18: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Advanced user interface – under developemnt

Page 19: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Collection versus Unit Level

• Number of nodes– National node network: 31 nodes + SINs– Unit-level network: several hundreds to thousands of

provider nodes in Europe

• Complexity of data profile (and query processes)– BioCASE MetaProfile: about 60 elements– ABCD unit-level schema: several hundreds of elements

• Control– Collection level: national nodes exert some control– Unit-level: providers to retain full control

Page 20: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

BioCASE: all data levels

NNNNCollection

BioCASE Core

WWWInterface P

Core Data Items

(BioCASE Profile)

keywords,keyword Relations

Enh

ance

dM

eta-

Dat

a

Thesaurus

L

SH

Unitaccess

Metadata

IndexP

B

Cor

e D

ata

Item

s(B

ioC

AS

E P

rofi

le)

L

Collection-levelMeta-Data

X

SpecialInterest

Networks

NationalNodes

NN

UnitInformation

DB

UnitInformation

DB

UnitInformation

DB

UnitInformation

DB

Unit-D

ata

(ABCD)

L,B

Page 21: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Unit level data standard: a history

• 1990 Accessions subgroup of the Taxonomic Databases Working Group

• Trying to provide field list for data exchange• Botany: Types, ITF, Names, HISPID• ASC model 1992, CDEFD 1995, BioCISE 1998• CODATA/TDWG Working Group 2000:

Access to biological collection data, aim: ABCD XML Schema

Page 22: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Unit-level data standard: ABCD

• Support: collection networking initiatives in the Americas, Australia, Europe, Japan, and New Zealand

• ENHSIN and BioCASE have driven the unit-level data specification process 2001/2002, investing substantial resources

• 2002 ABCD Schema accepted by GBIF• 2003 GBIF declares to support and integrate

BioCASE Network

Page 23: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

ABCD Schema design principles

• Common data definition for all biological collec-tions (living, dead, and sets of observations)

• Not a minimal common denominator approach; looking beyond the obvious questions

• Full support for statements of provider rights, IPR, copyright etc.

• Variable atomisation to support wide range of provider database structures

Page 24: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

ABCD Schema

Page 25: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

The BioCASE unit-level network

• Dynamic network, data held by providers

• Unit-level data provider (collection holders) are connected directly to BioCASE network

• Standard http Internet protocol

• Wrapper technology to produce standard output from any SQL database

• Output and query standard ABCD (XML)

Page 26: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Unit data

• Use concepts as defined in ABCD

• Very flexible because of variable atomisation

• Not all information ABCD defines is needed– Geo-referenced taxa desired, but even just an object

identifier (e.g. collectors field number) and a name or the location is sufficient

Page 27: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

The provider software

• Works on Windows and Unix derivates (Linux, Mac OS X)

• Works with many relational SQL databases– Currently: Access , Filemaker, MySQL , Oracle,

PostgreSQL , SQL Server

• Generic software design, therefore supports any other concept definition, e.g. Darwin Core, if additionally configured for this

Page 28: BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October

BioCASE – A Biological Collection Access Service for Europe

Provider software installation

• Requirements:– SQL Database with unit-level data

– Webserver connected to the Internet

• Install Python scripting language as CGI– available as open source: http://www.python.org

• Installation & configuration user guide:– http://www.biocase.org/dev/provider/index.shtml

• Report wrapper URL to BioCASE secretariat