Upload
beverley-payne
View
219
Download
5
Embed Size (px)
Citation preview
BioCASE – A Biological Collection Access Service for Europe
BioCASE programme – metadata and computing methods
The Irish National Node Workshop: October 13th, 2003. The Royal Irish Academy, Dublin.
Chris EmblowEcological Consultancy Services Ltd. (EcoServe)
BioCASE – A Biological Collection Access Service for Europe
BioCASE Methodology
This development of an advanced electronic access system is based on:
1. National nodes - standardised input through meta-information 2. Enhancement of key metadata using indexing tools and an
extendible thesaurus…a WWW-based user interface3. Applications implemented at the central access node to combine
meta-information with specimen or observation-level data4. A User Panel to channel user feedback throughout the project
period5. A business model to sustain the service after the project period
and IPR issues addressed in collaboration with other FP5 projects
BioCASE – A Biological Collection Access Service for Europe
Collection Level
Unit Level
BioCASE data methodology:
all levels
BioCASE – A Biological Collection Access Service for Europe
Collection Level
BioCASE data methodology:
collection level
BioCASE – A Biological Collection Access Service for Europe
Stage 1: Metadata collection
BioCASE – A Biological Collection Access Service for Europe
Stage 2: Data capture and storage
BioCASE – A Biological Collection Access Service for Europe
Collection Level
BioCASE data methodology:
central node data harvesting
BioCASE – A Biological Collection Access Service for Europe
BioCASE: National Node Network
BioCASE National Node
CORM
•31 National Nodes
• Core Meta Database
is updated every night
BioCASE – A Biological Collection Access Service for Europe
Stage 3: central node data harvesting
BioCASE – A Biological Collection Access Service for Europe
Collection Level
BioCASE data methodology:thesaurus and
indexing
BioCASE – A Biological Collection Access Service for Europe
Thesaurus and indexing
• Initially, the institution co-ordinating BioCASE will house the Central Node, however, it is envisaged that it will become a virtual facility, which can be accessed transparently through a number of servers on the network.
• The Central Node will integrate the data provided
by national nodes with the thesaurus module by means of the indexing module.
BioCASE – A Biological Collection Access Service for Europe
Collection Level
BioCASE data methodology:user interface
BioCASE – A Biological Collection Access Service for Europe
BioCASE: Preliminary User Interface
A preliminary user interface displaying information collected by EcoServe to date is available at:
www.biocase.org/biocasesimple/Default.cfm
BioCASE – A Biological Collection Access Service for Europe
BioCASE: Preliminary Access Interface
BioCASE – A Biological Collection Access Service for Europe
BioCASE: Preliminary Access Interface
BioCASE – A Biological Collection Access Service for Europe
BioCASE: Preliminary Access Interface
BioCASE – A Biological Collection Access Service for Europe
BioCASE: Preliminary Access Interface
BioCASE – A Biological Collection Access Service for Europe
Advanced user interface – under developemnt
BioCASE – A Biological Collection Access Service for Europe
Collection versus Unit Level
• Number of nodes– National node network: 31 nodes + SINs– Unit-level network: several hundreds to thousands of
provider nodes in Europe
• Complexity of data profile (and query processes)– BioCASE MetaProfile: about 60 elements– ABCD unit-level schema: several hundreds of elements
• Control– Collection level: national nodes exert some control– Unit-level: providers to retain full control
BioCASE – A Biological Collection Access Service for Europe
BioCASE: all data levels
NNNNCollection
BioCASE Core
WWWInterface P
Core Data Items
(BioCASE Profile)
keywords,keyword Relations
Enh
ance
dM
eta-
Dat
a
Thesaurus
L
SH
Unitaccess
Metadata
IndexP
B
Cor
e D
ata
Item
s(B
ioC
AS
E P
rofi
le)
L
Collection-levelMeta-Data
X
SpecialInterest
Networks
NationalNodes
NN
UnitInformation
DB
UnitInformation
DB
UnitInformation
DB
UnitInformation
DB
Unit-D
ata
(ABCD)
L,B
BioCASE – A Biological Collection Access Service for Europe
Unit level data standard: a history
• 1990 Accessions subgroup of the Taxonomic Databases Working Group
• Trying to provide field list for data exchange• Botany: Types, ITF, Names, HISPID• ASC model 1992, CDEFD 1995, BioCISE 1998• CODATA/TDWG Working Group 2000:
Access to biological collection data, aim: ABCD XML Schema
BioCASE – A Biological Collection Access Service for Europe
Unit-level data standard: ABCD
• Support: collection networking initiatives in the Americas, Australia, Europe, Japan, and New Zealand
• ENHSIN and BioCASE have driven the unit-level data specification process 2001/2002, investing substantial resources
• 2002 ABCD Schema accepted by GBIF• 2003 GBIF declares to support and integrate
BioCASE Network
BioCASE – A Biological Collection Access Service for Europe
ABCD Schema design principles
• Common data definition for all biological collec-tions (living, dead, and sets of observations)
• Not a minimal common denominator approach; looking beyond the obvious questions
• Full support for statements of provider rights, IPR, copyright etc.
• Variable atomisation to support wide range of provider database structures
BioCASE – A Biological Collection Access Service for Europe
ABCD Schema
BioCASE – A Biological Collection Access Service for Europe
The BioCASE unit-level network
• Dynamic network, data held by providers
• Unit-level data provider (collection holders) are connected directly to BioCASE network
• Standard http Internet protocol
• Wrapper technology to produce standard output from any SQL database
• Output and query standard ABCD (XML)
BioCASE – A Biological Collection Access Service for Europe
Unit data
• Use concepts as defined in ABCD
• Very flexible because of variable atomisation
• Not all information ABCD defines is needed– Geo-referenced taxa desired, but even just an object
identifier (e.g. collectors field number) and a name or the location is sufficient
BioCASE – A Biological Collection Access Service for Europe
The provider software
• Works on Windows and Unix derivates (Linux, Mac OS X)
• Works with many relational SQL databases– Currently: Access , Filemaker, MySQL , Oracle,
PostgreSQL , SQL Server
• Generic software design, therefore supports any other concept definition, e.g. Darwin Core, if additionally configured for this
BioCASE – A Biological Collection Access Service for Europe
Provider software installation
• Requirements:– SQL Database with unit-level data
– Webserver connected to the Internet
• Install Python scripting language as CGI– available as open source: http://www.python.org
• Installation & configuration user guide:– http://www.biocase.org/dev/provider/index.shtml
• Report wrapper URL to BioCASE secretariat