21
MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements: Miroslaw Ryba; Angela Way; Kim Finney; Pamela Brodie http://www.marine.csiro.au/dmr/database/marlin/

MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

Embed Size (px)

Citation preview

Page 1: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

MarLIN

CSIROMarine Laboratories Information Network

update April 1999

Tony Rees

Divisional Data Centre

CSIRO Marine Research, Hobart

acknowledgements: Miroslaw Ryba; Angela Way; Kim Finney; Pamela Brodie

http://www.marine.csiro.au/dmr/database/marlin/

Page 2: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

Format for this presentation

• Look at the overall context - organisations, data, and metadata

• Explain the construction / operation of the MarLIN application

• Visit MarLIN on-line - look at some key / new features and content

• Consider “where to from here? …”

• Present “MarLIN contributor of the year” award

Page 3: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

Common organisational problems regarding data

• Tendency to “knowledge loss”...

– No one person can be familiar with the entire data resource

– Individuals change projects, change jobs or organisations, or retire, taking their knowledge with them

– Existing resources become scattered and cannot be interpreted in isolation

– Much information is in people’s heads rather than in a written format

• Tendency towards information disparity

– No common architecture for accessing or describing data. Existing information may be stored in proprietary software using incompatible formats and standards

• Tendency to undervalue historic data

– Resources are most frequently directed at new data acquisition rather than curation or documentation of existing data

• Information tends to be spread across a number of areas of responsibility

– Projects, corporate group, library, records etc.: each own “part” of the picture

– Limited resources or coordination are available for a common course of action

• Situation compounds unless steps taken to address it

Page 4: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

Metadata are a solution ...

• METADATA are structured information about data , e.g…

– “Dataset” has broad interpretation - could be paper records, digital data, images, maps, specimens, bibliographies ...

• Metadata should comply with a national standard (e.g. AUS, US), an enhanced national standard (e.g. Blue Pages, ASDD), or an international standard (ISO TC211)

• Metadata directories can be standalone (e.g. list or database on a single PC) or more widely accessible (e.g. printed form, CD-ROM, intranet or internet)

• WWW versions can be “static” catalogue, or provide on-line access to a live database (or across several databases)

Dataset nameGeographic extent

Description/subject matterContributors

AcknowledgementsStorage and access details

On-line linksData Quality information

DocumentationTime period covered

Custodian / contact informationEtc. etc.

Page 5: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

We are not alone …- example from another agency (British Antarctic Survey):

“BAS Metadata and the MDMS (Metadata Management System)”

“Metadata are "Data about Data", for example, where and when they were collected, who they were collected by and what equipment was used. Without effective supporting metadata data themselves are useless. Therefore, improving the quality and availability of metadata acts to improve the overall quality, availability, and therefore security, of BAS's data holdings. This is especially important for data sets of long term significance where the data may well be used many years after their collection, when those who collected the data are not available for reference.

In order to improve the quality, availability and security of BAS's data holdings, the AEDC has developed the BAS Metadata Management System (MDMS). The MDMS is a web based system which allows BAS scientists and data managers to record and maintain a standardised set of metadata for the data they are responsible for.

The MDMS also provides a tool for finding data, allowing scientists to see what other data within BAS might be relevant to their science area by viewing metadata entries created by others. All metadata entries contain information on what the data are, where and when they were collected, who is responsible for them and where they are located. You can search the MDMS to find out information on BAS datasets.”

Page 6: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

The external environment - example metadata directories(all links accessible from “About MarLIN” page)

Page 7: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

Benefits of metadata for CSIRO Marine Research

• For the organisation’s internal use - what data is where, who holds it, what form is it in …

– “An integrated data resource supported by a comprehensive metadata warehouse is the institutional memory for an organisation.”

• For providing information (metadata and/or data) to external enquirers on a self-serve basis

– “Metadata records provide information about data in a similar way that library catalogues provide information about books. A catalogue facilitates searching for particular topics or author(s), and metadata is searchable in a comparable way.”

• To enable the Division to participate in national metadata sharing activities e.g. “Blue Pages”, ASDD, etc.

– “The Australian Spatial Data Directory (ASDD) provides search interfaces to geospatial dataset descriptions (metadata) from all jurisdictions throughout Australia.”

Page 8: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:
Page 9: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

MarLIN “value adding”

MarLIN entry(as visible to user)

User entersinto table“DataInfo”

Organisationdetails

Voyage tracks(gif images)

Projectdetails

Voyagedetails

Persondetails

Entry &update details(automatically

generated)

CAABspeciesdetails

CAAB database

Page 10: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

MarLIN linkages

MarLIN database(CMR’s records)

Hyperlinks todocuments,data, etc.

“Blue Pages”HTML documents

(many organisations’ records)

Selected detailsexported to ...

Online link back to ...

Blue Pages search facility

Internet search engines

MarLIN search facility

ASDD(= “virtual directory”)

search facility

Page 11: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

MarLIN progress since April 1998

• MarLIN content continuously being added to

• Divisional Projects table populated

• Lots of improvements to EDIT interface

• Lots of improvements to SEARCH interface - including making it accessible to external enquirers

• A few new fields created (contributors, related datasets, metadata owner, GIS formats) and some existing ones modified (e.g. allowing multiple choices)

• New interface built for locating and delivering research vessel data

• Automatic links to voyage tracks, for relevant datasets

• Supporting tables can be viewed directly by users, and used as jump-off point for searches

• Automated export facility for sharing MarLIN information with “Blue Pages”

Page 12: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

(go live to MarLIN if accessible)

Page 13: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

Why contribute metadata to MarLIN?

• Making metadata records should be considered part of “best practice” data management

• Writing metadata is done most effectively by the data collectors/data custodians, with assistance if needed, while the data are current (or near current)

• Contributing/updating metadata need not be an onerous task, and can assist in project-level data management and information exchange

• Other benefits of writing metadata records …

– Free and ongoing national/international exposure for the work, for relatively modest effort expended

– Recognition of the efforts of contributors and collaborators

– Allows potential users of the data to locate relevant information on a self-serve basis, plus can be used to provide data delivery on demand to relevant clients

– Ensures that essential supporting information is listed in one place, to maximise potential for re-use of the data

– Internet-based display method allows unlimited facility to cross-link to other www resources (at CMR or elsewhere)

Page 14: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

How the Data Centre can assist …

• Providing easy-to-use tools, customised to CMR activities, plus maintain a policy of “continuous improvement” and response to users’ suggestions

• Providing training, demonstrations/workshops, and ongoing assistance in use of the metadata system

• Populating the metadatabase with “core” records about Divisional data so that the system becomes useful from an early stage

• Adding value to information entered by providing automated links (e.g. to project and voyage information), facilitating on-line data access, etc.

• Adding exposure to metadata records (where appropriate) by porting them to external schemes such as the “Blue Pages” (gives a variety of access points to the information)

• Promoting “top-down”, “bottom up” and sideways recognition of the value of documenting datasets and freeing up metadata visibility

Page 15: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

Topics for further consideration …

• How best to progress the present metadata initiative?

– via current program/project structure? (e.g., designated metadata persons, PPE system)

– via Data Centre visits/data audits?

– from published information?

– from unpublished “legacy” data holdings/archives?

– from Divisional “communication” activities - posters, etc?

– from new grant applications?

• How important is database comprehensiveness / completeness?

• What are people’s feelings about metadata? How valuable is this or similar systems, in users’ experience? Are there user needs not presently catered for?

• Are links possible / desirable to other databases around the Division?

Page 16: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

MarLIN award 1998-9 for most metadata entriesAwarded to … .

Page 17: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

MarLIN award 1998-9 for most metadata entriesAwarded to … .Ken Suber - Remote Sensing Project

Page 18: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

Topics for further consideration …

• How best to progress the present metadata initiative?

– via current program/project structure? (e.g., designated metadata persons, PPE system)

– via Data Centre visits/data audits?

– from published information?

– from unpublished “legacy” data holdings/archives?

– from Divisional “communication” activities - posters, etc?

– from new grant applications?

• How important is database comprehensiveness / completeness?

• What are people’s feelings about metadata? How valuable is this or similar systems, in users’ experience? Are there user needs not presently catered for?

• Are links possible / desirable to other databases around the Division?

Page 19: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

MarLIN “value adding”

MarLIN entry(as visible to user)

User entersinto table“DataInfo”

Organisationdetails

Voyage tracks(gif images)

Projectdetails

Voyagedetails

Persondetails

Entry &update details(automatically

generated)

CAABspeciesdetails

CAAB database

Page 20: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements:

ASDD (Australian Spatial Data Directory) -example distributed database search system (as at April 1999)

Database no. of records … typical quoted search time

• ACT Spatial Data Directory 118 … 7 seconds

• AUSLIG Data Directory 44 … 1 second

• BRS - Incorporating Other Commonwealth Data 204 … 7 seconds

• Bureau of Meteorology 8 … 1 second

• EA Environmental Data Directory (Green Pages) 64 … 3 seconds

• IndexGeo Pty Ltd - Eco Companion catalogue 4 … 4 seconds

• NSW Natural Resources Data Directory 3228 … 4 seconds

• Northern Territory 50 … 1 second

• Queensland Spatial Data Directory 8382 … 303 seconds

• South Australia Spatial Information Directory 401 … 4 seconds

• Victorian Spatial Data Directory 466 … 2 seconds

• WALIS Interrogator-Aerial Photography 4747 … 64 seconds

• WALIS Interrogator-Environmental Impact Statements785 … 3 seconds

• WALIS Interrogator-Spatial Data 1293 … 4 seconds

Total ASDD record count = 19794 dataset descriptions

Page 21: MarLIN CSIRO Marine Laboratories Information Network update April 1999 Tony Rees Divisional Data Centre CSIRO Marine Research, Hobart acknowledgements: