Taxonomic Publications: Past und Future Donat Agosti (AMNH and NHMB) Andrew Polaszek (ICZN) Klemens...

Preview:

Citation preview

Taxonomic Publications: Past und Future

Donat Agosti (AMNH and NHMB)

Andrew Polaszek (ICZN)

Klemens Böhm und Guido Sautter (Uni Karlsruhe)

Taxonomists at work ……

T. E. Lawrence: Seven Pillars of Wisdom – a triumph. 1st published for general circulation, 1935: p. 535

The traditional flux of information

…a more or less closed system

The group that found the top Quark at Fermilab in Chicago in 1998

Successful scientists at work

The staff of The Natural History Museum, London, 1993

Aren‘t we doing big science too?

> 6,000 taxonomists world wide, major institutions (Herbaria, Natural History Museums)

The staff of Entomology at The Natural History Museum, London, 1993

Aren‘t we doing big science too?

> 6,000 taxonomists world wide, major institutions (Herbaria, Natural History Museums)

Hawkmoths

Curculionids

Ants

Psyllids

Chalcids

Bugs

Aren‘t we doing big science too?

> 6,000 taxonomists world wide, major institutions (Herbaria, Natural History Museums)

Global Biodiversity

The staff of Entomology at The Natural History Museum, London, 1993

Aren‘t we doing big science?

• > 6,000 taxonomists world wide, major institutions (Herbaria, Natural History Museums)

• 1,5 M known taxa and about 10M to go• > 2 Billion specimens in our colellections• Increasing amounts of DNA sequences and whole genomes• > 1,000 journals covering systematics

• If all would be connected….

Finland

“Structure of the World Wide Web in Finland. Circles denote sites and lines denote connecting links.” Courtesy of Bernardo Hubernman (HP Labs, Palo Alto)

from B. Huberman The Laws of the Web, Cambridge, MIT Press, 2001

Why aren‘t we recognized as „big science“?

It has a lot to do with the way we are currently organized: whose data in this room can be accessed right now over the Internet?

Why aren‘t we recognized as „big science“?

It has a lot to do with the way we are currently organized: whose data in this room can be accessed right now over the Internet?

But the sheer numbers and knowledge offers a potential to change this situation.

Why aren‘t we recognized as „big science“?

It has a lot to do with the way we are currently organized: whose data in this room can be accessed right now over the Internet?

But the sheer numbers and knowledge offers a potential to change this situation.

What needs be done?

What ought to be changed:

Culture – probably the most difficult to change:

- The way we collaborate (the social aspects)

What ought to be changed:

Culture – probably the most difficult to change:

- The way we collaborate (the social aspects)

- The way we exchange and provide access to data

What ought to be changed:

Culture – probably the most difficult to change:

- The way we collaborate (the social aspects)

- The way we exchange and provide access to data

- The way we look at the Internet (Semantic Web)

Scanning

Pdf-conversion

(WWW)

Electronic revolution? Not yet.

From text document to XML-document,

or the deconstruction of documents

Tax

on-x

sc

hem

a

Index 1 Index n DocsIndex ..

RDBMS

Retrieval Engine- Analyze queries- Use indices for SE and result improvement- Retrieve documents

- Functionality in Query Executor & Plugins

Query Pipeline

Retrieval

Plu

gin

1

Retrieval

Plu

gin

...

Measu

reP

lug

in 1

Measu

reP

lug

in ...

Retrieval

Plu

gin

n

QueryExe-cutor

Document Analyzer- Analyze documents (NLP)- Store documents- Create indices from analysis results

- Functionality in GATE & Plugins

NLP Analysis Pipeline

Pre-

Plu

gin

1

Pre-

Plu

gin

...

An

alyzerP

lug

in ...

An

alyzerP

lug

in n

An

alyzerP

lug

in 1

GATE

DocDoc

Doc

Result

DocDoc

Doc

Result

DocDoc

Doc

Result

UserUserUserUserUserUser

???Query

???Query

???QueryDocDocDoc

Questions /Feedback

Legend

Document / Query

Meta Data

Information Retrieval for Biodiversity Information Guido Sautter

What ought to be changed:

Culture – probably the most difficult to change:- The way we collaborate (the social aspects)- The way we exchange and provide access to data- The way we look at the Internet (Semantic Web)

Access: We face an aggressive publishing industry (and few colleagues) who disrupt our still free(ish) if a little bit anarchic and fledging flow of information over the Internet, due to commercial interest.

(Fyffe, 2005)

Indirect effect due to huge increases of costs of serials

Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages. Source: (Agosti 2005 and antbase.org)

Directly through enforcement of copyright

What can and should we do to enhance access to our data and knowledge?

A case needs to be made that non-systematists do not want to miss our information as well:

Make your information accessible

at small scale: All ant literature is online (4,000 publications)

What can and should we do to enhance access to our data and knowledge?

A case needs to be made that non-systematists do not want to miss our information:

Make your information accessible

at small scale: All ant literature is online (4,000 publications)

at larger scale: Biodiversity Heritage Literature Project (All systematics literature published in the English language), hopefully followed by a European initiative

What can and should we do to enhance access to our data and knowledge?

A case needs to be made that non-systematists do not want to miss our information:

Make your information accessible

at small scale: All ant literature is online (4,000 publications)

at larger scale: Biodiversity Heritage Literature Project (All systematics literature published in the English language), hopefully followed by a European initiative

Provide Name Servers

Registration of new names as a prerequisite to make them valid, in exchange of an up-to-date list of all (animal) names (i.e. Zoobank at ICZN), but mainly through a federation of taxon specific name servers (e.g. Hymenpotera Name Server / antbase) linked together through global tools, such as GBIF, ITIS, Species2000 or UBIO.

What can and should we do to enhance access to our data and knowledge?

A case needs to be made that non-systematists do not want to miss our information: Make your information accessible

at small scale: All ant literature is online (4,000 publications)at larger scale: Biodiversity Heritage Literature Project (All

systematics literature published in the English language), hopefully followed by a European initiative

Provide Name ServersRegistration of new names as a prerequisite to make them valid, in

exchange of an up-to-date list of all (animal) names (i.e. Zoobank at ICZN)

Standard accessApply and develop, if necessary, data standards and exchange

protocols (e.g. Darwin Core or ABCD, or DiGir as used at GBIF)

What can and should we do to enhance access to our data and knowledge?

A case needs to be made that non-systematists do not want to miss our information as well:

Open flow of information

- Support Open Access, and publish in journals allowing open access

What can and should we do to enhance access to our data and knowledge?

A case needs to be made that non-systematists do not want to miss our information:

Open flow of information

- Support Open Access, and publish in journals allowing open access

- Adopt the principles of the Conservation Commons, that is making data and information accessible for science, education and conservation use

What can and should we do to enhance access to our data and knowledge?

A case needs to be made that non-systematists do not want to miss our information:

Open flow of information

- Support Open Access, and publish in journals allowing open access

- Adopt the principles of the Conservation Commons, that is making data and information accessible for science, education and conservation use

- Urge your publishers and societies to warrant open access

What can and should we do to enhance access to our data and knowledge?

A case needs to be made that non-systematists do not want to miss our information:

Open flow of information

- Support Open Access, and publish in journals allowing open access

- Adopt the principles of the Conservation Commons, that is making data and information accessible for science, education and conservation use

- Urge your publishers and societies to warrant open access

BUT: Can descriptions and monographs be copyrighted anyway?

Descriptions are “factual knowledge”, that is knowledge, based on direct observation

What can and should we do to enhance access to our data and knowledge?

A case needs to be made that non-systematists do not want to miss our information:

Open flow of information

- Support Open Access, and publish in journals allowing open access

- Adopt the principles of the Conservation Commons, that is making data and information accessible for science, education and conservation use

- Urge your publishers and societies to warrant open access

BUT: Can descriptions and monographs be copyrighted anyway?

Descriptions as “factual knowledge”

Increasingly, descriptions are machine output from data-matrices

(i.e. DELTA, Lucid, etc.)

What can and should we do to enhance access to our data and knowledge?

A case needs to be made that non-systematists do not want to miss our information:

Open flow of information

- Support Open Access, and publish in journals allowing open access

- Adopt the principles of the Conservation Commons, that is making data and information accessible for science, education and conservation use

- Urge your publishers and societies to warrant open access

BUT: Can descriptions and monographs be copyrighted anyway?

Descriptions as “factual knowledge”

Descriptions as machine output from data-matrices

(i.e. DELTA, Lucid, etc.)

As an alternative, why not change the function of a publication from a terminal product to a version control instrument?

From the traditional flux of information …

…in a more or less closed system ….

ms submission(„Taxon-x-version“)

new ms alertPosting for review

Edited ms

Revised msPublication: pdf

Publication: hard copy

Publication database(„taxon-x-version“)

ontology

bibliography

analysis & ms preparation

ZooBank / NS

Character DB

Specimen DB

Description DB

Distribution DB

Char. Matrix DB

Phyl. Tree DB

Char-state Im.

Specimen Im.

Habitat Image

Leg. Publicat.

Tax

on D

B

New Data

feedback

Accepted ms

New taxon alert

….. to the Future of Publication

Recommended