59
SWETS – Be Open! Donat Agosti Plazi, Bern 23.6.2014, Universität Bern Open Access and the Future of (Biodiversity-) Research

20140623 swets agosti_final

  • Upload
    agosti

  • View
    448

  • Download
    0

Embed Size (px)

DESCRIPTION

Open Access and the Future of (Biodiversity-) Research "SWETS Be Open" Event Bern, Switzerland, June 23, 2013; Donat Agosti, Plazi

Citation preview

Page 1: 20140623 swets agosti_final

SWETS – Be Open!

Donat Agosti

Plazi, Bern

23.6.2014, Universität Bern

Open Access and the Future of (Biodiversity-) Research

Page 2: 20140623 swets agosti_final

Future of Biodiversity Research

Data Mining

Page 3: 20140623 swets agosti_final

Background

Rio Earth Summit 1992

Page 4: 20140623 swets agosti_final

Background

Biodiversity Crisis

Page 5: 20140623 swets agosti_final

Background

Indicators

Page 6: 20140623 swets agosti_final

Indicators as powerful widely understood tool

Reed Elsevier, Annual Reports and Financial Statements 2013

http://www.reedelsevier.com/investorcentre/reports%202007/Documents/2013/reed_elsevier_ar_2013.pdf

39% profit

Page 7: 20140623 swets agosti_final

Biodiversity research and conservation planning0

Multi-Taxon Specimen Data for Setting Conservation Priorities

Source: Kremen C, et al. 2008. Science 320: 222-226.

Consensus conservation priority areas and actual and proposed protected areas

2003: Madagascar announces it will triple protected land to 10% coverage

Page 8: 20140623 swets agosti_final

Politics

IPBESIntergovernmental Platform on Biodiversiy & Ecosystem Services

Page 9: 20140623 swets agosti_final

EU-Political and Science Decision to support IPBES

EU-BONEuropean Biodiversity Observation Network

EU-FP7 funded

Page 10: 20140623 swets agosti_final

A EU decision in support of environmental policy making

EU-BON

• To build a European Biodiversity Observation Network

• Measure and predict change over space and time

• Combine Remote Sensing data and on the groundobservation data in predictive modeling

• Tools to inform decision makers (EU-politicians)

Page 11: 20140623 swets agosti_final

The basic science question

Hardisty, Nature 502, 171 (2013)

BUT: predictive ecology has substantial data needs

Harfoot, BIH2013, Rome, 2013

What is the future of the biological world?

Imagine if we could:

…Predict community level dynamics of ecosystems atscales from local to global, based on the ecology andbiology of all individual organisms

Page 12: 20140623 swets agosti_final

Modeling life on earth

Can we do it?

A realistic goal?

Page 13: 20140623 swets agosti_final

Communication

EU-BON a child of GEOSS

Global Earth Observation System of Systems

Open Access to remote sensing data from all over theworld

Page 14: 20140623 swets agosti_final

The impact of remote sensing data on understanding biodiversity

With sophisticated technologies we can identify different trees in the Amazon…

http://video.ted.com/talk/podcast/2013G/None/GregAsner_2013G-480p.mp4

Page 15: 20140623 swets agosti_final

Access to data

…we could create a link to the related data in our biodiversity literature

http://video.ted.com/talk/podcast/2013G/None/GregAsner_2013G-480p.mp4

Page 16: 20140623 swets agosti_final

Names as information tags in life sciences

Names

Characteristics

Publications

GenesCollections

Specimens

Distribution

Page 17: 20140623 swets agosti_final

Treatments as bits of information

Treatment: sections of publications documenting the features or distribution of a related group of organisms (called a “taxon”, plural “taxa”) in ways adhering to highly formalized conventions. (Catapano, 2010)

Formica obsoleta, Linnaeus 1758: 580

Page 18: 20140623 swets agosti_final

Treatments as part of publications

DNA

Specimens Observations

Institution

Pharmacology/epidemiology

Publication

Treatment

Treatment

Treatment

Table

Appendix

Biology/ecologyReference to other biota

Publication

Treatment

Publication

Page 19: 20140623 swets agosti_final

Text(e.g. PDF)

<tax:treatment>

<tax:nomenclature>

<tax:name>

<tax:xid source="HNS" identifier="193329"/>

<tax:xmldata>

<dc:Genus>Mystrium</dc:Genus>

<dc:Species>leonie</dc:Species>

</tax:xmldata>

Mystrium leonie

</tax:name> Bohn & Verhaagh

<tax:status>n. sp.</tax:status>

Fig 1 D - F

</tax:nomenclature>

<tax:div type="description">

<tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL

1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin strongly curving

to a sharp apical tooth, the apex parallel to the anterior clypeal margin.

(Holotype with material in mandibles, so mandibles and anterior clypeus

$ described below from paratypes.) Median clypeus

....

</treatment>

Enhanced and linked text(XML: Taxonx / Taxpub JATS)

Plazi: Semantic enhanced treatments

Page 20: 20140623 swets agosti_final

Automatic extraction and visualization of treatment content

CountriesMadagascar

Anochetus grandidieri Forel

Page 21: 20140623 swets agosti_final

Datamining of treatments

Pseudomyrmex ants and Vachellia ant-acaciasare a classic example of mutualism in biology.

allenii

melanoceras

ruddiae

chiapensis

collinsii

cookii

cornigera

globulifera

hindsii

janzenii

mayana

sphaerocephala

boopis

flavicornis

hesperius

ita

janzenikuenckeli

mixtecus

nigrocinctus

nigropilosus

opaciceps

particeps

peperi

reconditus

satanicus

simulansspinicola

subtilissimus

veneficus

ferrugineus

gentlei

gracilis

Transbiotic link networkAssociated species linked throughreferences in taxonomic treatments

Acacia-ant species: Pseudomyrmex gracili

Treatment: original description

Treatment: redescription

Associated ant-acacia: Acacia gentlei

Ants Plants

Photocredits: Alex Wild

Treatment

Treatments linked through citations

Page 22: 20140623 swets agosti_final

The Plazi approach

From treatmentto treatment repository

The Plazi approachAgosti, D., W. Egloff. 2009. Taxonomic information exchange and copyright: the Plazi approach. BMC Research Notes 2009, 2:53. doi:10.1186/1756-0500-2-53

Page 23: 20140623 swets agosti_final

The Plazi approach

Plazi workflow

Plazi SRS

find scan «OCR» markup store

Page 24: 20140623 swets agosti_final

Analyzing a large corpus of publications: Plazi repository

14,590 specimens

8900 plottable specimens from

1138 unique locations

Page 25: 20140623 swets agosti_final

Analyzing a journal: Journal of Hymenoptera Research

5170 specimens

4062 plottable specimens from

1138 unique locations

Page 26: 20140623 swets agosti_final

The biodiversity community

Plants

3,400 Herbaria worldwide

10,000 Associate curators and specialists

350,000,000 specimens in collections

180,000,000 specimens digitized

2,000,000,000 specimens including animals

Page 27: 20140623 swets agosti_final

The biodiversity community

200,000,000+ printed pages1,900,000 species described20,000,000+ species treatments 17,000 new species per year

Page 28: 20140623 swets agosti_final

The taxonomy publishing world

12,000 Taxonomic Papers on 42,000 SpidersSince 1757

Publications widely scattered

Source: Jeremy Miller

Page 29: 20140623 swets agosti_final

Why is the system broken?

WHY does it NOT work?

Page 30: 20140623 swets agosti_final

Access to data limited

…we cannot create a link to the related biodiversity data

http://video.ted.com/talk/podcast/2013G/None/GregAsner_2013G-480p.mp4

Page 31: 20140623 swets agosti_final

Communication

200,000,000+ printed pages1,900,000 species described20,000,000+ species treatments 17,000 new species per year

BUT: The data are hidden

Incomplete digitization Publications are not semantically enhancedCollections are incompleteData is not linkedMost data are not open

Page 32: 20140623 swets agosti_final

Why is the system broken?

Access to a corpus

NOT

single PDF, data point

Page 33: 20140623 swets agosti_final

Why is the system broken?

Access to content

NOT

representations

Page 34: 20140623 swets agosti_final

Why is the system broken?

Legal issues

Technical issues

Social issues

Page 35: 20140623 swets agosti_final

Legal issues: Copyright

Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire

body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)

Page 36: 20140623 swets agosti_final

Legal issues: licences

Legal licences for 1000+ journalscannot be tracked by scientists

Page 37: 20140623 swets agosti_final

Technical issues: Digtial Object Identifiers

DOIMissing

CrossRef an exclusive club

Page 38: 20140623 swets agosti_final

Technical issues: Journal publishing workflow

Journal publishing workflows:

From structured data to unstructured text

Page 39: 20140623 swets agosti_final

Technical issues: Content extraction

Conversion of legacy literature prohibitively expensive

Mark up costs for markup including materials citations

0

5

10

15

20

25

30

35

40

0 100 200 300 400 500 600 700

Pag

es

Minutes

Source: Spider Pilot, Jeremy Miller

Plazi SRS

find scan «OCR» markup store

Average:6 min / page complete

OCR: 0.80 EUR /page vendor

Page 40: 20140623 swets agosti_final

Social issues: data sharing

The misunderstood attribution

Page 41: 20140623 swets agosti_final

Why is the system broken?

WHY NOT

make it work?

Page 42: 20140623 swets agosti_final

European Open Biodiversity Knowledge Management System

European Open Biodiversity Knowledge

Management System

European Union FP7 funded project

Page 43: 20140623 swets agosti_final

European Open Biodiversity Knowledge Management System

Prepare the ground for the creation of a system for intelligent management of biodiversity knowledge which will improve the present system of taxonomic literature.

Page 44: 20140623 swets agosti_final

Legal issues: Copyright: The Blue List

The Blue List

elements of taxonomic information that are not subject to copyright

Patterson, D. J., Egloff, W., Agosti, D., Eades, D., Franz, N., Hagedorn, G., Rees, J. A. and Remsen, D. P. 2014. Scientific names of organisms: attribution, rights, and licensing BMC Research Notes 7:79 doi:10.1186/1756-0500-7-79.

Page 45: 20140623 swets agosti_final

Legal issues: Copyright: Legal exceptions for research

Legal exceptions for research

Egloff W, Patterson D, Agosti D, Hagedorn G 2014. Open exchange of scientific knowledge and European copyright: The case of biodiversity information. ZooKeys 414, 109-135. DOI: 10.3897/zookeys.414.7717

Page 46: 20140623 swets agosti_final

Legal issues: Copyright: Open Access

Open Access

Page 47: 20140623 swets agosti_final

Legal issues: Copyright: Creative Commons Licence

Page 48: 20140623 swets agosti_final

Technical issues: DOI

Persistent identifiers for data objects and physical objects

Linking data using agreed vocabularies

http://wiki.pro-ibiosphere.eu/wiki/Best_practices_for_stable_URIs

Page 49: 20140623 swets agosti_final

Technical issues: DOI

Biodiversity Literature Repository @ Zenodo

public repository for legacy literature using Data Cite DOI

CrossRef to cite (Zenodo) Data Cite DOI?!

Page 50: 20140623 swets agosti_final

Technical issues: semantic enhanced publishing

Semantic enhanced publishing

Taxpub JATS

Use DOI as widely as possible

Page 51: 20140623 swets agosti_final

Technical issues: machine access

(well documented) API

Page 52: 20140623 swets agosti_final

Technical issues: semantic publishing

Advanced publishing and dissemination

Form based

Semantnic enhanced TaxPub JATS based publishing

Page 53: 20140623 swets agosti_final

Social issues: Bouchout Declaration

http://bouchoutdeclaration.org/ launched June 12, 2014

10 PrinciplesFree and open use of digital resourcesUse of persistent identifiers and linking of dataPolicy developmentsDeveloping sustainable business models

Page 54: 20140623 swets agosti_final

Social issues: Bouchout Declaration

Page 55: 20140623 swets agosti_final

Technical issues: business plan

Page 56: 20140623 swets agosti_final

Conclusions

If we want to conserve theworld’s biodiversity, weneed one stop open shopping for biodiversityresearch results.

Page 57: 20140623 swets agosti_final

Conclusions

We scientists are getting ouracts together.

Page 58: 20140623 swets agosti_final

Conclusions

Will the publishers too?

Page 59: 20140623 swets agosti_final

Thank you!

Donat Agosti

[email protected]://plazi.org