Upload
agosti
View
448
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Open Access and the Future of (Biodiversity-) Research "SWETS Be Open" Event Bern, Switzerland, June 23, 2013; Donat Agosti, Plazi
Citation preview
SWETS – Be Open!
Donat Agosti
Plazi, Bern
23.6.2014, Universität Bern
Open Access and the Future of (Biodiversity-) Research
Future of Biodiversity Research
Data Mining
Background
Rio Earth Summit 1992
Background
Biodiversity Crisis
Background
Indicators
Indicators as powerful widely understood tool
Reed Elsevier, Annual Reports and Financial Statements 2013
http://www.reedelsevier.com/investorcentre/reports%202007/Documents/2013/reed_elsevier_ar_2013.pdf
39% profit
Biodiversity research and conservation planning0
Multi-Taxon Specimen Data for Setting Conservation Priorities
Source: Kremen C, et al. 2008. Science 320: 222-226.
Consensus conservation priority areas and actual and proposed protected areas
2003: Madagascar announces it will triple protected land to 10% coverage
Politics
IPBESIntergovernmental Platform on Biodiversiy & Ecosystem Services
EU-Political and Science Decision to support IPBES
EU-BONEuropean Biodiversity Observation Network
EU-FP7 funded
A EU decision in support of environmental policy making
EU-BON
• To build a European Biodiversity Observation Network
• Measure and predict change over space and time
• Combine Remote Sensing data and on the groundobservation data in predictive modeling
• Tools to inform decision makers (EU-politicians)
The basic science question
Hardisty, Nature 502, 171 (2013)
BUT: predictive ecology has substantial data needs
Harfoot, BIH2013, Rome, 2013
What is the future of the biological world?
Imagine if we could:
…Predict community level dynamics of ecosystems atscales from local to global, based on the ecology andbiology of all individual organisms
Modeling life on earth
Can we do it?
A realistic goal?
Communication
EU-BON a child of GEOSS
Global Earth Observation System of Systems
Open Access to remote sensing data from all over theworld
The impact of remote sensing data on understanding biodiversity
With sophisticated technologies we can identify different trees in the Amazon…
http://video.ted.com/talk/podcast/2013G/None/GregAsner_2013G-480p.mp4
Access to data
…we could create a link to the related data in our biodiversity literature
http://video.ted.com/talk/podcast/2013G/None/GregAsner_2013G-480p.mp4
Names as information tags in life sciences
Names
Characteristics
Publications
GenesCollections
Specimens
Distribution
Treatments as bits of information
Treatment: sections of publications documenting the features or distribution of a related group of organisms (called a “taxon”, plural “taxa”) in ways adhering to highly formalized conventions. (Catapano, 2010)
Formica obsoleta, Linnaeus 1758: 580
Treatments as part of publications
DNA
Specimens Observations
Institution
Pharmacology/epidemiology
Publication
Treatment
Treatment
Treatment
Table
Appendix
Biology/ecologyReference to other biota
Publication
Treatment
Publication
Text(e.g. PDF)
<tax:treatment>
<tax:nomenclature>
<tax:name>
<tax:xid source="HNS" identifier="193329"/>
<tax:xmldata>
<dc:Genus>Mystrium</dc:Genus>
<dc:Species>leonie</dc:Species>
</tax:xmldata>
Mystrium leonie
</tax:name> Bohn & Verhaagh
<tax:status>n. sp.</tax:status>
Fig 1 D - F
</tax:nomenclature>
<tax:div type="description">
<tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL
1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin strongly curving
to a sharp apical tooth, the apex parallel to the anterior clypeal margin.
(Holotype with material in mandibles, so mandibles and anterior clypeus
$ described below from paratypes.) Median clypeus
....
</treatment>
Enhanced and linked text(XML: Taxonx / Taxpub JATS)
Plazi: Semantic enhanced treatments
Automatic extraction and visualization of treatment content
CountriesMadagascar
Anochetus grandidieri Forel
Datamining of treatments
Pseudomyrmex ants and Vachellia ant-acaciasare a classic example of mutualism in biology.
allenii
melanoceras
ruddiae
chiapensis
collinsii
cookii
cornigera
globulifera
hindsii
janzenii
mayana
sphaerocephala
boopis
flavicornis
hesperius
ita
janzenikuenckeli
mixtecus
nigrocinctus
nigropilosus
opaciceps
particeps
peperi
reconditus
satanicus
simulansspinicola
subtilissimus
veneficus
ferrugineus
gentlei
gracilis
Transbiotic link networkAssociated species linked throughreferences in taxonomic treatments
Acacia-ant species: Pseudomyrmex gracili
Treatment: original description
Treatment: redescription
Associated ant-acacia: Acacia gentlei
Ants Plants
Photocredits: Alex Wild
Treatment
Treatments linked through citations
The Plazi approach
From treatmentto treatment repository
The Plazi approachAgosti, D., W. Egloff. 2009. Taxonomic information exchange and copyright: the Plazi approach. BMC Research Notes 2009, 2:53. doi:10.1186/1756-0500-2-53
The Plazi approach
Plazi workflow
Plazi SRS
find scan «OCR» markup store
Analyzing a large corpus of publications: Plazi repository
14,590 specimens
8900 plottable specimens from
1138 unique locations
Analyzing a journal: Journal of Hymenoptera Research
5170 specimens
4062 plottable specimens from
1138 unique locations
The biodiversity community
Plants
3,400 Herbaria worldwide
10,000 Associate curators and specialists
350,000,000 specimens in collections
180,000,000 specimens digitized
2,000,000,000 specimens including animals
The biodiversity community
200,000,000+ printed pages1,900,000 species described20,000,000+ species treatments 17,000 new species per year
The taxonomy publishing world
12,000 Taxonomic Papers on 42,000 SpidersSince 1757
Publications widely scattered
Source: Jeremy Miller
Why is the system broken?
WHY does it NOT work?
Access to data limited
…we cannot create a link to the related biodiversity data
http://video.ted.com/talk/podcast/2013G/None/GregAsner_2013G-480p.mp4
Communication
200,000,000+ printed pages1,900,000 species described20,000,000+ species treatments 17,000 new species per year
BUT: The data are hidden
Incomplete digitization Publications are not semantically enhancedCollections are incompleteData is not linkedMost data are not open
Why is the system broken?
Access to a corpus
NOT
single PDF, data point
Why is the system broken?
Access to content
NOT
representations
Why is the system broken?
Legal issues
Technical issues
Social issues
Legal issues: Copyright
Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire
body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)
Legal issues: licences
Legal licences for 1000+ journalscannot be tracked by scientists
Technical issues: Digtial Object Identifiers
DOIMissing
CrossRef an exclusive club
Technical issues: Journal publishing workflow
Journal publishing workflows:
From structured data to unstructured text
Technical issues: Content extraction
Conversion of legacy literature prohibitively expensive
Mark up costs for markup including materials citations
0
5
10
15
20
25
30
35
40
0 100 200 300 400 500 600 700
Pag
es
Minutes
Source: Spider Pilot, Jeremy Miller
Plazi SRS
find scan «OCR» markup store
Average:6 min / page complete
OCR: 0.80 EUR /page vendor
Social issues: data sharing
The misunderstood attribution
Why is the system broken?
WHY NOT
make it work?
European Open Biodiversity Knowledge Management System
European Open Biodiversity Knowledge
Management System
European Union FP7 funded project
European Open Biodiversity Knowledge Management System
Prepare the ground for the creation of a system for intelligent management of biodiversity knowledge which will improve the present system of taxonomic literature.
Legal issues: Copyright: The Blue List
The Blue List
elements of taxonomic information that are not subject to copyright
Patterson, D. J., Egloff, W., Agosti, D., Eades, D., Franz, N., Hagedorn, G., Rees, J. A. and Remsen, D. P. 2014. Scientific names of organisms: attribution, rights, and licensing BMC Research Notes 7:79 doi:10.1186/1756-0500-7-79.
Legal issues: Copyright: Legal exceptions for research
Legal exceptions for research
Egloff W, Patterson D, Agosti D, Hagedorn G 2014. Open exchange of scientific knowledge and European copyright: The case of biodiversity information. ZooKeys 414, 109-135. DOI: 10.3897/zookeys.414.7717
Legal issues: Copyright: Open Access
Open Access
Legal issues: Copyright: Creative Commons Licence
Technical issues: DOI
Persistent identifiers for data objects and physical objects
Linking data using agreed vocabularies
http://wiki.pro-ibiosphere.eu/wiki/Best_practices_for_stable_URIs
Technical issues: DOI
Biodiversity Literature Repository @ Zenodo
public repository for legacy literature using Data Cite DOI
CrossRef to cite (Zenodo) Data Cite DOI?!
Technical issues: semantic enhanced publishing
Semantic enhanced publishing
Taxpub JATS
Use DOI as widely as possible
Technical issues: machine access
(well documented) API
Technical issues: semantic publishing
Advanced publishing and dissemination
Form based
Semantnic enhanced TaxPub JATS based publishing
Social issues: Bouchout Declaration
http://bouchoutdeclaration.org/ launched June 12, 2014
10 PrinciplesFree and open use of digital resourcesUse of persistent identifiers and linking of dataPolicy developmentsDeveloping sustainable business models
Social issues: Bouchout Declaration
Technical issues: business plan
Conclusions
If we want to conserve theworld’s biodiversity, weneed one stop open shopping for biodiversityresearch results.
Conclusions
We scientists are getting ouracts together.
Conclusions
Will the publishers too?