39
Implementation of TaxPub, a JATS extension for domain- specific markup in taxonomy: the experience of a biodiversity publisher Lyubomir Penev, Terry Catapano, Donat Agosti, Teodor Georgiev, Guido Sautter, Pavel Stoev JATS-Con, 16 - 17 Oct 2012 Plazi

Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

  • Upload
    orrick

  • View
    48

  • Download
    1

Embed Size (px)

DESCRIPTION

Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher. Lyubomir Penev , Terry Catapano , Donat Agosti , Teodor Georgiev , Guido Sautter, Pavel Stoev JATS-Con, 16 - 17 Oct 201 2. Plazi. - PowerPoint PPT Presentation

Citation preview

Page 1: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Implementation of TaxPub, a JATS extension for domain-

specific markup in taxonomy:the  experience of a biodiversity

publisher

Lyubomir Penev, Terry Catapano, Donat Agosti, Teodor Georgiev,

Guido Sautter, Pavel Stoev

JATS-Con, 16 - 17 Oct 2012

Plazi

Page 2: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

This presentation wll focus on:

Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing

Semantic tagging of and enhancements to published texts

Dissemination of published information to aggregators

Current and future development of TaxPub

Page 3: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Quick facts about PlaziPlazi founded in 2008: Swiss based NGO with members in Switzerland, Germany, US and IranPlazi is a research based think tank with the mission to promote the idea of open access to scientific contentPlazi has four pillars: Legal advice, technical solutions (eg TaxPub), maintenance of a treatment repository, advocacyPlazi GmbH founded in 2012 as service SME owned by Plazi to provide document conversion services and consultationFunding from public donors, eg. EU, and privateClients are global

Page 4: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

ContextConservation: Global biodiversity crisis. Increasing loss of species, but no tools to measure and document itScience: ca 1.8M species described, ca 8M expectedScientific publications

ca 17,000 species described per annum; ca 100,000 redescriptions per annum -> rich contenthighly fragmented with over 2,500 journals and books involved -> difficult access

Solution: Open Access and semantically enhanced publications allow immediate registration of new taxa and dissemination of content -> Taxpub JATS/DTD

Page 5: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

This presentation wll focus on:

Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing

Semantic tagging of and enhacements to published texts

Dissemination of published information to aggregators

Current and future development of TaxPub

Page 6: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

TaxPub Lightweight extension of Blue DTD Describe at JATS-Con 2010: “TaxPub: An Extension of

the NLM/NCBI Journal Publishing DTD for Taxonomic Descriptions” (http://www.ncbi.nlm.nih.gov/books/NBK47081/)

Treatments (i.e., species descriptions) <tp:taxon-treatment>, <tp:nomenclature>,

<tp:treatment-sec> Domain specific content

<taxon-name>: Taxonomic names <materials-citation>references to specimens <descriptive-statement>: descriptions of

morphological features

Page 7: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

<tp:taxon-treatment> <tp:nomenclature> <tp:taxon-name><tp:taxon-name-part taxon-name-part-type="genus">Platyscelio</tp:taxon-name-part> <tp:taxon-name-part taxon-name-part-type="species">mzantsi</tp:taxon-name-part><object-id>urn:lsid:zoobank.org:act:D084EF48-4736-444F-916F-2C8CDE23E29B</object-id><object-id>urn:lsid:biosci.ohio-state.edu:osuc_concepts:242617</object-id>

</tp:taxon-name><tp:taxon-authority>Taekul &amp; Johnson</tp:taxon-authority><tp:taxon-status>sp. n.</tp:taxon-status> </tp:nomenclature><tp:treatment-sec sec-type=”materials_examined”>...

Page 8: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

<tp:treatment-sec sec-type="materials_examined"> <p> <tp:material-citation> <tp:type-status>Holotype</tp:type-status> worker. <tp:taxon-type-location>King Saud Museum of Arthropods (KSMA),

College of Food and Agriculture Sciences, King Saud University, Riyadh, Kingdom of Saudi Arabia.</tp:taxon-type-location> <tp:collecting-event> <tp:collecting-location>SAUDI ARABIA, Al Bahah province,

Amadan forest, Al Mandaq governorate, </tp:collecting-location> <named-content content-type="dwc:verbatimCoordinates">20°12'N, 41°13'E</named-content> , 1881 m.a.s.l. 19.V.2010 (M. R. Sharaf &amp; A. S. Aldawood Leg.); </tp:collecting-event> </tp:material-citation> </p></tp:treatment-sec>

Page 9: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

TaxPub: Recent and Future Developments

Largely stable <x> Greenfication Interest from journals:

European Journal of Taxonomy Zootaxa (via EOL)

Markup of morphological descriptions

Page 10: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

<p>Spreading shrub; stems erect,<Categorical uri="http://ontology.org/plant/stem-color"> <State uri="http://ontology.org/plant/greenish">greenish</State></Categorical>. Leaves deciduous early in summer (particularly when infected with Diseasomyces), oblong, apex obtuse, glabrous or weakly hirsute; stipules sharply pointed, <Quantitative uri="http://ontology.org/plant/stipule-width"><value value="3.2">3,2mm</value></Quantitative> wide, <Categorical uri="http://ontology.org/plant/stipule-color"><State uri="http://ontology.org/plant/black">black</State> or <State uri="http://ontology.org/plant/brown">darkish brown,</State></Categorical>extremely rarely yellow, often shallowly joined around the node; spines stout.</p>

Page 11: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

TaxPub: Challenges Maintenance

Sourceforge Volunteer effort, little time, no funding… Supported by Plazi

Documentation Comments with ad hoc markup in extension files Converted to HTML by NCBI Tool Maintained at Species-ID wiki

Page 12: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher
Page 13: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher
Page 14: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Quick facts about Pensoft & ZooKeysPensoft founded in 1992: more than 700 books

published; two offices in Sofia and Moscow; 16 employeesZooKeys launched in July 2008 as the first mandatory Open Access journal in taxonomy; 205 issues, 20,000 pages IN FOUR YEARSAll new taxa registered in ZooBank and supplied to EOL, Plazi and the wiki Species-ID CrossRef member, ISI and Scopus covered, indexed in Zoological Record, DOAJ, CABI Abstracts, Google Scholar; archived in PubMedCentral and CLOCKSSPensoft Journal System – XML-based online editorial system; publishing services offered to society and institutional journals

Page 15: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

ZooKeys growth

Page 16: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Unified marked up final outputTaxon treatments, keys, images, localities

PROSPECTIVE PUBLISHING | HISTORICAL LITERATURE

The XML landscape for legacy and prospective taxonomic literature

Content management systems &repositories (e.g., EOL, GBIF, SCRATCHPADS)

TaxPub XML schema PENSOFT MARK UP tool

Marked up publicationsPDF, HTML and XML

archivingWIKI

Species-IDWikispecies

Wikipedia

Indexing (IPNI, ZooBank, Myco-

Bank, GNA)Aggregators(EOL, GBIF)

Electronic archives; Data

Centers

END USERS

TaxonX , taXMLit schemas PLAZI’ GOLDEN GATE editor

Automated submission; peer-review

Page 17: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Four stages of the XML-based editorial workflow

SUBMISSION: XML-tagged or non-tagged manuscripts?

PEER-REVIEW/EDITORIAL PROCESS: The technical challenges of the XML mark up

PUBLICATION: Different publishing formats and to whom they are addressed?

DISSEMINATION: How to provide a maximum distribution of published information

Page 18: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

But why to mark up? Is it really needed? Who will be using it?

Descriptions

Images

Occurrences

Nomenclature

Literature

Plazi

Page 19: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

What XML gives to the readers more than the usual PDF does?

Page 20: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Semantic enhancements to published texts

Page 21: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Semantic enhancements to published texts

Page 22: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher
Page 23: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Archiving in PubMedCentral

Page 24: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Automated export of species descriptions to Encyclopedia of Life

(EOL)

XML MARK UP

Page 25: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Automated harvesting and deposition of taxon treatments in Plazi

Page 26: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Export of content to the Wiki environment

Page 27: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Species descriptions on Wikispecies and Wikimedia

Commons

Page 28: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher
Page 29: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

The Future of TaxPub and its implementationsMore semantic Web Enhancements! Pensoft Writing Tool (PWT) – a collaborative article writing platformCommunity-based and open peer review processBiodiversity Data Journal will publish any kind of “small data”: checklists, nomenclatural acts, taxon treatments

Page 30: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

The collaborative article authoring tool

Page 31: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher
Page 32: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher
Page 33: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher
Page 34: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Why the Biodiversity Data Journal is needed?

Page 35: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Primary data Drawings: Slavena Peneva

Publishing and sharing of primary data

RE-USEof

CONTENT

Page 36: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Biodiversity Data Journal All data maters: NO lower or upper limit of

manuscript size! ALL within a single online collaborative

platform, including the writing of the manuscript!

Collaborative article authoring tool Community peer review with “open” and

“public” options, on the top of conventional peer-review

Online editorial process and version control Standard-compliant (Darwin Core, Dublin

Core, NLM JATS, etc.) Pre-defined biological Code-compliant article

templates

Page 37: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

Life cycle of data published in the BDJ

BIODIVERSITYMANUSCRIPT

Occurrence data Genome data

Image galleries

Morphometric data

Environmental data

Phylogenetic data

Any other data

XML MARK UP

Structured text (data!)

ARTICLES Occurr-ence data Taxon namesTaxon treatments

PlaziBHL

Wiki COL

Biblio-graphies

Page 38: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

The lessons learnedThe main difficulties are caused by:

The specificity of the domain (e.g., taxon names, synonyms, instability of nomenclature, lack of global LSID infrastructure, etc.)Mark up of occurrence data (certainly a great challenge)Cost efficiency of markup processSociological barriers: the majority of authors are not willing to change their writing habits; most are still not aware about the tremendous advantages of the Web 2.0 technologiesMost small taxonomy publishers (and some bigger ones) have no experience in XML-based editorial wokflows or they simply can’t afford it

Page 39: Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher

“ Semi-automatically generated semantic, enhanced e-publications are the only way to describe the missing 10 M species, and to deal with an increasing flood of data.”

Donat Agosti

It is not easy, but...... ... it is exciting ....... however possible only through Open

Access!