Upload
jeffrey-cook
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Taxonomic LiteratureStandards and Synergies
TDWG 2006
Anna L. Weitzman & Christopher H. C. Lyal
Consensus view in taxonomic, biodiversity informatics, natural history library, publishing communities that there is a need for standards to use for taxonomic literature
• Various XML schemas are being created
• Various initiatives are digitizing literature in different ways
Taxonomic Literature
We will consider uses and users for such standards, and whether ‘one size fits all’
• For example, the need emerged early on that simple citations needed standardizing for a number of purposes, but other other standards also needed
• TDWG literature standards working group informally established in 2004, formally in 2005, will update to meet new process in late 2006, expect to propose standards in 2007
Taxonomic Literature Standards
Level 1 standard for taxonomic literatureRecommendation from Group• Author(s) of citation (required; could be author(s) of new taxon
or new combination, or just that portion of work cited)• Author(s) of work (optional; author(s) of book, chapter, or
article, if different from citation author(s))• Title of work (required; title of book, book series, journal, etc.)• Volume indicator (optional, required if applicable)• Part or issue number indicator (optional, required if applicable)• Page(s) (required)• Image indicator (optional)
Level 1 standard for taxonomic literature
Comments & questions• Article and/or chapter title have been omitted -
rarely if ever used in microcitations - to be included in level 2.
• Recommendation needed for reference to the permanent archive for an electronic publication?
• Need for LSID for the microcitation. • Placeholder for LSIDs of Level 2 citation and
level 3 document needed.
Recommendations from working group to include: • All elements of level 1 (level 1 standard as subset)• Article Title• Chapter Title• Additional metadata as required for interoperability with
library standards (MODS?)• Alternative abbreviations for journal names etc • LSIDFunction: e.g. to enable central repository for references of
taxonomic works so that users may obtain citations needed in form appropriate for publication, CV, etc. i.e.: ‘public version of EndNote’
Level 2 standard for taxonomic literature
• Main focus of future work of subgroup
• Activities:– Consider use(s) of standard (schema)– Identify set of criteria for agreement– Review current developments– Determine best fit (including components
from different schemas if appropriate) and refine standard
Level 3 standard for taxonomic literature
Uses:• To enable location of digitized published information.• Provide standard for use with multiple initiatives.• Enable search by unconstrained user-defined criteria,
and return of that subset.• Enable retrieval of specimen-level information related
to taxa in literature.• Enable simultaneous search across different literature
items.• Enable interoperability with taxonomic name data
sources.• Vital component for taxonomic workspace on the web.
Level 3 standard for taxonomic literature
Three major aspects of the standard• Text formatting, pagination, etc.
– TEI-lite used by some initiatives– NCBI has embedded at least some formatting elements in its
journal publishing DTD• ‘Standard’ publication elements.
– Data about the published item (journal, ISBN/ISSN, authorship, publisher, etc) from standard 2
– Other front and back matter (e.g. glossary, contents, abstract, foreword); with some taxonomic-related content
• Content– Domain-specific (biologic content, including taxonomy, natural
history, morphology, molecular, etc)
Content is main area for concentration in this discussion and for level 3.
Level 3 standard for taxonomic literature
Comparison of current schemas • Compare properties of the schemas • Considered only schemas designed for
literature per se, not those designed for other purposes, even though they may contain some pertinent elements (e.g., ABCD, TCS, etc)
Note: AMNH/TaxonX has also done some comparison as part of an NSF grant
Level 3 standard for taxonomic literature: comparison of current candidates
Schemas, standards, formats compared:• PDF• TEI/TEI-lite• Flora Zambeziaca• Flora/Fauna of New Zealand• Flora of Australia• ‘TEI’ + (high level/intermediate markup from
taXMLit)• TaxonX• taXMLit
Level 3 standard for taxonomic literature: comparison of current candidates
Criterion (1)• Allows searching at some level
– PDF depends on source (image or text)– TEI/TEI-lite +– Flora Zambeziaca +– Flora/Fauna
of New Zealand +– Flora of Australia +– TEI + +– TaxonX +– taXMLit +
Comparison of standards in use for taxonomic literature
Criterion (2)• Includes metadata about the publication
– PDF –– TEI/TEI-lite +– Flora Zambeziaca +– Flora/Fauna
of New Zealand +– Flora of Australia +– TEI + + – TaxonX +– taXMLit +
Comparison of standards in use for taxonomic literature
Criterion (3)• Distinguishes main components of taxonomic
literature– PDF –– TEI/TEI-lite – – Flora Zambeziaca +– Flora/Fauna
of New Zealand +– Flora of Australia +– TEI + +– TaxonX +– taXMLit +
Comparison of standards in use for taxonomic literature
Criterion (4)• Potentially applicable to all taxonomic literature
– PDF +– TEI/TEI-lite +– Flora Zambeziaca –– Flora/Fauna
of New Zealand –– Flora of Australia –– TEI + +– TaxonX +– taXMLit +
Comparison of standards in use for taxonomic literature
Criterion (5)• Allows (or structure could allow) multiple search criteria
name synonyms geography dates authors collector other
– PDF – – – – – – –– TEI/TEI-lite – – – – – – –– Flora
Zambeziaca +* – +* – – – + – Flora/Fauna
New Zealand + – – – – – –– Flora of
Australia +* + – – – – +– TEI + + +* +* +* +* +* +*– TaxonX + – +* +* +* – +*– taXMLit + + + + + + +
Comparison of standards in use for taxonomic literature
Criterion (6)• Allows selection of user-defined subsets
– PDF –– TEI/TEI-lite –– Flora Zambeziaca –– Flora/Fauna
of New Zealand –– Flora of Australia +– TEI + –– TaxonX +* (depends on interface)– taXMLit + (depends on interface)
Comparison of standards in use for taxonomic literature
Criterion (7)• Suitable for accessing multiple works
simultaneously– PDF –– TEI/TEI-lite –– Flora Zambeziaca + (of same series)– Flora/Fauna
of New Zealand + (of same series)– Flora of Australia + (of same series)– TEI + +– TaxonX +– taXMLit +
Comparison of standards in use for taxonomic literature
Criterion (8)• Synonymy extractable and combinable into
catalogue format– PDF –– TEI/TEI-lite –– Flora Zambeziaca –– Flora/Fauna
of New Zealand –– Flora of Australia –– TEI + +*– TaxonX –– taXMLit +
Comparison of standards in use for taxonomic literature
Criterion (9)• Display outputs from different treatments in
uniform/comparable style– PDF – – TEI/TEI-lite –– Flora Zambeziaca + (of same series)– Flora/Fauna
of New Zealand + (of same series)– Flora of Australia + (of same series)– TEI + +– TaxonX +– taXMLit +
Comparison of standards in use for taxonomic literature
Criterion (10)Output formats contain data with contextual information or
without (atomized data may be extracted for other uses)– PDF – (only with)– TEI/TEI-lite – (only with)– Flora Zambeziaca – (only with)– Flora/Fauna
of New Zealand – (only with)– Flora of Australia – (only with)– TEI + – (only with)– TaxonX – (only with)– taXMLit +
Comparison of standards in use for taxonomic literature
Criterion (11)• Produces usable keys
– PDF –– TEI/TEI-lite –– Flora Zambeziaca +– Flora/Fauna
of New Zealand +– Flora of Australia +– TEI + –– TaxonX –– taXMLit +
Comparison of standards in use for taxonomic literature
Criterion (12)• Interoperable with other standard schemas
– PDF – – TEI/TEI-lite –– Flora Zambeziaca –– Flora/Fauna
of New Zealand –– Flora of Australia –– TEI + –– TaxonX +*– taXMLit +
Comparison of standards in use for taxonomic literature
Criterion (13)• Uses other schemas
– PDF –– TEI/TEI-lite –– Flora Zambeziaca –– Flora/Fauna
of New Zealand –– Flora of Australia –– TEI + –– TaxonX +, Darwin Core, MODS– taXMLit +
Comparison of standards in use for taxonomic literature
Suggestion that schema include entire other standards (e.g., ABCD, Darwin Core, MODS)
• Pro:
– congruence between different standards;
– limits development time by capitalizing on other working groups’ discussions;
– familiarity to some users
• Con:
– users need to learn several schemas not one;
– changes in constituent schemas require changes in content;
– changes in any constituent schema requires changes in parsing and reviewing tools;
– literature schema can be tailored to vagaries of literature more than elements from schemas developed for other purposes
Stand-alone standard vs. import
• Should this be split into two levels?– Basic structure standard
• consensus of taxonX, Flora of Australia, Flora Brasiliensis, Flora Zambeziaca, Flora/Fauna of New Zealand, high level elements from taXMLit?
• Basic level might include Name, citations, geography, hierarchical placement, description
– Detailed structure standard• taXMLit as the basis?
How to move forward for a level 3 standard for taxonomic literature
Considerations for detailed structure standard
• Experience with museum databasing shows:
The higher the level of data atomization the greater the retrievability of information and the better the database functionality
• User needs assessment:While taxonomists may find it relatively easy to find much content in a relatively simple markup, the other users who often criticize taxonomists for not making data readily available are much less likely to find what they need that way. By atomizing more detail, a wider variety of uses and users will find more of what they need.
• Even for taxonomists, additional time spent searching text rather than just seeing the data requested (without useless context) is not good when we are trying to speed the taxonomic process and reduce the ‘taxonomic impediment’.
• More complex to produce, but automated or semi-automated markup will assist.
• With a good interface it will be easier for end users to understand and use.
• Includes potential to model same data in different contexts, reflecting subtleties of organization / content – stand-alone schema.
Considerations for detailed structure standard