Upload
lona
View
56
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Journals and Magazines and Books, Oh My! A Look at ACS' Use of NLM Tagsets. Dan O'Brien, ACS Publications Presented at JATS-Con, 1-Nov-2010. What We'll Cover. Intro ACS, Products, Processes Framework & terminology for discussing customizations ACS Pubs' Use of NLM Tagsets - PowerPoint PPT Presentation
Citation preview
American Chemical Society
Journals and Magazines and Books, Oh My!
A Look at ACS' Use of NLM TagsetsDan O'Brien, ACS Publications
Presented at JATS-Con, 1-Nov-2010
American Chemical Society 2
What We'll Cover
•Intro
– ACS, Products, Processes
– Framework & terminology for discussing customizations
•ACS Pubs' Use of NLM Tagsets
– Overall Approach
– Journals
– Books
– Magazine
• Successes & Lessons Learned
3 American Chemical Society
Character Introductions
• ACS & ACS Pubs
• Journals• Books• Magazine
• Processes• Terminology
American Chemical Society 4
Introductions: ACS
•Professional membership organization
– Chartered by U.S. Congress in 1876
– Non-profit
– Over 160,000 members
•ACS Publications Division ("ACS Pubs")
– Journals
– Magazine
– Books
– On a quest
American Chemical Society 5
Introductions: ACS Journals
•40 peer-reviewed titles
•300,000 annual published pages
•~50% volume published weekly
•Among highest ISI impact factors
•"King" of publishing forest
American Chemical Society 6
Introductions: Books
•Symposium Series
•Around 30 titles published annually
•Around 25 chapters per book
•Hard covers, rigid content format
American Chemical Society 7
Introductions: C&EN Magazine
•Chemical & Engineering News
•Weekly Print & Web issues
•Daily Online News
•"BusinessWeek" for chemists
•Flexible format, loose content definitions
•More than meets the eye
American Chemical Society 8
Introductions, cont.
•Pressure for product innovation: Wicked Which of the West
•NLM Tagsets – has the answers: Wizard of Oz
American Chemical Society 9
Introductions: Processes
•Journals & Books:
– Standard scholarly publishing model
– XML-first article/chapter based production
• Automated Pre-Editing (Inera AutoRedact)
• Technical Editing
• Automated Post-Editing & Validations
– Article ASAP publication (Journals)
– Issue/Book publication (Journals & Books)
•Magazine:
– Staff writers vs. authors
– Feature articles, Thematic issues
– Story Online News? Issue?
– Edit-to-Fit
American Chemical Society 10
Introductions: Journal Process
American Chemical Society 11
Introductions: Books Process
American Chemical Society 12
Introductions: Magazine Process
American Chemical Society 13
Terminology
•Tag – a bit of XML markup: an element, attribute, etc.
•Tag Definition – the coding (in DTD or XSD syntax) that declares the tag name and what its allowed to do.
•Module – a way of logically organizing tag definitions, allowing reuse for multiple schemas.
•Tagset – a collection of related tag definitions forming a complete vocabulary, usually stored within a set of interrelated modules
•Schema – an application of a tagset to form a specific content model
American Chemical Society 14
Terminology
Module Module Module Module
Module Module
ModuleSchema (DTD, XSD, etc.)
Tagset
Tag definition dependencies
Schema (DTD, XSD, etc.)
Module
Tag definition A
Tag definition B
Tag definition C
Tag definition D
American Chemical Society 15
Terminology – "Customization Levels"
Tagset is used "As-Is" without customizations
Tagset not directly used; just "informs" your approach
American Chemical Society 16
Terminology – "Customization Levels"
As-Is Extended Reduced Customized Built From Informed By
American Chemical Society 17
Terminology – "Customization Levels"
As-Is Extended Reduced Customized Built From Informed By
Public version is used without changes or modification
Superset of public tagset is used
Subset of the public schema is used
Combo of Extensions + Reductions
Substantial changes: renamed tags, altered tag hierarchies, etc.
Only the design philosophy of public tagset is used
<xyz> <a/> <b/> </xyz>
<xyz> <a/> <b/> <c/></xyz>
<xyz> <a/> <b/> </xyz>
<xyz> <a/> <b/> <c/></xyz>
<abc> <a> <b/> </a></abc>
<abc> <aa/> <bb/> <cc/></abc>
XML is compatible
Public Custom
Public Custom
XML not compatible?
XML not compatible!
XML not compatible!
American Chemical Society 18
Terminology – "Customization Implementation Methods"
Overrides, leaving original public tag definitions versions intact
Modified original public tag definitions
American Chemical Society 19
Terminology – "Customization Implementation Methods"
Overrides Mixed Modified
Module
Module
Module
Module
Module
Module
ModuleCustom Schema (DTD,
XSD, etc.)
Tagset
Tag definition dependencies
Public Schema (DTD, XSD, etc.)
Module
Tag definition A
Tag definition B
Tag definition C
Tag definition D
American Chemical Society 20
Terminology – "Customization Implementation Methods"
Overrides Mixed Modified
Module
Module
Module
Module
Module
Module
Custom Schema (DTD, XSD, etc.)
Tagset
Tag definition dependencies
Public Schema (DTD, XSD, etc.)
Module
Tag definition A
Tag definition B
Tag definition C
Tag definition D
American Chemical Society 21
Terminology – "Customization Profile"
Customization Levels
Customization Implementation Methods
Overrides Mixed Modifications
As-is
Extended
Reduced
Customized
Built from
Informed by
The Journey: ACS Pubs' Use of NLM Tagsets
American Chemical Society 23
ACS Pubs' Use of NLM Tagsets – Overview & Approach
•Leverage a public schema, or develop one from scratch?
•If use a public schema, would customization be needed? (i.e., where on the "Customization Levels” spectrum?)
– Product drivers !!
– Process drivers !!
– ACS Terminology !?
•If customization would be needed:
– How much customization was needed? (scoping)
– What customizations are needed? (details)
– How to implement the customizations? (i.e., where on the "Implementation Methods" spectrum?)
24 American Chemical Society
ACS Journals' Use of NLM Tagsets
• Production vs. Delivery• What we use and why • Customization Profile • Highlights of Customizations
American Chemical Society 25
ACS Journal Production: What we use
•Custom-built DTD based loosely on NLM Journal Archiving & Interchange v2.2
•~2005, as NLM tagset was beginning to increase in prominence for STM publishing
•Pre-2010: Monolithic tagset & schema used for editing, page composition, interchange with web delivery and 3rd parties
•Late 2010: New version of tagset supporting multiple schema flavors:
– "X" – External & Delivery Interchange
– "P" – Internal Production
– "L" – Page Layout
American Chemical Society 26
ACS Journal Production: What we use
Core tagset modulesExternal/Interchange
DTD
ACS Journal v1.03 DTDs ACS Journal v1.03 Tagset
American Chemical Society 27
ACS Journal Production: What we use
Production-specific tagset features extend core modules
Core tagset modulesExternal/Interchange
DTD
Production DTD
ACS Journal v1.03 DTDs ACS Journal v1.03 Tagset
Overrides of tag definitions
American Chemical Society 28
ACS Journal Production: What we use
Production-specific tagset features extend core modules
Core tagset modules
Page layout specific tagset features extend production-specific modules
External/Interchange DTD
Production DTD
Layout DTD
ACS Journal v1.03 DTDs ACS Journal v1.03 Tagset
Overrides of tag definitions
American Chemical Society 29
ACS Journal Production: Why
•No public tagset met the minimum requirements for
– ACS Journal Product – without undesirable product limitations
– ACS Journal Process – without increasing costs
– Allowing ACS Pubs Terminology
• Without significant staff training & documentation updates
• Without risking rejection
•NLM's Journal tagset came closest
– Could have used massive extensions?
– ACS Pubs Terminology pushed us into "Built From"
American Chemical Society 30
ACS Journal Production: Customization Profile
Customization Levels
Customization Implementation Methods
Overrides Mixed Modifications
As-is
Extended
Reduced
Customized
Built fromACS Journal Production
Informed by
American Chemical Society 31
ACS Journal Production: Customizations – Terminology
NLM ACS
<fig> with @fig-type <fig>, <chart>, <scheme>
<abstract> with @abstract-type <abstract>
<synopsis>
<dek>
<graphic> with @content-type <abstract-graphic>
<toc-graphic>
<title-page-graphic>
<bio-pic>
<media> <weo>, <toc-weo>
American Chemical Society 32
ACS Journal Production: Customizations – Process
NLM ACS
<article>
<front>
<journal-meta>
<article-meta>
<body>
<back>
<document>
<metadata>
<journal-meta>
<document-meta>
<processing-meta>
<body>
<back>
<sub-article>, <response> <sec> beefed up to act as quasi "sub-article"
American Chemical Society 33
ACS Journal Production: Customizations – Product
NLM ACS
<nlm-citation> (v2.3),<element-citation> (v3.0)
<acs-titles>, <acs-no-titles>, <acs-biochem>
n/a <chemical-name>, <chemical-process>, <caution>
<live-change> and related tags
<tie-bar-start/>, <tie-bar-end/>
American Chemical Society 34
ACS Journal Production: Customizations – Product, cont.
NLM ACS
n/a MathML 2 extensions:
<ACS:marker>
<object-group>
(now available in MathML 3)
n/a CALS Table extensions
@row-type = list of types to receive special handling
@indent-left = amount + unit
@indent-left-style = {"full", "first-line", "hanging"}
@spacing-before, @spacing-after
American Chemical Society 35
ACS Journal Delivery: What we use
• Online delivery system: based on Literatum from Atypon
• Literatum speaks "NLM Journal Archive & Interchange"
• Common base tagset ≠ XML content compatibility
– Differing schemas
– Differing tagging expectations
...see Figure <xref rid="xfca3"/>.
vs.
...see Figure <xref rid="xfca3">4</xref>.
American Chemical Society 36
ACS Journal Delivery: What we use
• Two-part content interface
1. Production system: "ACS-Delivery-Prep" (export )
2. Delivery system: "ACS2NLM" lexer ( import)
Both advantages & disadvantages
+ Insulates Production developers from Delivery intricacies
+ Delivery system tagging can evolve without Production
- Occasional failure point
- New products, production tagging changes = ACS2NLM lexer changes
American Chemical Society 37
ACS Journal Delivery: Customization Profile
Customization Levels
Customization Implementation Methods
Overrides Mixed Modifications
As-is
ExtendedACS Journal
Delivery System
Reduced
Customized
Built fromACS Journal Production
Informed by
38 American Chemical Society
ACS Books' Use of NLM Tagsets
• What we use and why • Customization Profile • Highlights of Customizations
American Chemical Society 39
ACS Books: What we use and why - Drivers
•Delivery System: Leverage our new Literatum-based delivery platform.
•Composition: Leverage Arbortext Publishing Engine for highly-automated XML-based page composition.
•Like Journals: Don't re-invent the XML wheel.
•Unlike Journals: Books had unique product characteristics of their own; different type of wheel.
•Book + Chapter production:
– Individual Chapter level: production editing and some composition
– Whole Book level: final book composition, indexing
– Delivery: combination of both book and chapter XML & PDF deliverables.
American Chemical Society 40
ACS Books: What we use and why - Answers
•Delivery System:
– Literatum already supported an Extended version of NLM Book v2.3
– Production & Delivery could share a common tagset!
•Composition: Extended NLM Book v2.3 fit the bill
•Like Journals:
– Extended NLM Book v2.3 had CALS table model
– Many elements & structures were similar to ACS Journal tagset, easing adoption
•Unlike Journals: Extended NLM Book v2.3 addressed almost all book-specific metadata & processing needs
•Book + Chapter production: gap! Solution: Xinclude
– Allows "link book to chapter" instead of "copy chapter into book"
American Chemical Society 41
ACS Books: Customization Profile
Customization Levels
Customization Implementation Methods
Overrides Mixed Modifications
As-is
ExtendedACS Journal & Book Delivery
System
Reduced
CustomizedACS Book Production
Built fromACS Journal Production
Informed by
American Chemical Society 42
ACS Books: Customization Highlights
•Addition of XInclude
– Allows a chapter XML to be processed both as stand-alone document AND within context of entire book
•Use of OASIS Table Model
(instead of default XHTML Table model)
•Addition of DocBook <index> Model
•Addition of <book-series-meta> section
(similar to <journal-meta>)
American Chemical Society 43
ACS Books: Customization Highlights - XInclude
Book XML
<book> <book-series-meta>… <book-meta>… <body> <book-part>… <book-part>… <book-part>…
Book DTD
Chapter XMLs
Book XML
<book> <book-series-meta>… <book-meta>… <body> <xi:include hef="ch1.xml"/> <xi:include hef="ch2.xml"/> <xi:include hef="ch3.xml"/>
Book DTD
<book-part>…
<book-part>…
<book-part>…
44 American Chemical Society
ACS C&EN Magazine's Use of NLM Tagsets
• What we use and why • Customization Profile • Highlights of Customizations
American Chemical Society 45
ACS Magazine: What we use and why
•What: A customized version of the ACS Journal Tagset
– (Which was "informed by" NLM Journal Tagset)
•Drivers:
– Ability to archive a "content of record" that is format independent
– Ability to serve as technology-neutral "content interchange format"
• Automated web delivery
• External content syndication
•Other contenders: DITA for Publications, DocBook, EPUB, PRISM, NewsML,
American Chemical Society 46
ACS Magazine: Customization Profile
Customization Levels
Customization Implementation Methods
Overrides Mixed Modifications
As-is
ExtendedACS Journal & Book Delivery
System
Reduced
CustomizedACS Book Production
Built fromACS C&EN Magazine
ACS Journal Production
Informed by
American Chemical Society 47
ACS Magazine: Customization Highlights
•Amorphous, modular content structures: XInclude
– Same content produced as
• Single article in print
• Several distinct pages online
– Web-only articles & article components
– Blur between articles & subarticles
– Graphics, tables, media have separate production lifecycles, joined later
•Non-contiguous Pagination
•Ads
American Chemical Society 48
ACS Magazine: Customization Highlights
•Flexible, recursive categorization model
– Print/web name, internal code, source/type
• "CO2 Sequestration" vs. "Carbon Dioxide Sequestration"
– RSS feeds
– Alternate topic-oriented TOCs
•Special content constructs
– Dek
– Eyebrow
– Pull quotes
American Chemical Society 49
ACS Magazine: Customization Highlights
50 American Chemical Society
ACS Pubs' Use of NLM Tagsets – Summary
Tagset Lineage & Content Interchange Map
51 American Chemical Society
Successes & Lessons Learned
• Tagging & Technology• People
American Chemical Society 52
Successes & Lessons - Technical
1. Monolithic vs. specialized schemas
2. Use of XInclude for Books & Magazine
American Chemical Society 53
Successes & Lessons - Technical
3. ACS Pubs' hosted "Validations service"
• Internal staff
• Internal systems
• External vendors
4. Use of XML for ACS Mobile
American Chemical Society 54
Successes & Lessons - People
1. Busting the NLM DTD "compatibility" myth
2. "XML as a product" mentality
American Chemical Society 55
Successes & Lessons - People
3. Specifying XML requirements via "Three-legged stool" or package:
a) XML DTD/Schema
b) Documentation: Tagging Conventions & Rendering Expectations
c) XML Samples
56 American Chemical Society
Q & A