63
XML Workflows and You Thad McIlroy The Future of Publishing San Francisco & Vancouver Presented to ACP CPDS Digital Publishing Workshop Thursday, December 10, 2009

XML Workflows & You

Embed Size (px)

DESCRIPTION

"XML Workflows & You", a presentation to the Association of Canadian Publishers in December-09. Designed to educate non-technical book publishing personnel on the intricacies of implementing XML in their production workflows. The presentation does not fully endorse the notion!

Citation preview

  • 1. XML Workflows and You Thad McIlroy The Future of Publishing San Francisco & Vancouver Presented to ACP CPDS Digital Publishing Workshopgg pThursday, December 10, 2009

2. Outline My background y g My XML Thesis The Vision! Coping with a digital world Thinking (a lot) about XML Complexity p y eBooks Implementing XML workflows 3. My Background M B k d 8 years in bookselling & publishing in Canada; 4 in the U.S. (15 in SF)() 20+ years studying the intersection of technology and print publishing, working with publishers, printers & vendors 5 years with Seybold Seminars 14 books and 200+ articles 4. More Recent Background 10 years studying the impact of the Internet on graphic communications Major focus now: The future of publishing Workflow eBooks and other media Publishing automation (XSL FO) (XSL-FO) Writing for PrintAction, Gilbane.com, TheFutureOfPublishing.com 5. XML Background B kd Worked to implement two multi-million XML workflows at large educational g publishers Authored XSL-FO: Ready for PrimeXSL FO: Time (published by Gilbane) Designed four automated print publishing systems 6. TheFutureofPublishing.com Th F tfP bli hi 7. My XML Thesisy These are all real problems! Print production Web publishing Repurposing to multiple media The semantic world (metadata) Recombining (or subsetting) for new products Archiving Accessibility 8. XML Provides Real Solutions But it is a big, ugly, unwieldy bearg, g y, y Its conceptual metaphors bear little or no resemblance to those of book publishing Its based on 1960s thinking about techdoc (GML, SGML, XML)(GML SGML Yet its ubiquity makes it hard to shake .as does its mindshare Its an open standard, but that dont make it free 9. The Vision: Smart Documents Authors w. Text and vector templates t l tgraphics in XMLEditors working Knows where electronicallyits been and where its going: print & bind Proofing Web W b & PDA done digitally distribution File contains all preflight info & revision historyContains multiple language versions li 10. Some ACP Survey Results Most title production remains inhousep Few expect this to change soon Few are very aware of XML workflows;very aware many questions remain Not N t certain where th ROI will come f t i h theillfrom Keen on semantic tagging but without a strong concept of the value 11. ACP Survey Result Q SR lt Questions ti 50% of the software used inhouse is neither QE or ID. What is it? Why the major interest in semantic tagging? 12. Dealing with a Digital World 13. Workflow Can B C t ll d W kflC Be Controlled It is a debugged system: there are no technological reasons for mistakes PDF is a big part of the answer Predictable (potentially) and independent Major problem remains in the interface between publishers and printers 14. Color C l can be Controlled b C t ll d CMSs work Soft, remote color proofing works Press manufacturers support color control Closed-loop color control is here now 15. 20 Years Later 16. Managing the Mice 17. $10k f an EforExculsive L kl i Look 18. What is to Be Done? 19. Workflow MustBe Charted 20. The T Th Tenets of Automationt fA tti Full digitization: nothing on paper Full commitment: from management to sales to all operating staff All the software: the right applications (from creative through DAM/CMS and workflow enablers) Standards: full support for the standards that enable a tomation automation 21. Stylesheets St l h t Do you fully embrace stylesheets throughout y gyour workflow, from word processor to page composition? If changes are made in the print-readyprint ready PDF file, are those changes systematically being restored to earlier iterations of the work? 22. Essential Points Eti l P i t If you have not got your current workflows fully digital and fully debugged, y g y gg forget XML entirely There is only one workflow Stop seeing the printer as separate from your workflow Stop seeking needless competitive bids.Partner.Partner 23. 6 projectsthat could change publishing f the b hld hbli hi for h better Michael Tamblyn, CEO BookNet Canada BookNet Canada TechForum 09 24. an XML publishingpg workflow that doesnt suck 25. OR a publishing workflow bli hi kflthat offers all of thebenefits of XMLXML,yet doesnt suckdoesn tThe Future of Publishing g 26. The I Th Importance of XML tf eXtended Markup Language XML enables content management Combining of the power of style sheets with the power of databases Style sheets with meaning 27. XML is the AnswerA New-Breed of Data Standard, ,a Single Standard Able to Represent: 1.All manner of content ft t2.The structure of content3.3 The meaning of content (through smart meaningtag names and metadata)4.Production/workflow requirements5.Rights data6.Repurposing requirements (cross-media) 2005 28. XML Composition is the low-hanging fruit XML stands for Extremely Mixed-up Mixed up Language Suited to reference non fictionreference, non-fiction, education, multipurposing XML i lik violence. If youre not getting is like i l t tti the result you want you have to use more. more 29. Format vs. Structure Ft St t Format describes how content is intended to look when it is displayed orp y printed Structure describes the purpose or meaning of content 30. XML & P bli h Publishers Workflows W kfl Most smaller publishers are still exploratory mode py This is NOT simple to implement, train and support It can be hugely expensive Its It easy to make expensive mistakest kii t k Demands a large offshore component 31. The I f Th Information Avalanche ti A lh Doubling the knowledge base: 1750 1900: 150 years to double 1900 1950: 50 years to double y 1950 1960: 10 years to double 1960 1992: 5 years to double By 2020, information is expected to double about every 73 days! Paper cant provide data in a cost-effective and timely fashion 32. Growth in Electronic Documents 1995: 12 trillion electronic and paper documents 90% of all documents were printed (in 1998) 2005:20 trillion documents 2005: About 50% will be printed Ratio f ff t t di it l i t R ti of offset to digital print 40 60 40:60 Offset @ 40% of todays volume Source: Gary Starkweather, Microsoft Research(and inventor of the laser printer) 33. The Next Section IMPLEMENTING XML 34. Structured Taggingby Authors? 36Typfi sample approach 35. XML Tagging Semantic tagging requires human judgment y p"O t t" R f "L7"/ 37 36. Semantic Tagging Sti Ti 37. Semantic Tagging Sti Ti The concept has been dominated by the notion of a semantic Web The benefits are easy to imagine The implementation can be imagined also But the infrastructure is not in place to deliver th b d lithe benefitsfit Publishers run fiefdoms 38. RCO Th Semantic Challenge The S ti Ch ll Reusable Content Objects: What is your level of semantic granularity? The word Sentence Paragraph Story Other content objects Graphics 39. If you show this to most editors... theyre going to start drinking at their desks (MT) 40. Digital Asset ManagementXMLs role in metadata and taxonomies 42 41. Content Management 42. Templated DesignsHow much of XML-tagged content can becomposed automatically? 44Typfi sample approach 43. Metadata E tM t d t Enters the Process th P Data that describes other data 45 44. The B Th Bean Analogy A l FROM: A Managers Introduction to Adobe extensible Metadata Platform 45. Bean M t d t BMetadata ELEMENT CATEGORY VALUE OF CATEGORY IN THIS DATA TYPE NUMBER OF INFORMATIONINSTANCE (What appears on the label)1 The maker:Trader JoesString2 The contents: Black Beans String3 A notion of distinctivefood value: A low fat foodString4 A second notation ofdistinctive food value: An excellent source of dietary fiberString5 Directions for findingnutritional information:See side panel for nutritional informationString6 A notation of weight, inEnglish and metric units: New Wt. 15 oz (415g)Formatted numbers7 A marketing narrative Trader Joes Black Beans have a rich, hearty tasteand soft texture. They are wonderful in soups andstews, with rice, and in salads with colorfulvegetables and Southwestern or Caribbean flavors.Black beans have gained in popularity due to theirhigh dietary fiber and protein content. They are acholesterol-free and low fat food. Long string 46. More Bean Metadatacholesterol-free and low fat food. 8A declaration ofNo preservatives, no artificial colors, no artificialwholesomeness:flavorsString 9A list of ingredients:black beans, water, salt, calcium chloride List separated by commas 10 The ID of distributor Dist.& Sold Exclusively by Trader Joes,and seller: So. Pasadena, CA 91031 String 11 A tracking code, in Roman 0009 6362Integer 12Same tracking code in bar-code-readable format Bit map 13 The nutritional facts, in Nutritional FactsStructured tablestandard order and format:Serving Size 1/2 cup (130g)Servings per container about 3Amount per servingCalories 130 Fat Cal 5% Daily ValueTotal Fat 0.5g0% Saturated Fat 0g 0%Cholesterol 0mg 0%Sodium 260mg 11%Total Carbohydrates 22g 7% Dietary Fiber 5g22%Sugars 0gProtein 10g20%Vitamin A 0% Vitamin C 0%Calcium 4% Iron 10% Percent Daily Values are based on a 2,000 calorie diet 47. The Cross-Media Challengeg Print Web Mobile 48. The C Th Cross-Media ChallengeM di Ch ll 49. The C Th Cross-Media ChallengeM di Ch ll 50. W3C XML Schema Definition Language (XSD) 1.1 12-03-09 Part 1: Structures specifies the XML Schema Definition Language, offering facilities for describing the structure and constraining the contents of XML documents, including those which exploit the XML Namespace facility. The schema language, which is itself represented i an XML vocabulary and usest d inb l d namespaces, substantially reconstructs and extends the capabilities found in XML document type definitionspyp (DTDs). The second publication, "Datatypes, defines facilities for defining datatypes to be used in XML Schemas as well as other XML specifications Commentsspecifications. welcome through 31-12. 51. DocBook (docbook.org) What is DocBook? DocBook is a schema (available in several languages including RELAX NG NG, SGML and XML DTDs, and W3C XML Schema) maintained by the DocBook Committee of OASIS. It is particularly well suited to books and papers about computer hardware and software... 700 pages of dense documentationf ddt ti 52. DITA 1 1 August-071.1 August 07 The Darwin Information Typing Architecture (DITA) is an XML-based architecture for authoring producing andauthoring, producing, delivering information. Its main use is for technical publications The documentation is 593 pages Maintained by OASIS-open.org M i t i d b OASIS And it wont work for all your titles 53. Do Not Be Tricked! 54. Rule f L R l of Least PowertP A W3C TAG Finding states: When designing computer systems, one isg gpy often faced with a choice between using a more or less powerful language for p g g publishing information, for expressing constraints, or for solving some pgproblem. [ . . . ] The Rule of Least Power suggests choosing the least p gg g powerful language suitable for a given purpose. 55. Were Thi ki W Thinking SmallS ll 56. Were Thi ki W Thinking SmallS ll 57. The Tipping Point How Little Things Can Make a Big Difference ...a book that presents a new way of p y understanding why change so often happens as q pp quickly and as unexpectedly asy py it does...Ideas and behavior and messages and products sometimes behave j p just like outbreaks of infectious disease. They arep social epidemics. Malcolm Gladwell 58. Crossing the ChasmCi th Ch PragmatistsConservatives Visionaries Skeptics TechiesInnovatorsEarly Early Majority Late MajorityLaggardsAdopters Source: www.chasmgroup.com 59. The Human Factor New Internal Roles, Skills & Positions The production skill set changes substantially Much of the existing knowledge basechanges or obsoletes TheTh move from design & composition & fd i itiproduction management to content &pproduct architecting and engineering g gg There is an enormous training challengeahead And a need for certification 60. Some St SSteps Do your homework Listen to the next preso and decide if you think that can work for you Remember, Remember there are alternatives eBook construction is only going to get cheeper and easierhdi Stay tunedwith Adobe, Quark and the key trade associations 61. Thank Th k you [email protected]