Upload
truongthuy
View
234
Download
0
Embed Size (px)
Citation preview
The OpenXML format of OpenOffice
OpenOffice implements OpenXML, an ISO/IEC internationalstandard, ISO/IEC 26300:2006 Open Document Format for OfficeApplications (OpenDocument) v1.0.The most common filename extensions used are:
.odt for word processing documents
.ods for spreadsheets
.odp for presentations
.odb for databases
.odg for graphics
.odf for formulae, mathematical equationsAn OpenXML document is a zip archive of XML files.
Summer School 2011 2/62
Inside a typical ODF file
mimetype Gives mime typePictures/100000000000028000000168D2C0ED14.jpg a graphics filecontent.xml Main body of documentmanifest.rdf List of filesstyles.xml Definition of stylesmeta.xml Document meta dataThumbnails/thumbnail.png Document thumbnailsettings.xml Settings for applicationMETA-INF/manifest.xml List of files
Summer School 2011 4/62
The metadata.
.
. ..
.
.
<office:document-metaoffice:version="1.2"grddl:transformation="http://docs.oasis-
open.org/office/1.2/xslt/odf2rdf.xsl"><office:meta><meta:initial-creator>Sebastian Rahtz</meta:initial-creator><dc:creator>Sebastian Rahtz</dc:creator><meta:editing-cycles>1</meta:editing-cycles><meta:creation-date>2011-05-23T21:41:00</meta:creation-date><dc:date>2011-05-23T22:50:35</dc:date><meta:editing-duration>PT3S</meta:editing-duration><meta:generator>LibreOffice/3.3$Unix
LibreOffice_project/330m19$Build-6</meta:generator><meta:document-statistic
meta:table-count="0"meta:image-count="1"meta:object-count="0"meta:page-count="1"meta:paragraph-count="11"meta:word-count="116"meta:character-count="655"/>
<meta:user-defined meta:name="AppVersion">14.0000</meta:user-defined><meta:user-defined meta:name="Company">University of
Oxford</meta:user-defined><meta:template xlink:type="simple" xlink:actuate="onRequest" xlink:title="Normal.dotm" xlink:href=""/>
</office:meta></office:document-meta>
Summer School 2011 5/62
The document.
.
. ..
.
.
<office:document-content office:version="1.2"><office:body><office:text><draw:frame
text:anchor-type="page"text:anchor-page-number="0"draw:z-index="0"draw:name="Picture 1"draw:style-name="gr1"draw:text-style-name="P7"svg:width="387.98pt"svg:height="192.33pt"svg:x="0pt"svg:y="0pt">
<draw:imagexlink:href="Pictures/100000000000028000000168D2C0ED14.jpg"xlink:type="simple"xlink:show="embed"xlink:actuate="onLoad">
<text:p/></draw:image>
</draw:frame><text:h text:style-name="P4" text:outline-level="1">Flights cancelled as
ash cloud heads towards UK</text:h><text:p text:style-name="P3"><text:p text:style-name="P4">The threat of further disruption led US
President Barack Obama to fly out of the Republic of Ireland a day early to getto <text:span text:style-name="T2">London</text:span> for a statevisit.</text:p>
</text:p><text:list xml:id="list830950205" text:style-name="L2"><text:list-item><text:p text:style-name="P5"><text:a
xlink:type="simple"xlink:href="http://www.bbc.co.uk/news/business-13507675">
<text:span text:style-name="T5">Airline shares hit by ashfears</text:span>
</text:a></text:p>
</text:list-item></text:list>
</office:text></office:body>
</office:document-content>
Summer School 2011 6/62
Simple building blocks
<text:h> heading (with @text:outline-level)<text:p> paragraph<text:list> list<text:list-item> list item<text:span> inline span
With all styling controlled by @text:style-name
Summer School 2011 7/62
The styles.
.
. ..
.
.
<style:stylestyle:name="Heading_20_1"style:display-name="Heading 1"style:family="paragraph"style:parent-style-name="Standard"style:next-style-name="Text_20_body"style:default-outline-level="1"style:list-style-name=""style:class="text">
<style:paragraph-properties fo:margin-top="1.39pt" fo:margin-bottom="1.39pt"/><style:text-properties
style:font-name="Times"fo:font-size="24pt"fo:language="en"fo:country="GB"fo:font-weight="bold"style:letter-kerning="true"style:font-size-asian="24pt"style:font-weight-asian="bold"style:font-size-complex="24pt"style:font-weight-complex="bold"/>
</style:style><style:style style:name="P5" style:family="paragraph" style:parent-style-name="Heading_20_1" style:master-page-name="Standard"><style:paragraph-properties style:page-number="auto"/>
</style:style><style:style style:name="P3" style:family="paragraph" style:parent-style-name="Standard"><style:paragraph-properties fo:margin-top="0pt" fo:margin-
bottom="13.49pt" style:line-height-at-least="13.49pt"/><style:text-properties
fo:color="#333333"style:font-name="Arial1"fo:font-size="10pt"fo:language="en"fo:country="GB"fo:font-weight="bold"style:font-size-asian="10pt"style:font-weight-asian="bold"style:font-name-complex="Arial2"style:font-size-complex="10pt"style:font-weight-complex="bold"/>
</style:style>
Summer School 2011 8/62
Implementation of TEI/ODT conversion
Some simple principles:In ODT to TEI, use recursive <xsl:for-each-group> tointerpolate structure from headingsODT paragraphs, lists, items, spans all map more or less 1:1to <p>, <list>, <item> and <hi>ODT pictures more or less map to <figure> and <graphic>As always, table mapping is complicated by simplicity of tablemodel in TEI (no formatting)
When making ODT, unpack a template file (to avoid generating allthe style info), and then overwrite the content.xml file
Summer School 2011 9/62
The flat headings problem
What we see is.
.
. ..
.
.
<text><head level="1">Top-level heading 1</head><p>Lorum ipsum</p><p>Lorum ipsum</p><head level="2">Second-level heading 1</head><p>Lorum ipsum</p><head level="2">Second-level heading 2</head><p>Lorum ipsum</p><p>Lorum ipsum</p><head level="1">Top-level heading 2</head><p>Lorum ipsum</p><p>Lorum ipsum</p>
</text>
Summer School 2011 10/62
The flat headings problem (2)What we want is.
.
. ..
.
.
<text><div><head>Top-level heading 1</head><p>Lorum ipsum</p><p>Lorum ipsum</p><p>Lorum ipsum</p><p>Lorum ipsum</p><div><head>Second-level heading 1</head><p>Lorum ipsum</p><p>Lorum ipsum</p>
</div><div><head>Second-level heading 2</head><p>Lorum ipsum</p><p>Lorum ipsum</p><p>Lorum ipsum</p>
</div></div><div><head>Top-level heading 2</head><p>Lorum ipsum</p><p>Lorum ipsum</p>
</div></text>
Summer School 2011 11/62
How does that <xsl:for-each-group> work?
Assuming we have <head> elements with a @level attribute.
.
. ..
.
.
<xsl:template match="office:text"><body><xsl:for-each-group select="*" group-starting-with="head[@level='1']"><xsl:choose><xsl:when test="self::head[@level='1']"><xsl:call-template name="group-by-section"/>
</xsl:when><xsl:otherwise><xsl:call-template name="inSection"/>
</xsl:otherwise></xsl:choose>
</xsl:for-each-group></body>
</xsl:template>
Summer School 2011 13/62
Case 1: this is a heading
.
.
. ..
.
.
<xsl:template name="group-by-section"><xsl:variable name="ThisHeader" select="number(@level)"/><xsl:variable name="NextHeader" select="number(@level)+1"/><div><head><xsl:apply-templates/>
</head><xsl:for-each-group
select="current-group() except ."group-starting-with="head[number(@level)=$NextHeader]">
<xsl:choose><xsl:when test="self::head"><xsl:call-template name="group-by-section"/>
</xsl:when><xsl:otherwise><xsl:call-template name="inSection"/>
</xsl:otherwise></xsl:choose>
</xsl:for-each-group></div>
</xsl:template>
Summer School 2011 14/62
Case 2: other elements
.
.
. ..
.
.
<xsl:template name="inSection"><xsl:for-each select="current-group()"><xsl:apply-templates select="."/>
</xsl:for-each></xsl:template>
Summer School 2011 15/62
Moving to Word: the OOXML data format
Microsoft Office 2007 (Office 2008/2011 on a Mac) is more or lessan implementation of ISO/IEC 29500 (OOXML); this defines
a family of interlinked XML schemas to describe officedocumentsa file hierarchy structurea packaging format (zip)
There is a (smallish) difference between is in Word, not whatshould be there according to the spec.
Summer School 2011 16/62
The architecture of a Word docx (OOXML) file(Useful picture fromhttp://en.wikipedia.org/wiki/Office_Open_XML)
Summer School 2011 17/62
XML namespaces in Wordurn:schemas-microsoft-com:mac:vml Drawinghttp://schemas.microsoft.com/office/mac/office/2008/mainhttp://schemas.openxmlformats.org/markup-compatibility/2006urn:schemas-microsoft-com:office:officehttp://schemas.openxmlformats.org/officeDocument/2006/relationships
Linkshttp://schemas.openxmlformats.org/officeDocument/2006/math
Mathsurn:schemas-microsoft-com:vml Another bit of drawingurn:schemas-microsoft-com:office:wordhttp://schemas.openxmlformats.org/wordprocessingml/2006/main
Normal texthttp://schemas.microsoft.com/office/word/2006/wordmlhttp://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing
More drawingSummer School 2011 18/62
What are the files for?
[Content_Types].xml mime types of files_rels/.rels links between names and ob-
jectsword/_rels/document.xml.rels links between names and sup-
port filesword/document.xml document bodyword/media/image1.jpeg picturedocProps/thumbnail.jpeg document thumbnailword/settings.xml settingsword/webSettings.xml settings for HTML exportword/styles.xml style definitionsword/numbering.xml numbering schemesdocProps/core.xml document propertiesword/fontTable.xml font detailsdocProps/app.xml application details
All of these, except media files, are XML files (despite some weirdnames).
Summer School 2011 20/62
Simple text in Word
The main building blocks are<p> block-level object (‘paragraph’)<r> inline object<t> text ‘run’
with corresponding style objects:<pPr> block-level object style rules<rPr> inline style rules
There is no hierarchy, just a flat set of block-level objects.
Summer School 2011 21/62
Word/TEI conversion implementation
As for OpenXML, with the following extra issues(from DOCX) There is no heading element, just paragraphswith a particular style name(from DOCX) There is no list wrapper or list item. Groups ofparagraphs marked as list items have to be wrapped in a<list>(to DOCX) Graphics files have to be listed in a separate file,and linked up with ID/IDREF(to DOCX) Graphics file have to be read to get their naturalsize, needed in the XML markup
Summer School 2011 22/62
Converting other OOXML formats
So can I convert between TEI and Powerpoint, or Excel?
Theoretically, yes. Butthe models of presentations and spreadsheets differmuch more from what TEI does. Don’t expect it tobe easy
Summer School 2011 23/62
Sample of Powerpoint markup.
.
. ..
.
.
<p:txBody><a:bodyPr/><a:lstStyle/><a:p><a:pPr marL="457200" indent="-457200" algn="l"><a:buFont typeface="Arial"/><a:buChar char="•"/>
</a:pPr><a:r><a:rPr lang="en-US" dirty="0" smtClean="0"/><a:t>But the unexpected</a:t>
</a:r></a:p><a:p><a:pPr marL="457200" indent="-457200" algn="l"><a:buFont typeface="Arial"/><a:buChar char="•"/>
</a:pPr><a:r><a:rPr lang="en-US" dirty="0" smtClean="0"/><a:t>And more besides </a:t>
</a:r><a:endParaRPr lang="en-US" dirty="0"/>
</a:p></p:txBody>
Summer School 2011 24/62
Example: references in OOXML (Word) — 1.
.
. ..
.
.
<w:p w:rsidR="008A0CE8" w:rsidRPr="00250571" w:rsidRDefault="008A0CE8" w:rsidP="008A0CE8"><w:pPr><w:pStyle w:val="Heading1"/><w:tabs><w:tab w:val="clear" w:pos="400"/><w:tab w:val="clear" w:pos="560"/><w:tab w:val="left" w:pos="403"/><w:tab w:val="left" w:pos="562"/>
</w:tabs></w:pPr><w:bookmarkStart w:id="8" w:name="_Toc201542376"/><w:r w:rsidRPr="00250571"><w:t>Normative references</w:t>
</w:r><w:bookmarkEnd w:id="8"/>
</w:p><w:p w:rsidR="008A0CE8" w:rsidRPr="00250571" w:rsidRDefault="008A0CE8" w:rsidP="008A0CE8"><w:r w:rsidRPr="00250571"><w:t>The following referenced documents are indispensable for
the application of this document. For dated references, onlythe edition cited applies. For undated references, the latestedition of the referenced document (including any amendments)applies.</w:t>
</w:r></w:p>
Summer School 2011 26/62
Example: references in OOXML (Word) — 2.
.
. ..
.
.
<w:p w:rsidR="008A0CE8" w:rsidRPr="00250571" w:rsidRDefault="008A0CE8" w:rsidP="008A0CE8"><w:pPr><w:pStyle w:val="RefNorm"/>
</w:pPr><w:r><w:rPr><w:sz w:val="19"/><w:szCs w:val="19"/>
</w:rPr><w:t>ISO </w:t>
</w:r><w:r w:rsidRPr="00250571"><w:rPr><w:sz w:val="19"/><w:szCs w:val="19"/>
</w:rPr><w:t>13909-2:2001,</w:t>
</w:r><w:r w:rsidRPr="00250571"><w:t xml:space="preserve"> </w:t>
</w:r><w:r w:rsidRPr="00250571"><w:rPr><w:i/>
</w:rPr><w:t>Hard coal and coke</w:t>
</w:r><w:r><w:rPr><w:i/>
</w:rPr><w:t> —</w:t>
</w:r><w:r w:rsidRPr="00250571"><w:rPr><w:i/>
</w:rPr><w:t xml:space="preserve"> Mechanical sampling</w:t>
</w:r><w:r><w:rPr><w:i/>
</w:rPr><w:t> —</w:t>
</w:r><w:r w:rsidRPr="00250571"><w:rPr><w:i/>
</w:rPr><w:t xml:space="preserve"> </w:t>
</w:r><w:r><w:rPr><w:i/>
</w:rPr><w:t>Part </w:t>
</w:r><w:r w:rsidRPr="00250571"><w:rPr><w:i/>
</w:rPr><w:t>2: Coal</w:t>
</w:r><w:r><w:rPr><w:i/>
</w:rPr><w:t> —</w:t>
</w:r><w:r w:rsidRPr="00250571"><w:rPr><w:i/>
</w:rPr><w:t xml:space="preserve"> Sampling from moving streams</w:t>
</w:r></w:p>
Summer School 2011 27/62
Example: references in XML (TEI)
.
.
. ..
.
.
<div type="normativeReferences"><head>Normative references</head><p>The following referenced documents are indispensable
for the application of this document. For datedreferences, only the edition cited applies. For undatedreferences, the latest edition of the referenced document(including any amendments) applies.</p>
<listBibl type="normativeReferences"><bibl type="dated"><publisher>ISO</publisher><idno type="docNumber">13909</idno><idno type="docPartNumber">1</idno><edition>2001</edition><title rend="italic">Hard coal and coke — Mechanical
sampling —<seg/>Part 1: Generalintroduction</title>
</bibl></listBibl>
</div>
Summer School 2011 28/62
Example: math in XML (MathML).
.
. ..
.
.
<p>The required overall precision on a lot should be agreed between theparties concerned. In the absence of such agreement, a value of one tenthof the ash content may be assumed.</p><p>The theory of precision is given in ISO 13909-7. The followingequation is derived:</p><p><formula><mml:math><mml:msub><mml:mrow><mml:mi>P</mml:mi>
</mml:mrow><mml:mrow><mml:mtext>L</mml:mtext>
</mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:msqrt><mml:mfrac><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>V</mml:mi>
</mml:mrow><mml:mrow><mml:mtext>l</mml:mtext>
</mml:mrow></mml:msub>
</mml:mrow><mml:mrow><mml:mi>n</mml:mi>
</mml:mrow></mml:mfrac><mml:mo>+</mml:mo><mml:mfenced separators="|"><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mi>u</mml:mi>
</mml:mrow><mml:mrow><mml:mi>m</mml:mi>
</mml:mrow></mml:mfrac>
</mml:mrow></mml:mfenced><mml:msub><mml:mrow><mml:mi>V</mml:mi>
</mml:mrow><mml:mrow><mml:mtext>m</mml:mtext>
</mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mrow><mml:mi>V</mml:mi>
</mml:mrow><mml:mrow><mml:mtext>PT</mml:mtext>
</mml:mrow></mml:msub>
</mml:mrow><mml:mrow><mml:mi>u</mml:mi>
</mml:mrow></mml:mfrac>
</mml:msqrt></mml:math><lb/><c rend="tab"/>(1)</formula>
</p>
Summer School 2011 30/62
Challenges in the XSLT conversion
interpolating hierarchy from flat section headings we use XSLT 2.0<for-each-group> heavily to create documentstructure
making decisions depend on generated structure the conversionmakes 3 passes over the data with the one XSLTtransform, each time adding more structure orresolving anomalies.
table management, depending on what table model we target aWord table differs from a CALS table in how itmodels spanning cells and tables, which causesconsiderable problems in mapping
Summer School 2011 31/62
Putting together similar objectsAnother technique with for-each-group is to put similar itemstogetherInput:.
.
. ..
.
.
<text><p>Lorum ipsum</p><item>cats</item><item>dogs</item><item>horses</item><p>Lorum ipsum</p>
</text>
Output:.
.
. ..
.
.
<text><p>Lorum ipsum</p><list><item>cats</item><item>dogs</item><item>horses</item>
</list><p>Lorum ipsum</p>
</text>
Summer School 2011 32/62
XSL to do grouping
The @group-adjacent attribute must return something to check.
.
. ..
.
.
<xsl:for-each-groupselect="*"group-adjacent="if (self::item) then 1 else 2">
<xsl:choose><xsl:when test="current-grouping-key()=1"><list><xsl:copy-of select="current-group()"/>
</list></xsl:when><xsl:otherwise><xsl:copy-of select="current-group()"/>
</xsl:otherwise></xsl:choose>
</xsl:for-each-group>
ex9.xsl
Summer School 2011 33/62
Working with group-adjacent to differentiate similarelements
.
.
. ..
.
.
<xsl:template match="w:p"><!-- We are looking for: - Lists -> 1 - Table of Contents -> 2 - Figures-> 3 --><xsl:for-each-group
select="."group-adjacent="if (teidocx:is-list(.)) then 1 else if
(teidocx:is-toc(.)) then 2 else if (teidocx:is-figure(.)) then 3 else 4"><!-- For each defined grouping call a specific template. If there is nogrouping defined, apply templates with mode paragraph -->
<xsl:choose><xsl:when test="current-grouping-key()=1"><xsl:call-template name="listSection"/>
</xsl:when><xsl:when test="current-grouping-key()=2"><xsl:call-template name="tocSection"/>
</xsl:when><xsl:when test="current-grouping-key()=3"><xsl:call-template name="figureSection"/>
</xsl:when><!-- it is not a defined grouping .. apply templates -->
<xsl:otherwise><xsl:apply-templates select="."/>
</xsl:otherwise></xsl:choose>
</xsl:for-each-group></xsl:template>
Summer School 2011 34/62
How do those functions work?
.
.
. ..
.
.
<xsl:function name="teidocx:is-toc" as="xs:boolean"><xsl:param name="p"/><xsl:choose><xsl:when
test="$p[contains(w:pPr/w:pStyle/@w:val,'toc')]">true</xsl:when><xsl:otherwise>false</xsl:otherwise>
</xsl:choose></xsl:function><xsl:function name="teidocx:is-figure" as="xs:boolean"><xsl:param name="p"/><xsl:choose><xsl:when
test="$p[contains(w:pPr/w:pStyle/@w:val,'Figure')]">true</xsl:when><xsl:when
test="$p[contains(w:pPr/w:pStyle/@w:val,'Caption')]">true</xsl:when><xsl:otherwise>false</xsl:otherwise>
</xsl:choose></xsl:function>
Summer School 2011 35/62
Handling incoming Word style
We use TEI @rend a lot to preserve style names.
.
. ..
.
.
<xsl:templatematch="w:p[w:pPr/w:pStyle/@w:val='Figure text']"mode="paragraph">
<p><xsl:if test="w:pPr/w:jc/@w:val"><xsl:attribute name="iso:align"><xsl:value-of select="w:pPr/w:jc/@w:val"/>
</xsl:attribute></xsl:if><xsl:attribute name="rend"><xsl:text>Figure_text</xsl:text>
</xsl:attribute><xsl:apply-templates/>
</p></xsl:template>
Summer School 2011 36/62
Handling incoming TEI element
.
.
. ..
.
.
<xsl:templatematch="tei:front/tei:div/tei:p[@type='foreword']">
<xsl:call-template name="block-element"><xsl:with-param name="pPr"><w:pPr><w:pStyle><xsl:attribute name="w:val"><xsl:value-of
se-lect="concat(translate(substring(parent::tei:div/@type,1,1),$lowercase,$uppercase),substring(parent::tei:div/@type,2))"/>
</xsl:attribute></w:pStyle>
</w:pPr></xsl:with-param>
</xsl:call-template></xsl:template>
Summer School 2011 37/62
Corrigenda and addenda (TEI XML)
.
.
. ..
.
.
<p>This fourth edition cancels and replaces the thirdedition(ISO 6579:<del when="2009-10-30T13:19:00Z" type="COR" n="1">1993</del><add when="2009-10-30T13:19:00Z" type="COR" n="1">1999</add>), which
has been technically revised.</p><bibl><add when="2009-10-30T09:27:00Z" type="AMD" n="1">ISO/TS 11133-1,
<title rend="italic">Microbiology of food and animal feeding stuffs —Guidelines on preparation and production of culture media — Part 1:General guidelines on quality assurance for the preparation of culturemedia in the laboratory</title></add>
</bibl>
Summer School 2011 38/62
Supporting new styles in DOCX to TEI: a real storyOur target is a complex Word document, carefully prepared withmaximum use of styles.
Summer School 2011 40/62
1. Map some styles to TEI elements.
.
. ..
.
.
<xsl:templatematch="w:p[w:pPr/w:pStyle/@w:val='ITLP Caption']"mode="paragraph">
<head><xsl:apply-templates/>
</head></xsl:template><xsl:template
match="w:p[w:pPr/w:pStyle/@w:val='ITLP Table Heading']"mode="paragraph">
<head><xsl:apply-templates/>
</head></xsl:template><xsl:template
match="w:p[w:pPr/w:pStyle/@w:val='ITLP Ex Tasks Bulleted']"mode="paragraph">
<item><xsl:apply-templates/>
</item></xsl:template><xsl:template
match="w:p[w:pPr/w:pStyle/@w:val='ITLP BodyText Bulletted']"mode="paragraph">
<item><xsl:apply-templates/>
</item></xsl:template>
Summer School 2011 43/62
2. Identify list structuresThe conversion uses functions which check whether something is alist, and decide what sort of list..
.
. ..
.
.
<xsl:function name="teidocx:is-list" as="xs:boolean"><xsl:param name="p"/><xsl:choose><xsl:when
test="$p[contains(w:pPr/w:pStyle/@w:val,'List')]">true</xsl:when><xsl:when
test="$p[contains(w:pPr/w:pStyle/@w:val,'Bulletted')]">true</xsl:when><xsl:when
test="$p[contains(w:pPr/w:pStyle/@w:val,'Bulleted')]">true</xsl:when><xsl:otherwise>false</xsl:otherwise>
</xsl:choose></xsl:function><xsl:function name="teidocx:get-listtype" as="xs:string"><xsl:param name="style"/><xsl:choose><xsl:when test="$style='ITLP BodyText Bulletted'"><xsl:text>unordered</xsl:text>
</xsl:when><xsl:otherwise><xsl:text/>
</xsl:otherwise></xsl:choose>
</xsl:function>
Summer School 2011 44/62
3. Identify headingsSimilarly, we need to know if something is a section heading.Top-level headings are a bit different..
.
. ..
.
.
<xsl:function name="teidocx:is-firstlevel-heading" as="xs:boolean"><xsl:param name="p"/><xsl:choose><xsl:when test="$p[w:pPr/w:pStyle/@w:val='ITLP H1']">true</xsl:when><xsl:when
test="$p[w:pPr/w:pStyle/@w:val='ITLP Anonymous Heading1']">true</xsl:when>
<xsl:otherwise>false</xsl:otherwise></xsl:choose>
</xsl:function><xsl:function name="teidocx:is-heading" as="xs:boolean"><xsl:param name="p"/><xsl:variable name="s" select="$p/w:pPr/w:pStyle/@w:val"/><xsl:choose><xsl:when test="$s=''">false</xsl:when><xsl:when test="$s='ITLP Anonymous Heading 1'">true</xsl:when><xsl:when test="$s='ITLP Anonymous Heading 2'">true</xsl:when><xsl:when test="$s='ITLP H1'">true</xsl:when><xsl:when test="$s='ITLP H2'">true</xsl:when><xsl:when test="$s='ITLP H3'">true</xsl:when><xsl:when test="$s='Heading1'">true</xsl:when><xsl:when test="$s='Heading2'">true</xsl:when><xsl:when test="$s='Heading3'">true</xsl:when><xsl:when test="$s='Heading4'">true</xsl:when><xsl:otherwise>false</xsl:otherwise>
</xsl:choose></xsl:function>
Summer School 2011 45/62
4. Some cases where the TEI has no structuralcomponent, so use @@rend
.
.
. ..
.
.
<xsl:templatematch="w:p[w:pPr/w:pStyle/@w:val='ITLP Ex Explanation']"mode="paragraph">
<p rend="ExampleExplanation"><xsl:apply-templates/>
</p></xsl:template><xsl:template
match="w:p[w:pPr/w:pStyle/@w:val='ITLP Task Text']"mode="paragraph">
<p rend="ExampleTask"><xsl:apply-templates/>
</p></xsl:template><xsl:template
match="w:p[w:pPr/w:pStyle/@w:val='ITLP Step Text']"mode="paragraph">
<p rend="ExampleStep"><xsl:apply-templates/>
</p></xsl:template><xsl:template
match="w:p[w:pPr/w:pStyle/@w:val='ITLP Ex Heading']"mode="paragraph">
<p rend="ExampleHeading"><xsl:apply-templates/>
</p></xsl:template>
Summer School 2011 46/62
5. Now the inline styles.
.
. ..
.
.
<xsl:templatematch="w:r[w:rPr/w:rStyle/@w:val='ITLP FileSpec']">
<code rend="FileSpec"><xsl:apply-templates/>
</code></xsl:template><xsl:template
match="w:r[w:rPr/w:rStyle/@w:val='ITLP Button']"><code rend="Button"><xsl:apply-templates/>
</code></xsl:template><xsl:template
match="w:r[w:rPr/w:rStyle/@w:val='ITLP Input']"><code rend="Input"><xsl:apply-templates/>
</code></xsl:template><xsl:template
match="w:r[w:rPr/w:rStyle/@w:val='ITLP Key']"><code rend="Key"><xsl:apply-templates/>
</code></xsl:template><xsl:template
match="w:r[w:rPr/w:rStyle/@w:val='ITLP Label']"><code rend="Label"><xsl:apply-templates/>
</code></xsl:template><xsl:template
match="w:r[w:rPr/w:rStyle/@w:val='ITLP Menu']"><code rend="Menu"><xsl:apply-templates/>
</code></xsl:template><xsl:template
match="w:r[w:rPr/w:rStyle/@w:val='ITLP Software']"><code rend="Software"><xsl:apply-templates/>
</code></xsl:template>
Summer School 2011 47/62
eBooks
A long history of attempts to make replacements for books ona small tablet computer looking like a bookMost successful is Amazon KindleFollowed by Apple iPadAnd then the Sony Reader, Barnes and Noble Nook etcLargely marketed for reading modern fiction
Summer School 2011 48/62
What are eBooks like?
Designers compare them to “1990s web”Designers too used to painting picturesInconsistent support for ePubToo many reader apps on iPadKindle format annoyingly different
Summer School 2011 49/62
Formats
Most ebook formats based on HTMLOpen ePub format has most supportAmazon Kindle is variant but can be created by convertingePubePub is simply a zipped bundle of XML/HTML files, CSS,graphics etc
Summer School 2011 50/62
iBooks: http://www.apple.com/
Free app on iPhone and iPadRenders ePub and PDF booksManaged using iTunes on host computerImplements some extensions to ePub (video, fixed format)
Summer School 2011 51/62
iBooks issues
Pretty good support for ePub / CSS featuresStill too slow with big books,1000 pages or more (not yettried with iPad 2)Badly needs MathML supportBookshelf layout still primitiveiTunes interface politically uncomfortable for someWe need an ePub previewer for the Mac desktop!
Summer School 2011 52/62
ePub specs and apps
ePub @ IDPF: http://idpf.org/epubAdobe Digital Editions: http://www.adobe.com/products/digitaleditions/Making ePub from Apple Pages:http://support.apple.com/kb/ht4168Making ePub using InDesign: http://blogs.adobe.com/digitalpublishing/2010/03/create_epub_ebooks_with_adobe_indesign.htmlsoftware in the cloud to convert any webpage into an e-book:http://dotepub.com/
Summer School 2011 53/62
Useful linksePub syntax checker, an essential tool for checking whether apackage is properly constructed:http://code.google.com/p/epubcheck/Stanza ePub reader for iPhone and Mac is not bad, but failson large files and does not do all the formatting:http://www.lexcycle.com/Aldiko on Google phones is quite complete:http://www.aldiko.com/FBReader ePub reader for Linux and Android, useableforsome texts, but not a very complete renderer:http://www.fbreader.org/FBReaderJ/EPUBReader Firefox extension allows you to view ePubsseamlessly in Firefox: https://addons.mozilla.org/en-US/firefox/addon/45281/Calibre is a good package for ePub conversions andmanagement: http://calibre-ebook.com/
Summer School 2011 54/62
Things to read
a nice ePub book called epub straight to the point by LizCastro (http://www.elizabethcastro.com/epub/)One of the guides on making ePub (there are many): http://www.lexcycle.com/faq/how_to_create_epubePub-related blogs which I find useful arehttp://www.pigsgourdsandwikis.com/ andhttp://blog.threepress.org
Summer School 2011 55/62
Files in a typical ePub
mimetype gives mime-‐type (uncompressed)META-INF/container.xml gives name of directory where files are
(OEBPS)OEBPS/content.opf metadata, file manifest, order of chapters
etcOEBPS/media/image0.png image for bookOEBPS/stylesheet.css CSS stylesheetOEBPS/s2.html HTML chapterOEBPS/s3.html HTML chapterOEBPS/page-template.xpgt instructions for ADEOEBPS/titlepage.html HTML for front pageOEBPS/titlepageback.html HTML for back pageOEBPS/toc.ncx table of contentsOEBPS/index.html HTML main partOEBPS/s1.html HTML partOEBPS/cover.jpg book cover imageOEBPS/print.css CSS for stylesheet for printing
Summer School 2011 56/62
First part of metadata
.
.
. ..
.
.
<metadata><dc:title>Collected Poems</dc:title><dc:language xsi:type="dcterms:RFC3066">en</dc:language><dc:subject>Oxford Text Archive</dc:subject><dc:subject>Poems -- Great Britain -- 20th
century</dc:subject><dc:identifier id="dcidid" opf:scheme="URI">http://ota.ox.ac.uk/id/3020</dc:identifier><dc:description>Collected Poems / Owen, Wilfred,
1893-1918</dc:description><dc:creator>Owen, Wilfred</dc:creator><dc:publisher>Oxford Text Archive, Oxford
University</dc:publisher><dc:date opf:event="creation">1920</dc:date><dc:date opf:event="epubpublication" xsi:type="dcterms:W3CDTF">2010-
09-21</dc:date><dc:rights>Creative Commons Attribution</dc:rights><meta name="cover" content="cover-image"/>
</metadata>
Summer School 2011 57/62
Second part of metadata.
.
. ..
.
.
<manifest><item href="cover.jpg" id="cover-image" media-
type="image/jpeg"/><item href="stylesheet.css" id="css" media-
type="text/css"/><item
href="titlepage.html"id="titlepage"media-type="application/xhtml+xml"/>
<itemhref="titlepageback.html"id="titlepageback"media-type="application/xhtml+xml"/>
<item id="print.css" href="print.css" media-type="text/css"/><item
id="apt"href="page-template.xpgt"media-type="application/adobe-page-template+xml"/>
<item id="start" href="index.html" media-type="application/xhtml+xml"/><item href="s1.html" media-
type="application/xhtml+xml" id="section1"/><item href="s2.html" media-
type="application/xhtml+xml" id="section34"/><item href="s3.html" media-
type="application/xhtml+xml" id="section57"/><item href="media/image0.png" id="image-1" media-
type="image/png"/><item id="ncx" href="toc.ncx" media-type="application/x-
dtbncx+xml"/></manifest>
Summer School 2011 58/62
Third part of metadata.
.
. ..
.
.
<spine toc="ncx"><itemref idref="titlepage" linear="yes"/><itemref idref="start" linear="yes"/><itemref linear="yes" idref="section1"/><itemref linear="yes" idref="section34"/><itemref linear="yes" idref="section57"/><itemref idref="titlepageback" linear="no"/>
</spine><guide><reference type="text" href="titlepage.html" ti-
tle="Cover"/><reference type="text" title="Start" href="index.html"/><reference type="text" href="s1.html" title="War Poems"/><reference type="text" href="s2.html" title="Other Poems,
and Fragments"/><reference type="text" href="s3.html" title="Minor Poems,
and Juvenilia"/><reference href="titlepageback.html" type="text" ti-
tle="About this book"/></guide>
Summer School 2011 59/62
So lets make our own
Export to ePub from InDesignExport to ePub from Apple PagesConvert from other formats using CalibreConvert web pages (dotepub)Your Favourite System may have an exportRoll it yourself with an HTML editor
Summer School 2011 60/62
TEI ePub method
Use TEI XML as pivot formatAdapt HTML transforms and CSSGenerate extra components of ePub format as part of XSLTtransformGenerate cover images automatically from metadataScripting to manage zip-packaging and graphic filesInitially command-line, then as web service
Summer School 2011 61/62