40
XSLT Brent P. Christie Major USMC

XSLT Brent P. Christie Major USMC. XSLT Overview What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Embed Size (px)

Citation preview

Page 1: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XSLT

Brent P. Christie

Major USMC

Page 2: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XSLT Overview What is XSLT?

– XSL is the Extensible Style Language.

– It has two parts: the transformation language and the formatting language.

– In this lecture we are concerned with XSL transformations, or XSLT.

• The formatting language will be discussed in a separate brief.

– XSLT provides a syntax for defining rules that transform an XML document to another document.

• For example, to an HTML document.

– An XSLT “style sheet” consists primarily of a set of template rules that are used to transform nodes matching some patterns.

Page 3: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XSLT Overview Example of XML document<?xml version=”1.0”?><?xml-stylesheet type=”text/xml” href=”planets.xsl”?><PLANETS>  <PLANET> <NAME>Mercury</NAME> <MASS UNITS=”(Earth = 1)”>.0553</MASS> <DAY UNITS=”days”>58.65</DAY> <RADIUS UNITS=”miles”>1516</RADIUS> <DENSITY UNITS=”(Earth = 1)”>.983</DENSITY> <DISTANCE UNITS=”million miles”>43.4</DISTANCE><!--At

perihelion--> </PLANET>

 

Page 4: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XSLT Overview XML document example con’t<PLANET> <NAME>Venus</NAME> <MASS UNITS=”(Earth = 1)”>.815</MASS> <DAY UNITS=”days”>116.75</DAY> <RADIUS UNITS=”miles”>3716</RADIUS> <DENSITY UNITS=”(Earth = 1)”>.943</DENSITY> <DISTANCE UNITS=”million miles”>66.8</DISTANCE><!--At perihelion--> </PLANET>  <PLANET> <NAME>Earth</NAME> <MASS UNITS=”(Earth = 1)”>1</MASS> <DAY UNITS=”days”>1</DAY> <RADIUS UNITS=”miles”>2107</RADIUS> <DENSITY UNITS=”(Earth = 1)”>1</DENSITY> <DISTANCE UNITS=”million miles”>128.4</DISTANCE><!--At perihelion--> </PLANET> </PLANETS>

Page 5: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XSLT Overview Example of a style sheet planet.xsl<?xml version=”1.0”?><xsl:stylesheet version=”1.0”

xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”>  <xsl:template match=”PLANETS”> <HTML> <xsl:apply-templates/> </HTML> </xsl:template>  <xsl:template match=”PLANET”> <P> <xsl:value-of select=”NAME”/> </P> </xsl:template> </xsl:stylesheet>

Page 6: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XSLT Overview Result<HTML>  <P>Mercury</P>  <P>Venus</P>  <P>Earth</P> </HTML>

Page 7: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XSLT Overview The xml-stylesheet element in the XML instance references an XSL

style sheet.

In general, children of the stylesheet element in a stylesheet are templates.

A template specifies a pattern; the template is applied to nodes in the XML source document that match this pattern.– Note: the pattern “/” matches the root node of the document, we will see

this later

In the transformed document, the body of the template element replaces the matched node in the source document.

In addition to text, the body may contain further XSL terms, e.g.:– xsl:value-of extracts data from selected sub-nodes.

Page 8: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XSLT Overview We have an XML document and the style sheet (or rules) to

transform it. So, how do you transform the document?. You can transform documents in three ways:

– In the server. A server program, such as a Java servlet, can use a style sheet to transform a document automatically and serve it to the client. Example, XML Enabler, which is a servlet that you’ll find at XML for Java Web site, www.alphaworks.ibm.com/tech/xml4j

– In the client. An XSL-enabled browser may convert XML downloaded from the server to HTML, prior to display. Currently Internet Explorer supports a subset of XSLT.

– In a standalone program. XML stored in or generated from a database, say, may be “manually” converted to HTML before placing it in the server’s document directory.

In any case, a suitable program takes an XML document as input, together with an XSLT “style-sheet”.

Page 9: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Format of Style Sheet You guessed it, XSLT style sheet is itself an XML document.

We will be using the XSLT elements from the namespace http://www.w3.org/1999/XSL/Transform for this brief– As a matter of convention we use the prefix xsl: for this namespace.

The document root in an XSLT style sheet is an xsl:stylesheet element, e.g.:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > . . .</xsl:stylesheet>

– A synonym for xsl:stylesheet is xsl:transform.

Several kinds of element can be nested inside xsl:stylesheet, but by far the most important is the xsl:template element.

Page 10: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

The xsl:template element When you match or select nodes, a template tells the

XSLT processor how to transform the node for output So all our templates will have the form:

<xsl:template match=“pattern”> template body </xsl:template>

The pattern is an Xpath expression describing the nodes to which the template can be applied.

The processor scans the input document for nodes matching this pattern, and replaces them with the text included in the template body.

In a nutshell, this explains the whole operation of XSLT.

Page 11: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XPATH The XML Path Language, or XPath, is a language for addressing parts

of an XML document. The patterns and other node selections appearing in XSLT rules are

represented using XPath syntax.– Including the match element of xsl:template or the select element of

xsl:value-of.

We’ve seen that you can use the match attribute to find nodes by name, child elements(s), attributes. Can even find a descendant.

XPath does all this and more with the select attribute.– Finding nodes by parent or sibling elements, as well as much more

involved tests.

More of a true language than the expressions you can use with match attribute– Return not only lists of nodes, but also Boolean, string, and numeric

values

Page 12: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XPath It is an essential part of XSLT, and also XPointer

(as wel as being used in XML schema).

In simple cases an XPath expression looks like a UNIX path name, with nested directory names replaced by nested element names:– “/” is the root element of a document

– expressions may be absolute (relative to the root) or relative to some context node

Page 13: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Types of XPath expressions Xpath expressions evaluate to one of four possible types of

thing:– A node-set: a collection of nodes in the XML document. See

below for the description of “node”.

– A boolean value: true or false.

– A number: always represented internally as 64-bit IEEE 754 floating-point double-precision format, although they may be written and used as an integer.

– A string.

In the end we are interested in Xpath expressions that evaluate to a node-set, although other expression types will appear.

Page 14: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Xpath Node XSL transformations accept a document tree as input and produce

a tree as output. From the XSLT point of view, documents are trees built of nodes, and there are seven types of nodes that can appear in a node-set; here are those nodes, and how XSLT processors treat them:Node DescriptionDocument root Is the very start of the document, “/” Attribute Holds the value of an attribute after entity

references have been expanded and surrounding whitespace has been trimmedComment Holds the text of a comment, not including <!--

and -->Element Consists of all character data in the element,

which includes character data in any of the children of the element

Namespace Holds the namespace’s URIProcessing instruction Holds the text of the processing instruction,

which does not include <? and ?>Text Holds the text of the node

Page 15: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Location paths The most important kind of Xpath expression—the one that gives

XPath its name—is the location path.– absolute location path – path from the root node

– relative – starting with current node, called context node.

In general a location path consists of a series of location steps separated by the slash “/”. Made up of– axis, a node test, and zero or more predicates.

As noted earlier, the most common example of a location step—analogous to a UNIX directory name—is an XML element name.

Actually this common case is an example of what is called abbreviated syntax.

To be systematic, we will describe the general, unabbreviated syntax for location paths first

Page 16: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Parts of a location step An individual location step has three logical parts:

– The axis—a keyword which, loosely speaking, describes the “dimension” this location step takes us into.

• Simple examples are child and attribute which, respectively, say that this step enters the set of children or the set of attributes of an element.

– A node test—this is typically an element or attribute name, selecting within the chosen axis. It may also, less specifically, be a node type.

– Zero or more predicates, which use arbitrary XPath expressions to further refine the set of selected nodes.

The unabbreviated syntax for a location step is: axis :: node-test [predicate1] [predicate2] . . .

Page 17: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Axes Any location step starts from some context node; the axis

is relative to this node. The available axes are:

– child—contains the children of the context node.– descendent—contains children and all descendents of children.– parent—contains the parent of the context node.– ancestor—contains parent of the context node and ancestors of parent.

All the way back to and including root node.– attribute—contains the attributes of the context node.– following—all following element nodes, in document order.– preceding—all preceding element nodes, in document order.– following-sibling, preceding-sibling—elements at the same syntactic

level.– namespace—contains namespace nodes of context node.– self, descendent-or-self, ancestor-or-self

Page 18: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

The NodeTest After choosing the axis, we refine the selection with a

node test. The most common cases for a the node-test field are likely

to be: – an element or attribute name, selecting nodes in the axis with the

given name, or

– the wildcard, “*”, selecting all nodes of the “principal type” for this axis (typically, all element nodes, or all attribute nodes if axis is attribute).

The node-test field may also be a node type expression:– comment(), text(), processing-instruction(), node()

– Optionally, the processing-instruction() function may include a literal string specifying a particular type of instruction.

Page 19: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Predicates The node test is optionally followed by a series of

predicate expressions. Each expression appears in []s. Syntax of the expressions will be briefly discussed later.

Some examples appear in the next few slides. The predicates are computed successively to further filter

the set of selected nodes—after each predicate is applied, the selected node set is reduced to exclude those elements for which the expression evaluates to false.

Following examples are taken from the XML Path Language specification.

Page 20: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Example location pathschild :: para

– para element children of the context node

child :: *– All element children of the context node

child :: text()– All text node children of the context node

child :: node()– All children, regardless of node type

attribute :: name– The name attribute of the context node

attribute :: *– All attributes of the context node

decendent :: para– para element descendents of the context node

ancestor :: div– div element ancestors of the context node

Page 21: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

More complex exampleschild :: chapter/descendent :: para

– para element descendents of chapter element children of the context node.

child :: */child :: para– All para element grandchildren of the context node.

/descendent :: para– All para elements in this document.

child :: para [position() = 1]– First para element child of the context node.

child :: para [position() = last()]– Last para element child of the context node.

child :: para [position() > 1]– All para element children of the context node, except the first.

/child :: doc/child :: chapter [position() = 5]/child :: section– section elements of 5th chapter element of root doc element.

child :: para [attribute :: type = ‘warning’] [position() = 5]– 5th para child of context node having type attribute value “warning”.

Page 22: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Predicate expressionsThe axis in a location step defines a node-

set, which is then filtered by the node test to produce a reduced node set.

A predicate is evaluated for each element of the node set selected so far:– The context node for the predicate expression

is the element being filtered (not the context node for the location step as a whole!)

– The context set for the predicate expression is the node set currently being filtered.

Page 23: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Context Position Various functions are available in Xpath expressions, e.g.:

– last() returns the size of the context set for the expression.

– position(): the position of the context node in the context set.

If the Xpath expression that appears in the predicate of a path step evaluates to numeric type, it is converted to true if its value is equal to position().– Otherwise it is converted to false.

Thus, by definition, the location step: child :: para [5]

is an alternative to: child :: para [position() = 5]

i.e., the 5th para child.

Page 24: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Booleans In general an Xpath expression is converted to a boolean

—if context demands—by the following rules:– A non-zero number converts to true, zero converts to false.

– A non-empty string converts to true true, empty to false.

– A non-empty node converts to true, empty to false.

According to the third rule, in the location step: child :: section [child :: para]

the predicate is true if the context node for the predicate (i.e. the section node) has at least one para child.

Operators and, or, not() are available.

Page 25: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Comparisons Numeric and string comparisons in Xpath predicates

follows obvious rules. Comparisons involving node sets are defined to be true if

the comparison would hold true for the string-value of any elements of the sets involved.– Note, the string value of an element node is a concatenation of the

string values of its children.

For example, in child :: para [attribute :: type = ‘warning’]

the predicate is true iff the node set attribute :: type includes an element with string-value “warning”, i.e. if the para child has an attribute with value “warning”.

Page 26: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

UnionsThe operator “|” forms the union of two

node sets.e.g.

child :: chapter [child :: section | child :: para]

selects chapters that directly contain a section or a para.

Page 27: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Abbreviated syntax for paths Together, the following abbreviations allow the

UNIX-like path syntax seen earlier:– The axis selector child :: can always be omitted: a node

test alone implicitly refers to the child axis.

– The location step “.” is short for self :: node().

– The location step “..” is short for parent :: node().

Other useful abbreviations are:– The axis selector attribute :: can be abbreviated to @.

– // is short for /descendent-or-self :: node()/• e.g //para is short for any para element in the document.

Page 28: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

An input document

<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xml" href="eg.xsl"?><planets> <planet> <name>Mercury</name> <mass>0.0553</mass> <day units="days">58.65</day> <radius units="miles">1516</radius> <density>0.983</density> </planet> <planet> <name>Venus</name> <mass>0.815</mass> <day units="days">116.75</day> <radius units="miles">3716</radius> <density>0.943</density> </planet> <planet> <name>Earth</name> <mass>1</mass> <day units="days">1</day> <radius units="miles">2107</radius> <density>1</density> </planet></planets>

Simplified version ofexample from the“Inside XML” book(complete with astronomical errors).

Page 29: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Using an empty style sheet Consider the example where there are no templates

explicitly specified, eg.xsl has the form: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > </xsl:stylesheet>

The transformation of the input document is:Mercury0.055358.6515160.983Venus0.815116.7537160.943Earth1121071

i.e. just the concatenated string values in all text nodes.

This happens because there is a default template rule: <xsl:template match=“text()”> <xsl:value-of select=“.”/> </xsl:template>

Page 30: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Templates without embedded XSLT Now consider a single template, with no embedded XSLT

commands: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0” xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > <xsl:template match="planet"> <p>planet discovered</p> </xsl:template> </xsl:stylesheet>

The transformation of the input document is:<?xml version="1.0" encoding="UTF-16"?><p>planet discovered</p><p>planet discovered</p><p>planet discovered</p>

This is valid HTML, but not very readable (as text).

We can add the command: <xsl:output indent="yes"/>

to the xsl:stylesheet element to get prettier output formatting.

Page 31: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

The xsl:apply-templates element Suppose a second template matching the planets element is

added: <xsl:template match="planet"> <p>planet discovered</p> </xsl:template>

<xsl:template match="planets"> <h1>All Known Planets</h1> </xsl:template>

The output now only contains the header: <h1>All Known Planets</h1>

not the “planet discovered” messages from processing the nested planet elements.

Once a match is found, nested elements are not processed unless there is an explicit <xsl:apply-templates> instruction: <xsl:template match="planets"> <h1>All Known Planets</h1> <xsl:apply-templates/> </xsl:template>

Page 32: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

The xsl:value-of element We can now match arbitrary nodes in the source

document, but we don’t yet have a way to extract data from those nodes.

To do this we need the xsl:value-of element, e.g.: <xsl:template match="planet"> <p>planet <xsl:value-of select="name"/>

discovered</p> </xsl:template>

<xsl:template match="planets"> <h1>All Known Planets</h1> <xsl:apply-templates/> </xsl:template>

We now get the more interesting output: <h1>All Known Planets</h1> <p>planet Mercury discovered</p> <p>planet Venus discovered</p> <p>planet Earth discovered</p>

Page 33: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Selections The select attribute of the xsl:value-of element is

a general Xpath expression. Its result—which may be a node set or other

allowed value—is converted to a string and included in the output.

For example, the selection can be an attribute node, a set of elements, or it could be the result of a numeric computation.

If the selection is a set of elements, the text contents of all the element bodies, including nested elements, are concatenated and returned as the value.

Page 34: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

Xpath expressions in attributes Suppose we want to generate an XML element in the output with

an attribute whose value is computed from source data. One might be tempted to try a template like:

<planet name = "<xsl:value-of select='name'/>" > Status: discovered </planet>

This is ill-formed XML: we cannot have an XML element as an attribute value.

Instead {}s can be used in an attribute value to surround an Xpath expression:

<planet name = "{name}" > Status: discovered </planet>

The Xpath expression name is evaluated exactly as for a select attribute, and interpolated into the attribute value string.

Page 35: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

The xsl:element element For similar reasons we cannot use <xsl:value-of> to

compute an expression that is used as the name of an element in the generated file.

Instead one can use instead the xsl:element element.– These can optionally include nested xsl:attribute elements (as their

first children): <xsl:template match="planet"> <xsl:element name="{name}"> <xsl:attribute name="distance"> <xsl:value-of select="distance"/> </xsl:attribute> Status: discovered </xsl:element> </xsl:template>

– When this template matches a planet, it generates an XML element whose name is the planet, with a distance attribute.

Page 36: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

A Table of Planets<xsl:template match="planets"> <html><body> <h1>All Known Planets</h1> <table width="100%" align="center" border="1"> <tr><th>name</th><th>mass</th><th>density</th> <th>radius</th></tr>

<xsl:apply-templates/> <!-- rows of table -->

<tr> <td>AVERAGES</td> <td> <xsl:value-of select="sum(planet/density) div count(planet)"/> </td> <td></td> <td></td> </tr> </table> </body></html> </xsl:template>

Page 37: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

A row of the table

<xsl:template match="planet"> <tr> <td><xsl:value-of select="name"/></td> <td><xsl:value-of select="mass"/></td> <td><xsl:value-of select="density"/></td> <td><xsl:value-of select="radius"/></td> </tr> </xsl:template>

Page 38: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

The display

Page 39: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

References

Inside XML, Chapter 13: “XSL Transformations”.

“XSL Transformations (XSLT)”, version 1.0: http://www.w3.org/TR/xslt

“XML Path Language (XPath)”, version 1.0: http://www.w3.org/TR/xpath

Nancy McCracken, Ozgur Balsoy– http://aspen.csit.fsu.edu/webtech/xml/

Page 40: XSLT Brent P. Christie Major USMC. XSLT Overview  What is XSLT? –XSL is the Extensible Style Language. –It has two parts: the transformation language

XML and Related Acronyms Document Type Definition (DTD), which defines the tags and their relationships Extensible Style Language (XSL) style sheets, which specify the presentation of

the document Cascading Style Sheets(CSS) less powerful presentation technology without tag

mapping capability XPATH which specifies location in document XLINK and XPOINTER which defines link-handling details Resource Description Framework (RDF), document metadata Document Object Model (DOM), API for converting the document to a tree

object in your program for processing and updating Simple API for XML (SAX), “serial access” protocol, fast-to-execute protocol for

processing document on the fly XML Namespaces, for an environment of multiple sets of XML tags XHTML, a definition of HTML tags for XML documents (which are then just

HTML documents) XML schema, offers a more flexible alternative to DTD