55
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall 2005 1 Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall 2005 1 XML Transformation: XSLT Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology

XML Transformation: XSLT

  • Upload
    sally

  • View
    119

  • Download
    2

Embed Size (px)

DESCRIPTION

XML Transformation: XSLT. Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology. Outline. Fundamentals of XSLT XPath Cocoon. XSLT. XSLT stands for E xtensible S tylesheet L anguage T ransformations - PowerPoint PPT Presentation

Citation preview

Page 1: XML Transformation:  XSLT

Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall 2005 1Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall 2005 1

XML Transformation: XSLT

Semantic Web - Spring 2007Computer Engineering Department

Sharif University of Technology

Page 2: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 2

Outline

• Fundamentals of XSLT• XPath• Cocoon

Page 3: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 3

XSLT

• XSLT stands for Extensible Stylesheet Language Transformations

• It is used to transform XML documents into other kinds of documents, e.g. HTML, PDF, XML, …

• XSLT uses two input files:– The XML document containing the actual data– The XSL document containing both the “framework” in

which to insert the data, and XSLT commands to do so

Page 4: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 4

XSLT Architecture

Source XML doc

XSL stylesheet

XSL processor

Target Document

Page 5: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 5

Some special transforms

• XML to HTML— for old browsers

• XML to LaTeX—for TeX layout

• XML to SVG—graphs, charts, trees

• XML to tab-delimited—for db/stat packages

• XML to plain-text—occasionally useful

• XML to XSL-FO formatting objects

Page 6: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 6

XSLT Data Model

• XSLT reads an XML documents as a source tree• Transforms the documents into a result tree• Transformations are specified in a stylesheet• To navigate the tree XSLT uses XPath

Page 7: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 7

Introduction to XPath

• XPath is a syntax for addressing parts of an XML document by– describing paths through the document hierarchy– specifying constraints to match against the document's

structure• XSL uses XPath expressions to

– determine which elements match a template– select nodes upon which to perform operations

Page 8: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 8

XPath Basics

• XPath expressions superficially resemble UNIX pathnames, e.g.

poem/stanza/line

refers to "all line elements which are children of stanza elements which are children of poem elements"

• XPath expressions are evaluated relative to a "context node", which is analogous to the "current working directory" in UNIX or DOS. The XPath expression for this is "."

Page 9: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 9

XPath Basics: a Simple Example

• Consider the following XML document:<poem> <title>Roses</title> <author>Ima Poet</author> <stanza> <line>Roses are red</line> <line>violets are blue</line> </stanza> <stanza> <line>I'm a poet</line> <punch>and you're not!</punch> </stanza></poem>

Page 10: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 10

XPath Basics: a Simple Example (cont.)

• The XPath "poem/stanza/line" selects<poem> <title>Roses</title> <author>Ima Poet</author> <stanza> <line>Roses are red</line> <line>violets are blue</line> </stanza> <stanza> <line>I'm a poet</line> <punch>and you're not!</punch> </stanza></poem>

Page 11: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 11

XPath Basics: wildcards

• The XPath "poem/stanza/*" selects<poem> <title>Roses</title> <author>Ima Poet</author> <stanza> <line>Roses are red</line> <line>violets are blue</line> </stanza> <stanza> <line>I'm a poet</line> <punch>and you're not!</punch> </stanza></poem>

Page 12: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 12

XPath Basics: descendants

• The XPath "poem//punch" selects:<poem> <title>Roses</title> <author>Ima Poet</author> <stanza> <line>Roses are red</line> <line>violets are blue</line> </stanza> <stanza> <line>I'm a poet</line> <punch>and you're not!</punch> </stanza></poem>

Page 13: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 13

XPath Basics: sequencing

• "poem/stanza/line[1]" selects:<poem> <title>Roses</title> <author>Ima Poet</author> <stanza> <line>Roses are red</line> <line>violets are blue</line> </stanza> <stanza> <line>I'm a poet</line> <punch>and you're not!</punch> </stanza></poem>

Page 14: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 14

XPath Basics: sequencing (cont.)

• "poem/stanza/line[position() = last()]" selects:<poem> <title>Roses</title> <author>Ima Poet</author> <stanza> <line>Roses are red</line> <line>violets are blue</line> </stanza> <stanza> <line>I'm a poet</line> <punch>and you're not!</punch> </stanza></poem>

Page 15: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 15

XPath Basics: selecting text nodes

• "poem/author/text()" selects:<poem> <title>Roses</title> <author>Ima Poet</author> <stanza> <line>Roses are red</line> <line>violets are blue</line> </stanza> <stanza> <line>I'm a poet</line> <punch>and you're not!</punch> </stanza></poem>

Page 16: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 16

XPath Basics: conditionals

• "poem/stanza[punch]" selects:<poem> <title>Roses</title> <author>Ima Poet</author> <stanza> <line>Roses are red</line> <line>violets are blue</line> </stanza> <stanza> <line>I'm a poet</line> <punch>and you're not!</punch> </stanza></poem>

Page 17: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 17

XPath Basics: conditionals: equality

• “//line[text()="I'm a poet"]”<poem> <title>Roses</title> <author>Ima Poet</author> <stanza> <line>Roses are red</line> <line>violets are blue</line> </stanza> <stanza> <line>I'm a poet</line> <punch>and you're not!</punch> </stanza></poem>

Page 18: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 18

A simple XSL example• File data.xml:

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="render.xsl"?><message>Hello World!</message>

• File render.xsl: <?xml version="1.0"?>

<xsl:stylesheet version="1.0” xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <!-- one rule, to transform the input root (/) --> <xsl:template match="/">

<html><body> <h1><xsl:value-of select="message"/></h1>

</body></html> </xsl:template></xsl:stylesheet>

Page 19: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 19

Stylesheet (.xsl file)

• It is a well-formed XML document• It is a collection of template rules• A template rule consists of pattern and a

template• Pattern is specified in Xpath and locates the node

of the XML tree.• The located node is replaced by the template in

the result tree

Page 20: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 20

The .xsl file

• An XSLT document has the .xsl extension • The XSLT document begins with:

<?xml version="1.0"?> <xsl:stylesheet version="1.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform">• Contains one or more templates, such as:

<xsl:template match="/"> ... </xsl:template>• And ends with:

</xsl:stylesheet>

Page 21: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 21

Finding the message text

• The template <xsl:template match="/"> says to select the entire file– You can think of this as selecting the root node of the

XML tree• Inside this template,

– <xsl:value-of select="message"/> selects the message child

– Alternative Xpath expressions that would also work:• ./message• /message/text()• ./message/text()

Page 22: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 22

Putting it together

• The XSL was: <xsl:template match="/">

<html><body> <h1><xsl:value-of select="message"/></h1>

</body></html> </xsl:template>

• The <xsl:template match="/"> chooses the root• The <html><body> <h1> is written to the output file• The contents of message is written to the output file• The </h1> </body></html> is written to the output file

• The resultant file looks like: <html><body> <h1>Hello World!</h1> </body></html>

Page 23: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 23

How XSLT works

• The XML text document is read in and stored as a tree of nodes

• The <xsl:template match="/"> template is used to select the entire tree

• The rules within the template are applied to the matching nodes, thus changing the structure of the XML tree– If there are other templates, they must be called explicitly from the

main template• Unmatched parts of the XML tree are not changed• After the template is applied, the tree is written out again

as a text document

Page 24: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 24

Where XSLT can be used

• A server can use XSLT to change XML files into HTML files before sending them to the client

• A modern browser can use XSLT to change XML into HTML on the client side– This is what we will mostly be doing in this class

• Most users seldom update their browsers– If you want “everyone” to see your pages, do any XSL

processing on the server side

Page 25: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 25

Modern browsers

• Internet Explorer 6 best supports XML• Netscape 6 supports some of XML• Internet Explorer 5.x supports an obsolete version

of XML– If you must use IE5, the initial PI is different (you can

look it up if you ever need it)

Page 26: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 26

xsl:value-of

• <xsl:value-of select="XPath expression"/> selects the contents of an element and adds it to the output stream– The select attribute is required– Notice that xsl:value-of is not a container, hence it

needs to end with a slash

• Example (from an earlier slide): <h1> <xsl:value-of select="message"/> </h1>

Page 27: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 27

xsl:for-each

• xsl:for-each is a kind of loop statement• The syntax is

<xsl:for-each select="XPath expression"> Text to insert and rules to apply </xsl:for-each>

• Example: to select every book (//book) and make an unordered list (<ul>) of their titles (title), use: <ul> <xsl:for-each select="//book"> <li> <xsl:value-of select="title"/> </li> </xsl:for-each> </ul>

Page 28: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 28

Filtering output

• You can filter (restrict) output by adding a criterion to the select attribute’s value: <ul> <xsl:for-each select="//book"> <li> <xsl:value-of select="title[../author='Terry Pratchett']"/> </li> </xsl:for-each> </ul>

• This will select book titles by Terry Pratchett

Page 29: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 29

Filter details

• Here is the filter we just used: <xsl:value-of select="title[../author='Terry Pratchett']"/>

• author is a sibling of title, so from title we have to go up to its parent, book, then back down to author

• This filter requires a quote within a quote, so we need both single quotes and double quotes

• Legal filter operators are: = != &lt; &gt;– Numbers should be quoted, but apparently don’t have to be

Page 30: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 30

But it doesn’t work right!

• Here’s what we did: <xsl:for-each select="//book"> <li> <xsl:value-of select="title[../author='Terry Pratchett']"/> </li> </xsl:for-each>

• This will output <li> and </li> for every book, so we will get empty bullets for authors other than Terry Pratchett

• There is no obvious way to solve this with just xsl:value-of

Page 31: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 31

xsl:if

• xsl:if allows us to include content if a given condition (in the test attribute) is true

• Example: <xsl:for-each select="//book"> <xsl:if test="author='Terry Pratchett'"> <li> <xsl:value-of select="title"/> </li> </xsl:if> </xsl:for-each>

• This does work correctly!

Page 32: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 32

xsl:choose

• The xsl:choose ... xsl:when ... xsl:otherwise construct is XML’s equivalent of Java’s switch ... case ... default statement

• The syntax is:<xsl:choose> <xsl:when test="some condition"> ... some code ... </xsl:when> <xsl:otherwise> ... some code ... </xsl:otherwise></xsl:choose>

• xsl:choose is often used within an xsl:for-each loop

Page 33: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 33

xsl:sort

• You can place an xsl:sort inside an xsl:for-each• The attribute of the sort tells what field to sort on• Example:

<ul> <xsl:for-each select="//book"> <xsl:sort select="author"/> <li> <xsl:value-of select="title"/> by <xsl:value-of select="author"/> </li> </xsl:for-each> </ul>– This example creates a list of titles and authors, sorted

by author

Page 34: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 34

xsl:text

• <xsl:text>...</xsl:text> helps deal with two common problems:– XSL isn’t very careful with whitespace in the document

• This doesn’t matter much for HTML, which collapses all whitespace anyway (though the HTML source may look ugly)

• <xsl:text> gives you much better control over whitespace; it acts like the <pre> element in HTML

– Since XML defines only five entities, you cannot readily put other entities (such as &nbsp;) in your XSL

• &amp;nbsp; almost works, but &nbsp; is visible on the page • Here’s the secret formula for entities:

<xsl:text disable-output-escaping="yes">&amp;nbsp;</xsl:text>

Page 35: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 35

Creating tags from XML data

• Suppose the XML contains<name>Dr. Abolhassani's Home Page</name><url>http://sharif.edu/~abolhassani</url>

• And you want to turn this into<a href="http://sharif.edu/~abolhassani">Dr. Abolhassani's Home Page</a>

• We need additional tools to do this!

Page 36: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 36

Creating tags--solution 1• Suppose the XML contains

<name>Dr. Abolhassani's Home Page</name><url>http://sharif.edu/~abolhassani</url>

• <xsl:attribute name="..."> adds the named attribute to the enclosing tag

• The value of the attribute is the content of this tag• Example:

<a> <xsl:attribute name="href"> <xsl:value-of select="url"/> </xsl:attribute> <xsl:value-of select="name"/> </a>

• Result: <a href="http://sharif.edu/~abolhassani">Dr. Abolhassani's Home Page</a>

Page 37: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 37

Creating tags--solution 2• Suppose the XML contains

<name>Dr. Abolhassani's Home Page</name><url>http://sharif.edu/~abolhassani</url>

• An attribute value template (AVT) consists of braces { } inside the attribute value

• The content of the braces is replaced by its value• Example:

<a href="{url}"> <xsl:value-of select="name"/> </a>

• Result: <a href="http://sharif.edu/~abolhassani">

Dr. Abolhassani's Home Page</a>

Page 38: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 38

Modularization

• Modularization--breaking up a complex program into simpler parts--is an important programming tool– In programming languages modularization is often done with

functions or methods– In XSL we can do something similar with

xsl:apply-templates• For example, suppose we have a DTD for book with parts

titlePage, tableOfContents, chapter, and index– We can create separate templates for each of these parts

Page 39: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 39

Book example

• <xsl:template match="/"> <html> <body> <xsl:apply-templates/> </body> </html></xsl:template>

• <xsl:template match="tableOfContents"> <h1>Table of Contents</h1> <xsl:apply-templates select="chapterNumber"/> <xsl:apply-templates select="chapterName"/> <xsl:apply-templates select="pageNumber"/></xsl:template>

• Etc.

Page 40: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 40

xsl:apply-templates

• The <xsl:apply-templates> element applies a template rule to the current element or to the current element’s child nodes

• If we add a select attribute, it applies the template rule only to the child that matches

• If we have multiple <xsl:apply-templates> elements with select attributes, the child nodes are processed in the same order as the <xsl:apply-templates> elements

Page 41: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 41

Applying templates to children

• <book> <title>XML</title> <author>Gregory Brill</author> </book>

• <xsl:template match="/"> <html> <head></head> <body> <b><xsl:value-of select="/book/title"/></b> <xsl:apply-templates select="/book/author"/> </body> </html></xsl:template><xsl:template match="/book/author"> by <i><xsl:value-of select="."/></i></xsl:template>

With this line:XML by Gregory Brill

Without this line:XML

Page 42: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 42

Tools for XSL Development

• There are a number of free and commercial XSL tools available– XSLT processors:

• MSXML, which currently supports the latest XSLT specification (native Win32)

• Xalan from Apache (C++, Java)– Editors and browsers

• Internet Explorer 6.0• XML Spy (commercial)

Page 43: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 43

Cocoon

• Cocoon is Apache’s dynamic XML Publishing Framework.

• Cocoon uses XSLT.• Cocoon allows separation of content, logic and

presentation. making sure people can interact and collaborate on a project, without stepping on each other toes, and component-based web development.

• Cocoon is a web-application that runs using Apache Tomcat (Cocoon.war).

Page 44: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 44

What Cocoon can do

X M Ld o c u m e nt(s ta tic o rd y na m ic )

N e wD e vic e

bro ws e r

re q u e s t

H TM L/PDF /... ...W M L

requ e s t

requ

est

?

Page 45: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 45

Cocoon Pipeline

Cocoon introduced the idea of a pipeline to handle a request. A pipeline is a series of steps for processing a particular kind of content.

File G e ne rato r X SL T Trans fo rm e r H TM L s e r ial ize r

SAX SAXX M L

D o c um e nt H TM L F ile

Page 46: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 46

Cocoon processing

Page 47: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 47

Separating content and layout

Page 48: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 48

Sitemap

Page 49: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 49

Sitemap (cont.)

Page 50: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 50

Defining a pipeline

Page 51: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 51

Defining a pipeline (cont.)

Page 52: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 52

Complex pipeline

Page 53: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 53

Matching a request

Page 54: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 54

Other features

• Selectors components• Multi-channel capabilities• Actions components• Readers• XSP• Content aggregation• Views• Form processing

Page 55: XML Transformation:  XSLT

Semantic web - Computer Engineering Dept. - Spring 2007 55

References

• Specifications:– http://www.w3.org/Style/XSL– http://www.w3.org/TR/xslt– http://www.w3.org/TR/xpath– http://www.w3.org/TR/xsl

• An excellent XSLT tutorial:– http://www.cafeconleche.org/books/bible2/chapters/ch17.html

• Another tutorial:– http://www.w3schools.com/xsl

• Microsoft (MSXML3):– http://msdn.microsoft.com/xml

• Saxon:– http://saxon.sourceforge.net/

• Xalan:– http://xml.apache.org./xalan/overview.html

• Cocoon: http://cocoon.apache.org