Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Extensible Markup
Language (XML)Hamid Zarrabi-Zadeh
Web Programming – Fall 2013
Outline
• Introduction
• XML Structure
• Document Type Definition (DTD)
• XHMTL
• Formatting XML
CSS Formatting
XSLT Transformations
• JSON
2
What is XML?
• XML is a markup language for encoding
documents in a format that is both human-
readable and machine-readable
• Is designed to transport and store data
• Emphasizes simplicity, generality, and usability
over the Internet
• Has strong support via Unicode for the languages
of the world
3
XML History
• XML is based on SGML, a Standard Generalized
Markup Language (ISO 8879:1986)
• Most of XML comes from SGML unchanged
• First XML specification draft published in 1996
• XML 1.0 became a W3C recommendation in
1998 (fifth edition published in 2008)
• XML 1.1 published in 2004 (revised in 2006), but is
not widely implemented and is rarely used
4
XML Example
• A simple XML example
5
<?xml version="1.0"?>
<message>
<from>Hassan</from>
<to>Hossein</to>
<body>Please give me a call!</body>
</message>
XML Example
• Another example:
6
<?xml version="1.0"?>
<books>
<book>
<title>Maktub</title>
<author>Paulo Coelho</author>
</book>
<book>
<title>Never Crashed!</title>
<author>Microsoft</author>
</book>
</books>
XML versus HTML
• XML and HTML are both markup languages
• HTML is for displaying data, while XML is for
describing data
• XML syntax differences
New tags may be defined at will
Tags may be nested to arbitrary depth
May contain an optional description of its grammar
• XHTML is a version of HTML in XML
7
XML Markup Languages
• Lots of new markup languages have been
created with XML, including:
XHTML
RSS for news feeds
RDF for describing resources
SVG for scalable vector graphics
SMIL for describing multimedia for the web
MathML for describing mathematical notation
…
8
XML Pros and Cons
• Pros:
software- and hardware-independent
simplifying:
sharing data between applications
transporting data between different platforms
• Cons:
verbosity
rather complex parsing and mapping to type systems
9
XML Structure
XML Tree
• Each XML document forms a tree structure that
starts at the root and branches to the leaves
11
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
</bookstore>
XML Tree Example12
XML Tags
• XML tags are similar to HTML tags but
They are case-sensitive
All tags must be closed
• Like HTML tags they must be properly nested
• All XML documents must have a single root
element that contains all other elements
This root element can have any name
13
XML Attributes
• XML elements can have attributes
• Attribute values must be quoted with either single
or double quotes
• Attributes have limitations (use with care)
– Child elements are more flexible alternatives
14
<book title="Let's party!">
<book>
<title>It's me</title>
<author>Me who</author>
</book>
<film name='The "Lost"'/>
Document Type
Definitions (DTDs)
Document Type Definitions
• Most applications will not be able to deal with
general XML documents
• Instead, they expect documents that have a
specific structure
• This structure can be defined with an XML
Document Type Definition (DTD)
• A DTD specifies the root node's tag name and
what it contains
16
Valid XML
• A well-formed XML document which conforms to
the rules of a DTD is called a valid XML
17
<?xml version="1.0"?>
<!DOCTYPE message SYSTEM "message.dtd">
<message>
<from>Hassan</from>
<to>Hossein</to>
<body>Please give me a call!</body>
</message>
DTD Example
• A simple DTD for our message example would
look like this
18
<!DOCTYPE message
[
<!ELEMENT message (from,to,subject,body)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT subject (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
DTD Building Blocks
• In a DTD we can specify
Elements – tags and the stuff text between them
Attributes – information about elements
Entities – special character <, >, &
PCDATA – parsed character data
Parsed by the XML parser and examined for markup
CDATA – (unparsed) character data
19
Elements
• There are different ways to declare an element
Empty
Parsed character data
Anything
With a specific sequence of children
20
<!ELEMENT br EMPTY>
<!ELEMENT p (#PCDATA)>
<!ELEMENT x ANY>
<!ELEMENT message (from,to,subject,body)>
Elements with Children
• Child sequences can be specified using a syntax
similar to regular expressions
<!ELEMENT picture (polygon+)>
<!ELEMENT picture (polygon+)>
<!ELEMENT picture (polygon?)>
<!ELEMENT polygon (point,point,point+)>
<!ELEMENT picture (polygon|image)>
<!ELEMENT picture (polygon|image)*>
21
Element Attributes
• We can also specify which attributes an element
has
<!ATTLIST element-name attribute-name attribute-type
default-value>
22
<!ATTLIST polygon boundary CDATA "black">
<!ATTLIST polygon interior CDATA "white">
<!ATTLIST polygon fill (true|false) "true">
<!ATTLIST point x CDATA "0">
Attribute Value Types
• Attribute values types can be
CDATA - The value is character data
(en1|en2|..) - The value must be one from an enumerated list
ID - The value is a unique id
IDREF - The value is the id of another element
IDREFS - The value is a list of other ids
NMTOKEN - The value is a valid XML name
NMTOKENS - The value is a list of valid XML names
ENTITY - The value is an entity
ENTITIES - The value is a list of entities
NOTATION - The value is a name of a notation
xml: - The value is a predefined xml value
23
Default Attribute Values
• Default attribute values can be
Value - The default value of the attribute
#REQUIRED - The attribute value must be included in
the element (no default)
#IMPLIED - The attribute does not have to be included
#FIXED value - The attribute value is fixed
24
Entities
• Entities are variables used to define common text
<!ENTITY entity-name "entity-value">
25
<!ENTITY sut "Sharif University of Technology">
...
[in XML file:]
&sut;
Example – Newspaper26
<!DOCTYPE newspaper [
<!ELEMENT newspaper (article+)>
<!ELEMENT article (headline,byline,body,notes)>
<!ELEMENT headline (#PCDATA)>
<!ELEMENT byline (#PCDATA)>
<!ELEMENT body (#PCDATA)>
<!ELEMENT NOTES (#PCDATA)>
<!ATTLIST article author CDATA #REQUIRED>
<!ATTLIST article editor CDATA #IMPLIED>
<!ATTLIST article date CDATA #IMPLIED>
<!ATTLIST article edition CDATA #IMPLIED>
<!ENTITY publisher "Sample Press">
<!ENTITY copy "Copyright 2013 Sample Press"> ]>
XML Schema
• XML Schema is an XML-based alternative to DTD
• Main differences to DTDs
XML schemas use XML syntax
XML schemas support data types
XML schemas are extensible
27
Schema Example28
<?xml version="1.0"?>
<xs:element name="message">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="subject" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
XHTML
XHTML
• XHTML is a version of HTML that is proper XML
• XHTML 1.0 released in 2000
• Because it is XML, it is defined using a DTD
• The html tag must have an xmlns attribute
30
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-
transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
</body>
</html>
XHTML versus HTML
• XHTML and HTML have mostly the same tags
• Main differences have to do with XML syntax
All tags must be closed
Empty tags must also be closed
Elements must be properly nested
Tag names must be lowercase
Attribute values must be quoted
Attributes must have values
<input type="checkbox" checked="checked" />
<input type="text" readonly="readonly" />
The id attribute replaces the name attribute
31
Formatting XML
CSS Formatting
• Formatting information can be added to XML
documents using CSS
• This works by adding a reference to a CSS
stylesheet in the XML document header
33
<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="msg.css"?>
<message>
<from>Hassan</from>
<to>Hossein</to>
<body>Please give me a call!</body>
</message>
CSS Example
• The :before and :after CSS pseudo-elements can
be very useful here
34
from {
display: block;
padding: 10px;
}
from:before {
content: "From: ";
font-weight: bold;
}
XSLT Transformations
• Formatting XML with CSS is not the most common
method
• W3C recommends using XSLT instead
• XSLT (eXtensible Stylesheet Language Transformation) is a language for transforming
XML documents into other XML documents
• To display XML on the web, we could use XSLT to
convert our XML document into an XHTML
document
35
XSLT Example36
<?xml version="1.0"?>
<html xsl:version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml">
<body>
<xsl:for-each select="messages/message">
<div style="padding:10px; margin:10px>
<div><b>From</b>:
<xsl:value-of select="from"/></div>
<div><b>To</b>:
<xsl:value-of select="to"/></div>
<div><xsl:value-of select="body"/></div>
</div>
</xsl:for-each>
</body>
</html>
JSON
What is JSON?
• JSON stands for JavaScript Object Notation
• It is a lightweight text-data interchange format,
commonly used as an alternative to XML
• JSON is smaller, faster and easier to parse
• Although JSON uses JavaScript syntax, it is still
language and platform independent.
38
JSON Examples39
{
"message": {
"from": "Hassan",
"to": "Hossein",
"body": "Please give me a call!"
}
}
{
"books": [
{"title": "Maktub", "author": "Paulo Coelho"},
{"title": "Crashed!", "author": "Microsoft"}
]
}
Summary
• XML is used to describe data
• DTDs and Schemas can be used to define valid
documents
• XML can be formatted with CSS and XSLT
• XHTML is a version of HTML which is proper XML
• JSON is a good alternative to XML
40
References
• W3Schools
http://www.w3schools.com/xml
• Internet Programming by Pat Morin
http://cg.scs.carleton.ca/~morin/teaching/2405/
• Wikipedia
http://en.wikipedia.org/wiki/XML
41