View
218
Download
4
Category
Tags:
Preview:
Citation preview
XML
Vadim Parizher
CS 496-EBT
Fall 2003
Learning Objectives
Learn what XML is Learn the various ways in which XML is used Learn the key companion technologies
Agenda
Overview Syntax and Structure The XML Alphabet Soup
OverviewWhat is XML?
A tag-based meta language Designed for structured data representation Represents data hierarchically (in a tree) Provides context to data (makes it meaningful)
Self-describing data
Separates presentation (HTML) from data (XML) An open W3C standard A subset of SGML
vs. HTML, which is an implementation of SGML
OverviewWhat is XML?
XML is a “use everywhere” data specification
DocumentsConfiguration
Database
Application X
Repository
XML XML
XML XML
OverviewDocuments vs. Data
XML is used to represent two main types of things: Documents
Lots of text with tags to identify and annotate portions of the document
Data Hierarchical data structures
OverviewXML and Structured Data
Pre-XML representation of data:
XML representation of the same data:
“PO-1234”,”CUST001”,”X9876”,”5”,”14.98”
<PURCHASE_ORDER><PO_NUM> PO-1234 </PO_NUM><CUST_ID> CUST001 </CUST_ID><ITEM_NUM> X9876 </ITEM_NUM><QUANTITY> 5 </QUANTITY><PRICE> 14.98 </PRICE>
</PURCHASE_ORDER>
OverviewBenefits of XML
Open W3C standard Representation of data across heterogeneous
environments Cross platform Allows for high degree of interoperability
Strict rules Syntax Structure Case sensitive
OverviewWho Uses XML?
Submissions by Microsoft IBM Hewlett-Packard Fujitsu Laboratories Sun Microsystems Netscape (AOL), and others…
Technologies using XML SOAP, ebXML, BizTalk, WebSphere, many others…
Agenda
Overview Syntax and Structure The XML Alphabet Soup
Syntax and StructureComponents of an XML Document
Elements Each element has a beginning and ending tag
<TAG_NAME>...</TAG_NAME> Elements can be empty (<TAG_NAME />)
Attributes Describes an element; e.g. data type, data range, etc. Can only appear on beginning tag
Processing instructions Encoding specification (Unicode by default) Namespace declaration Schema declaration
Syntax and StructureComponents of an XML Document
<?xml version=“1.0” ?><?xml-stylesheet type="text/xsl" href=“template.xsl"?><ROOT>
<ELEMENT1><SUBELEMENT1 /><SUBELEMENT2 /></ELEMENT1><ELEMENT2> </ELEMENT2><ELEMENT3 type=‘string’> </ELEMENT3><ELEMENT4 type=‘integer’ value=‘9.3’> </ELEMENT4>
</ROOT>
Prologue (processing instructions)
Elements
Elements with Attributes
Syntax and StructureRules For Well-Formed XML
There must be one, and only one, root element Sub-elements must be properly nested
A tag must end within the tag in which it was started
Attributes are optional Defined by an optional schema
Attribute values must be enclosed in “” or ‘’ Processing instructions are optional XML is case-sensitive
<tag> and <TAG> are not the same type of element
Syntax and StructureWell-Formed XML?
No, CHILD2 and CHILD3 do not nest propertly
<xml? Version=“1.0” ?><PARENT>
<CHILD1>This is element 1</CHILD1><CHILD2><CHILD3>Number 3</CHILD2></CHILD3>
</PARENT>
Syntax and StructureWell-Formed XML?
No, there are two root elements
<xml? Version=“1.0” ?><PARENT>
<CHILD1>This is element 1</CHILD1></PARENT><PARENT>
<CHILD1>This is another element 1</CHILD1></PARENT>
Syntax and StructureWell-Formed XML?
Yes
<xml? Version=“1.0” ?><PARENT>
<CHILD1>This is element 1</CHILD1><CHILD2/><CHILD3></CHILD3>
</PARENT>
Syntax and StructureAn XML Document
<?xml version='1.0'?><bookstore> <book genre=‘autobiography’ publicationdate=‘1981’ ISBN=‘1-861003-11-0’> <title>The Autobiography of Benjamin Franklin</title> <author> <first-name>Benjamin</first-name> <last-name>Franklin</last-name> </author> <price>8.99</price> </book> <book genre=‘novel’ publicationdate=‘1967’ ISBN=‘0-201-63361-2’> <title>The Confidence Man</title> <author> <first-name>Herman</first-name> <last-name>Melville</last-name> </author> <price>11.99</price> </book></bookstore>
Syntax and Structure Namespaces: Overview
Part of XML’s extensibility Allow authors to differentiate between tags of the
same name (using a prefix) Frees author to focus on the data and decide how to
best describe it Allows multiple XML documents from multiple authors
to be merged
Identified by a URI (Uniform Resource Identifier) When a URL is used, it does NOT have to represent
a live server
Syntax and Structure Namespaces: Declaration
xmlns: bk = “http://www.example.com/bookinfo/”
xmlns: bk = “urn:mybookstuff.org:bookinfo”
Namespace declaration examples:
Namespace declaration Prefix URI (URL)
xmlns: bk = “http://www.example.com/bookinfo/”
Syntax and Structure Namespaces: Examples
<BOOK xmlns:bk=“http://www.bookstuff.org/bookinfo”> <bk:TITLE>All About XML</bk:TITLE> <bk:AUTHOR>Joe Developer</bk:AUTHOR> <bk:PRICE currency=‘US Dollar’>19.99</bk:PRICE>
<bk:BOOK xmlns:bk=“http://www.bookstuff.org/bookinfo”xmlns:money=“urn:finance:money”> <bk:TITLE>All About XML</bk:TITLE> <bk:AUTHOR>Joe Developer</bk:AUTHOR> <bk:PRICE money:currency=‘US Dollar’> 19.99</bk:PRICE>
Syntax and Structure Namespaces: Default Namespace
An XML namespace declared without a prefix becomes the default namespace for all sub-elements
All elements without a prefix will belong to the default namespace:
<BOOK xmlns=“http://www.bookstuff.org/bookinfo”> <TITLE>All About XML</TITLE> <AUTHOR>Joe Developer</AUTHOR>
Syntax and Structure Namespaces: Scope
Unqualified elements belong to the inner-most default namespace. BOOK, TITLE, and AUTHOR belong to the default
book namespace PUBLISHER and NAME belong to the default
publisher namespace<BOOK xmlns=“www.bookstuff.org/bookinfo”> <TITLE>All About XML</TITLE> <AUTHOR>Joe Developer</AUTHOR> <PUBLISHER xmlns=“urn:publishers:publinfo”> <NAME>Microsoft Press</NAME> </PUBLISHER></BOOK>
Syntax and Structure Namespaces: Attributes
Unqualified attributes do NOT belong to any namespace Even if there is a default namespace
This differs from elements, which belong to the default namespace
Syntax and Structure Entities
Entities provide a mechanism for textual substitution, e.g.
You can define your own entities Parsed entities can contain text and markup Unparsed entities can contain any data
JPEG photos, GIF files, movies, etc.
Entity Substitution< <
& &
Agenda
Overview Syntax and Structure The XML Alphabet Soup
The XML ‘Alphabet Soup’
XML itself is fairly simple Most of the learning curve is knowing about
all of the related technologies
The XML ‘Alphabet Soup’
XML Extensible Markup Language
Defines XML documents
Infoset Information Set Abstract model of XML data; definition of terms
DTD Document Type Definition Non-XML schema
XSD XML Schema XML-based schema language
CSS Cascading Style Sheets Allows you to specify styles
XSL Extensible Stylesheet Language
Language for expressing stylesheets; consists of XSLT and XSL-FO
XSLT XSL Transformations Language for transforming XML documents
XSL-FO XSL Formatting Objects Language to describe precise layout of text on a page
The XML ‘Alphabet Soup’
XPath XML Path Language A language for addressing parts of an XML document, designed to be used by both XSLT and XPointer
XPointer XML Pointer Language Supports addressing into the internal structures of XML documents
XLink XML Linking Language Describes links between XML documents
XQuery XML Query Language (draft)
Flexible mechanism for querying XML data as if it were a database
DOM Document Object Model API to read, create and edit XML documents; creates in-memory object model
SAX Simple API for XML API to parse XML documents; event-driven
Data Island XML data embedded in a HTML page
Data Binding Automatic population of HTML elements from XML data
The XML ‘Alphabet Soup’ Schemas: Overview
DTD (Document Type Definitions) Not written in XML No support for data types or namespaces
XSD (XML Schema Definition) Written in XML Supports data types Current standard recommended by W3C
The XML ‘Alphabet Soup’ Schemas: Purpose
Define the “rules” (grammar) of the document Data types Value bounds
A XML document that conforms to a schema is said to be valid More restrictive than well-formed XML
Define which elements are present and in what order
Define the structural relationships of elements
The XML ‘Alphabet Soup’ Schemas: DTD Example
XML document:
DTD schema:<!DOCTYPE BOOK [<!ELEMENT BOOK (TITLE+, AUTHOR) ><!ELEMENT TITLE (#PCDATA) ><!ELEMENT AUTHOR (#PCDATA) >]>
<BOOK> <TITLE>All About XML</TITLE> <AUTHOR>Joe Developer</AUTHOR></BOOK>
The XML ‘Alphabet Soup’ Schemas: XSD Example
XML document:<CATALOG> <BOOK> <TITLE>All About XML</TITLE> <AUTHOR>Joe Developer</AUTHOR> </BOOK> …</CATALOG>
The XML ‘Alphabet Soup’ Schemas: XSD Example
<xsd:schema id="NewDataSet“ targetNamespace="http://tempuri.org/schema1.xsd" xmlns="http://tempuri.org/schema1.xsd" xmlns:xsd="http://www.w3.org/1999/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"> <xsd:element name="book"> <xsd:complexType content="elementOnly"> <xsd:all> <xsd:element name="title" minOccurs="0" type="xsd:string"/> <xsd:element name="author" minOccurs="0" type="xsd:string"/> </xsd:all> </xsd:complexType> </xsd:element> <xsd:element name=“Catalog" msdata:IsDataSet="True"> <xsd:complexType> <xsd:choice maxOccurs="unbounded"> <xsd:element ref="book"/> </xsd:choice> </xsd:complexType> </xsd:element></xsd:schema>
The XML ‘Alphabet Soup’ Schemas: Why You Should Use XSD
Newest W3C Standard Broad support for data types Reusable “components”
Simple data types Complex data types
Extensible Inheritance support Namespace support Ability to map to relational database tables XSD support in Visual Studio.NET
The XML ‘Alphabet Soup’ Transformations: XSL
Language for expressing document styles Specifies the presentation of XML
More powerful than CSS
Consists of: XSLT XPath XSL Formatting Objects (XSL-FO)
The XML ‘Alphabet Soup’ Transformations: Overview
XSLT – a language used to transform XML data into a different form (commonly XML or HTML)
XML,HTML,
…
XML
XSLT
The XML ‘Alphabet Soup’ Transformations: XSLT
The language used for converting XML documents into other forms
Describes how the document is transformed Expressed as an XML document (.xsl) Template rules
Patterns match nodes in source document Templates instantiated to form part of result document
Uses XPath for querying, sorting, etc.
The XML ‘Alphabet Soup’ Transformations: Example
<sales> <summary> <heading>Scootney Publishing</heading> <subhead>Regional Sales Report</subhead> <description>Sales Report</description> </summary> <data> <region> <name>West Coast</name> <quarter number="1" books_sold="24000" /> <quarter number="2" books_sold="38600" /> <quarter number="3" books_sold="44030" /> <quarter number="4" books_sold="21000" /> </region> ... </data></sales>
The XML ‘Alphabet Soup’ Transformations: Example
<xsl:param name="low_sales" select="21000"/><BODY> <h1><xsl:value-of select="//summary/heading"/></h1> ... <table><tr><th>Region\Quarter</th> <xsl:for-each select="//data/region[1]/quarter"> <th>Q<xsl:value-of select="@number"/></th> </xsl:for-each> ... <xsl:for-each select="//data/region"> <tr><xsl:value-of select="name"/></th> <xsl:for-each select="quarter"> <td><xsl:choose> <xsl:when test="number(@books_sold <= $low_sales)"> color:red;</xsl:when> <xsl:otherwise>color:green;</xsl:otherwise></xsl:choose> <xsl:value-of select="format-number(@books_sold,'###,###')"/></td> ... <td><xsl:value-of select="format-number(sum(quarter/@books_sold),'###,###')"/>
The XML ‘Alphabet Soup’ Transformations: Example
The XML ‘Alphabet Soup’XSL Formatting Objects (XSL-FO)
A set of formatting semantics Denotes typographic elements (for example:
page, paragraph, rule, etc.) Allows finer control obtained via formatting
elements Word, letter spacing Indentation Widow, orphan, hyphenation control Font style, etc.
The XML ‘Alphabet Soup’ XPath (XML Path Language)
General purpose query language for identifying nodes in an XML document
Declarative (vs. procedural) Contextual – the results depend on current node Supports standard comparison, Boolean and
mathematical operators (=, <, and, or, *, +, etc.)
The XML ‘Alphabet Soup’ XPath Operators
Operator Usage Description/ Child operator – selects only immediate children (when
at the beginning of the pattern, context is root)
// Recursive descent – selects elements at any depth (when at the beginning of the pattern, context is root)
. Indicates current context
* Wildcard
@ Prefix to attribute name (when alone, it is an attribute wildcard)
[ ] Applies filter pattern
The XML ‘Alphabet Soup’ XPath Query Examples
./author (finds all author elements within current context)
/bookstore (find the bookstore element at the root)
/* (find the root element)
//author (find all author elements anywhere in document)
/bookstore[@specialty = “textbooks”] (find all bookstores where the specialty
attribute = “textbooks”)
/book[@style = /bookstore/@specialty] (find all books where the style attribute = the specialty attribute of the bookstore element at the root)
The XML ‘Alphabet Soup’ XPointer
Builds upon XPath to: Identify sub-node data Identify a range of data Identify data in local document or remote documents
New standard
The XML ‘Alphabet Soup’ XLink
XML Linking Language Elements of XML documents Describes links between resources
Simple links (for example, HTML HREFs) Extended links
Remote resources Local resources Rules for how a link is followed, etc.
The XML ‘Alphabet Soup’ The XML DOM
XML Document Object Model (DOM) Provides a programming interface for manipulating
XML documents in memory Includes a set of objects and interfaces that represent
the content and structure of an XML document Enables a program to traverse an XML tree Allows elements, attributes, etc., to be added/deleted in
an XML tree Allows new XML documents to be created
programmatically
The XML ‘Alphabet Soup’ SAX (Simple API for XML)
API to allow developers to read/write XML data Event based
Uses a “push” model Sequential access only (data not cached) Requires less memory to process XML data than
the DOM SAX has less overhead (uses small input, work and
output buffers) than the DOM DOM constructs the data structure in memory (work
and output buffers = to size of data)
XML embedded in an HTML document Manipulated via client side script or data binding
<XML id=“XMLID”> <BOOK> <TITLE>All About XML</TITLE> <AUTHOR>Joe Developer</AUTHOR> </BOOK></XML>
<XML id=“XMLID” src=“mydocument.xml”>
The XML ‘Alphabet Soup’ Data Islands
The XML ‘Alphabet Soup’ Data Islands
Can be embedded in an HTML SCRIPT element XML is accessible via the DOM:
<SCRIPT language=“xml” id=“XMLID”><SCRIPT type=“text/xml” id=“XMLID”><SCRIPT language=“xml” id=“XMLID” src=“mydocument.xml”>
The XML ‘Alphabet Soup’ Data Islands
Access the XML via the HTML DOM:
Or access the XML directly via the ID:
function returnXMLData() { return document.all("XMLID").XMLDocument.nodeValue;}
function returnXMLData() { return XMLID.documentElement.text;}
The XML ‘Alphabet Soup’ Data Binding
Client-side data binding (in the browser) The XML Data Source Object (DSO) binds HTML
elements to an XML data set (or data island) When the XML data set changes, the bound elements
are updated dynamically DATASRC: the source of the data (e.g., the ID
of the data island) DATAFLD: the field (XML element) to display Can offload XSLT processing to the client IE only
The XML ‘Alphabet Soup’ Data Binding
<HTML><BODY> <XML ID="xmlParts"> <?xml version="1.0" ?> <parts> <part> <partnumber>A1000</partnumber> <description>Flat washer</description> <quantity>1000</quantity> </part> ... </XML> <table datasrc=#xmlParts border=1><tr> <td><div datafld="partnumber"></div></td> <td><div datafld="quantity"></div></td> </tr></table></BODY></HTML>
The XML ‘Alphabet Soup’ XLang (BizTalk)
An XML language for defining processes Processes usually on multiple platforms Support for:
Concurrency Long-running transactions
Exposed messaging and orchestration APIs Connect to COM components Connect to MSMQ queues Connect to SQL components
The XML ‘Alphabet Soup’ XML-Based Applications
Microsoft BizTalk Server Enables uniform exchange of data among disparate
systems (XLang) Backend interchange with partners/customers
Microsoft Commerce Server XML-based Product Catalog System Integrated with BizTalk Server for backend
communication
The XML ‘Alphabet Soup’ XML-Based Applications
Microsoft SQL Server Retrieve relational data as XML Query XML data Join XML data with existing database tables Update the database via XML Updategrams
Microsoft Exchange Server XML is native representation of many types of data Used to enhance performance of UI scenarios (for
example, Outlook Web Access (OWA))
XML in .NET Other XML Namespaces
System.Xml – Core XML namespace System.Xml.XPath – contains the XPathNavigator, XPath parser and evaluation engine
System.Xml.Xsl – support XSLT transformations
System.Xml.Serialization – Classes to serialize objects into XML documents or streams
System.Xml.Schema – Supports XSD schemas
Resources
http://msdn.microsoft.com/xml/ http://www.xml.com/ http://www.w3.org/xml/ Microsoft Press Books
(http://mspress.microsoft.com/) XML Step By Step XML In Action Developing XML Solutions
O’Reilly Press XML In A Nutshell
Resources
XML in .NEThttp://msdn.microsoft.com/msdnmag/issues/01/01/xml/xml.asp
Working with XML in the .NET Platformhttp://www.xmlmag.com/upload/free/features/xml/2001/05may01/dw0102/dw0102.asp
Recommended