Upload
allen-snow
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Introduction to XMLIntroduction to XML
Marek Podgorny and Lukasz BecaEECS SU and CollabWorx, Inc.
Syracuse UniversityFall 2002
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 2
Markup LanguagesMarkup Languages Marking up text is a methodology for encoding data with
information about itself– Yellow highlighter is a valid markup methodology
– You decide which part of the document are important– It is portable – others can benefit from your markup
Two critical properties on a valid markup:– A standard must be in place to define what a valid markup is
– Above, markup is defined as a bit of yellow ink atop text– In HTML a markup is a <font color=yellow>tag</font>
– A standard must be in place to define what markup means– Yellow highlight means the highlighted text represents an important
point– In HTML each tag carries a well-defined formatting instruction
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 3
What is XML?What is XML? Like HTML, XML (Extensible Markup Language) is a
markup language which relies on the concept of rule-specifying tags and the use of a tag-processing application that knows how to deal with the tags
For HTML, the application is a browser– This is because HTML is a presentation markup
For XML, the application can by anything– XML may be processed by browsers, but its application
domain is huge and not even completely understood today
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 4
eXtensibility of XMLeXtensibility of XML The most important technical difference between
XML and HTML is that while HTML is a closed set of tags, XML is a meta-language for defining other markup languages– XML specifies the standards with which you can define
your own markup languages with their own sets of tags– This very statement makes people nervous…– We will discuss methodology to define a new language
but in practice very few people will ever write a DTD
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 5
Made-up Markup Language (MuML)Made-up Markup Language (MuML)
<CONTACT> <NAME>Kim Smith</NAME> <ID>027</ID> <COMPANY>WebtopSystems Inc.</COMPANY> <EMAIL>[email protected]</EMAIL> <PHONE>315 443-4868</PHONE> <STREET>111 College Pl</STREET> <CITY>Syracuse</CITY> <STATE>New York</STATE> <ZIP>13244</ZIP></CONTACT>This is a chunk of valid XML. How is it useful?
Netscape browser surely doesn’t know what to do with it….
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 6
How to make MuML useful?How to make MuML useful?
There must be a set of rules allowing us/computer to understand syntax of the language– In XML, this information is provided to processing application by
Document Type Definition (DTD)– The DTD specifies what it means to be a valid tag - the syntax for
marking up There must be a set of rules defining the meaning
(semantics) of the markup– To specify what valid tags mean, XML documents are also
associated with style sheets which provide GUI instructions for a processing application like a web browser.
– Note that other application domains of XML might do w/o a style sheet – e.g., application using XML a object serialization technique
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 7
Style Sheet Pseudo-CodeStyle Sheet Pseudo-Code Anytime you see a
<CONTACT>, display it using a <UL> tag. </CONTACT> tags should be converted to </UL>
All <NAME> tags can be substituted for <LI> tags and </NAME> tags should substituted for </LI>
All <EMAIL> tags can be substituted for <LI> tags and </EMAIL> tags should be ignored
Style sheet utilizes the functionality of HTML to define the formatting of MuML.
For non-browser apps, the HTML translation is irrelevant
Processing application combines the logic of the style sheet, the DTD, and the data of the MuML document, and displays it according to the rules and the data.
So instead of a simple HTML we got three different chunks. Why the pain?
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 8
Complex XML WorldComplex XML World We need a processing agent which will put together
the DTD, the style sheet, and the data– Note Web browsers barely up to the task yet
Formal definition:– "A software module called an XML processor is used to
read XML documents and provide access to their content and structure. It is assumed that an XML processor is doing its work on behalf of another module, called the application."
And this is not yet all….
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 9
Build your own ColdFusion?Build your own ColdFusion?
XML allows each specific industry to develop its own tag sets to meet its unique needs– Doesn’t force everyone's browser to incorporate zillions of tag sets,
or developers to settle for a tag set that is too generic to be useful– Compelling? Well…
The real power of XML: – Not only can you define your own set of tags, but the rules
specified by those tags are not limited to formatting rules– XML allows you to define all sorts of tags with all sorts of rules
– tags representing business rules or tags representing data description or data relationships.
– As these tags are reflected in DOM, you can do computation on documents!
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 10
Why are HTML days counted?Why are HTML days counted?
The GUI is embedded in the data. – What happens if you decide that you like a table-based
presentation better than a list-based presentation? Searching for information in the data is tough The data is tied to the logic and language of HTML
and hence to browsers– What if I want to use my data in a Java applet?
HTML: <LI>State: Ohio <LI>State: Oregon
XML: <state>Ohio</state> <state>Oregon</state>
How do I find all records for Ohio
What is relationship of Ohio and Oregon?
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 11
HTML Search in ActionHTML Search in Action
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 12
Long Live XML!Long Live XML! With XML, the GUI and data are divorced
– Thus, changes to display do not require messing with the data - a separate style sheet will specify a table display or a list display
Searching the data is easy and efficient – Search engines can parse description-bearing tags rather than
muddling in the data. Tags provide them with the intelligence they otherwise lack
Complex relationships (trees, inheritances, classes) can be communicated
The code is much more legible to a lay person - – It is obvious that <ID>911</ID> represents an ID whereas <LI>911 might not. XML is self-describing
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 13
Why isn’t it there if it is so good?Why isn’t it there if it is so good? No XML applications…
– IE 5.0 provides some support for XSL and XML if output is HTML
– Netscape 5.0 (Mozilla) also implements support for XML but not for XSL
A quote: “XML isn't about display -- it's about structure. This has implications
that make the browser question secondary. So the whole issue of what is to be displayed and by what means is intentionally left to other applications. You can target the same XML (with different XSL) for different devices (standard web browser, palm pilot, printer, etc.). You should not get the impression that XML is useless until browsers support it. This is definitely not true -- we are using it at NASA in ways where no browser plays any role." - Ken Sall, NASA IT Manager
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 14
XML Design GoalsXML Design Goals Enable better search algorithms (metadata) Enable presentation of various views for same data Integrate data from different sources Provide easy use over the Internet Create documents readable even by humans Support data interchange Enable easy development of document processing
applications
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 15
XML - SummaryXML - Summary Extensible Markup Language - Subset of Standard
Generalized Markup Language (SGML) Universal format for describing structured data on
the Web Specification developed by World Wide Web
Consortium (W3C) supervised by XML Working Group
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 17
Applications of XMLApplications of XML XML languages XML protocols Support for XML
– Client side– Server side
XML and databases Data interchange
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 18
XML DeploymentXML Deployment XML is a basis for development of industry language
and protocol standards Corporations and academic organizations form
special organizations (consortiums or forums) in order to develop standards for whole branches of industry. Example: World Wide Web Consortium or WAPForum
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 19
Extensible HyperText Markup Language (XHTML)Extensible HyperText Markup Language (XHTML) XML based syntax Extensibility through XHTML modules allow the combination of
existing and new feature sets when developing content and when designing new user agents (web browsers, portable devices, etc.)
Examples of modules:– required modules: structure, basic text, hypertext, lists– optional modules: presentation, forms, tables, images,
stylesheets, applets, frames, etc. XHTML is designed with general user agent interoperability in
mind, XHTML documents should be displayed on any type of XHTML-compliant devices
Current version - XHTML™ 1.0, DTD specification available at http://www.w3.org site
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 20
Synchronized Multimedia Integration Language (SMIL)Synchronized Multimedia Integration Language (SMIL) SMIL allows developers to mix media presentation to be
presented and synchronized with each other For example, the SMIL document can specify:
– the positioning where the visual content appears in player – when audio or video (or other type of stream) starts and
stops playing Users need a special player to view the SMIL documents Products supporting SMIL: Real Networks - Realplayer, Apple -
QuickTime See:
http://www.empirenet.com/~joseram/smil_intro/smil_intro.html for tutorial about SMIL written in SMIL
Current version - SMIL 1.0, Specification available at http://www.w3.org site
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 21
Wireless Application Protocol (WAP) and Wireless Markup Language (1)Wireless Application Protocol (WAP) and Wireless Markup Language (1) Forecasted users of wireless services by 2001 - 530 million Currently used and available in the future devices have multimedia
capabilities: receiving/sending e-mail, accessing Internet Wireless Application Protocol - standard for the presentation and
delivery of wireless information and telephony on mobile phones and other wireless terminals – handset manufacturers that represent 90 percent of world market support
this standard Wireless Markup Language (WML) - part of the standard,
designed to describe information to be presented on small displays
WML documents can be accessed over the Internet using standard HTTP protocol – traditional servers can be used for hosting WML documents
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 22
Simple Object Access Protocol (SOAP) Simple Object Access Protocol (SOAP) Support for Remote Procedure Call and messaging
mechanisms over various protocols (for example, HTTP). implemented in XML
Describes conventions for definition of:– method calls– method parameters– results of method calls– serialization mechanisms for encoding application-defined data types
Since SOAP messages can be transported over HTTP protocol, currently deployed Web infrastructure becomes one distributed computing platform (distributed objects can be placed on HTTP servers)
Current version - SOAP 1.1 (status: note), Specification available at http://www.w3.org site
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 23
Support for XML in Web BrowsersSupport for XML in Web Browsers Internet Explorer 5.0+
– Extensible Markup Language– Extensible Stylesheet Language– Cascading Stylesheets– Document Object Model– Data Islands
Mozilla 5.0– Extensible Markup Language– Cascading Stylesheets – Document Object Model– Graphical User Interface built using XUL (Extensible User Interface
Language) - users can provide their own user interface documents to customize layout of the browser
Microbrowsers for portable devices– Wireless Markup Language
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 24
Support for XML on Server SideSupport for XML on Server Side Web servers can host XML documents XML documents can be dynamically generated by
servlets, JSP pages, and ASP pages XML adapters allow translation from application
specific formats to XML XML documents can be stored in databases for fast
retrieval Enterprise applications with XML processing
functionality can be easily built using available XML parser components and XSL processors
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 25
XML Document and Database (1)XML Document and Database (1)
Part Name Part ID Price InStock
window 001 40$ yes
muffler 002 150$ yes
door 003 30$ no
Information stored in database
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 26
XML Document and Database (2)XML Document and Database (2)
<store><part id=“p001”><part-name>window</part-name><price>40</price><instock>yes</instock></part><part id=“p002”><part-name>muffler</part-name><price>150</price><instock>yes</instock></part> </store>
The same information represented as an XML document
Introduction to XML CPS606, Fall 2002, EECS SU & CollabWorx 27
Data InterchangeData Interchange One of the most costly aspect of Enterprise
Application Integration - conversion of proprietary data formats to other data formats
XML - new data interchange standard Information handled by different applications and
data sources can be converted into XML to provide uniform data format
Using XML– applications can exchange data easily– application specific data can be used on the Internet