40
ECA 228 Internet/Intranet Design I Intro to XML

ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

Embed Size (px)

Citation preview

Page 1: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Intro to XML

Page 2: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

HTML

markup language very loose standards browsers adjust for non-standard HTML disadvantages:

– loose standards– fixed markup– maintenance

Page 3: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML

markup language / meta language consists only of data and markup

HTML was designed to display data and focus on how it looks.

XML was designed to describe data, and focus on what it is.

HTML is about displaying information. XML is about describing information.

Page 4: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

What is XML?

XML– stands for EXtensible Markup Language– is a markup language much like HTML– was designed to describe data– tags are not predefined – you define your own tags– uses a DTD of XML Schema to describe the data– is designed to be self_descriptive

Page 5: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

HTML Example

<body><p>Bubba</p><p>Jenna</p><p>Reminder</p><p>Remember, we’re looking for mushrooms tomorrow.</p></body>

Page 6: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Example

<?xml version=“1.0” encoding=“ISO-8859-1”?><note><to>Bubba</to><from>Jenna</from><heading>Reminder</heading><message>We’re looking for mushrooms tomorrow.</message></note>

HTML is about displaying information. XML is about describing information.

Page 7: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML uses

XML was designed to store, carry, and exchange data XML was not designed to display data in web pages

– HTML to structure the document– CSS to add presentational elements– XML to provide content

to exchange data between otherwise incompatible applications

Page 8: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Example

1st line is the XML declaration 2nd line describes the root element next 4 lines are child elements of the root last line defines the end of the root element

<?xml version=“1.0” encoding=“ISO-8859-1”?><note><to>Bubba</to><from>Jenna</from><heading>Reminder</heading><message>We’re looking for mushrooms tomorrow.</message></note>

Page 9: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Syntax

All XML elements must have a closing tag– in HTML the following is legal

– in XML it is illegal to omit the closing tag

– empty tags must use opening/closing tag

<p>This is a paragraph.

<p>This is a paragraph.</p>

<br />

Page 10: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Syntax cont …

XML tags are case sensitive

<Message>This is incorrect</message>

<message>This is correct</message>

Page 11: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Syntax cont …

XML tags must be nested properly

<b><i>This is incorrect</b></i>

<b><i>This is correct</i></b>

Page 12: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Syntax cont …

XML documents must have a root element– root element contains all other elements– other elements may have child elements nested properly

within them

<root> <child> <subchild> . . . </subchild> </child></root>

Page 13: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Syntax cont …

Attribute values must always be quoted– elements may contain attributes in name=value pairs

<?xml version=“1.0” encoding=“ISO-8859-1”?><note date=“4/15/2004”><to>Bubba</to><from>Jenna</from><heading>Reminder</heading><message>We’re looking for mushrooms tomorrow.</message></note>

Page 14: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Syntax

With XML, white space is preserved– HTML ignores all white space, but one

renders as

– XML preserves all white space

Hi there, my name is Bubba.

Hi there, my name is Bubba.

Page 15: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Syntax cont …

XML converts CR / LF to LF– a new line is always stored as a LF ( line feed )

Windows apps store new lines as CR LF Unix apps store new lines as LF Mac apps store new lines as CR ( Carriage Return )

– XML converts all to LF

Page 16: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Syntax cont …

XML comments– XML comments are the same as in HTML

< ! - - This is an XML comment - - >

Page 17: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Syntax cont …

XML special symbols– to include an ampersand, a less than sign, or a greater than sign

as data, use character entities

CODE REPRESENTS

&amp; ampersand

&lt; less than sign

&gt; greater than sign

&apos; apostrophe

&quot; quote

Page 18: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Elements

XML elements can contain different types of content– element: contains other XML elements– text: contains only text– mixed: contains both text and other elements– empty: contains no content, such as the HTML <br />

tag

Page 19: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Naming XML Elements

Certain rules must be followed when naming elements– can contain letters, numbers, and some characters

(underscore, hyphen, period, colon), however, caution should be used when using hyphen, period, or colon

– must start with a letter, underscore, or colon (namespace), but no numbers

– may not contain spaces– may not begin with the letters xml– is case sensitive

Page 20: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Naming XML Elements cont …

separate words are often separated by an underscore

Caution when using a hyphen, period, or colonfirst_name

first.name

first-name

Page 21: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Naming XML Elements cont …

names can be as long as necessary keep names descriptive but simple

XML document often has a corresponding database– element names may match filed names

the_title_of_the_book

book_title

Page 22: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Attributes

XML elements may contain attributes– attributes provide information which is not

necessarily part of the data

attributes are used in the opening tag of an element

attribute values must always be quoted

<student gender=“female” >

<student gender=‘female’ >

Page 23: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Attributes cont …

if an attribute value contains double quotes, place single quotes around the value

if an attribute uses single quotes or an apostrophe, place double quotes around the value

<movie title=“Sophie’s Choice” >

<lumberjack name=‘Jacques “Bubba” Renault’>

Page 24: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Attributes cont …

in many instances, attributes can be written as a child element

<movie title=“Sophie’s Choice” >

<movie><title>Sophie’s Choice</title>

</movie>

Page 25: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Attributes cont …

if the information being provided by the attribute seems like data, use it as an element

if the information clearly is not data, but can be used to classify or categorize the element, use an attribute

<meeting_minutes date=“12/30/2003”>

<meeting_minutes><date>12/30/2003</date>

</meeting_minutes>

Page 26: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Attributes cont …

meta data ( data about data ) should be stored as an attribute

actual data should be stored as an element

<meeting_minutes id=“mtg_12_2003”><date>

<day>30</day><month>12</month><year>2003</year>

</date></meeting_minutes>

Page 27: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Validation

well-formed– a document that adheres to correct XML syntax

valid– a document that is well-formed and conforms to a

DTD DTD – Document Type Definition

– defines a structure with a list of legal elements XML Schema – alternative to DTD

Page 28: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Namespace

combined XML documents risk name conflicts

<table><coffee> … </coffee><dining> … </dining><patio> … </patio>

</table>

<table><tr>

<td> … </td><td> … </td><td> … </td>

</tr></table>

Page 29: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Namespace cont …

create a prefix to distinguish similarly named elements

<michael:table><michael:coffee> … </michael:coffee>

</michael:table>

<other:table><other:tr>

<other:td> … </other:td></other:tr>

</other:table>

Page 30: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Namespace cont …

XML namespace– takes the form of a URL– based upon a unique domain name– namespace must have a unique & persistent name

begin with domain name add descriptive information, as if it is a path in the URL

http://www.justustwo.com/ns/tables

Page 31: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Namespace cont …

the URL does not necessarily point to a particular document

a URL is used because it is unique a namespace must be declared before it can

be used namespace declaration has a specific structure

xmlns:namespace_prefix=“namespace_url”

Page 32: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Namespace cont …

prefix the xml elements with namespace name

to declare a default namespace, omit the prefix

<michael:table xmlns:michael=”http://www.justustwo.com/ns/tables”>< michael:coffee> … </michael:coffee>< michael:dining> … </michael:dining>

</michael:table>

<table xmlns:”http://www.justustwo.com/ns/tables”>< coffee> … </coffee>< dining> … </dining>

</table>

Page 33: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Namespace cont …

override a default namespace by specifying a prefix

<michael:table xmlns:michael=”http://www.justustwo.com/ns/tables”> <T:table xmlns:T=”http://www.the_other_url.com/ns/html_table”> <T:tr>

<T:td>< michael:coffee> … </michael:coffee></T:td><T:td>< michael:dining> … </michael:dining></T:td><T:td>< michael:patio> … </michael:patio></T:td>

</T:tr> </T:table></michael:table>

Page 34: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Namespace cont …

3 reasons not to use the URL itself as a prefix1. URL’s make use of the colon

2. very difficult to type and read

3. URL’s can contain special characters not permitted in XML

< http://www.justustwo.com/ns/tables:table>< http://www.justustwo.com/ns/tables:coffee>… </http://www.justustwo.com/ns/tables:coffee>< http://www.justustwo.com/ns/tables:dining>… </http://www.justustwo.com/ns/tables:dining>< http://www.justustwo.com/ns/tables:patio>… </http://www.justustwo.com/ns/tables:patio>

</ http://www.justustwo.com/ns/tables:table>

Page 35: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Formatting XML

although it is possible to format XML with CSS, it is not recommended

rather than CSS, the preferred way to format XML is with XSL ( EXtensible Stylesheet Language )

XSL can be used to transform XML into HTML before it is sent to the browser

Page 36: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Data Island

Microsoft proprietary technology not supported by the W3C specification XML Data Islands use an unofficial <xml>

tagset Data Islands can embed XML directly into an

HTML document

Page 37: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Data Island cont …

note that the <xml> tag is an HTML element, not an XML element

<xml id=’note’> <note> <to>Bubba</to> <from>Jenna</from> <heading>Reminder</heading> <message>Remember, we’re looking for mushrooms.</message> </note> </xml>

Page 38: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

XML Data Island cont …

if the Data Island is a separate document, it can be embedded with a reference to the document

<xml id=’note’ src=“note.xml”></xml>

Page 39: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Data Binding

Microsoft proprietary technology not supported by the W3C Data Islands can be bound to HTML elements,

such as an HTML table– load a Data Island from an external XML file– bind the table to the Data Island with a data source

attribute ( datasrc )– bind the table elements to the XML data with a data

field attribute ( datafld ) inside a span tag, inside each table cell

Page 40: ECA 228 Internet/Intranet Design I Intro to XML. ECA 228 Internet/Intranet Design I HTML markup language very loose standards browsers adjust for non-standard

ECA 228 Internet/Intranet Design I

Data Binding

note that the datafld matches the element name

<html><body> <xml id="cdcat" src="cd_catalog.xml"></xml> <table border="1" datasrc="#cdcat"> <tr> <td><span datafld="ARTIST"></span></td> <td><span datafld="TITLE"></span></td> </tr> </table></body></html>