fhtfy

Embed Size (px)

Citation preview

  • 7/30/2019 fhtfy

    1/20

    0

    Computer Science E-75Building Dynamic Websites

    Harvard Extension Schoolhttp://www.cs75.net/

    Lecture 3: XML

    David J. Malan

    [email protected]

  • 7/30/2019 fhtfy

    2/20

    1

  • 7/30/2019 fhtfy

    3/20

    2

    XML

    MalanDavid

    20050621

    29.95

    Harry Potter and the

    Order of the Phoenix

    J.K. Rowling

    element

    textstart tag

    attribute

    end tag

    child element

    value

  • 7/30/2019 fhtfy

    4/20

    3

    Extensibility

    Malan

    David

    J

    Oxford Street

    33

    Cambridge

    MA

    20050621

    ...

    extend structurewithout breakingexisting data and

    applications

  • 7/30/2019 fhtfy

    5/20

    4

    Jim Bob

    graduate

    Computer Science & Music

    Jim Bob!

    Hi my name is jim. I look like

    ]]>

    ...

  • 7/30/2019 fhtfy

    6/20

    5

    XML Declaration Optional Must appear at the very top of an XML document Used to indicate the version of the specification to

    which the document conforms (and whether the

    document is standalone)

    Used to indicate the character encoding of thedocument

    UTF-8 UTF-16 iso-8859-1

  • 7/30/2019 fhtfy

    7/20

    6

    Elements Main structure in an XML document Only one root element allowed Start Tag

    Allows specification of zero or more attributes

    End Tag Must match name, case, and nesting level of start tag

    Name must start with letter or underscore and can containonly letters, numbers, hyphens, periods, and underscores

    Jim Bob

  • 7/30/2019 fhtfy

    8/20

    7

    Content Models Element Content

    ...

    Parsed Character Data (aka PCDATA, aka Text)Jim Bob

    Mixed ContentJim J Bob

    No Content

  • 7/30/2019 fhtfy

    9/20

    8

    Attributes Name

    Must start with letter or underscore and can contain only letters,numbers, hyphens, periods, and underscores

    Value Can be of several types, but is almost always a string Must be quoted

    title="Lecture 2"match='item="baseball bat"'

    Cannot contain< or&(by itself)

  • 7/30/2019 fhtfy

    10/20

    9

    PCDATA Text that appears as the content of an element Can reference entities Cannot contain< or&(by itself)

    Jim Bob

  • 7/30/2019 fhtfy

    11/20

    10

    Entities Used to escape content or include content that is hard to

    enter or repeated frequently

    Somewhat like macros Five pre-defined entities

    & < > ' " Character entities can refer to a single character by unicode

    number e.g., is

    Must be declared to be legal

    Cannot refer to themselves

    &

  • 7/30/2019 fhtfy

    12/20

    11

    CDATA Parsed in one chunk by the XML parser Data within is not checked for subelements, entities, etc. Allows you to include badly formed markup or character data

    that would cause a problem during parsing

    Example Including HTML tags in an XML document

    Jim Bob! ... ]]>

  • 7/30/2019 fhtfy

    13/20

    12

    Comments

    Can include any text inside a comment to make it easier forhuman readers to understand your document

    Generally not available to applications reading the document Always begin with Cannot contain --

  • 7/30/2019 fhtfy

    14/20

    13

    SimpleXMLhttp://us2.php.net/simplexml

  • 7/30/2019 fhtfy

    15/20

    14

    DOM

    Jim Bob

    graduate

    Document

    Comment Element: students

    Element: name Element: status

    Element: student

    Text: Jim Bob Text: graduate

    Attr: idText: \n\t Text: \n

    Text: \n\t\t Text: \n\tText: \n\t\t

  • 7/30/2019 fhtfy

    16/20

    15

    RSShttp://cyber.law.harvard.edu/rss/rss.html

    Image from wikipedia.org.

  • 7/30/2019 fhtfy

    17/20

    16

    RSS

    [...]

  • 7/30/2019 fhtfy

    18/20

    17

    XPath

    /child::lectures/child::lecture[@number='0']

    step axis nodetest

    predicate

    location path

  • 7/30/2019 fhtfy

    19/20

    18

    PizzaML

    Image from junkfoodnews.net.

  • 7/30/2019 fhtfy

    20/20

    19

    Computer Science E-75Building Dynamic Websites

    Harvard Extension Schoolhttp://www.cs75.net/

    Lecture 3: XML

    David J. Malan

    [email protected]