2 WHAT is XML

Embed Size (px)

Citation preview

  • 8/7/2019 2 WHAT is XML

    1/43

    WHAT is XMLWHAT is XML

    Rakesh Kumar RaiRakesh Kumar Rai

    Lecturer I.T Dept.Lecturer I.T Dept.G.C.E.T Gr. NoidaG.C.E.T Gr. Noida

  • 8/7/2019 2 WHAT is XML

    2/43

    What is XMLWhat is XML

    XML is a text-based markup language that is

    Fast becoming the standard for data

    interchange on the Web. As with HTML, youidentify data using tags (identifiers enclosed

    in angle brackets, like this:

    ).Collectively, the tags are known as"markup".

  • 8/7/2019 2 WHAT is XML

    3/43

    What is XML cont.What is XML cont.

    But unlike HTML, XML tags identify the data,

    Rather than specifying how to display it. Where an

    HTML Tag says something like "display this data inBold font (...), an XML tag acts like a

    Field name in your program. It puts a label on a

    Piece of data that identifies it (for example:

    ...)

  • 8/7/2019 2 WHAT is XML

    4/43

  • 8/7/2019 2 WHAT is XML

    5/43

    Example DetailExample DetailThroughout this tutorial, we use boldface text to

    highlight things we want to bring to your attention.XML does not require anything to be in bold! Thetags in this example identify the message as awhole, the destination and sender addresses, thesubject, and the text of the message. As in HTML,the tag has a matching end tag: . Thedata between the tag and and its matching end tagdefines an element of the XML data. Note, too, thatthe content of the tag is entirely containedwithin the scope of the message>..tag. It is this ability for one tag to contain othersthat gives XML its ability to represent hierarchicaldata structures

  • 8/7/2019 2 WHAT is XML

    6/43

    Example DetailExample Detail

    Once again, as with HTML, white space isessentially irrelevant, so you can formatthe data for readability and yet stillprocess it easily with a program. UnlikeHTML, however, in XML you could easilysearch a data set for messages containing"cool" in the subject, because the XML

    tags identify the content of the data,rather than specifying its representation.

  • 8/7/2019 2 WHAT is XML

    7/43

    The XML Prolog

    XML file is always starts with a prologXML file is always starts with a prologMinimum prolog tells that the givenMinimum prolog tells that the given

    document is XML.document is XML.

    Version specifies the version of XMLVersion specifies the version of XML

    Document. it is not optionalDocument. it is not optional

    Encoding identifies the character to encodeEncoding identifies the character to encodeThe data ISOThe data ISO--88598859--1 the western European1 the western European

    And English language character set.And English language character set.

  • 8/7/2019 2 WHAT is XML

    8/43

    The XML Prolog

    standalone

    Tells whether or not this document

    references an external entity or anexternal data type specification (see below).

    If there are no external references, then

    "yes" is appropriate

  • 8/7/2019 2 WHAT is XML

    9/43

    DTD(Document Type Definition)DTD(Document Type Definition)

    A Document Type Definition (DTD)A Document Type Definition (DTD)defines the legal building blocks ofdefines the legal building blocks of

    an XML document. It defines thean XML document. It defines thedocument structure with a list ofdocument structure with a list oflegal elements and attributes.legal elements and attributes.

    A DTD can be declared inline insideA DTD can be declared inline inside

    an XML document, or as an externalan XML document, or as an externalreference.reference.

  • 8/7/2019 2 WHAT is XML

    10/43

    DTD cont.DTD cont.The DTD specification is actually part of the XMLspecification, rather than a separate entity. Onthe other hand, it is optional you can write anXML document without it. A DTD specifies the

    kinds of tags that can be included in your XMLdocument, and the valid arrangements of thosetags. You can use the DTD to make sure youdon't create an invalid XML structure. You can

    also use it to make sure that the XML

    structure you are reading (or that got sent overthe net) is indeed valid.

  • 8/7/2019 2 WHAT is XML

    11/43

    DTD cont.DTD cont.

    Unfortunately, it is difficult to specify aDTD for a complex document in such away that it prevents all invalidcombinations. The DTD can exist at thefront of the document, as part of theprolog. It can also exist as a separate

    entity

  • 8/7/2019 2 WHAT is XML

    12/43

    DTD(Internal DTD Declaration )DTD(Internal DTD Declaration )

    If the DTD is declared inside the XML file,If the DTD is declared inside the XML file,

    it should be wrapped in a DOCTYPEit should be wrapped in a DOCTYPE

    definition with the following syntax:definition with the following syntax:

  • 8/7/2019 2 WHAT is XML

    13/43

    DTD Example (Internal DTDDTD Example (Internal DTDdeclaration)declaration)

  • 8/7/2019 2 WHAT is XML

    14/43

    DTD Example (External DTDDTD Example (External DTDdeclaration)declaration)

    There will be two filesThere will be two files

    1.1. .dtd file(contains the reference of dtd.dtd file(contains the reference of dtd

    file)file)2.2. .xml file( contains the reference of dtd.xml file( contains the reference of dtd

    file)file)

  • 8/7/2019 2 WHAT is XML

    15/43

    Dtd fileDtd file

    games(cricket,hockey,boxing,shooting)>

  • 8/7/2019 2 WHAT is XML

    16/43

    xml filexml file

    game of 11 playersgame of 11 playersgame of 11 playersgame of 11 players

    game of 11 playersgame of 11 players

    game of 11 playersgame of 11 players

  • 8/7/2019 2 WHAT is XML

    17/43

    SAXSAX

    You can also think of this standard as the "serialaccess" protocol for XML. This is the fast-to

    execute mechanism you would use to read and

    write XML data in a server, for example. This is

    also called an event-driven protocol, because the

    technique is to register your handler with a SAX

    parser, after which the parser invokes your

    callback methods whenever it sees a new XML tag(or encounters an error, or wants to tell you

    anything else).

  • 8/7/2019 2 WHAT is XML

    18/43

    The Simple API for XML (SAX) APIs

    The basic outline of the SAX parsing APIs

    are shown at right. To start the process, an

    instance of the SAXParserFactory classed is

    used to generate an instance of the parser.

    The parser wraps a SAXReader object. When theparser's parse() method is invoked, the reader

    invokes one of several callback methods

    implemented in the application.

    Those methods are defined by the interfaces

    ContentHandler,ErrorHandler,DTDHandler, and

    EntityResolver.

  • 8/7/2019 2 WHAT is XML

    19/43

    What is parsingWhat is parsing

    Parsing is the mechanism which is usedParsing is the mechanism which is used

    To read the object. It is also responsible forTo read the object. It is also responsible for

    Wellformedness and correctnessWellformedness and correctness

  • 8/7/2019 2 WHAT is XML

    20/43

    JAXP: Java API for XML Parsing

    It provides a common interface for creating

    and using the standard SAX,DOM APIs in

    Java, regardless of which vendor'simplementation is actually being used.

  • 8/7/2019 2 WHAT is XML

    21/43

    SAX overviewSAX overview

  • 8/7/2019 2 WHAT is XML

    22/43

    The Simple API for XML (SAX) APIscont.

    1. SAXParserFactoryA SAXParserFactory object creates an instance ofthe parser determined by the system property,

    javax.xml.parsers.SAXParserFactory.

    2.SAX ParserThe SAX Parser interface defines several kinds ofparse() methods. In general, you pass an XMLdata source and a Default Handler object to theparser, which processes the XML and invokes theappropriate methods in the handler object.

  • 8/7/2019 2 WHAT is XML

    23/43

    The Simple API for XML (SAX) APIscont.

    3.SAXReaderThe SAX Parser wraps a SAXReader.Typically, you don't care about

    that, but every once in a while you need toget hold of it using SAX Parser'sgetXMLReader(), so you can configure

    it. It is the SAXReader which carries on theconversation with the SAX event handlersyou define.

  • 8/7/2019 2 WHAT is XML

    24/43

    The Simple API for XML (SAX) APIscont.

    Default HandlerDefaultHandler implements the Content Handler,ErrorHandler,DTDHandler, and EntityResolverinterfaces (with null methods), so you canoverride only the ones you're interested in.Content Handler Methods like startDocument, endDocument, startElement, andendElement are invoked when an XML tag isrecognized. This interface also defines methods

    characters and processing Instruction, which areinvoked when the parser encounters the text inan XML element or an inline processinginstruction, respectively.

  • 8/7/2019 2 WHAT is XML

    25/43

    The Simple API for XML (SAX) APIscont.

    Error HandlerMethods error, fatal Error, and warningare invoked in response to various parsingerrors. The default error handler throws

    an exception for fatal errors and ignoresother errors (including validation errors).That's one reason you need to knowsomething about the SAX parser, even ifyou are using the DOM. Sometimes, the

    application may be able to recover from avalidation error. Other times, it may needto generate an exception. To ensure thecorrect handling, you'll need to supply

    your own error handler to the parser.

  • 8/7/2019 2 WHAT is XML

    26/43

    The Simple API for XML (SAX) APIscont.DTDHandler

    Defines methods you will generally never be called uponto use. Used when processing a DTD to recognize andact on declarations for an unparsed entity.Entity Resolver

    The resolveEntity method is invoked when the parsermust identify data identified by a URI. In most cases,a URI is simply a URL, which specifies the location of adocument, but in some cases the document may beidentified by a URN -- a public identifier, or name, that isunique in the web space. The public identifier may bespecified in addition to the URL. The EntityResolver canthen use the public identifier instead of the URL tofind the document, for example to access a local copy ofthe document if one exists.

  • 8/7/2019 2 WHAT is XML

    27/43

    Packages for SAXPackages for SAX

    1.org.xml.sax1.org.xml.sax Defines the SAX interfaces.

    The name "org.xml" is the package prefix

    that was settled on by the group that

    defined the SAX API.2. org.xml.sax.ext Defines SAX extensions that are

    used when doing more sophisticated SAX

    processing, for example, to process a documenttype definitions (DTD) or to see the

    detailed syntax for a file.

  • 8/7/2019 2 WHAT is XML

    28/43

    3.org.xml.sax.helpers

    Contains helper classes that make it easier

    to use SAX -- for example, by defining adefault handler that has null-methods for all

    of the interfaces, so you only need to

    override the ones you actually want toimplement.

    Packages for SAXPackages for SAX

  • 8/7/2019 2 WHAT is XML

    29/43

    Packages for SAXPackages for SAX

    4.javax.xml.parsers

    Defines the SAXParserFactory class which

    returns the SAXParser. Also definesexception classes for reporting errors.

  • 8/7/2019 2 WHAT is XML

    30/43

    Example of SAX XMLTest.javaExample of SAX XMLTest.javaimport java.io.*;import java.util.*;import org.w3c.dom.*;import org.xml.sax.*;import java.io.*;import java.util.*;import org.w3c.dom.*;import org.xml.sax.*;

    import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.SAXParserFactory;

    import javax.xml.parsers.SAXParser;import javax.xml.parsers.SAXParser;public class XMLTestpublic class XMLTest

    {Public static void main(String args[]){Try{String xmlResource=file:+new{Public static void main(String args[]){Try{String xmlResource=file:+newFile(args[0]).getAbsolutePath();File(args[0]).getAbsolutePath();

    Parser parser;SAXParserFactory spf= SAXParserFactory.newInstance();Parser parser;SAXParserFactory spf= SAXParserFactory.newInstance();

    //get an instance of SAXParserFactory//get an instance of SAXParserFactory

    SAXParser sp=spf.newSAXParser();//get a SAXParser instance from the factorySAXParser sp=spf.newSAXParser();//get a SAXParser instance from the factorySAXHandler handler=new SAXHandler();//craete an instance of handlerbaseSAXHandler handler=new SAXHandler();//craete an instance of handlerbase

    sp.parse(xmlResource,handler);//set the document handler to call our SAXHandler whensp.parse(xmlResource,handler);//set the document handler to call our SAXHandler whenSAXEvent occurs while parsing our XMLresouseSAXEvent occurs while parsing our XMLresouse

    Hashtable cfgTable=handler.getTable();//After the resourced is parsed get the resultingHashtable cfgTable=handler.getTable();//After the resourced is parsed get the resultingtabletable

    HashtableHashtable

    cfgtable=handler.getTable();System.out.println(ID==+(String)cgfTable.get(newcfgtable=handler.getTable();System.out.println(ID==+(String)cgfTable.get(newString(ID)));String(ID)));

    System.out.println(DES==+(String)cgfTable.get(new String(DESCRIPTION)));System.out.println(DES==+(String)cgfTable.get(new String(DESCRIPTION)));

    System.out.println(PRICE==+(String)cgfTable.get(new String(PRICE)));System.out.println(PRICE==+(String)cgfTable.get(new String(PRICE)));

    System.out.println(QUANTITY==+(String)cgfTable.get(new String(QUANTITY)));System.out.println(QUANTITY==+(String)cgfTable.get(new String(QUANTITY)));

    }catch(Exception e){e.printStackTrace();}}}}catch(Exception e){e.printStackTrace();}}}

  • 8/7/2019 2 WHAT is XML

    31/43

    Example of XML (SAXHandler.java)Example of XML (SAXHandler.java)

    Import java.io.*;Import java.io.*;Import java.util.*;Import java.util.*;

    Import org.xml.sax.*;Import org.xml.sax.*;

    Public class SAXHandler extends HandlerBasePublic class SAXHandler extends HandlerBase

    {Private Hashtable table=new Hashtale();Private String current{Private Hashtable table=new Hashtale();Private String current

    Element;Private String current Value;Element;Private String current Value;Public void settable(){This.table=table;} Public HashtablePublic void settable(){This.table=table;} Public Hashtable

    getTable(){Return table;getTable(){Return table;

    }Public void startElement(String tag,AttributeList atts)}Public void startElement(String tag,AttributeList atts)

    Throws SAXException{currentElement=tag;}Throws SAXException{currentElement=tag;}

    public void charcters(char[] ch,int start,int lenght)Throwspublic void charcters(char[] ch,int start,int lenght)ThrowsSAXExceptionSAXException

    {currentValue=new String(ch,start,length);{currentValue=new String(ch,start,length);

    }Public void endElement(String name) throws SAXException}Public void endElement(String name) throws SAXException

    {If(currentElement.equals(name)){table.put(currentElement,currentVal{If(currentElement.equals(name)){table.put(currentElement,currentVal

    ue);}}}ue);}}}

  • 8/7/2019 2 WHAT is XML

    32/43

    DOMDocument Object Model

    The Document Object Model protocol converts an

    XML document into a collection of objects in your

    program. You can then manipulate the object

    model in any way that makes sense. This

    mechanism is also known as the "random access

    protocol, because you can visit any part of the

    data at any time. You can then modify the data,remove it, or insert new data. For more

    information on the DOM specification,

  • 8/7/2019 2 WHAT is XML

    33/43

    DOM cont.

    The XML DOM (Document ObjectThe XML DOM (Document ObjectModel) defines a standard way forModel) defines a standard way foraccessing and manipulating XMLaccessing and manipulating XMLdocuments. The DOM presents andocuments. The DOM presents anXML document as a tree structure,XML document as a tree structure,with elements, attributes, and text aswith elements, attributes, and text as

    nodes:nodes:

  • 8/7/2019 2 WHAT is XML

    34/43

    DOM cont.

  • 8/7/2019 2 WHAT is XML

    35/43

    DOMDOM

  • 8/7/2019 2 WHAT is XML

    36/43

    DOMDOM

    The javax.xml.parsers.DocumentBuilderFactoryclass to get a DocumentBuilder instance, anduse that to produce a Document (a DOM) thatconforms to the DOM specification. The builder

    you get, in fact, is determined by the

    SystemProperty,javax.xml.parsers.DocumentBuilderFactory, which selects the factory implementation

    that is used to produce the builder.

  • 8/7/2019 2 WHAT is XML

    37/43

    DOM PackagesDOM Packages

    org.w3c.dom

    Defines the DOM programming interfaces

    for XML (and, optionally, HTML) documents,as specified by the W3C.

  • 8/7/2019 2 WHAT is XML

    38/43

    DOM PackagesDOM Packages

    javax.xml.parsers

    Defines the DocumentBuilderFactory class and the

    DocumentBuilder class, which returns an object that

    implements the W3C Document interface. The factory thatIs used to create the builder is determined by the

    javax.xml.parsers system property, which can be set from

    the command line or overridden when invoking the

    newInstance method. This package also defines the

    ParserConfigurationException class for reporting errors.

  • 8/7/2019 2 WHAT is XML

    39/43

    DOM ExampleDOM Exampleimport javax.xml.parsers.DocumentBuilder;import javax.xml.parsers.DocumentBuilderFactory;import javax.xml.parsers.FactoryConfigurationError;

    import javax.xml.parsers.ParserConfigurationException;import org.xml.sax.SAXException;import org.xml.sax.SAXParseException;import org.w3c.dom.Document;import org.w3c.dom.DOMException;import org.w3c.dom.Node;import org.w3c.dom.NodeList;import javax.xml.transform.Transformer;import javax.xml.transform.TransformerException;import javax.xml.transform.TransformerFactory;

    import javax.xml.transform.TransformerConfigurationException;import javax.xml.transform.dom.DOMSource;import javax.xml.transform.stream.StreamResult;import java.io.*;public class TransformationApp03{ Document document; public static

    void main (String argv []){if (argv.length != 1) {System.err.println ("Usage: java Transformation filename");System.exit

    (1);}DocumentBuilderFactory factory =DocumentBuilderFactory.newInstance();

    factory.setNamespaceAware(true);//factory.setValidating(true);try {File f = newFile(argv[0]);DocumentBuilder builder factory.newDocumentBuilder();document = builder.parse(f);// Get the first element in the DOMNodeList list = document.getElementsByTagName("slide");Node node = list.item(0);TransformerFactory tFactory

    =TransformerFactory.newInstance();Transformer transformer =tFactory.newTransformer();DOMSource source = new DOMSource(node);

    StreamResult result = new StreamResult(System.out);transformer.transform(source,

    result);} catch (TransformerConfigurationException tce) {

  • 8/7/2019 2 WHAT is XML

    40/43

    Example Cont.Example Cont.

    System.out.println ("\n** Transformer Factory error");System.out.println(" " +tce.getMessage() ); anyThrowable x = tce;if (tce.getException() != null)x =tce.getException();x.printStackTrace();} catch (TransformerException te){// Error generated by the parserSystem.out.println ("\n** Transformationerror");

    System.out.println(" " + te.getMessage() );

    Throwable x = te;if (te.getException() != null)x =te.getException();x.printStackTrace();} catch (SAXException sxe) {

    Exception x = sxe;if (sxe.getException() != null)x =sxe.getException();x.printStackTrace();} catch(ParserConfigurationException pce) {

    // Parser with specified options can't be builtpce.printStackTrace();} catch(IOException ioe) {// I/O error

    ioe.printStackTrace()}}}

  • 8/7/2019 2 WHAT is XML

    41/43

    DTDDocument Type Definition

    A DTD specifies the kinds of tags that

    can be included in your XML document,

    and the valid arrangements of those tags.

    You can use the DTD to make sure you

    don't create an invalid XML structure. You

    can also use it to make sure that the XML

    structure you are reading (or that got sent

    over the net) is indeed valid.

  • 8/7/2019 2 WHAT is XML

    42/43

    DTD contDTD cont..

    A DTD makes it possible to validate thestructure of relatively simple XMLdocuments, A DTD can't restrict thecontent of elements, and it can't specifycomplex relationships. For example, it isimpossible to specify with a DTD that a for a must have botha and an , while a

    for a only needs a. In a DTD,once you only get tospecify the structure of the element one time. There is no context-

    sensitivity.

  • 8/7/2019 2 WHAT is XML

    43/43

    DTD contDTD cont..

    This issue stems from the fact that a DTDspecification is not hierarchical. For a mailingaddress that contained several "parsed character

    data" (PCDATA) elements, for example, the DTDmight look something like this: