XML Concepts Overview

  • Upload
    ypraju

  • View
    222

  • Download
    0

Embed Size (px)

Citation preview

  • 8/14/2019 XML Concepts Overview

    1/11

    XML Concepts Overview

    By PenchalaRaju.Yanamala

    Extensible Markup Language (XML) is a flexible way to create commoninformation formats and to share the formats and data between applications andon the internet.

    You can import XML definitions into PowerCenter from the following file types:

    XML file. An XML file contains data and metadata. An XML file can reference aDocument Type Definition file (DTD) or an XML schema definition (XSD) forvalidation.DTD file. A DTD file defines the element types, attributes, and entities in anXML file. A DTD file provides some constraints on the XML file structure but aDTD file does not contain any data.XML schema. An XML schema defines elements, attributes, and typedefinitions. Schemas contain simple and complex types. A simple type is anXML element or attribute that contains text. A complex type is an XML elementthat contains other elements and attributes.

    Schemas support element, attribute, and substitution groups that you canreference throughout a schema. Use substitution groups to substitute oneelement with another in an XML instance document. Schemas also supportinheritance for elements, complex types, and element and attribute groups.

    XML Files

    XML files contain tags that identify data in the XML file, but not the format of thedata. The basic component of an XML file is an element. An XML elementincludes an element start tag, element content, and element end tag. All XMLfiles must have a root element defined by a single tag at the top and bottom of

    the file. The root element encloses all the other elements in the file.

    An XML file models a hierarchical database. The position of an element in anXML hierarchy represents its relationships to other elements. An element cancontain child elements, and elements can inherit characteristics from otherelements.

    For example, the following XML file describes a book:

    Fun with XML

    Understanding XMLUsing XML

    Using DTD FilesFun with Schemas

  • 8/14/2019 XML Concepts Overview

    2/11

  • 8/14/2019 XML Concepts Overview

    3/11

    Validating XML Files with a DTD or Schema

    A valid XML file conforms to the structure of an associated DTD or schema file.

    To reference the location and name of a DTD file, use the DOCTYPE declarationin an XML file. The DOCTYPE declaration also names the root element for theXML file.

    For example, the following XML file references the location of the note.dtd file:

    XML Data

    To reference a schema, use the schemaLocation declaration. TheschemaLocation contains the location and name of a schema.

    The following XML file references the note.xsd schema in an external location:

  • 8/14/2019 XML Concepts Overview

    4/11

    XML Data

    Unicode Encoding

    An XML file contains an encoding attribute that indicates the code page in thefile. The most common encodings are UTF-8 and UTF-16. UTF-8 represents acharacter with one to four bytes, depending on the Unicode symbol. UTF-16

    represents a character as a 16-bit word.

    The following example shows a UTF-8 attribute in an XML file:

    XML Data

    DTD Attributes

    Attributes provide additional information about elements. In a DTD file, anattribute occurs inside the starting tag of an element.

    The following syntax describes an attribute in a DTD file:

  • 8/14/2019 XML Concepts Overview

    5/11

    The following parameters identify an attribute in a DTD file:

    Element_name. The name of the element that has the attribute.Attribute_name. The name of the attribute.Attribute_type. The kind of attribute. The most common attribute type isCDATA. A CDATA attribute is character data.Default_value. The value of the attribute if no attribute value occurs in the XMLfile.

    Use the following options with a default value:

    -#REQUIRED. The XML file must contain the attribute value.-#IMPLIED. The attribute value is optional.

    -

    #FIXED. The XML file must contain the default value from the DTD file. A validXML file can contain the same attribute value as the DTD, or the XML file canhave no attribute value. You must specify a default value with this option.

    The following example shows an attribute with a fixed value:

    The element name is product. The attribute is product_name. The attribute has adefault value, vacuum.

    XML Schema Files

    An XML schema is a document that defines the valid content of XML files. AnXML schema file, like a DTD file, contains only metadata. An XML schemadefines the structure and type of elements and attributes for an associated XMLfile. When you use a schema to define an XML file, you can restrict data, definedata formats, and convert data between datatypes. XML schemas support

    complex types and inheritance between types. They also provide a way tospecify element and attribute groups, ANY content, and circular references

  • 8/14/2019 XML Concepts Overview

    6/11

    Cardinality

    Element cardinality in a DTD or schema file is the number of times an elementoccurs in an XML file. Element cardinality affects how you structure groups in anXML definition. Absolute cardinality and relative cardinality of elements affect thestructure of an XML definition.

    Absolute Cardinality

    The absolute cardinality of an element is the number of times an element occurswithin its parent element in an XML hierarchy. DTD and XML schema filesdescribe the absolute cardinality of elements within the hierarchy. A DTD fileuses symbols, and an XML schema file uses the and attributes to describe the absolute cardinality of an element.

    For example, an element has an absolute cardinality of once (1) if the elementoccurs once within its parent element. However, the element might occur manytimes within an XML hierarchy if the parent element has a cardinality of one or

    more (+).

    The absolute cardinality of an element determines its null constraint. An elementthat has an absolute cardinality of one or more (+) cannot have null values, butan element with a cardinality of zero or more (*) can have null values. Anattribute marked as fixed or required in an XML schema or DTD file cannot havenull values, but an implied attribute can have null values.

    Table 1-2 describes how DTD and XML schema files represent cardinality:

    Table 1-2. Cardinality of Elements in XML

  • 8/14/2019 XML Concepts Overview

    7/11

    Absolute Cardinality DTD Schema

    Zero or once ? minOccurs=0 maxOccurs=1

    Zero or one or more times * minOccurs=0 maxOccurs=unboundedminOccurs=0 maxOccurs=n

    Once minOccurs=1 maxOccurs=1

    One or more times + minOccurs=1 maxOccurs=unboundedminOccurs=1 maxOccurs=n

    Note: You can declare a maximum number of occurrences or an unlimited

    occurrences in a schema

    Relative Cardinality

    Relative cardinality is the relationship of an element to another element in theXML hierarchy. An element can have a one-to-one, one-to-many, or many-to-many relationship to another element in the hierarchy.

    An element has a one-to-one relationship with another element if everyoccurrence of one element can have one occurrence of the other element. For

    example, an employee element can have one social security number element.Employee and social security number have a one-to-one relationship.

    An element has a one-to-many relationship with another element if everyoccurrence of one element can have multiple occurrences of another element.For example, an employee element can have multiple email addresses.Employee and email address have a one-to-many relationship.

    An element has a many-to-many relationship with another element if an XML filecan have multiple occurrences of both elements. For example, an employeemight have multiple email addresses and multiple street addresses. Email

    address and street address have a many-to-many relationship

  • 8/14/2019 XML Concepts Overview

    8/11

    XML Path

    XMLPath (XPath) is a language that describes a way to locate items in an XMLfile. XPath uses an addressing syntax based on the route through the hierarchyfrom the root to an element or attribute. An XML path can contain long schemacomponent names.

    XPath uses a slash (/) to distinguish between elements in the hierarchy. XMLattributes are preceded by @ in the XPath.

  • 8/14/2019 XML Concepts Overview

    9/11

    Using XML with PowerCenter Overview

    You can create an XML definition in PowerCenter from an XML file, DTD file,XML schema, flat file definition, or relational table definition. When you create anXML definition, the Designer extracts XML metadata and creates a schema in therepository. The schema provides the structure from which you edit and validatethe XML definition.

    An XML definition can contain multiple groups. In an XML definition, groups arecalled views. The relationship between elements in the XML hierarchy definesthe relationship between the views. When you create an XML definition, theDesigner creates views for multiple-occurring elements and complex types in aschema by default. The relative cardinality of elements in an XML hierarchyaffects how PowerCenter creates views in an XML definition. Relative cardinalitydetermines if elements can be part of the same view.

    The Designer defines relationships between the views in an XML definition bykeys. Source definitions do not require keys, but target views must have them.

    Each view has a primary key that is an XML element or a generated key.

    When you create an XML definition, you can create a hierarchical model or anentity relationship model of the XML data. When you create a hierarchical model,you create a normalized or denormalized hierarchy. A normalized hierarchycontains separate views for multiple-occurring elements. A denormalizedhierarchy has one view with duplicate data for multiple-occurring elements.

    If you create an entity model, the Designer creates views for complex types andmultiple-occurring elements. The Designer creates an XML definition that modelsthe inheritance and circular relationships the schema provides.

  • 8/14/2019 XML Concepts Overview

    10/11

    Importing XML Metadata

    When you import an XML definition, the Designer creates a schema in therepository for the definition. The repository schema provides the structure fromwhich you edit and validate the XML definition.

    You can create metadata from the following file types:

    XML files

    DTD filesXML schema filesRelational tablesFlat files

    Importing Metadata from an XML File

    In an XML file, a pair of tags marks the beginning and end of each data element.These tags are the basis for the metadata that PowerCenter extracts from theXML file. If you import an XML file without an associated DTD or XML schema,the Designer reads the XML tags to determine the elements, their possible

    occurrences, and their position in the hierarchy. The Designer checks the datawithin the element tags and assigns a datatype depending on the datarepresentation. You can change the datatypes for these elements in the XMLdefinition.

    Figure 2-1 shows a sample XML file. The root element is Employees. Employeeis a multiple occurring element. The Employee element contains the LastName,FirstName, and Address. The Employee element also contains the multiple-occurring elements: Phone and Email.

  • 8/14/2019 XML Concepts Overview

    11/11