Upload
clara-gardner
View
240
Download
3
Tags:
Embed Size (px)
Citation preview
CSCI5333 DBMS
Outline
Structured, Semistructured, & Unstructured Data
XML Hierarchical Data Model
XML Document, DTD, & XML Schema
XML Documents & Databases
XML Querying
4CSCI5333 DBMS
Structured vs Semistructured Data
Structured Data:
e.g., information stored in databases; all records
have the same format as defined in the
relational schema
Semistructured data may have a certain structure
but no all the information collected will have
identical structure.
6CSCI5333 DBMS
FIGURE 26.2Part of an HTML
document representing
unstructured data
(c.f., the company database schema)
7CSCI5333 DBMS
XML Hierarchical (Tree) Data ModelProblem with HTML document:
Difficult to interpret automatically by programs because they do not include schema information about the type of data in the documents
Inappropriate as intermediate Web documents to be exchanged among various computer sites
Solution XML documentsTwo main structuring concepts: elements, attributes
c.f., In XML, tag names are defined to describe the meaning of the data elements, rather than to describe how the text is to be displayed (as in HTML).
8CSCI5333 DBMS
FIGURE 26.3A complex
XML element called
<projects>.
Correction: <project>
Complex elements: <projects>, <project>, <Worker>
Simple elements: <Name>, <Number>, <SSN>, …
Standalone=“yes” - schemaless
9CSCI5333 DBMS
XML Documents, DTD, and XML Schema
A well-formed XML document is one that follows a few conditions.– Start with an XML declaration (version, …)
– Tree model
– A single root element
– Matching start and end tags for an element must be within the tags of the parent element
– Syntactically correct
10CSCI5333 DBMS
XML Documents, DTD, and XML Schema
A valid XML document is well formed, and in addition the element names used in the start and end tag pairs must follow the structure specified in a separate XML DTD (Document Type Definition) file or XML schema file.
Figure 26.4: a sample XML DTD called projects* Zero or more, + one or more, ? Zero or one
Otherwise: exactly once
(data type)
(#PCDATA) parsed character data
11CSCI5333 DBMS
FIGURE 26.4 An XML DTD file called projects
To use the DTD file: (1) Store the DTD file in the same file system as the XML document(2) <?xml version=“1.0” standalone=“no”?>
<!DOCTYPE projects SYSTEM “proj.dtd”>
12CSCI5333 DBMS
DTD Limitations
1) Data types in DTD are not very general
2) Has its own special syntax and thus requires specialized processors
3) All DTD elements are always forced to follow the specified ordering of the documents, so unordered elements are not permitted.
Solution XML Schema
13CSCI5333 DBMS
FIGURE 26.5 An XML schema file called company
Schema namespace
the root element company; also an unnamed complex element
• “Department”, “Employee”, etc. must be named types.• The selector “employeeDependent” is an attribute of “Employee”, of type “Dependent”.• The field “dependentName” in “Dependent” must be unique.
14CSCI5333 DBMS
FIGURE 26.5 (continued)
An XML schema file
called company. <xsd:uniqu …> specifies a key constraint for non-primary key element.
<xsd:key> specifies a primary key.
<xsd:keyref> specifies a foreign key; <xsd:selector> refers to the referencing element type; <xsd:field> refers to the referencing attribute.
15CSCI5333 DBMS
FIGURE 26.5 (continued)An XML schema file called
company
Exercise: Define the element “projectWorker” in the type “Project” as an embedded sub-element.
Answer:
<xsd:element name=“projectWorker” minOccurs=“1” maxOccurs=“unbound”> <xsd:sequence> <xsd:element name=“SSN” type=“xsd:string” /> <xsd:element name=“hours” type=“xsd:float” /> </xsd:sequence></xsd:element>
17CSCI5333 DBMS
XML Documents and Databases
Approaches to Storing XML DocumentsExtracting XML Documents from Relational
DatabasesBreaking Cycles to Convert Graphs into TreesOther Steps for Extracting XML Documents from
Databases
19CSCI5333 DBMS
FIGURE 26.7Subset of the UNIVERSITY database schema
needed for XML document extraction.
26CSCI5333 DBMS
XML Query
XPath: Specifying Path Expressions in XML
XQuery: Specifying Queries in XML
27CSCI5333 DBMS
FIGURE 26.14Some examples of XPath expressions on XML
documents that follow the XML schema file COMPANY in Figure 26.5
28CSCI5333 DBMS
FIGURE 26.15Some examples of XQuery queries on XML documents that
follow the XML schema file COMPANY in Figure 26.5.