11
11.02.2009 1 XML Databases 13. Systems Silke Eckstein Andreas Kupfer Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13.1 Introduction 13.2 Oracle 13.3 DB2 13.4 SQL Server 13.5 Tamino 13.6 Summary 13.X Overview and References 2 13. Systems XML Databases –Silke Eckstein – Institut für Informationssysteme –TU Braunschweig After discussing various aspects of XML and XML databases ... ... we are now going to have a closer look at some of the database systems. XML Databases –Silke Eckstein – Institut für Informationssysteme –TU Braunschweig 3 13.1 Introduction RDBMS with XML support Native XML-DBMS systems XML Databases –Silke Eckstein – Institut für Informationssysteme –TU Braunschweig 4 13.1 Introduction 13.1 Introduction 13.2 Oracle 13.3 DB2 13.4 SQL Server 13.5 Tamino 13.6 Summary 13.X Overview and References 5 13. Systems XML Databases –Silke Eckstein – Institut für Informationssysteme –TU Braunschweig XML Databases –Silke Eckstein – Institut für Informationssysteme –TU Braunschweig 6 [Tür08] 13.2 Oracle 11g Architecture Figure taken from Oracle® XML Developer's Kit Programmer's Guide 11g Release 1 (11.1), April 2008

13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

1

XML Databases13. Systems

Silke EcksteinAndreas KupferInstitut für InformationssystemeTechnische Universität Braunschweighttp://www.ifis.cs.tu-bs.de

13.1 Introduction

13.2 Oracle

13.3 DB2

13.4 SQL Server

13.5 Tamino

13.6 Summary

13.X Overview and References

2

13. Systems

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig

• After discussing various aspects of XML and XML databases ...

• ... we are now going to have a closer look at some of the database systems.

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 3

13.1 Introduction

• RDBMS with XML support

• Native XML-DBMS systems

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 4

13.1 Introduction

13.1 Introduction

13.2 Oracle

13.3 DB2

13.4 SQL Server

13.5 Tamino

13.6 Summary

13.X Overview and References

5

13. Systems

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 6[Tür08]

13.2 Oracle 11gArchitecture

Figure taken from Oracle® XML Developer's KitProgrammer's Guide 11g Release 1 (11.1), April 2008

Page 2: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

2

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 7[Tür08]

13.2 Oracle 11gArchitecture (2)

Figure taken from Oracle® XML DB Developer’s Guide 11g Release 1 (11.1)October 2007

• Mapping variants from XML to databases– XML column approach: Column is based on XML type– XML table approach: Table is based on XML type

• Using objectrelational extensions of Oracle– XMLTYPE as predefined object type with SQL/XML

functions as methods– Intermedia-Text-Package with full text functions– DBMS_XMLDOM package with DOM methods– DBMS_XMLSCHEMA package with administration and

generation methods– DBMS_XMLGEN package with methods to generate XML

from SQL

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 8[Tür08]

13.2 Oracle 11g

• Storage options– text-based (unstructured as CLOB)

– binary (compact storage in XML binary format)

– schema-based (object-relational storage requires XML Schema)

– hybrid (semistructured)

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 9[Tür08]

13.2 Oracle 11g

Figure taken from Oracle® XML DB Developer’s Guide 11g Release 1 (11.1)October 2007

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 10[Tür08]

13.2 Oracle 11g

Figure taken from Oracle® XML DB Developer’s Guide 11g Release 1 (11.1)October 2007

• XML-column vs. XML-table approach

– Table with XML column

– XML table

– Inserting documents in both cases

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 11[Tür08]

13.2 Oracle 11g

CREATE TABLE <table name> ( <column name> XMLTYPE)[XMLTYPE [COLUMN] <column name>[STORE AS {OBJECT RELATIONAL | CLOB ( <LOB parameter>) | BINARY XML ( <LOB parameter>) })[XMLSCHEMA <url> ELEMENT [ <url> #] <element> ]]

CREATE TABLE <table name> OF XMLTYPE[XMLTYPE[STORE AS {OBJECT RELATIONAL | CLOB ( <LOB parameter>) | BINARY XML ( <LOB parameter>) })[XMLSCHEMA <url> ELEMENT [ <url> #] <element> ]]

INSERT INTO table VALUES (XMLTYPE (getDocument('input1.xml')));

schema-based text-based binary

• User-defined function getDocument(file) toread XML documents

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 12[Tür08]

13.2 Oracle 11g

CREATE DIRECTORY xmldir AS 'c:\xmldir';

GRANT READ ON DIRECTORY xmldir TO PUBLIC WITH GRANT OPTION;

CREATE FUNCTION getDocument(filename VARCHAR2)

RETURN CLOB

AUTHID CURRENT_USER IS

xbfile BFILE;

xclob CLOB;

BEGIN

xbfile := BFILENAME('xmldir', filename);

DBMS_LOB.open(xbfile);

DBMS_LOB.createTemporary(xclob TRUE, DBMS_LOB.session);

DBMS_LOB.loadFromFile(xclob, xbfile,

DBMS_LOB.getLength(xbfile));

DBMS_LOB.close(xbfile);

RETURN xclob;

END;

/

Page 3: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

3

• Package DBMS_XMLSCHEMA offers methodsto register, compile, generate and delete XML Schemas

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 13[Tür08]

13.2 Oracle 11g

DBMS_XMLSCHEMA.registerSchema( 'schema-URL', 'schema-name' );DBMS_XMLSCHEMA.registerSchema( 'text.xsd', getDocument('test.xsd') );

DBMS_XMLSCHEMA.compileSchema( 'schema-URL' );

DBMS_XMLSCHEMA.generateSchema( 'schema-URL', 'type-name' );

DBMS_XMLSCHEMA.deleteSchema( 'schema-URL', DeleteOption );

DeleteOption:DELETE_RESTRICTDELETE_INVALIDATEDELETE_CASCADEDELETE_CASCADE_FORCE

• Some methods of the XMLTYPE

– XMLTYPE(<value-expr>) is the constructor. Expression can be a string or a

user defined type

– getClobVal()/getStringVal() returns XML value as CLOB or string

– getNumVal() only applicable to text nodes containing a numeric string

– isFragment() returns 1 if instance has more than one root element

– existsNode(<XPath-expr>) returns 1 if the expression returns a node

– extract(<XPath-expr>) extracts a part of the XML value

– transform(<XML-value-expr>) transforms according to a stylesheet

– toObject() converts to an object

– isSchemaBased() returns 1 if the XML value is based on a schema

– getSchemaURL() returns the URL to the schema

– getRootElement() returns the root element or NULL for fragments

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 14[Tür08]

13.2 Oracle 11g

• Queries– Support of SQL/XML functions

• XMLQUERY• XMLTABLE• XMLAGG• XMLELEMENT• XMLATTRIBUTE• XMLFOREST• …

– And additional functions• EXTRACT• EXISTSNODE• ...

– Full text search with the Intermedia-Text-Package

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 15[Tür08]

13.2 Oracle 11g

• EXTRACT

– extracts an excerpt of the XML value described by an XPath query

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 16[Tür08]

13.2 Oracle 11g

EXTRACT( <XML-value-expression>, <XPath-expression> [, <Namespace>] )

SELECT EXTRACT( VALUE(b), '//@ISBN' ) AS ISBNumber,EXTRACT( VALUE(b), '//Title/text()' ) AS Title_content,EXTRACT( VALUE(b), '//Title' ) AS Title_element

FROM Book b;

ISBNumber Title_content Title_element

3-89864-148-1 XML &amp; Datenbanken <Title>XML &amp; Datenbanken</Title>

3-89864-219-4 SQL-1999 &amp; SQL:2003 <Title>SQL-1999 &amp; SQL:2003</Title>

• EXISTSNODE

– Returns 0 if the query returns the empty sequence

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 17[Tür08]

13.2 Oracle 11g

EXISTSNODE( <XML-value-expression>, <XPath-expression> [, <Namespace>] )

Example:SELECT EXTRACT( VALUE(b), '//@ISBN' ) AS ISBNumber,

EXTRACT( VALUE(b), '//Title/text()' ) AS Title_content,EXTRACT( VALUE(b), '//Title' ) AS Title_element

FROM Book bWHERE EXISTSNODE( VALUE(b), '//Book[@ISBN="3-89864-219-4"]' ) = 1;

ISBNumber Title_content Title_element

3-89864-219-4 SQL-1999 &amp; SQL:2003 <Title>SQL-1999 &amp; SQL:2003</Title>

• Indexing– Full text index

– Path index

– Functional index(value index)

– XML index• Creates a set of secondary indexes

– Path index with all XML tags and fragments

– Value index with the oder of the document (node positions)

– Value index to index the values of the nodes

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 18[Tür08]

13.2 Oracle 11g

CREATE INDEX xmlfulltextidx ON Book b(VALUE(b)) INDEXTYPE IS CTXSYS.CONTEXT;

CREATE INDEX xmlpathidx ON Book b(VALUE(b)) INDEXTYPE IS CTXSYS.CTXXPATH;

CREATE INDEX xmlfunctionalidx ON Book b(EXTRACTVALUE(VALUE(b),'//@year'));

CREATE INDEX xmlidx ON Book b(VALUE(b)) INDEXTYPE IS XDB.XMLIndex;

Page 4: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

4

• Using indexes

– Query using the path index:

– Query using the full text index:

– Query using the functional index:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 19[Tür08]

13.2 Oracle 11g

SELECT EXTRACTVALUE (VALUE(b),'//Title') AS Title FROM Book bWHERE EXISTSNODE (VALUE(b),'/Book/Publisher[text()="dpunkt"]') = 1;

SELECT SCORE (o), EXTRACT(VALUE(b),'//@ISBN') AS ISBN FROM Book bWHERE CONTAINS (VALUE(b),'Java', o) > o ORDER BY SCORE (o) DESC;

SELECT EXTRACTVALUE (VALUE(b),'//Title') AS Title FROM Book bWHERE EXTRACTVALUE (VALUE(b),'//Year') = 2009;

• Manipulation – UPDATEXML– Change a part (defined by an XPath query) of the XML value

– Example to change the value of an attribute:

• Manipulation – DELETEXML– Deletes a sequence of nodes (selected by an XPath query)

from the XML value

– Example to delete a specific Author node:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 20[Tür08]

13.2 Oracle 11g

UPDATEXML (<XML-value-expr>, <replacement-list> [, <namespace>])<replacement-list> := <XPath-expr>, <value-expr>

UPDATE Book bSET VALUE(b) = UPDATEXML (VALUE(b),'//Publisher[text()="dpunkt"]/@City', 'Zürich');

UPDATE Book bSET VALUE(b) = DELETEXML (VALUE(b),'//Book[@ISBN="3-89864-148-1"]/Author[text()="Holger Meyer"]');

DELETEXML (<XML-value-expr>, <replacement-list> [, <namespace>])

• XML views

– Allow XML-based views on SQL and XML values

– Are based on the principle of object views

• The object type is XMLTYPE in this case

– Example:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 21[Tür08]

13.2 Oracle 11g

CREATE VIEW DpunktBooks OF XMLTYPEWITH OBJECT ID DEFAULTAS SELECT VALUE (b) FROM Book b

WHERE EXISTSNODE (VALUE(b),'//Publisher[text()="dpunkt"]') ;

• Export of database contents with XML syntax– Standard mapping: SQL � XML with

• Top level elements result from columns• Simple types (with scalar values) as elements with PCDATA• Structured types and their attributes as elements with subelements for

attributes• Complex attributes as hierarchically nested elements• Collection types are mapped to lists of elements• Object references and referential integrity as ID/IDREF within the

document• Table content is mapped to ROWSET elements:

– User defined transformation from SQL to XML is possible with XSLT

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 22[Tür08]

13.2 Oracle 11g

DBMS_XMLGEN.getXML('query')

<ROWSET><ROW num="1" > … </ROW>…<ROW num="n" > … </ROW>

</ROWSET>

• Summary Oracle XML support

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 23[Tür08]

13.2 Oracle 11g

XML storage modelccccc Extensible, object relational

Schema definition Validation possible

Storage type Text-based or schema-based

Mapping DB � XML By SQL/XML functions, schemagenerators, XML views

XML data type Available

Value/function index Available

Full text index Available

Path index Available

Queries SQL/XML with XQuery support

Full text search With the Intermedia-Text-Package

Manipulation SQL methods with XPath

13.1 Introduction

13.2 Oracle

13.3 DB2

13.4 SQL Server

13.5 Tamino

13.6 Summary

13.X Overview and References

24

13. Systems

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig

Page 5: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

5

• IBM DB2

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 25[Tür08]

13.3 DB2 V9

Database

XML documentsApplication

filesys-tem

• Mapping XML data to relational databases– Variants:

• XML column approach: based on XML data type• XML collection approach: based on decomposition of XML documents into

database tables and attributes

– Table with XML column:• Diverse XML datatypes:

– XML: modelbased / hierarchical storage– XMLCLOB: XML documents stored as CLOBs– XMLVARCHAR: XML documents stored as VARCHAR – XMLFILE: XML documents stored in file system

• XML schema validation for datatype XML only• In addition: materialized views

– Extract selected XML content from documents– Materialise those content into so-called side tables– Side tables are defined in Document Access Definition (DAD)

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 26[Tür08]

13.3 DB2 V9

PureXML

XMLExtender

• "pureXML and relational hybrid database"

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 27[IBM06a]

13.3 DB2 V9

• Ways to put XML data into the database (PureXML)

13.3 DB2 V9

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 28[IBM06b]

• Ways to get XML data out of the database (PureXML)

13.3 DB2 V9

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 29[IBM06b]

• PureXML – Queries and Indexes– Application of SQL in XQuery:

– Delivers the value of column xml1 of table t1 as a node sequence (column must be of type XML)

– Delivers the XML value of the single-column table t1 as a node sequence (column must be of type XML)

– Definition of a path index:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 30[Tür08]

13.3 DB2 V9

CREATE INDEX Idx_Author_Path ON Book (Content)GENERATE KEY USING XMLPATTERN '//Author' AS SQL VARCHAR(50)

XQUERY db2-fn:xmlcolumn (‘t1.xml1’)

XQUERY db2-fn:sqlquery (’SELECT xml1 FROM t1’)

Page 6: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

6

• XML Extender – Mapping between XML and SQL

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 31[Tür08]

13.3 DB2 V9

• XML Extender –Tables with XML Types– XML extension setup with XML Extender Admin Wizard

or Command Window:

– Definition of tables accepting XML documents:• Variant 1: Create with XML Extender Admin Wizard

• Variant 2: SQL

– Insertion of an XML document:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 32[Tür08]

13.3 DB2 V9

> dxxadm enable_db XMLDB

CREATE TABLE Buch(Inhalt DB2XML.XMLVARCHAR)

INSERT INTO Buch (Inhalt)

VALUES (DB2XML.XMLVARCHARFromFile('C:\XMLDIR\buch01.xml'))

• XML Extender – Queries– SQL-XML Extender offers functions for queries and

updates• Extract functions:

• Example:

– Limited supportof SQL/XML standard• XMLAGG

• XMLELEMENT

• XMLATTRIBUTE

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 33[Tür08]

13.3 DB2 V9

DB2XML.EXTRACT<datatype>(<XML value expression>, <XPath expression>)

SELECT a.RETURNEDVARCHARFROM Buchlob, TABLE(DB2XML.EXTRACTVARCHARS(Inhalt, '//Autor')) a

• ExtractXXX(<XML value expression>, <XPath expression>)

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 34[Tür08]

13.3 DB2 V9

"IBM DB2 Universal Database XML Extender

Administration and Programming, Version 8, 2002"

"IBM DB2 Universal Database XML Extender

Administration and Programming, Version 8, 2002"

• XML Extender – Updates – Updates possible with special XML Extender methods– Syntax:

– Restriction: predicates with elements are not supported• Example: not supported predicate

• Example: supported predicate

– With XML column approach updates are transferred to side tables automatically

– In PureXML an XML value can only be fully replaced

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 35[Tür08]

13.3 DB2 V9

DB2XML.UPDATE(<XML value expression>, <XPath expression >, <new value>)

UPDATE BuchlobSET Inhalt = DB2XML UPDATE(Inhalt '//Verlag[text()="dpunkt"]/@Ort' 'Zürich')

UPDATE BuchlobSET Inhalt = DB2XML.UPDATE(Inhalt, '// Buch[@ISBN="3-89864-148-1"]/Verlag/ @Ort', 'Köln')

• XML Extender – Indexing– Index support

• Value index (B-Tree, Bitmap, etc.) on side tables (XML Extender)

• Full text index (with Text Extender) on XML types

– Extension of full text index for IR on XML• Path information included in index

• Support for path expressions

• Example:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 36[Tür08]

13.3 DB2 V9

SELECT InhaltFROM BuchlobWHERE contains(dscrHandel, ‘MODEL order SECTION(//Buch/Beschreibung) "Datenbank"‘) = 1

Retrival model

Page 7: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

7

• Summary IBM DB2 XML Support

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 37[Tür08]

13.3 DB2 V9

XML storage model Extensible, object relational

Schema definition Validation possible

Storage type Model-based (PureXML), text-based oruserdefined schema-based (XML Extender)

Mapping DB � XML DAD (XML Extender)

XML data type Available (PureXML)

Value/function index Standard DBS indexes on side tables

Full text index With TextExtender

Path index Available

Queries SQL/XML with XQuery support

Full text search WithTextExtender

Manipulation SQL functions with XPath

13.1 Introduction

13.2 Oracle

13.3 DB2

13.4 SQL Server

13.5 Tamino

13.6 Summary

13.X Overview and References

38

13. Systems

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig

• Microsoft SQL Server Architecture

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 39[Tür08]

13.4 SQL Server

Database

XML documentsApplication

• Mapping XML data to relational databases– 4 storage variants:

• Native (binary) storage• Text-based storage as CLOB• Model-based storage according to EDGE approach• Schema-based storage via STORED-queries

– Datatype XML with methods based on XQuery• Query() – evaluates an XQuery and returns a value of type XML• Value() – evaluates an XQuery and returns a scalar SQL value• Exist() – returns true, if XQuery result is not empty• Modify() – updates a value of type XML• Nodes() – returns subtree of XML value

– Integrated Usage of SQL and XQuery• Access to SQL data in XQuery via sql:column() and sql:variable()

• Evaluation of XQuery expressions in SQL via XML methods from above

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 40[Tür08]

13.4 SQL Server

• Native storage – table definition– Schema registration

– Table definition

– Insertion of an XML document from a file

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 41[Tür08]

13.4 SQL Server

CREATE XML SCHEMA COLLECTION BuchXSD AS '<?xml version="1.0"?>…'

CREATE TABLE Buch (Id INT PRIMARY KEY,Inhalt XML BuchXSD)

)

INSERT INTO BuchSELECT 1, xColFROM (SELECT *

FROM OPENROWSET (BULK 'C:\XMLDIR\buch1.xml', SINGLE_BLOB) AS xCol)

AS R(xCol)

• Native storage – SQL/XML queries & updates– Find all author elements from books whose first author is

"Gunter Saake"

– Update the value of the attributes "City" from all those publisher elements to "Zürich", where the publisher is "dpunkt"

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 42[Tür08]

13.4 SQL Server

SELECT Inhalt.query('//Autor') AS AutorenFROM BuchWHERE Inhalt.exist('/Buch[Autor[1] = "Gunter Saake"]') = 1

Autoren

<Autor>Gunter Saake</Autor><Autor>Ingo Schmitt</Autor><Autor>Can Türker</Autor>

<Autor>Gunter Saake</Autor><Autor>Kai-Uwe Sattler</Autor>

UPDATE BuchSET Inhalt.modify('replace value of (//Verlag[. = "dpunkt"]/@Ort)[1]

with "Zürich"')

Page 8: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

8

• Native storage – indexing – Definition of a primary XML indexes

• Creates clustered index with entries of form (ID, ORDPATH, TAG, NODETYPE, VALUE, PATH_ID, ...)

• necessary in order to create secondary indexes

– Secondary XML index types:• Path index (path, value)• Property index (primary key, path, value)• Value index (value, path)

– Definition of a secondary XML index:

– Full text index is also supported:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 43[Tür08]

13.4 SQL Server

CREATE PRIMARY XML INDEX Idx_Inhalt ON Buch (Inhalt)

PATH | PROPERTY | VALUE

CREATE XML INDEX Idx_Inhalt_Path ON Buch (Inhalt)USING XML INDEX Idx InhaltFOR <Indextyp>

CREATE FULLTEXT INDEX Idx_Inhalt_FT ON Buch (Inhalt)KEY INDEX b

• Model-based storage with EDGE– Invocation of OPENXML without WITH claus creates EDGE

table

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 44[Tür08]

13.4 SQL Server

Column Datatype Task

id bigint unique node id

parentid bigint parent node id

nodetype int distinguishes elements, attributes, comments

localname nvarchar tag

prefix nvarchar XML namespace prefix

namespaceuri nvarchar XML namespace URI

datatype nvarchar datatype (derived from DTD or XML schema)

prev bigint id of previous node (in document order)

text ntext node content

• Model-based storage with EDGE

– EDGE table:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 45[Tür08]

13.4 SQL Server

EXEC sp_xml_preparedocument @hdoc OUTPUT, @xmldoctextINSERT INTO EDGESELECT *FROMOpenXML (@hdoc, '', 0)EXEC sp_xml_removedocument @hdocC

id parent nodetype localname prefix namespaceuri datatype prev text

0 NULL 1 book NULL NULL NULL NULL NULL

...

17 6 3 #text NULL NULL NULL NULL 'Vossen'

• Schema-based storage of STORED queries– SQL extension with OPENXML– OPENXML transforms XML contents into database tables

(shredding)– OPENXML therefore offers possibility to implement STORED

queries– Example for the realization of a STORED query:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 46[Tür08]

13.4 SQL Server

EXEC sp_xml_preparedocument @hdoc OUTPUT, @xmldoctextINSERT INTO bookSELECT *FROM OpenXML (@hdoc, '//book/', 0) WITH(

title NVARCHAR(3000) ‘./title',publisher NVARCHAR(200) ‘./publisher‘,isbn NVARCHAR(15) ‘./isbn‘

)EXEC sp_xml_removedocument @hdoc

• Mapping of databases to XML– Variant 1: Standard transformation with SQL SELECT and FOR

XML clause• FOR XML RAW: Transformation in ROW-XML elements and XML

attributes• FOR XML AUTO:

– Semantically rich XML element names– Foreign key relationships are transformed into hierarchies

• FOR XML EXPLICIT: User controls XML assembling through metadata (EDGE)

– Variant 2: User defined XML view• Use of a (available) XML schema• Annotation of the schema with information about tables and columns• Accesss from the application to the XML view via:

– IIS functionality– ADO (ActiveX Data Objects) – middleware for DB access

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 47[Tür08]

13.4 SQL Server

• Updates– SQL Server does not offer functions to update XML documents

stored as CLOBs• Results in heavy restrictions of text-based approach

– Updates for schema-based approach possible via so called updategrams

• Builds on annotated XML schemas• Updates are specified as an XML document• New namespace: xmlns:updg="urn:schemas-microsoft-com:xml-updategram"

– Element before: Definition of a previous state (to be modified)– Element after: Definition of the new state

• Different update operations through varying element contents– Insert: before element remains empty– Delete: after element remains empty– Update: both elements have non-empty contents

• Automatic execution of necessary database operations

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 48[Tür08]

13.4 SQL Server

Page 9: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

9

• Updates: updategram example– Update of publisher information

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 49[Tür08]

13.4 SQL Server

<ROOT xmlns:updg="urn:schemas-microsoft-com:xml-updategram">

<updg:sync >

<updg:before>

<Buch>

<Titel> Objektdatenbanken </Titel>

<ISBN>3-8266-00258-7 </ISBN>

<Verlag> Thomson </Verlag>

</Buch>

</updg:before>

<updg:after>

<Buch>

<Titel> Objektdatenbanken </Titel>

<ISBN>3-8266-00258-7 </ISBN>

<Verlag> International Thomson Publishing </Verlag>

<Buch>

</updg:after>

</updg:sync>

</ROOT>

• Summary SQL Server XML support

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 50[Tür08]

13.4 SQL Server

XML storage model Relational

Schema definition inline DTD or XML schema

Storage type Native: XML columntext-based: CLOB columnmodelbased: with OPENXMLuser-defined schema-based: with OPENXML-STORED queries

Mapping DB � XML Automatically: FOR XML clauseuser-defined: XSD annotations

XML data type Available

Value index Available

Full text index No XML specific functions

Path index Available

Queries SQl extensions (query and value not compatible with

SQL/XML), XQuery

Manipulation XML method modify with updategrams

13.1 Introduction

13.2 Oracle

13.3 DB2

13.4 SQL Server

13.5 Tamino

13.6 Summary

13.X Overview and References

51

13. Systems

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 52[Tür08]

13.5 Tamino

• Architecture

• Architecture (2)

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 53[Tür08]

13.5 Tamino

XML Output Query (URL) XML Objects, DTDs

Data from external sources and/or internal data storage

Data to external sources and/or internal data storage

• Storage structures: Mapping of XML– Tamino uses "native" storage structures for XML data– Native storage is supplemented with diverse classical index types

• B-Tree index• Full text index• Path index

– Storage alternatives:• Storage of well-formed XML documents without schema• Storage of valid XML documents

– Annotation of schema definition with storage alternatives

– Storage hierarchy:• Tier 1: Tamino• Tier 2: Collection• Tier 3: Document type (defined by set of XML schema definitions)• Tier 4: document instance

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 54[Tür08]

13.5 Tamino

Page 10: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

10

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 55[Tür08]

13.5 Tamino<?xml version="1.0" encoding="UTF-8"?>

<xsd:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns:tsd="namespaces.softwareag.com/tamino/TaminoSchemaDefinition">

<xs:annotation> <xs:appinfo>

<tsd:schemaInfo name="book">

<tsd:collection name="books"></tsd:collection>

<tsd:doctype name="book">

<tsd:logical> <tsd:content>open<tsd:content></tsd:logical>

</tsd:doctype>

</tsd:schemaInfo>

</xs:appinfo> </xs:annotation>

<xs:element name = "book">

<xs:complexType> <xs:sequence>

<xs:element name = "title" type = "xs:string"></xs:element>

<xs:element name = "summary" type = "xs:string">

<xs:annotation> <xs:appinfo>

<tsd:elementInfo>

<tsd:physical> <tsd:native>

<tsd:index> <tsd:text></tsd:text> </tsd:index>

</tsd:native> </tsd:physical>

</tsd:elementInfo>

</xs:appinfo> </xs:annotation>

</xs:element>

</xs:sequence> </xs:complexType>

</xs:element>

Storage: Example schema with annotations for text index

• Queries– Access possibilities

• Program controlled, e.g. via DCOM components• Ad-hoc queries with X-Plorer query tool• "Interactive Interface"

– Supported query languages• XPath 1.0 dialect with extensions for text search (also possible without

index)– Containedness (~=)

– Wildcard character (*)

– Consideration of context (NEAR)

• XQuery dialect

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 56[Tür08]

13.5 Tamino

/Buch[Titel ~= "Datenmodelle"]/Beschreibung

/*[. ~= "*XML*"]

/*[/Autor ~= "Gunter" NEAR "Saake"]

• Updates

– Operations

• Delete:

• Insert:

• Replace:

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 57[Tür08]

13.5 Tamino

UPDATE DELETE $buch//Verlag[@Ort="Zürich"]/@Ort

UPDATE INSERT <Preis Waehrung="EUR">35</Preis>INTO $buch[@ISBN="3-8266-0258-7"]

UPDATE REPLACE $buch//Verlag[@Ort="Zürich"]/@OrtWITH ATTRIBUTE Ort {"Wiesbaden"}

• Indexing– Classical indexes for data

• Numbers and strings

– Text indexes for document centric parts• With wildcards

– Structure index• Full• Condensed

– Combined index• Multiple elements and attributes, even on different levels

– Multi path index• Different paths indexed together

– Reference index• Hierarchy aware index

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 58[Tür08]

13.5 Tamino

• Summary Tamino

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 59[Tür08]

13.5 Tamino

Native Relational

Schema definition Validation possible

Storage type Model-based

Mapping DB � XML Native

XML data type Available

Value index Available

Full text index Available

Path index Available

Queries Tamino X-Query (with extensions and small differences compared to W3C XQuery)

Full text search Supported

Manipulation Supported

13.1 Introduction

13.2 Oracle

13.3 DB2

13.4 SQL Server

13.5 Tamino

13.6 Summary

13.X Overview and References

60

13. Systems

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig

Page 11: 13. Systems XML Databases · 13.2 Oracle 11 g XML storage modelccccc Extensible, object relational Schema definition Validation possible Storage type Text-based or schema-based Mapping

11.02.2009

11

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 61

13.5 Summary

1. Introduction

2. XML Basics

3. Schema definition

4. XML query languages I

5. Mapping relational datato XML

6. SQL/XML

7. XML processing

8. XML query languages II –XQuery Data Model

9. XML query languages III – XQuery

10. XML storage I –Overview

11. XML storage II

12. Updates

13. Systems

13.6 Overview

62XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig

��

��

��

��

��

��

��

��

��

��

��

��

��

• "XML und Datenbanken" [Tür08]– Can Türker

– Lecture, University of Zurich, 2008

• "XML und Datenbanken" [KM03]– M. Klettke, H. Meier

– dpunkt.verlag, 2003

• " DB2 9 pureXML Guide" [IBM06a]– IBM

– December 2006

• "DB2 Version 9. XML Guide" [IBM06b]

63

13.6 References

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig

• Now, or ...

• Room: IZ 232

• Office our: Tuesday, 12:30 – 13:30 Uhr

or on appointment

• Email: [email protected]

XML Databases – Silke Eckstein – Institut für Informationssysteme – TU Braunschweig 64

Questions, Ideas, Comments