53
© 2010 IBM Corporation Information Management DB2 9 for z/OS pureXML Performance and Best Practices Information Management June, 2010

DB2 9 for z/OS pureXML Performance and Best Practices

  • Upload
    annick

  • View
    52

  • Download
    3

Embed Size (px)

DESCRIPTION

DB2 9 for z/OS pureXML Performance and Best Practices. Information Management June, 2010. Agenda. F. XML performance Performance monitoring and tuning Best practices. Note: the following performance numbers are not latest and have been improved since measurements. - PowerPoint PPT Presentation

Citation preview

Page 1: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation

Information Management

DB2 9 for z/OS pureXML Performance and Best Practices

Information Management

June, 2010

Page 2: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation2

Information Management

Agenda

XML performance

Performance monitoring and tuning

Best practices

Note: the following performance numbers are not latestand have been improved since measurements.

Page 3: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation3

Information Management

What You Can Do with pureXML

Create tables with XML columns or alter table add XML columns

Insert XML data, optionally validated against schemas

Create indexes on XML data

Efficiently search XML data

Extract XML data

Decompose XML data into relational data or create relational view

Construct XML documents from relational and XML data

Handle XML objects in all the utilities and tools

XMLDOC

XML Column

XMLIndex

XML

- Managing XML data the same way as relational data

Page 4: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation4

Information Management

Storage for UNIFI Messages

0

20

40

60

80

100

120

140

160

180

Sto

rage

(K

B)

Original Docs DB2XML Strip WS DB2XML Pres WS

Uncompress (KB)

Compress (KB)

96 sample documentsStrip WS: Strip WhitespacesPres WS: Preserve Whitespaces

67%

70%

58%52% 71%

Page 5: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation5

Information Management

0

2040

6080

100120140

160180

Siz

e in

KB

OriginalDocs

XMLStripWS

XMLPrervWS

OriginalDocs

XMLStripWS

XMLPrervWS

Unifi-messages JPM

Non compressed Compressed

Whitespace option and Compression

Note : UNIFI (International Standard ISO 20022 – UNIversal Financial Industry message scheme )

70% saving

80% saving

Page 6: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation6

Information Management

Table Compression Impact

0

100

200

300

400

500

Tim

e in

sec

ond

EL CPU EL CPU

insert Xscan

Non compressed XML table Compressed XML table

Page 7: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation7

Information Management

Insert Performance (Batch)

Measurement in March 2007, z9 DS8300, Single thread, Docs in EBCDIC

0

50

100

150

200

250

300

1K x 1000000 10K x 100000 100K x 10000 1M x 1000 10M x 100

Doc Size x Number

Tim

e (S

ec)

Elapsed CPU

3.9 millions 10K docs per hour or

1100 docs/sec

Page 8: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation8

Information Management

Insert XML – with indexes

Insert Elapsed and CPU

100% 100%111% 116%

125%138%

0%20%

40%60%

80%100%

120%140%

160%

Elapsed CPU

XML

w/1 index

w/ 2 indexes

Page 9: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation9

Information Management

XML Index Create or Rebuild

//e /a/b/@c /a/b/f/g

0

20

40

60

80

100

120

140

Tim

e in

sec

ond

//e' /a/b/@c' /a/b/f/g'

Create Elapsed Create CPU Rebuild Elapsed Rebuild CPU

Page 10: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation10

Information Management

Insert Performance – compare w/ CLOB

(average of 1K to 10M document insert performance)

100% 100%111% 116%

71.36%64%

0%

20%

40%

60%

80%

100%

120%

140%

Elapsed CPU

XML XML w/ One index CLOB

Page 11: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation11

Information Management

Fetch Performance (Batch)

Measurement in March 2007, z9 DS8300, Single thread, Docs in EBCDIC

01020304050607080

1K x 1000000 10K x 100000 100K x 10000 1M x 1000

Doc size x Number

Tim

e (s

ec)

Elapsed CPU

9.3 millions 10K docs per hour or

2580 docs/sec

Page 12: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation12

Information Management

XML Index Exploitation

Page 13: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation13

Information Management

XML Insert v.s. Validation v.s. Decomposition

0

2

4

6

8

10

12

Insert Insert withValidation

Decomposition

4K doc Single thread 3000 repeat

Tim

e in

sec

ond

Elapsed

CPU

Page 14: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation14

Information Management

Large Sample Tax Document Insert Performance

IRS Doc size Elapsed time (resp time) CPU time 25M 2.09 1.284250M 17.663 13.32378M 26.089 19.773500M 35.0333 26.822722M 51.917 40.064

z9-109, one LPAR with dedicated 3 CPs. Documents were stored remotelyin AIX box and inserted using Java application. Time in seconds.

Page 15: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation15

Information Management

0

2

4

6

8

10

12

14

16

CP

U t

ime

- m

s f

or

ins

ert

,

- s

ec

fo

r lo

ad

Order_insert (1-2K)

CustAcc_insert (4-20KB)

LOAD (2.5MB)

XMLSS usage in insert and load

zAAP eligble CP

z/OS XML Specialty Engine Support

14%32%

48%

Page 16: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation16

Information Management

LOAD Testing

Job Number of Rows

Size of XML documents (bytes)

Number of User XML Indexes

CPU time in general CP(sec)

CPU time in zAAP

Redirection percentage

LOAD1 300,000 4K-20K 4 191 40 17%

LOAD2 300,000 4K-20K 2 152 31 17%

LOAD3 300,000 4K-20K 1 93 38 29%

LOAD4 2,000 2.5M 2 329 82 20%

LOAD5 2,000 2.5M 2 330 82 20%

LOAD6 200 25M 1 254 64 20%

LOAD7 200 25M 1 114 64 36%

Average 209 57 21%

Processor IBM System z9 Enterprise Class (z9 EC) LPAR configuration: 4 General Purpose CPs, 1 zAAP, 1 zIIP, all dedicated Memory: 24GB memory Storage IBM DS8300 Operating system z/OS Version 1.9 DB2 DB2 9 Feb 2008 PTF level

Page 17: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation17

Information Management

TPoX Insert Test

TPoX Mass insert2 z/OS 1.8 (Case-A)

z/OS 1.9 (Case-B)

LPAR CPU utilization 76.45% 63.71%

Number of concurrent threads 20 Threads 20

XML inserts per second (average) 2363 2269

DB2 class1 average elapsed time (ms, per commit) 77.565 88.501

DB2 class1 average CPU time (ms, per commit) 7.909 7.586

Number of transactions per second (10 inserts /tx) 236.3 226.9

Internal Throughput rate 309.09 356.15

XML System Service CPU usage in LPAR 32.30% 18.23%

Processor IBM System z9 Enterprise Class (z9 EC) LPAR configuration: 3 dedicated General Purpose CPs (no zIIP no zAAP) Memory: 24GB memory Storage IBM DS8300 Case–AOperating system z/OS Version 1.8 DB2 DB2 9 June 2007 PTF levelCase –BOperating system z/OS Version 1.9 DB2 DB2 9 Feb 2008 PTF level

Page 18: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation18

Information Management

TPoX Mixed Transaction Test (1/2)

Processor IBM System z9 Enterprise Class (z9 EC) LPAR configuration: 3 General Purpose CPs, dedicated (no zAAP or zIIP) Memory: 24GB memory Storage IBM DS8300 Operating system z/OS Version 1.9 DB2 DB2 9 Feb 2008 PTF levelThreads 35

Transaction name Type of transaction Transaction Weight

Get _order (1) Query 12

Get_security (2) Query 12

Customer_profile (3) Query 12

Account_summary (5) Query 12

Get_security_price (6) Query 12

Insert_custacc Insert 20

Insert_order Insert 20

Page 19: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation19

Information Management

TPoX Mixed Transaction Test (2/2)

Transaction Average Transaction Response time

Get _order 0.03 second

Get_security 0.03 second

Customer_profile 0.03 second

Account_summary 0.03 second

Get_security_price 0.02 second

Insert_custacc 0.03 second

Insert_order 0.02 second

Transactions per second 1207 tps

CPU utilization 58.3%

Internal Throughput Rate 2068 tps

z/OS XML System Services CPU consumption

3.5%

Page 20: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation20

Information Management

TPoX Benchmark

TPoX

0

500

1000

1500

2000

2500

PK81260 PK80732 10 PK80732 20 PK80732 30 PK80732 40

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

ETR (tps)

cpu busy (%)

z10, 5 CPs

Number of users

October, 2009

Page 21: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation21

Information Management

Agenda

XML performance

Performance monitoring and tuning

Best practices

Page 22: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation22

Information Management

Performance Monitoring and Tuning

Since XML native storage is built on top of regular tablespace structure, there are no special changes in DB2 Performance Expert to support XML other than minor points - such as new XML locks (type x’35’).

XML performance problem can be analyzed through accounting traces and performance traces.

There is a new LOAD MODULE for XML: DSNNXML XML indexes have the same consideration as other

indexes. The REORG utility should be used to maintain order and

free space. Run RUNSTATS for statistics to help pick XML indexes.

Page 23: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation23

Information Management

XML Query Performance Issues

■ 85% of the performance issues relate to:– Query execution plans– Index usage (indexing presentation)– Proper coding of SQL/XML and XQuery expressions

(Best Practices section)

Page 24: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation24

Information Management

How to obtain and analyze XML query plans■ Create Explain tables

– Use member DSNTESC of the SDSNSAMP library– Option E from menu of DB2 admin tool (DSN_STATEMNT_TABLE)– Use Visual Explain

• Optim Development Studio• IBM DB2 Optimization Service Center for DB2 for z/OS(OSC)

■ Gather explain information– Use SPUFI – prefix query with EXPLAIN PLAN SET QUERYNO– SELECT from PLAN_TABLE

Page 25: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation25

Information Management

Use RUNSTATS

Use RUNSTATS to collect statistics for XML data and indexes so the optimizer can pick the right access methods

LISTDEF DBACORDTSLIST INCLUDE TABLESPACES DATABASE DBACORD

RUNSTATS TABLESPACE LIST DBACORDTSLIST TABLE(ALL) INDEX(ALL)

Page 26: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation26

Information Management

Agenda

XML performance

Performance monitoring and tuning

Best practices

Page 27: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation27

Information Management

Best Practices

Tip 1: Choose the right table and storage design

Tip 2: Choose the right XML document granularity

Tip 3: Be aware of XML schema validation overhead

Tip 4: Avoid encoding conversion during XML insert and retrieval

Tip 5: In XPath expressions, use fully specified paths as much as possible

Tip 6: Define lean XML indexes

Tip 7: Put document filtering predicates in XMLEXISTS instead of XMLQUERY

Tip 8: Use square brackets [ ] to avoid Boolean predicates in XMLEXISTS

Tip 9: Use RUNSTATS to collects statistics for XML data and indexes

Tip 10: Use SQL/XML publishing views to expose relational data as XML

Tip 11: Use XMLTABLE views to expose XML as relational data

Tip 12: Use SQL/XML statements with parameter markers and host vars

Page 28: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation28

Information Management

Tip 1: Decision making: XML input => storage

Regulatory Requirements

Intact Digital Signature Significant Data Flexible

Search in XML

Never

LOBVARCHARVARBIN

(preserve whitespace) (strip whitespace) (Relational/XML)

Yes Return XMLalways

Yes

XML withXML indexes

No

Light Reporting

StructuresRegularFixed Relational

Complex Flexible

XML withXMLTABLE()

Yes

Heavy Analytics

Relational withXML

(can be materialized)

Page 29: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation29

Information Management

Some considerations

Tedious normalization and frustrated changes of schema are an indicator for using native XML.

Store hybrid or redundant data in relational/XML, when– Fully normalized storage is an overkill– Referential integrity: extract into relation columns– Store in XML, but materialize frequently used fields in relational

for heavy analytic applications– Document size

Use compression for XML data always

Page 30: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation30

Information Management

Table Design

Mixed document types in one table– Flexibility in exchange of

overhead (such as index maintenance)

Separate tables for different document types– to avoid overhead

XMLDOCID DOCTYPE

XMLDOCID

DOCTYPE1TAB

XMLDOCID

DOCTYPE2TAB

DOCTAB

Page 31: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation31

Information Management

Tip 2: Choose the right XML document granularity

Small vs. large documents? (KBs vs. MBs)XML Indexes filter at the document level

Smaller documents tend to perform betterBut, rule of thumb:

Document granularity should match the predominant granularity of access

Page 32: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation32

Information Management

Document Granularity: Example<order date=‘2004-11-05'>

<customer>Doe</customer> <part key='82' > <quantity>5</quantity> <price>5.00</price> </part> <part key='83' > <quantity>11</quantity> <price>19.95</price> </part></order>

<order date=‘2004-11-06'> <customer>Doe</customer> <part key=‘19' > <quantity>23</quantity> <price>1.99</price> </part> <part key=‘48' > <quantity>1</quantity> <price>24.95</price> </part></order>

<allorders> <order date=‘2004-11-05'>

<customer>Doe</customer> <part key='82' > <quantity>5</quantity> <price>5.00</price> </part> <part key='83' > <quantity>11</quantity> <price>19.95</price> </part> </order> <order date=‘2004-11-06'> <customer>Doe</customer> <part key=‘19'> <quantity>23</quantity> <price>1.99</price> </part> <part key=‘48'> <quantity>1</quantity> <price>24.95</price> </part> </order></allorders>

<part key='83' > <quantity>11</quantity> <price>19.95</price></part>

<part key=‘48' > <quantity>1</quantity> <price>24.95</price> </part>

<part key='82' > <quantity>5</quantity> <price>5.00</price></part>

<part key=‘19' > <quantity>23</quantity> <price>1.99</price></part>

<order date=‘2004-11-05'> <customer>Doe</customer> <part key='82‘/> <part key='83‘/></order>

<order date=‘2004-11-06'> <customer>Doe</customer> <part key=‘19‘/> <part key=‘48‘/></order>

Page 33: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation33

Information Management

Tip 3: Beware of Schema Validation Overhead

create table dept(deptID char(8), deptdoc xml);

Validation is optional, and per document (per row):insert into dept values (?, ?)insert into dept values (?, dsn_xmlvalidate(?, ?))

Validation increases CPU time for inserts, and reduces throughput.

Use schema validation if needed.Avoid schema validation for highest possible insert performance.

No validation

with validation

Page 34: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation34

Information Management

Tip 4: Avoid encoding conversion

■ Internally encoded XML: encoding derived from the data, e.g. Unicode Byte-Order Mark or optional XML declaration: <?xml version="1.0" encoding="UTF-8" ?>

■ Externally encoded XML: application encoding determines XML encoding if character type variables are used

■ Internally encoded XML with UTF-8 is preferred

– CLI: use SQL_C_BINARY data buffers rather than SQL_C_CHAR, SQL_C_DBCHAR, SQL_C_WCHAR

– Java: use binary stream (setBinaryStream) rather than string (setString).

– COBOL: SQL BLOB

Page 35: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation35

Information Management

Tip 5: Use fully specified paths if possible■ As much as possible possible, use fully specified XPath expression

rather than wildcards, e.g.

– /customerinfo/phone instead of //phone

– /customerinfo/addr/state instead of /customerinfo/*/state

<customerinfo Cid="1004"> <name>Matt Foreman</name> <addr country="Canada"> <street>1596 Baseline</street> <city>Toronto</city> <state>Ontario</state> <pcode>M3Z-5H9</pcode> </addr> <phone type="work">905-555-4789</phone> <phone type="home">416-555-3376</phone> <assistant> <name>Peter Smith</name> <phone type="home">416-555-3426</phone> </assistant></customerinfo>

Page 36: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation36

Information Management

Tip 6: Lean XML Indexes• create unique index idx1 on customer(info)

• generate key using

• xmlpattern '/customerinfo/@Cid'

• as sql decfloat;

• create index idx2 on customer(info)

• generate key using

• xmlpattern '/customerinfo/name'

• as sql varchar(40);

<customerinfo Cid="1004"> <name>Matt Foreman</name> <addr country="Canada"> <street>1596 Baseline</street> <city>Toronto</city> <state>Ontario</state> <pcode>M3Z-5H9</pcode> </addr> <phone type="work">905-555-4789</phone> <phone type="home">416-555-3376</phone> <assistant> <name>Peter Smith</name> <phone type="home">416-555-3426</phone> </assistant></customerinfo>

create index idx3 on customer(info) generate key using xmlpattern '//name' as sql varchar(40);

create table customer( info XML);

create index idx4 on customer(info)generate key using xmlpattern '/customerinfo/phone' as sql varchar(40);

LUW: “as sql double”zOS: “as sql decfloat”

Page 37: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation37

Information Management

Tip 6: Lean XML Indexes• create unique index idx1 on customer(info)

• generate key using

• xmlpattern '/customerinfo/@Cid'

• as sql decfloat;

• create index idx2 on customer(info)

• generate key using

• xmlpattern '/customerinfo/name'

• as sql varchar(40);

<customerinfo Cid="1004"> <name>Matt Foreman</name> <addr country="Canada"> <street>1596 Baseline</street> <city>Toronto</city> <state>Ontario</state> <pcode>M3Z-5H9</pcode> </addr> <phone type="work">905-555-4789</phone> <phone type="home">416-555-3376</phone> <assistant> <name>Peter Smith</name> <phone type="home">416-555-3426</phone> </assistant></customerinfo>

create index idx3 on customer(info) generate key using xmlpattern '//name' as sql varchar(40);

create table customer( info XML);

create index idx4 on customer(info) generate key using xmlpattern '//text()' as sql varchar(40);

Don’t index everything!Very expensive for

insert, update, delete !

Page 38: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation38

Information Management

Tip 6: Lean XML Indexes and Indexing non-leaf Nodes

<customerinfo Cid="1004"> <name>Matt Foreman</name> <addr country="Canada"> <street>1596 Baseline</street> <city>Toronto</city> <state>Ontario</state> <pcode>M3Z-5H9</pcode> </addr> (…)</customerinfo>

Typically not useful !

…xmlpattern '/customerinfo/addr' as sql varchar(128);

Single index entry. Key value = concatenation of all text nodes under “addr”:

(/customerinfo/addr, “1596 BaselineTorontoOntarioM3Z-5H9”)

Better: 4 separate indexes !

…xmlpattern '/customerinfo/addr/street' as sql varchar(50); …xmlpattern '/customerinfo/addr/city' as sql varchar(40); …xmlpattern '/customerinfo/addr/state' as sql varchar(25); …xmlpattern '/customerinfo/addr/pcode' as sql varchar(10);

Page 39: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation39

Information Management

Tip 7 & 8 : Put document filtering predicates in XMLEXISTS instead of XMLQUERY & Use square brackets [ ] to avoid Boolean predicates in XMLEXISTS■ XMLQUERY function in a SELECT clause does not filter

documents or rows, does not use indexes

■ Document/Row-filtering predicates must be in XMLEXISTS in the WHERE clause

■ Predicates in XMLEXISTS must be in square brackets

Page 40: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation40

Information Management

SQL/XML with XMLQUERY

<customerinfo> <name>Matt Foreman</name> <phone>905-555-4789</phone></customerinfo>

<customerinfo> <name>Peter Jones</name> <phone>905-123-9065</phone></customerinfo>

<customerinfo> <name>Mary Poppins</name> <phone>905-890-0763</phone></customerinfo>

• select xmlquery(‘$i/customerinfo[phone = “905-555-4789”]/name’ passing info as “i”)

from customer

select xmlquery(‘$i/customerinfo/name’ passing info as “i”) from customer where xmlexists(‘$i/customerinfo[phone = “905-555-4789”]’ passing info as “i”)

<name>Matt Foreman</name>

<name>Matt Foreman</name>

customer table:

1 record(s) selected

3 record(s) selected

create table customer( info XML);

Can usean index!

Can not use an index!

Page 41: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation41

Information Management

select xmlquery(‘$i/customerinfo/name’ passing info as “i”) from customer where xmlexists(‘$i/customerinfo[phone = “905-555-4789”]’ passing info as “i”)

SQL/XML with XMLEXISTS select xmlquery(‘$i/customerinfo/name’ passing info as “i”)

from customerwhere xmlexists(‘$i/customerinfo/phone = “905-555-4789”’ passing info as “i”)

<name>Matt Foreman</name>

<name>Matt Foreman</name>

<name>Peter Jones</name>

<name>Mary Poppins</name>

customer table:

1 record(s) selected

3 record(s) selected

<customerinfo> <name>Matt Foreman</name> <phone>905-555-4789</phone></customerinfo>

<customerinfo> <name>Peter Jones</name> <phone>905-123-9065</phone></customerinfo>

<customerinfo> <name>Mary Poppins</name> <phone>905-890-0763</phone></customerinfo>

create table customer( info XML);

Can usean index!

Can not use an index!

True or false, not empty!

Page 42: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation42

Information Management

Tip 9: Use RUNSTATS on XML data!

■ RUNSTATS does collect statistics for XML data and XML indexes!

■ The optimizer does use these stats!

LISTDEF DBACORDTSLIST INCLUDE TABLESPACES DATABASE DBACORD

RUNSTATS TABLESPACE LIST DBACORDTSLIST TABLE(ALL) INDEX(ALL)

Page 43: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation43

Information Management

Tip 10: Use SQL/XML publishing views to expose relational data as XML

■ SQL/XML publishing functions hidden in a view

■ create table unit( unitID char(8), name char(20), manager varchar(20));

■ create view UnitView(unitID, name, unitdoc) as select unitID, name, XMLELEMENT(NAME "Unit",

XMLELEMENT(NAME "ID", u.unitID),

XMLELEMENT(NAME "UnitName", u.name),

XMLELEMENT(NAME "Mgr", u.manager) )

from unit u;

Page 44: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation44

Information Management

Tip 10: Use SQL/XML publishing views to expose relational data as XML

Queries that perform sub-optimally

select unitdoc from UnitViewwhere xmlexists('$i/Unit[ID = "WWPR"]' passing unitdoc as "i");

Query that performs well: filter on relational

select unitdoc from UnitViewwhere UnitID = "WWPR";

In a nutshell, include relational columns in a SQL/XML publishing view, and when querying the view express any predicates on those columns rather than on the constructed XML.

Page 45: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation45

Information Management

Tip 11: Use XMLTABLE views to expose XML data in relational format

■ Values returned from XML documents in tabular format

■ create table customer(info XML);

■ create view myview(CustomerID, Name, Zip, Info) as SELECT T.*, info FROM customer, XMLTABLE ('$c/customerinfo' passing info as “c”

COLUMNS

“CID” INTEGER PATH './@Cid',

“Name” VARCHAR(30) PATH './name',

“Zip” CHAR(12) PATH './addr/pcode' ) as T;

Page 46: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation46

Information Management

Tip 11: Use XMLTABLE views to expose XML data in relational format

■ Query with an XML predicate– May perform sub-optimallyselect CustomerID,Name from myviewwhere Zip = “95141”;

■ Will perform well select CustomerID, Name

from myview where xmlexists('$i/customerinfo[addr/pcode ' “95141”] passing info as “i”);

In a nutshell, be careful with XMLTABLE views which expose XML data in relational form. When possible, include additional columns in the view definition so that filtering predicates can be expressed on those columns instead of the XMLTABLE columns.

Page 47: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation47

Information Management

Extracting values from XML for Hybrid Store using Trigger

CUST(ID, NAME, CITY, ZIP, INFO): extract NAME, CITY, ZIP from INFO (XML)

CREATE TRIGGER ins_cust AFTER INSERT ON cust REFERENCING NEW AS newrow FOR EACH ROW MODE DB2SQL BEGIN ATOMIC update cust set (name, city, zip) = (select X.name, X.city, X.zip from cust, XMLTABLE('customerinfo' PASSING CUST.INFO COLUMNS name varchar(30) PATH 'name', city varchar(20) PATH 'addr/city', zip varchar(12) PATH 'addr/pcode-zip') as X where cust.id = newrow.id ) where cust.id = newrow.id; END #

Page 48: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation48

Information Management

Tip 12: Use Parameter markers and host vars for fast XML queries

select info from customer where xmlexists('$i/customerinfo[phone = "905-555-4789"]' passing info as "i")

select info from customer where xmlexists('$i/customerinfo[phone = $p]' passing info as "i", cast(? as varchar(12)) as "p")

select info from customer where xmlexists('$i/customerinfo[phone = $p]' passing info as "i", :vchostvar as "p")

Page 49: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation49

Information Management

XML Queries – Things to do to improve Performance

Use XPath instead of FLWOR where possible– Reason: Simpler is better, for humans and for the DB2 optimizer.– XML may not have to be reconstructed and FLWOR can use more

tempspace

■ Avoid parent for predicates– Eg. /a/b//d[../c=fn:string(“abc”)]– Reason: parent steps in the predicate prevent index usage– See: http://www.ibm.com/developerworks/data/library/techarticle/dm-

0611nicola/#cases

■ Use range predicates when appropriate– Eg. [dateOfBirth [.>= xs:date(“2000-01-01”) and .<= xs:date(“2000-12-31”)]]– What is important here is the notation with the self axes (the dots).– Reason: This allows DB2 to use a single XISCAN instead of 2 XISCANs +

IXAND.– See: http://www.ibm.com/developerworks/data/library/techarticle/dm-

0611nicola/#rangepredicates

LUW chart

Page 50: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation50

Information Management

Using SPUFI or JCL

SELECT Cid, InfoFROM DSN8910.CUSTOMERWHERE XMLEXISTS ('declare default element namespace "http://posample.org";//addr[city="Toronto"]' passing INFO)

XML and XPath are case-sensitive: CAP off/case mixed– SQLCODE = -16002, ERROR: AN XQUERY EXPRESSION HAS AN

UNEXPECTED TOKEN DEFAULT FOLLOWING DECLARE.

Terminal session CCSID setting has to be consistent with application encoding scheme as “[” and “]” have different code points in different code pages.– SQLCODE = -16002, ERROR: AN XQUERY EXPRESSION HAS AN

UNEXPECTED TOKEN  FOLLOWING "Toronto".

Page 51: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation51

Information Management

XML as Front to Backend/Core Systems

XML

DB2 pureXML

relationalXML

XML

XML

Backend 1

Backend 2

Backend 3

Backend n

Interface Tables 1

Interface Tables 2

Interface Tables 3

Interface Tables n

Physical tables or logical views

MQ

FTP

HTTP

Need to handle XML data, but full normalization is overkill.

Great fit for pureXML

Page 52: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation52

Information Management

An End-to-end XML Paradigm

Client Presentation

HTML+XFormsLotus Forms

XML

Data StorageDB2 pureXML

relationalXMLSOA Gateway / WAS

XQueryXSLTXPath

End-to-End Straight Through Processing using XML

XML programming paradigm and architecture pattern– XForms– REST/SOAP web services– XQuery suite: XQuery, XQuery update facility, XQuery scripting extension, etc.

ProtocolSOAPHTTP

REST

Page 53: DB2 9 for z/OS pureXML Performance and Best Practices

© 2010 IBM Corporation53

Information Management

Summary

XML performance

Performance monitoring and tuning

Best practices