79
MonetDB/XQuery: Using a Relational DBMS for XML Peter Boncz CWI The Netherlands

MonetDB/XQuery: Using a Relational DBMS for XML

Embed Size (px)

DESCRIPTION

Peter Boncz CWI The Netherlands. MonetDB/XQuery: Using a Relational DBMS for XML. Peter Boncz. Pathfinder - MonetDB/XQuery. TU Delft 10-5-2005. Outline. Basic XML / XQuery Introduction of Pathfinder and MonetDB projects Relational XQuery XPath steps in the pre/post plane - PowerPoint PPT Presentation

Citation preview

Page 1: MonetDB/XQuery:  Using a Relational DBMS for XML

MonetDB/XQuery:

Using a Relational DBMS for XML

Peter BonczCWI

The Netherlands

Page 2: MonetDB/XQuery:  Using a Relational DBMS for XML

Outline

• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery

– XPath steps in the pre/post plane– Translating for-loops, and beyond

• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition

• Outlook– Conclusions

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 3: MonetDB/XQuery:  Using a Relational DBMS for XML

Outline

• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery

– XPath steps in the pre/post plane– Translating for-loops, and beyond

• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition

• Outlook– Conclusions

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 4: MonetDB/XQuery:  Using a Relational DBMS for XML

XML

• Standard, flexible syntax for data exchange

– Regular, structured data

Database content of all kinds: Inventory, billing, orders, …

“Small” typed values

– Irregular, unstructured text

Documents of all kinds: Transcripts, books, legal briefs, …

“Large” untyped values

• Lingua franca of B2B Applications…

– Increase access to products & services

– Integrate disparate data sources

– Automate business processes

• … and numerous other application domains

– Bio-informatics, library science, …

Page 5: MonetDB/XQuery:  Using a Relational DBMS for XML

XML : A First Look

• XML document describing catalog of books

<?xml version="1.0" encoding="ISO-8859-1" ?><catalog> <book isbn="ISBN 1565114302"> <title>No Such Thing as a Bad Day</title> <author>Hamilton Jordan</author> <publisher>Longstreet Press, Inc.</publisher> <price currency="USD">17.60</price> <review> <reviewer>Publisher</reviewer>: This book is the moving

account of one man's successful battles against three cancers ... <title>No Such Thing as a Bad Day</title> is warmly recommended.

</review> </book>

<!-- more books and specifications -->

</catalog>

Page 6: MonetDB/XQuery:  Using a Relational DBMS for XML

XQuery 1.0

• Functional, strongly-typed query language• XQuery 1.0 =

XPath 2.0 for navigation, selection, extraction

+ A few more expressions For-Let-Where-Order By-Return (FLWOR)

XML construction

Operators on types

+ User-defined functions & modules

+ Strong typing

Page 7: MonetDB/XQuery:  Using a Relational DBMS for XML

XSLT vs. XQuery

• XSLT 1.0: XML XML, HTML, Text– Loosely-typed scripting language– Format XML in HTML for display in browser– Must be highly tolerant of variability/errors in data

• XQuery 1.0: XML XML– Strongly-typed query language– Large-scale database access– Must guarantee safety/correctness of operations on data

• Over time, XSLT & XQuery may both serve needs of many application domains

• XQuery will become a hidden, commodity language

Page 8: MonetDB/XQuery:  Using a Relational DBMS for XML

Navigation, Selection, Extraction

• Titles of all books published by Longstreet Press

$cat/catalog/book[publisher=“Longstreet Press”]/title <title>No Such Thing As A Bad Day</title>

• Publications with Jerome Simeon as author or editor • $cat//*[(author|editor) = “Jerome Simeon”]

<book><title>XQuery from the Experts</title>…</book>

<spec><title>XQuery Formal Semantics</title>…</spec>

Page 9: MonetDB/XQuery:  Using a Relational DBMS for XML

Transformation & Construction

• First author & title of books published by A/W

for $b in $cat//book[publisher = “Addison Wesley”] return <awbook> { $b/author[1], $b/title } </awbook> <awbook> <author>Don Chamberlin</author> <title>XQuery from the Experts</title>

</awbook>

Page 10: MonetDB/XQuery:  Using a Relational DBMS for XML

Sequences & Iteration

• Sequence constructorReturn all books followed by all W3C specifications($cat/catalog/book, $cat/catalog/W3Cspec)

• XPath ExpressionReturn all books & W3C specifications in doc order$cat/catalog/(book|W3Cspec)

• For Expression– Similar to map : apply function to each item in sequence

Return number of authors in each bookfor $b in $cat/catalog/book return fn:count($b/authors)

=> (3,1,2,…)

Page 11: MonetDB/XQuery:  Using a Relational DBMS for XML

Conditional & Quantified

• Conditionalif //show[year >= 2000] then “A-OK!” else “Error!”

• Existential quantification

– Implicit meaning of predicate expressions

//show[year >= 2000]

– Explicit expression:

//show[some $y in ./year satisfies $y >= 2000]

• Universal quantification //show[every $y in year satisfies $y >= 2000]

Page 12: MonetDB/XQuery:  Using a Relational DBMS for XML

Putting It Together

• For each author, return number of books and receipts books published in past 2 years, ordered by name

let $cat := fn:doc(“www.bn.com/catalog.xml“), Join $sales := fn:doc(“www.publishersweekly.com/sales.xml“)

for $author in distinct-values($cat//author) Groupinglet $books := $cat//book[@year >= 2000 and author = $a], S.J.

$receipts := $sales/book[@isbn = $books/@isbn]/receipts

order by $author Orderingreturn

<sales> XML Construction { $author }

<count> { fn:count($books) } </count> Aggregation <total> { fn:sum($receipts) } </total></sales>

Page 13: MonetDB/XQuery:  Using a Relational DBMS for XML

Recursive Processing

• Recursive functions support recursive data <part id=“001”> <partCt count=“2” id=“001”>

<part id=“002”> <partCt count=“1” id=“002”/>

<part id=“003”/> => <partCt count=“0” id=“003”/> </part> </partCt>

<part id=“004”/> <partCt count=“0” id=“004”/>

</part> </partCt>

declare function partCount($p as element(part))

as element(partCt) {

<partCt count=“{ count($p/part) }”>

{ $p1/@id, for $p2 in $p/part return partCount($p2) }

</partCt>

}

Page 14: MonetDB/XQuery:  Using a Relational DBMS for XML

XML Schema Languages

• Many variants…– DTDs, XML Schema, RELAX-N/G, XDuce

• … with similar goals to define– Types of literal (terminal) data– Names of elements & attribute

• XQuery designed to support (all of) XML Schema– Structural & name constraints over types– Regular tree expressions over elements, attributes, atomic types

Page 15: MonetDB/XQuery:  Using a Relational DBMS for XML

TeXQuery : Full-text extensions

• Text search & querying of structured content

• Limited support in XQuery 1.0

– String operators with collation sequences

$cat//book[contains(review/text(), “two thumbs up”)]

• Stop words, proximity searching, ranking

Ex: “Tony Blair” within two words of “George Bush”

• Phrases that span tags and annotations

Ex: Match “Mr. English sponsored the bill” in <sponsor> Mr. English </sponsor> <footnote> for himself and <co-

sponsor> Mr.Coyne </co-sponsor> </footnote> sponsored the bill in the <committee-name> Committee for Financial Services </committee-name>

Page 16: MonetDB/XQuery:  Using a Relational DBMS for XML

Outline

• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery

– XPath steps in the pre/post plane– Translating for-loops, and beyond

• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition

• Outlook– Conclusions

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 17: MonetDB/XQuery:  Using a Relational DBMS for XML

Outline

• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery

– XPath steps in the pre/post plane– Translating for-loops, and beyond

• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition

• Outlook– Conclusions

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 18: MonetDB/XQuery:  Using a Relational DBMS for XML

XQuery Systems: 2 Approaches

• Tree-based– Tree is basic data structure

• Also on disk (if an XQuery DBMS)– Navigational Approach

• Galax [Simeon..], Flux [Koch..], X-Hive– Tree Algebra Approach

• TIMBER [Jagadish..]

• Relational– Data shredded in relational tables– XQuery translated into database query (e.g. SQL)

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 19: MonetDB/XQuery:  Using a Relational DBMS for XML

The Pathfinder Project

• Challenge / Goal:– Turn RDBMSs into efficient XQuery engines

• People:– Maurice van Keulen

• University of Twente

– Torsten Grust, Jens Teubner• University of Konstanz

– Jan Rittinger• University of Konstanz & CWI

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 20: MonetDB/XQuery:  Using a Relational DBMS for XML

The Pathfinder Project

• Challenge / Goal:– Turn RDBMSs into efficient XQuery engines

• People:– Maurice van Keulen

• University of Twente

– Torsten Grust, Jens Teubner• University of Konstanz

– Jan Rittinger• University of Konstanz & CWI

• Task: generate code for MonetDB

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 21: MonetDB/XQuery:  Using a Relational DBMS for XML

MonetDB: Applied CS Research at CWI

• a decade of “query-intensive” application experience

• image retrieval: Peter Bosch ImageSpotter

• audio/video retrieval: Alex van Ballegooij RAM

• XML text retrieval: de Vries / Hiemstra TIJAH

• biological sequences: Arno Siebes BRICKS

• XML databases: Albrecht Schmidt XMark

Grust / vKeulen Pathfinder

• GIS: Wilco Quak MAGNUM

• data warehousing / OLAP / data mining

SPSS DataDistilleries

Univ. Massachussetts PROXIMITY

CWI research group successfully spun off DataDistilleries (now SPSS)

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 22: MonetDB/XQuery:  Using a Relational DBMS for XML

MIL (Query Algebra)

Pathfinder — MonetDB

Pathfinder

MonetDB

Parser

Sem. Analysis

Core Translation

Typechecking

Relational Algebra

Database

SQL

Core to MILTranslation

Parser

Sem. Analysis

Core Translation

Typechecking

Database

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 23: MonetDB/XQuery:  Using a Relational DBMS for XML

Open Source

• MonetDB + Pathfinder on Sourceforge– Mozilla License

• Project Homepage– http://monetdb.cwi.nl

• Developers website:– http://sf.net/projects/monetdb

RoadMap• 14-apr-04: initial Beta release MonetDB/SQL• 30-sep-04: first official release MonetDB/SQL• 30-may-05: beta release of MonetDB/XQuery (i.e. Pathfinder)

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 24: MonetDB/XQuery:  Using a Relational DBMS for XML

MonetDB

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 25: MonetDB/XQuery:  Using a Relational DBMS for XML

MonetDB Particulars

• Column wise fragmentation– BAT: Binary Association Tables [oid,X]– Don’t touch what you don’t need

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 26: MonetDB/XQuery:  Using a Relational DBMS for XML

Binary Association Tables (BATs)

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 27: MonetDB/XQuery:  Using a Relational DBMS for XML

BAT storage as thin arrays

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 28: MonetDB/XQuery:  Using a Relational DBMS for XML

MonetDB Particulars

• Column wise fragmentation– BAT: Binary Association Tables [oid,X]– Don’t touch what you don’t need

• Void (virtual-oid) columns– Contain dense sequence 0,1,2,3,4,…– Require no space– Positional access (nice for XPath skipping)

• pre = void

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 29: MonetDB/XQuery:  Using a Relational DBMS for XML

DBMS Architecture

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 30: MonetDB/XQuery:  Using a Relational DBMS for XML

Monet: DBMS Microkernel

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 31: MonetDB/XQuery:  Using a Relational DBMS for XML

MonetDB: extensible architecture

Front-end/back-end:

• support multiple data models

• support multiple end-user languages

• support diverse application domains

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 32: MonetDB/XQuery:  Using a Relational DBMS for XML

Front-end/back-end:

• support multiple data models

• support multiple end-user languages

• support diverse application domains

PathfinderXQuery Frontend

MonetDB: extensible architecture

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 33: MonetDB/XQuery:  Using a Relational DBMS for XML

Architecture

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 34: MonetDB/XQuery:  Using a Relational DBMS for XML

Outline

• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery

– XPath steps in the pre/post plane– Translating for-loops, and beyond

• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition

• Outlook– Conclusions

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 35: MonetDB/XQuery:  Using a Relational DBMS for XML

Outline

• Basic XML / XQuery

• Introduction of Pathfinder and MonetDB projects

• Relational XQuery– XPath steps in the pre/post plane

– Translating for-loops, and beyond

• MonetDB Implementation– Data structures

• Optimizations– Order prevention

– Loop-Lifted Staircase join

– Join recognition

• Outlook– Conclusions

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 36: MonetDB/XQuery:  Using a Relational DBMS for XML

XPath on and RDBMS

Node-based relational encoding of XQuery's data model

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 37: MonetDB/XQuery:  Using a Relational DBMS for XML

Tree Knowledge 1: pruning

Page 38: MonetDB/XQuery:  Using a Relational DBMS for XML

Tree Knowledge 2: Partitioning

Page 39: MonetDB/XQuery:  Using a Relational DBMS for XML

Staircase Join Algorithm

Page 40: MonetDB/XQuery:  Using a Relational DBMS for XML

Tree Knowledge 3: Skipping

Page 41: MonetDB/XQuery:  Using a Relational DBMS for XML

Pre/Post Pre/Level/Size

done for better skipping and updates

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 42: MonetDB/XQuery:  Using a Relational DBMS for XML

Updates

• Dense pre-numbers are nice for XPath– Positional skipping in Staircase join!

• But how to handle updates?

Page 43: MonetDB/XQuery:  Using a Relational DBMS for XML

Updates

• Dense pre-numbers are nice for XPath– Positional skipping in Staircase join!

• But how to handle updates?

DenseNot Dense

Page 44: MonetDB/XQuery:  Using a Relational DBMS for XML

Planned Update Solution

Page 45: MonetDB/XQuery:  Using a Relational DBMS for XML

Planned Update Solution

Page 46: MonetDB/XQuery:  Using a Relational DBMS for XML

Planned Update Solution

Page 47: MonetDB/XQuery:  Using a Relational DBMS for XML

XPath XQuery

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 48: MonetDB/XQuery:  Using a Relational DBMS for XML

Sequence Representation

• sequence = table of items• add pos column for maintaining order• ignore polymorphism for the moment

(10, “x”, <a/>, 10) →Pos Item

1 102 “X”3 pre(a)4 10

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 49: MonetDB/XQuery:  Using a Relational DBMS for XML

For-loops: the iter column

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 50: MonetDB/XQuery:  Using a Relational DBMS for XML

For-loops: the iter column

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 51: MonetDB/XQuery:  Using a Relational DBMS for XML

Loop-lifting

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 52: MonetDB/XQuery:  Using a Relational DBMS for XML

Loop-lifting

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 53: MonetDB/XQuery:  Using a Relational DBMS for XML

Full Example

join calc project

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 54: MonetDB/XQuery:  Using a Relational DBMS for XML

Mapping Rules

XQuery construct relational algebraSee VLDB’04 / TDM’04

[Grust,Teubner]

– Sequence construction union– If-Then-[Else] select, [union]– For loop map with cartesian product (all combinations)– Calculations projection expressions– List-functions (e.g. fn:first) select(pos=1)– Element Construction updates using descendant– Path steps selections on the pre/post plane

• Staircase join [VLDB03]: – Single-pass for a *set* of context nodes

– elaborate skipping!

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 55: MonetDB/XQuery:  Using a Relational DBMS for XML

Xmark Query 2

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 56: MonetDB/XQuery:  Using a Relational DBMS for XML

Xmark Query 2 (common subexpr)

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 57: MonetDB/XQuery:  Using a Relational DBMS for XML

Outline

• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery

– XPath steps in the pre/post plane– Translating for-loops, and beyond

• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition

• Outlook– Conclusions

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 58: MonetDB/XQuery:  Using a Relational DBMS for XML

Outline

• Basic XML / XQuery

• Introduction of Pathfinder and MonetDB projects

• Relational XQuery– XPath steps in the pre/post plane

– Translating for-loops, and beyond

• MonetDB Implementation– Data structures

• Optimizations– Order prevention

– Loop-Lifted Staircase join

– Join recognition

• Outlook– Conclusions

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 59: MonetDB/XQuery:  Using a Relational DBMS for XML

Order Prevention

To encode order, we use the pos column

New pos columns are created using DENSE RANK (sql) primitive

• Needs [pos] | [iter] order

• More commonly [iter,pos]

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 60: MonetDB/XQuery:  Using a Relational DBMS for XML

Order Prevention

To encode order, we use the pos column

New pos columns are created using DENSE RANK (SQL) primitive

• Needs [pos] | [iter] order

• More commonly [iter,pos]

This requires a lot of sorting! often not necessary

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 61: MonetDB/XQuery:  Using a Relational DBMS for XML

Order Prevention

[VLDB03 Wang&Cherniack]

• Order properties of relations

• Order propagation rules for relational operators

Decoration of physical plans with order properties eliminate sort

New ideas:

• RefineSort: pipelined algorithm that extends sort order

• Order property [C1] | [C2]

“for each equal value of [C2] in order of appearance, the values in [C1] are monotonically increasing”

Hash-based DENSE RANK only requires [pos] | [iter]

sorts on [iter,pos] avoided

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 62: MonetDB/XQuery:  Using a Relational DBMS for XML

Order Prevention

[VLDB03 Wang&Cherniack] define:

• Order properties of relations

• Order propagation rules for relational operators

Decoration of physical plans with order properties eliminate sort

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 63: MonetDB/XQuery:  Using a Relational DBMS for XML

Join Recognition (recap Mapping Rules)

XQuery construct relational algebraSee VLDB’04 / TDM’04

[Grust,Teubner]

– Sequence construction union– If-Then-[Else] select, [union]– For loop map with cartesian product (all combinations)– Calculations projection expressions– List-functions (e.g. fn:first) select(pos=1)– Element Construction updates using descendant– Path steps selections on the pre/post plane

• Staircase join [VLDB03]: – Single-pass for a *set* of context nodes

– elaborate skipping!

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 64: MonetDB/XQuery:  Using a Relational DBMS for XML

– For loop map with all combinations O(N*N)– If `simple’ condition exist on two loop variables join

– Only make a map with the matching combinations– E.g. with Hash-Table O(N)

Join Recognition

for $p in $auction/site/people/person for $t in $auction/site/closed_auctions/closed_auction where $t/buyer/@person = $p/@id return $t

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 65: MonetDB/XQuery:  Using a Relational DBMS for XML

– For loop map with all combinations O(N*N)– If `simple’ condition exist on two loop variables join

– Only make a map with the matching combinations– E.g. with Hash-Table O(N)

Performed on the XCore tree

Recognize if-then expressions

Open question:

where to optimize best??

Join Recognition

for $p in $auction/site/people/person for $t in $auction/site/closed_auctions/closed_auction where $t/buyer/@person = $p/@id return $t

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 66: MonetDB/XQuery:  Using a Relational DBMS for XML

Join Optimization for $x in $foo for $y in $bar where $x/p1/@a < $y/p2/@a return $x

p1p1 p2theta-join

project

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 67: MonetDB/XQuery:  Using a Relational DBMS for XML

Join Optimization for $x in $foo for $y in $bar where $x/p1/@a < $y/p2/@a return $x

p1/p1 /p2theta-join

project

p1/p1 /p2

theta-join

Aggr(min) Aggr(max)

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 68: MonetDB/XQuery:  Using a Relational DBMS for XML

Loop-Lifted StaircaseJoin (recap rules)

XQuery construct relational algebraSee VLDB’04 / TDM’04

[Grust,Teubner]

– Sequence construction union– If-Then-[Else] select, [union]– For loop map with cartesian product (all combinations)– Calculations projection expressions– List-functions (e.g. fn:first) select(pos=1)– Element Construction updates using descendant– Path steps selections on the pre/post plane

• Staircase join [VLDB03]: – Single-pass for a *set* of context nodes

– elaborate skipping!

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 69: MonetDB/XQuery:  Using a Relational DBMS for XML

Loop-lifted staircase join

• Staircase join [VLDB03]: – Single-pass for a *set* of context nodes

Loop-lifting multiple iters multiple sets of context nodes

– elaborate skipping!

– Loop-Lifted Staircase Join

In a single pass: process multiple input context node lists

– Use a stack

– Exploit axis properties for pruning

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 70: MonetDB/XQuery:  Using a Relational DBMS for XML

Staircase join

document

List of context nodes

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 71: MonetDB/XQuery:  Using a Relational DBMS for XML

Loop-lifted staircase join

document document

List of context nodes Active stack

Multiple lists of context nodes

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 72: MonetDB/XQuery:  Using a Relational DBMS for XML

Loop-lifted staircase join

• Staircase join [VLDB03]: – Single-pass for a *set* of context nodes

Loop-lifting multiple iters multiple sets of context nodes

– elaborate skipping!

– Loop-Lifted Staircase Join

In a single pass: process multiple input context node lists

– Use a stack

– Exploit axis properties for pruning

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 73: MonetDB/XQuery:  Using a Relational DBMS for XML

Scalability

Test platform• Opteron 1.6GHz, 8GB RAM, Red Hat Linux 64-bit

• Can process 11GB document!

Mostly linear scaling with document size

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 74: MonetDB/XQuery:  Using a Relational DBMS for XML

Scalability

Test platform• Opteron 1.6GHz, 8GB RAM, Red Hat Linux 64-bit

• Can process 11GB document!

Mostly linear scaling with document size

• Some swapping in the join queries

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 75: MonetDB/XQuery:  Using a Relational DBMS for XML

Scalability

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Test platform• Opteron 1.6GHz, 8GB RAM, Red Hat Linux 64-bit

• Can process 11GB document!

Mostly linear scaling with document size

• Some swapping in the join-queries

• Q11 + Q12 generate quadratic result

Page 76: MonetDB/XQuery:  Using a Relational DBMS for XML

XMark 10MB : Pathfinder vs XHive & Galax

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 77: MonetDB/XQuery:  Using a Relational DBMS for XML

XMark 1GB: Pathfinder vs X-Hive

did not finish

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 78: MonetDB/XQuery:  Using a Relational DBMS for XML

Conclusions

• Relational approach can be scalable & fast• Crucial Optimizations

– Join recognition– Loop-lifted XPath steps– Order awareness

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery

Page 79: MonetDB/XQuery:  Using a Relational DBMS for XML

Conclusions

• Relational approach can be scalable & fast• Crucial Optimizations

– Join recognition– Loop-lifted XPath steps– Order awareness

Future Roadmap (beta: May 30, Holland Open)• Alegebraic Query Optimization• Updates (not in release)

Peter Boncz TU Delft 10-5-2005Pathfinder - MonetDB/XQuery