View
216
Download
0
Category
Preview:
Citation preview
MonetDB/XQuery
Technology Preview 1
Stefan ManegoldCWI
Amsterdam
http://monetdb.cwi.nl/ - http://pathfinder-xquery.org/
European Pathfinder Team
• University of Konstanz (Germany)
– Torsten Grust, Jens Teubner, Jan Rittinger
• University of Twente (Netherlands)
– Maurice van Keulen, Jan Flokstra
• CWI, Amsterdam (Netherlands)
– Peter Boncz, Stefan Manegold, Sjoerd Mullender
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Xmark 11 MB 110 MB 1.1 GB 11 GB
Q Galax X-Hive MDB/XQ Galax X-Hive MDB/XQ X-Hive MDB/XQ MDB/XQ
1 0.06 0.37 0.05 0.72 1.29 0.41 9.9 1.2 13
2 0.03 0.45 0.07 0.31 1.75 0.30 33.0 2.4 25
3 0.14 0.65 0.28 1.76 5.66 1.51 25.1 12.5 126
4 0.22 0.10 0.08 2.91 1.00 0.45 18.1 3.8 36
5 0.05 0.13 0.05 0.63 0.90 0.16 20.7 1.2 11
6 1.30 1.07 0.02 13.29 10.17 0.05 178.1 0.3 3
7 2.68 1.57 0.03 30.01 24.84 0.07 278.4 0.4 4
8 0.16 0.85 0.14 2.12 3.51 0.75 49.1 10.4 208
9 113.23 32.25 0.20 DNF 12280.66 0.87 DNF 12.9 289
10 1.74 5.28 0.80 18.61 442.37 5.31 DNF 55.0 1882
11 2.62 98.91 0.18 DNF 19927.29 3.48 DNF 872.5 DNF
12 1.44 23.39 0.14 DNF 5100.19 1.66 DNF 150.7 DNF
13 0.03 0.10 0.07 0.66 1.03 0.22 12.9 1.3 13
14 1.92 0.72 0.17 99.53 11.16 1.40 110.2 13.7 959
15 0.02 0.03 0.09 0.20 0.49 0.28 10.6 1.7 16
16 0.03 0.03 0.11 0.46 0.52 0.26 10.9 1.8 18
17 0.06 0.09 0.07 0.82 0.85 0.30 11.8 2.8 26
18 0.07 0.08 0.04 0.73 0.64 0.13 14.8 0.9 9
19 1.17 0.67 0.11 14.73 12.15 0.55 254.5 5.3 88
20 0.28 0.11 0.24 2.98 1.40 0.62 24.6 4.9 50
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
did not finish
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Results: Performance (2)
Story
• XQuery Example• Relational XQuery
– System Architecture– XML Encoding
• Science & Reseach• Scalability• Outlook
– Conclusions– Roadmaps– Release & References
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
• For each author, return number of books and receipts
for books published in the past 2 years, ordered by name
let $cat := fn:doc(“www.bn.com/catalog.xml”), (:Documents:) $sales := fn:doc(“www.publishersweekly.com/sales.xml”)
for $author in distinct-values($cat//author) (:Grouping:) let $books := $cat//book[@year >= 2003 and author = $a], (:Sel.:) $receipts := $sales/book[@isbn = $books/@isbn]/receipts (:Join:) order by $author (:Ordering:) return
<sales> (:XML Construction:) { $author }
<count> { fn:count($books) } </count> (:Aggregation:) <total> { fn:sum($receipts) } </total></sales>
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XQuery Example
• For each author, return number of books and receipts
for books published in the past 2 years, ordered by name
let $cat := fn:doc(“www.bn.com/catalog.xml”), Documents $sales := fn:doc(“www.publishersweekly.com/sales.xml”)
for $author in distinct-values($cat//author) Grouping let $books := $cat//book[@year >= 2003 and author = $a], Sel. $receipts := $sales/book[@isbn = $books/@isbn]/receipts Join order by $author Ordering return
<sales> XML Construction { $author }
<count> { fn:count($books) } </count> Aggregation <total> { fn:sum($receipts) } </total></sales>
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XQuery Example
XQuery Systems: 2 Approaches
• Existing “native” XML/XQuery systems are built from scratch– Galax, Saxon, …– X-Hive, Tamino, …– (Still have to) re-invent optimization technology
• Our approach:– Build XQuery system on top of an RDBMS– Leverage mature relational technology
to achieve efficient XQuery processing
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XML in an RDBMS: XPath Accelerator
Node-based relational encoding of XQuery's data model
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
1. f/descendant:SELECT * FROM pre_post WHERE pre > f.pre AND post < f.post
2. f/ancester: SELECT * FROM pre_post WHERE pre < f.pre AND post > f.post
3. f/preceeding: SELECT * FROM pre_post WHERE pre < f.pre AND post < f.post
4. f/following: SELECT * FROM pre_post WHERE pre > f.pre AND post > f.post
Science & Research
• More research lead to more optimization– Join Recognition– Embedded XPath processing– Order Awareness
• Various scientific publications
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Results: Scalability (3)
Unsurpassed scalability • Standard Opteron PC, 8GB RAM, 64-bit Linux• Can process 11GB documents!
Mostly linear scaling with document size
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Conclusions
• Relational approach Works Is fast Is scalable
• Crucial Optimizations– Join recognition– Embedded XPath processing– Order awareness
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Roadmap
• 30-05-05: MonetDB/XQuery 4.8/0.8 “Mercurius”– Developers Release / Technology Preview 1
• 30-09-05: MonetDB/XQuery 4.10/0.10 “Venus”– Student Release / Technology Preview 2
– XUpdate, Algebraic Query Optimization
• 30-12-05: MonetDB/XQuery 4.12/1.12 “Mars”– Final Release
– Application Programming Interfaces
– End-User Front-Ends
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Open Source Release
• MonetDB + Pathfinder on SourceForge– Mozilla-like License
• MonetDB homepage– http://monetdb.cwi.nl/
• Pathfinder homepage– http://pathfinder-xquery.org/
• Developers website– http://sf.net/projects/monetdb/
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Xmark 11 MB
Q X-Hive
1 0.06 0.37
2 0.03 0.45
3 0.14 0.65
4 0.22 0.10
5 0.05 0.13
6 1.30 1.07
7 2.68 1.57
8 0.16 0.85
9 113.23 32.25
10 1.74 5.28
11 2.62 98.91
12 1.44 23.39
13 0.03 0.10
14 1.92 0.72
15 0.02 0.03
16 0.03 0.03
17 0.06 0.09
18 0.07 0.08
19 1.17 0.67
20 0.28 0.11
Galax
Xmark 11 MB
Q X-Hive MDB/XQ
1 0.06 0.37 0.05
2 0.03 0.45 0.07
3 0.14 0.65 0.28
4 0.22 0.10 0.08
5 0.05 0.13 0.05
6 1.30 1.07 0.02
7 2.68 1.57 0.03
8 0.16 0.85 0.14
9 113.23 32.25 0.20
10 1.74 5.28 0.80
11 2.62 98.91 0.18
12 1.44 23.39 0.14
13 0.03 0.10 0.07
14 1.92 0.72 0.17
15 0.02 0.03 0.09
16 0.03 0.03 0.11
17 0.06 0.09 0.07
18 0.07 0.08 0.04
19 1.17 0.67 0.11
20 0.28 0.11 0.24
Galax
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Xmark 11 MB 110 MB
Q X-Hive MDB/XQ X-Hive
1 0.06 0.37 0.05 0.72 1.29
2 0.03 0.45 0.07 0.31 1.75
3 0.14 0.65 0.28 1.76 5.66
4 0.22 0.10 0.08 2.91 1.00
5 0.05 0.13 0.05 0.63 0.90
6 1.30 1.07 0.02 13.29 10.17
7 2.68 1.57 0.03 30.01 24.84
8 0.16 0.85 0.14 2.12 3.51
9 113.23 32.25 0.20 DNF 12280.66
10 1.74 5.28 0.80 18.61 442.37
11 2.62 98.91 0.18 DNF 19927.29
12 1.44 23.39 0.14 DNF 5100.19
13 0.03 0.10 0.07 0.66 1.03
14 1.92 0.72 0.17 99.53 11.16
15 0.02 0.03 0.09 0.20 0.49
16 0.03 0.03 0.11 0.46 0.52
17 0.06 0.09 0.07 0.82 0.85
18 0.07 0.08 0.04 0.73 0.64
19 1.17 0.67 0.11 14.73 12.15
20 0.28 0.11 0.24 2.98 1.40
Galax Galax
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Xmark 11 MB 110 MB
Q X-Hive MDB/XQ X-Hive MDB/XQ
1 0.06 0.37 0.05 0.72 1.29 0.41
2 0.03 0.45 0.07 0.31 1.75 0.30
3 0.14 0.65 0.28 1.76 5.66 1.51
4 0.22 0.10 0.08 2.91 1.00 0.45
5 0.05 0.13 0.05 0.63 0.90 0.16
6 1.30 1.07 0.02 13.29 10.17 0.05
7 2.68 1.57 0.03 30.01 24.84 0.07
8 0.16 0.85 0.14 2.12 3.51 0.75
9 113.23 32.25 0.20 DNF 12280.66 0.87
10 1.74 5.28 0.80 18.61 442.37 5.31
11 2.62 98.91 0.18 DNF 19927.29 3.48
12 1.44 23.39 0.14 DNF 5100.19 1.66
13 0.03 0.10 0.07 0.66 1.03 0.22
14 1.92 0.72 0.17 99.53 11.16 1.40
15 0.02 0.03 0.09 0.20 0.49 0.28
16 0.03 0.03 0.11 0.46 0.52 0.26
17 0.06 0.09 0.07 0.82 0.85 0.30
18 0.07 0.08 0.04 0.73 0.64 0.13
19 1.17 0.67 0.11 14.73 12.15 0.55
20 0.28 0.11 0.24 2.98 1.40 0.62
Galax Galax
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Xmark 11 MB 110 MB 1.1 GB
Q X-Hive MDB/XQ X-Hive MDB/XQ X-Hive MDB/XQ
1 0.06 0.37 0.05 0.72 1.29 0.41 9.9 1.2
2 0.03 0.45 0.07 0.31 1.75 0.30 33.0 2.4
3 0.14 0.65 0.28 1.76 5.66 1.51 25.1 12.5
4 0.22 0.10 0.08 2.91 1.00 0.45 18.1 3.8
5 0.05 0.13 0.05 0.63 0.90 0.16 20.7 1.2
6 1.30 1.07 0.02 13.29 10.17 0.05 178.1 0.3
7 2.68 1.57 0.03 30.01 24.84 0.07 278.4 0.4
8 0.16 0.85 0.14 2.12 3.51 0.75 49.1 10.4
9 113.23 32.25 0.20 DNF 12280.66 0.87 DNF 12.9
10 1.74 5.28 0.80 18.61 442.37 5.31 DNF 55.0
11 2.62 98.91 0.18 DNF 19927.29 3.48 DNF 872.5
12 1.44 23.39 0.14 DNF 5100.19 1.66 DNF 150.7
13 0.03 0.10 0.07 0.66 1.03 0.22 12.9 1.3
14 1.92 0.72 0.17 99.53 11.16 1.40 110.2 13.7
15 0.02 0.03 0.09 0.20 0.49 0.28 10.6 1.7
16 0.03 0.03 0.11 0.46 0.52 0.26 10.9 1.8
17 0.06 0.09 0.07 0.82 0.85 0.30 11.8 2.8
18 0.07 0.08 0.04 0.73 0.64 0.13 14.8 0.9
19 1.17 0.67 0.11 14.73 12.15 0.55 254.5 5.3
20 0.28 0.11 0.24 2.98 1.40 0.62 24.6 4.9
Galax Galax
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Xmark 11 MB 110 MB 1.1 GB 11 GB
Q Galax X-Hive MDB/XQ Galax X-Hive MDB/XQ X-Hive MDB/XQ MDB/XQ
1 0.06 0.37 0.05 0.72 1.29 0.41 9.9 1.2 13
2 0.03 0.45 0.07 0.31 1.75 0.30 33.0 2.4 25
3 0.14 0.65 0.28 1.76 5.66 1.51 25.1 12.5 126
4 0.22 0.10 0.08 2.91 1.00 0.45 18.1 3.8 36
5 0.05 0.13 0.05 0.63 0.90 0.16 20.7 1.2 11
6 1.30 1.07 0.02 13.29 10.17 0.05 178.1 0.3 3
7 2.68 1.57 0.03 30.01 24.84 0.07 278.4 0.4 4
8 0.16 0.85 0.14 2.12 3.51 0.75 49.1 10.4 208
9 113.23 32.25 0.20 DNF 12280.66 0.87 DNF 12.9 289
10 1.74 5.28 0.80 18.61 442.37 5.31 DNF 55.0 1882
11 2.62 98.91 0.18 DNF 19927.29 3.48 DNF 872.5 DNF
12 1.44 23.39 0.14 DNF 5100.19 1.66 DNF 150.7 DNF
13 0.03 0.10 0.07 0.66 1.03 0.22 12.9 1.3 13
14 1.92 0.72 0.17 99.53 11.16 1.40 110.2 13.7 959
15 0.02 0.03 0.09 0.20 0.49 0.28 10.6 1.7 16
16 0.03 0.03 0.11 0.46 0.52 0.26 10.9 1.8 18
17 0.06 0.09 0.07 0.82 0.85 0.30 11.8 2.8 26
18 0.07 0.08 0.04 0.73 0.64 0.13 14.8 0.9 9
19 1.17 0.67 0.11 14.73 12.15 0.55 254.5 5.3 88
20 0.28 0.11 0.24 2.98 1.40 0.62 24.6 4.9 50
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Outline
• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery
– XPath steps in the pre/post plane– Translating for-loops, and beyond
• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition
• Outlook– Conclusions– Roadmaps
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Outline
• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery
– XPath steps in the pre/post plane– Translating for-loops, and beyond
• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition
• Outlook– Conclusions– Roadmaps
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XML
• Standard, flexible syntax for data exchange
– Regular, structured data
Database content of all kinds: Inventory, billing, orders, …
“Small” typed values
– Irregular, unstructured text
Documents of all kinds: Transcripts, books, legal briefs, …
“Large” untyped values
• Lingua franca of B2B Applications…
– Increase access to products & services
– Integrate disparate data sources
– Automate business processes
• … and numerous other application domains
– Bio-informatics, library science, …Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XML : A First Look
• XML document describing catalog of books
<?xml version="1.0" encoding="ISO-8859-1" ?><catalog> <book isbn="ISBN 1565114302"> <title>No Such Thing as a Bad Day</title> <author>Hamilton Jordan</author> <publisher>Longstreet Press, Inc.</publisher> <price currency="USD">17.60</price> <review> <reviewer>Publisher</reviewer>: This book is the moving
account of one man's successful battles against three cancers ... <title>No Such Thing as a Bad Day</title> is warmly recommended.
</review> </book>
<!-- more books and specifications -->
</catalog>
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XQuery 1.0
• Functional, strongly-typed query language• XQuery 1.0 =
XPath 2.0 for navigation, selection, extraction
+ A few more expressions For-Let-Where-Order By-Return (FLWOR)
XML construction
Operators on types
+ User-defined functions & modules
+ Strong typing
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XSLT vs. XQuery
• XSLT 1.0: XML XML, HTML, Text– Loosely-typed scripting language– Format XML in HTML for display in browser– Must be highly tolerant of variability/errors in data
• XQuery 1.0: XML XML– Strongly-typed query language– Large-scale database access– Must guarantee safety/correctness of operations on data
• Over time, XSLT & XQuery may both serve needs of many application domains
• XQuery will become a hidden, commodity language
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XQuery Example
• For each author, return number of books and receipts books published in past 2
years, ordered by name
let $cat := fn:doc(“www.bn.com/catalog.xml“), Join $sales := fn:doc(“www.publishersweekly.com/sales.xml“)
for $author in distinct-values($cat//author) Groupinglet $books := $cat//book[@year >= 2000 and author = $a], S.J.
$receipts := $sales/book[@isbn = $books/@isbn]/receipts
order by $author Orderingreturn
<sales> XML Construction { $author }
<count> { fn:count($books) } </count> Aggregation <total> { fn:sum($receipts) } </total></sales>
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Outline
• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery
– XPath steps in the pre/post plane– Translating for-loops, and beyond
• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition
• Outlook– Conclusions– Roadmaps
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XQuery Systems: 2 Approaches
• Tree-based– Tree is basic data structure
• Also on disk (if an XQuery DBMS)– Navigational Approach
• Galax [Simeon..], Flux [Koch..], X-Hive– Tree Algebra Approach
• TIMBER [Jagadish..]
• Relational– Data shredded in relational tables– XQuery translated into database query (e.g. SQL)
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
The Pathfinder Project
• Challenge / Goal:– Turn RDBMSs into efficient XQuery engines
• People:– Torsten Grust, Jens Teubner
• University of Konstanz (June 2005: Technical University of Munich)
– Maurice van Keulen
• University of Twente
– Jan Rittinger
• University of Konstanz & CWI
• Task: generate code for MonetDB
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
The Pathfinder Project
• Challenge / Goal:– Turn RDBMSs into efficient XQuery engines
• People:– Torsten Grust, Jens Teubner, ...
• University of Konstanz (June 2005: Technical University of Munich)
– Maurice van Keulen, Jan Flokstra, ...
• University of Twente
– Jan Rittinger
• University of Konstanz & CWI
• Task: generate code for MonetDB
– Peter Boncz, Stefan Manegold, Sjoerd Mullender, ...
• CWI, Amsterdam
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
MonetDB: Applied CS Research at CWI
• a decade of “query-intensive” application experience
• image retrieval: Peter Bosch ImageSpotter
• audio/video retrieval: Alex van Ballegooij RAM
• XML text retrieval: de Vries / Hiemstra TIJAH
• biological sequences: Arno Siebes BRICKS
• XML databases: Albrecht Schmidt XMark
Grust / vKeulen Pathfinder
• GIS: Wilco Quak MAGNUM
• data warehousing / OLAP / data mining
SPSS DataDistilleries
Univ. Massachussetts PROXIMITY
CWI research group successfully spun off DataDistilleries (now SPSS)
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Pathfinder — MonetDB
Pathfinder
MonetDB
Parser
Sem. Analysis
Core Translation
Typechecking
Relational Algebra
Database
MIL
SQL
Parser
Sem. Analysis
Core Translation
Typechecking
Database
MIL
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Pathfinder — MonetDB
Pathfinder
MonetDB
Parser
Sem. Analysis
Core Translation
Typechecking
Relational Algebra
Database
MIL
SQL
Core to MILTranslation
Parser
Sem. Analysis
Core Translation
Typechecking
Database
MIL
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Open Source
• MonetDB + Pathfinder on Sourceforge– Mozilla License
• MonetDB Homepage– http://monetdb.cwi.nl/
• Pathfinder Homepage
– http://pathfinder-xquery.org/• Developers website:
– http://sf.net/projects/monetdb/
RoadMap• 14-apr-04: initial Beta release MonetDB/SQL• 30-sep-04: first official release MonetDB/SQL• 30-may-05: Developer release of MonetDB/XQuery (i.e. Pathfinder)• 30-sep-05: Student release of MonetDB/XQuery (incl. XUpdate)• 30-dec-05: Users release of MonetDB/XQuery (?)
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
MonetDB: extensible architecture
Front-end/back-end:
• support multiple data models
• support multiple end-user languages
• support diverse application domains
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Front-end/back-end:
• support multiple data models
• support multiple end-user languages
• support diverse application domains
PathfinderXQuery Frontend
MonetDB: extensible architecture
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Outline
• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery
– XPath steps in the pre/post plane– Translating for-loops, and beyond
• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition
• Outlook– Conclusions– Roadmaps
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XPath on an RDBMS
Node-based relational encoding of XQuery's data model
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Pre/Post Pre/Level/Size
done for better skipping and updates
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Outline
• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery
– XPath steps in the pre/post plane– Translating for-loops, and beyond
• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition
• Outlook– Conclusions– Roadmaps
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Order Prevention
[VLDB03 Wang&Cherniack] define:
• Order properties of relations
• Order propagation rules for relational operators
Decoration of physical plans with order properties eliminate sort
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
– For loop map with all combinations O(N*N)– If `simple’ condition exist on two loop variables join
– Only make a map with the matching combinations– E.g. with Hash-Table O(N)
Performed on the XCore tree
Recognize if-then expressions
Open question:
where to optimize best??
Join Recognition
for $p in $auction/site/people/person for $t in $auction/site/closed_auctions/closed_auction where $t/buyer/@person = $p/@id return $t
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Loop-lifted staircase join
• Staircase join [VLDB03]: – Single-pass for a *set* of context nodes
Loop-lifting multiple iters multiple sets of context nodes
– elaborate skipping!
– Loop-Lifted Staircase Join
In a single pass: process multiple input context node lists
– Use a stack
– Exploit axis properties for pruning
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Scalability
Test platform• Opteron 1.6GHz, 8GB RAM, 64-bit Linux (Fedora Core 3)
• Can process 11GB document!
Mostly linear scaling with document size
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Scalability
Test platform• Opteron 1.6GHz, 8GB RAM, 64-bit Linux (Fedora Core 3)
• Can process 11GB document!
Mostly linear scaling with document size
• Some swapping in the join queries
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Scalability
Test platform• Opteron 1.6GHz, 8GB RAM, 64-bit Linux (Fedora Core 3)
• Can process 11GB document!
Mostly linear scaling with document size
• Some swapping in the join-queries
• Q11 + Q12 generate quadratic result
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XMark 10MB : Pathfinder vs XHive & Galax
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
XMark 1GB: Pathfinder vs X-Hive
did not finish
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Outline
• Basic XML / XQuery• Introduction of Pathfinder and MonetDB projects• Relational XQuery
– XPath steps in the pre/post plane– Translating for-loops, and beyond
• Optimizations– Order prevention– Loop-Lifted Staircase join – Join recognition
• Outlook– Conclusions– Roadmaps
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Conclusions
• Relational approach can be scalable & fast• Crucial Optimizations
– Join recognition– Loop-lifted XPath steps– Order awareness
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Conclusions
• Relational approach can be scalable & fast• Crucial Optimizations
– Join recognition– Loop-lifted XPath steps– Order awareness
Future Roadmap (Scientific/Research):• Algebraic Query Optimization• Updates
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Product Roadmap
• 30-05-05: MonetDB/XQuery 4.8/0.8 “Mercurius”– Developers Release / Technology Preview 1
• 30-09-05: MonetDB/XQuery 4.10/0.10 “Venus”– Student Release / Technology Preview 2
– XUpdate, Algebraic Query Optimization
• 30-12-05: MonetDB/XQuery 4.12/1.12 “Mars”– Final Release
– Application Programming Interfaces
– End-User Front-Ends
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Loop-lifted staircase join
document document
List of context nodes Active stack
Multiple lists of context nodes
Stefan Manegold HollandOpen, Amsterdam 31-5-2005MonetDB/XQuery
Recommended