View
67
Download
0
Category
Preview:
DESCRIPTION
Database Systems I Query Languages for XML. Query Languages for XML. XPath is a simple query language based on describing similar paths in XML documents. XQuery extends XPath in a style similar to SQL, introducing iterations, subqueries, etc. - PowerPoint PPT Presentation
Citation preview
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 1
Database Systems I
Query Languages for XML
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 2
Query Languages for XMLXPath is a simple query language based on describing similar paths in XML documents.XQuery extends XPath in a style similar to SQL, introducing iterations, subqueries, etc.XPath and XQuery expressions are applied to an XML document and return a sequence of qualifying items.Items can be primitive values or nodes (elements, attributes, documents).The items returned do not need to be of the same type.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 3
XPathA path expression returns the sequence of all qualifying items that are reachable from the input item following the specified path.A path expression is a sequence consisting of tags or attributes and special characters such as slashes (“/”).Absolute path expressions are applied to some XML document and returns all elements that are reachable from the document’s root element following the specified path.Relative path expressions are applied to an arbitrary node.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 4
XPath<?XML version=“1.0” standalone =“yes” ?><bibliography>
<book bookID = “b100“> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book>…
</bibliography>
Applied to the above document, the XPath expression /bibliography/book/author returns the sequence
<author> Abiteboul </author>
<author> Hull </author> <author> Vianu </author> . . .
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 5
AttributesIf we do not want to return the qualifying elements, but the value one of their attributes, we end the path expression with @attribute.Applied to the above document, the XPath expression
/bibliography/book/@bookID returns the sequence
“b100“ . . .
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 6
AxesXPath provides a variety of axes, i.e. modes of navigation through semistructured data.
At each step of a path expression, we can prefix a tag or attribute name by an axis name and a colon.
For example, the path expression
/child::bibliography/child::book/attribute::bookID
is equivalent to /bibliography/book/@bookID.
Descendants are all direct and indirect children of a node.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 7
AxesAxes include
parent, ancestor, descendant, next-sibling, previous-sibling, self, and descendant-or-self.
XPath has the following shorthands for axes:/ child,// descendant-or-self,@ attribute,. self,.. parent.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 8
Axes<bibliography>
<book bookID = “b100“> <title> Foundations… </title> <author affiliation = “IBM“> Abiteboul </author> <author> Hull </author>
. . . </book> <article articleID = “a245“>
<header><author authorID = “a739“> Codd
</author> <title> A relational database model </title>
</header> <body> . . . </body> </article>
</bibliography>
Applied to the above document, the path expression /bibliography//author returns the sequence <author> Abiteboul </author> <author> Hull </author>
<author> Codd </author> .
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 9
WildcardsWe can use wildcards instead of actual tags and attributes:* means any tag, and @* means any attribute.
Examples /bibliography/*/author returns the
sequence <author> Abiteboul </author>
<author> Hull </author>.
/bibliography//author/@* returns the sequence “IBM“
“a739“.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 10
ConditionsWe can restrict the qualifying paths to those that satisfy a given condition, surrounded by square brackets.Conditions can be anything returning a boolean value.
In particular, conditions can be: [<subpath>=<value>] there exists a subpath with the specified value [i] the element is the i-th element of the specified type Example /bibliography/book[/title=“Foundations…”]/author[2] returns <author> Hull </author>.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 11
XQueryXQuery extends XPath, i.e. every XPath expression is an XQuery expression.Beyond XPath expressions, XQuery introduces FLWOR expressions.Format: for let where order-by return
for/let clauses
where clause
order-by/return clause
sequence of items
sequence of items
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 12
XQueryFLWOR expressions are similar to SQL select . . from . . . where . . . queries.
XQuery allows zero, one or more for and let clauses.
The where clause is optional.
There is one optional order-by clause.
Finally, there is exactly one return clause.
XQuery is case-sensitive.
XQuery (and XPath) is a W3C standard.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 13
XQueryXQuery is a functional language.
Any XQuery expression can be used in any place that an expression is expected.
SQL also allows subqueries in many places. However, SQL does, e.g., not allow any subquery to be any operand of any comparison in a WHERE clause.
This implies that every XQuery operator must be defined for operands that are sequences of items, not just for individual items.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 14
XQuery Clausesfor $x in expr
Defines node variable $x.The expression expr evaluates to a sequence of items.The variable $x is assigned to each item, in turn, and the body of the for clause is executed once for each assignment.
let $x := expr Defines collection variable $x.The expression expr evaluates to a sequence of items.The variable is bound to the entire sequence of items.Useful for common subexpressions and for aggregations.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 15
XQuery Clauseswhere condition
The condition is a boolean expression.The clause is applied to some item.If and only if the condition evaluates to true, the following return clause is executed for that item.
return expressionThe result of a FLWOR clause is a sequence of items. Expression defines the result format for the current (qualifying) item.The sequence of items produced by expression is appended to the sequence of items produced so far.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 16
Document NodesThe context for a for or let clause is often provided by a document node.Typically, the document comes from a file.The doc function constructs a document node from a file with a given name.Examples
doc("bib.xml")
doc(“infolab.stanford.edu/~hector/movies.xml”)
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 17
Interpretation as XQuery Expression
XQuery expressions can be used wherever an XML expression of any kind is permitted.
Any text string is acceptable as content of a tag or value of an attribute.
If a string contains an XQuery expression that should be evaluated, this substring must be surrounded by curly brackets {}.Example
for $b in doc("bib.xml")/bibliography/book return <result id = {$b/@bookID}>{$b/title}</result>
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 18
XQuery Examples
for $x in doc("bib.xml")/bibliography/book
return <result> {$x} </result>
for $x in doc("bib.xml")/bibliography/book
return <result> {$x} </result>
Returns: <result> <book>...</book></result> <result> <book>...</book></result> <result> <book>...</book></result> ...
let $x := doc("bib.xml")/bibliography/book
return <result> {$x} </result>
let $x := doc("bib.xml")/bibliography/book
return <result> {$x} </result>
Returns: <result> <book>...</book> <book>...</book> <book>...</book> ...</result>
Find all books.for vs. let
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 19
XQuery Examples
Result: <title> abc </title> <title> def </title> <title> ghi </title>
for $x in doc("bib.xml")/bibliography/book
where $x/year > 1995
return $x/title
for $x in doc("bib.xml")/bibliography/book
where $x/year > 1995
return $x/title
Find all titles of books published after 1995.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 20
Ordering the Query Result
The order-by clause allows you to order the
results of an XQuery expression.
order-by list of expressions
The sort order is based on the value of the first
expression. Ties are broken based on the value
of the second (if necessary third etc.)
expression.
By default, the order is ascending.
A descending sort order can be specified using
descending.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 21
Elimination of DuplicatesThe built-in function distinct-values eliminates duplicates from a sequence of result items.
In principle, it applies only to primitive (atomic) types.
It can also be applied to elements, but then it will remove their tags, replacing them by quotes “”.ExampleIf return $b/title produces <title> aaa </title> <title> bbb </title> <title> aaa </title> then distinct-values (return $b/title) produces “aaa” “bbb”.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 22
XQuery ExamplesFind all books published by Morgan Kaufman and list them in descending order of their prices.
Uses order-by with option descending.
for $b in doc("bib.xml") /bibliography/book[publisher=“Morgan Kaufmann”])
order-by $b/price descending
return $b
for $b in doc("bib.xml") /bibliography/book[publisher=“Morgan Kaufmann”])
order-by $b/price descending
return $b
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 23
XQuery ExamplesFor each author of a book published by Morgan Kaufmann, list the author and the titles of all books she published.
Uses nested subquery and function distinct-values.
for $a in distinct-values(doc("bib.xml") /bibliography/book[publisher=“Morgan Kaufmann”]/author)
return <result>
{$a}
{for $t in /bib/book[author=$a]/title
return $t}
</result>
for $a in distinct-values(doc("bib.xml") /bibliography/book[publisher=“Morgan Kaufmann”]/author)
return <result>
{$a}
{for $t in /bib/book[author=$a]/title
return $t}
</result>
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 24
XQuery ExamplesResult:
<result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <result> <author> Smith </author> <title> ghi </title> </result>
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 25
JoinsWe can join two or more documents, by using one variable for each of the documents .
We let a variable range over the elements of the corresponding document, within a for-clause.
Need to be careful when comparing elements for equality, since their equality is by element identity, not by element content.
Typically, we want to compare the element content.
The built-in function data(E) returns the content of an element E.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 26
Example
Find all pairs of titles of books from the same year.
Uses two variables ranging over books and the data function applied to their year elements.
let $books:=doc("bib.xml")
for $b1 in doc("bib.xml")/bibliography/book, $b2 in doc("bib.xml")/bibliography/book
where data($b1/year) = data($b2 /year) return <result>{$b1/title} {$b2/title} </result>
let $books:=doc("bib.xml")
for $b1 in doc("bib.xml")/bibliography/book, $b2 in doc("bib.xml")/bibliography/book
where data($b1/year) = data($b2 /year) return <result>{$b1/title} {$b2/title} </result>
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 27
Comparison OperatorsXQuery supports the standard comparison operators such as <, >, =.
Comparison operators are applied to a sequence of items.
Comparisons have an existential nature. I.e., they return true if and only if at least one of the items satisfies the condition of the comparison.
for $b in doc("bib.xml")/bibliography/book/ where $b/author/firstname = “A”
and $b/author/lastname = “B” return $b
Books returned can have one author with firstname A and another author with lastname B.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 28
Comparison OperatorsXQuery also supports special comparison operators
that only compare sequences consisting of a single
item: eq, ne, lt, gt, ge.
These comparisons fail if one of the operands
contains more than one item.
XQuery also provides built-in functions for
approximate string matching, in particular
contains($p, "windsurfing").
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 29
Quantification
XQuery supports the existential and the universal quantifier.
Universal quantifierevery $v in expression1 satisfies
expression 2
Existential quantifiersome $v in expression1 satisfies
expression 2
Expression1 evaluates to a sequence of items, expression 2 is a boolean expression.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 30
Aggregation
XQuery provides built-in functions for the standard
aggregations such as SUM, MIN, COUNT and AVG.
They can be applied to any XQuery expression, i.e.
to any sequence of items.
Example
avg(doc("bib.xml")/bibliography/book/price)
count(doc("bib.xml")/bibliography/book/price)
Computes the average book price and the number of books, resp.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 31
XQuery Examples
Find books whose price is larger than the average price.
Uses aggregate operator (avg), applied to the result of a path expression.
let $a:=avg(doc("bib.xml")/bibliography/book/price)
for $b in doc("bib.xml")/bibliography/book
where $b/price > $a
return $b
let $a:=avg(doc("bib.xml")/bibliography/book/price)
for $b in doc("bib.xml")/bibliography/book
where $b/price > $a
return $b
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 32
XQuery Examples
Find title of books with a paragraph containing the terms “sailing” and “windsurfing”.
Uses existential quantifier (some) and string matching (contains).
for $b in doc("bib.xml")//book
where some $p in $b//para satisfies
contains($p, "sailing") and contains($p, "windsurfing")
return $b/title
for $b in doc("bib.xml")//book
where some $p in $b//para satisfies
contains($p, "sailing") and contains($p, "windsurfing")
return $b/title
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 33
XQuery Examples
Find the title of books where every paragraph contains the terms “sailing”.
Uses universal quantifier (every) and string matching (contains).
for $b in doc("bib.xml")//book
where every $p in $b//para satisfies
contains($p, "sailing")
return $b/title
for $b in doc("bib.xml")//book
where every $p in $b//para satisfies
contains($p, "sailing")
return $b/title
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 34
SummaryXQuery is the standard XML query language.It is a functional language, i.e. any XQuery expression can be used in any place where an expression is expected.An XQuery expression consists of for, let, where, order and return clauses, of which some are optional.The main new concept compared to SQL are path expressions that return sets of elements reachable via the given path.Path expressions are defined in XPath, a sublanguage of XQuery. In addition, XQuery has equivalent constructs for most of the main SQL constructs, in particular quantifiers and aggregate functions.
Recommended