Upload
camilla-phillips
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
1
Typing XQuery
WANG Zhen (Selina)
2006.4.6
2
Something about the InternshipGroup Name: PROTHEO, Inria, France
Research: Rewriting and strategies, Constraints, Automated Deduction
A member in REWERSE (Reasoning on the Web with Rules and Semantics), a research network within EU
Aim: develop reasoning languages for Web applications.
In progress: Xcerpt
• A deductive, rule-based query language for graph-structured data, including XML data.
• More suitable for reasoning, compared to XQuery.
Still working on Xcerpt and its the typing system.
Question: How to build a type system for Xcerpt?• Refer to the typing system of other query languages.
• My Internship: analyze the typing system of XQuery
3
Outline
Background
Related Work
XQuery Typing System
Conclusion and Future Work
4
Background
XQuery
An XML query language
E.g.: A simple path expression
doc("Catalogue.xml")/catalogue/cd
Path expressions with predicatedoc("Catalogue.xml")/catalogue/cd[ 1 ]/title
doc("Catalogue.xml")/catalogue/cd[ price>=30 ] /title
doc("Catalogue.xml")/catalogue/cd[ keyword ] /title
Predicate [pre]
serves to filter a sequence, retaining some items and discarding others.
For …/x[pre]…
Compute the predicate truth value of pre for each item x.
If true, the item x is retained, else, the item x is discarded
Three Typical Predicates [pre] :
pre is numeric → predicate truth value = if position is pre
doc("Catalogue.xml")/catalogue/cd[1]/title
pre is boolean → predicate truth value = pre
doc("Catalogue.xml")/catalogue/cd[price>30]/title
pre is a typed path → predicate truth value = if pre exit
doc("Catalogue.xml")/catalogue/cd[keyword]/title
Background
6
Background
Typing XQuery
An important aspect of XQuery formal semantics
E.g.: Given:
Catalogue.xml
A query: extract the title of the CD's, with price equal to or more than 30
XQuery expression:
doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Result
7
Background
Problem, if no type information for the XML data
The queries and Different ResultsQuery1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title
Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
>=
price “30.0” incorrect=30
correct>30 or <30
ResultPrice
Not only compare the number, but also compare the length in some cases
Possible Reason:
8
Background
However, there is no error message or warning. The mistake is too subtle to be located easily.
If we provide type information (E.g.: define price as a float) and type checking, we may find the mistake during compilation:
>=
numeric numeric
(numeric: decimal, float, double etc)
price>="30.0"
Typing error!
9
Related WorkXQuery 1.0 and XPath 2.0 Formal Semanticshttp://www.w3.org/TR/xquery-semantics/A W3C Candidate Recommendation, including
Describes the formal semantics, including some details in static analysis phase and dynamic evaluation phase Provides some generic typing rules
Too general to guide the implementation of the detailed typing procedure• E.g.: only a single rule for typing path expressions
Some inconsistency between the summarized formal semantics and the rules
• E.g.:
Formal Semantics Rules
Three kinds of Predicate:
(Numeric, Boolean, Typed path)
---
10
Related Work
Besides numeric/boolean/typed path, for the other possible expressions pre for Predicates [pre]
If pre is a string or a sequence, the predicate truth value is true if pre is not empty, and is false otherwise.
In all other cases, a typing error is raised.
Problem: any expression can be used in a predicate. Some of them, can pass compilation, but does not give reasonable results
doc("Catalogue.xml")/catalogue/cd[ “1” ]doc("Catalogue.xml")/catalogue/cd[ “price>=30” ]doc("Catalogue.xml")/catalogue/cd[ “keyword” ]
11
XQuery Typing System
This system includes the typing rules which describes the detailed typing procedure for XQuery. Extension on W3C work
Adopt and modify some basic notations to focus on typingTry to solve the inconsistency problem
• Up to now, we mainly extend the typing rules for path expression including predicates.
Definitions and Notations Typing Rules Example Implementation
12
Definitions and Notations
A Basic TypeThe built-in datatypes defined in XML Schema, including the primitive and the derived datatypes. E.g.: string, integer etc.A user defined simple type. E.g.: “myInteger” defines the integer with value between 1000 and 2000:
<xsd:simpleType name="myInteger"><xsd:restriction base="xsd:integer">
<xsd:minInclusive value="1000"/><xsd:maxInclusive value=“2000"/>
</xsd:restriction></xsd:simpleType>
13
Definitions and Notations
A type is:1. A type constant, e.g.: DocumentType, predicate2. A basic type, or3. A type symbol (E.g., a type called “CD”), or4. A functional type with the form (n ≥ 0) where {…} are types for attributes, τi are types for children
5. A disjunction type is of the form (whereτi are types , and n ≥ 0):
6. A type with occurrence indicator, in the form of
14
Definitions and Notations
A typing judgement exp:τexp is an typed expression, τis a type.
if exp’s type is τ, the typing judgement is true
• The conclusion is true, given that all the premises are true,
• All the premises and the conclusion are typing judgements.
• If there is no premise, the conclusion is always true.
A typing rule
15
Definitions and Notations
Notations
16
Typing Rules
Typing rules used forQuery1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title
Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Typing doc(f)
doc(f) is a document function, which is used to extract data from XML file f.
Typing rule suppose that the type of the root element of XML file f isτ
Typing Paths
Typing Predicate ( numeric, boolean, typed path )
τ1<: τ2 means type τ1 is the subtype of type τ2
If exp: τ1, and τ1<: τ2, then exp: τ2
Use a type called “numeric” where: (W3C)
Typing rules
ExampleTyping Query 1 with a schema: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Typing Query 1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Example
Typing Query 2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title
Predicate (numeric, boolean, typed path)
see whether price>=“30.0”: boolean
Typing Rule (from W3C) for operator “>=”, while τis numeric
Example
A typing error is generated
23
ImplementationIn order to apply those typing rules, we need to:
parse an XQuery expression into an abstract syntax treeapply those rules by navigating through the tree, add type information on the nodes
Our implementation: XQueryX – XML expression of XQuery syntax
TOM -- An extension of Java designed to manipulate tree structures and XML documents, by using pattern matching facilities.
FrameworkExample:
Query 1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
<?xml version="1.0"?><module xmlns:xqx="http://www.w3.org/2005/XQueryX" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2005/XQueryX http://www.w3.org/2005/XQueryX/xqueryx.xsd"> <mainModule> <queryBody> <pathExpr> <argExpr> <functionCallExpr>
<functionName>doc</functionName> <arguments> <stringConstantExpr> <value>Catalogue.xml</value> </stringConstantExpr> </arguments> </functionCallExpr> </argExpr>
XQueryX expression
<stepExpr> <xpathAxis>child</xpathAxis><nameTest>catalogue</nameTest> </stepExpr> <stepExpr> <xpathAxis>child</xpathAxis><nameTest>cd</nameTest> <predicates> <greaterThanOrEqualOp>
<firstOperand> <pathExpr> <stepExpr> <nameTest>price</nameTest> </stepExpr> </pathExpr> </firstOperand>
<secondOperand><integerConstantExpr><value>30</value></
integerConstantExpr> </secondOperand> </greaterThanOrEqualOp> </predicates></stepExpr>
XQueryX expression
<stepExpr> <xpathAxis>child</xpathAxis> <nameTest>title</nameTest> </stepExpr> </pathExpr> </queryBody> </mainModule></module>
XQueryX expression
Apply the Rules by Using TOM
Rules for typing
Doc(f)
Rules for typing
each step in a path expression
28
Conclusion and Future WorkConclusion
We analyze the related work in typing XQuery, and solve some inconsistency by extends the typing rules. A prototype of XQuery Typing System is implemented, including the detailed typing rules for the path expressions in XQuery.
Future Work Implementation of all the typing rules in W3C work, find and solve the potential inconsistency problem Design typing system for Xcerpt Find a polymorphic typing system for Web query languages.
29
Thank You
Catalogue.xml<?xml version="1.0" encoding="UTF-8"?><catalogue xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation="Catalogue.xsd"> <cd> <title>"Empire Burlesque"</title> <artist>Bob Dylan</artist> <year>1985</year> <price>29</price> <keyword>Empire</keyword> <keyword>Bob</keyword> </cd> <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <year>1988</year> <price>30</price> </cd> <cd> <title>Stop</title> <artist>Sam Brown</artist> <year>1988</year> <price>39</price> </cd></catalogue>
Result
<title>Hide your heart</title> <title>Stop</title>
Query:
doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Source:
Catalogue.xml
Result
<title>Hide your heart</title> <title>Stop</title>
Source:
Catalogue.xml
Query 1Query 2
Query 3
Result
<title>Stop</title>
<price>30</price>
IncorrectCorrect
Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title
Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Catalogue.xml<?xml version="1.0" encoding="UTF-8"?><catalogue xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation="Catalogue.xsd"> <cd> <title>"Empire Burlesque"</title> <artist>Bob Dylan</artist> <year>1985</year> <price>29</price> <keyword>Empire</keyword> <keyword>Bob</keyword> </cd> <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <year>1988</year> <price>30</price> </cd> <cd> <title>Stop</title> <artist>Sam Brown</artist> <year>1988</year> <price>39</price> </cd></catalogue>
<price>30.0</price>
Result
<title>Hide your heart</title> <title>Stop</title>
Source:
Catalogue.xml
Query 1Query 2
Query 3
Result
<title>Stop</title>
<price>30</price>
IncorrectCorrect
Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title
Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Result
<title>Hide your heart</title> <title>Stop</title>
Source:
Catalogue.xml
Query 1
Query 2Query 3
Result
<title>Stop</title>
IncorrectCorrect
<price>30.0</price>
Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title
Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Catalogue.xml<?xml version="1.0" encoding="UTF-8"?><catalogue xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation="Catalogue.xsd"> <cd> <title>"Empire Burlesque"</title> <artist>Bob Dylan</artist> <year>1985</year> <price>29</price> <keyword>Empire</keyword> <keyword>Bob</keyword> </cd> <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <year>1988</year> <price>30</price> </cd> <cd> <title>Stop</title> <artist>Sam Brown</artist> <year>1988</year> <price>39</price> </cd></catalogue>
<price>30.00</price>
Result
<title>Hide your heart</title> <title>Stop</title>
Source:
Catalogue.xml
Query 1Query 2
Query 3
Result
<title>Stop</title>
<price>30</price>
IncorrectCorrect
Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title
Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Result
<title>Hide your heart</title> <title>Stop</title>
Source:
Catalogue.xml
Query 1
Query 2
Query 3
Correct
<price>30.00</price>
Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title
Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title
Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title
Example: Schema file “Catalogue.xsd”