39
1 Typing XQuery WANG Zhen (Selina) 2006.4.6

1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Embed Size (px)

Citation preview

Page 1: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

1

Typing XQuery

WANG Zhen (Selina)

2006.4.6

Page 2: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

2

Something about the InternshipGroup Name: PROTHEO, Inria, France

Research: Rewriting and strategies, Constraints, Automated Deduction

A member in REWERSE (Reasoning on the Web with Rules and Semantics), a research network within EU

Aim: develop reasoning languages for Web applications.

In progress: Xcerpt

• A deductive, rule-based query language for graph-structured data, including XML data.

• More suitable for reasoning, compared to XQuery.

Still working on Xcerpt and its the typing system.

Question: How to build a type system for Xcerpt?• Refer to the typing system of other query languages.

• My Internship: analyze the typing system of XQuery

Page 3: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

3

Outline

Background

Related Work

XQuery Typing System

Conclusion and Future Work

Page 4: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

4

Background

XQuery

An XML query language

E.g.: A simple path expression

doc("Catalogue.xml")/catalogue/cd

Path expressions with predicatedoc("Catalogue.xml")/catalogue/cd[ 1 ]/title

doc("Catalogue.xml")/catalogue/cd[ price>=30 ] /title

doc("Catalogue.xml")/catalogue/cd[ keyword ] /title

Page 5: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Predicate [pre]

serves to filter a sequence, retaining some items and discarding others.

For …/x[pre]…

Compute the predicate truth value of pre for each item x.

If true, the item x is retained, else, the item x is discarded

Three Typical Predicates [pre] :

pre is numeric → predicate truth value = if position is pre

doc("Catalogue.xml")/catalogue/cd[1]/title

pre is boolean → predicate truth value = pre

doc("Catalogue.xml")/catalogue/cd[price>30]/title

pre is a typed path → predicate truth value = if pre exit

doc("Catalogue.xml")/catalogue/cd[keyword]/title

Background

Page 6: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

6

Background

Typing XQuery

An important aspect of XQuery formal semantics

E.g.: Given:

Catalogue.xml

A query: extract the title of the CD's, with price equal to or more than 30

XQuery expression:

doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Result

Page 7: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

7

Background

Problem, if no type information for the XML data

The queries and Different ResultsQuery1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

>=

price “30.0” incorrect=30

correct>30 or <30

ResultPrice

Not only compare the number, but also compare the length in some cases

Possible Reason:

Page 8: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

8

Background

However, there is no error message or warning. The mistake is too subtle to be located easily.

If we provide type information (E.g.: define price as a float) and type checking, we may find the mistake during compilation:

>=

numeric numeric

(numeric: decimal, float, double etc)

price>="30.0"

Typing error!

Page 9: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

9

Related WorkXQuery 1.0 and XPath 2.0 Formal Semanticshttp://www.w3.org/TR/xquery-semantics/A W3C Candidate Recommendation, including

Describes the formal semantics, including some details in static analysis phase and dynamic evaluation phase Provides some generic typing rules

Too general to guide the implementation of the detailed typing procedure• E.g.: only a single rule for typing path expressions

Some inconsistency between the summarized formal semantics and the rules

• E.g.:

Formal Semantics Rules

Three kinds of Predicate:

(Numeric, Boolean, Typed path)

---

Page 10: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

10

Related Work

Besides numeric/boolean/typed path, for the other possible expressions pre for Predicates [pre]

If pre is a string or a sequence, the predicate truth value is true if pre is not empty, and is false otherwise.

In all other cases, a typing error is raised.

Problem: any expression can be used in a predicate. Some of them, can pass compilation, but does not give reasonable results

doc("Catalogue.xml")/catalogue/cd[ “1” ]doc("Catalogue.xml")/catalogue/cd[ “price>=30” ]doc("Catalogue.xml")/catalogue/cd[ “keyword” ]

Page 11: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

11

XQuery Typing System

This system includes the typing rules which describes the detailed typing procedure for XQuery. Extension on W3C work

Adopt and modify some basic notations to focus on typingTry to solve the inconsistency problem

• Up to now, we mainly extend the typing rules for path expression including predicates.

Definitions and Notations Typing Rules Example Implementation

Page 12: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

12

Definitions and Notations

A Basic TypeThe built-in datatypes defined in XML Schema, including the primitive and the derived datatypes. E.g.: string, integer etc.A user defined simple type. E.g.: “myInteger” defines the integer with value between 1000 and 2000:

<xsd:simpleType name="myInteger"><xsd:restriction base="xsd:integer">

<xsd:minInclusive value="1000"/><xsd:maxInclusive value=“2000"/>

</xsd:restriction></xsd:simpleType>

Page 13: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

13

Definitions and Notations

A type is:1. A type constant, e.g.: DocumentType, predicate2. A basic type, or3. A type symbol (E.g., a type called “CD”), or4. A functional type with the form (n ≥ 0) where {…} are types for attributes, τi are types for children

5. A disjunction type is of the form (whereτi are types , and n ≥ 0):

6. A type with occurrence indicator, in the form of

Page 14: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

14

Definitions and Notations

A typing judgement exp:τexp is an typed expression, τis a type.

if exp’s type is τ, the typing judgement is true

• The conclusion is true, given that all the premises are true,

• All the premises and the conclusion are typing judgements.

• If there is no premise, the conclusion is always true.

A typing rule

Page 15: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

15

Definitions and Notations

Notations

Page 16: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

16

Typing Rules

Typing rules used forQuery1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Typing doc(f)

doc(f) is a document function, which is used to extract data from XML file f.

Typing rule suppose that the type of the root element of XML file f isτ

Page 17: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Typing Paths

Page 18: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,
Page 19: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Typing Predicate ( numeric, boolean, typed path )

τ1<: τ2 means type τ1 is the subtype of type τ2

If exp: τ1, and τ1<: τ2, then exp: τ2

Use a type called “numeric” where: (W3C)

Typing rules

Page 20: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

ExampleTyping Query 1 with a schema: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Page 21: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Typing Query 1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Example

Page 22: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Typing Query 2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Predicate (numeric, boolean, typed path)

see whether price>=“30.0”: boolean

Typing Rule (from W3C) for operator “>=”, while τis numeric

Example

A typing error is generated

Page 23: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

23

ImplementationIn order to apply those typing rules, we need to:

parse an XQuery expression into an abstract syntax treeapply those rules by navigating through the tree, add type information on the nodes

Our implementation: XQueryX – XML expression of XQuery syntax

TOM -- An extension of Java designed to manipulate tree structures and XML documents, by using pattern matching facilities.

FrameworkExample:

Query 1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Page 24: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

<?xml version="1.0"?><module xmlns:xqx="http://www.w3.org/2005/XQueryX" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2005/XQueryX http://www.w3.org/2005/XQueryX/xqueryx.xsd"> <mainModule> <queryBody> <pathExpr> <argExpr> <functionCallExpr>

<functionName>doc</functionName> <arguments> <stringConstantExpr> <value>Catalogue.xml</value> </stringConstantExpr> </arguments> </functionCallExpr> </argExpr>

XQueryX expression

Page 25: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

<stepExpr> <xpathAxis>child</xpathAxis><nameTest>catalogue</nameTest> </stepExpr> <stepExpr> <xpathAxis>child</xpathAxis><nameTest>cd</nameTest> <predicates> <greaterThanOrEqualOp>

<firstOperand> <pathExpr> <stepExpr> <nameTest>price</nameTest> </stepExpr> </pathExpr> </firstOperand>

<secondOperand><integerConstantExpr><value>30</value></

integerConstantExpr> </secondOperand> </greaterThanOrEqualOp> </predicates></stepExpr>

XQueryX expression

Page 26: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

<stepExpr> <xpathAxis>child</xpathAxis> <nameTest>title</nameTest> </stepExpr> </pathExpr> </queryBody> </mainModule></module>

XQueryX expression

Page 27: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Apply the Rules by Using TOM

Rules for typing

Doc(f)

Rules for typing

each step in a path expression

Page 28: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

28

Conclusion and Future WorkConclusion

We analyze the related work in typing XQuery, and solve some inconsistency by extends the typing rules. A prototype of XQuery Typing System is implemented, including the detailed typing rules for the path expressions in XQuery.

Future Work Implementation of all the typing rules in W3C work, find and solve the potential inconsistency problem Design typing system for Xcerpt Find a polymorphic typing system for Web query languages.

Page 29: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

29

Thank You

Page 30: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Catalogue.xml<?xml version="1.0" encoding="UTF-8"?><catalogue xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation="Catalogue.xsd"> <cd> <title>"Empire Burlesque"</title> <artist>Bob Dylan</artist> <year>1985</year> <price>29</price> <keyword>Empire</keyword> <keyword>Bob</keyword> </cd> <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <year>1988</year> <price>30</price> </cd> <cd> <title>Stop</title> <artist>Sam Brown</artist> <year>1988</year> <price>39</price> </cd></catalogue>

Result

<title>Hide your heart</title> <title>Stop</title>

Query:

doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Source:

Catalogue.xml

Page 31: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Result

<title>Hide your heart</title> <title>Stop</title>

Source:

Catalogue.xml

Query 1Query 2

Query 3

Result

<title>Stop</title>

<price>30</price>

IncorrectCorrect

Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Page 32: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Catalogue.xml<?xml version="1.0" encoding="UTF-8"?><catalogue xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation="Catalogue.xsd"> <cd> <title>"Empire Burlesque"</title> <artist>Bob Dylan</artist> <year>1985</year> <price>29</price> <keyword>Empire</keyword> <keyword>Bob</keyword> </cd> <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <year>1988</year> <price>30</price> </cd> <cd> <title>Stop</title> <artist>Sam Brown</artist> <year>1988</year> <price>39</price> </cd></catalogue>

<price>30.0</price>

Page 33: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Result

<title>Hide your heart</title> <title>Stop</title>

Source:

Catalogue.xml

Query 1Query 2

Query 3

Result

<title>Stop</title>

<price>30</price>

IncorrectCorrect

Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Page 34: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Result

<title>Hide your heart</title> <title>Stop</title>

Source:

Catalogue.xml

Query 1

Query 2Query 3

Result

<title>Stop</title>

IncorrectCorrect

<price>30.0</price>

Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Page 35: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Catalogue.xml<?xml version="1.0" encoding="UTF-8"?><catalogue xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation="Catalogue.xsd"> <cd> <title>"Empire Burlesque"</title> <artist>Bob Dylan</artist> <year>1985</year> <price>29</price> <keyword>Empire</keyword> <keyword>Bob</keyword> </cd> <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <year>1988</year> <price>30</price> </cd> <cd> <title>Stop</title> <artist>Sam Brown</artist> <year>1988</year> <price>39</price> </cd></catalogue>

<price>30.00</price>

Page 36: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Result

<title>Hide your heart</title> <title>Stop</title>

Source:

Catalogue.xml

Query 1Query 2

Query 3

Result

<title>Stop</title>

<price>30</price>

IncorrectCorrect

Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Page 37: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Result

<title>Hide your heart</title> <title>Stop</title>

Source:

Catalogue.xml

Query 1

Query 2

Query 3

Correct

<price>30.00</price>

Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Page 38: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,

Example: Schema file “Catalogue.xsd”

Page 39: 1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,