1 Typing XQuery WANG Zhen (Selina) 2006.4.6. 2 Something about the Internship Group Name: PROTHEO,...

Preview:

Citation preview

1

Typing XQuery

WANG Zhen (Selina)

2006.4.6

2

Something about the InternshipGroup Name: PROTHEO, Inria, France

Research: Rewriting and strategies, Constraints, Automated Deduction

A member in REWERSE (Reasoning on the Web with Rules and Semantics), a research network within EU

Aim: develop reasoning languages for Web applications.

In progress: Xcerpt

• A deductive, rule-based query language for graph-structured data, including XML data.

• More suitable for reasoning, compared to XQuery.

Still working on Xcerpt and its the typing system.

Question: How to build a type system for Xcerpt?• Refer to the typing system of other query languages.

• My Internship: analyze the typing system of XQuery

3

Outline

Background

Related Work

XQuery Typing System

Conclusion and Future Work

4

Background

XQuery

An XML query language

E.g.: A simple path expression

doc("Catalogue.xml")/catalogue/cd

Path expressions with predicatedoc("Catalogue.xml")/catalogue/cd[ 1 ]/title

doc("Catalogue.xml")/catalogue/cd[ price>=30 ] /title

doc("Catalogue.xml")/catalogue/cd[ keyword ] /title

Predicate [pre]

serves to filter a sequence, retaining some items and discarding others.

For …/x[pre]…

Compute the predicate truth value of pre for each item x.

If true, the item x is retained, else, the item x is discarded

Three Typical Predicates [pre] :

pre is numeric → predicate truth value = if position is pre

doc("Catalogue.xml")/catalogue/cd[1]/title

pre is boolean → predicate truth value = pre

doc("Catalogue.xml")/catalogue/cd[price>30]/title

pre is a typed path → predicate truth value = if pre exit

doc("Catalogue.xml")/catalogue/cd[keyword]/title

Background

6

Background

Typing XQuery

An important aspect of XQuery formal semantics

E.g.: Given:

Catalogue.xml

A query: extract the title of the CD's, with price equal to or more than 30

XQuery expression:

doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Result

7

Background

Problem, if no type information for the XML data

The queries and Different ResultsQuery1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

>=

price “30.0” incorrect=30

correct>30 or <30

ResultPrice

Not only compare the number, but also compare the length in some cases

Possible Reason:

8

Background

However, there is no error message or warning. The mistake is too subtle to be located easily.

If we provide type information (E.g.: define price as a float) and type checking, we may find the mistake during compilation:

>=

numeric numeric

(numeric: decimal, float, double etc)

price>="30.0"

Typing error!

9

Related WorkXQuery 1.0 and XPath 2.0 Formal Semanticshttp://www.w3.org/TR/xquery-semantics/A W3C Candidate Recommendation, including

Describes the formal semantics, including some details in static analysis phase and dynamic evaluation phase Provides some generic typing rules

Too general to guide the implementation of the detailed typing procedure• E.g.: only a single rule for typing path expressions

Some inconsistency between the summarized formal semantics and the rules

• E.g.:

Formal Semantics Rules

Three kinds of Predicate:

(Numeric, Boolean, Typed path)

---

10

Related Work

Besides numeric/boolean/typed path, for the other possible expressions pre for Predicates [pre]

If pre is a string or a sequence, the predicate truth value is true if pre is not empty, and is false otherwise.

In all other cases, a typing error is raised.

Problem: any expression can be used in a predicate. Some of them, can pass compilation, but does not give reasonable results

doc("Catalogue.xml")/catalogue/cd[ “1” ]doc("Catalogue.xml")/catalogue/cd[ “price>=30” ]doc("Catalogue.xml")/catalogue/cd[ “keyword” ]

11

XQuery Typing System

This system includes the typing rules which describes the detailed typing procedure for XQuery. Extension on W3C work

Adopt and modify some basic notations to focus on typingTry to solve the inconsistency problem

• Up to now, we mainly extend the typing rules for path expression including predicates.

Definitions and Notations Typing Rules Example Implementation

12

Definitions and Notations

A Basic TypeThe built-in datatypes defined in XML Schema, including the primitive and the derived datatypes. E.g.: string, integer etc.A user defined simple type. E.g.: “myInteger” defines the integer with value between 1000 and 2000:

<xsd:simpleType name="myInteger"><xsd:restriction base="xsd:integer">

<xsd:minInclusive value="1000"/><xsd:maxInclusive value=“2000"/>

</xsd:restriction></xsd:simpleType>

13

Definitions and Notations

A type is:1. A type constant, e.g.: DocumentType, predicate2. A basic type, or3. A type symbol (E.g., a type called “CD”), or4. A functional type with the form (n ≥ 0) where {…} are types for attributes, τi are types for children

5. A disjunction type is of the form (whereτi are types , and n ≥ 0):

6. A type with occurrence indicator, in the form of

14

Definitions and Notations

A typing judgement exp:τexp is an typed expression, τis a type.

if exp’s type is τ, the typing judgement is true

• The conclusion is true, given that all the premises are true,

• All the premises and the conclusion are typing judgements.

• If there is no premise, the conclusion is always true.

A typing rule

15

Definitions and Notations

Notations

16

Typing Rules

Typing rules used forQuery1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Typing doc(f)

doc(f) is a document function, which is used to extract data from XML file f.

Typing rule suppose that the type of the root element of XML file f isτ

Typing Paths

Typing Predicate ( numeric, boolean, typed path )

τ1<: τ2 means type τ1 is the subtype of type τ2

If exp: τ1, and τ1<: τ2, then exp: τ2

Use a type called “numeric” where: (W3C)

Typing rules

ExampleTyping Query 1 with a schema: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Typing Query 1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Example

Typing Query 2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Predicate (numeric, boolean, typed path)

see whether price>=“30.0”: boolean

Typing Rule (from W3C) for operator “>=”, while τis numeric

Example

A typing error is generated

23

ImplementationIn order to apply those typing rules, we need to:

parse an XQuery expression into an abstract syntax treeapply those rules by navigating through the tree, add type information on the nodes

Our implementation: XQueryX – XML expression of XQuery syntax

TOM -- An extension of Java designed to manipulate tree structures and XML documents, by using pattern matching facilities.

FrameworkExample:

Query 1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

<?xml version="1.0"?><module xmlns:xqx="http://www.w3.org/2005/XQueryX" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2005/XQueryX http://www.w3.org/2005/XQueryX/xqueryx.xsd"> <mainModule> <queryBody> <pathExpr> <argExpr> <functionCallExpr>

<functionName>doc</functionName> <arguments> <stringConstantExpr> <value>Catalogue.xml</value> </stringConstantExpr> </arguments> </functionCallExpr> </argExpr>

XQueryX expression

<stepExpr> <xpathAxis>child</xpathAxis><nameTest>catalogue</nameTest> </stepExpr> <stepExpr> <xpathAxis>child</xpathAxis><nameTest>cd</nameTest> <predicates> <greaterThanOrEqualOp>

<firstOperand> <pathExpr> <stepExpr> <nameTest>price</nameTest> </stepExpr> </pathExpr> </firstOperand>

<secondOperand><integerConstantExpr><value>30</value></

integerConstantExpr> </secondOperand> </greaterThanOrEqualOp> </predicates></stepExpr>

XQueryX expression

<stepExpr> <xpathAxis>child</xpathAxis> <nameTest>title</nameTest> </stepExpr> </pathExpr> </queryBody> </mainModule></module>

XQueryX expression

Apply the Rules by Using TOM

Rules for typing

Doc(f)

Rules for typing

each step in a path expression

28

Conclusion and Future WorkConclusion

We analyze the related work in typing XQuery, and solve some inconsistency by extends the typing rules. A prototype of XQuery Typing System is implemented, including the detailed typing rules for the path expressions in XQuery.

Future Work Implementation of all the typing rules in W3C work, find and solve the potential inconsistency problem Design typing system for Xcerpt Find a polymorphic typing system for Web query languages.

29

Thank You

Catalogue.xml<?xml version="1.0" encoding="UTF-8"?><catalogue xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation="Catalogue.xsd"> <cd> <title>"Empire Burlesque"</title> <artist>Bob Dylan</artist> <year>1985</year> <price>29</price> <keyword>Empire</keyword> <keyword>Bob</keyword> </cd> <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <year>1988</year> <price>30</price> </cd> <cd> <title>Stop</title> <artist>Sam Brown</artist> <year>1988</year> <price>39</price> </cd></catalogue>

Result

<title>Hide your heart</title> <title>Stop</title>

Query:

doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Source:

Catalogue.xml

Result

<title>Hide your heart</title> <title>Stop</title>

Source:

Catalogue.xml

Query 1Query 2

Query 3

Result

<title>Stop</title>

<price>30</price>

IncorrectCorrect

Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Catalogue.xml<?xml version="1.0" encoding="UTF-8"?><catalogue xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation="Catalogue.xsd"> <cd> <title>"Empire Burlesque"</title> <artist>Bob Dylan</artist> <year>1985</year> <price>29</price> <keyword>Empire</keyword> <keyword>Bob</keyword> </cd> <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <year>1988</year> <price>30</price> </cd> <cd> <title>Stop</title> <artist>Sam Brown</artist> <year>1988</year> <price>39</price> </cd></catalogue>

<price>30.0</price>

Result

<title>Hide your heart</title> <title>Stop</title>

Source:

Catalogue.xml

Query 1Query 2

Query 3

Result

<title>Stop</title>

<price>30</price>

IncorrectCorrect

Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Result

<title>Hide your heart</title> <title>Stop</title>

Source:

Catalogue.xml

Query 1

Query 2Query 3

Result

<title>Stop</title>

IncorrectCorrect

<price>30.0</price>

Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Catalogue.xml<?xml version="1.0" encoding="UTF-8"?><catalogue xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation="Catalogue.xsd"> <cd> <title>"Empire Burlesque"</title> <artist>Bob Dylan</artist> <year>1985</year> <price>29</price> <keyword>Empire</keyword> <keyword>Bob</keyword> </cd> <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <year>1988</year> <price>30</price> </cd> <cd> <title>Stop</title> <artist>Sam Brown</artist> <year>1988</year> <price>39</price> </cd></catalogue>

<price>30.00</price>

Result

<title>Hide your heart</title> <title>Stop</title>

Source:

Catalogue.xml

Query 1Query 2

Query 3

Result

<title>Stop</title>

<price>30</price>

IncorrectCorrect

Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Result

<title>Hide your heart</title> <title>Stop</title>

Source:

Catalogue.xml

Query 1

Query 2

Query 3

Correct

<price>30.00</price>

Query1: doc("Catalogue.xml")/catalogue/cd[price>=30]/title

Query2: doc("Catalogue.xml")/catalogue/cd[price>="30.0"]/title

Query3: doc("Catalogue.xml")/catalogue/cd[price>="30.00"]/title

Example: Schema file “Catalogue.xsd”