47

pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Containment and Optimization of Object-PreservingConjunctive QueriesEdward P.F. ChanDepartment of Computer ScienceUniversity of WaterlooWaterloo, Ontario, Canada N2L [email protected] van der MeydenSchool of Computing SciencesUniversity of Technology, SydneyAustralia,[email protected] 12, 1999AbstractIn the optimization of queries in an object-oriented database system (OODB), a natural�rst step is to use the typing constraints imposed by the schema to transform a queryinto an equivalent one that logically accesses a minimal set of objects. We study a classof queries for OODB's called conjunctive queries. Variables in a conjunctive query rangeover heterogeneous sets of objects. Consequently, a conjunctive query is equivalent to aunion of conjunctive queries of a special kind, called terminal conjunctive queries. Testingcontainment is a necessary step in solving the equivalence and minimization problems. We�rst characterize the containment and minimization conditions for the class of terminal con-junctive queries. We then characterize containment for the class of all conjunctive queries,and derive an optimization algorithm for this class. The equivalent optimal query producedis expressed as a union of terminal conjunctive queries which has the property that thenumber of variables as well as their search spaces are minimal among all unions of terminalconjunctive queries. Finally, we investigate the complexity of the containment problem. Weshow that it is complete in �p2.1 IntroductionThe initial attempts at constructing object-oriented databases (OODB's) provided only nav-igational programming languages for manipulating data [25, 6]. The lack of query languageslike those available in relational systems has been criticized as a major drawback of the object-oriented approach [34, 5]. Consequently, most, if not all, commercial OODB's now provide,1

Page 2: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

or will provide, some form of high-level declarative query language (e.g., [27, 15, 26, 21, 22]).These query languages, like those of the relational model, transfer the burden of choosing ane�cient execution plan for a query to the database system. This has lead to a resurrection ofthe study of query optimization in the object-oriented setting (e.g., [30, 8, 7, 31, 18, 23, 13]).Most of these papers develop transformations that reduce the cost of evaluating a given querybut do not necessarily produce an optimal equivalent query.In the setting of relational databases, a well accepted notion of query optimality exists forthe class of conjunctive queries [12], and the classical theory is based on the notion of querycontainment. A query Q1 is said to be contained in a query Q2 if in every database instance,the set of answers to Q1 is a subset of the set of answers to Q2. In this paper, we study thecontainment and optimization problems for a class of conjunctive queries in an object-orientedsetting. The closely related equivalence problem has previously been addressed for object-oriented queries by Hull and Yoshikawa [18]. Our results are complementary to their work,in that the language in [18] is object-generating, while our language is object-preserving . Ourlanguage enables a user to retrieve objects from a database, but not to create new complexobjects. Moreover, our language, like the one in [8], is de�ned on an inheritance hierarchy,whereas most languages studied in the literature are basically languages for complex objectswithout inheritance. The need to deal with inheritance introduces an extra level of complexityinto the containment and optimization problems.In an OODB, classes are named collections of similar objects. A class C may be re�ned intosubclasses. Conversely, the class C is said to be a superclass of its subclasses. Subclasses arespecializations of their superclasses. Consequently, objects in a class are also contained in itssuperclasses. Specialization of a class is often achieved by re�ning and/or adding properties toits superclasses. Since properties of a superclass are also properties of its subclasses, a subclassis said to inherit the properties of its superclasses. Class-subclass relationships form an acyclicdirected graph called an inheritance or generalization hierarchy.Inheritance is a powerful modeling tool, because it allows for a better structured and moreconcise description of the schema, and helps in factoring out shared implementations in appli-2

Page 3: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

cations [4]. Objects belonging to the same class share some common properties. Properties areattributes or methods de�ned on types; they are applicable only to instances of the types. Ine�ect, therefore, types are constraints imposed on objects in the classes. Properties are formallydenoted as attribute-type pairs in this paper. A natural �rst step in query optimization is touse the typing constraints implied by the schema to minimize the search space for variablesinvolved in the query. The following example illustrates how this idea may be applied to thekind of object-oriented conjunctive query we consider.Example 1.1 The following is a schema for a vehicle rental database. It keeps track of allrental transactions for vehicles in the company. In this application, Auto, Trailer and Truckare subclasses of the superclass Vehicle. There are clients, called discount customers, who areknown to the company and receive special treatment. Discount customers receive a special rateand are not required to pay a deposit on the vehicles rented. However, discount customers areonly allowed to rent automobiles, and not other types of vehicles. Note that this constraint iscaptured by the more restrictive typing of the attribute VehRented in the subclass Discount ofthe class Client. Let us assume further that all superclasses are partitioned by their respectivesubclasses.

Normal Discount

VehRented:{Auto}DoscountRate: Int.

Client

RentalPeriod: DateVehRented:{Vehicle}

#OfSeat: Int. Facilities:{Str.} CargoCap: Int.

TrailerAuto Truck

Id: Int.Model: Str.

Vehicle

Deposit: Money

Customer:Renter

Maker:Str.

3

Page 4: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Suppose we want to �nd all those vehicles that have been rented to a discount client. Ex-pressed in a calculus-like language, the query looks like:Q1: f x j 9y (x2Vehicle & y2Discount & x2y.VehRented)g.Since discount clients are allowed to rent automobiles only, the above query is equivalent to thefollowing query: Q2: f x j 9y (x2Auto & y2Discount & x2y.VehRented)g.Q2 is considered to be more optimal since the number of variables as well as their search spacesare minimal, given the typing constraints implied by the schema. Let us consider another query.Assume that we want to �nd those clients who rented a truck. It can be expressed as follows:Q3: f x j 9y (x2Client & y2Truck & y2x.VehRented)g.Since discount clients are allowed to rent automobiles only (but not other kind of vehicles), Q3is the same as the following query.Q4: f x j 9y (x2Normal & y2Truck & y2x.VehRented)g.2 Relational conjunctive queries have been studied extensively in the literature. A variablein a relational query ranges over a homogeneous relation. On the other hand, as illustratedby the example, variables in an object-oriented query range over classes which could consist ofheterogeneous sets of objects. This is because a class may be re�ned to various subclasses, inwhich shared attribute names may correspond to di�erent types or classes. For example, thevariable x in Q3 in Example 1.1 ranges over a heterogeneous set Client = Normal[Discount.All the members of this set have the attribute V ehRented, but only for members x of Normalcan x:V ehRented contain an element of the class Truck. This implies clients who rent a truckare normal clients. This constitutes a signi�cant divergence from the relational case. Forinstance, syntactically correct relational conjunctive queries are always satis�able but this isnot true for object-oriented conjunctive queries [10]. The additional complexity is also re ectedin the containment problem, as is illustrated by the following example.4

Page 5: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Example 1.2 The following schema records the employer-employee relationships among a groupof people. The Employee attribute indicates the set of employees hired by a person.Person

Employee:{Person}

FemaleMale

Consider the following two queries de�ned on the above inheritance hierarchy. Q1 retrievesall people x who hire a person u and a male v such that u is also an employee of v and u hiresa female employee w. Q2 �nds all those people x who hire a male employee y who in turn hiresa female employee z. Expressed in our language, they are as follows.Q1: f x j 9u 9v 9w (x2Person & u2Person & v2Male & w2Female &u2x.Employee & v2x.Employee & u2v.Employee & w2u.Employee)g.Q2: f x j 9y 9z (x2Person & y2Male & z2Female & y2x.Employee & z2y.Employee)g.The above two queries are best visualized as the following two graphs.Q

x

v

u

w

x

y

z

M

F

M

F

1 2Q

x: x hires y

yWe claim that Q2 contains Q1, meaning that whenever there is an answer for Q1, it willalso be an answer for Q2. The person u is either a male or a female. If u is a male, then y and5

Page 6: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

z in Q2 can be mapped to u and w, respectively. Similarly if u is a female, variables y and z inQ2 can be mapped to v and u, respectively. Thus, whenever there is an answer for Q1, it willalso be an answer for Q2. 2The above examples illustrate the kind of conjunctive queries we are interested in. Exam-ples 1.1 and 1.2 demonstrate that the analysis of containment of conjunctive queries is moredi�cult than its counterpart in the relational case. This is due to the fact that the domains ofattributes impose certain constraints on a query, and the analysis of the containment problemalso involves analysis of disjunctive information.The following is an overview of the problem and the approach we took in solving it. Givena conjunctive query Q(S), where S is an object-oriented database schema denoted by an in-heritance hierarchy, we want to �nd an equivalent query Q0(S) that is, in some sense, optimal.Moreover, we are interested in determining when a conjunctive query is contained in anotherone. Both problems require an understanding of what a conjunctive query represents. We �rstobserve that a conjunctive query, like the ones in Examples 1.1 and 1.2, can be decomposed as aunion of special kind of conjunctive queries called terminal conjunctive queries. As typing con-straints in an inheritance hierarchy are restrictions on objects in a database, not every terminalconjunctive query is satis�able. With typing constraints implied by an inheritance hierarchy,unsatis�able terminal conjunctive queries can be determined and are eliminated from a union.Having removed unsatis�able terminal conjunctive queries, characterization of containment andoptimization are then derived. The technique employed and the result obtained are similar tothose in SPJU expressions in the relational theory [28].Most work on query optimization in OODB's concentrates on complex object optimiza-tion without considering the typing constraints imposed by the inheritance hierarchy (e.g.,[30, 7, 31, 18, 23, 13]). Type checking of queries in the presence of non-strict inheritancehierarchy was studied in [8]. Our work is di�erent from all previous approaches in several im-portant respects. Firstly, we use the typing constraints imposed by an inheritance hierarchyto study the containment, equivalence and optimization of queries. Secondly, our optimizationis an exact minimization while most of the previous work deals with algebraic transformations6

Page 7: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

and/or heuristics (e.g., [30, 31, 7, 23, 13]). Thirdly, with reasons similar to those noted in [28],characterizing equivalence does not su�ce to solve the optimization problem. Instead, we needto understand the containment problem as well. This work, to our best knowledge, is the �rstwork that provides a characterization for containment of queries in an object-oriented setting.This result could also �nd applications in view de�nition and classi�cation in an OODB. Forinstance, to correctly integrate a virtual class or view into an inheritance hierarchy, it is im-perative to resolve the containment problem for the view de�nition language [29]. Lastly, wedemonstrate that the idea of containment mappings of relational conjunctive queries [12] canbe extended to its object-oriented counterpart. Our proposed language, on the other hand, isperhaps more restrictive than some other query languages studied in the literature.The next section de�nes the class of conjunctive queries and the basic notation neededthroughout the discussion. In characterizing the containment and equivalence of terminal con-junctive queries, it is assumed that the query involved is satis�able. We present an e�cientalgorithm for solving the satis�ability problem for terminal conjunctive queries in Section 3.The results in that section were proven in [10] and are needed in the subsequent discussions.Sections 4 and 5 characterize the containment, equivalence and minimization conditions forterminal conjunctive queries. In Section 6, we solve the containment problem and derive analgorithm for optimizing the class of all conjunctive queries. The notion of optimization cap-tures the intuition of minimization of the number of variables as well as their search spaces. InSection 7, we analyse the complexity of testing containment of conjunctive queries. The mainresult shows that the problem is �p2-complete. Finally, we give our conclusions in Section 8.2 De�nitions and NotationIn this section, we introduce notation that is necessary for the rest of the discussion.2.1 Types, Classes and SchemasWe suppose given the following pairwise disjoint sets:7

Page 8: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

1. A set T of atomic type domains, where each atomic type domain is an in�nite set ofatomic values. Examples of atomic type domains are the set of integers, and the set ofstrings over some alphabet. We asssume distinct atomic type domains are disjoint.2. A countably in�nite set O of symbols which are called object identi�ers.3. A set B of atomic types, containing for each atomic type domain T 2 T , a symbol Tnaming that type domain. For brevity, we abuse notation by using the same symbol Tfor a type domain and its name. Thus B=T .4. A countably in�nite set A of symbols, the attributes.5. A countably in�nite set C of symbols which are called classes.The set ST2T T is said to be the set of atomic values. The elements of A will be used asattribute names in tuple types, and the elements of C serve as names for user-de�ned classes.A type expression over a set C�C of class names is an expression de�ned as follows:1. Every element of B is a type expression, called an atomic type.2. Every element of C is a type expression, called a class.3. If t is an atomic type or a class, then ftg is type expression, called a set type.4. If a1 , : : : , an are distinct attributes in A and t1 , : : : , tn are atomic types, set types orclasses, where n�0, then [a1:t1 , : : : , an:tn] is a type expression, called a tuple type. Asin a relation scheme, the order of attributes is immaterial. The empty tuple [] is also atuple type. The type ti is said to be the type of the attribute ai, for each i = 1 : : : n.We write type-expr(C) for the set of all type expressions over C.Following [24, 9], we introduce the notion of schema. A schema S is a triple (C, �, �),where C is a �nite subset of C, � is a function from C to tuple types, and � is a partialorder on C. The mapping � associates to each class in C a tuple type in type-expr(C) whichdescribes its structure. As noted in [16], there is no loss of representation power in restricting8

Page 9: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

the structures of classes to be tuple types. The relationship � among classes represents theuser-de�ned inheritance hierarchy. We assume that the hierarchy has no cycle of length greaterthan 1. A class A2C is said to be terminal if there is no class B6=A such that B�A. OtherwiseA is non-terminal. A class B is a descendant (or an ancestor) of a class A if B�A (or A�B,respectively).Following [2, 24], we derive from this hierarchy a subtyping relation � among expressionsin type-expr(C). Let S = (C, �, �) be a schema. The subtyping relation among expressions intype-expr(C) is the smallest ordering � which satis�es the following axioms:1. A�A if A2B.2. B�C if B�C, for all classes B and C in C.3. ftg�fsg, for all types s, t such that t�s.4. [a1:t1 , : : : , an:tn , : : : , an+p:tn+p] �[a1:s1 , : : : , an:sn], for all atomic types, set types orclasses t1 , : : : , tn, s1 , : : : , sn such that ti�si, for all i = 1 : : : n.The order of arguments in a tuple type is immaterial, so we have [a3 : C3; a2 : C2; a1 : C1] �[a1 : C1; a2 : C2].For any expressions E1 and E2 in type-expr(C), E1 is a subtype of E2 if E1�E2. It isworth noting that the subtyping relation is a re exive and transitive relation. As inheritancehierarchies are given by users, some schemas may not be meaningful. Let S = (C, �, �)be a schema. S is consistent if for all classes B and C such that B�C, we have �(B)��(C).We only consider consistent schemas throughout this paper. The schemas we have de�nedare essentially the same as those de�ned in ODMG-93 [9]. Thus our results are applicable tosystems conforming to this standard. Let C 2 C. Attributes in �(C) are called the attributesof C. The type of C.A, denoted type(C.A), is the type t of A in �(C). In consistent schemas,subclasses are specializations of their superclasses. Specialization of subclasses is representedformally by re�ning types of inherited attributes and/or adding new attribute-type pairs.9

Page 10: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

2.2 States, Domains and ObjectsLet S = (C, �, �) be a schema and � the subtyping relation on type-expr(C). Let O be a�nite subset of O and Ic be a function from O to C. Given O and Ic, each type expressionT in type-expr(C) is interpreted as a set of possible values, called the domain of T, denotedas dom(T). In order to represent inapplicable attributes, we introduce a new symbol `�'. Thedomain of a type with respect to (w.r.t.) O and Ic is de�ned as follows:1. If T 2 B is an atomic type naming the type domain T , then dom(T) = T .2. For each class D2C, we de�ne dom(D) = fo j o 2 O and Ic(o) = E, where E�Dg.3. For each set type ftg, we de�ne dom(ftg) = fv j v � dom(t)g.4. For each tuple type [a1:t1 , : : : , an:tn], we de�ne dom([ a1:t1 , : : : , an:tn]) = f[a1:v1 , : : :, an:vn] j vi2dom(ti)[f�g for all i = 1 : : : ng.Note that the value of an attribute of a tuple may be the null value �. This is to be interpretedas the attribute being inapplicable.A state s on a schema S = (C, �, �) is a triple (O, Ic, Iv), where O is a �nite subset ofO, Ic is a function from O to C, and Iv is a function from O to tuple values in domains oftypes with respect to O and Ic. The function Iv maps each element in O to a tuple value whichsatis�es the following:8o2O, Iv(o)2dom(�(Ic(o))).That is, Iv de�nes the data value of an object and the (tuple) value of an object de�ned on aclass must satisfy the type speci�cation associated with the class. The set f<o, Iv(o)> j o2Ogis the set of objects in the state s. Two objects in a state are identical if and only if they havethe same identi�er, so we may sometimes abuse terminology by referring to the identi�er o asan object of s.Let [a1:v1 , : : : , an:vn] be a tuple value. Then [a1:v1 , : : : , an:vn].ai is vi. We call vi thevalue of attribute ai in the tuple. If the attribute a does not occur in a tuple then the value ofattribute a in the tuple is �. 10

Page 11: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

In many, if not most, existing object-oriented database systems (e.g., [26, 21, 27]), an objectis de�ned on exactly one most specialized class in an inheritance hierarchy. Consequently, asin [2, 21], we assume the following throughout the discussion.Terminal Class Partitioning Assumption: Given any state s= (O, Ic, Iv) on a schemaS, the class Ic(o) is a terminal class in S for every object o in O. That is, every non-terminalclass is partitioned by its terminal descendants.2.3 A Class of Object-Preserving Conjunctive QueriesIn this subsection, we de�ne a calculus-like query language for an object-oriented database.Queries are constructed from a set of variables, symbols from the set of atomic values, theequality operator `=', the membership operator `2', the OR operator `t', the logical operator`&', as well as the existential quanti�er `9'. The set of variables is assumed to be disjoint fromother sets of symbols.First we de�ne the concept of term. Terms enable us to refer to an object or a componentof an object. A term is an expression of one of the following forms: c or x or x.A, where c isan atomic value, i.e., c is in some atomic type domain, x is a variable and A is an attribute. Aterm of the form x or x.A is called a variable term. An attribute term is of the form x.A.An atom or an atomic formula is de�ned to be one of the following:1. x2C1 t � � � t Cn, where the Ci's are classes or atomic types, and x is a variable. An atomx2C1 t � � � t Cn is called a range atom and it asserts that the variable x denotes an objectin the class Ci or a value in the atomic type Ci, for some 1�i�n.2. t1 = t2, where t1 and t2 are terms. Such an atom is called an equality atom. An equalityatom asserts that the operands denote identical objects or are of the same atomic value.3. x2y.A, where x and y are variables. The atom x2y.A is called a membership atom. Amembership atom asserts that the object or atomic value denoted by x is a member ofthe set object denoted by y.A.It is worth noting that path expressions of the form x.A1. . .An (as used in [37]) and of theform x.A1[y1]. . . .An[yn] (as used in [19]); where x and yi's are variables or atomic values, can11

Page 12: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

all be represented indirectly in our language. Likewise, atoms of the forms c2x.A and y.A2C1t � � � t Cn and of the form x.A2y.B, where x and y are variables and c is an atomic value, canagain be expressed indirectly in our language.A formula is constructed from atomic formulas, the logical operator `&', as well as existentialquanti�ers. Bound and free variables are de�ned in the usual manner. A query is an expressionof the form f t j �(t)g, where t is either a variable or an atomic value and �(t) is a formula. Theterm t is called the distinguished term of the query. A query f s0 j �(t)g is called conjunctiveif �(t) is of the form 9s1...9sm(M), where M is a formula containing no quanti�er that is aconjunction of atomic formulas.1 9s1...9sm is called the pre�x and M is called the matrix of theformula or of the query. We also make use of union queries, which are expressions of the formQ1 [ : : : [Qn, where each Qi is a conjunctive query.2.4 Semantics of QueriesWe now give the semantics of queries, and de�ne the notion of query containment. It is conve-nient for technical reasons that will become apparent below to state the semantics in terms ofa mapping to a language that uses the objects and atomic values of a state as basic syntacticentities. We remark that the semantics is slightly non-standard in that it requires that theterms occurring in an atom must have a non-null value in order for the atom to be true: thiswas handled in [10] using a three valued logic, but since only the truth of atoms is relevant tothe containment question for conjunctive queries we simplify this here.De�ne a term over a state s = (O; Ic; Iv) to be an expression of one of the following forms:an atomic value c, an object o 2 O, or an expression o:A, where o 2 O is an object and A is anattribute. The value Val(t) of a term t over s is de�ned as follows.1. If t is an atomic value c then Val(t) = c.2. If t is an object o 2 O then Val(t) = o.3. If t is the expression o:A, where o 2 O is an object and A is an attribute then Val(t) isthe value of attribute A in Iv(o).1The results in this paper can be extended to conjunctive queries with more than one free variable.12

Page 13: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Note that Val(t) may be an atomic value, object, set, or the null value �.An atom over s is an expression of one of the following forms:1. t 2C1 t � � � tCn, where t is a term over s and the Ci are classes or atomic types;2. t1 = t2, where t1 and t2 are terms over s, or3. t 2 o:A where t is a term over s, where o is an object of s, and where A is an attribute.We de�ne certain atoms A over a state s, to be satis�ed in s, written s j= A, as follows:1. s j= t 2 C1 t � � � tCn, where t is a term over s and each Ci is a class or atomic type, ifVal(t) 2 dom(Ci) for some i = 1 : : : n;2. s j= t1 = t2, where t1 and t2 are terms over s, if Val(t1) = Val(t2) and neither value isequal to �;3. s j= t 2 o:A where t is a term over s, the o is an object of s, and A is an attribute, ifVal(t) is not equal to � and Val(o:A) is a set that contains Val(t).Note that in order for an atom to be satis�ed, all of its terms must have non-null values.An assignment for a query Q in a state s = (O; Ic; Iv) is a function � mapping each variableof Q either to an atomic value or to an object in O. Assignments may be extended to mappingsfrom the terms and atoms of Q to terms and atoms over s, respectively, as follows:1. For terms which are atomic values c we de�ne �(c) = c.2. For terms of the form x:A, where x is a variable, we de�ne �(x:A) to be the expressiono:A, where o = �(x).3. For atoms A of Q we de�ne �(A) to be the atom over s obtained by substituting for eachterm t in A the term �(t).Using the notion of satisfaction of atoms over a state in that state, we now de�ne a formula� to be satis�ed in a state s with respect to an assignment �, written s; � j= �, in the usualway. For atomic formulae A we have s; � j= A if s j= �(A). The cases of Boolean operators13

Page 14: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

and quanti�ers are as in the standard semantics of �rst order logic, where the universe consistsof the union of the sets dom(T ), where T ranges over all type expressions.A query Q = f t j �(t)g is said to be satis�ed in a state s with respect to an assignment �,written s; � j= Q, if s; � j= �(t). The assignment � is called a satisfying assignment for Q inthis case. We say that the object or value a is an answer of Q with respect to s if there existsa satisfying assignment � for Q such that a = �(t), where t is the distinguished term of Q. IfQ is a query and s is a state, we write Q(s) for the set of all answers of Q with respect to s.For union queries Q of the form Q1 [ : : :[Qn we de�ne Q(s) to be the set Q1(s)[ : : :[Qn(s).A query Q is said to be satis�able if there is a state s such that Q(s) is non-empty. Giventwo queries Q1 and Q2 (on a schema S), Q1 is said to contain Q2 with respect to S, denotedQ1�Q2, if Q1(s)�Q2(s), for all states s on S. Two queries Q1 and Q2 are said to be equivalentwith respect to schema S, denoted Q1�Q2, if they contain each other with respect to S.We note that for conjunctive queries, the existential quanti�ers are not strictly essential:the query ft j 9x1 : : : 9xn(�)g is equivalent to the query ft j �g. Consequently, we assumehenceforth for purposes of analysis that queries do not contain existential quati�ers. Thisyields the following simple characterization of satisfaction: � is a satisfying assignment for aconjunctive query Q in a state s if and only if s j= �(A) for all atoms A of Q. (It is stillsometimes convenient to write formulae with quanti�ers in order to scope variables and avoidnaming con icts.)2.5 Well-formed Conjunctive QueriesWe consider only those queries in which each term either denotes an object or a value, or aset of objects or values, but not both. We call such queries well-formed. The following de�neswhen a query is well-formed. First we note that, given a conjunctive query, additional equalitiesamong terms could be inferred with the following algorithm. It is easy to see that the inferencesperformed in the algorithm are correct.Algorithm EqualityGraph: Given a conjunctive query, generate additional implied equalityedges.Input: A conjunctive query Q. 14

Page 15: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Output: An undirected graph E(Q), called the complete equality relationship graph for Q.Method:The edges fx; yg in the graph E(Q) are called equality edges, and are also denoted by `x = y'.(1) Generate a graph with terms in Q as nodes. (If x:A is a term of Q then so is x.) Generateadditional nodes and equality edges by applying the following three steps exhaustively tothe graph until no more edges can be derived.(i) For each node t, derive the equality edge t = t. For each equality atom `s = t' of Q,generate an equality edge between the node s and the node t.(ii) If s = t and t = u are equality edges, then derive the equality edge s = u.(iii) If x and y are variable nodes, x = y is an equality edge, and x.A is a node in thegraph, then add the node y.A, if it does not already exist, and derive the equality edgex.A = y.A.(2) Output the graph constructed.By steps (i) and (ii), the complete equality relationship graph E(Q) for a conjunctive queryQ, yields an equivalence relation R, de�ned by tRt0 if there exists an equality edge between tand t0. For each term t in E(Q), the equivalence class [t] of R containing t is the set ft0 j t0 isa node in E(Q) and there is an equality edge between t and t0g. These sets are said to be theequivalence classes of E(Q).Let Q be a query. An occurrence of a term x:A in the matrix of Q is a set occurrence ifthe occurrence appears on the right-hand side of a membership atom. All other occurrencesof terms in the matrix of Q are object occurrences. A term s is an object term if some termt in the equivalence class [s] has an object occurrence in the query. A term s is a set term ifthere is a set occurrence in the query of some term t 2 [s]. Intuitively, a set term is one thatmust denote a set, and an object term is one that must denote an object or atomic value. Aconjunctive query Q is well-formed if(i) every term in Q is either an object term or a set term, but not both, and(ii) each object term of the form x.A is equated, directly or indirectly, to some variable oratomic value; that is, there is a variable or an atomic value in the equivalence class [x.A],and 15

Page 16: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

(iii) every variable in Q ranges over exactly one disjunction of classes or atomic types; that is,there is exactly one range atom associated with each variable.Condition (i) is necessary for the satis�ability of the query, and arises from the obviousconstraint that no term can simultaneously denote both an object and a set. It is worthremarking that this condition implies that a set term cannot occur within an equality atom inthe query. For, such an occurrence would be an object occurrence, which would make the termsimultaneously a set term and and object term. Note, moreover, that no object term can everdenote a set. For, by conditions (ii) and (iii), an object term must denote an element of someunion of classes and atomic types.For terms denoting objects or atomic values, condition (ii) is not a real restriction, sincesuch a term can always be equated to some new existentially quanti�ed variable ranging over allclasses and atomic types. This condition is needed to simplify the discussion in the subsequentsections. In the case of condition (iii), note that because of the Terminal Class PartitioningAssumption, a query is unsatis�able if it contains both x 2 C and x 2 D, where C and D aredistinct terminal classes. Such a query may be satis�able if C and D are nonterminal classes,but in this case the two range atoms can be replaced (given a schema) with the single atom x 2C1t � � � tCn, where C1; : : : ; Cn are the common terminal descendants of C and D. Moreover,if the variable x occurs in no range atom, then we may clearly add the atom x 2 C1t � � � tCn,where C1; : : : ; Cn are all terminal classes and atomic types, without changing the meaning ofthe query.For the rest of this paper, we use the term conjunctive queries to denote well-formed con-junctive queries. Well-formed queries include safe, as well as unsafe queries that produce in�niteanswers [35].2.6 Terminal Conjunctive QueriesA terminal conjunctive query is a conjunctive query in which every range atom is of the form`x2C', where C is a terminal class or an atomic type.Every conjunctive query can be expressed as a union of terminal conjunctive queries, as16

Page 17: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

follows. First, de�ne an expansion of a query Q to be a query obtained by replacing each rangeatom of the form x 2 C1 t � � � tCn by one of the atoms x 2 Ci. For example, the queryfx j (x 2 C tD) ^ (y 2 E t F ) ^ (x:A = y)ghas four expansions, one of which is the queryfx j x 2 D ^ y 2 E ^ x:A = yg:Note that every expansion of a conjunctive query is a terminal conjunctive query.Proposition 2.1 Let Q be a conjunctive query and let Q1; : : : ; Qn be all the expansions of Q.Then Q is equivalent to Q1 [ : : : [Qn.[Proof]: See [10]. 2This result states that, semantically, a conjunctive query corresponds to a union of terminalconjunctive queries. Each terminal conjunctive query in a union could have variables de�nedon di�erent domains. To solve the containment and equivalence problems, it is necessary tosolve the satis�ability problem and to identify exactly the set of objects or values over whicha variable is ranging. In Section 3, we present an algorithm for solving these problems for theterminal conjunctive queries. This algorithm employs typing constraints to determine satis�a-bility of a terminal conjunctive query. Queries that are not terminal could, by Proposition 2.1,�rst be decomposed into a union of terminal conjunctive queries. We can then apply the al-gorithm in Section 3 to each subquery in the union to determine its satis�ability, and deletethe unsatis�able subqueries from the union. In Sections 4 and 5, we shall derive algorithms fortesting containment and for minimizing terminal conjunctive queries. Section 6.1 deals withcontainment of unions of conjunctive queries, which can be used to determine containment ofarbitrary conjunctive queries.3 An E�cient Algorithm for Testing Satis�ability of TerminalConjunctive QueriesTesting satis�ability of restricted classes of conjunctive queries is an NP-complete problem [10].However, determining if a terminal conjunctive query is satis�able is tractable. We present in17

Page 18: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

this section an algorithm that solves this problem in polynomial time, from [10], along witha sketch of its correctness proof. The aspect of this proof that is germane to our purposes inthe present paper is that if the input query is satis�able, it is possible to construct a `minimal'state with respect to which the query returns a non-empty result. Various properties of thestate constructed are needed in the proof of the characterization of containment of terminalconjunctive queries.One of the reasons for unsatis�ability of a query is the incompatibility of the typing ofits terms implied by the schema. Note that a query Q and schema S = (C, �, �) togetherdetermine a type type(t) for every term t of the query. (In every case, this type is in fact aclass or atomic type.) For every atomic value c in an atomic type T , we de�ne type(c) = T .For every variable x in Q, de�ne type(x) to be the unique class or atomic type C such that Qcontains an atom of the form x2C. For every term of the form x.A in Q, de�ne type(x.A) as thetype of attribute A in �(type(x)), if type(x) is a class and A is an attribute of �(type(x)), andunde�ned otherwise.In order for the query to be satis�able, the type assigned to its terms must be consistent.By the Terminal Class Partitioning Assumption, terms denoting the same object must belongto the same terminal class or atomic type, and an object belonging to a set must be of a typeadmissible for that set. The following de�nition helps to check these conditions. If T is anatomic type or a class, we say D is a terminal subtype of T if either T is an atomic type and Dis T, or D is a terminal descendant of T. Let t be an object term in Q. De�ne SatType(t) to bethe set of all D such that1. for all terms u 2 [t], D is a terminal subtype of type(u), and2. for every `u2z.A' in Q, where u2[t], D is a terminal subtype of C, where type(z.A) is fCg.Intuitively, SatType(t) is the set of terminal types that are consistent with all the typing in-formation on t derivable from the query. Since every object term must be equated to somevariable, which must range over a terminal type, or to an atomic value, SatType(t) contains atmost a single element. Note that the de�nition depends only on the equivalence class of t, so wemay also write SatType([t]) for SatType(t). Note also that if Q is satis�able, then SatType(t)18

Page 19: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

is a singleton set, for every variable object term t in Q. For, if SatType(t) were empty, then itwould be impossible to construct a satisfying assignment.Algorithm SatTestUT: Verify if a terminal conjunctive query Q is satis�able.Input: A terminal conjunctive query Q on S.Output: yes if Q is satis�able, and no otherwise.Method:Compute the complete equality relationship graph E(Q) for Q.(1) If there is an object term of the form x.A for which type(x.A) is unde�ned or equal to aset type, or there is a set term of the form x.A, for which type(x.A) is unde�ned or is notequal to a set type, then output no and exit.(2) If there is an object term t in Q such that SatType(t) is empty, then output no and exit.(3) If there is an object term t with two distinct atomic values c1 and c2 both in [t], thenoutput no and exit.(4) Output yes.Lemma 3.1 If the algorithm SatTestUT outputs yes, then Q is satis�able.Suppose the algorithm outputs yes. We construct a state sQ and a satisfying assignment �for Q in sQ. This assignment will be called the canonical asssignment for Q in sQ. The detailsof the construction will be applied in later results.First, to each equivalence class [t] of the complete equality relationship graph of Q, weassociate a distinct value [t]� and a terminal class or an atomic type type�([t]). The typetype�([t]) is de�ned to be type(s) for any term s of the query in [t]. This is well-de�ned in thecase of object terms by statement (2). For set terms x:A, note that if t 2 [x:A] then t must beof the form y:A for some variable y 2 [x], by the fact that set terms do not occur in equationsin Q and construction of E(Q). Thus type�([x:A]) may be de�ned to be type(x:A) in this case.The values [t]� are assigned as follows.Val1: If type�([t]) is an atomic type, and there is an atomic value c in [t], then [t]� = c. Bystatement (3), c is the unique atomic value in [t].19

Page 20: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Val2: If type�([t]) is an atomic type and there is no atomic value in [t], then take [t]� to be anyvalue of type type�([t]), in such a way that no two distinct equivalence classes are assignedthe same value. (This is possible because the atomic types are in�nite.)Val3: If type�([t]) is a terminal class then the value [t]� is de�ned to be just the equivalenceclass [t] itself. In this case these values will be interpreted below as objects in the stateto be constructed.Val4: If none of the above cases apply, then type�([t]) is a set type. In this case [t]� is de�nedto be the set f[v]� j `v2z.A' is an atom in Q for some variable z in [t]g. (Note thatthe variables v must be object variables, so the values [v]� are already de�ned by casesVal1-Val3.)Observe that it follows from this de�nition that distinct equivalence classes are assigned distinctvalues.We now construct a state sQ = (O, Ic, Iv). We take the set of objects O of this state tobe f [t] j [t] is an equivalence class of Q such that type�([t]) is a terminal classg. These objectsare assigned to terminal classes by putting Ic([t]) = type�([t]), for every [t]2O. The mappingIv, de�ned below, maps each [t]2O to a tuple value on type�([t]). First, for each [t]2O, de�neattr([t]) to be the set of attributes A such that x:A is a term of Q for some variable x2[t]. Wenow de�ne attribute values for the object [t] as follows. Note that by conditions (1) and (2), theset attr([t]) is a subset of the set of attributes in Ic([t]). For each attribute A2attr([t]), the valueassigned to the attribute A for the object [t] is [x:A]�, where x is any variable in [t] such thatx:A is a term of Q. (Note that the de�nition of the complete equality relationship graph ensuresthat if x and y are variables in [t] such that x:A and y:A are terms of Q then [x:A] = [y:A],so this de�nition is independent of the choice of x.) If A is an attribute in type�([t]) but notin attr([t]), a null value is assigned to A for the object [t]. This completes the de�nition of thestate s.Finally, we de�ne a mapping � from the variables of Q to this state. For each variable x,we let �(x) be the value [x]�. 20

Page 21: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Lemma 3.2 The query Q is satis�ed in the state sQ under the assignment �.See [10] for more details (on a slightly di�erent approach to the proof).Theorem 3.3 A terminal conjunctive query Q on S is satis�able if and only if the algorithmSatTestUT outputs yes.[Proof]: See [10]. 2Unless otherwise stated, we consider only satis�able terminal conjunctive queries for therest of this paper.4 Containment of Terminal Conjunctive QueriesWe now set about developing a condition that characterizes containment of terminal conjunctivequeries. We remark that some of the results of this section depend crucially on the notion ofatoms over a state de�ned in section 2.4. For the rest of Sections 4 and 5, a query Q refers toa terminal conjunctive query Q.We begin by de�ning a relation that is intended to capture the equations that must besatis�ed under any satisfying assignment for a query. Recall that the terms in an equationmust have non-null interpretations for the equation to hold. Given a query Q, de�ne therelation � on the set of terms by s � t if either1. s and t are both terms in the complete equality relationship graph of Q and [s] = [t], or2. s and t are the same term, which is an atomic value.Note that the relation � is an equivalence relation when restricted to the set of terms ofthe complete equality relationship graph of Q. However, � is not an equivalence relation ingeneral, since we do not have t � t for terms t that are not an atomic value or in the completeequality relationship graph of Q. Intuitively, this re ects the fact that such terms may haveinterpretation � under a satisfying assignment, so that the equation t = t does not hold. (Wedo have c � c for atomic values c because the equation c = c will hold under every satisfyingassignment.) 21

Page 22: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

We can extend the relation � on terms to a relation on atoms by de�ning A � A0 if A andA0 are atoms of the same syntactic form and the terms in corresponding positions are �-related.(For example, if [x] = [y] and z:A is a term of the complete equality relationship graph thenwe have x 2 z:A � y 2 z:Z, but not x = z:A � y 2 z:Z, because these atoms are of di�erentsyntactic forms.)Lemma 4.1 If � is a satisfying assignment for Q in a state s then(i) If s and t are terms with s � t then both Val(�(s)) and Val(�(t)) are non-null andVal(�(s)) = Val(�(t)).(ii) If A and A0 are atoms with A � A0 then s j= �(A) if and only if s j= �(A0).[Proof]: The claim of part (i) is trivial for constants c, since we always have Val(�(c)) = cis non-null. The case where s and t are terms of the complete equality relationship graph with[s] = [t] is by induction on the construction of the complete equality relationship graph, provingalso the additional property that Val(�(t)) is non-null for any term in the complete equalityrelationship graph.Note that for all terms t of Q, we must have that Val(�(t)) is non-null, since this is requiredfor the atom in which t occurs to be satis�ed under �. (There is one exception to this observa-tion, the case in which t is the distinguished term and not equal to a variable. But then t mustbe an atomic value c, for which Val(�(t)) = c is non-null.) For equations s = t in Q we musthave Val(�(s)) = Val(�(t)) in order for � to be satisfying. It is trivial that for an equality edget = t we have Val(�(t)) = Val(�(t)). This establishes the base case of the induction. (Thecase of equations t = t, for terms introduced later in the construction, is similar, but uses theadditional property.)Consider next the case of edges t1 = t2 and t2 = t3 inducing an edge t1 = t3. Since we have[t1] = [t2] and [t2] = [t3] it follows from the inductive hypothesis that Val(�(ti)) is non-nullfor i = 1 : : : 3 and Val(�(t1)) = Val(�(t2)) and Val(�(t2)) = Val(�(t3)). It is immediate thatVal(�(t1)) = Val(�(t3)). 22

Page 23: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Finally, suppose that x and y are variables with x = y an edge of the complete equalityrelationship graph, and that x:A is a node of the complete equality relationship graph. By theinduction hypothesis Val(�(x)) and Val(�(y)) are non-null and equal, and Val(�(x):A) is non-null. It follows that Val(�(y:A)) = Val(�(y):A) is equal to Val(�(x):A) and hence non-null,and equal to Val(�(x:A)).Part (ii) follows directly from part (i) using the fact that the de�nition of satisfactiondepends only on the values of the terms in the atoms A and A0. 2In addition to the atoms in a query, certain other atoms will always be satis�ed with respectto any satisfying assignment for the query. For example, if a query contains atoms x = y andy 2 C then every satisfying assignment also makes the atom x 2 C true. The following notionis intended to characterize such atoms. A query Q is said to derive an atom A if1. A is of the form t 2 T where T is a basic type and t � c for some atomic value c of typeT , or2. A is of the form t1 = t2, where t1 and t2 are terms satisfying t1 � t2, or3. Q contains an atom A0 such that A0 � A.We write Q ` A if Q derives the atom A. The following result shows that every atom derivedby a query in fact holds under any satisfying assignment.Lemma 4.2 Let � be any satisfying assignment for the query Q in the state s. If A is an atomsuch that Q ` A then s j= �(A).[Proof]: We consider each clause of the de�nition of derivation:1. Suppose A is of the form t 2 T where T is an atomic type and t � c, where c is anatomic value of type T . By Lemma 4.1, we have Val(�(t)) = Val(�(c)) = c. It followsthat s j= �(A).2. If A is of the form s = t with s � t then by Lemma 4.1(i), Val(�(s)) = Val(�(t)), withboth non-null, so s j= �(A) by de�nition.23

Page 24: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

3. Otherwise, Q contains an atom A0 such that A0 � A. Since � is a satisfying assignment,we have that s j= �(A0). By Lemma 4.1(ii), it follows that s j= �(A) also. 2We now set about showing that a converse to this result holds for the state sQ constructed forma query Q. We begin by relating atoms over this state to atoms of the query. De�ne an inverseof the canonical mapping � for the query Q to be a function ! mapping each object of sQ andeach atomic value to either a variables of Q or an atomic value, such that1. for all atomic values c such that there exists a variable y of Q with �(y) = c, we havethat !(c) is a variable with this property, and2. for all atomic values c such that there exists no variable y of Q with �(y) = c, we have!(c) = c, and3. for all objects [t] of sQ we have that !([t]) is a variable in [t]. (Note that each object ofsQ is an equivalence class of terms that must contain a variable by condition (ii) of thede�nition of well-formed queries.)We may extend ! to a mapping from terms over the state sQ to terms formed using the variablesof Q by de�ning !([t]:A) = !([t]):A for all terms of the form [t]:A. The following result explainswhy we call such a mapping an inverse of �. (To understand the condition on (ii), observe that! is not de�ned on sets occurring in sQ.)Lemma 4.3 If ! is an inverse of the canonical mapping � from Q to sQ then:(i) For all terms t of Q we have !(�(t)) � t.(ii) For all terms t over sQ for which V al(t) is non-null and not equal to a set, we have!(Val(t)) � !(t).(iii) For all terms t over sQ for which V al(t) is non-null, !(t) is either an atomic value or aterm in the complete equality relationship graph of Q.[Proof]: For (i) we consider three cases, according as whether �(t) is an atomic value,object or attribute term. 24

Page 25: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Consider �rst the case where �(t) is the atomic value c. Note that t cannot be an attributeterm since these must be mapped to attribute terms. If t is an atomic value, t must be equalto c, so !(�(t)) = c � c = t. If t is a variable, there exists a variable y with �(y) = c and!(c) = y. Since the construction guarantees that distinct equivalence classes are mapped by �to distinct values, we must have t 2 [y]. Thus !(�(t)) = y � t.Next, consider the case in which �(t) is the object [x]. As this case can only arise from caseVal3 of the construction, we have t 2 [x], so !(�(t)) = !([x]) � t.Finally, if �(t) is of the form o:A, where o is an object of sQ, then t is a term of the formx:A, where x is a variable, and o = [x]. Thus !(�(t)) = !([x]:A) = !([x]):A. Let !([x]) bethe variable y. Because y must be in [x] and x:A is a term of the complete equality graph,y:A is also a term of the complete equality graph, with [x:A] = [y:A]. Hence we have that!(�(t)) = y:A � x:A = t.We prove (ii) and (iii) together. Note that if t is an atomic value or an object of sQ then!(t) is a variable of Q or an atomic value, so the claim of (iii) holds. In this case we also haveVal(t) = t so the claim of (ii) follows directly from the fact that !(t) is an atomic value or aterm of the complete equality relationship graph, so that !(t) � !(t).Suppose next that t is a term over sQ of the form [x]:A (where, without loss of generality,x is a variable) for which V al(t) is de�ned and not equal to a set. By de�nition, !([x]:A) =!([x]):A = y:A for some variable y 2 [x]. Moreover, by construction of sQ, there exists avariable z 2 [x] such that z:A is a term in the complete equality relationship graph. It followsthat !([x]:A) are in the complete equality relationship graph, establishing the claim of (iii) inthis case. (Note also that x:A is in the complete equality relationship graph.)Moreover, we have either Val(t) = [x:A] or Val(t) = c for some atomic value c. To completethe proof of (ii) we consider each of these cases individually.Suppose �rst that Val(t) = [x:A]. Let !([x]) be the variable y 2 [x]. Then x � y. Hence!(Val(t)) = !([x:A]) � x:A � y:A = !(t). Next, if Val(t) is the atomic value c then, byconstruction of sQ, one of the following cases applies:1. In case (Val1) of the construction, we have c 2 [x:A]. Thus !(Val(t)) = !(c). If there25

Page 26: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

does not exist a variable v with !(v) = c, then !(c) = c � x:A. If there does exist such avariable, and !(c) = v, then we must have v 2 [x:A]. Hence here !(c) = v � x:A also. Ineither case, it follows that !(Val(t)) � !(t).2. In case (Val2) of the construction, x:A is of atomic type, but [x:A] contains no atomicvalue. In this case, there exists a variable y in [x:A], for which we must have �(y) = c.Without loss of generality, take y to be the variable such that !(c) = y. Then !([x]:A) =!([x]):A � x:A � y = !(Val(t)). 2We may extend an inverse ! of � to a mapping from atoms A over sQ to atoms over thevariables of Q by de�ning !(A) to be the atom obtained by substituting for each term t oversQ in A the term !(t).We now show that the atoms derived by a query Q completely capture the set of atoms ofa certain form holding in the state sQ. De�ne an atom over sQ to be terminal if it is of one ofthe following forms:1. t 2 T , where t is a term over sQ and T is an atomic type or terminal class,2. s = t, where s and t are terms over sQ such that neither Val(s) nor Val(s) is a set,3. t 2 o:A, where t is a term over sQ and o is an object of sQ.Intuitively, the terminal atoms are those that may occur as images under satisfying assigmentsof a well-formed query.Lemma 4.4 Let ! be any inverse of �. Then for all terminal atoms A over sQ such thatsQ j= A we have Q ` !(A).[Proof]: We consider each of the possible cases for the atom A.1. If A is of the form c 2 T where c is an atomic value of type T then Q ` !(A) by the �rstclause of the de�nition of derivation.2. SupposeA is of the form [t] 2 C where C is a terminal class and [t] is an object of sQ, withIc([t]) = C. By construction of sQ, there exists a variable x 2 [t] such that x 2 C is an26

Page 27: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

atom of Q. Since !([t]) is also an element of the equivalence class [t] we have x � !([t]),so the atom !(A) is �-related to the atom x 2 C. Hence Q ` !(A) by the third clauseof the de�nition of derivation.3. Suppose that A is of the form s = t, where s and t are terms over sQ. For this equation tohold in sQ, we must have Val(s) = Val(t), with both non-null. Since the atom is terminal,neither value is a set. Hence, by Lemma 4.3(ii), we have !(s) � !(Val(s)) = !(Val(s)) �!(t). It follows that Q ` !(s) = !(t) by the second clause of the de�nition of derivation.4. Suppose that A is an atom of the form a 2 b:A, where a is a value or object of sQ and bis an object of sQ. By construction of sQ, for this atom to hold in sQ there must exist anobject term t in Q and a set term x:A such that t 2 x:A is an atom of Q and �(t) = a and�(x) = b. Since !(a) = !(�(t)) � t and !(b) = !(�(x)) � x by Lemma 4.3 (i), it followsthe third clause of the de�nition of derivation that Q ` !(t) 2 !(b):A, i.e., Q ` !(A). 2We comment that the corresponding property could not be established for atoms over sQexpressing equations between sets. For example, for the query Q = fx j 1 2 x:A ^ 1 2 y:Bgwe �nd that in sQ we have sQ j= [x]:A = [y]:B, since Val([x]:A) = f1g = Val([y]:B), but notQ ` x:A = y:B, since [x:A] 6= [y:B].We are now ready to state the characterization of containment of queries. First, de�ne avariable mapping from a query Q2 to a query Q1 to be a function � mapping each variable ofQ2 to either a variable of Q1 or to an atomic value. Such a mapping can be extended to amapping from terms in the complete equality graph of Q2 to expressions formed from atomicvalues and variables of Q1 by taking �(c) = c for all atomic values c, and �(x:B) = �(x):B forall variables x. (Note that the expression �(x):B need not be a term of the complete equalitygraph of Q1, or even a term. For example, if �(x) is the atomic value c then this expressionis c:B, which is not a term, and is uninterpretable in our language, since atomic values do nothave attributes.)We will deal with the composition of various such mappings. Recall that if f : X ! Y andg : Y ! Z are functions then the composition g � f is the function from X to Z de�ned by27

Page 28: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

g � f(x) = g(f(x)).De�ne a containment mapping � from a query Q2 to a query Q1 to be a variable mappingfrom Q2 to Q1 such that1. if t1 is the distinguished term of Q1 and t2 is the distinguished term of Q2 then �(t2) �1 t1(where �1 is the relation derived from Q1), and2. for every atom A of Q2, we have Q1 ` �(A).The following result shows that containment mappings characterize containment between ter-minal conjunctive queries.Theorem 4.5 If Q1 and Q2 are terminal conjunctive queries then Q1 � Q2 if and only if thereexists a containment mapping from Q2 to Q1.[Proof]: Suppose �rst that � is a containment mapping from Q2 to Q1. We need to showthat Q1 � Q2. For this, let s be any state and suppose that � is a satisfying assignment for Q1in the state s, so that �(t1) 2 Q1(s), where t1 is the distinguished term of Q1. We show that�(t1) is also in Q2(s). De�ne the assignment � for Q2 in s by � = � � �. If A is an atom of Q2then since � is a containment mapping we have by de�nition of containment that Q1 ` �(A).By Lemma 4.2 it follows that s j= �(�(A)), i.e. that s j= �(A). Since this holds for every atomof Q2 it follows that � is a satisfying assignment of Q2 in s. Thus Q2(s) contains �(t2), wheret2 is the distinguished term of Q2. Because � is a containment mapping, we have �(t2) � t1.Thus, by Lemma 4.1 we have that �(t2) = �(�(t2)) = �(t1) is in Q2(s), as promised. Thiscompletes the proof that Q1 � Q2, establishing the implication from right to left in the lemma.To prove the converse, assume that there exists no containment mapping from Q2 to Q1.We show that Q1 is not contained in Q2 by establishing that Q1(sQ1) is not a subset of Q2(sQ1).In particular, we argue that if � is the canonical assignment of Q1 in sQ1 and and t1 is thedistinguished term of Q1, then �(t1) is not in Q2(sQ1). Note that, on the other hand, �(t1) isan element of Q1(sQ1) by Lemma 3.2.To show that �(t1) is not inQ2(sQ1), assume to the contrary that � is a satisfying assignmentfor Q2 in sQ1 with �(t2) = �(t1), where t2 is the distinguished term of Q2. We derive a28

Page 29: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

contradiction to the assumption that there exists no containment mapping from Q2 to Q1. Inparticular, let ! be any inverse of � and consider the mapping � = ! � �. Note that this mustbe a variable mapping from Q2 to Q1. We claim that � is a containment mapping from Q2 toQ1.To see this, note �rst that �(t2) = !(�(t2)) = !(�(t1)) � t1. Thus � satis�es the �rstclause of the de�nition of containment mapping. Next, note that if A is an atom of Q2 thensince � is a satisfying assignment we have that sQ1 j= �(A). Since �(A) is a terminal atomover sQ1 we have by Lemma 4.4 that Q1 ` !(�(A)), i.e., that Q1 ` �(A). Thus, � satis�esthe second condition of the de�nition of containment mapping, completing the proof that � isa containment mapping from Q2 to Q1, and yielding the desired contradiction. 25 Minimization of Terminal Conjunctive QueriesIn this section, we de�ne a notion of minimality of terminal conjunctive queries, and derive analgorithm that, given a terminal conjunctive query as input, �nds a minimal equivalent queryamong all terminal conjunctive queries.Let Q be a terminal conjunctive query. A minimal terminal conjunctive query of Q is aterminal conjunctive query equivalent to Q with the number of variables minimal among suchterminal conjunctive queries. We now show how to �nd minimal queries.We begin with a number of lemmas concerning containment mappings. In the rest of thediscussion, we use subscripting to indicate the query with respect to which we compute theequivalence classes.Lemma 5.1 Let � be a containment mapping from the satis�able terminal conjunctive queryQ2 to the satis�able terminal conjunctive query Q1.(i) If t is a term of the complete equality relationship graph of Q2, then �(t) is either anatomic value or a term of the complete equality relationship graph of Q1.(ii) Let �1 and �2 be the relations on terms corresponding to the queries Q1; Q2, respectively.For all terms s and t, if s �2 t then �(s) �1 �(t).29

Page 30: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

[Proof]: We establish (i) and (ii) simultaneously. If s �2 t because both s and t are theatomic value c, then we have �(s) = �(t) = c, so �(s) �1 �(t). It therefore remains to show thatif [s]2 = [t]2, where s and t are terms of the complete equality relationship graph of Q2, then�(s) and �(t) are terms of the complete equality relationship graph of Q1 and [�(s)]1 = [�(t)]1.We prove this by induction on the construction of the complete equality relationship graph ofquery Q2. Note that it is immediate from the fact that � is a containment mapping that forall terms s of Q such that �(s) is not an atomic value, we have �(s) equal to a term in thecomplete equality relationship graph of Q1. This is because s occurs in an atom A of Q2 andQ1 derives the atom �(A).In the case of edges s = s, we clearly have �(s) �1 �(s). For edges s = t corresponding toatoms of Q2, we have �(s) �1 �(t) because � is a containment mapping. Suppose that an edges = u is derived from edges s = t and t = u of the complete equality relationship graph of Q2for which we have �(s) �1 �(t) and �(t) �1 �(u). Since all the latter terms are in Q1, it followsfrom the fact that �1 is an equivalence relation on the terms of Q1 that �(s) �1 �(u).Finally, suppose that an edge x:A = y:A is derived from an edge x = y and a term x:A of thecomplete equality relationship graph of Q2. We assume by way of induction that �(x) �1 �(y)and that the term �(x:A) occurs in the complete equality relationship graph of Q1. Now�(x:A) = �(x):A, so �(x) must be a variable, else Q1 would not be satis�able. Since �(x) �1�(y), it follows similarly that �(y) must be a variable. Thus, �(y:A) = �(y):A is a term of thecomplete equality relationship graph of Q1 and �(x):A �1 �(y):A, by the inductive hypothesisand the construction of the complete equality relationship graph of Q1. 2Lemma 5.2 Let � be a containment mapping from the satis�able terminal conjunctive queryQ2 to the satis�able terminal conjunctive query Q1. If A is an atom such that Q1 ` A thenQ2 ` �(A).[Proof]: We consider the three cases of the de�nition of derivation. First, suppose A isof the form t 2 T where T is an atomic type and t is �-related to the atomic value c of typeT . Then by Lemma 5.1 (ii) we have �(t) �1 c, so Q1 ` �(t) 2 T . Second, if A is the atoms = t and s �2 t, then by Lemma 5.1 (ii) we have �(s) �1 �(t), so Q1 ` �(s) = �(t). Finally,30

Page 31: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

if Q2 contains the atom A0 �2 A then by Lemma 5.1 (ii) we have �(A0) �1 �(A). Since �is a containment mapping it is also the case that Q1 ` �(A0). It follows using the de�nitionof derivation and the fact that �1 is an equivalence relation on terms of the complete equalityrelationship graph that Q1 ` �(A). 2Lemma 5.3 Let Q1; Q2 and Q3 be satis�able terminal conjunctive queries. If �1 is a contain-ment mapping from Q1 to Q2 and �1 is a containment mapping from Q2 to Q3 then �2 � �1 isa containment mapping from Q1 to Q3.[Proof]: Let t1; t2 and t3 be the distinguished terms of Q1; Q2 and Q3, respectively, andlet �1;�2 and �3 be the relations on terms derived form these queries. Since �1 and �2 arecontainment mappings, we have �1(t1) �2 t2 and �2(t2) �3 t3. It follows using Lemma 5.1(ii)that �2 � �1(t1) �3 �2(t2) �3 t3. This establishes that �2 � �1 satis�es the �rst condition of thede�nition of containment mapping.It remains to show that if A is an atom of Q1 then Q3 ` �2 � �1(A). Since �1 is acontainment mapping we have that Q2 ` �1(A). It follows using Lemma 5.2 and the fact that�2 is a containment mapping that Q3 ` �2 � �1(A). 2Let Q = ft j Mg be a conjunctive query and suppose � is a variable mapping on Q. De�ne�(Q) to be the conjunctive query with the distinguished term �(t) obtained by replacing eachatom A of Q by the atom �(A).Proposition 5.4 Let Q be a terminal conjunctive query. Suppose there is a containment map-ping � from Q to itself. Then �(Q) is equivalent to Q.[Proof]: It is easy to check that � is a containment mapping from Q to �(Q), so we have byTheorem 4.5 that �(Q)�Q. To show Q��(Q), we show that there is a containment mappingfrom �(Q) to Q. We claim that the identity mapping i is such a mapping. Note �rst thatthe distinguished term of �(Q) is �(t), where t is the distinguished term of Q, and we havei(�(t)) = �(t) � t because � is a containment mapping. It remains to show that for every atomA of Q, we have for the corresponding atom �(A) of �(Q) that Q ` i(�(A)). That is, we needQ ` �(A). This is immediate from the fact that � is a containment mapping. 231

Page 32: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

The following describes how to obtain a minimal terminal conjunctive query. Say that avariable mapping � from a query Q1 to a query Q2 is bijective if �(x) is a variable of Q2 for everyvariable x of Q1, and the restriction of � to the set of variables of Q1 is a bijective mapping tothe set of variables of Q2.Theorem 5.5 A terminal conjunctive query Q is minimal if all containment mappings fromQ to itself are bijective.[Proof]: Suppose that all containment mappings from Q to itself are bijective, but that Q isnot minimal. We establish a contradiction. Since Q is not minimal, there is a minimal terminalconjunctive query Q0 equivalent to Q which has fewer variables. Now by Theorem 4.5, the factthat Q0�Q implies that there is a containment mapping � from Q to Q0 and also that there is acontainment mapping ! from Q0 to Q. It follows from Lemma 5.3 that the composite mapping! � � is a variable mapping from Q to itself. By the assumption, ! � � is bijective. But thismeans that the number of variables of Q0 is at least as large as the number of variables of Q,contradicting the choice of Q0. 2The converse to Theorem 5.5 follows from the following.Theorem 5.6 Let Q1 and Q2 be minimal terminal conjunctive queries. Suppose Q1 � Q2.Then every containment mapping from one query to the other is bijective.[Proof]: Let � be a containment mapping from Q1 to Q2. Since these queries are equivalent,there also exists a containment mapping ! from Q2 to Q1. By Lemma 5.3, the compositemapping � � ! is a containment mapping from Q2 to itself. Suppose that the image of theset of variables of Q2 under � � ! does not contain all the variables of Q2. Then the query(� � !)(Q2), which is equivalent to Q2 by Proposition 5.4, has fewer variables than Q2. Thiscontradicts the minimality of Q2. This shows that the image of the the set of variables of Q2under � � ! contains all the variables of Q2. It follows that Q1 has at least as many variablesas Q2. A similar argument using the containment mapping ! � � from Q1 to Q1 shows that Q2has at least as many variables as Q1. Since � �! covers the variables of Q2, it now follows that� must in fact be a bijection between the variables of Q1 and Q2. 232

Page 33: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Given a satis�able terminal conjunctive query Q the algorithm to �nd a minimal terminalconjunctive query Q0 equivalent to Q is as follows. Consider all containment mappings � fromQ to itself. Choose Q0 to be one of the queries �(Q) that has the fewest variables among suchqueries. The minimality of Q0 follows from Theorem 5.6.Note that there may be some further optimizations possible for the query Q0 so obtained,since this may contain atoms of the form c 2 T or c = c, where c is an atomic value of typeT . Such atoms, since they are derivable even from an empty query, can be deleted, yielding anequivalent query.6 Containment and Optimization of Conjunctive QueriesIn this section, we study the containment of conjunctive queries and show how to obtain opti-mal conjunctive queries. The optimal conjunctive queries obtained are expressed as unions ofterminal conjunctive queries and are optimal among all unions of terminal conjunctive queries.In Section 6.1, we characterize containment of conjunctive queries by solving the containmentproblem for unions of terminal conjunctive queries. In Section 6.2, we give our notion of op-timality. In Section 6.3, we derive an algorithm, given a conjunctive query, for �nding anoptimal union of terminal conjunctive queries. We �rst use an example to illustrate our notionof optimality and the approach taken in obtaining an optimal query.Example 6.1 Let us consider the following query de�ned on the schema in Example 1.1.Q1: f x j 9y 9z (x2Vehicle & y2Discount & z2Client & x2y.VehRented & x2z.VehRented)g.This query retrieves all those vehicles that have been rented to a discount client. By Propo-sition 2.1, Q1 is equivalent to the union of the following terminal conjunctive queries:S1: f x j 9y 9z (x2Auto & y2Discount & z2Normal & x2y.VehRented & x2z.VehRented)g.S2: f x j 9y 9z (x2Auto & y2Discount & z2Discount & x2y.VehRented & x2z.VehRented)g.S3: f x j 9y 9z (x2Trailer & y2Discount & z2Normal & x2y.VehRented & x2z.VehRented)g.S4: f x j 9y 9z (x2Trailer & y2Discount & z2Discount & x2y.VehRented & x2z.VehRented)g.S5: f x j 9y 9z (x2Truck & y2Discount & z2Normal & x2y.VehRented & x2z.VehRented)g.S6: f x j 9y 9z (x2Truck & y2Discount & z2Discount & x2y.VehRented & x2z.VehRented)g.33

Page 34: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

With algorithm SatTestUT, it can be shown that S3, S4, S5 and S6 are unsatis�able. Thereason for this is that discount clients are only allowed to rent automobiles and not other typesof vehicles. Hence Q1 is equivalent to S1[S2. There is a containment mapping from S2 to S1;the mapping is to map x to x and y and z to y. By Theorem 4.5, S1 is redundant and is removedfrom the union. S2 can further be minimized by mapping x to x and y and z to y. The resultingoptimal query obtained is:S02: f x j 9y (x2Auto & y2Discount & x2y.VehRented)g. 26.1 A Characterization of Containment of Unions of Terminal ConjunctiveQueriesBy Proposition 2.1, understanding containment of unions of terminal conjunctive queries su�cesto solve the containment problem of conjunctive queries. We have found a characterization ofcontainment for terminal conjunctive queries. We are now ready to state the containmentcondition for two unions of terminal conjunctive queries.Theorem 6.1 Let M = Q1 [� � � [Qs and N = P1 [� � � [Pt be two unions of terminal conjunc-tive queries. M�N if and only if for each Qi in M, there is a Pj in N such that Qi�Pj.[Proof]: \If" Trivial.\Only if" Let Qi be a subquery in M. Let sQi be the state constructed for Qi. Suppose �is a satisfying assignment from Qi to sQi and let ! be an inverse of �. Since M�N, there issome Pj such that there is a satisfying assignment of object identi�ers and atomic values inthe state sQi to variables in Pj that gives rise to the answer �(ti), where ti is the distinguishedterm of Qi. Then !o is a variable mapping from Pj to Qi. We show that the mapping !o isa containment mapping.Let !o (tj) = v, where tj is the distinguished term of Pj. Then (tj) = �(ti). Hence,!( (tj)) = !(�(ti)). By Lemma 4.3(i), !(�(ti))� tj . Let A be an atom of Pj. Then (A) isa terminal atom over sQi and sQi j= (A). By Lemma 4.4, Qi ` ! ( (A)). Hence !o is acontainment mapping. By Theorem 4.5, Qi�Pj . 234

Page 35: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

As a corollary, we solve the problem of determining when one conjunctive query containsthe other one.6.2 Search-Space-Optimal QueriesWe now introduce our notion of optimality which intends to capture the intuition that thenumber of variables as well as their search spaces are minimal among all equivalent queries. Ina conjunctive query, each variable is associated with a set of terminal classes or atomic typeswhich denotes the search space of the variable. Without knowing the physical data organizationfor various classes, a good criterion of evaluating various equivalent queries is by comparing theset of variables in a query and their associated search spaces.Example 6.2 Let us consider again the Vehicle Rental Schema. The following three queriescan be shown to be equivalent.Q1: f x j 9y 9z (x2Auto & y2Discount & z2Vehicle & x2y.VehRented & z2y.VehRented)g.Q2: f x j 9y (x2Vehicle & y2Discount & x2y.VehRented)g.Q3: f x j 9y (x2Auto & y2Discount & x2y.VehRented)g.If we consider the domain of the type of a variable as its search space, then Q1 has morevariables and has a larger search space than Q2. Although Q2 and Q3 have the same numberof variables, the search space associated with variables in Q2 is greater than that in Q3. Q3is considered to be more optimal since the number of variables as well as the search space areminimal. 2Existing work on exact minimization tries to minimize the number of joins in an expression[3]. The notion of optimality we shall propose attempts to generalize that idea.A multiset is a set or bag of elements in which duplicate elements are allowed. Let S and Tbe two multisets. The bag union of S and T, denotes S ] T, is a multiset obtained from mergingelements in the operands such that for every element x in S or T, the number of occurrencesof x in the bag union is the sum of the numbers of occurrences of x in S and in T. Clearly thebag union operator is commutative and associative. We say S is a bag subset of T, denotes S35

Page 36: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

vT, if for every element x in S, if there are n occurrences of x in S, then there are at least noccurrences of x in T.Let Q be a conjunctive query and x a variable in Q. De�ne term-class(Q, x) = fEj x2C1t � � � tCn is the range atom in Q associated with the variable x, and E is a terminal subtypeof Ci, for some 1 �i �ng. Informally, term-class(Q, x) gives the terminal descendent classesor atomic types over which the variable x is ranging in the query. Let x1 , : : : , xn be theset of variables in Q. Then term-class(Q) is a multiset de�ned as term-class(Q, x1) ] . . .]term-class(Q, xn).We are now ready to de�ne our notion of optimality. LetQ=Q1[ � � � [Qn and P=P1[ � � � [Pmbe two unions of conjunctive queries. Q is said to be at least as optimal as P, denotes Q�P, ifterm-class(Q1) ] . . .] term-class(Qn) v term-class(P1) ] . . .] term-class(Pm).A query Q is search-space-optimal among a set of queries S if for all P in S such thatP is equivalent to Q, P�Q implies Q�P. For search-space-optimal queries, the object searchspaces are minimal among all equivalent queries in the set S. The query Q3 in Example 6.2 isa search-space-optimal query.Corollary 6.2 If Q is a minimal terminal conjunctive query, then Q is a search-space-optimalquery among all terminal conjunctive queries.[Proof]: Follows from Theorem 5.6 and the derivability of range atoms. 26.3 Optimization of Unions of Terminal Conjunctive QueriesIn this subsection, we study the optimization of unions of terminal conjunctive queries. Weshow how to obtain a search-space-optimal query among all unions of terminal conjunctivequeries.A union of terminal conjunctive queries Q1(s, t) [� � � [Qn(s, t) is nonredundant if there areno Qi and Qj, i6=j, such that Qi�Qj. We can transform a union of terminal conjunctive queriesto an equivalent nonredundant union by �nding Qi and Qj, i6=j, such that Qi�Qj and deletingQi from the union until no more subquery can be removed.36

Page 37: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

The following is an important property about nonredundant unions of terminal conjunctivequeries.Theorem 6.3 Let M = Q1 [� � � [Qs and N = P1 [� � � [Pt be two nonredundant unions ofterminal conjunctive queries. Then M�N if and only if for each Qi in M, there is a unique Pjin N such that Qi�Pj and vice versa. Moreover, s=t.[Proof]: \If" Follows from Theorem 6.1.\Only if" Suppose M�N. Let Qi be a subquery in M. By Theorem 6.1, there is a Pj inN such that Qi�Pj . By the assumption on equivalence and by Theorem 6.1, there is Qp inM such that Pj�Qp. If i6=p, then Qi�Pj and Pj�Qp. This implies Qi�Qp and contradictsthe nonredundancy of M. Hence i=p and Qi�Pj. If s6=t, say, s<t, then there is a Pj which isredundant. This is a contradiction. It follows that s=t. 2The following is an algorithm for �nding an optimal union of terminal conjunctive queriesfor a conjunctive query.Algorithm Optimization: Given a conjunctive query Q, �nd an equivalent union of terminalconjunctive queries which is search-space-optimal among all unions of terminal conjunctivequeries.Input: A conjunctive query Q.Output: An equivalent union of terminal conjunctive queries.Method:(1) Convert Q into an equivalent union of terminal conjunctive queries.(2) Remove unsatis�able subqueries from the union using the algorithm SatTestUT.(3) Remove any redundant subqueries from the union.(4) Minimize each of the remaining subqueries.(5) Output the union of resulting terminal conjunctive queries.Theorem 6.4 The union of terminal conjunctive queries output by Algorithm Optimization isequivalent to Q and is a search-space-optimal query among all unions of terminal conjunctivequeries. 37

Page 38: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

[Proof]: By Proposition 2.1, Theorem 3.3 and by de�nitions of redundancy and minimization,the union, say Q', output is equivalent to the input Q. Let P= P1 [ � � � [Pn be a union ofterminal conjunctive queries that is equivalent to Q. Without loss of generality, let us assumethat P is nonredundant and each subquery is minimal. By Theorem 6.3, there is a one-onecorrespondence between the two unions. By Theorem 5.6 and the fact that both unions areunions of terminal conjunctive queries, Q' �P. 2Let us look at the following example.Example 6.3 Let us consider a query de�ned on the following schema.N

A:{H}B:H

T

B:I

T T3

A:{I} B:I

A:{J}B:J

21

H

I J

Q1: f x j 9y 9s (x2N & y2H & s2J & y=x.B & y2x.A & s2x.A)g.By Proposition 2.1, Q1 is equivalent to the union of the following terminal conjunctivequeries:S1: f x j 9y 9s (x2T1 & y2I & s2J & y=x.B & y2x.A & s2x.A)g.S2: f x j 9y 9s (x2T2 & y2I & s2J & y=x.B & y2x.A & s2x.A)g.S3: f x j 9y 9s (x2T3 & y2I & s2J & y=x.B & y2x.A & s2x.A)g.S4: f x j 9y 9s (x2T1 & y2J & s2J & y=x.B & y2x.A & s2x.A)g.S5: f x j 9y 9s (x2T2 & y2J & s2J & y=x.B & y2x.A & s2x.A)g.S6: f x j 9y 9s (x2T3 & y2J & s2J & y=x.B & y2x.A & s2x.A)g.With algorithm SatTestUT, S2, S3, S4 and S5 are unsatis�able. Hence Q1 is equivalent toS1[S6. Neither subquery in the union contains the other and hence the union is nonredundant.Since variables in S1 range over di�erent terminal classes, S1 is minimal. It can easily be shownthat S6 can be minimized further. A minimal form is as follows:S06: f x j 9y (x2T3 & y2J & y=x.B & y2x.A)g.38

Page 39: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

The union of terminal conjunctive queries output by the algorithm is S1[S06 which is asearch-space-optimal query for Q1 among all unions of terminal conjunctive queries. 2Algorithm Optimization only produces an optimal query expressed as a union of terminalconjunctive queries. This form needs not be the most desirable form to be executed. Forinstance, Q1 in Example 6.3 is equivalent to the following query.Q2: f xj 9y9s(x2T1 t T3 & y2H & s2J & y=x.B & y2x.A & s2x.A)g.Throughout the discussion, we made no assumption on how data are being physically or-ganized. It could be the case that, given certain information on data organization, Q2 is abetter form to be evaluated than the union produced by the algorithm. However, the union ofterminal conjunctive queries produced could be used as a basis to generate equivalent query ina more desirable form. It would be interesting to see how other information could be used tosynthesize a more optimal query for the union.7 Complexity of the Containment ProblemIn this section, we investigate the time complexity for determining containment of conjunctivequeries. We begin with the simple case of terminal conjunctive queries. The containmentproblem for terminal conjunctive queries is clearly in NP. A relational query is called a SPJ-query if only selection with constant, projection and natural join are used in the query. It iswell-known that the class of relational SPJ-expressions can be expressed as a tagged tableau[3]. Every such tagged tableau can be translated into a conjunctive query without the setmembership construct. In [3], it was shown that the problem of determining containmentof SPJ-expressions is an NP-complete problem. Consequently, the containment problem ofterminal conjunctive queries is also NP-complete. In fact, it can be shown that containmentproblem of terminal conjunctive queries involving only range and set membership atoms isNP-complete.Corollary 7.1 The problem of determining containment for terminal conjunctive queries is anNP-complete problem. 39

Page 40: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

[Proof]: Follows from the argument above. 2Corollary 7.1 implies that testing containment of conjunctive queries is NP-hard. We shallshow that the problem is in �P2 of the polynomial hierarchy [32]. The proof of this result issimilar to a proof in [28]. A language L is in �P2 if its complement is in �P2 , the class of languageswhich can be recognized by nondeterministic polynomial-time algorithm with an oracle fromNP. An oracle for a class of decision problems C enables one to decide any problem in C inunit time. The classes �P2 and �P2 contain NP and are contained in PSPACE.Theorem 7.2 The problem of determining containment for conjunctive queries is in �P2 .[Proof]: Let E1 and E2 be two conjunctive queries. We describe a nondeterministic polynomial-time algorithm with an oracle from NP that answers \yes" if and only if E1 6�E2. By Theo-rem 6.1, E1 6�E2 if and only if there is a terminal conjunctive query in the union for E1, sayE3, such that E3 6�E2. E3 can be guessed nondeterministically from E1 by assigning terminalclasses or atomic types to variables involved. After guessing the query, we test if E3 is sat-is�able. Testing satis�ability of terminal conjunctive queries can be performed in polynomialtime [10]. If E3 is satis�able, then we ask the oracle if E3�E2. If the oracle answers \no",then the algorithm answers \yes". It remains to show that determining E3 �E2 is in NP. ByTheorem 6.1, E3�E2 if and only if there is a terminal conjunctive query in the union for E2,say E4, such that E3 �E4. Thus we can guess E4 as before and then check in nondeteministicpolynomial time that E3 �E4. Hence our claim is proved. 2We are now ready to show that the containment problem is �P2 - hard.A �2 formula of quanti�ed propositional logic is an expression' = 8p1 : : : pn9pn+1 : : : pn+m[�]where � is a formula of propositional logic containing only the propositional variables p1;. . . ; pn+m.Such an expression is true if for every assignment of boolean truth value to the variablesp1;. . . ; pn, there exists an assignment of truth values to the variables pn+1;. . . ; pn+m underwhich the formula � is true. The set �2-SAT is the set of all true �2 formulae. This is a40

Page 41: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

generalization of the problem of satis�ability to the polynomial hierarchy. It is known that theset �2-SAT is complete for the level �p2 of this hierarchy [33, 36].Theorem 7.3 There exists a �xed schema S such that problem of deciding, given queries Q1and Q2 on S, whether Q1 � Q2 is �p2-hard.[Proof]: By reduction from �2-SAT . We show that for every �2 formula ', there is a pairof conjunctive queries Q1; Q2 such that ' is true if and only if Q1 is contained in Q2.De�ne the schema S to contain classes C;R;G; V;AND and NOT . The subclass relation-ships between these classes are given by R � C and G � C. Thus, the terminal classes areR;G; V;AND and NOT . The tuple type for both R and G is [a : C; b : INT; c : V ], for V isthe empty tuple type [], for AND is [in1 : V; in2 : V; out : V ], and for NOT is [in : V; out : V ].We �rst describe the queryQ1. Part of this query will encode the truth tables for conjunctionand negation. Intuitively, there are (four) variables ranging over AND which represent the linesof the truth table for `^', and there are (two) variables ranging over NOT which represent thelines of the truth table of `:', and there are variables t and f in V which denote the truthvalues true and false, respectively. We write TT (t; f) for the following formula:9u1 2 AND [u1:in1 = t & u1:in2 = t & u1:out = t] &9u2 2 AND [u2:in1 = t & u2:in2 = f & u2:out = f ] &9u3 2 AND [u3:in1 = f & u3:in2 = t & u3:out = f ] &9u4 2 AND [u4:in1 = f & u4:in2 = f & u4:out = f ] &9v1 2 NOT [v1:in = t & v1:out = f ] &9v2 2 NOT [v2:in = f & v2:out = t]:Next, suppose ' has n universally quanti�ed variables. Part of the query Q1 will have thefunction of assigning a truth value to each of these variables. We construct for each i = 1;. . . ; nthe query ASGNi(t; f) given by9wi1 2 R 9wi2 2 C 9wi3 2 G [wi1:a = wi2 & wi2:a = wi3 &wi2:b = i & wi3:b = i &wi2:c = t & wi3:c = f ]:We now de�ne the query Q1 to beft j t 2 V & 9f 2 V [TT (t; f) & ASGN1(t; f) & : : :& ASGNn(t; f)]g:41

Page 42: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

After moving the quanti�ers to the front, it is clear that Q1 is a conjunctive query.Let � be a formula of propositional logic in the propositional constants p1;. . . ; pn+m. Wede�ne inductively the formula �� with free variables amongst x = xpn ;. . . ; xpn+m and x�. If �is the propositional constant pi, where i = 1;. . . ; n, then�� = 9zi1 2 R9zi2 2 G[zi1:a = zi2 & zi2:b = i & zi2:c = xpi ]:If � is the propositional constant pi, where i = n + 1;. . . ; n +m, then �� is the null formulatrue. (Note that the variable x� is xpi in these cases.) If � = �& then ��(x; x�) is9y� 2 AND 9x�x 2 V " y�:in1 = x� & y�:in2 = x & y�:out = x� &��(x; x�) & � (x; x ) #If � = :� then ��(x; x�) is9y� 2 NOT 9x� 2 V [x:in = x�&y�:out = x�&��(x; x�)]:We de�ne Q2 to be the queryfx� j x� 2 V & 9xp1 : : : xpn 2 V [��(x; x�)]gObserve that the class C does not occur in Q2. Thus, after moving the quanti�ers to the front,this query is a terminal conjunctive query.Note that expansions of Q1 are obtained by replacing each of the n range atoms wi2 2 Cby either wi2 2 R or wi2 2 G. There are therefore 2n such expansions. We �rst show thateach of these expansions uniquely determines an assignment of truth values to the propositionalconstants p1;. . . ; pn.Suppose 1 � i � n. Because the constant i has only two occurrences in Q1, if � is a mappingfrom the variables zi1; zi2; xpi to the variables of an expansion E of Q1 that preserves the atomzi2:b = i of the formula �pi , then we must have �(zi2) = wi2 or �(zi2) = wi3. In case theatom wi2 2 C of ASGNi is expanded as wi2 2 G, the mapping zi1 7! wi1; zi2 7! wi2; xpi 7! t,is the only mapping that preserves all the equality and range atoms of �pi . This determinesthe assignment of true to pi. Similarly, in case wi2 2 C is expanded as wi2 2 R the mappingzi1 7! wi2; zi2 7! wi3; xpi 7! f is the only such mapping. This determines the assignment offalse to pi. 42

Page 43: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

Next, suppose that � is a mapping from the variables of the formulae �pi for i = 1;. . . ; nand the variables xpn+1 ;. . . ; xpn+m to the variables of an expansion E of Q1, such that the atomsof Q2 containing these variables are preserved. As noted above, this implies that the variablesxp1 ;. . . ; xpn are mapped to either t or f . The same holds for the variables xpn+1 ;. . . ; xpn+m ,because of the range atoms xpi 2 V in Q2. Let � be the truth value assignment that assignsthe constant pi to be true if and only if �(xpi) = t. Under these conditions, a straightforwardinduction on the complexity of � shows that there exists a unique extension of the mapping� to a mapping from the variables of Q2 to the variables of Q1 that preserves all the atomsof Q2. Furthermore, we have �(x�) = t if and only if the formula � is true with respect tothe assignment �. Note also that this mapping � is a containment mapping from Q2 to theexpansion E if and only if �(x�) = t.It now follows from the observations above that there exists a containment mapping fromQ2 to E for each expansion E of Q1 if and only if the quanti�ed formula ' is true. 2Theorem 7.4 The problem of determining containment for conjunctive queries is complete in�P2 .[Proof]: By Theorems 7.2 and 7.3. 28 ConclusionQuery optimization is an important and yet di�cult problem in an OODB. The types of at-tributes in an inheritance hierarchy can be considered as constraints imposed on objects in astate. In this paper, we studied the containment, equivalence and optimization problems for aclass of natural queries called conjunctive queries. A conjunctive query can be expressed as aunion of terminal conjunctive queries. We �rst characterized containment and minimization forterminal conjunctive queries. We then solved the problems of containment and optimizationfor the class of object-preserving conjunctive queries. The optimal queries are expressed asunions of terminal conjunctive queries. The notion of optimality captures the intuition thatan optimal equivalent query logically accesses, in certain sense, the least number of objectsin a database. It was shown that testing containment of terminal conjunctive queries is an43

Page 44: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

NP-complete problem. Moreover, the containment problem of conjunctive query in general is�p2-complete.AcknowledgementThe �rst author wishes to acknowledge the �nancial support of the Natural Sciences and Engi-neering Research Council of Canada. Work done in part while the �rst author was a visitor andthe second author an employee of NTT Basic Research Labs, Tokyo. The authors also thankanonymous referees for their constructive comments which greatly enhanced the readability ofthis paper.

44

Page 45: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

References[1] Abiteboul, S. and Beeri, C., \On the Power of Languages for the Manipulation of ComplexObjects," Technical Report 846, INRIA, May 1988.[2] Abiteboul, S. and Kanellakis, P., \Object Identity as a Query Language Primitive," Pro-ceedings of 1989 ACM SIGMOD, Portland, Oregon, pp. 159-173. Also appears as TechnicalReport 1022, INRIA, April 1989.[3] Aho, A.V., Sagiv, Y. and Ullman, J.D., \Equivalence of Relational Expressions," SIAM J.of Computing 8(2), 1979, pp. 218-246.[4] Atkinson, M., Bancilhon, F., DeWitt, D., Dittrich, K., Maier, D. and Zdonik, S, \TheObject-Oriented Database System Mani�sto," Proceedings of the 1st International Con-ference on Deductive and Object-Oriented Databases (Kim. W., Nicolas, J-M and Nishio,S., Eds), Kyoto, Japan, December 1989, pp. 40-57.[5] Bancilhon, F., \Object-Oriented Database Systems," Proceedings of the 7th ACM Sympo-sium on Principles of Database Systems, Austin, Texas, 1988, pp. 152-162.[6] Banerjee, J., Kim, W., and Kim, K.C., \Queries in Object-Oriented Databases," Proceed-ings of the 4th International Conference on Data Engineering , Feb. 1988, pp. 31-38.[7] Beeri, C. and Kornatzky, Y, \Algebraic Optimization of Object-Oriented Query Lan-guages," Proceedings of the 3rd International Conference on Database Theory , Paris,France, Lecture Notes in Computer Science 470, December 1990, Springer-Verlag, pp.72-88.[8] Borgida, A. \Type Systems for Querying Class Hierarchies with Non Strict Inheritance,"Proceedings of the 8th ACM Symposium on Principles of Database Systems, Philadelphia,Pennsylvania, 1989, pp. 394-401.[9] The Object Database Standard: ODMG-93, Release 1.1, Cattell, R.G.G. (Editor), MorganKaufmann, San Mateo, CA, 1994.[10] Chan, E.P.F., \Testing Satis�ability of a Class of Object-Oriented Conjunctive Queries,"Theoretical Computer Science (134), 1994, pp. 287-309.[11] Chan, E.P.F., \Containment and Minimization of Positive Conjunctive Queries inOODB's," Proceedings of the 11th ACM Symposium on Principles of Database Systems,San Diego, CA, 1992, pp. 202-211.[12] Chandra, A. and Merlin, P.M., \Optimal Implementation of Conjunctive Queries in Re-lational Databases," Proceedings of the 9th ACM Symposium on Theory of Computing ,1977, pp. 77-90.[13] Cluet, S. and Delobel, C., \A General Framework for the Optimization of Object-orientedQueries," Proceedings of the ACM SIGMOD , San Diego, CA, 1992, pp. 383-392.[14] Codd, E.F., \Extending the Data Base Relational Model to Capture More Meaning," ACMTransactions on Database Systems, 4(4), December 1979, pp. 397-434.[15] Fisher, D.H., Beech, D., Cate, H.P., Chow, E.C., Connors, T., Davis, J.W., Derrette, N.,Hoch, C.G., Kent, W., Lyngbaek, P., Mahbod, B., Neimat, M.A., Ryan, T.A., and Shan,M.C., \IRIS: An Object-Oriented Database Management System," ACM Transactions onO�ce Information Systems, 1(5), January 1987, pp. 48-69.45

Page 46: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

[16] Hull, R., \A Survey of Theoretical Research on Typed Complex Database Objects,"Databases (J. Paredaens, Ed.), Academic Press, London, UK., 1987, pp. 193-256.[17] Hull, R. and Yoshikawa, M., \ILOG: Declarative Creation and Manipulation of ObjectIdenti�ers (Extended Abstract)," Proceedings of 16th International Conference on VeryLarge Data Bases, Morgan Kaufmann, Los Altos, CA, 1990, pp. 455-468.[18] Hull, R. and Yoshikawa, M., \On the Equivalence of Database Restructurings Involv-ing Object Identi�ers (Extended Abstract)," Proceedings of the 10th ACM Symposium onPrinciples of Database Systems, Denver, Colorado, 1991, pp. 328-340.[19] Kifer, M., Kim, W. and Sagiv, Y., \Querying Object Oriented Databases," Proceedings ofthe 1992 ACM SIGMOD , San Diego, CA, 1992, pp. 393-402.[20] Kifer, M., Lausen, G. and Wu, J., \Logical Foundations of Object-Oriented and Frame-based Languages," Technical Report 90/14 (revised), SUNY at Stony Brook, June 1990.[21] Kim, W., Introduction to Object-Oriented Databases, MIT Press, Cambridge, Mas-sachusetts, 1990.[22] Lamb, C., Landis, G., Orenstein, J. and Weinreb, D., \The ObjectStore Database System,"CACM 34(10), October 1991, pp. 50-63.[23] Lanzelotte, R., Valduriez, P., Ziane, M. and Cheiney, J-P., \Optimization of NonrecursiveQueries in OODBs," Proceedings of 2nd International Conference on Deductive and Object-Oriented Databases, Munich, Germany, Lecture Notes in Computer Science 566, December1991, pp. 1-21.[24] Lecluse, C. and Richard, P., \Manipulation of Structured Values in Object-OrientedDatabase System," Proceedings of the 2th International Workshop on Database Program-ming Languages (Hull, R., Morrison, R. and Stemple, D., Eds.), Morgan Kaufmann Pub-lishers Inc., San Mateo, CA, 1990, pp. 113-121.[25] Maier, D., Stein, J.C., Otis, A. and Purdy, A., \Development of an Object-OrientedDBMS," Proceedings of OOPSLA '86 , Portland, Oregan, Sept. 1986, pp. 472-482, ACM,New York.[26] Ontos Object Database Programmer's Guide, Ontologic Inc., Burlington, MA, 1989.[27] The O2 Programmer Manual, O2 Technology, 1991.[28] Sagiv, Y. and Yannakakis, M., \Equivalences Among Relational Expressions with theUnion and Di�erence Operators," Journal of ACM 27(4), October 1980, pp. 633-655.[29] Scholl, M.H., Laasch, C., and Tresch, M., \Updatable Views in Object-OrientedDatabases," Proceedings of 2nd International Conference on Deductive and Object-OrientedDatabases, Munich, Germany, Lecture Notes in Computer Science 566, December 1991,pp. 189-207.[30] Shaw, G. and Zdonik, S., \OO queries: Equivalence and Optimization," Proceedings ofthe 1st International Conference on Deductive and Object-Oriented Databases, (Kim. W.,Nicolas, J-M and Nishio, S., Eds), Kyoto, Japan, December 1989, pp. 264-278.[31] Straube, D.D. and Ozsu, T., \Queries and Query Processing in Object-Oriented DatabaseSystems," ACM Transactions on Information Systems 8(4), October 1990, pp. 387-430.46

Page 47: pdfs.semanticscholar.orgpdfs.semanticscholar.org/c6e9/39986b0121d7f939deef80270e6a5cda8a07.pdf · Con tainmen t and Optimization of Ob ject-Preserving Conjunctiv e Queries Edw a rd

[32] StockMeyer, L.J., \The Polynomial-Time Hierarchy," Theoretical Computer Science (3:1),1976, pp.1-22.[33] Stockmeyer, L.J. and Meyer, A.R., \Word Problems requiring exponential time," Proc. ofthe 5th STOC, pp.1-9, 1973.[34] Ullman, J.D., \Database Theory-Past and Future," Proceedings of the 6th ACM Symposiumon Principles of Database Systems, San Diego, CA, 1987, pp.1-10.[35] Ullman, J.D., Principles of Database and Knowledge-base Systems, Vol. I , Computer Sci-ence Press, Rockville, Maryland, 1988.[36] Wrathall, C, \Complete Sets and the Polynomial-time Hierarchy," Theoretical ComputerScience (3), pp. 23-33, 1976.[37] Zaniolo, C., \The Database Language GEM," Proceedings of the 1983 ACM SIGMOD ,San Jose, CA, 1983, pp. 207-218.

47