Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Database Systems IFoundations of Databases
Summer term 2010
Melanie [email protected]
Database Systems Group, University of Tübingen
1
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Chapter 5Relational Algebra
1. Introduction2. ER-Modeling3. Relational model(ing)4. Relational algebra5. SQL6. Programming7. Advanced topics
• After completing this chapter, you should be able to
‣ enumerate and explain the operations of relational algebra (there is a core of 5 relational algebra operators),
‣ write relational algebra queries of the typejoin–select–project,
‣ discuss correctness and equivalence of given relational algebra queries.
2
Foundations of Databases | Summer term 2010 Melanie Herschel | University of Tübingen
Chapter 5Relational algebra
• Introduction
• Unary Operators: Selection, Projection
• Binary Operators: Cartesian Product, Join, Outer Join
• Set Operations
• Combining Operators
• Formal Definitions, A Bit of Theory
3
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Example Database
4
Sample Movie Database Tables
AID Name DOB1 Jolie 4.6.75
2 Pitt 18.12.63
ActorMID Title Year Rating1 Babel 2006 7
2 Inglorious Bastards 2009 8
3 Wanted 2008 3
Movie
AID MID Name1 3 Fox
2 1 Richard Jones
2 2 Lt. Aldo Raine
Role
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Relational Algebra
5
•Relational algebra (RA) is a query language for the relational model with a solid theoretical foundation.
•Relational algebra is not visible at the user interface level (not in any commercial RDBMS, at least).
•However, almost any RDBMS uses RA to represent queries internally (for query optimization and execution).
•Knowledge of relational algebra will help in understanding SQL and relational database systems in general.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Relational Algebra
6
• In mathematics, an algebra is a
‣ set (the carrier), and
‣operations that are closed with respect to the set.
•Example: ( , {∗, +}) forms an algebra.
‣ In case of RA,
‣ the carrier is the set of all finite relations.
•We will get to know the operations of RA in the sequel (one such operation is, for example, ∪).
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Relational Algebra
7
•Another operation of relational algebra is selection.
• In contrast to operations like + in , the selection σ is parameterized by a simple predicate.
•For example, the operation σAID=2 selects all tuples in the input relation that have the value 2 in column AID.
Selection
AID MID Name1 3 Fox
2 1 Richard Jones
2 2 Lt. Aldo Raine
Role
σAID=2 =AID MID Name2 1 Richard Jones
2 2 Lt. Aldo Raine
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Relational Algebra
8
•Since the output of any RA operation is some relation R again, R may be the input for another RA operation.
•The operations of RA nest to arbitrary depth such that complex queries can be evaluated. The final results will always be a relation.
•A query is a term (or expression) in this relational algebra.
A query
πTitle,Name(Role ⋈ σRating>5(Movie))
Title NameBabel Richard Jones
Inglorious Bastards Lt. Aldo Raine
=
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Relational Algebra
9
• There are some differences between the two query languages RA and SQL:
•Null values are usually excluded in the definition of relational algebra, except when operations like outer join are defined.
•Relational algebra treats relations as sets, i.e., duplicate tuples will never occur in the input/output relations of an RA operator.
Remember: In SQL, relations are multisets (bags) and may contain duplicates. Duplicate elimination is explicit in SQL (SELECT DISTINCT).
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Relational Algebra
10
•Relational algebra is the query language when it comes to the study of relational query languages (DB Theory):
•The semantics of RA is much simpler than that of SQL. RA features five basic operations (and can be completely defined on a single page, if you will).
•RA is also a yardstick for measuring the expressiveness of query languages. If a query language QL can express all possible RA queries, then QL is said to be relationally complete.
SQL is relationally complete. Vice versa, every SQL query (without null values, aggregation, and duplicates) can also be written in RA.
Foundations of Databases | Summer term 2010 Melanie Herschel | University of Tübingen
Chapter 5Relational algebra
• Introduction
• Unary Operators: Selection, Projection
• Binary Operators: Cartesian Product, Join, Outer Join
• Set Operations
• Combining Operators
• Formal Definitions, A Bit of Theory
11
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Selection
12
Selection
The selection σφ selects a subset of the tuples of a relation, namely those which satisfy predicate φ. Selection acts like a filter on a set.
Selection
σRating > 5 =
MID Title Year Rating1 Babel 2006 7
2 Inglorious Bastards 2009 8
3 Wanted 2008 3
=
MID Title Year Rating1 Babel 2006 7
2 Inglorious Bastards 2009 8
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Selection
13
•A simple selection predicate φ has the form
⟨Term⟩ ⟨ComparisonOperator⟩ ⟨Term⟩
•⟨Term⟩ is an expression that can be evaluated to a data value for a given tuple:
‣ an attribute name,
‣ a constant value,
‣ an expression built from attributes, constants, and data type operations like +, !, ∗, /.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Selection
14
•⟨ComparisonOperator⟩ is
‣= (equals), " (not equals),
‣< (less than), > (greater than), #, $,
‣or other data type-dependent predicates (e.g., LIKE).
•Examples for simple selection predicates:
‣Name = ‘Fox’
‣Rating > 5
‣Movie.MID = Role.MID
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Selection
15
•σφ(R) may be implemented as follows.
• If index structures are present (e.g., a B-tree index), it is possible to evaluate σφ(R) without reading every tuple of R.
“Naive” selection
create a new temporary relation T; foreach t ∈ R do p ← φ(t); if p then insert t into T; fiodreturn T;
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Selection
16
A few corner cases
Selection (4)181
• σϕ(R) may be imlemented as:“Naive” selection
create a new temporary relation T ;foreach t ∈ R dop ← ϕ(t);if p theninsert t into T ;
fi
od
return T ;
• If index structures are present (e.g., a B-tree index), it ispossible to evaluate σϕ(R) without reading every tuple of R.
Selection (5)182
A few corner cases
σC=1
A B1 31 42 5
= � (schema error)
σA=A
A B1 31 42 5
=A B1 31 42 5
σ1=2
A B1 31 42 5
= A B
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Selection
17
•σφ(R) corresponds to the following SQL query:
SELECT * FROM" R WHERE" φ
•A different relational algebra operation called projection corresponds to the SELECT clause. Source of confusion.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Selection
18
•More complex selection predicates may be performed using the Boolean connectives:
‣φ1 ∧ φ2 (“and”),% φ1 ∨ φ2 (“or”),% ¬φ1 (“not”).
‣Note: σφ1 ∧ φ2 (R) = σφ1 (σφ2 (R)).
•The selection predicate must permit evaluation for each input tuple in isolation.
Thus, exists (∃) and for all (∀) or nested relational algebra queries are not permitted in selection predicates. Actually, such predicates do not add to the expressiveness of RA.
∨ and ¬
Are the Boolean connectives ∨, ¬ strictly needed?
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Projection
19
Projection
The projection !L eliminates all attributes (columns) of the input relation but those mentioned in the projection list L.
Projection
&Title,Year =
MID Title Year Rating1 Babel 2006 7
2 Inglorious Bastards 2009 8
3 Wanted 2008 3
=
Title YearBabel 2006
Inglorious Bastards 2009
Wanted 2008
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Projection
20
•The projection !Ai1 ,...,Aik (R) produces for each input tuple (A1 : d1,...,An : dn) an output tuple (Ai1" : di1,...,Aik" : dik ).
•& may be used to reorder columns.
•“σ discards rows, & discards columns.”
•DB slang: “All attributes not in L are projected away.”
• In general, the cardinalities of the input and output relations are not equal.
Projection eliminates duplicates
&YearMID Title Year Rating1 Babel 2006 7
2 Inglorious ... 2009 8
3 Wall-E 2009 3
=Year2006
2009
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Projection
21
•!Ai1 ,...,Aik (R) may be implemented as follows.
•The necessary duplicate elimination makes !L one of the more costly operations in RDBMSs. Thus, query optimizers try hard to “prove” that the duplicate elimination step is not necessary.
“Naive” projection
create a new temporary relation T; foreach t = (A1 :d1,...,An:dn)∈ R do u ← (Ai1:di1,...,Aik:dik); insert u into T;odeliminate duplicate tuples in T; return T;
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Projection
22
• If RA is used to formalize the semantics of SQL, the format of the projection list is often generalized:
•Attribute renaming:
!B1←Ai1,...,Bk←Aik (R)
•Computations (e.g., string concatenation via ||) to derive the value in new columns, e.g.,
πSID,Name ←First || ’ ’ || Last (Producer)
•Such generalized & operators are also referred to as map operators (as in functional programming languages).
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Projection
23
•&A1,...,Ak (R) corresponds to the SQL query:
SELECT" DISTINCT A1,...,Ak FROM"R
•!B1←Ai1,...,Bk←Aik (R) is equivalent to the SQL query:
SELECT DISTINCT A1"[AS] B1,...,Ak"[AS] Bk FROM"R
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Selection vs. Projection
24
Selection vs. Projection
A1 A2 A3 A4 A1 A2 A3 A4Selection σ Projection !
Filters some rowsPreserves all columns
Preserves all rowsFilters some columns
Foundations of Databases | Summer term 2010 Melanie Herschel | University of Tübingen
Chapter 5Relational algebra
• Introduction
• Unary Operators: Selection, Projection
• Binary Operators: Cartesian Product, Join, Outer Join
• Set Operations
• Combining Operators
• Formal Definitions, A Bit of Theory
25
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Cartesian Product
26
• In general, queries need to combine information from several tables.
• In RA, such queries are formulated using ', the Cartesian product.
Cartesian Product
The Cartesian product R " S of two relations R, S is computed by concatenating each tuple t ∈ R with each tuple u ∈ S (◦ denotes tuple concatenation.)
Cartesian Product
AID1 AName DOB1 Jolie 4.6.75
2 Pitt 18.12.63
ActorAID2 MID RName1 3 Fox
2 1 Jones
2 2 Raine
Role
! =
AID1 AName DOB AID2 MID RName1 Jolie 4.6.75 1 3 Fox
1 Jolie 4.6.75 2 1 Jones
1 Jolie 4.6.75 2 2 Raine
2 Pitt 18.12.63 1 3 Fox
2 Pitt 18.12.63 2 1 Jones
2 Pitt 18.12.63 2 2 Raine
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Cartesian Product
27
• If t = (A1 : a1,...,An : an) and u = (B1 : b1, ..., Bm : bm), then t ◦ u =(A1 : a1, ..., An : an, B1 :b1, ..., Bm : bm).
•The Cartesian product can be implemented as follows:
Cartesian Product: Nested Loops
create a new temporary relation T; foreach t ∈ R do foreach u ∈ S do insert t ◦ u into T; ododreturn T;
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Cartesian Product and Renaming
28
• Since attribute names must be unique within a tuple, the Cartesian product may only be applied if R, S do not share any attribute names. (This is no real restriction because we have renaming in &.)
•R # S may be computed by the equivalent SQL query (SQL does not impose the unique column name restriction, a column A of relation R may uniquely be identified by R.A):
SELECT"*FROM R, S
• In RA, this is often formalized by means of of a renaming operator ρX(R). If sch(R) = (A1 : D1,...,An : Dn), then
ρX(R) ≡ !X.A1←A1,...,X.An←An(R)
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Join
29
•The intermediate result generated by a Cartesian product may be quite large in general (|R| = n,|S| = m ⇒ |R#S| = n∗m).
•Since the combination of Cartesian product and selection in queries is common, a special operator join has been introduced.
Join
The (theta-)join R ⋈θ S between relations R, S is defined as
R ⋈θ S ≡ σθ(R ! S)
The join predicate θ may refer to attribute names of R and S.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Join
30
ρA(Actor) ⋈A.AID=R.AID ρR(Role) - Assuming no key and foreign keys are defined
AID AName DOB1 Jolie 4.6.75
2 Pitt 18.12.63
3 Ford 13.7.42
ActorAID MID RName1 3 Fox
2 1 Jones
2 2 Raine
Role
=
A.AID A.AName A.DOB R.AID R.MID R.RName1 Jolie 4.6.75 1 3 Fox
2 Pitt 18.12.63 2 1 Jones
2 Pitt 18.12.63 2 2 Raine
⋈A.AID=R.AID
Note: actor Ford does not appear in the join result.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Join
31
• R ⋈θ S can be evaluated by “folding” the procedures for σ, ':
Nested Loops Join
create a new temporary relation T;foreach t ∈ R do foreach u ∈ S do if θ(t ◦ u) then insert t ◦ u into T ; fi ododreturn T;
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Join
32
•Join combines tuples from two relations and acts like a filter: tuples without join partner are removed.
•Note: if the join is used to follow a foreign key relationship, then no tuples are filtered.
•There are join variants which act like filters only: left and right semijoin (⋉, ⋊):
R⋉θ S ≡ &sch(R)(R ⋈θ S),
or do not filter at all: outer-join (see below).
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Natural Join
33
•The natural join provides another useful abbreviation (“RA macro”).
• In the natural join R ⋈ S, the join predicate θ is defined to be a conjunctive equality comparison of attributes sharing the same name in R, S.
•Natural join handles the necessary attribute renaming and projection.
Natural Join
Assume R(A, B, C) and S(B, C, D). Then:R ⋈ S = !A,B,C,D(σB=B′∧ C=C′ (R " !B′←B,C′←C,D(S) ) )
(Note: shared columns occur once in the result.)
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Joins in SQL
34
• In SQL, R ⋈θ S is normally written using one of the following notations
•Classic notation: SELECT * FROM R, S WHERE θ
•SQL-92 notation: SELECT * FROM R JOIN S ON θ
•Note: this left query is exactly the SQL equivalent of σθ (R # S) we have seen before.
SQL is a declarative language: it is the task of the SQL optimizer to infer that this query may be evaluated using a join instead of a Cartesian product.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Algebraic Laws
35
•The join satisfies the associativity condition
In “join chains”, parentheses are thus superfluous:
•Join is not commutative unless it is followed by a projection, i.e., a column reordering:
Algebraic Laws (1)207
• The join satisfy the associativity condition
(R �� S) �� T ≡ R �� (S �� T ) .
In “join chains”, parentheses are thus superfluous:
R �� S �� T .
• Join is not commutative unless it is followed by a projection,i.e., a column reordering:
πL(R �� S) ≡ πL(S �� R) .
Algebraic Laws (2)208
• A significant number of further algebraic laws hold, which areheavily utilized by the query optimizer.
• Example: selection push-down.
If predicate ϕ refers to attributes in S only, then
σϕ(R �� S) ≡ R �� σϕ(S) .
Selection push-down
Why is selection push-down considered one of the mostsignificant algebraic optimizations?
• (Such effficiency considerations are the subject of“Datenbanken II.”)
Algebraic Laws (1)207
• The join satisfy the associativity condition
(R �� S) �� T ≡ R �� (S �� T ) .
In “join chains”, parentheses are thus superfluous:
R �� S �� T .
• Join is not commutative unless it is followed by a projection,i.e., a column reordering:
πL(R �� S) ≡ πL(S �� R) .
Algebraic Laws (2)208
• A significant number of further algebraic laws hold, which areheavily utilized by the query optimizer.
• Example: selection push-down.
If predicate ϕ refers to attributes in S only, then
σϕ(R �� S) ≡ R �� σϕ(S) .
Selection push-down
Why is selection push-down considered one of the mostsignificant algebraic optimizations?
• (Such effficiency considerations are the subject of“Datenbanken II.”)
Algebraic Laws (1)207
• The join satisfy the associativity condition
(R �� S) �� T ≡ R �� (S �� T ) .
In “join chains”, parentheses are thus superfluous:
R �� S �� T .
• Join is not commutative unless it is followed by a projection,i.e., a column reordering:
πL(R �� S) ≡ πL(S �� R) .
Algebraic Laws (2)208
• A significant number of further algebraic laws hold, which areheavily utilized by the query optimizer.
• Example: selection push-down.
If predicate ϕ refers to attributes in S only, then
σϕ(R �� S) ≡ R �� σϕ(S) .
Selection push-down
Why is selection push-down considered one of the mostsignificant algebraic optimizations?
• (Such effficiency considerations are the subject of“Datenbanken II.”)
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Algebraic Laws
36
•A significant number of further algebraic laws hold, which are heavily utilized by the query optimizer.
•Example: selection push-down. If predicate φ refers to attributes in S only, then
• (Such efficiency considerations are the subject of “Datenbanken II.”)
Algebraic Laws (1)207
• The join satisfy the associativity condition
(R �� S) �� T ≡ R �� (S �� T ) .
In “join chains”, parentheses are thus superfluous:
R �� S �� T .
• Join is not commutative unless it is followed by a projection,i.e., a column reordering:
πL(R �� S) ≡ πL(S �� R) .
Algebraic Laws (2)208
• A significant number of further algebraic laws hold, which areheavily utilized by the query optimizer.
• Example: selection push-down.
If predicate ϕ refers to attributes in S only, then
σϕ(R �� S) ≡ R �� σϕ(S) .
Selection push-down
Why is selection push-down considered one of the mostsignificant algebraic optimizations?
• (Such effficiency considerations are the subject of“Datenbanken II.”)
Selection push down
Why is selection push-down considered one of the most significant algebraic optimizations?
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
A Common Query Pattern
37
The following operator tree structure is very common:
1.Join all tables needed to answer the query,
2.Select the relevant tuples,
3.Project away all irrelevant columns.
A Common Query Pattern (1)209
• The following operator tree structure is very common:
πA1,...,Ak
σϕ
��θ1��θ2
���
��θn−1Rn
���Rn−1
��R2
���� R1
����
1○ Join all tables needed to answer the query, 2○ select therelevant tuples, 3○ project away all irrelevant columns.
A Common Query Pattern (2)210
• The select-project-join query
πA1,...,Ak (σϕ(R1 ��θ1 R2 ��θ2 · · · ��θn−1 Rn))
has the obvious SQL equivalent
SELECT DISTINCT A1, . . . ,AkFROM R1, . . . ,RnWHERE ϕAND θ1 AND · · · AND θn−1
• It is a common source of errors to forget a join condition:think of the scenario R(A,B), S(B,C), T (C,D) whenattributes A,D are relevant for the query output.
�Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Relational Algebra Quiz (Level: Novice)
38
Sample Movie Database Tables
AID Name DOB1 Jolie 4.6.75
2 Pitt 18.12.63
ActorMID Title Year Rating1 Babel 2006 7
2 Inglorious Bastards 2009 8
3 Wanted 2008 3
Movie
AID MID Name1 3 Fox
2 1 Richard Jones
2 2 Lt. Aldo Raine
Role
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Relational Algebra Quiz (Level: Novice)
39
Formulate equivalent queries in RA
Print all Movie titles produced after 2000 that have a high rating (equal or above 8).
Print all Movie titles and Role names where Ford plays a role.
Print all Movie titles and years of movies that appeared in 2010 featuring young actors (born after 1990).
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Self Joins
40
Sometimes it is necessary to refer to more than one tuple of the same relation at the same time.
•Example: “Which movies are remakes of movie A? These movies are identified by equal title but with a higher year”.
•To answer this query, we need to compare two tuples t, u of the relation Movie:
1.tuple t corresponding to a movie with title A,
2.tuple u corresponding to another movie in which u.Year > t.Year AND t.Title = u.title
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Self Joins
41
•This requires a generalization of the select-project-join query pattern, in which two instances of the same relation are joined (the attributes in at least one instance must be renamed first):
πX.MID( ρX(Movie) ⋈X.Title = Y.Title ⋀ X.Year < Y.Year ρY(Movie))
•Such joins are commonly referred to as self joins.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Outer Join
42
•Join eliminates tuples without a partner.
•The left outer join preserves all tuples in its left argument, even if a tuple does not team up with a partner in the join:
Outer Join (1)
233
• Join (��) eliminates tuples without partner:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca2 b2 c2
• The left outer join preserves all tuples in its left argument,even if a tuple does not team up with a partner in the join:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca1 b1 (null)a2 b2 c2
Outer Join (2)
234
• The right outer join preserves all tuples in its right argument:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca2 b2 c2
(null) b3 c3
• The full outer join preserves all tuples in both arguments:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=
A B Ca1 b1 (null)a2 b2 c2
(null) b3 c3
Outer Join (1)
233
• Join (��) eliminates tuples without partner:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca2 b2 c2
• The left outer join preserves all tuples in its left argument,even if a tuple does not team up with a partner in the join:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca1 b1 (null)a2 b2 c2
Outer Join (2)
234
• The right outer join preserves all tuples in its right argument:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca2 b2 c2
(null) b3 c3
• The full outer join preserves all tuples in both arguments:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=
A B Ca1 b1 (null)a2 b2 c2
(null) b3 c3
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Outer Join
43
•The right outer join preserves all tuples in its right argument:
•The full outer join preserves all tuples in both arguments:
Outer Join (1)
233
• Join (��) eliminates tuples without partner:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca2 b2 c2
• The left outer join preserves all tuples in its left argument,even if a tuple does not team up with a partner in the join:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca1 b1 (null)a2 b2 c2
Outer Join (2)
234
• The right outer join preserves all tuples in its right argument:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca2 b2 c2
(null) b3 c3
• The full outer join preserves all tuples in both arguments:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=
A B Ca1 b1 (null)a2 b2 c2
(null) b3 c3
Outer Join (1)
233
• Join (��) eliminates tuples without partner:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca2 b2 c2
• The left outer join preserves all tuples in its left argument,even if a tuple does not team up with a partner in the join:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca1 b1 (null)a2 b2 c2
Outer Join (2)
234
• The right outer join preserves all tuples in its right argument:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=A B Ca2 b2 c2
(null) b3 c3
• The full outer join preserves all tuples in both arguments:
A Ba1 b1a1 b2
��B Cb2 c2b3 c3
=
A B Ca1 b1 (null)a2 b2 c2
(null) b3 c3
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Outer Join
44
R ⟕ θ S
create a new temporary relation T;foreach t ∈ R do haspartner ← false; foreach u ∈ S do if θ(t ◦ u) then insert t ◦ u into T ; haspartner ← true; fi od if ¬haspartner then insert t ◦ (null,...,null) into T; fiodreturn T;
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Outer Join
45
Outer Join Example
Prepare a full list of movies (ids and titles suffice) and their associated role names, including those movies that have no associated roles (e.g., the documentary (‘Planet Earth’, 2000, 5) ).
πMID, title, name (Movie ⟕MID = MID’ πMID’ ← MID,Name(Role))
MID Title Year Rating3 Wanted 2008 3
4 Planet Earth 2000 5
MovieAID MID Name1 3 Fox
10 3 Sloan
Role
MID Tile Name3 Wanted Fox
3 Wanted Sloan
4 Planet Earth null
Query result
Foundations of Databases | Summer term 2010 Melanie Herschel | University of Tübingen
Chapter 5Relational algebra
• Introduction
• Unary Operators: Selection, Projection
• Binary Operators: Cartesian Product, Join, Outer Join
• Set Operations
• Combining Operators
• Formal Definitions, A Bit of Theory
46
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Set Operations
47
•Relations are sets (of tuples). The “usual” family of binary set operations can also be applied to relations.
• It is a requirement, that both input relations have the same schema.
Set operations
The set operations of relational algebra are R ∪ S, R ∩ S, and R " S (union, intersection, difference).
A minimal set of operations
Which of these set operations is redundant (i.e., may be derived using an alternative RA expression, just like join)?
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Set Operations
48
Set Operations (2)
217
R
S
R ∪ S
R ∩ S
R − S
S − R
Set Operations (3)
218
• R ∪ S may be implemented as follows:
Union
create a new temporary relation T ;foreach t ∈ R doinsert t into T ;
od
foreach t ∈ S doinsert t into T ;
od
remove duplicates in T ;return T ;
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Set Operations
49
•R ∪ S may be implemented as follows:
Union
create a new temporary relation T; foreach t ∈ R do insert t into T; odforeach t ∈ S do insert t into T;odremove duplicates in T; return T;
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Set Operations
50
•R - S may be implemented as follows:
Difference
create a new temporary relation T; foreach t ∈ R do remove ← false; foreach u ∈ S do remove ← remove ∨ (t = u); od if ¬remove then insert t into T; fiodreturn T;
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Union
51
• In RA queries, a typical application for union is case analysis.
Case analysis using union
The following query assigns movie categories based on ratings.
πMID,Category←‘Favorite movies’ (σRating >= 9(Movies)) ∪ πMID,Category←‘Good movies’ (σRating >= 7 ∧ Rating < 9(Movies))∪ πMID,Category←‘Average movies’ (σRating >= 5 ∧ Rating < 7(Movies))
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Union
52
• In SQL, ∪ is directly supported: keyword UNION. UNION may be placed between two SELECT-FROM-WHERE blocks:
SQL’s UNION
SELECT MID, ‘Favorite Movies’FROM MoviesWHERE Rating >= 9UNIONSELECT MID, ‘Good Movies’FROM MoviesWHERE Rating >= 7 AND Rating < 9UNIONSELECT MID, ‘Average Movies’FROM MoviesWHERE Rating >= 5 AND Rating < 7
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Set Difference
53
•Note: the RA operators σ, &, ', ∪, ⋈ are monotic by definition, e.g.:
•Then it follows that every query Q that exclusively uses the above operators behaves monotonically:
‣Let I1 be a database state, and let I2 = I1 ∪ {t}(database state after insertion of tuple t).
‣Then every tuple u contained in the answer to Q in state I1 is also contained in the answer to Q in state I2.
Database insertion never invalidates a correct answer.
Union (2)221
• In SQL, ∪ is directly supported: keyword UNION.UNION may be placed between two SELECT-FROM-WHERE blocks:SQL’s UNION
SELECT SID, ’A’ AS GRADE
FROM RESULTS
WHERE CAT = ’M’ AND ENO = ’1’ AND POINTS ¿= 12
UNION
SELECT SID, ’B’ AS GRADE
FROM RESULTS
WHERE CAT = ’M’ AND ENO = ’1’
AND POINTS ¿= 10 AND POINTS ¡ 12
UNION
...
Set Difference (1)222
• Note: the RA operators σ,π,×,��,∪ are monotic bydefinition, e.g.:
R ⊆ S =⇒ σϕ(R) ⊆ σϕ(S) .
• Then it follows that every query Q that exclusively uses theabove operators behaves monotonically:
� Let I1 be a database state, and let I2 = I1 ∪ {t}(database state after insertion of tuple t).
� Then every tuple u contained in the answer to Q in state I1is also contained in the anser to Q in state I2.
Database insertion never invalidates a correct answer.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Set Difference
54
• If we pose non-monotonic queries, e.g.,
•“Which actor has not played any role?”
•“What movies featuring Actor1 have the highest rating?
then it is obvious that σ, &, ', ⋈, ∪ are not sufficient to formulate the query. Such queries require set difference (!).
A non-monotonic query
“Which actor has not played any role? (Print name and date of birth)”
Example database tables repeated on next slide)
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Example Database
55
Sample Movie Database Tables
AID Name DOB1 Jolie 4.6.75
2 Pitt 18.12.63
3 Depp 9.6.1963
ActorMID Title Year Rating1 Babel 2006 7
2 Inglorious Bastards 2009 8
3 Wanted 2008 3
Movie
AID MID Name1 3 Fox
2 1 Richard Jones
2 2 Lt. Aldo Raine
Role
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Set Difference
56
A correct solution?
πName, DOB ( Actor ⋈MID ≠ MID2 πMID2 ← MID(Role))
A correct solution?
πAID, Name, DOB(Actor - πMID(Role))
Correct solution!
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Set Difference
57
•A typical RA query pattern involving set difference is the anti-join.
•Given R(A, B) and S(B, C), retrieve the tuples of R that do not have a (natural) join partner in S (Note: sch(R) ∩ sch(S) = {B}):
(The following is equivalent: )
•There is no common symbol for this anti-join, but seems appropriate (complemented semi-join).
Set Difference (3)225
A correct solution?
πFIRST,LAST(STUDENTS ��SID �=SID2 πSID2←SID(RESULTS))
A correct solution?
πSID,FIRST,LAST(STUDENTS− πSID(RESULTS))
Correct solution!
Set Difference (4)226
• A typical RA query pattern involving set difference is theanti-join.
• Given R(A,B) and S(B,C), retrieve the tuples of R that donot have a (natural) join partner in S(Note: sch(R) ∩ sch(S) = {B}):
R �� (πB(R)− πB(S)) .
(The following is equivalent: R − πsch(R)(R �� S).)
• There is no common symbol for this anti-join, but R�Sseems appropriate (complemented semi-join).
Set Difference (3)225
A correct solution?
πFIRST,LAST(STUDENTS ��SID �=SID2 πSID2←SID(RESULTS))
A correct solution?
πSID,FIRST,LAST(STUDENTS− πSID(RESULTS))
Correct solution!
Set Difference (4)226
• A typical RA query pattern involving set difference is theanti-join.
• Given R(A,B) and S(B,C), retrieve the tuples of R that donot have a (natural) join partner in S(Note: sch(R) ∩ sch(S) = {B}):
R �� (πB(R)− πB(S)) .
(The following is equivalent: R − πsch(R)(R �� S).)
• There is no common symbol for this anti-join, but R�Sseems appropriate (complemented semi-join).
Set Difference (3)225
A correct solution?
πFIRST,LAST(STUDENTS ��SID �=SID2 πSID2←SID(RESULTS))
A correct solution?
πSID,FIRST,LAST(STUDENTS− πSID(RESULTS))
Correct solution!
Set Difference (4)226
• A typical RA query pattern involving set difference is theanti-join.
• Given R(A,B) and S(B,C), retrieve the tuples of R that donot have a (natural) join partner in S(Note: sch(R) ∩ sch(S) = {B}):
R �� (πB(R)− πB(S)) .
(The following is equivalent: R − πsch(R)(R �� S).)
• There is no common symbol for this anti-join, but R�Sseems appropriate (complemented semi-join).
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Set Difference
58
•Suppose that the relations R, S have been computed as:
•S := SELECT A1,...,An FROM R1,...,Rm WHERE φ1
•R := SELECT B1,...,Bn FROM S1,...,Sk WHERE φ2Set difference R - S in SQL
SELECT" A1, . . . ,An FROM R1,...,Rm WHERE"φ1 AND NOT EXISTS (SELECT * FROM S1,...,Sk WHERE"φ2 AND" B1=A1 AND···AND Bn=An )
The subquery in () returns TRUE if it returns 0 tuples.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Set Difference
59
•Note that the availability of ∪, ! (and ∩) renders complex selection predicates superfluous:
Predicate simplification rules
Set Difference (5)227
• Suppose that the relations R,S have been computed as:� S := SELECT A1, . . . , An FROM R1, . . . , Rm WHERE ϕ1� R := SELECT B1, . . . , Bn FROM S1, . . . , Sk WHERE ϕ2
Set difference R − S in SQL4
SELECT A1, . . . ,AnFROM R1, . . . ,RmWHERE ϕ1 AND NOT EXISTS
(SELECT *
FROM S1, . . . ,SkWHERE ϕ2AND B1=A1 AND · · · AND Bn=An)
4The subquery in () returns TRUE if it returns 0 tuples.
Set Operations and Complex Selections
228
• Note that the availability of ∪,− (and ∩) renders complexselection predicates superfluous:
Predicate Simplification Rules
σϕ1∧ϕ2(Q)→= σϕ1(Q) ∩ σϕ2(Q)
σϕ1∨ϕ2(Q) = σϕ1(Q) ∪ σϕ2(Q)
σ¬ϕ(Q) = Q− σϕ(Q)
RDBMS implement complex selection predicates anyway
Why?RDBMS implement complex selection predicates anyway
Why?
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Relational Algebra Quiz (Level: Intermediate)
60
Formulate equivalent queries in RA
What movies (print MID) starring actor with AID 1 have the highest rating?
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Relational Algebra Quiz (Level: Intermediate)
61
Table pivot
Below are two alternatives to represent award recipients per category. Find RA expressions to transform between the two representations.
Award Year BestMovie BestActorOscar 2010 Great Movie John Doe
Oscar 2009 Fabulous Film Mr. Smith
Oscar 2008 Go see it! Invisible Man
Award_1
Award Year Category WinnerOscar 2010 Best Movie Great Movie
Oscar 2009 Best Movie Fabulous Movie
Oscar 2008 Best Movie Go see it!
Oscar 2010 Best Actor John Doe
Oscar 2009 Best Actor Mr. Smith
Oscar 2008 Best Actor Invisible Man
Award_2
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Summary
62
•Derived (and thus redundant) operations:Theta-Join, Natural Join, Semi-Join, Renaming, and Intersection
•Extensions to the basic relational algebra: left outer join, right outer join, full outer join.
The five basic operations of relational algebra are:
1.Selection
2.Projection
3.Cartesian Product
4.Union
5.Difference
Foundations of Databases | Summer term 2010 Melanie Herschel | University of Tübingen
Chapter 5Relational algebra
• Introduction
• Unary Operators: Selection, Projection
• Binary Operators: Cartesian Product, Join, Outer Join
• Set Operations
• Combining Operators
• Formal Definitions, A Bit of Theory
63
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Combining Operations
64
•Since the result of any relational algebra operation is a relation again, this intermediate result may be the input of a subsequent RA operation.
Example: πTitle,Name(Role ⋈ σRating>5(Movie))
•We can think of the intermediate result to be stored in a named temporary relation (or as a macro definition):
HighRatings := σRating>5(Movie);MR := Role ⋈ HighRatings;πTitle,Name(MR);
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Combining Operations
65
•Composite RA expressions are typically depicted as operator trees:
• In these trees, computation proceeds bottom-up. The evaluation order of sibling branches is not pre-determined.
Movie
σRating>5
Role
⋈
πTitle,Name
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Combining Operations
66
•SQL-92 permits the nesting of queries (the result of a SQL query may be used in a place of a relation name):
•Note that this is not the typical style of SQL querying!
Nested SQL Query
SELECT DISTINCT Title, NameFROM ( SELECT * FROM Role, ( SELECT * FROM Movie WHERE Rating > 5 ) AS HighRatings WHERE Role.MID = HighRatings.MID) AS MR
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Combining Operations
67
• Instead, a single SQL query is equivalent to an RA operator tree containing σ, &, and (multiple) ' (see below):
• Really complex queries may be constructed step-by-step (using SQL’s view mechanism), which may be used like a relation:
SELECT-FROM-WHERE Block
SELECT DISTINCT Title, NameFROM Role, MovieWHERE Movie.MID = Role.MID AND Rating > 5
SQL View Definition
CREATE VIEW MR AS SELECT * FROM Role, Movie WHERE Movie.MID = Role.MID AND Rating > 5
Foundations of Databases | Summer term 2010 Melanie Herschel | University of Tübingen
Chapter 5Relational algebra
• Introduction
• Unary Operators: Selection, Projection
• Binary Operators: Cartesian Product, Join, Outer Join
• Set Operations
• Combining Operators
• Formal Definitions, A Bit of Theory
68
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Syntax
69
•Let the following be given:
‣A set D of data type names and for each D ∈ D a set val(D) of values.
‣A set A of valid attribute names (identifiers).
Relational database schema
A relational database schema S consists of
•a finite set of relation names R, and
•for every R ∈ R a relation schema sch(R).
(We ill ignore constraints here.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Syntax
70
•The set of syntactically correct RA expressions or queries is defined recursively, together with the resulting schema of each expression.
Syntax of RA (base cases)
1. R# (relation name) For every R ∈ R, R is an RA expression with schema sch(R).
2. {(A1:d1,...,An:dn)}#(relation constant) A relation constant is an RA expression if A1, ... , An ∈ A, di ∈val(Di)
for 1$ i$ n with D1,...,Dn ∈ D. The schema of this expression is
(A1:D1, ..., An:Dn).
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Syntax
71
•Let Q be an RA expression with schema s = (A1:D1, ..., An:Dn).
Syntax of RA (recursive cases)
3.σAi = Aj (Q) for i,j ∈ {1,...,n} is an RA expression with schema s.
4.σAi = d (Q) for i ∈ {1,...,n} and d ∈ val(Di) is an RA expression with schema s.
5.!B1←Ai1,...,Bm←Aim (Q) for i1,...,im ∈ {1,...,n} and B1,...,Bm ∈ A such that Bj % Bk for j % k is an
RA expression with schema (B1:Di1,...,Bm:Dim).
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Syntax
72
•Let Q1, Q2 be RA expressions with the same schema s.
•Let Q1, Q2 be RA expressions with schemas (A1:D1, ..., An:Dn) and (B1:E1, ..., Bm:Em), respectively.
Syntax of RA (recursive cases)
6. Q1 ∪ Q2 and
7. Q1 $ Q2
are RA expressions with schema s.
Syntax of RA (recursive cases)
8. Q1 # Q2 is an RA expression with schema (A1:D1, ..., An:Dn, B1 :E1 ,..., Bm :Em ) if {A1 , . . . , An} ∩ {B1 , . . . , Bm} = ∅.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Semantics
73
The result of a query Q, i.e., an RA expression, in a database state I is a relation. This relation is denoted by I[Q] and defined recursively corresponding to the syntactic structure of Q.
Database state
A database state I (instance) defines a relation I(R) for every relation name R in the database schema S.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Semantics
74
I[Q]
Definitions: Semantics (1)243
Database State
A database state I (instance) defines a relation I(R) forevery relation name R in the database schema S.
• The result of a query Q, i.e., an RA expression, in adatabase state I is a relation. This relation is denoted by I[Q]and defined recursively corresponding to the syntacticstructure of Q.
Definition: Semantics (2)244
I[Q]
• If Q is a relation name R, then I[Q] := I(R).
• If Q is a constant relation {(A1:d1, . . . , An:dn)}, thenI[Q] := {(d1, . . . , dn)}.
• If Q has the form σAi=Aj (Q1), thenI[Q] := {(d1, . . . , dn) ∈ I[Q1] | di = dj}
• If Q has the form σAi=d(Q1), thenI[Q] := {(d1, . . . , dn) ∈ I[Q1] | di = d}
• If Q has the form πB1←Ai1 ,...,Bm←Aim (Q1), thenI[Q] := {(di1 , . . . , dim) | (d1, . . . , dn) ∈ I[Q1]}
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Semantics
75
I[Q] (continued)
Definition: Semantics (3)245
I[Q] (continued)
• If Q has the form Q1 ∪Q2, thenI[Q] := I[Q1] ∪ I[Q2]
• If Q has the form Q1 −Q2, thenI[Q] := I[Q1]− I[Q2]
• If Q has the form Q1 ×Q2, thenI[Q] := { (d1, . . . , dn, e1, . . . , em) |
(d1, . . . , dn) ∈ I[Q1],(e1, . . . , em) ∈ I[Q2]} .
Monotonicity246
Smaller Database State
A database state I1 is smaller than (or equal to) a databasestate I2, written I1 ⊆ I2, iff I1(R) ⊆ I2(R) for all relationnames R ∈ R of schema S.
Theorem: RA \{−} is monotonicIf an RA expression Q does not contain the − (set difference)operator, then the following holds for all database states I1, I2:
I1 ⊆ I2 =⇒ I1[Q] ⊆ I2[Q] .
Formulate proof by induction on syntactic structure of Q(“structural induction”).
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Monotonicity
76
Smaller database state
A database state I1 is smaller than (or equal to) a database state I2, written I1 ⊆ I2, iff I1(R) ⊆ I2(R) for all relation names R ∈ R of schema S.
Theorem: RA \ {-} is monotonic
If an RA expression Q does not contain the & (set difference) operator, then the following holds for all database states I1, I2:
I1 ⊆ I2 ⇒ I1[Q] ⊆ I2[Q] .
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Equivalence
77
Equivalence of RA expressions
Two RA expressions Q1 and Q2 are equivalent iff they have the same (result) schema and for all database states I, the following holds:
I[Q1] = I[Q2]
•Examples:
‣σφ1(σφ2(Q)) = σφ2(σφ1(Q))
‣(Q1 # Q2) # Q3 = Q1 # (Q2 # Q3)
‣If A is an attribute in the result schema of Q1, then σA=d (Q1 # Q2) = (σA=d(Q1)) # Q2.
•Theorem: The equivalence of (arbitrary) relational algebra expressions is undecidable.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Limitations of RA
78
• Let R be a relation name and assume sch(R) = (A:D, B:D), i.e., both columns share the same data type D. Let val(D) be infinite.
•The transitive closure of I(R) is the set of all (d, e) ∈ val(D) # val(D) such that there are n ∈ , n & 1, and d0, d1,...,dn ∈ val(D) with d = d0, e = dn and (di$1,di) ∈ I(R) for i = 1,...,n.
Example of transitive closure
from toa b
b c
c d
R
a b
cd
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Limitations of RA
79
•Theorem: There is no RA expression Q such that I[Q] is the transitive closure of I(R) for all database states I.
•An n-fold self-join will find all paths in the graph of length n + 1. To compute the transitive closure for arbitrary graphs, i.e., for all database states I, is impossible in RA.
Example of transitive closure
In the directed graph example, one self-join(of R with itself) is needed, to follow the edgesin the graph:
πS.from,T.to(ρS(R) ⋈S.to=T.from ρT(R))
from toa b
b c
c d
R
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Limitations of RA
80
•This of course implies that relational algebra is not computationally complete.
•There are functions from database states to relations (query results), for which we could write a program using our favorite programming language, but we will not be able to find an equivalent RA expression to do the same.
•However, this would have been truly unexpected and actually unwanted, because want a guarantee that query evaluation always terminates. This is guaranteed for RA.
Otherwise, we would have solved the halting problem.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Limitations of RA
81
•All RA queries can be evaluated in time that is polynomial in the size of the database state.
•This implies that certain “complex problems” cannot be formulated in relational algebra.
For example, if you find a way to formulate the Traveling Salesman problem in RA, you have solved the famous P = NP problem. (With a solution that nobody expects; contact me to collect your PhD.)
•As the transitive closure example shows, even not all problems of polynomial complexity can be formulated in “classical RA.”
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Expressive Power
82
Read as: “It is possible to write an RA-to-L query compiler.”
Relational completeness
A query language L for the relational model is called strong
relationally complete if, for every DB schema S and for every RA
expression Q1 with respect to S there is a query Q2 ∈ L such that for all
database states I with respect to S the two queries produce the same
results:
I[Q1] = I[Q2]
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Expressive Power
83
•SQL is strong relationally complete.
• If we can even write RA-to-L as well as L-to-RA compilers, both query
languages are equivalent.
•SQL and RA are not equivalent. SQL contains concepts, e.g., the aggregate COUNT, which cannot be simulated in RA.
Equivalent query languages
•Relational algebra,
•SQL without aggregations and with mandatory duplicate elimination,
•Tuple relational calculus,
•Datalog (a Prolog variant) without recursion.
Foundations of Databases | Summer term 2010 | Melanie Herschel | University of Tübingen
Summary
• Relational algebra is a query language over relational data.
• Five basic operators: selection, projection, Cartesian product, union, set difference.
• Derived operators: intersection, join, semijoin, outer joins
• Operators can be nested and form tree-shaped query plans
• Theoretical background: syntax, semantics, monotonicity, equivalence, limitations
84