90
Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

Embed Size (px)

Citation preview

Page 1: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

Lecture 3: Relational Algebra and SQL

Tuesday, March 25, 2008

Page 2: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

2

Outline

• Relational Algebra: 4.2 (except 4.2.5)

• SQL: 5.2, 5.3, 5.4

Page 3: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

3

Querying the Database

• Goal: specify what we want from our database – Find all the employees who earn more than $50,000

and pay taxes in New Jersey.

• Could write in C++/Java, but bad idea• Instead use high-level query languages:

– Theoretical: Relational Algebra, Datalog– Practical: SQL

• Relational algebra: a basic set of operations on relations that provide the basic principles.

Page 4: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

4

Instances of Branch and Staff (part) Relations

Page 5: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

5

Relational Algebra• Operators: relations as input, new relation as output • Five basic RA operators:

– Set Operators• union, difference • Selection:

– Projection: – Cartesian Product: X

• Derived operators:– Intersection, complement– Joins (natural,equi-join, theta join, semi-join)

• When our relations have attribute names:– Renaming:

Page 6: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

6

Set Operations: Union

• Union: all tuples in R1 or R2

• Notation: R1 U R2

• R1, R2 must have the same schema

• R1 U R2 has the same schema as R1, R2

• Example: – ActiveEmployees U RetiredEmployees

Page 7: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

7

Union• R S

– Union of two relations R and S defines a relation that contains all the tuples of R, or S, or both R and S, duplicate tuples being eliminated.

– R and S must be union-compatible.

• If R and S have I and J tuples, respectively, union is obtained by concatenating them into one relation with a maximum of (I + J) tuples.

Example: List all cities where there is either a branch office or a property for rent.

• Pcity(Branch) union Pcity(PropertyForRent) or R = Branch[city] union PropertyForRent[city]

Page 8: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

8

Intersection• R S

– Defines a relation consisting of the set of all tuples that are in both R and S.

– R and S must be union-compatible.

• Expressed using basic operations:

R S = R – (R – S)

Example: List all cities where there is both a branch office and at least one property for rent.

city(Branch) city(PropertyForRent)

Page 9: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

9

Difference (Minus)

• R – S– Defines a relation consisting of the tuples that are in relation R, but not in S.

– R and S must be union-compatible.

• List all cities where there is a branch office but no properties for rent.

city(Branch) – city(PropertyForRent)

Or R = Branch [city] - PropertyForRent [city]

Page 10: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

10

Set Operations: Difference

• Difference: all tuples in R1 and not in R2

• Notation: R1 – R2

• R1, R2 must have the same schema

• R1 - R2 has the same schema as R1, R2

• Example– AllEmployees - RetiredEmployees

Page 11: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

11

Relational Algebra Operations

Page 12: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

12

Set Operations: Selection

• Returns all tuples which satisfy a condition

• Notation: c(R)

• c is a condition: =, <, >, and, or, not

• Output schema: same as input schema

• Find all employees with salary > $40,000:– Salary > 40000 (Employee)

Page 13: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

13

Selection Example

EmployeeSSN Name DepartmentID Salary999999999 John 1 30,000777777777 Tony 1 32,000888888888 Alice 2 45,000

SSN Name DepartmentID Salary888888888 Alice 2 45,000

Find all employees with salary more than $40,000.Salary > 40000 (Employee)

Page 14: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

14

Restriction (or Selection)– Works on a single relation R and defines a relation that contains

only those tuples (rows) of R that satisfy the specified condition (predicate).

Example: List all staff with a salary greater than US$10,000.

salary > 10000 (Staff) or R = STAFF Where Salary > 10000

Page 15: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

15

Projection• Unary operation: returns certain columns• Eliminates duplicate tuples !• Notation: A1,…,An (R)• Input schema R(B1,…,Bm)• Condition: {A1, …, An} {B1, …, Bm}• Output schema S(A1,…,An)• Example: project social-security number and

names:– SSN, Name (Employee)

Page 16: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

16

Projection col1, . . . , coln(R)

– Works on a single relation R and defines a relation that contains a vertical subset of R, extracting the values of specified attributes and eliminating duplicates.

• Produce a list of salaries for all staff, showing only staffNo, fName, lName, and salary details.

staffNo, fName, lName, salary(Staff) or Staff [staffNo, fName, lName, salary]

Page 17: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

17

Projection Example

EmployeeSSN Name DepartmentID Salary999999999 John 1 30,000777777777 Tony 1 32,000888888888 Alice 2 45,000

SSN Name999999999 John777777777 Tony888888888 Alice

SSN, Name (Employee)

Page 18: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

18

Cartesian Product

• Each tuple in R1 with each tuple in R2• Notation: R1 x R2• Input schemas R1(A1,…,An), R2(B1,…,Bm)• Condition: {A1,…,An} ∩ {B1,…Bm} = • Output schema is S(A1, …, An, B1, …, Bm)• Notation: R1 x R2• Example: Employee x Dependents• Very rare in practice; but joins are very often

Page 19: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

19

Cartesian Product Example Employee Name SSN John 999999999 Tony 777777777 Dependents EmployeeSSN Dname 999999999 Emily 777777777 Joe Employee x Dependents Name SSN EmployeeSSN Dname John 999999999 999999999 Emily John 999999999 777777777 Joe Tony 777777777 999999999 Emily Tony 777777777 777777777 Joe

Page 20: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

20

Cartesian Product (Multiplication)• R X S

– Defines a relation that is the concatenation of every tuple of relation R with

every tuple of relation S.• List the names and comments of all clients who have viewed a property for

rent.

(clientNo, fName, lName(Client)) X (clientNo, propertyNo, comment (Viewing)) or

Client [clientNo, fName, lName] x Viewing [clientNo, propertyNo, comment ]

Page 21: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

21

Example - Cartesian product and Selection

• Use selection operation to extract those tuples where Client.clientNo = Viewing.clientNo.Client.clientNo = Viewing.clientNo((clientNo, fName, lName(Client)) (clientNo, propertyNo,

comment(Viewing)))

Cartesian product and Selection can be reduced to a single operation called a Join.

Page 22: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

22

Renaming

• Does not change the relational instance

• Changes the relational schema only

• Notation: B1,…,Bn (R)

• Input schema: R(A1, …, An)

• Output schema: S(B1, …, Bn)

• Example:

LastName, SocSocNo (Employee)

Page 23: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

23

Renaming Example

EmployeeName SSNJohn 999999999Tony 777777777

LastName SocSocNoJohn 999999999Tony 777777777

LastName, SocSocNo (Employee)

Page 24: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

24

Derived Operations

• Intersection can be derived:– R1 ∩ R2 = R1 – (R1 – R2) – There is another way to express it (later)

• Most importantly: joins, in many variants

Page 25: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

25

Join• Join is a derivative of Cartesian product.

• Equivalent to performing a Selection, using join predicate as selection formula, over Cartesian product of the two operand relations.

• One of the most difficult operations to implement efficiently in an RDBMS and one reason why RDBMSs have intrinsic performance problems.

Page 26: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

26

Join• Join is a derivative of Cartesian product.

• Equivalent to performing a Selection, using join predicate as selection formula, over Cartesian product of the two operand relations.

• One of the most difficult operations to implement efficiently in an RDBMS and one reason why RDBMSs have intrinsic performance problems.

• Various forms of join operation– Natural join (defined by Codd)– Outer join – Theta join– Equijoin (a particular type of Theta join)– Semijoin

Page 27: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

27

Theta join (-join)

• R FS– Defines a relation that contains tuples

satisfying the predicate F from the Cartesian product of R and S.

– The predicate F is of the form R.ai S.bi where may be one of the comparison operators (<, , >, , =, ).

Page 28: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

28

Theta join (-join)• Can rewrite Theta join using basic Selection

and Cartesian product operations.

R FS = F(R S)

Degree of a Theta join is sum of degrees of the operand relations R and S. If predicate F contains only equality (=), the term Equijoin is used.

Page 29: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

29

Example - Equijoin • List the names and comments of all clients

who have viewed a property for rent.(clientNo, fName, lName(Client)) Client.clientNo = Viewing.clientNo

(clientNo, propertyNo, comment(Viewing))

Page 30: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

30

EQUIJOIN

•  

• SELECT table.column, table.column FROM table1, table2 WHERE table1.column1 = table2.column2;

Page 31: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

31

Natural Join• Notation: R1 R2• Input Schema: R1(A1, …, An), R2(B1, …, Bm)• Output Schema: S(C1,…,Cp)

– Where {C1, …, Cp} = {A1, …, An} U {B1, …, Bm}

• Meaning: combine all pairs of tuples in R1 and R2 that agree on the attributes:– {A1,…,An} ∩ {B1,…, Bm} (called the join attributes)

• Equivalent to a cross product followed by selection• Example Employee Dependents

Page 32: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

32

Natural Join Example

EmployeeName SSNJohn 999999999Tony 777777777

DependentsSSN Dname999999999 Emily777777777 Joe

Name SSN DnameJohn 999999999 EmilyTony 777777777 Joe

Employee Dependents = Name, SSN, Dname( SSN=SSN2(Employee x SSN2, Dname(Dependents))

Page 33: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

33

Natural Join

• List the names and comments of all clients who have viewed a property for rent.(clientNo, fName, lName(Client)) Join (clientNo, propertyNo, comment(Viewing))

Or Client [clientNo, fName, lName] Join Viewing [clientNo, propertyNo, comment ]

Page 34: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

34

Another Natural Join Example

• R= S=

• R S=

A B

X Y

X Z

Y Z

Z V

B C

Z U

V W

Z V

A B C

X Z U

X Z V

Y Z U

Y Z V

Z V W

Page 35: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

35

Natural Join

• Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R S ?

• Given R(A, B, C), S(D, E), what is R S ?

• Given R(A, B), S(A, B), what is R S ?

Page 36: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

36

Theta Join

• A join that involves a predicate

• Notation: R1 R2 where is a condition

• Input schemas: R1(A1,…,An), R2(B1,…,Bm)

• {A1,…An} ∩ {B1,…,Bm} =

• Output schema: S(A1,…,An,B1,…,Bm)

• Derived operator:

R1 R2 = (R1 x R2)

Page 37: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

37

Semijoin

• R S = A1,…,An (R S)

• Where the schemas are:– Input: R(A1,…An), S(B1,…,Bm)– Output: T(A1,…,An)

Page 38: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

38

Outer join• To display rows in the result that do not

have matching values in the join column, use Outer join.

• R S– (Left) outer join is join in which tuples from

R that do not have matching values in common columns of S are also included in result relation.

Page 39: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

39

Outer Join• To display rows in the result that do not have matching values in the join

column, use Outer join.

• R Left Outer Join S– (Left) outer join is join in which tuples from R that do not have matching values in

common columns of S are also included in result relation.

Example:

• Produce a status report on property viewings.propertyNo, street, city(PropertyForRent)

Left Outer Join Viewing

Page 40: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

40

Types of Joins

• Left Outer Join – keep all of the tuples from the “left” relation

– join with the right relation

– pad the non-matching tuples with nulls

• Right Outer Join– same as the left, but keep tuples from the “right”

relation

• Full Outer Join– same as left, but keep all tuples from both relations

Page 41: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

41

Left Outer Join

• If we do a left outer join on R and S, and we match on the first column, the result is:

A B

C D

A F

G H

R= S=

A B F

C D -

name phone name email

name phone email

Page 42: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

42

Example - Left Outer join

• Produce a status report on property viewings.propertyNo, street, city(PropertyForRent)

Viewing

Page 43: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

43

Right Outer Join

• If we do a right outer join on R and S, and we match on the first column, the result is:

A B

C D

A F

G H

R= S=

A B F

G - H

name phone name email

name phone email

Page 44: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

44

Full Outer Join

• If we do a full outer join on R and S, and we match on the first column, the result is:

A B

C D

A F

G H

R= S=

A B F

C D -

G - H

name phone name email

name phone email

Page 45: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

45

OUTER JOIN Example 1

E 5F 4D 3B 2A 1

A 1C 2D 3E 4

R.ColA = S.SColA SR LEFT OUTER JOIN

A 1 A 1D 3E 4

- -- -

D 3E 5

F 4B 2

A 1 A 1D 3E 4

D 3E 5- - C 2

ColA ColB

SColBSColAS

R

R.ColA = S.SColA SR RIGHT OUTER JOIN

Page 46: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

46

OUTER JOIN Example 2

E 5F 4D 3B 2A 1

A 1C 2D 3E 4

R.ColA = S.SColA SR FULL OUTER JOIN

A 1 A 1D 3E 4

- -- -

D 3E 5

F 4B 2

- - C 2

ColA ColB

SColBSColAS

R

Page 47: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

47

Division

• Identify all clients who have viewed all properties with three rooms.

(clientNo, propertyNo(Viewing)) (propertyNo(rooms = 3 (PropertyForRent)))

Page 48: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

48

Natural join• R S

– An Equijoin of the two relations R and S over all common attributes x. One occurrence of each common attribute is eliminated from the result.

Page 49: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

49

Example - Natural join• List the names and comments of all clients

who have viewed a property for rent.(clientNo, fName, lName(Client))

(clientNo, propertyNo, comment(Viewing))

Page 50: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

50

Semijoin• R F S

– Defines a relation that contains the tuples of R that participate in the join of R with S.

Can rewrite Semijoin using Projection and Join:

R F S = A(R F S)

Page 51: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

51

Example - Semijoin• List complete details of all staff who work at

the branch in Glasgow.

Staff Staff.branchNo = Branch.branchNo and Branch.city = ‘Glasgow’ Branch

Page 52: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

52

Division• R S

– Defines a relation over the attributes C that consists of set of tuples from R that match combination of every tuple in S.

• Expressed using basic operations:T1 C(R)

T2 C((S X T1) – R)

T T1 – T2

Page 53: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

53

Example - Division• Identify all clients who have viewed all

properties with three rooms.

(clientNo, propertyNo(Viewing)) (propertyNo(rooms = 3 (PropertyForRent)))

Page 54: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

54

Page 55: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

55

Page 56: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

56

Page 57: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

57

Page 58: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

58

Page 59: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

59

Summary of Relational Algebra

• Five basic operators, many derived

• Combine operators in order to construct queries: relational algebra expressions, usually shown as trees

Page 60: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

60

RA has Limitations !

• Cannot compute “transitive closure”

• Find all direct and indirect relatives of Fred• Cannot express in RA !!! Need to write C program

Name1 Name2 Relationship

Fred Mary Father

Mary Joe Cousin

Mary Bill Spouse

Nancy Lou Sister

Page 61: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

61

Equivalences

The same relational algebraic expression can be written in many different ways. The order in which tuples appear in relations is never significant. • A B <=> B A • A B <=> B A • A B <=> B A • (A - B) is not the same as (B - A) • c1 ( c2 (A)) <=> c2 ( c1 (A)) <=> c1 ^ c2 (A) • a1(A) <=> a1( a1,etc(A)) , where etc is any attributes of A. • ...

Page 62: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

62

Operations on Bags (and why we care)

• Union: {a,b,b,c} U {a,b,b,b,e,f,f} = {a,a,b,b,b,b,b,c,e,f,f}– add the number of occurrences

• Difference: {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d}– subtract the number of occurrences

• Intersection: {a,b,b,b,c,c}∩{b,b,c,c,c,c,d} = {b,b,c,c}– minimum of the two numbers of occurrences

• Selection: preserve the number of occurrences

• Projection: preserve the number of occurrences (no duplicate elimination)

• Cartesian product, join: no duplicate elimination

Page 63: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

63

SQL Introduction

• Standard language for querying and manipulating data

Structured Query Language

• Many standards out there: SQL92, SQL2, SQL3, SQL99• Vendors support various subsets of these, but all of what we’ll be

talking about.• Works on bags, rather than sets

• Basic construct:SELECT …FROM …WHERE …

Page 64: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

64

Selections

Company(sticker, name, country, stockPrice)

Find all US companies whose stock is > 50:

SELECT * FROM Company WHERE country=“USA” AND stockPrice > 50

Output schema: R(sticker, name, country, stockPrice)

Page 65: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

65

Selections

What you can use in WHERE:

• attribute names of the relation(s) used in the FROM.• comparison operators: =, <>, <, >, <=, >=• apply arithmetic operations: stockprice*2• operations on strings (e.g., “||” for concatenation).• Lexicographic order on strings.• Pattern matching: s LIKE p• Special stuff for comparing dates and times.

Page 66: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

66

The LIKE operator

• s LIKE p: pattern matching on strings• p may contain two special symbols:

– % = any sequence of characters– _ = any single character

Company(sticker, name, address, country, stockPrice)Find all US companies whose address contains “Mountain”:

SELECT * FROM Company WHERE country=“USA” AND address LIKE “%Mountain%”

• Needed in the 1st assignment !

Page 67: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

67

Select only a subset of the attributes

SELECT name, stockPrice FROM Company WHERE country=“USA” AND stockPrice > 50

Input schema: Company(sticker, name, country, stockPrice)Output schema: R(name, stock price)

Projections

Page 68: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

68

Rename the attributes in the resulting table

SELECT name AS company, stockprice AS price FROM Company WHERE country=“USA” AND stockPrice > 50

Input schema: Company(sticker, name, country, stockPrice)Output schema: R(company, price)

Projections with Renamings

Page 69: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

69

Eliminating Duplicates

SELECT DISTINCT country FROM Company WHERE stockPrice > 50

Without DISTINCT the result is a bag

Page 70: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

70

Ordering the Results

SELECT name, stockPrice FROM Company WHERE country=“USA” AND stockPrice > 50 ORDERBY country, name

Ordering is ascending, unless you specify the DESC keyword.

Ties are broken by the second attribute on the ORDERBY list, etc.

Page 71: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

71

Joins

Product ( pname, price, category, maker)Purchase (buyer, seller, store, product)Company (cname, stockPrice, country)Person( per-name, phoneNumber, city)

Find names of people living in Seattle that bought gizmo products, and the names of the stores they bought from

SELECT per-name, store FROM Person, Purchase WHERE per-name=buyer AND city=“Seattle” AND product=“gizmo”

Page 72: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

72

Disambiguating Attributes

SELECT Person.name FROM Person, Purchase, Product WHERE Person.name=buyer AND product=Product.name AND Product.category=“telephony”

Product (name, price, category, maker)Purchase (buyer, seller, store, product)Person(name, phoneNumber, city)

Find names of people buying telephony products:

Page 73: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

73

Tuple Variables

SELECT product1.maker, product2.maker FROM Product AS product1, Product AS product2 WHERE product1.category=product2.category AND product1.maker <> product2.maker

Product ( name, price, category, maker)

Find pairs of companies making products in the same category

Page 74: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

74

Tuple Variables•Tuple variables introduced automatically by the system: Product ( name, price, category, maker)

SELECT name FROM Product WHERE price > 100Becomes: SELECT Product.name FROM Product AS Product WHERE Product.price > 100

Doesn’t work when Product occurs more than once.

Page 75: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

75

Meaning (Semantics) of SQL Queries

SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions

1. Nested loops: Answer = {} for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer U {(a1,…,ak)} return Answer

Page 76: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

76

Meaning (Semantics) of SQL Queries

SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions

2. Parallel assignment

Answer = {} for all assignments x1 in R1, …, xn in Rn do if Conditions then Answer = Answer U {(a1,…,ak)} return Answer

Doesn’t impose any order !

Page 77: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

77

Meaning (Semantics) of SQL Queries

SELECT a1, a2, …, akFROM R1 AS x1, R2 AS x2, …, Rn AS xnWHERE Conditions

3. Translation to Relational algebra:

a1,,…,ak ( Conditions (R1 x R2 x … x Rn))

Select-From-Where queries are precisely Select-Project-Join

Page 78: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

78

First Unintuitive SQLismSELECT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A

Looking for R ∩ (S U T)

But what happens if T is empty?

Page 79: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

79

Union, Intersection, Difference(SELECT name FROM Person WHERE City=“Seattle”) UNION(SELECT name FROM Person, Purchase WHERE buyer=name AND store=“The Bon”)

Similarly, you can use INTERSECT and EXCEPT.

You must have the same attribute names (otherwise: rename).

Page 80: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

80

Conserving Duplicates

(SELECT name FROM Person WHERE City=“Seattle”) UNION ALL(SELECT name FROM Person, Purchase WHERE buyer=name AND store=“The Bon”)

The UNION, INTERSECTION and EXCEPT operators operate as sets, not bags.

Page 81: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

81

Subqueries

A subquery producing a single tuple:

SELECT Purchase.productFROM PurchaseWHERE buyer = (SELECT name FROM Person WHERE ssn = “123456789”);

In this case, the subquery returns one value.

If it returns more, it’s a run-time error.

Page 82: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

82

Can say the same thing without a subquery:

SELECT Purchase.productFROM Purchase, PersonWHERE buyer = name AND ssn = “123456789”

Is this query equivalent to the previous one ?

Page 83: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

83

Subqueries Returning Relations

SELECT Company.name FROM Company, Product WHERE Company.name=maker AND Product.name IN (SELECT product FROM Purchase WHERE buyer = “Joe Blow”);

Here the subquery returns a set of values

Find companies who manufacture products bought by Joe Blow.

Page 84: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

84

Subqueries Returning Relations

SELECT Company.name FROM Company, Product, Purchase WHERE Company.name=maker AND Product.name = product AND buyer = “Joe Blow”

Equivalent to:

Is this query equivalent to the previous one ?

Page 85: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

85

Subqueries Returning Relations

SELECT name FROM Product WHERE price > ALL (SELECT price FROM Purchase WHERE maker=“Gizmo-Works”)

Product ( pname, price, category, maker)Find products that are more expensive than all those producedBy “Gizmo-Works”

You can also use: s > ALL R s > ANY R EXISTS R

Page 86: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

86

Question for Database Fans

• Can we express this query as a single SELECT-FROM-WHERE query, without subqueries ?

• Hint: show that all SFW queries are monotone (figure out what this means). A query with ALL is not monotone

Page 87: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

87

Conditions on Tuples

SELECT Company.name FROM Company, Product WHERE Company.name=maker AND (Product.name,price) IN (SELECT product, price) FROM Purchase WHERE buyer = “Joe Blow”);

Page 88: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

88

Correlated Queries

SELECT title FROM Movie AS x WHERE year < ANY (SELECT year FROM Movie WHERE title = x.title);

Movie (title, year, director, length) Find movies whose title appears more than once.

Note (1) scope of variables (2) this can still be expressed as single SFW

correlation

Page 89: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

89

Complex Correlated Query

Product ( pname, price, category, maker, year)

• Find products (and their manufacturers) that are more expensive than all products made by the same manufacturer before 1972

SELECT pname, makerFROM Product AS xWHERE price > ALL (SELECT price FROM Product AS y WHERE x.maker = y.maker AND y.year < 1972);

Powerful, but much harder to optimize !

Page 90: Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008

90

Exercises: write RA and SQL expressions

Product ( pname, price, category, maker)Purchase (buyer, seller, store, product)Company (cname, stock price, country)Person( per-name, phone number, city)

Ex #1: Find people who bought telephony products.Ex #2: Find names of people who bought American productsEx #3: Find names of people who bought American products and did not buy French productsEx #4: Find names of people who bought American products and they live in Seattle.Ex #5: Find people who bought stuff from Joe or bought products from a company whose stock prices is more than $50.