Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
2012-11-14 1 Silvia Stefanova, UDBL - IT - UU
DATABASE DESIGN I - 1DL300
Autumn 2012
An Introductory Course on
Database Systems
http://www.it.uu.se/edu/course/homepage/dbastekn/ht12/
Uppsala Database Laboratory
Department of Information Technology, Uppsala University, Uppsala, Sweden
2012-11-14 2 Silvia Stefanova, UDBL - IT - UU
Introduction to Relational Algebra
Elmasri/Navathe ch 6
Padron-McCarthy/Risch ch 10
Silvia Stefanova
Department of Information Technology
Uppsala University, Uppsala, Sweden
2012-11-14 3 Silvia Stefanova, UDBL - IT - UU
Outline 1. Why and what about the Relational Algebra
2. Unary operations • Select operation
• Project operation
3. Set operations • Union
• Intersection
• Set difference (Minus)
4. Cartesian product
5. Join operator • Join
• Natural join
• Outer join
6. Exercises
2012-11-14 4 Manivasakan Sabesan- UDBL - IT - UU
Relational algebra
• Relational algebra is a procedural language
• Operations in relational algebra takes two or more relations as
arguments and return a new relation.
• Relational algebraic operations:
– Operations from set theory:
• Union, Intersection, Difference, Cartesian product
– Operations specifically introduced for the relational data model:
• Select, Project, Join
• It have been shown that the select, project, union, difference, and cartesian product operations form a complete set. That is any other relational algebra operation can be expressed in these.
2012-11-14 5 Silvia Stefanova, UDBL - IT - UU
Outline 1. Why and what about the Relational Algebra
2. Unary operations • Select operation
• Project operation
3. Set operations • Union
• Intersection
• Set difference (Minus)
4. Cartesian product
5. Join operator • Join
• Natural join
• Outer join
6. Exercises
2012-11-14 6 Manivasakan Sabesan- UDBL - IT - UU
Select operation
• The select operator, σ, selects a specific set of tuples from a relation
according to a selection condition (or selection predicate P).
• Notation: σ<select condition> (R)
• The result relation has the same number of attributes as the initial
relation
• Example: σ pnumber=9012034455(Person)
2012-11-14 7 Silvia Stefanova, UDBL - IT - UU
Relations to work on
id name phone
1 Susan 100
2 Barbara 200
3 Ricardo 200
Teachers
id sname sphone
2 Barbara 200
4 Jimmy NULL
7 Anna 800
Students
cname code tid
Database Design 1 DB1 1
Database Design 2 DB2 1
Distributed Systems DS 3
Courses
student course
4 DB1
4 DS
7 DS
7 DB2
Attending
FK, course_taught_by
FK, student_attends
FK, course_to_attend
2012-11-14 8 Manivasakan Sabesan- UDBL - IT - UU
SELECT operation - examples
? Retrieve (select) teachers whose phone number is > = 200
? Retrieve (select) students with the name ‘Jimmy’
? Retrieve (select) teachers with name ‘Barbara’ and phone
number is > = 200
2012-11-14 9 Manivasakan Sabesan- UDBL - IT - UU
Project operation
• The project operator, Π, picks out (or projects) listed columns from a
relation and creates a new relation consisting of these columns.
• Notation: Π <attribute list> (R)
• The result relation is a new relation consisting of the projected
columns
• It doesn’t return the duplicates
• Example: Π Lname,Fname,Salary (Employee)
2012-11-14 10 Manivasakan Sabesan- UDBL - IT - UU
Relational algebra expressions- examples
? Retrieve (select) the names of the teachers whose phone
number is > = 200
? Is it possible in the example above to do the projection before
the selection ? Why or why not ?
2012-11-14 11 Silvia Stefanova, UDBL - IT - UU
Outline 1. Why and what about the Relational Algebra
2. Unary operations • Select operation
• Project operation
3. Set operations • Union
• Intersection
• Set difference (Minus)
4. Cartesian product
5. Join operator • Join
• Natural join
• Outer join
6. Exercises
2012-11-14 12 Silvia Stefanova, UDBL - IT - UU
Set operations
• Relations are required to be union compatible to be able to take
part in the union, intersection and difference operations.
• Two relations R1 and R2 is said to be union-compatible if:
R1 D1 × D2 ×... × Dn and
R2 D1 × D2 ×... × Dn
i.e. if they have the same degree and the same domains.
2012-11-14 13 Silvia Stefanova, UDBL - IT - UU
Union operation
• The union of two union-compatible relations R and S is the set
of all tuples that either occur in R, S, or in both.
• Notation: R S
• The duplicates don’t exist in the union.
• Example:
id sname sphone
2 Barbara 200
4 Jimmy NULL
7 Anna 800
Students id name phone
1 Susan 100
2 Barbara 200
3 Ricardo 200
Teachers = id sname sphone
2 Barbara 200
4 Jimmy NULL
7 Anna 800
1 Susan 100
3 Ricardo 200
2012-11-14 14 Silvia Stefanova, UDBL - IT - UU
Intersect operation
• The intersection of two union-compatible sets R and S, is the set
of all tuples that occur in both R and S.
• Notation: R ∩ S
• Example:
id sname sphone
2 Barbara 200
4 Jimmy NULL
7 Anna 800
Students ∩ id name phone
1 Susan 100
2 Barbara 200
3 Ricardo 200
Teachers = id sname sphone
2012-11-14 15 Silvia Stefanova, UDBL - IT - UU
Difference operation
• The difference of two union-compatible relations R and S is the
set of all tuples that occur in R but not in S.
• Notation: R - S
• Example:
id sname sphone
2 Barbara 200
4 Jimmy NULL
7 Anna 800
Students - id name phone
1 Susan 100
2 Barbara 200
3 Ricardo 200
Teachers = id sname sphone
2012-11-14 16 Silvia Stefanova, UDBL - IT - UU
Outline 1. Why and what about the Relational Algebra
2. Unary operations • Select operation
• Project operation
3. Set operations • Union
• Intersection
• Set difference (Minus)
4. Cartesian product
5. Join operator • Join
• Natural join
• Outer join
6. Exercises
2012-11-14 17 Silvia Stefanova, UDBL - IT - UU
Cartesian product
• The Cartesian product of R and S combines all tuples from R
with all tuples from S.
• Notation: R x S
• Example:
id name phone
1 Susan 100
2 Barbara 200
3 Ricardo 200
Teachers cname code tid
Database Design 1 DB1 1
Database Design 2 DB2 1
Distributed Systems DS 3
Courses x
2012-11-14 18 Silvia Stefanova, UDBL - IT - UU
Cartesian product
• The result is a new relation with (3+3) attributes and (3*5) tuples
cname code tid id name phone
Database Design 1 DB1 1 1 Susan 100
Database Design 1 DB1 1 2 Barbara 200
Database Design 1 DB1 1 3 Ricardo 200
Database Design 2 DB2 1 1 Susan 100
Database Design 2 DB2 1 2 Barbara 200
Database Design 2 DB2 1 3 Ricardo 200
Distributed Systems DS 3 1 Susan 100
Distributed Systems DS 3 2 Barbara 200
Distributed Systems DS 3 3 Ricardo 200
Courses x Teachers
2012-11-14 19 Silvia Stefanova, UDBL - IT - UU
Outline 1. Why and what about the Relational Algebra
2. Unary operations • Select operation
• Project operation
3. Set operations • Union
• Intersection
• Set difference (Minus)
4. Cartesian product
5. Join operator • Join
• Natural join
• Outer join
6. Exercises
2012-11-14 20 Silvia Stefanova, UDBL - IT - UU
Join
• The join operator creates a new relation by combining tuples from
two relations that satisfy a condition
• Notation: R <join condition> S
• Example:
id name phone
1 Susan 100
2 Barbara 200
3 Ricardo 200
tid=id Teachers
cname code tid
Database Design 1 DB1 1
Database Design 2 DB2 1
Distributed Systems DS 3
Courses
2012-11-14 21 Silvia Stefanova, UDBL - IT - UU
Join
The result is a relation with (3+3) attributes
Courses tid=id Teachers
cname code tid id name phone
Database Design 1 DB1 1 1 Susan 100
Database Design 1 DB1 1 2 Barbara 200
Database Design 1 DB1 1 3 Ricardo 200
Database Design 2 DB2 1 1 Susan 100
Database Design 2 DB2 1 2 Barbara 200
Database Design 2 DB2 1 3 Ricardo 200
Distributed Systems DS 3 1 Susan 100
Distributed Systems DS 3 2 Barbara 200
Distributed Systems DS 3 3 Ricardo 200
2012-11-14 22 Manivasakan Sabesan- UDBL - IT - UU
Join
? Receive the same result by using x and σ !
? Why do we need if this can be done by x and σ ?
2012-11-14 23 Silvia Stefanova, UDBL - IT - UU
Theta join
• Notation: R C S
C is the join condition which has the form Ar As , where is
one of {=, <, >, ≤, ≥, ≠}. Several terms can be connected as
C1 AND C2 AND … Ck
• A join operation with this kind of general join condition is called
Theta join.
A≤F = A B
1
1
1
6
9
2
2
2
7
7
C D
3
3
3
8
8
2
7
7
7
7
A B
1
6
9
2
7
7
C
3
8
8
D E
2
7
7
3
3
8
F
4
5
9
R A≤F S
E
3
3
8
8
8
F
4
5
9
9
9
R S
2012-11-14 24 Silvia Stefanova, UDBL - IT - UU
Equijoin
• Notation: R C S
• In C the only comparison operator is =
B=C = A B
a
a
2
4
C D
2
4
d
d
A B
a
a
2
4
C D
2
4
9
d
d
d
E
e
e
e
R B=C S
E
e
e
R S
2012-11-14 25 Silvia Stefanova, UDBL - IT - UU
Natural join
• Natural join is equivalent with the application of join to R and S
with the equality condition Ar = As (i.e. an equijoin) and then
removing the redundant column As in the result.
• Notation: R *Ar,As S
Ar,As are attribute pairs that should fulfill the join condition
which has the form Ar = As.
B=C = A B
a
a
2
4
D
d
d
A B
a
a
2
4
C D
2
4
9
d
d
d
E
e
e
e
R B=C S
E
e
e
R S
2012-11-14 26 Silvia Stefanova, UDBL - IT - UU
Natural join
cname code tid id name phone
Database Design 1 DB1 1 1 Susan 100
Database Design 2 DB2 1 1 Susan 100
Distributed Systems DS 3 3 Ricardo 200
Courses * tid=id Teachers
2012-11-14 27 Silvia Stefanova, UDBL - IT - UU
Outer join
• Extensions of the join operations that avoid loss of information.
• Computes the join and then adds tuples from one relation that do
not match tuples in the other relation to the result of the join.
• Fills out with null values:
null signifies that the value is unknown or does not exist.
All comparisons involving null are false by definition.
2012-11-14 28 Silvia Stefanova, UDBL - IT - UU
Outer join
• Left Outer join
loan
borrower
loan borrower (left outer join)
branch-name
Downtown
Redwood
Perryridge
loan-number
1-170
L-230
L-260
amount
3000
4000
1700
customer-name
Jones
Smith
Hayes
loan-number
1-170
L-230
L-155
customer-name
Jones
Smith
null
loan-number
1-170
L-230
null
branch-name
Downtown
Redwood
Perryridge
loan-number
1-170
L-230
L-260
amount
3000
4000
1700
2012-11-14 29 Manivasakan Sabesan- UDBL - IT - UU
A task
? Retrieve the names of the teachers teaching DB1!
id name phone
1 Susan 100
2 Barbara 200
3 Ricardo 200
Teachers
id sname sphone
2 Barbara 200
4 Jimmy NULL
7 Anna 800
Students
cname code tid
Database Design 1 DB1 1
Database Design 2 DB2 1
Distributed Systems DS 3
Courses
student course
4 DB1
4 DS
7 DS
7 DB2
Attending
FK, course_taught_by
FK, student_attends
FK, course_to_attend
2012-11-14 30 Silvia Stefanova, UDBL - IT - UU
Relational Algebra
Exercises on RA expressions:
Find the codes of the courses taken by the student having the name “Jimmy”.
Retrieve the names and phones of the students who are taking a database course
(“Database Design 1” or “Database Design 2” ).
List the students, names and phones of the teacher “Susan”.
Retrieve the phones of the teachers who are students as well.
T1
T2
T3
T4
2012-11-14 31 Silvia Stefanova, UDBL - IT - UU
Boyce-Codd Normal Form - BCNF
BCNF: A relation is in BCNF if:
– It is in 1NF
– Every determinant is a candidate key.
Normalization:
Decompose the relation so that after joining the new relations spurious
tuples will not be generated ( lossless join decomposition )
Teach
department course teacher
Computing Science Database 1 Sara S
Computer Systems Database 1 Sara S
Engineering Signals and Systems Peter E
Computing Science Database 2 Sven P
2012-11-14 32 Silvia Stefanova, UDBL - IT - UU
Street
street city length zipcode
Rydsvägen Linköping 19 58248
Mårdtorpsgatan Linköping 0.7 58248
Storgatan Linköping 1.5 58223
Storgatan Gnesta 0.014 64631
Normalization to BCNF
? Is Street in BCNF? Why or why not ?
How would you normalize it?