32
2012-11-14 1 Silvia Stefanova, UDBL - IT - UU DATABASE DESIGN I - 1DL300 Autumn 2012 An Introductory Course on Database Systems http://www.it.uu.se/edu/course/homepage/dbastekn/ht12/ Uppsala Database Laboratory Department of Information Technology, Uppsala University, Uppsala, Sweden

DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 1 Silvia Stefanova, UDBL - IT - UU

DATABASE DESIGN I - 1DL300

Autumn 2012

An Introductory Course on

Database Systems

http://www.it.uu.se/edu/course/homepage/dbastekn/ht12/

Uppsala Database Laboratory

Department of Information Technology, Uppsala University, Uppsala, Sweden

Page 2: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 2 Silvia Stefanova, UDBL - IT - UU

Introduction to Relational Algebra

Elmasri/Navathe ch 6

Padron-McCarthy/Risch ch 10

Silvia Stefanova

Department of Information Technology

Uppsala University, Uppsala, Sweden

Page 3: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 3 Silvia Stefanova, UDBL - IT - UU

Outline 1. Why and what about the Relational Algebra

2. Unary operations • Select operation

• Project operation

3. Set operations • Union

• Intersection

• Set difference (Minus)

4. Cartesian product

5. Join operator • Join

• Natural join

• Outer join

6. Exercises

Page 4: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 4 Manivasakan Sabesan- UDBL - IT - UU

Relational algebra

• Relational algebra is a procedural language

• Operations in relational algebra takes two or more relations as

arguments and return a new relation.

• Relational algebraic operations:

– Operations from set theory:

• Union, Intersection, Difference, Cartesian product

– Operations specifically introduced for the relational data model:

• Select, Project, Join

• It have been shown that the select, project, union, difference, and cartesian product operations form a complete set. That is any other relational algebra operation can be expressed in these.

Page 5: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 5 Silvia Stefanova, UDBL - IT - UU

Outline 1. Why and what about the Relational Algebra

2. Unary operations • Select operation

• Project operation

3. Set operations • Union

• Intersection

• Set difference (Minus)

4. Cartesian product

5. Join operator • Join

• Natural join

• Outer join

6. Exercises

Page 6: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 6 Manivasakan Sabesan- UDBL - IT - UU

Select operation

• The select operator, σ, selects a specific set of tuples from a relation

according to a selection condition (or selection predicate P).

• Notation: σ<select condition> (R)

• The result relation has the same number of attributes as the initial

relation

• Example: σ pnumber=9012034455(Person)

Page 7: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 7 Silvia Stefanova, UDBL - IT - UU

Relations to work on

id name phone

1 Susan 100

2 Barbara 200

3 Ricardo 200

Teachers

id sname sphone

2 Barbara 200

4 Jimmy NULL

7 Anna 800

Students

cname code tid

Database Design 1 DB1 1

Database Design 2 DB2 1

Distributed Systems DS 3

Courses

student course

4 DB1

4 DS

7 DS

7 DB2

Attending

FK, course_taught_by

FK, student_attends

FK, course_to_attend

Page 8: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 8 Manivasakan Sabesan- UDBL - IT - UU

SELECT operation - examples

? Retrieve (select) teachers whose phone number is > = 200

? Retrieve (select) students with the name ‘Jimmy’

? Retrieve (select) teachers with name ‘Barbara’ and phone

number is > = 200

Page 9: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 9 Manivasakan Sabesan- UDBL - IT - UU

Project operation

• The project operator, Π, picks out (or projects) listed columns from a

relation and creates a new relation consisting of these columns.

• Notation: Π <attribute list> (R)

• The result relation is a new relation consisting of the projected

columns

• It doesn’t return the duplicates

• Example: Π Lname,Fname,Salary (Employee)

Page 10: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 10 Manivasakan Sabesan- UDBL - IT - UU

Relational algebra expressions- examples

? Retrieve (select) the names of the teachers whose phone

number is > = 200

? Is it possible in the example above to do the projection before

the selection ? Why or why not ?

Page 11: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 11 Silvia Stefanova, UDBL - IT - UU

Outline 1. Why and what about the Relational Algebra

2. Unary operations • Select operation

• Project operation

3. Set operations • Union

• Intersection

• Set difference (Minus)

4. Cartesian product

5. Join operator • Join

• Natural join

• Outer join

6. Exercises

Page 12: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 12 Silvia Stefanova, UDBL - IT - UU

Set operations

• Relations are required to be union compatible to be able to take

part in the union, intersection and difference operations.

• Two relations R1 and R2 is said to be union-compatible if:

R1 D1 × D2 ×... × Dn and

R2 D1 × D2 ×... × Dn

i.e. if they have the same degree and the same domains.

Page 13: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 13 Silvia Stefanova, UDBL - IT - UU

Union operation

• The union of two union-compatible relations R and S is the set

of all tuples that either occur in R, S, or in both.

• Notation: R S

• The duplicates don’t exist in the union.

• Example:

id sname sphone

2 Barbara 200

4 Jimmy NULL

7 Anna 800

Students id name phone

1 Susan 100

2 Barbara 200

3 Ricardo 200

Teachers = id sname sphone

2 Barbara 200

4 Jimmy NULL

7 Anna 800

1 Susan 100

3 Ricardo 200

Page 14: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 14 Silvia Stefanova, UDBL - IT - UU

Intersect operation

• The intersection of two union-compatible sets R and S, is the set

of all tuples that occur in both R and S.

• Notation: R ∩ S

• Example:

id sname sphone

2 Barbara 200

4 Jimmy NULL

7 Anna 800

Students ∩ id name phone

1 Susan 100

2 Barbara 200

3 Ricardo 200

Teachers = id sname sphone

Page 15: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 15 Silvia Stefanova, UDBL - IT - UU

Difference operation

• The difference of two union-compatible relations R and S is the

set of all tuples that occur in R but not in S.

• Notation: R - S

• Example:

id sname sphone

2 Barbara 200

4 Jimmy NULL

7 Anna 800

Students - id name phone

1 Susan 100

2 Barbara 200

3 Ricardo 200

Teachers = id sname sphone

Page 16: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 16 Silvia Stefanova, UDBL - IT - UU

Outline 1. Why and what about the Relational Algebra

2. Unary operations • Select operation

• Project operation

3. Set operations • Union

• Intersection

• Set difference (Minus)

4. Cartesian product

5. Join operator • Join

• Natural join

• Outer join

6. Exercises

Page 17: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 17 Silvia Stefanova, UDBL - IT - UU

Cartesian product

• The Cartesian product of R and S combines all tuples from R

with all tuples from S.

• Notation: R x S

• Example:

id name phone

1 Susan 100

2 Barbara 200

3 Ricardo 200

Teachers cname code tid

Database Design 1 DB1 1

Database Design 2 DB2 1

Distributed Systems DS 3

Courses x

Page 18: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 18 Silvia Stefanova, UDBL - IT - UU

Cartesian product

• The result is a new relation with (3+3) attributes and (3*5) tuples

cname code tid id name phone

Database Design 1 DB1 1 1 Susan 100

Database Design 1 DB1 1 2 Barbara 200

Database Design 1 DB1 1 3 Ricardo 200

Database Design 2 DB2 1 1 Susan 100

Database Design 2 DB2 1 2 Barbara 200

Database Design 2 DB2 1 3 Ricardo 200

Distributed Systems DS 3 1 Susan 100

Distributed Systems DS 3 2 Barbara 200

Distributed Systems DS 3 3 Ricardo 200

Courses x Teachers

Page 19: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 19 Silvia Stefanova, UDBL - IT - UU

Outline 1. Why and what about the Relational Algebra

2. Unary operations • Select operation

• Project operation

3. Set operations • Union

• Intersection

• Set difference (Minus)

4. Cartesian product

5. Join operator • Join

• Natural join

• Outer join

6. Exercises

Page 20: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 20 Silvia Stefanova, UDBL - IT - UU

Join

• The join operator creates a new relation by combining tuples from

two relations that satisfy a condition

• Notation: R <join condition> S

• Example:

id name phone

1 Susan 100

2 Barbara 200

3 Ricardo 200

tid=id Teachers

cname code tid

Database Design 1 DB1 1

Database Design 2 DB2 1

Distributed Systems DS 3

Courses

Page 21: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 21 Silvia Stefanova, UDBL - IT - UU

Join

The result is a relation with (3+3) attributes

Courses tid=id Teachers

cname code tid id name phone

Database Design 1 DB1 1 1 Susan 100

Database Design 1 DB1 1 2 Barbara 200

Database Design 1 DB1 1 3 Ricardo 200

Database Design 2 DB2 1 1 Susan 100

Database Design 2 DB2 1 2 Barbara 200

Database Design 2 DB2 1 3 Ricardo 200

Distributed Systems DS 3 1 Susan 100

Distributed Systems DS 3 2 Barbara 200

Distributed Systems DS 3 3 Ricardo 200

Page 22: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 22 Manivasakan Sabesan- UDBL - IT - UU

Join

? Receive the same result by using x and σ !

? Why do we need if this can be done by x and σ ?

Page 23: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 23 Silvia Stefanova, UDBL - IT - UU

Theta join

• Notation: R C S

C is the join condition which has the form Ar As , where is

one of {=, <, >, ≤, ≥, ≠}. Several terms can be connected as

C1 AND C2 AND … Ck

• A join operation with this kind of general join condition is called

Theta join.

A≤F = A B

1

1

1

6

9

2

2

2

7

7

C D

3

3

3

8

8

2

7

7

7

7

A B

1

6

9

2

7

7

C

3

8

8

D E

2

7

7

3

3

8

F

4

5

9

R A≤F S

E

3

3

8

8

8

F

4

5

9

9

9

R S

Page 24: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 24 Silvia Stefanova, UDBL - IT - UU

Equijoin

• Notation: R C S

• In C the only comparison operator is =

B=C = A B

a

a

2

4

C D

2

4

d

d

A B

a

a

2

4

C D

2

4

9

d

d

d

E

e

e

e

R B=C S

E

e

e

R S

Page 25: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 25 Silvia Stefanova, UDBL - IT - UU

Natural join

• Natural join is equivalent with the application of join to R and S

with the equality condition Ar = As (i.e. an equijoin) and then

removing the redundant column As in the result.

• Notation: R *Ar,As S

Ar,As are attribute pairs that should fulfill the join condition

which has the form Ar = As.

B=C = A B

a

a

2

4

D

d

d

A B

a

a

2

4

C D

2

4

9

d

d

d

E

e

e

e

R B=C S

E

e

e

R S

Page 26: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 26 Silvia Stefanova, UDBL - IT - UU

Natural join

cname code tid id name phone

Database Design 1 DB1 1 1 Susan 100

Database Design 2 DB2 1 1 Susan 100

Distributed Systems DS 3 3 Ricardo 200

Courses * tid=id Teachers

Page 27: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 27 Silvia Stefanova, UDBL - IT - UU

Outer join

• Extensions of the join operations that avoid loss of information.

• Computes the join and then adds tuples from one relation that do

not match tuples in the other relation to the result of the join.

• Fills out with null values:

null signifies that the value is unknown or does not exist.

All comparisons involving null are false by definition.

Page 28: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 28 Silvia Stefanova, UDBL - IT - UU

Outer join

• Left Outer join

loan

borrower

loan borrower (left outer join)

branch-name

Downtown

Redwood

Perryridge

loan-number

1-170

L-230

L-260

amount

3000

4000

1700

customer-name

Jones

Smith

Hayes

loan-number

1-170

L-230

L-155

customer-name

Jones

Smith

null

loan-number

1-170

L-230

null

branch-name

Downtown

Redwood

Perryridge

loan-number

1-170

L-230

L-260

amount

3000

4000

1700

Page 29: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 29 Manivasakan Sabesan- UDBL - IT - UU

A task

? Retrieve the names of the teachers teaching DB1!

id name phone

1 Susan 100

2 Barbara 200

3 Ricardo 200

Teachers

id sname sphone

2 Barbara 200

4 Jimmy NULL

7 Anna 800

Students

cname code tid

Database Design 1 DB1 1

Database Design 2 DB2 1

Distributed Systems DS 3

Courses

student course

4 DB1

4 DS

7 DS

7 DB2

Attending

FK, course_taught_by

FK, student_attends

FK, course_to_attend

Page 30: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 30 Silvia Stefanova, UDBL - IT - UU

Relational Algebra

Exercises on RA expressions:

Find the codes of the courses taken by the student having the name “Jimmy”.

Retrieve the names and phones of the students who are taking a database course

(“Database Design 1” or “Database Design 2” ).

List the students, names and phones of the teacher “Susan”.

Retrieve the phones of the teachers who are students as well.

T1

T2

T3

T4

Page 31: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 31 Silvia Stefanova, UDBL - IT - UU

Boyce-Codd Normal Form - BCNF

BCNF: A relation is in BCNF if:

– It is in 1NF

– Every determinant is a candidate key.

Normalization:

Decompose the relation so that after joining the new relations spurious

tuples will not be generated ( lossless join decomposition )

Teach

department course teacher

Computing Science Database 1 Sara S

Computer Systems Database 1 Sara S

Engineering Signals and Systems Peter E

Computing Science Database 2 Sven P

Page 32: DATABASE DESIGN I - 1DL300 Autumn 2012...Cartesian product •The result is a new relation with (3+3) attributes and (3*5) tuples cname code tid id name phone Database Design 1 DB1

2012-11-14 32 Silvia Stefanova, UDBL - IT - UU

Street

street city length zipcode

Rydsvägen Linköping 19 58248

Mårdtorpsgatan Linköping 0.7 58248

Storgatan Linköping 1.5 58223

Storgatan Gnesta 0.014 64631

Normalization to BCNF

? Is Street in BCNF? Why or why not ?

How would you normalize it?