30
CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen LECTURE 7: MULTI-VALUED DEPENDENCY AND 4NF RELATIONAL OPERATIONS ON BAGS

CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Embed Size (px)

Citation preview

Page 1: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

CS 464/564Introduction to Database Management SystemInstructor: Abdullah MueenLECTURE 7: MULTI-VALUED DEPENDENCY AND 4NF

RELATIONAL OPERATIONS ON BAGS

Page 2: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

2

Definition of MVDA multivalued dependency (MVD) on R, X↠Y , says that if two tuples of R agree on all the attributes of X, then their components in Y may be swapped, and the result will be two tuples that are also in the relation.

i.e., for each value of X, the values of Y are independent of the values of R-X-Y.

Page 3: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

3

Picture of MVD X ↠ Y

x1 y1 z1

x1 y1 z2

x1 y2 z2

x1 y1 z2

x1 y2 z1

X Y Z=R-X-Y

Page 4: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

4

ExampleCandiesLover(name, addr, phones, candiesLiked)

A CandyLover’s phones are independent of the candies they like.◦ name ↠ phones and name ↠ candiesLiked.

Thus, each of a candyLover’s phones appears with each of the candies they like in all combinations.

This repetition is unlike FD redundancy.◦ Nameaddr is the only FD.

Page 5: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Another Example: MVD

Name Street City Title Year

C. Fisher 123 Maple St. Hollywood Star Wars 1977

C. Fisher 5 Locust Ln. Malibu Star Wars 1977

C. Fisher 123 Maple St. Hollywood Empire Strikes Back 1980

C. Fisher 5 Locust Ln. Malibu Empire Strikes Back 1980

C. Fisher 123 Maple St. Hollywood Return of the Jedi 1983

C. Fisher 5 Locust Ln. Malibu Return of the Jedi 1983

name↠ street cityname↠ title year

Page 6: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

6

MVD RulesEvery FD is an MVD (promotion ).

◦ If X Y, then swapping Y ’s between two tuples that agree on X doesn’t change the tuples.

◦ Therefore, the “new” tuples are surely in the relation, and we know X ↠ Y.

Complementation : If X↠ Y, and Z is all the other attributes, then X↠ Z.

Transitivity: If X↠ Y and Y↠ Z then X↠ Z

Trivial: 1. If the R = X U Y then X↠ Y and Y↠ X

2. If the Y ⊂ X then X↠ Y

Page 7: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

7

Splitting Doesn’t HoldLike FD’s, we cannot generally split the left side of an MVD.

But unlike FD’s, we cannot split the right side either --- sometimes you have to leave several attributes on the right side.

Page 8: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

8

Fourth Normal FormThe redundancy that comes from MVD’s is not removable by putting the database schema in BCNF.

There is a stronger normal form, called 4NF, that (intuitively) treats MVD’s as FD’s when it comes to decomposition, but not when determining keys of the relation.

Page 9: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

9

4NF DefinitionA relation R is in 4NF if: whenever X ↠ Y is a nontrivial MVD, then X is a superkey.

◦ Nontrivial MVD means that:1. Y is not a subset of X, and

2. X and Y are not, together, all the attributes.

◦ Note that the definition of “superkey” still depends on FD’s only.

Page 10: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

10

BCNF Versus 4NFRemember that every FD X Y is also an MVD, X↠ Y.

Thus, if R is in 4NF, it is certainly in BCNF.◦ Because any BCNF violation is a 4NF violation (after conversion to an MVD).

But R could be in BCNF and not 4NF, because MVD’s are “invisible” to BCNF.

Page 11: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

11

Decomposition and 4NFIf X↠ Y is a 4NF violation for relation R, we can decompose R using the same technique as for BCNF.

1. XY is one of the decomposed relations.

2. All but Y – X is the other.

Page 12: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

12

ExampleCandyLover(name, addr, phones, candiesLiked)

FD: name addr

MVD’s: name ↠ phones

name ↠ candiesLiked

Key is {name, phones, candiesLiked}.

All dependencies violate 4NF.

Page 13: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

13

Example, ContinuedDecompose using name addr:

1. CandyLover1(name, addr) In 4NF; only dependency is name addr.

2. CandyLover2(name, phones, candiesLiked) Not in 4NF. MVD’s name ↠ phones and name ↠ candiesLiked apply. No

FD’s, so all three attributes form the key.

Page 14: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

14

Example: Decompose CandyLover2Either MVD name ↠ phones or name ↠ candiesLiked tells us to decompose to:

◦ CandyLover3(name, phones)

◦ CandyLover4(name, candiesLiked)

Page 15: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Chase processR(A,B,C,D), F= {AB, B↠ C} and we wish to prove A↠ C

A B C D

a b1 c d1

a b c2 d

a,b,c are known values, subscripted a, b, c are unknown

A B C D

a b c d1

a b c2 d

A B C D

a b c d1

a b c2 d

a b c2 d1

a b c dUse AB

Use B↠ C

Page 16: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Comparison

Property 3NF BCNF 4NF

Eliminates redundancy due to FD’s No Yes Yes

Eliminates redundancy due to MVD’s No No Yes

Preserves FD’s Yes No No

Preserves MVD’s No No No

Relations in 3NF

Relations in BCNF

Relations in 4NF

Page 17: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Bookings

FlightsCustomers

SSN Address

Name

AircraftFlight Number

Day

BY IN

Row Seat

For exam

Page 18: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

RELATIONAL DATABASE PROGRAMMING

Page 19: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Relational Algebra on BagsA bag (or multiset ) is like a set, but an element may appear more than once.

Example: {1,2,1,3} is a bag.

Example: {1,2,3} is also a bag that happens to be a set.

19

Page 20: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Why Bags?SQL, the most important query language for relational databases, is actually a bag language.

Some operations, like projection, are much more efficient on bags than sets.

20

Page 21: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Operations on BagsSelection applies to each tuple, so its effect on bags is like its effect on sets.

Projection also applies to each tuple, but as a bag operator, we do not eliminate duplicates.

Products and joins are done on each pair of tuples, so duplicates in bags have no effect on how we operate.

21

Page 22: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Example: Bag Selection

22

R( A, B )1 25 61 2

𝜎A+B<5 (R) = A B1 21 2

Page 23: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Example: Bag Projection

23

R( A, B )1 25 61 2

𝜋A (R) = A151

Page 24: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Example: Bag Product

24

R( A, B ) S( B, C )1 2 3 45 6 7 81 2

R X S = A R.B S.B C1 2 3 41 2 7 85 6 3 45 6 7 81 2 3 41 2 7 8

Page 25: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Example: Bag Theta-Join

25

R( A, B ) S( B, C )1 2 3 45 6 7 81 2

R ⋈ R.B<S.B S = A R.B S.B C1 2 3 41 2 7 85 6 7 81 2 3 41 2 7 8

Page 26: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Bag UnionAn element appears in the union of two bags the sum of the number of times it appears in each bag.

Example: {1,2,1} U {1,1,2,3,1} = {1,1,1,1,1,2,2,3}

26

Page 27: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Bag IntersectionAn element appears in the intersection of two bags the minimum of the number of times it appears in either.

Example: {1,2,1,1} ∩ {1,2,1,3} = {1,1,2}.

27

Page 28: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Bag DifferenceAn element appears in the difference A – B of bags as many times as it appears in A, minus the number of times it appears in B.

◦ But never less than 0 times.

Example: {1,2,1,1} – {1,2,3} = {1,1}.

28

Page 29: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Beware: Bag Laws != Set LawsSome, but not all algebraic laws that hold for sets also hold for bags.

Example: the commutative law for union (R U S = S U R ) does hold for bags.◦ Since addition is commutative, adding the number of times x appears in R and S doesn’t depend on the

order of R and S.

29

Page 30: CS 464/564 Introduction to Database Management System ...mueen/Teaching/CS_464_564/Lectures/... · CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

Example of the Difference

Set union is idempotent, meaning that S U S = S.

However, for bags, if x appears n times in S, then it appears 2n times in S U S.

Thus S U S != S in general.

30