Chapter Four Database Design (Relational) Objectives Summary Keys (Constraints) Relational DBMS...

Preview:

Citation preview

Chapter FourDatabase Design (Relational)

Objectives Summary Keys (Constraints) Relational DBMS Normal Forms

2

Summary DB Lifecycle Business Requirements

Design (ER) Build DB Production

Architecture of DBMS Definitions Data Models Database Design (ER Model)

Strong Entity Weak Entity Relationship

Functionality Functional Dependency

3

Keys (Constraints)

A set of attributes whose values uniquely identify each entity in an entity set or a relationship set

How do we identify keys?Relation R with a1, a2, … an

4

Keys (Constraints)

1. Super key: Any set of attributes that uniquely identify each table.

Student (Name, ID, GPA, Major, Minor, Address, Phone)

5

Keys(Constraints)

2. Candidate Key: Smallest super key

3. Primary key: Candidate key selected by the DBA

6

Keys (Constraints)

Characteristic of primary key:a. Uniqueness:

At any given time, no two tuples can have the same value for a given primary key

b. Minimally:None of the attributes in the primary key can be discarded without distorting the uniqueness property

7

Keys (Constraints)

4. Foreign Key:An attribute(s) in an entity set one (relation one) which is the primary key of entity two(relation two)

R1 (a,b,c,d,e)R2 (x,y,z,a,w)

Faculty (ID, Name, Salary, D_name, age, Hiring_date)Department(D_name, No_Faculty, D_head)

Relational DBMS

9

Relational DBMS RDBM: Data are represented as a

set of tables (relation is a mathematical term for a table)

Originated by E.F. Codd(1970) Based on sets theory Record base data model

10

Structure:

A set of relations (Table) Each relation has a unique name Each relation has a set of

attributes (Columns) Each relation has a set of tuples

(Rows)

13

Restriction on RDB:

No two tuples are the same No two attributes are the same The order of tuples are immaterial The order of attributes are immaterial There is an attribute or collection of

attributes which identifies tuples uniquely called Primary Key

Value of attribute must be atomic

14

R a1 a2 … … an

T1

T2

R: Relation Namean: attributeTm: tupleT[an]: value of attributes for tuple T

Intention vs. Extension

15

Converting E.R Diagram to Relational

1. Strong Entity sets:

Let E be a strong entity set with attributes a1, a2,a3, … an

Create a relation R with n distinct columns each of which corresponds to one of the attributes in E

16

Converting E.R Diagram to Relational

2. Weak Entity sets:

Let W be a weak entity set with attributes a1 ,a 2,a3 , … ak

Let E be the strong entity set on which W is dependent

Let primary key of E be e1 ,e2 ,e3 , … ex

Create a relation R with k+x columns (a1, a2 ,a3 , … am) & (e1 ,e2 ,e3 , … ex)

17

Converting E.R Diagram to Relational

3. Relationship: Let R be a relationship among entity

sets e1, e2, … en with primary keys (Ei) and attributes a1 … an

Create a relation called R with Un

Primary key (Ei) U {a1, … an}

18

Example Convert the school ER diagram into

relational database.

19

Normal Forms (Guidelines for RD design)

How do we know this design is good?

If it is not a good design, What should we do?

Modify our design ??.

20

Normal Forms (Guidelines for RD design)

First Normal Form (1NF) Deals with the shape of the

records

A relation is in 1NF if the values of domain is atomic for each attribute.

21

First Normal Form: 1NF

Example: R (A, B, C, …)

R ( A B ) R ( A B ) a1 b1, b2 => a1 b1

a1b2

22

First Normal Form: 1NF

Example: Person (Name Age Children )

Smith 42 John, Lori, Mark

Person (Name Age Child )Smith 42 JohnSmith 42 LoriSmith 42 Mark

23

First Normal Form: 1NF

Example: Student ( Name Birthday )

S1 Feb 2,91S2 March 8,88

Student (Name, D_Birth, M_Birth, Y_Birth)

Note: 2NF and 3NF Deal with the relationship between non-key and key

24

Second Normal Form: 2NF

A relation R is in 2NF with respect to a set of FD if it is in 1NF and every non-prime attribute is Fully dependent on the entire key in R.

Fact: 2NF is violated when a non-key is a fact about a subset of a primary key

25

Second Normal Form: 2NF

Non-prime vs. prime:A relation R with attribute A and a set of FD on attribute A is prime if A is contained in some key of R, otherwise A is non-prime

26

Second Normal Form: 2NF Example: R(A,B,C,D) with FD

A, B ---> C, DA ---> D

D partially depends on A,B C fully depends on A,B A&B are prime (part of key)

If A is primary key. Is this in 2NF? If A&B is primary key. Is this in 2NF?

27

Second Normal Form: 2NF What should we do with a relation which

is not in 2NF?

Example: R(A,B,C,D) A, B ---> C, D A ---> D

R1 (A,B,C) R2(A,D)

28

Second Normal Form: 2NF Example: R(Part Warehouse Address Quantity

)

P1 W1 Frostburg 25

P2 W1 Frostburg 30

P3 W2 Cumberland

32

P4 W4 Frostburg 25

P4 W1What is the primary key?

Part, Warehouse ---> QuantityWarehouse ---> Address

29

Second Normal Form: 2NF Problems:

1. Repetition of information:Changing the address W!

2. Unable to present information:Warehouse with no part

3. Inconsistency

So …R1 (Warehouse, Address)R2 (Part, Warehouse, Quantity)

30

Second Normal Form: 2NF

Example:

R( Professor,

Student,

Course,

Degree )

P1 S1 C1 Ph.D.

P2 S2 C2 M.S.

P3 S2 C4 M.S.

P3 S3 C4 Pg.D.Professor ---> CourseStudent ---> DegreeProfessor ---> Student

Key? Not in 2NF

R1(Student, Degree)R2(Professor, Course, Student)

31

A relation R is 3NF with respect to a set of FD if it is in 2NF and whenever A ---> B holds, then

1. A --> B is a trivial FD2. A is a superkey for R3. B is contained in a candidate key

for R

A Non-key attribute non transitively depends on the Primary Key.

Third Normal Form (3NF):

32

Third Normal Form (3NF): Example: R(A,B,C,D) A, B --->D R1(A,B,D) D ---> C R2(D,C)

Fact: 3NF is violated when a non-key is a fact about another non-key

Employee ---> Dept ---> Location

33

Third Normal Form (3NF): Example: R(Employee, Dept, Location) Employee ---> Dept

Dept ---> Location

Employee Dept LocationE1 D1 Frostburg

E2 D1 Frostburg

E3 D1 Frostburg

Problems? R1(Employee, Dept)R2(Dept, Location)

34

Third Normal Form (3NF): ItemInfo (item,price, discount)

Item ---> price Price ---> discount

Item price discountI1 .99 2%I2 .80 2%I3 .10 2%I4 5 10%

35

Third Normal Form (3NF):

Employee (ID, Name, Expertise ,Age, Dept) ID --> Name ID --> Expertise ID --> Age ID --> Dept Dept --> Expertise

36

Third Normal Form (3NF): Example: R(A,B,C,D)

A,B ---> C A,C ---> D

So A,B is the Primary Key Not in 3NF

R1(A,B,C) R2(A,C,D)

37

Boyce Codd Normal Form: Def: A relation schema R is in

BCNF with respect to a set of FD, if it is 3NF and whenever X A holds, then X is a superkey (AX)

38

Boyce Codd Normal Form: Most 3rd NF relations are also BCNF A 3rd NF relation is NOT in BCNF if:

Candidate keys in the relation are composite keys (not single attribute)

There is more than one candidate key in the relation, and

The keys are not disjoint (some attributes in the keys are common)

39

Boyce Codd Normal Form: A relation is in BCNF if every

determinant is a candidate key R(A,B,C) FD: A,B -> C

C -> A A is prime, so it is 3rd NF C is not candidate key (Not in BCNF)

Not BCNF R1(A,B,C)R2(A,C)

40

Boyce Codd Normal Form: S(SupplierNo, sname, status, city)FD:

SupplierNo ---> status SupplierNo ---> city SupplierNo ---> sname sname ---> status sname ---> city sname ---> SupplierNo

It is in BCNF; Every determinate is a candidate key

41

Boyce Codd Normal Form:

S( SupplierNo sname Status City )

S1 Smith H Frostburg

S2 Johnson L LaVale

S3 Marker M Cumberland

42

Boyce Codd Normal Form:

S(SupplierNo, sname, PartNo, Qty)FD:

SupplierNo -- sname SupplierNo, PartNo ---> Qty sname, PartNo ---> Qty

43

Boyce Codd Normal Form:S( SupplierNo sname PartNo Qty )

S1 Smith P1 100

S1 Smith P2 200

S1 Smith P3 300

S1 Smith P4 400

It is in 3NF; not in BCNF;Problems: Sname or SupplierNo are not candidate keys for

this relationR1(SupplierNo, sname)R2(sname, PartNo, Qty)

44

Boyce Codd Normal Form:ClientInterview (ClientNo, InterviewDate,

InterviewTime, StaffID, roomNo)ClientNo,InterviewDate -> InterviewTimeClientNo, InterviewDate -> StaffIDClientNo, InterviewDate -> RoomNoStaffid, InterviewDate, InterviewTime -> ClientNoRoomNo, InterviewDate, InterviewTime -> StaffIDRoomNo, InterviewDate, InterviewTime -> ClientNoStaffID, InterviewDate -> RoomNo

45

Boyce Codd Normal Form:ClientNo

InterviewDate InterviewTime

StaffID RoomNo

C25 March 2, 02 10:00 S10 GC104

C28 March 2, 02 11:30 S10 GC104

C72 March 2, 02 1:30 S8 GC103

C28 April 2, 02 10:00 S24 GC103

It is in 3NFNot in BCNF(StaffID, InterviewData) is not a cadidatekey

46

Boyce Codd Normal Form: R1(ClientNo, InterviewData,

InterviewTime, StaffID) R2(StaffID,InterviewData, RoomNo)

47

Normal Forms:

Cars(Model, NoCylinders, Madeln, Tax, Fee) Model, NoCylinders ---> Madeln Model, NoCylinders ---> Tax Model, NoCylinders ---> Fee NoCylinders ---> Fee Madeln ---> Tax

48

Normal Forms:

Cars( Model

NoCylinders

Madeln

Tax Fee

)

GM 6 U.S. $20 $30

Toyota

4 Japan $40 $5

Honda

4 Japan $40 $5

VW 5 German

$50 $10Primary Key? Model, NoCylinders

Is it in 1NF?Is it in 2NF?

49

Normal Forms:Cars(Model, NoCylinders, Madeln, Tax)Licensing(NoCylinders,Fee)

50

Normal Forms: Is it in 3NF?

Cars(Model, NoCylinders, Madeln) Taxation(Madeln, Tax) Licensing(NoCylinders, Fee)

Assume we have FD Madeln ---> NoCylinders

It is not in BCNF Cars(Model, NoCylinders) EngineSize(NoCylinders, Madeln)

51

Practice:A: PropertyNoB: PropertyAddressC: InspectionDateD: InspectionTimeE: CommentsF: StaffIDG: StaffNameH: CarRegistrationNo

FD:A,C -> D,E,F,G,HA -> BF -> GF,C -> HH,C,D -> A,B,E,F,GF,C,D -> A,B,E

52

Multivalue Dependency (MVD) Multi valued Dependency are a

generalization of FD Relation R, with x,y subset

attributes of of R we say X -->-> Y There is a multivalued dependency

of y on x. Given a value for x there is a set of values for y.

53

Multivalue Dependency (MVD) Example:

Name --->-> St, cityS S1 C1

S S2 C2

M S1 C1

M S2 C2

54

Multi-value Dependency (MVD)

R x y R-x-y

t

s

U

V

x--->->y hold if t and s are 2 tuples in R t[x]=s[x] then also there are tuples u and v where

1. u[x]=v[x]=t[x]=s[x]2. u[y]=t[y] & u[R-x-y]=S[R-x-y]3. v[y]=s[y] & v[R-x-y]=t[R-x-y]

[Relationship between x&y is independent of the relationship between x & R-y]

55

Example:

Multivalue Dependency (MVD)

Name St City Car

t S S1 C1 Ford

s S S2 C2 Chev

u S S1 C1 Chev

v S S2 C2 Ford

1. u[Name]=v[Name]=s[Name]=t[Name]

2. u[St,City]=t[St,City] & u[Car]=s[Car]

3. v[St,City]=s[St,City] & v[Car]=t[Car]

56

Fourth Normal Form (4NF): A relation is in 4th NF with respect to a set

of MVD. If it is in 3rd NF and whenever x--->->y holds, then x in a superkey (x--->->y is not a trivial multivalued dependency, that is yx; yxy or x not empty)

4NF is violated when a record type contains two or more independent multivalued facts about an entity.

4th and 5th NF in a sense are also about composite keys

57

Fourth Normal Form (4NF): Example: R(Employee, Skill,

Language)

Employee SkillEmployee Language

58

Fourth Normal Form (4NF): Example: R(Employee, Skill, Language)

Employee Skill LanguageE1 Cook

E1 Cashier

E1 Manager

E1 English

E1 German

E1 Italian

E2 Cook German

59

Fourth Normal Form (4NF): We have two, many-to-many

relationships, Employee and Skill Employee and Language

Employee --->-> Skill R1(Employee, Skill)

R2(Employee, Language)

<----- key ----->

<----- key ----->

60

Fourth Normal Form (4NF):Employee SkillE1 Cook

E1 Cashier

E1 Manager

E2 Cook

Employee

Language

E1 English

E1 German

E1 Italian

E2 German

61

Fourth Normal Form (4NF): IN 4Th normal form a record should

not contain two or more independent multi-valued fact about an entity

62

Join Dependency (5 NF)R( SupplierNo PartNo ProjectNo )

S1 P1 N2

S2 P2 N1

S2 P1 N1

S1 P1 N1

R1( SupplierNo PartNo

)

S1 P1

S1 P2

S2 P1

R2( PartNo ProjectNo )P1 N2

P2 N1

P1 N1

63

Join Dependency

R3( SupplierNo ProjectNo

)

S1 N2

S1 N1

S2 N1

64

Join Dependency Join R1 & R2 over PartNo

SupplierNo PartNo ProjectNoS1 P1 N2

S1 P2 N1

S2 P1 N1

S2 P1 N2

S1 P1 N1

65

Join Dependency Join Result with R3

SupplierNo PartNo ProjectNoS1 P1 N2

S1 P2 N1

S2S2

P1P1

N1N1

S1 P1 N1

66

Join Dependency If(S1,P1) appears in R1

AND (P1,N1) appears in R2AND (N1,S1) appears in R3THEN (S1,P1,N1) appears in R

Rewrite: IF (S1,P1,N2), (S2,P1,N1), (S1,P2,N1)appears in RTHEN (S1,P1,N1) appear in R

67

Join Dependency Example:

IF Nelson supplies Screw DriverAND Screw Drivers are used in Pullen projectAND Nelson supplies the Pullen projectTHEN

Nelson supplies Screw Drivers for Pullen project

68

Fifth Normal Form (5NF): 5th normal form deals with cases that

information can be reconstructed from smaller pieces of information which can be maintained with less redundancy.

Join Dependency If an agent represents a company; and

company makes a product and agent sales product, so we have:R( Agent Company Product )

A1 Ford Car

A1 GM Truck

69

Fifth Normal Form (5NF): Lets assume, there is a rule:

“if an agent sells a product and s/he represent the company making that product, then s/he sells that product for that company”.

Agent Company Product

S1 Ford car

S1 Ford Truck

S1 GM Car

S1 GM Truck

S2 Ford Car

70

Fifth Normal Form (5NF):Agent Company

S1 Ford

S1 GM

S2 Ford

Company Products

Ford Car

Ford Truck

GM Car

GM Truck

Agent Products

S1 Car

S1 Truck

S2 Car

Recommended