55
Lec 08: Normalization BCA 20: DATABASE MANAGEMENT SYSTEM AND PROGRAMMING Department of Information Systems College of Computer Studies Xavier University – Ateneo de Cagayan

DBMS Lecture 8 - Normalization

Embed Size (px)

Citation preview

Page 1: DBMS Lecture 8 - Normalization

Lec 08: Normalization

BCA 20: DATABASE MANAGEMENT SYSTEM AND PROGRAMMING

Department of Information SystemsCollege of Computer Studies

Xavier University – Ateneo de Cagayan

Page 2: DBMS Lecture 8 - Normalization

Review1. What is a Table?2. What is a Column?3. What is a Row?

Page 3: DBMS Lecture 8 - Normalization

The Apparel Store Case StudyIn preparation for next year’s sale event, a certain apparel shop is coming up with ideas for the Item database.Analyze the table on the succeeding slide, and see how you can improve on it.

Page 4: DBMS Lecture 8 - Normalization

tbl_ItemsItem Colors Price Tax

T-shirt Red 12.00 0.60

Polo Red 12.00 0.60

Sweatshirt Blue 12.00 0.60

Page 5: DBMS Lecture 8 - Normalization

AnomaliesThe above table might be sufficient for a

simple database, but later on, errors (or anomalies) can occur when using it.

There are three general types of anomalies: Updation, Insertion, and Deletion.

Page 6: DBMS Lecture 8 - Normalization

tbl_ItemsItem Colors Price Tax

T-shirt Red,blue 12.00 0.60

Polo Red, Yellow 12.00 0.60

T-shirt Red, Black 12.00 0.60

Sweatshirt Blue, Black 12.00 0.60

Page 7: DBMS Lecture 8 - Normalization

Updation AnomalyFor example, to update the colors of the item

where it occurs twice or more than twice in a table, we will have to update column in all the rows, or else data will become inconsistent.

Page 8: DBMS Lecture 8 - Normalization

tbl_ItemsItem Colors Price Tax

T-shirt Red,blue 12.00 0.60

Polo Red, Yellow 12.00 0.60

T-shirt Red, Black 12.00 0.60

Sweatshirt Blue, Black 12.00 0.60

Page 9: DBMS Lecture 8 - Normalization

Insertion AnomalySuppose for a new item, we have the item and

color of the item but if it has not opted for a price yet then we have to insert NULL in there, leading to an Insertion Anomaly.

Page 10: DBMS Lecture 8 - Normalization

tbl_ItemsItem Colors Price Tax

T-shirt Red,blue 12.00 0.60

Polo Red, Yellow 12.00 0.60

T-shirt Red, Black 12.00 0.60

Sweatshirt Blue, Black 12.00 0.60

Page 11: DBMS Lecture 8 - Normalization

Deletion AnomalyLikewise, if one item was suggested to be

drops, then during the time when we delete that row, the entire item record will have to be deleted along with it.

Page 12: DBMS Lecture 8 - Normalization

tbl_ItemsItem Colors Price Tax

T-shirt Red,blue 12.00 0.60

Polo Red, Yellow 12.00 0.60

T-shirt Red, Black 12.00 0.60

Sweatshirt Blue, Black 12.00 0.60

Page 13: DBMS Lecture 8 - Normalization

The SolutionThrough Normalization, we can make sure that the

data are logically arranged. Usually there are 5 levels of normal forms, but usually 3rd normal form is sufficient for most typical database applications:

There are three steps in the Normalization process:First Normal Form (1NF);Second Normal Form (2NF); andThird Normal Form (3NF)

Page 14: DBMS Lecture 8 - Normalization

NormalizationNormalization is a technique of organizing data in a database through the systematic decomposition of tables in order to eliminate data redundancies and anomalies.

Page 15: DBMS Lecture 8 - Normalization

NormalizationThese anomalies refer to Insertion, Updation, and Deletion Anomalies.

Page 16: DBMS Lecture 8 - Normalization

NormalizationNormalization ensures that redundant data is eliminated, and data is logically stored (i.e. data dependencies make sense).

Page 17: DBMS Lecture 8 - Normalization

The Apparel Store Case StudyLet’s see how we can apply normalization to the Registrar’s database.

Page 18: DBMS Lecture 8 - Normalization

First Normal FormIn First Normal Form, no two Rows of data must contain repeating group of information (i.e each set of column must have a unique or single value). Each table should be organized into rows, and each row should have a primary key.

1NF

Page 19: DBMS Lecture 8 - Normalization

The Primary KeyThe Primary Key is a single column (or a combination of two or more columns) that uniquely identifies each row.We will use primary keys to help us in the Normalization process.

Page 20: DBMS Lecture 8 - Normalization

First Normal FormRemember, in First Normal Form, each row must not have a column in which more than one value is saved (liked separated with commas). Also, each row must be unique and distinguished by a primary key.tbl_Student1NF will now look like this:

1NF

Page 21: DBMS Lecture 8 - Normalization

tbl_ItemsItem Colors Price Tax

T-shirt Red,blue 12.00 0.60

Polo Red, Yellow 12.00 0.60

T-shirt Red, Black 12.00 0.60

Sweatshirt Blue, Black 12.00 0.60

Page 22: DBMS Lecture 8 - Normalization

First Normal FormTable is not in 1st normal form because: - Multiple items in color field - Duplicate records / no primary keySOLUTION:

BREAK IT DOWN

1NF

Page 23: DBMS Lecture 8 - Normalization

tbl_Items

Item Colors Price TaxT-shirt Red 12.00 0.60T-shirt Blue 12.00 0.60Polo Red 12.00 0.60Polo Yellow 12.00 0.60Sweatshirt Blue 12.00 0.60Sweatshirt Black 12.00 0.60

Page 24: DBMS Lecture 8 - Normalization

SecondNormal FormA table in Second Normal Form must first be in First Normal Form, and it must not have any partial dependencies.All non-key fields depend on all components of the primary key, guaranteed when primary key is a single field.

2NF

Page 25: DBMS Lecture 8 - Normalization

Partial DependencyA Partial Dependency refers to non-key attributes which are only dependent on part of the primary key (aka the composite primary key).Let’s take a look at table and see how this applies.

Page 26: DBMS Lecture 8 - Normalization

tbl_Items

Item Colors Price TaxT-shirt Red 12.00 0.60T-shirt Blue 12.00 0.60Polo Red 12.00 0.60Polo Yellow 12.00 0.60Sweatshirt Blue 12.00 0.60Sweatshirt Black 12.00 0.60

Page 27: DBMS Lecture 8 - Normalization

SecondNormal FormTable is not in second normal form because:- PRICE and TAX depend on ITEM, but not COLOR 2NF

Page 28: DBMS Lecture 8 - Normalization

tbl_ColorItem

Item ColorT-shirt RedT-shirt BluePolo RedPolo YellowSweatshirt BlueSweatshirt Black

Item Price Tax

T-shirt 12.00 0.60

Polo 12.00 0.60

Sweatshirt 12.00 0.6

0

tbl_PriceItem

Page 29: DBMS Lecture 8 - Normalization

Third Normal FormTables in Third Normal Form must first be in Second Normal Form, and all non-prime attributes of each table must be dependent on the primary key. 3NF

Page 30: DBMS Lecture 8 - Normalization

TransitiveDependencyA Transitive Dependency refers to non key attributes which dependent on another non key attribute.

Page 31: DBMS Lecture 8 - Normalization

Third Normal FormTables in Third Normal Form must first be in Second Normal Form, and all non-prime attributes of each table must be dependent on the primary key.Let’s look at a table again:

3NF

Page 32: DBMS Lecture 8 - Normalization

tbl_ColorItem

Item ColorT-shirt RedT-shirt BluePolo RedPolo YellowSweatshirt BlueSweatshirt Black

Item Price Tax

T-shirt 12.00 0.60

Polo 12.00 0.60

Sweatshirt 12.00 0.6

0

tbl_PriceItem

Page 33: DBMS Lecture 8 - Normalization

Third Normal FormTables are not in third normal form because: - TAX depends on PRICE, not ITEM 3NF

Page 34: DBMS Lecture 8 - Normalization

tbl_ColorItem

Item ColorT-shirt RedT-shirt BluePolo RedPolo YellowSweatshirt BlueSweatshirt Black

Item PriceT-shirt 12.00Polo 12.00Sweatshirt 12.00

tbl_PriceItem

Price Tax12.00 0.60

tbl_Tax

Page 35: DBMS Lecture 8 - Normalization

Another Example

Page 36: DBMS Lecture 8 - Normalization

Name Assignment A Assignment B

Jeff Smith Article Summary Poetry AnalysisNancy Jones Article Summary Reaction PaperJane Scott Article Summary Poetry Analysis

Table_Assignment

Page 37: DBMS Lecture 8 - Normalization

Problem:Table is not in first normal form because:

- Assignment field repeating- First and last name in one field- No (guaranteed unique) primary key field

Page 38: DBMS Lecture 8 - Normalization

Solution:Break down the field NAME into First Name, and Last Name. 1NF

Page 39: DBMS Lecture 8 - Normalization

tbl_Assignment

First Name Last Name Assignment 1

Assignment 2

Jeff Smith Article Summary

Poetry Analysis

Nancy Jones Article Summary

Reaction Paper

Jane Scott Article Summary

Poetry Analysis

Page 40: DBMS Lecture 8 - Normalization

No Primary Key??Ans: CREATE ANOTHER FIELD in this case name it Student ID 1NF

Page 41: DBMS Lecture 8 - Normalization

tbl_AssignmentStudent

IDFirst Name

Last Name

Assignment 1

Assignment 2

1 Jeff Smith Article Summary

Poetry Analysis

2 Nancy Jones Article Summary

Reaction Paper

3 Jane Scott Article Summary

Poetry Analysis

Page 42: DBMS Lecture 8 - Normalization

Seems okay right?

Look again in the table 1NF

Page 43: DBMS Lecture 8 - Normalization

tbl_AssignmentStudent

IDFirst Name

Last Name

Assignment 1

Assignment 2

1 Jeff Smith Article Summary

Poetry Analysis

2 Nancy Jones Article Summary

Reaction Paper

3 Jane Scott Article Summary

Poetry Analysis

Page 44: DBMS Lecture 8 - Normalization

Solution:Assignment field repeatingSolution:Create a new fields (Assignment ID & Description)

1NF

Page 45: DBMS Lecture 8 - Normalization

tbl_AssignmentStudent

IDFirst Name

Last Name

Assignment ID Description

1 Jeff Smith A Article Summary

1 Jeff Smith B Poetry Analysis

2 Nancy Jones A Article Summary

2 Nancy Jones C Reaction Paper

3 Jane Scott A Article Summary

3 Jane Scott B Poetry Analysis

Page 46: DBMS Lecture 8 - Normalization

Table is not in 2NF since:- Description does not depend on Student ID 2NF

Page 47: DBMS Lecture 8 - Normalization

tbl_AssignmentStudent

IDFirst Name

Last Name

Assignment ID Description

1 Jeff Smith A Article Summary

1 Jeff Smith B Poetry Analysis

2 Nancy Jones A Article Summary

2 Nancy Jones C Reaction Paper

3 Jane Scott A Article Summary

3 Jane Scott B Poetry Analysis

Page 48: DBMS Lecture 8 - Normalization

tbl_StudentStuden

t IDFirst Name

Last Name

1 Jeff Smith2 Nancy Jones3 Jane Scott

Student ID

Assignment ID Description

1 A Article Summary

1 B Poetry Analysis

2 A Article Summary

2 C Reaction Paper

3 A Article Summary

3 B Poetry Analysis

tbl_Assignment

Page 49: DBMS Lecture 8 - Normalization

Table is not in 3NF since:-Description does not depend still on Student ID-Data Repetition

3NF

Page 50: DBMS Lecture 8 - Normalization

tbl_StudentStudent

IDFirst Name

Last Name

1 Jeff Smith2 Nancy Jones3 Jane Scott

Student ID

Assignment ID

1 A2 A3 A1 B 3 B 2 C

tbl_Assignment

Assignment ID Description

1 Article Summary

2 Poetry Analysis

3 Reaction Paper

tbl_Descript

Page 51: DBMS Lecture 8 - Normalization

Normalization for Non-IT ProfessionalsWhile the process of Normalization can be tricky for non-IT students and professionals, everyone should still be able to create logically-sound databases in the Third Normal Form.

Page 52: DBMS Lecture 8 - Normalization

SummaryNormalization is the systematic decomposition of tables in order to eliminate data redundancies and anomalies. There are three normal forms: 1NF, 2NF, and 3NF.A Primary Key is a single column (or a combination of two or more columns) that uniquely identifies each row.A Partial Dependency refers to non-key attributes which are only dependent on part of the primary key.A Transitive Dependency refers to non key attributes which dependent on another non key attribute.

Page 53: DBMS Lecture 8 - Normalization

Exercise 1Normalize the following “Pet_Health” table to 3NF:

Pet_ID Pet_Name Pet_Type Pet_Age Owner

771 Rover Dog 12 Sam Villa

204 Spot Dog 2 Anna Dy

348 Mrs Whiskers Cat 4 Sam Villa

Page 54: DBMS Lecture 8 - Normalization

Exercise 2Item_ID Item_Na

meItem_Desc

Supplier_Name Address PO_num PO_date

A101 BckBPOne box of black ballpens

De Oro Office Supplies

Cagayan de Oro City

20986 12-11-2014

A102 BluBPOne box of blue ballpens

De Oro Office Supplies

Cagayan de Oro City

20986 12-11-2014

P100 SBPOne ream of short bond paper

King PapersCagayan de Oro City

1217 02-10-2011

P100 SBPOne ream of short bond paper

Office Depot Iligan City 21044 01-05-

2015

Page 55: DBMS Lecture 8 - Normalization

EndReferences:www.lib.ku.edu/instruction