DBMS Lecture 8 - Normalization

Preview:

Citation preview

Lec 08: Normalization

BCA 20: DATABASE MANAGEMENT SYSTEM AND PROGRAMMING

Department of Information SystemsCollege of Computer Studies

Xavier University – Ateneo de Cagayan

Review1. What is a Table?2. What is a Column?3. What is a Row?

The Apparel Store Case StudyIn preparation for next year’s sale event, a certain apparel shop is coming up with ideas for the Item database.Analyze the table on the succeeding slide, and see how you can improve on it.

tbl_ItemsItem Colors Price Tax

T-shirt Red 12.00 0.60

Polo Red 12.00 0.60

Sweatshirt Blue 12.00 0.60

AnomaliesThe above table might be sufficient for a

simple database, but later on, errors (or anomalies) can occur when using it.

There are three general types of anomalies: Updation, Insertion, and Deletion.

tbl_ItemsItem Colors Price Tax

T-shirt Red,blue 12.00 0.60

Polo Red, Yellow 12.00 0.60

T-shirt Red, Black 12.00 0.60

Sweatshirt Blue, Black 12.00 0.60

Updation AnomalyFor example, to update the colors of the item

where it occurs twice or more than twice in a table, we will have to update column in all the rows, or else data will become inconsistent.

tbl_ItemsItem Colors Price Tax

T-shirt Red,blue 12.00 0.60

Polo Red, Yellow 12.00 0.60

T-shirt Red, Black 12.00 0.60

Sweatshirt Blue, Black 12.00 0.60

Insertion AnomalySuppose for a new item, we have the item and

color of the item but if it has not opted for a price yet then we have to insert NULL in there, leading to an Insertion Anomaly.

tbl_ItemsItem Colors Price Tax

T-shirt Red,blue 12.00 0.60

Polo Red, Yellow 12.00 0.60

T-shirt Red, Black 12.00 0.60

Sweatshirt Blue, Black 12.00 0.60

Deletion AnomalyLikewise, if one item was suggested to be

drops, then during the time when we delete that row, the entire item record will have to be deleted along with it.

tbl_ItemsItem Colors Price Tax

T-shirt Red,blue 12.00 0.60

Polo Red, Yellow 12.00 0.60

T-shirt Red, Black 12.00 0.60

Sweatshirt Blue, Black 12.00 0.60

The SolutionThrough Normalization, we can make sure that the

data are logically arranged. Usually there are 5 levels of normal forms, but usually 3rd normal form is sufficient for most typical database applications:

There are three steps in the Normalization process:First Normal Form (1NF);Second Normal Form (2NF); andThird Normal Form (3NF)

NormalizationNormalization is a technique of organizing data in a database through the systematic decomposition of tables in order to eliminate data redundancies and anomalies.

NormalizationThese anomalies refer to Insertion, Updation, and Deletion Anomalies.

NormalizationNormalization ensures that redundant data is eliminated, and data is logically stored (i.e. data dependencies make sense).

The Apparel Store Case StudyLet’s see how we can apply normalization to the Registrar’s database.

First Normal FormIn First Normal Form, no two Rows of data must contain repeating group of information (i.e each set of column must have a unique or single value). Each table should be organized into rows, and each row should have a primary key.

1NF

The Primary KeyThe Primary Key is a single column (or a combination of two or more columns) that uniquely identifies each row.We will use primary keys to help us in the Normalization process.

First Normal FormRemember, in First Normal Form, each row must not have a column in which more than one value is saved (liked separated with commas). Also, each row must be unique and distinguished by a primary key.tbl_Student1NF will now look like this:

1NF

tbl_ItemsItem Colors Price Tax

T-shirt Red,blue 12.00 0.60

Polo Red, Yellow 12.00 0.60

T-shirt Red, Black 12.00 0.60

Sweatshirt Blue, Black 12.00 0.60

First Normal FormTable is not in 1st normal form because: - Multiple items in color field - Duplicate records / no primary keySOLUTION:

BREAK IT DOWN

1NF

tbl_Items

Item Colors Price TaxT-shirt Red 12.00 0.60T-shirt Blue 12.00 0.60Polo Red 12.00 0.60Polo Yellow 12.00 0.60Sweatshirt Blue 12.00 0.60Sweatshirt Black 12.00 0.60

SecondNormal FormA table in Second Normal Form must first be in First Normal Form, and it must not have any partial dependencies.All non-key fields depend on all components of the primary key, guaranteed when primary key is a single field.

2NF

Partial DependencyA Partial Dependency refers to non-key attributes which are only dependent on part of the primary key (aka the composite primary key).Let’s take a look at table and see how this applies.

tbl_Items

Item Colors Price TaxT-shirt Red 12.00 0.60T-shirt Blue 12.00 0.60Polo Red 12.00 0.60Polo Yellow 12.00 0.60Sweatshirt Blue 12.00 0.60Sweatshirt Black 12.00 0.60

SecondNormal FormTable is not in second normal form because:- PRICE and TAX depend on ITEM, but not COLOR 2NF

tbl_ColorItem

Item ColorT-shirt RedT-shirt BluePolo RedPolo YellowSweatshirt BlueSweatshirt Black

Item Price Tax

T-shirt 12.00 0.60

Polo 12.00 0.60

Sweatshirt 12.00 0.6

0

tbl_PriceItem

Third Normal FormTables in Third Normal Form must first be in Second Normal Form, and all non-prime attributes of each table must be dependent on the primary key. 3NF

TransitiveDependencyA Transitive Dependency refers to non key attributes which dependent on another non key attribute.

Third Normal FormTables in Third Normal Form must first be in Second Normal Form, and all non-prime attributes of each table must be dependent on the primary key.Let’s look at a table again:

3NF

tbl_ColorItem

Item ColorT-shirt RedT-shirt BluePolo RedPolo YellowSweatshirt BlueSweatshirt Black

Item Price Tax

T-shirt 12.00 0.60

Polo 12.00 0.60

Sweatshirt 12.00 0.6

0

tbl_PriceItem

Third Normal FormTables are not in third normal form because: - TAX depends on PRICE, not ITEM 3NF

tbl_ColorItem

Item ColorT-shirt RedT-shirt BluePolo RedPolo YellowSweatshirt BlueSweatshirt Black

Item PriceT-shirt 12.00Polo 12.00Sweatshirt 12.00

tbl_PriceItem

Price Tax12.00 0.60

tbl_Tax

Another Example

Name Assignment A Assignment B

Jeff Smith Article Summary Poetry AnalysisNancy Jones Article Summary Reaction PaperJane Scott Article Summary Poetry Analysis

Table_Assignment

Problem:Table is not in first normal form because:

- Assignment field repeating- First and last name in one field- No (guaranteed unique) primary key field

Solution:Break down the field NAME into First Name, and Last Name. 1NF

tbl_Assignment

First Name Last Name Assignment 1

Assignment 2

Jeff Smith Article Summary

Poetry Analysis

Nancy Jones Article Summary

Reaction Paper

Jane Scott Article Summary

Poetry Analysis

No Primary Key??Ans: CREATE ANOTHER FIELD in this case name it Student ID 1NF

tbl_AssignmentStudent

IDFirst Name

Last Name

Assignment 1

Assignment 2

1 Jeff Smith Article Summary

Poetry Analysis

2 Nancy Jones Article Summary

Reaction Paper

3 Jane Scott Article Summary

Poetry Analysis

Seems okay right?

Look again in the table 1NF

tbl_AssignmentStudent

IDFirst Name

Last Name

Assignment 1

Assignment 2

1 Jeff Smith Article Summary

Poetry Analysis

2 Nancy Jones Article Summary

Reaction Paper

3 Jane Scott Article Summary

Poetry Analysis

Solution:Assignment field repeatingSolution:Create a new fields (Assignment ID & Description)

1NF

tbl_AssignmentStudent

IDFirst Name

Last Name

Assignment ID Description

1 Jeff Smith A Article Summary

1 Jeff Smith B Poetry Analysis

2 Nancy Jones A Article Summary

2 Nancy Jones C Reaction Paper

3 Jane Scott A Article Summary

3 Jane Scott B Poetry Analysis

Table is not in 2NF since:- Description does not depend on Student ID 2NF

tbl_AssignmentStudent

IDFirst Name

Last Name

Assignment ID Description

1 Jeff Smith A Article Summary

1 Jeff Smith B Poetry Analysis

2 Nancy Jones A Article Summary

2 Nancy Jones C Reaction Paper

3 Jane Scott A Article Summary

3 Jane Scott B Poetry Analysis

tbl_StudentStuden

t IDFirst Name

Last Name

1 Jeff Smith2 Nancy Jones3 Jane Scott

Student ID

Assignment ID Description

1 A Article Summary

1 B Poetry Analysis

2 A Article Summary

2 C Reaction Paper

3 A Article Summary

3 B Poetry Analysis

tbl_Assignment

Table is not in 3NF since:-Description does not depend still on Student ID-Data Repetition

3NF

tbl_StudentStudent

IDFirst Name

Last Name

1 Jeff Smith2 Nancy Jones3 Jane Scott

Student ID

Assignment ID

1 A2 A3 A1 B 3 B 2 C

tbl_Assignment

Assignment ID Description

1 Article Summary

2 Poetry Analysis

3 Reaction Paper

tbl_Descript

Normalization for Non-IT ProfessionalsWhile the process of Normalization can be tricky for non-IT students and professionals, everyone should still be able to create logically-sound databases in the Third Normal Form.

SummaryNormalization is the systematic decomposition of tables in order to eliminate data redundancies and anomalies. There are three normal forms: 1NF, 2NF, and 3NF.A Primary Key is a single column (or a combination of two or more columns) that uniquely identifies each row.A Partial Dependency refers to non-key attributes which are only dependent on part of the primary key.A Transitive Dependency refers to non key attributes which dependent on another non key attribute.

Exercise 1Normalize the following “Pet_Health” table to 3NF:

Pet_ID Pet_Name Pet_Type Pet_Age Owner

771 Rover Dog 12 Sam Villa

204 Spot Dog 2 Anna Dy

348 Mrs Whiskers Cat 4 Sam Villa

Exercise 2Item_ID Item_Na

meItem_Desc

Supplier_Name Address PO_num PO_date

A101 BckBPOne box of black ballpens

De Oro Office Supplies

Cagayan de Oro City

20986 12-11-2014

A102 BluBPOne box of blue ballpens

De Oro Office Supplies

Cagayan de Oro City

20986 12-11-2014

P100 SBPOne ream of short bond paper

King PapersCagayan de Oro City

1217 02-10-2011

P100 SBPOne ream of short bond paper

Office Depot Iligan City 21044 01-05-

2015

EndReferences:www.lib.ku.edu/instruction

Recommended