20
by Abdul Rashid Ahmad

By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

Embed Size (px)

Citation preview

Page 1: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

by

Abdul Rashid Ahmad

Page 2: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

E.F. Codd proposed three normal forms:The first, second, and third normal forms

1NF, 2NF and 3NF are based on the functional dependencies among the attributes of a relation

Boyce-Codd normal form was proposed by Boyce and Codd and is known as a stronger definition of 3NF

Page 3: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

In the normalization process, we start with a single universal relation schema (URS) which is a table with collection of attributes present in the set of functional dependencies

Then apply the normal form restrictions to develop the complete relational schema, (decompose the tables as they specify).

Page 4: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

based on set of functional dependencies (FD) and Prime Key(PK) you analyze the relation schemas to achieve two desirable properties:

1) Minimizing data redundancy 2) Minimizing insertion, deletion and update anomalies

Therefore the ideal goal of DB design will be developing a fully normalized relational schema, preferably in BCNF.

Page 5: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

Decomposed the relation when a NF test fails starting from 1NF.

Disclaimers to use of normal forms:

1) Do not guarantee good database design.

2) Sometimes highest normal form may not be the best design for performance reasons.

Page 6: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

Normalizing the relation into 1NF

Consider this schema for departments:

The FD of EMP_DEPT is dnumber -> {dname, dmgrssn}

Problems with this?department location is multivalued.Possible solution are:Make composite primary key {dnumber,

dlocation}

dlocations dmgrssn dnumber dname

Page 7: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

Problem solved ?Make attributes for the locations if you

know the possible number of locations.Better solution?Make a new table with dname and

dlocationThis is the preferred solution

Page 8: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

Based on the concept of a full functional dependency.

A FD X -> Y is a full functional dependency if removal of any attribute from X means that the dependency does not hold any more.

A partial dependency can occur only if the determinate is composite.

Partial dependency, if there is some attribute A in X that can be removed from X and the dependency still holds.

Page 9: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

A relation is in 2NF if: every non-key attribute is fully functionally dependent on the prime attribute.

If a relation is not in 2NF, it can be normalized into a number of 2NF relations

This involves decomposing the table

Page 10: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

Normalize the EMP_PROJ tableSteps:

1) Figure out the dependencies, noting the determinates that is any attribute(s) on which some other attribute(s) are dependent)2) Put each determinate in a table by itself.3) Include in each table the attributes that are dependent on that determinate

SSN PNUMBER hours ename pname plocationEMP_PROJ

Page 11: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

Analyze the EMP_DEPT relation:1) In 1NF ? 2) In 2NF ?3) Still a problem ?4)Identify the problem.

The problem with the EMP_DEPT table DNAME depends on DNUM, not the prime key.

(transitive dependency) X-> Y is a transitive dependency if there is a set of

attributes Z of the relation and both X -> Z and Z ->Y hold.

A relation is in 3NF if every non-key attribute is: In 2NF). Non transitively dependent on the prime key.

ename SSN bdate address dnumber

dname

dmgrssn

Page 12: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

We can normalize EMP_DEPT by decomposing it into two 3NF relations.

D {R1, R2} where:R1 :EMP (ename, SSN, bdate, address)R2: DEPT(dnumber, dname, dmgrssn)

Intuitively, the two results represent independent entity facts.

To recover the original relation use natural join

ename SSN bdate address dnumber

dname

dmgrssn

Page 13: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

To assure we do not have update anomalies we need to extend this definition to include all candidate key rather than just PK.

Disallowed partial dependencies on any key in 2NF.

Disallowed transitive dependencies on any key in 3NF

Simpler definition for 3NF: Every nonprime attribute must be:

Fully functionally dependent on every keyNon transitively dependent on every key

 

Page 14: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

Simpler, yet stricter form of 3NF.A relation is in BCNF if and only if every

determinate is a candidate key.Decomposed into an equivalent set of

BCNF if relation is not in BCNF.If relations have a composite candidate

key, one of whose members are determined by a non prime attribute are in 3NF but not in BCNF.

Page 15: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

Few cases where a database is in 3NF and not in BCNF.

Example:Define another FD on the Lots DBAll the lots in the DB come from only two

countiesIn one county, all lots are >= one acreIn the other county, all lots are < one acreThe FD is area -> county_name

property_ID county_name Lot# area price tax_rate

Page 16: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

In general, it is best to have a relational schema in BCNF.

If that is not possible, 3NF will do.2NF and 1NF are not considered good

relation schema designs.They allow too much data redundancy

which leads to update anomalies. 

Page 17: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

Jovial Remark, “Normalize until it hurts, and denormalize until it works”.

Easier to query but reintroduce data redundancies.

Always improves data retrieval performance.Example: R: EMPLOYEE

Employee of a larger corporation have access to a handful of mutual fund companies for their retirement investment

Emp#

Name

Age Salary M_fund# Fund_nm Fund_mgr

Page 18: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

The FD M_fund# -> {Fund_nm, Fund_mgr} causing a violation of 3NF in R.

Decompose R to eliminate the 3ND violationD {R1, R2} where:

R1: FUND (M_fund#, Fund_nm, Fund_mgr);R2: EMPLOYEE (Emp#, Name, Age, Salary, M_fund#)

Query to the normalized relation which is EMPLOYEE requiring retirement financial data lead to joining R1 and R2.

The (denormalized) R is a better option in such situation

Page 19: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

However, if there are thousands of employees and just a handful of these retirement mutual funds.

Query requiring only mutual fund data will execute rather inefficiently in denormalized design.

In contrary, denormalization might not improve data retrieval performance in certain situation.

Page 20: By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies

http://www.cs.sjsu.edu/faculty/lee/cs157/cs157a.html

http://www.cs.montana.edu/~defrance/classes/spring-2003/cs435/lectures/FD.html