Database Systems Lec7

Embed Size (px)

Citation preview

  • 8/14/2019 Database Systems Lec7

    1/42

    Chapter 7Normalization of Database

    Tables

  • 8/14/2019 Database Systems Lec7

    2/42

    Database Tables and

    Normalization Normalization is a process for assigning

    attributes to entities. It reduces dataredundancies and helps eliminate the dataanomalies.

    Normalization works through a series ofstages called normal forms:

    First normal form (1NF)

    Second normal form (2NF) Third normal form (3NF)

    Fourth normal form (4NF)

    The highest level of normalization is notalways desirable.

  • 8/14/2019 Database Systems Lec7

    3/42

    Database Tables and

    Normalization The Need for Normalization

    Case of a Construction Company

    Building project -- Project number, Name,Employees assigned to the project.

    Employee -- Employee number, Name, Jobclassification

    The company charges its clients by billing thehours spent on each project. The hourly billing

    rate is dependent on the employees position. Periodically, a report is generated.

    The table whose contents correspond to thereporting requirements is shown in Table 5.1.

  • 8/14/2019 Database Systems Lec7

    4/42

    Scenari

    oA few employees worksfor one project.

    Project Num :15Project Name :Evergreen

    Employee Num :101, 102, 103,105

  • 8/14/2019 Database Systems Lec7

    5/42

    Sample Form

    Project Num :15Project Name : Evergreen

    105

    103

    102

    101

    TotalHrs

    Billed

    Chr

    Hours

    Job

    Class

    Emp

    Name

    Emp

    Num

  • 8/14/2019 Database Systems Lec7

    6/42

  • 8/14/2019 Database Systems Lec7

    7/42

    Table Structure Matches

    the Report Format

  • 8/14/2019 Database Systems Lec7

    8/42

    Problems with the Figure 5.1

    The project number is intended to be a primarykey, but it contains nulls.

    The table displays data redundancies. The table entries invite data inconsistencies.

    The data redundancies yield the followinganomalies:

    Update anomalies.

    Addition anomalies.

    Deletion anomalies.

    Database Tables andNormalization

  • 8/14/2019 Database Systems Lec7

    9/42

    Conversion to First Normal Form

    A relational table must not containrepeating groups.

    Repeating groups can be eliminated byadding the appropriate entry in at leastthe primary key column(s).

    Database Tables andNormalization

  • 8/14/2019 Database Systems Lec7

    10/42

    Data Organization: First Normal Form

    Before

    After

  • 8/14/2019 Database Systems Lec7

    11/42

    1NF Definition

    The term first normal form (1NF)describes the tabular format in which:

    All the key attributes are defined.

    There are no repeating groups in thetable.

    All attributes are dependent on theprimary key.

    First Normal Form (1 NF)

  • 8/14/2019 Database Systems Lec7

    12/42

    Dependency Diagram

    The primary key components are bold, underlined,and shaded in a different color.

    The arrows above entities indicate all desirabledependencies, i.e., dependencies that are based on

    PK. The arrows below the dependency diagram indicate

    less desirable dependencies -- partial dependenciesand transitive dependencies.

    Dependency Diagram

  • 8/14/2019 Database Systems Lec7

    13/42

    Conversion to Second Normal Form

    Starting with the 1NF format, the database can beconverted into the 2NF format by

    Writing each key component on a separate line,

    and then writing the original key on the lastline and

    Writing the dependent attributes after eachnew key.

    PROJECT (PROJ_NUM, PROJ_NAME)EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS,

    CHG_HOUR)

    ASSIGN (PROJ_NUM, EMP_NUM, HOURS)

    Second Normal Form (2NF)

  • 8/14/2019 Database Systems Lec7

    14/42

    Dependency Diagram

  • 8/14/2019 Database Systems Lec7

    15/42

    A table is in 2NF if:

    It is in 1NF and

    It includes no partial dependencies;that is, no attribute is dependent ononly a portion of the primary key.

    (It is still possible for a table in 2NF

    to exhibit transitive dependency; thatis, one or more attributes may befunctionally dependent on nonkeyattributes.)

    Second Normal Form (2NF)

  • 8/14/2019 Database Systems Lec7

    16/42

    Conversion to Third Normal Form

    Create a separate table with attributes ina transitive functional dependence

    relationship.

    PROJECT (PROJ_NUM, PROJ_NAME)

    ASSIGN (PROJ_NUM, EMP_NUM, HOURS)

    EMPLOYEE (EMP_NUM, EMP_NAME,JOB_CLASS)

    JOB (JOB_CLASS, CHG_HOUR)

    Third Normal Form (3 NF)

  • 8/14/2019 Database Systems Lec7

    17/42

    3NF Definition

    A table is in 3NF if:

    It is in 2NF and It contains no transitive

    dependencies.

    Third Normal Form (3 NF)

  • 8/14/2019 Database Systems Lec7

    18/42

    The

    CompletedDatabase

  • 8/14/2019 Database Systems Lec7

    19/42

  • 8/14/2019 Database Systems Lec7

    20/42

    A Table That Is In 3NF

    But Not In BCNF

  • 8/14/2019 Database Systems Lec7

    21/42

  • 8/14/2019 Database Systems Lec7

    22/42

    Sample Data for a BCNF Conversion

  • 8/14/2019 Database Systems Lec7

    23/42

    Decomposition into BCNF

  • 8/14/2019 Database Systems Lec7

    24/42

    BCNF Definition

    A table is in BCNF if every determinantin that table is a candidate key. If atable contains only one candidate key,3NF and BCNF are equivalent.

    BCNF Definition

  • 8/14/2019 Database Systems Lec7

    25/42

    Normalization will help us identify

    correct and appropriate TABLES. Until Now we have 4 tables

    Normalization

    PROJECT (PROJ_NUM, PROJ_NAME)

    ASSIGN (PROJ_NUM, EMP_NUM, HOURS)

    EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS)

    JOB (JOB_CLASS, CHG_HOUR)

  • 8/14/2019 Database Systems Lec7

    26/42

  • 8/14/2019 Database Systems Lec7

    27/42

    Business Rules The company manages many projects.

    Each project requires the services of manyemployees.

    An employee may be assigned to severaldifferent projects.

    Some employees are not assigned to aproject and perform duties not specificallyrelated to a project. Some employees arepart of a labor pool, to be shared by allproject teams.

    Each employee has a (single) primary jobclassification. This job classificationdetermines the hourly billing rate.

    Many employees can have the same jobclassification.

  • 8/14/2019 Database Systems Lec7

    28/42

    Two Initial Entities:

    PROJECT (PROJ_NUM, PROJ_NAME)

    EMPLOYEE (EMP_NUM, EMP_LNAME,

    EMP_FNAME, EMP_INITIAL,JOB_DESCRIPTION, JOB_CHG_HOUR)

    Normalization and DatabaseDesign

  • 8/14/2019 Database Systems Lec7

    29/42

    Three Entities After Transitive Dependency

    Removed

    PROJECT (PROJ_NUM, PROJ_NAME)

    EMPLOYEE (EMP_NUM, EMP_LNAME, EMP_FNAME,EMP_INITIAL, JOB_CODE)

    JOB (JOB_CODE, JOB_DESCRIPTION, JOB_CHG_HOUR)

    Normalization and DatabaseDesign

  • 8/14/2019 Database Systems Lec7

    30/42

    The Modified ERD

  • 8/14/2019 Database Systems Lec7

    31/42

    Creation of Composite Entity ASSIGN

  • 8/14/2019 Database Systems Lec7

    32/42

    Attribute ASSIGN_HOUR is assigned to the compositeentity ASSIGN.

    Manages relationship is created betweenEMPLOYEE and PROJECT.

    PROJECT (PROJ_NUM, PROJ_NAME, EMP_NUM)

    EMPLOYEE (EMP_NUM, EMP_LNAME, EMP_FNAME,EMP_INITIAL,

    EMP_HIREDATE, JOB_CODE)

    JOB (JOB_CODE, JOB_DESCRIPTION, JOB_CHG_HOUR)ASSIGN (ASSIGN_NUM, ASSIGN_DATE, PROJ_NUM,

    EMP_NUM,

    ASSIGN_HOURS)

    Normalization and DatabaseDesign

  • 8/14/2019 Database Systems Lec7

    33/42

  • 8/14/2019 Database Systems Lec7

    34/42

    Higher-Level Normal Forms

    4NF Definition

    A table is in 4NF if it is in 3NF and has nomultiple sets of multivalueddependencies.

  • 8/14/2019 Database Systems Lec7

    35/42

    A Set of Tables in 4NF

  • 8/14/2019 Database Systems Lec7

    36/42

    Denormalization Normalization is only one of many

    database design goals.

    Normalized (decomposed) tablesrequire additional processing, reducing

    system speed. Normalization purity is often difficult to

    sustain in the modern databaseenvironment. The conflict between

    design efficiency, informationrequirements, and processing speedare often resolved throughcompromises that includedenormalization.

  • 8/14/2019 Database Systems Lec7

    37/42

    SUMMAR

    Y

  • 8/14/2019 Database Systems Lec7

    38/42

    Summary.

    Normalization is a technique used to

    design tables in which data redundancies

    are minimized

    The first three normal forms (1NF, 2NF

    and 3NF) are most commonly

    encountered

    Normalization is an important part-but only

    a part-of the design process

  • 8/14/2019 Database Systems Lec7

    39/42

    The Initial 1NF Structure

  • 8/14/2019 Database Systems Lec7

    40/42

  • 8/14/2019 Database Systems Lec7

    41/42

    Table Structures Based On

    The Selected PKs

  • 8/14/2019 Database Systems Lec7

    42/42

    References

    ROB, P. AND CORONEL, C., 2004, Database

    Systems. 6th Ed., Thomson Course Technology