19
INFO 340 Lecture 7 Functional Dependency, Normalization

INFO 340 Lecture 7 Functional Dependency, Normalization

  • View
    225

  • Download
    0

Embed Size (px)

Citation preview

Page 1: INFO 340 Lecture 7 Functional Dependency, Normalization

INFO 340

Lecture 7Functional Dependency,

Normalization

Page 2: INFO 340 Lecture 7 Functional Dependency, Normalization

DeMorgan’s Theorem

• A AND B = A OR B

• A OR B = A AND B

Page 3: INFO 340 Lecture 7 Functional Dependency, Normalization

“Spreadsheet Syndrome”

• When you use a spreadsheet program, you only really have one table.

• This leads to duplication of data.

Page 4: INFO 340 Lecture 7 Functional Dependency, Normalization

Normalization

• Goal: Every non-key column is directly dependent on the key, the whole key, and nothing but the key

• Goal: Reduce redundancies, less anomalies, and improve efficiency.

Page 5: INFO 340 Lecture 7 Functional Dependency, Normalization

Data Redundancy & Update Anomalies

• Insertion Anomaly– Staff # | sName | position | salary | branch# | bAddress

• Add new staff & bAddress must be updated also – creating opportunity for error

• Want to add new branch w/no staff means we have to enter nulls for staff members

• Deletion Anomaly• Deleting last staff member of a branch also deletes details on branch

• Modification Anomaly• Updating details of a particular branch must be done for all rows –

creating opportunity for error

Page 6: INFO 340 Lecture 7 Functional Dependency, Normalization

Functional Dependency & Normalization

• How to identify the most commonly used normal forms, namely First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF).

Page 7: INFO 340 Lecture 7 Functional Dependency, Normalization

What happens if normalization hasn’t occurred?

• Data duplication

• Multiple truths

• Difficulty to query

Page 8: INFO 340 Lecture 7 Functional Dependency, Normalization
Page 9: INFO 340 Lecture 7 Functional Dependency, Normalization

Full functional dependency

• A fully functional dependency is when you can not remove items from the first set (the A in AB) and maintain a functional dependency.

Page 10: INFO 340 Lecture 7 Functional Dependency, Normalization

Transitive Dependency

• Transitive dependency describes a condition where A, B, and C are attributes of a relation such that if A → B and B → C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C).

Page 11: INFO 340 Lecture 7 Functional Dependency, Normalization

Functional Dependency & Normalization

• Main characteristics of functional dependencies used in normalization:– There is a one-to-one relationship between the

attribute(s) on the left-hand side (determinant) and those on the right-hand side of a functional dependency.

– Holds for all time.– The determinant has the minimal number of

attributes necessary to maintain the dependency with the attribute(s) on the right hand-side.

Page 12: INFO 340 Lecture 7 Functional Dependency, Normalization

Normalization

• Formal technique for analyzing a relation based on its primary key and the functional dependencies between the attributes of that relation.

• Formal method to cross-check your work – “sanity check”

• Often executed as a series of steps. Each step corresponds to a specific normal form, which has known properties.

• As normalization proceeds, the relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies.

Page 13: INFO 340 Lecture 7 Functional Dependency, Normalization

1st Normalized Form

• A relation in which the intersection of each row and column contains one and only one value.

• Atomicity. Based upon you’re requirements, a column holds only one value.

Page 14: INFO 340 Lecture 7 Functional Dependency, Normalization

2nd Normal Form

• Based on the concept of full functional dependency.

• A relation that is in 1NF and every non-primary-key attribute is fully functionally dependent on the primary key.

Page 15: INFO 340 Lecture 7 Functional Dependency, Normalization

2NF examplesStudent Class Location

John CSE 143 EEB 103

John EE 131 EEB 103

Susie INFO 340 MGH 238

Susie MATH 124 PAR 104

Susie EE 131 EEB 103

• While in 1NF form, it is not in 2NF form. Candidate Key {Student,Class} .

• Location is not fully functional dependent, since it is dependent only on Class.

Student Class

John CSE 143

John EE 131

Susie INFO 340

Susie MATH 124

Susie EE 131

Class Location

CSE 143 EEB 103

EE 131 EEB 103

INFO 340 MGH 238

MATH 124

PAR 104

Page 16: INFO 340 Lecture 7 Functional Dependency, Normalization

3rd Normal Form

• Based on the concept of transitive dependency.

• A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on the primary key.

Page 17: INFO 340 Lecture 7 Functional Dependency, Normalization

3NF examplePublisherID Name Address City State ZIP

1 Apress 2560 Ninth Street, Station 219 Berkeley CA 94710

• Looks good, but notice that City and State are really dependent on ZIP, not Publisher_ID.

• A good way to find transitive functional dependencies is think to yourself. – “If I update this column, do I need to update others?”

• In this case, updating the City column would require you to update the ZIP and possible the State column.

• This example, though, hints that one of the dangers of normalization, that you can sometimes go too far..

PublisherID Name Address ZIP

1 Apress 2560 Ninth Street, Station 219 94710

ZIP City State

94710 Berkeley CA

Page 18: INFO 340 Lecture 7 Functional Dependency, Normalization

MidTerm Overview

• Limitations of file-based systems• Difference in a DDL & DML• Advantages/disadvantages of DBMS’s• Differences in External, Conceptual, &

Internal levels of DBMS’s• Data independence• Functions of a DBMS• Relation, attribute, domain,cardinality,

degree• Attribute domains• Cartesian product• Properties of a relation• Keys – super, candidate, primary, foreign• Null• Entity integrity, referential integrity• Sets – union, intersection, difference• Joins – inner, right outer, left outer

• SQL – selects, updates, inserts, aggregates, group by, order by

• Wild cards• Nested query• DeMorgan’s Theorem in an SQL query• Relational algebra – difference between a

selection & a projection• Entity relationship diagrams• Mulitplicity• Functional dependency• Definitions of First, Second, & Third

Normal form• Be able to identify if a relation is in 1NF,

2NF, or 3NF• Difference between Integer types in

MySQL

Page 19: INFO 340 Lecture 7 Functional Dependency, Normalization

Homework

Complete Mini-Project work

Prepare for mid-term