21
Copyright © 2005 Ed Lance Copyright © 2005 Ed Lance Fundamentals of Fundamentals of Relational Database Relational Database Design Design By By Ed Lance Ed Lance

Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Embed Size (px)

Citation preview

Page 1: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Fundamentals of Relational Fundamentals of Relational Database DesignDatabase Design

ByBy

Ed LanceEd Lance

Page 2: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

About MeAbout Me

Independent Database ConsultantIndependent Database Consultant

Worked with Access since 1.0Worked with Access since 1.0

Many years designing, developing, tuning, Many years designing, developing, tuning, and developing with databases.and developing with databases.

Code with .Net platformCode with .Net platform

http://SanDiegoDataPro.comhttp://SanDiegoDataPro.com

Page 3: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

What’s this about relationships?What’s this about relationships?

Relational Database is a database system that Relational Database is a database system that organizes data according to (more or less) the organizes data according to (more or less) the rules of relational set theory. (Codd)rules of relational set theory. (Codd)

In a relational database, we break up data into In a relational database, we break up data into multiple tables.multiple tables.

These tables have ‘pointers’ (keys) in the data These tables have ‘pointers’ (keys) in the data that relate rows in one table to rows in another.that relate rows in one table to rows in another.

This allows us to have multiple dimensions in This allows us to have multiple dimensions in our data without going insane.our data without going insane.

Page 4: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Why do I care?Why do I care?

Relational Database Design is important Relational Database Design is important because…because…– It allows us to organize our data in a way that doesn’t It allows us to organize our data in a way that doesn’t

limit the user.limit the user.– It allows us to model very complex data with relative It allows us to model very complex data with relative

ease.ease.– It eliminates duplicate data that is hard to maintain.It eliminates duplicate data that is hard to maintain.– It provides us a way to search and retrieve data with It provides us a way to search and retrieve data with

an easy language and gets us out of having to write an easy language and gets us out of having to write all our own data handling code.all our own data handling code.

Page 5: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

The Penalties for Not Doing ItThe Penalties for Not Doing It

Poor database performancePoor database performanceCan’t write queries to get data the way you want Can’t write queries to get data the way you want it.it.Lots of extra work (code) to get reports they way Lots of extra work (code) to get reports they way you want.you want.Always doing kludges to get around the design.Always doing kludges to get around the design.When the customer wants to add one more of When the customer wants to add one more of something, it’s a design change (“I need six something, it’s a design change (“I need six phone numbers instead of five.”)phone numbers instead of five.”)Updating data is a nightmareUpdating data is a nightmare

Page 6: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Basic ConceptsBasic Concepts

KeysKeys

IndexesIndexes

RelationshipsRelationships

Normal FormsNormal Forms

Page 7: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Primary KeysPrimary Keys

We need to be able to uniquely identify We need to be able to uniquely identify rows in a table.rows in a table.Natural key – Use one or more columns Natural key – Use one or more columns together that aren’t repeated.together that aren’t repeated.Identity – or Autonumber, a unique Identity – or Autonumber, a unique number for each row, managed by the number for each row, managed by the system.system.PK’s help us connect data in other tables PK’s help us connect data in other tables to rows in the current table.to rows in the current table.

Page 8: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

IndexesIndexes

An index is a data structure used by a An index is a data structure used by a database engine to make searches faster.database engine to make searches faster.Data is Data is generallygenerally stored in the order it was stored in the order it was input. We want to be able to filter and input. We want to be able to filter and order by any column (or set of columns).order by any column (or set of columns).Indexes are defined by you, but Indexes are defined by you, but maintained by the engine.maintained by the engine.You can index one column or a set of You can index one column or a set of columns.columns.

Page 9: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

IndexesIndexes

Deciding which columns to create indexes Deciding which columns to create indexes for is a matter of judgment and skill. for is a matter of judgment and skill.

An index can have a huge impact (usually An index can have a huge impact (usually good) on performance of queries.good) on performance of queries.

Too many indexes can bog down Too many indexes can bog down database when modifying data.database when modifying data.

Indexes do take space on disk.Indexes do take space on disk.

Page 10: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

RelationshipsRelationships

We use columns in tables to contain We use columns in tables to contain pointers to rows in other tables.pointers to rows in other tables.By using primary key columns, we can By using primary key columns, we can uniquely identify rows in a table.uniquely identify rows in a table.Then by using the value of the primary key Then by using the value of the primary key in other tables, we can tie the tables in other tables, we can tie the tables together.together.Columns that contain key values from Columns that contain key values from another table are called foreign keys.another table are called foreign keys.

Page 11: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

RelationshipsRelationships

In order for this to work, we must be In order for this to work, we must be careful that we don’t end up with foreign careful that we don’t end up with foreign keys that point to rows that don’t exist.keys that point to rows that don’t exist.

This is called Referential IntegrityThis is called Referential Integrity

Most database engines (inc. Access, SQL Most database engines (inc. Access, SQL Server) can enforce this for you.Server) can enforce this for you.

Page 12: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Relationship TypesRelationship Types

1 – 11 – 1– If two tables really represent the same set of items. If two tables really represent the same set of items.

In other words they could be merged into one table.In other words they could be merged into one table.

1 – Many1 – Many– Items in one table are related to one or more items in Items in one table are related to one or more items in

another table. (One library has many books.)another table. (One library has many books.)

Many – ManyMany – Many– Items in one table are related to one or more items in Items in one table are related to one or more items in

another table, but the relationship exists the other way another table, but the relationship exists the other way as well. (One doctor has many patients, but each as well. (One doctor has many patients, but each patient may have more than one doctor)patient may have more than one doctor)

Page 13: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

RelationshipsRelationships

Finding these relationships in your Finding these relationships in your customer’s data and designing tables to customer’s data and designing tables to correctly model them is the art of relational correctly model them is the art of relational database design.database design.

Look for repetition in existing data.Look for repetition in existing data.

Look for ‘add-on’ fields that are used to Look for ‘add-on’ fields that are used to add similar items (PhoneNumber1, add similar items (PhoneNumber1, PhoneNumber2)PhoneNumber2)

Page 14: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Tips for DesigningTips for Designing

Think of tables as entities (nouns).Think of tables as entities (nouns).– Books, products, people, events, ordersBooks, products, people, events, orders

Think of relationships as verbs.Think of relationships as verbs.– Books have authors.Books have authors.– Products have suppliersProducts have suppliers– Orders have order details (the set of items being Orders have order details (the set of items being

ordered)ordered)

Test phrases for relationship typesTest phrases for relationship types– Suppliers have many products, but each product has Suppliers have many products, but each product has

only one supplier. (one-to-many)only one supplier. (one-to-many)

Page 15: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

What is Normal?What is Normal?

The process of organizing relational data The process of organizing relational data can be called Normalization.can be called Normalization.

Codd (and follow on people) defined Codd (and follow on people) defined criteria for increasing levels of criteria for increasing levels of normalization in data. normalization in data.

In the practical world we care about the In the practical world we care about the first three.first three.

Page 16: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Normal FormsNormal Forms

1NF – First Normal Form1NF – First Normal Form– No duplicate data in columns. This means we have to No duplicate data in columns. This means we have to

break up our data into relational tables, and establish break up our data into relational tables, and establish key values.key values.

NameName AddressAddress Phone1Phone1 Phone2Phone2

TitleTitle AuthorAuthor ISBNISBN ....

Codd, LanceCodd, Lance

Page 17: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Normal FormsNormal Forms

NameName AddressAddress Phone1Phone1 Phone2Phone2

EdEd 12341234 555-1212555-1212 444-1234444-1234

NamesIDNamesID NameName AddressAddress

11 EdEd 12341234

NamesIDNamesID PhoneNumberPhoneNumber

11 555-1212555-1212

11 444-1234444-1234

Page 18: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Normal FormsNormal Forms

TitleTitle AuthorAuthor ISBNISBN ....

RDBMSRDBMS Codd, LanceCodd, Lance

PubsIDPubsID NameName

11 CoddCodd

11 LanceLance

PubIDPubID TitleTitle ISBNISBN

11 RDBMSRDBMS

Page 19: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Normal FormsNormal Forms

2NF – Second Normal Form2NF – Second Normal Form– Be in 1NF.Be in 1NF.– All columns must be dependent on the whole primary All columns must be dependent on the whole primary

key, not just one part of it. Not an issue if you use key, not just one part of it. Not an issue if you use Autonumber keys.Autonumber keys.

Part_No (PK)Part_No (PK) Supplier_Name (PK)Supplier_Name (PK) PricePrice Supplier Supplier AddressAddress

Page 20: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Normal FormsNormal Forms

3NF – Third Normal Form3NF – Third Normal Form– Be in 2NFBe in 2NF– Non-key columns should not have dependencies on Non-key columns should not have dependencies on

each other.each other.

Part_No (PK)Part_No (PK) Mfgr. NameMfgr. Name Mfgr. AddressMfgr. Address ……

Page 21: Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance

Copyright © 2005 Ed LanceCopyright © 2005 Ed Lance

Questions?Questions?

The EndThe End