63
Relational Database Systems Higher Information Systems

Relational Database Systems Higher Information Systems

Embed Size (px)

Citation preview

Page 1: Relational Database Systems Higher Information Systems

Relational Database Systems

Higher Information Systems

Page 2: Relational Database Systems Higher Information Systems

The Relational Model

data is grouped into entities which are related, in order to minimise data duplication and achieve data integrity

many-to-many relationships between entities are removed and replaced with one-to-many relationships

Page 3: Relational Database Systems Higher Information Systems

Entity-Occurrence Modelling

Page 4: Relational Database Systems Higher Information Systems

Entity-Occurrence Modelling Lines indicate how

the instances ofeach entity arelinked

E.g. Member 1034 has rented DVDs 002 and 015

DVD 003 has been rented by members 1012 1056

Page 5: Relational Database Systems Higher Information Systems

Entity-Occurrence Modelling

Each DVD can berented by manyMembers

Each Member canrent many DVDs

So there is a many-to-many relationship between Member and DVD

Page 6: Relational Database Systems Higher Information Systems

Entity-Occurrence Modelling

This method isonly as good asthe available data

Make up “dummy”data if necessary tofill in the gaps

Page 7: Relational Database Systems Higher Information Systems

More about keys An atomic key consists of one attribute

MEMBER(Member Number, Name, Telephone Number)

A compound key consists of two or more attributes

MEMBER(Member Number, Name, Telephone Number)

A surrogate key is a made up attribute designed to identify a record Member Number is a surrogate key

Page 8: Relational Database Systems Higher Information Systems

Choosing a key An atomic key is better than a

compound key A numeric attribute is better than a

text attribute KISS = Keep It Short and Simple A key must have a value—it cannot

be blank (or “null”) A key should not change over time

Page 9: Relational Database Systems Higher Information Systems

The flat file revisited…

DVD Code Title Cost Date Out Date Due

Member Number Name

Telephone Number

002 Finding Nemo £2.50 03/09/04 04/09/04 1034 John Silver 142536

003 American Pie £2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263

003 American Pie £2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847

008 The Pianist £2.50 04/09/04 06/09/04 1097 Annette Kirton 384756

What is a suitable key? DVD Code? Member Number? (DVD Code, Member Number)?

Page 10: Relational Database Systems Higher Information Systems

Update Anomalies

DVD Code Title Cost Date Out Date Due

Member Number Name

Telephone Number

002 Finding Nemo £2.50 03/09/04 04/09/04 1034 John Silver 142536

003 American Pie £2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263

003 American Pie £2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847

008 The Pianist £2.50 04/09/04 06/09/04 1097 Annette Kirton 384756

There is no way of storing the details of a member who hasn’t rented any DVDs

A value must be provided for both DVD Code and Member Number for the key

This is called an “insertion anomaly”

Page 11: Relational Database Systems Higher Information Systems

Update Anomalies

DVD Code Title Cost Date Out Date Due

Member Number Name

Telephone Number

002 Finding Nemo £2.50 03/09/04 04/09/04 1034 John Silver 142536

003 American Pie £2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263

003 American Pie £2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847

008 The Pianist £2.50 04/09/04 06/09/04 1097 Annette Kirton 384756

If a member’s details have to be amended, this must be done in each record with those details

This can lead to data inconsistency if there is an error or omission in making the change

This is called a “modification anomaly”

Page 12: Relational Database Systems Higher Information Systems

Update Anomalies

DVD Code Title Cost Date Out Date Due

Member Number Name

Telephone Number

002 Finding Nemo £2.50 03/09/04 04/09/04 1034 John Silver 142536

003 American Pie £2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263

003 American Pie £2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847

008 The Pianist £2.50 04/09/04 06/09/04 1097 Annette Kirton 384756

If a DVD is removed from the database, then it may also remove the only record of a member’s details

This is called a “deletion anomaly”

Page 13: Relational Database Systems Higher Information Systems

Update Anomalies

Insertion anomalies Modification anomalies Deletion anomalies These are characteristics of poorly

designed databases The solution is to use a relational

database We use normalisation to help work out

what tables are required and which data items should be stored in each table

Page 14: Relational Database Systems Higher Information Systems

Normalisation

Page 15: Relational Database Systems Higher Information Systems

Un-normalised Form (UNF) Identify an entity List all the

attributes Identify a key

ORDER

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone NumberItem CodeDescriptionUnit CostQuantity)

Page 16: Relational Database Systems Higher Information Systems

Un-normalised Form (UNF)

Identify repeating data items

Page 17: Relational Database Systems Higher Information Systems

Un-normalised Form (UNF) Identify repeating

data items

ORDER (Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number

Repeating itemsItem CodeDescriptionUnit CostQuantity)

Page 18: Relational Database Systems Higher Information Systems

First Normal Form (1NF) Remove

repeating data items to form a new entity

Take the key with you!

ORDER (Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number

Repeating itemsItem CodeDescriptionUnit CostQuantity)

Page 19: Relational Database Systems Higher Information Systems

First Normal Form (1NF) Remove

repeating data items to form a new entity

Take the key with you!

ORDER

ORDER_ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(Order NumberItem CodeDescriptionUnit CostQuantity)

Page 20: Relational Database Systems Higher Information Systems

First Normal Form (1NF) Identify a key for

the new entity It will be a

compound key Use the original

key and add to it

ORDER

ORDER_ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(Order NumberItem CodeDescriptionUnit CostQuantity)

Page 21: Relational Database Systems Higher Information Systems

First Normal Form (1NF) Identify a key for

the new entity It will be a

compound key Use the original

key and add to it Label the foreign

key Order Number is

both part of the compound primary key and also a foreign key.

ORDER

ORDER_ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order NumberItem CodeDescriptionUnit CostQuantity)

Page 22: Relational Database Systems Higher Information Systems

First Normal Form (1NF) A data model is

in 1NF if it has no multi-valued attributes

ORDER

ORDER_ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order NumberItem CodeDescriptionUnit CostQuantity)

Page 23: Relational Database Systems Higher Information Systems

First Normal Form (1NF)

Page 24: Relational Database Systems Higher Information Systems

First Normal Form (1NF)

But what if there were lots of orders for large deluxe red widgets…?

There are still update anomalies

Page 25: Relational Database Systems Higher Information Systems

Second Normal Form (2NF) Examine any entity

with a compound key (in this case ORDER_ITEM)

See if any attributes are dependent on just one part of the compound key

These are called partial dependencies

ORDER

ORDER_ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order NumberItem CodeDescriptionUnit CostQuantity)

Page 26: Relational Database Systems Higher Information Systems

Second Normal Form (2NF) Order Number is part

of the key Item Code is part of

the key Description is

dependent on the Item Code

Unit Cost is dependent on the Item Code

Quantity is dependent on both Order Number and Item Code.

ORDER

ORDER_ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order NumberItem CodeDescriptionUnit CostQuantity)

Page 27: Relational Database Systems Higher Information Systems

Second Normal Form (2NF) Description and Unit

Cost are partial dependencies

They are dependent on Item Code

Remove these attributes to a new entity

Take a copy of the attribute they are dependent on

ORDER

ORDER_ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order NumberItem CodeDescriptionUnit CostQuantity)

Page 28: Relational Database Systems Higher Information Systems

Second Normal Form (2NF) Item Code

becomes the key of the new entity

And becomes a foreign key in ORDER-ITEM

ORDER

ORDER_ITEM

ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order Number*Item CodeQuantity)

(Item CodeDescriptionUnit Cost)

Page 29: Relational Database Systems Higher Information Systems

Second Normal Form (2NF) A data model is in

2NF if it is in 1NF and there are no partial dependencies

ORDER

ORDER_ITEM

ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order Number*Item CodeQuantity)

(Item CodeDescriptionUnit Cost)

Page 30: Relational Database Systems Higher Information Systems

Second Normal Form (2NF)

We can add an item to the Item table without it having to be on an order

We can delete an order in the Order table without deleting details of the items on the order

We can update item details once in the Item table without affecting the orders for that item in the Order-Item table

Page 31: Relational Database Systems Higher Information Systems

Second Normal Form (2NF) But there are still

update anomalies with the Order entity

ORDER (Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

Page 32: Relational Database Systems Higher Information Systems

Third Normal Form (3NF) Examine all the

entities produced so far

See if there are any non-key attributes which are dependent on any other non-key attributes

These are called non-key dependencies

ORDER

ORDER_ITEM

ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order Number*Item CodeQuantity)

(Item CodeDescriptionUnit Cost)

Page 33: Relational Database Systems Higher Information Systems

Third Normal Form (3NF) In the ORDER

entity, Customer Name, Address, Post Code and Telephone Number are all dependent on Customer Number

ORDER

ORDER_ITEM

ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order Number*Item CodeQuantity)

(Item CodeDescriptionUnit Cost)

Page 34: Relational Database Systems Higher Information Systems

Third Normal Form (3NF) Remove these

attributes to a new entity

ORDER

ORDER_ITEM

ITEM

(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order Number*Item CodeQuantity)

(Item CodeDescriptionUnit Cost)

Page 35: Relational Database Systems Higher Information Systems

Third Normal Form (3NF) Remove these

attributes to a new entity

Customer Number is the key of the new entity

Leave Customer Number behind as a foreign key

ORDER

CUSTOMER

ORDER_ITEM

ITEM

(Order NumberOrder Date*Customer Number)

(Customer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order Number*Item CodeQuantity)

(Item CodeDescriptionUnit Cost)

Page 36: Relational Database Systems Higher Information Systems

Third Normal Form (3NF) A data model is in

3NF if it is in 2NF and there are no non-key dependencies

ORDER

CUSTOMER

ORDER_ITEM

ITEM

(Order NumberOrder Date*Customer Number)

(Customer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order Number*Item CodeQuantity)

(Item CodeDescriptionUnit Cost)

Page 37: Relational Database Systems Higher Information Systems

Third Normal Form (3NF)

We can add a customer to the Customer table without the customer having to place an order

We can delete an order in the Order table without deleting details of the customer who placed the order

We can update a customer’s details once in the Customer table without affecting the orders placed by that customer in the Order table

Page 38: Relational Database Systems Higher Information Systems

Memory Aid

In 3NF, each attribute is dependent on

the key the whole key and nothing but the key

Page 39: Relational Database Systems Higher Information Systems

Entity-Relationship DiagramORDER

CUSTOMER

ORDER_ITEM

ITEM

(Order NumberOrder Date*Customer Number)

(Customer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order Number*Item CodeQuantity)

(Item CodeDescriptionUnit Cost)

CUSTOMER

ITEMORDER

ORDER_ITEM

CUSTOMER

ITEMORDER

ORDER_ITEM

Page 40: Relational Database Systems Higher Information Systems

Entity-Relationship DiagramORDER

CUSTOMER

ORDER_ITEM

ITEM

(Order NumberOrder Date*Customer Number)

(Customer NumberCustomer NameAddressPost CodeTelephone Number)

(*Order Number*Item CodeQuantity)

(Item CodeDescriptionUnit Cost)

CUSTOMER

ITEMORDER

ORDER_ITEM

CUSTOMER

ITEMORDER

ORDER_ITEM

The foreign key is always at the “many” end of the relationship

Page 41: Relational Database Systems Higher Information Systems

Source documents

Page 42: Relational Database Systems Higher Information Systems

Source documents List all the

attributes which must be stored in the database

DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NoDVD CodeTitleCostDate HiredDate DueMember)

Page 43: Relational Database Systems Higher Information Systems

Source documents List all the

attributes which must be stored in the database

Identify a key

DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NoDVD CodeTitleCostDate HiredDate DueMember)

Page 44: Relational Database Systems Higher Information Systems

Source documents There are two

attributes called Title

DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NoDVD CodeTitleCostDate HiredDate DueMember)

Page 45: Relational Database Systems Higher Information Systems

Source documents There are two

attributes called Title

Member Number is the same as Member

DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NoDVD CodeTitleCostDate HiredDate DueMember)

Page 46: Relational Database Systems Higher Information Systems

Source documents There are two

attributes called Title

Member Number is the same as Member

Number or No?

DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NoDVD CodeTitleCostDate HiredDate DueMember)

Page 47: Relational Database Systems Higher Information Systems

Source documents Tidy up UNF Carry on as

before to 3NF

DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NumberDVD CodeDVD TitleCostDate HiredDate Due)

Page 48: Relational Database Systems Higher Information Systems

Database Design

For each attribute you must decide its name its data type its properties

Page 49: Relational Database Systems Higher Information Systems

Database Design

For each attribute you must decide its name

Choose sensible and meaningful field names

Be consistent! e.g. Number/Num/No/#

Page 50: Relational Database Systems Higher Information Systems

Database Design For each attribute you must decide

its name its data type

text (alphanumeric, string) numeric (integer, real, currency) date or time Boolean (yes or no) link object (e.g. picture, sound, file)

Page 51: Relational Database Systems Higher Information Systems

Data Types Text

“Smith” “John Smith” Alphanumeric

“IV99 9ZZ” “01234 567890” “10 Downing Street” “10”

Free text: “The cat sat on the mat, etc…”

Page 52: Relational Database Systems Higher Information Systems

Data Types

Numeric Integer: 3, 1246, 0, -5 Real/floating point: 3.14, 1246.0, 0, -

5.2 Currency: 3.14, 1246.00, 0.00, -5.20

Note that the currency symbol is not stored

Page 53: Relational Database Systems Higher Information Systems

Data Types

Date “Short” date: 1/1/2006 “Long” date: 29 February 2004 “Medium” date: 29 Feb 2004 dd/mm/yyyy indicates format Watch out for US dates: mm/dd/yyyy

Page 54: Relational Database Systems Higher Information Systems

Database Design Names are usually stored as 3 or 4

fields Title (Mr/Mrs/Miss/Ms) Forename Initials/Other Names Surname

Page 55: Relational Database Systems Higher Information Systems

Database Design

Addresses are usually stored as 3 or 4 fields Address1 (Street Address) Address2 (Town) Address3 (District) Post Code Sometimes the house number is stored

separately from the Street Name

Page 56: Relational Database Systems Higher Information Systems

Database Design Telephone Numbers are always text Numbers are usually text if they are not

used in calculations, e.g. House Number Other “numbers” are also stored as text

ISBNs Vehicle Registration “numbers”

Use integers for whole numbers

Page 57: Relational Database Systems Higher Information Systems

Database Design

For each attribute you must decide its name its data type its properties

Primary key/foreign key PK/FK Validation (presence, range, restricted

choice) Default value Format

Page 58: Relational Database Systems Higher Information Systems

Database Design

For each attribute you must decide its name its data type its properties

Store this information in a Data Dictionary

Page 59: Relational Database Systems Higher Information Systems

Data DictionaryEntity Attribute Key Data Type Required Unique Format Validation

DVD Code PK Integer Y Y

Film Code FK Integer Y N Lookup value from FILM table

DVD

Cost Currency Y N >=1 and <=3

Member Number PK Integer Y Y

Title Text Y N Choice of Mr/Mrs/Miss/Ms/Dr

Forename Text (15) Y N

Surname Text (20) Y N

Address 1 Text (20) Y N

Address 2 Text (20) N N

Address 3 Text (20) N N

Post Code Text (7) Y N A?09 0AA

MEMBER

Telephone Number Text (11) N N (99999) 000000

Film Code PK Integer Y Y FILM

Title Text (30) Y N

Member Number PK Integer Y N Lookup value from MEMBER table

DVD Code PK Integer Y N Lookup value from DVD table

Date Hired PK Date Y N dd/mm/yy

LOAN

Date Due Date Y N dd/mm/yy

Page 60: Relational Database Systems Higher Information Systems

Data Dictionary

Page 61: Relational Database Systems Higher Information Systems

Data DictionaryEntity Attribute Key Data Type Required Unique Format Validation

Member Number PK Integer Y Y

Title Text Y N Choice of Mr/Mrs/Miss/Ms/Dr

Forename Text (15) Y N

Surname Text (20) Y N

Address 1 Text (20) Y N

Address 2 Text (20) N N

Address 3 Text (20) N N

Post Code Text (7) Y N A?09 0AA

MEMBER

Telephone Number Text (11) N N (99999) 000000

Page 62: Relational Database Systems Higher Information Systems

Data DictionaryEntity Attribute Key Data Type Required Unique Format Validation

Film Code PK Integer Y Y FILM

Title Text (30) Y N

Page 63: Relational Database Systems Higher Information Systems

Data DictionaryEntity Attribute Key Data Type Required Unique Format Validation

Member Number PK Integer Y N Lookup value from MEMBER table

DVD Code PK Integer Y N Lookup value from DVD table

Date Hired PK Date Y N dd/mm/yy

LOAN

Date Due Date Y N dd/mm/yy