Upload
ankush-dhingra
View
217
Download
0
Embed Size (px)
Citation preview
8/7/2019 Database_Design_-_A_Practical_Guide
1/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA1
2007 Progress Software Corporation1
Database Design, a Practical Guide
Gus BjrklundWizard, Progress Software Corporation
Brandon GibbsManager, US-East, Solution Engineer
Progress Software Corporation
2007 Progress Software Corporation3
Rules are made to be broken
To every rule,there is an exception!
8/7/2019 Database_Design_-_A_Practical_Guide
2/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA2
2007 Progress Software Corporation4
If you thought this talk was going to be aboutindexing
It isnt. Nor is it about performance.
2007 Progress Software Corporation5
Topics
Theory:
What is Database Design
Basic Elements
Representing the Model as Tables
Practice
An Example Some Other Topics
2007 Progress Software Corporation6
First, a little theory
8/7/2019 Database_Design_-_A_Practical_Guide
3/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA3
2007 Progress Software Corporation7
What do we mean by databasedesign?
A process for defining a modelof a subset ofthe real1 world, then representing it as datain tables in a relational database
At least, thats the definition we will use forthe purposes of this talk.
1 Well, for small values of real, anyway.
2007 Progress Software Corporation8
Basic Elements
Just 3 Things:
Entities
Attributes
Relationships
What do we put in our model?
The entity-relationship model was described by Peter Chen in 1976.
See http://bit.csc.lsu.edu/~chen/chen.html
2007 Progress Software Corporation9
Basic Elements: Entities
Can be thought of as nouns
People
author, composer, performer, seller, buyer
Places
home, IP address, URL, destination, factory,store
Things
song, recording, instrument, car, invoice
Is telephone number a place or a thing?
8/7/2019 Database_Design_-_A_Practical_Guide
4/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA4
2007 Progress Software Corporation10
Basic Elements: Attributes
Can be thought of as adjectives (but only loosely): Length
Color
Horsepower
Part number
Song Title
Publication Date
Size
Fabric
Owner
Is telephone number a attribute or an entity?
Entities have attributes
2007 Progress Software Corporation11
Basic Elements: Relationships
Can be thought of as verbs: has a
owns
contains
supervises
performs
called
sold
purchased
proved
Entities are connectedbyrelationships
Is telephone number a relationship?
2007 Progress Software Corporation12
Relationships have attributes too
In May, 1995,Andrew Wiles
publisheda proof
of Fermats Last Theorem
8/7/2019 Database_Design_-_A_Practical_Guide
5/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA5
2007 Progress Software Corporation13
Relationships have attributes too
In May, 1995,Andrew Wiles
publisheda proof
of Fermats Last Theorem
entity
entity
relationship
attribute
2007 Progress Software Corporation14
What goes in an entity
Identifying attributes
Must be able to uniquely identify the entity
Can have more than one way to id
Id can be composite
Descriptive attributes
the values you need to keep track of generally should be simple, not complex
2007 Progress Software Corporation15
What to include in your model
The things your application has to keep track of Telephones, wires, switches
The actions your application or its users perform Make calls, send telephone bills, collect payments
Some attributes of the things and actions Originating number, date and time of call, duration, called
number
Keep it simple
Be accurate
Keep it up to date
8/7/2019 Database_Design_-_A_Practical_Guide
6/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA6
2007 Progress Software Corporation16
What to include in your model
Consider the goals of the system Everything you include should be there for a
reason you can state
in no more than two sentences
Everything should have a clear name
if you cant name it, it doesnt belong
Talk to the stakeholders !!!
2007 Progress Software Corporation17
What to leave out of your model
The real world has properties that dontmatter (to your application)
The real world has relationships that dontmatter
Things happen in the real world that dont
matter Keep it simple
If you cant say why you need it, leave it out
2007 Progress Software Corporation18
Logical vs Physical Data Models
Logical entities often require multiple tables torepresent them Tables can be thought of as logical or physical
It depends on your point of view
There is also the physical storage database layout storage areas
data extents
disks
etc.
We arent going to talk about the physical databaselayout
We will talk about tables
8/7/2019 Database_Design_-_A_Practical_Guide
7/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA7
2007 Progress Software Corporation19
Mapping Your Model to aDatabase
Entities become tables Identifiers become indexes
Attributes become columns
Data types: pick appropriate
Relationships become tables or foreign keys
Simply put,
2007 Progress Software Corporation20
In theory, there is no difference betweentheory and practice, but in practice there is.
Jan van de Snepscheut
2007 Progress Software Corporation21
Now for some practice.
8/7/2019 Database_Design_-_A_Practical_Guide
8/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA8
2007 Progress Software Corporation22
An example
Music store Buys compact disc recordings from
distributors
Has inventory
Allows customers to search for what they want
Maybe in an in-store kiosk or on the web
Sells compact discs to customers
2007 Progress Software Corporation23
What should we do first?
2007 Progress Software Corporation24
Activities
We buy discs from a distributor
Orders are sent to a distributor
Orders are delivered to the store
Orders may be cancelled
We sell discs to customers in sales transactions
Customers buy discs in sales transactions
Customers search for what they want to buy
Which of these must be remembered by the system?
8/7/2019 Database_Design_-_A_Practical_Guide
9/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA9
2007 Progress Software Corporation25
What do we need to keep track of
Discs we have Discs we sold
Discs we know about and can get
Discs we have ordered
Information needed to do our income tax what we paid for stock
when we bought it
what we sold it for
when we sold it
2007 Progress Software Corporation26
Disc entities
UPC Code: 8697-07416-2
Manufacturer: Sony BMG
Cost to us: $ 2.00
Price charged: $ 17.95
Tax charged: $ 0.80
Date purchased: March 19, 2007 Date sold: June 9, 2007
2007 Progress Software Corporation27
Disc table might look like this
upc manuf cost price tax datePurch dateSold
86 97 -0 7416-2 Sony BMG 2 .00 17 .95 0. 90 20 07-03-19 200 7-06-09
8697-07416-2 Sony BMG 2.00 ? ? 2007-06-09 ?
314-510347- 2 Island Rec or ds 2 .21 15.95 0 .80 2006- 01-12 2007- 02-14
314-510347-2 Island R ecords 2.21 ? ? 2006-01-12
8/7/2019 Database_Design_-_A_Practical_Guide
10/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA10
2007 Progress Software Corporation28
Whats wrong?
Is upc a unique identifier? Might have bought from a distributor
Have no information about what is on the disc How do customers search?
Dont know when disc was made
Could be more than one tax jurisdiction provincial tax, city tax
Dont know if disc is on order
Dont know who bought it
Duplicated data
Etc., etc.
2007 Progress Software Corporation29
Disc entities take 2
UPC Code: 8697-07416-2
Manufacturer: Sony BMG
Dist ri butor : Bob s Wholesale CDs
Cost to us: $ 2.00
Price charged: $ 17.95
Tax charged: $ 0.80
Date ordered : March 19, 2007
Date received: March 20, 2007
Date sold: June 9, 2007 Disc Title: The EssentialJoshua Bell
Artist: Joshua Bell
Track 1: Danse Russe
Tr ack 2: Vi olin Concerto in E Minor
Tr ack 3: Nocturne in C-shar p Minor
etc.
2007 Progress Software Corporation30
Example: Now Whats wrong?
This is getting messy
Activities combined with discs attributes
Have duplicated information
How many tracks can there be?
What if there is more than one artist?
Dont have all the information a customermight want to use to search
8/7/2019 Database_Design_-_A_Practical_Guide
11/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA11
2007 Progress Software Corporation31
Discs revisited
Discs have titles Discs have pictures on the cover
Discs contain tracks
Discs are made by manufacturers
Discs are purchased from distributors
Discs are ordered from distributors
Discs are delivered to the store
Discs are sold to customers
2007 Progress Software Corporation32
Discs contain tracks
Tracks contain songs
Tracks occur in order
Tracks have a duration
Songs are performed in performances
Songs have performers (usually)
Songs have composers
Songs have names (titles)
Songs have a key (but not always)
Performances are done by performers
Performers can be groups (bands, orchestras, etc.)
Performances are performed in a location or venue
2007 Progress Software Corporation33
We seem to need these entities
Discs
Manufacturers
Distributors
Orders
Customers
Inventory
Tracks
Songs
Performers
Groups ?
8/7/2019 Database_Design_-_A_Practical_Guide
12/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA12
2007 Progress Software Corporation34
Songs have names (titles).
Are names properties of songs?
Or are they entities related to songs?
Or are they something else?
2007 Progress Software Corporation35
Song data (track 1)
Title Danse Russe from Swan Lake, Op.20
Time 4:30
Composer Peter Tchaikovsky
Category Classical, v iolin, orchest ra
Performers Joshua Bell, Michael Tilson Thomas,
Berlin Philharminic OrchestraTrack number 1
Disc upc 8697-07416-2
2007 Progress Software Corporation36
Song data (track 2)
Title Violin Concerto in E Minor, Op. 64
Time 6:27
Composer Felix Mendelssohn
Category Classical, v iolin, orchest ra
Performers Joshua Bell, Sir Roger Norrington,Camerata Salzburg
Track number 2
Disc upc 8697-07416-2
8/7/2019 Database_Design_-_A_Practical_Guide
13/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA13
2007 Progress Software Corporation37
Performance data
Title Violin Concerto in E Minor, Op. 64
Time 6:27
Composer Felix Mendelssohn
Category Classical, v iolin, orchest ra
Performers Joshua Bell, Sir Roger Norrington,Camerata Salzburg
2007 Progress Software Corporation38
Performance data take 2
Title Violin Concerto in E Minor, Op. 64
Time 6:27
Composer Felix Mendelssohn
Category Classical, v iolin, orchest ra
Performers Joshua Bell, Sir Roger Norrington,Camerata Salzburg
PerformanceDate
?
PerformanceLocation
?
2007 Progress Software Corporation39
Performer data
id name
1 Joshua Bell
2 Sir Roger Norrington
3 Camerata Salzburg
4 Michael Tilson Thomas
5 Berlin Philharmonic
6 Bono
7 The Edge
8 Adam Clayton
9 Larry Mullen
8/7/2019 Database_Design_-_A_Practical_Guide
14/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA14
2007 Progress Software Corporation40
Performance to PerformerRelationship
performance id performer id
1 11 2
1 3
1
2 1
2 4
2 5
2
325 6
325 7
325 8
325 9
2007 Progress Software Corporation41
Performance data take 3
Performance id 2
Title Violin Concerto in E Minor, Op. 64
Time 6:27
Composer Felix Mendelssohn
Category Classical, v iolin, orchest ra
2007 Progress Software Corporation42
Track to PerformanceRelationship
Disc up c Track Num Performance id
8697-07416-2 1 1
8697-07416-2 2 2
314-510347-2 1 325
h 3 D t b D i P ti l G id
8/7/2019 Database_Design_-_A_Practical_Guide
15/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA15
2007 Progress Software Corporation43
Relationships (so far):
disctrack
track
track
trackperformance
performance
performance
performance
performer
performerone to one
one to many
many to many
2007 Progress Software Corporation44
What happened to Songs?
2007 Progress Software Corporation45
Relationships (take 2):
disctrack
track
track
tracksong
performance
performance
performance
performer
performer
one to one
one to many
many to many
performance
performance
performance
song
one to many
arch 3: Database Design a Practical Guide
8/7/2019 Database_Design_-_A_Practical_Guide
16/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA16
2007 Progress Software Corporation46
Relationships (take 3):
disc
track
track
track
performance
performance
performance
performer
performer
song
song
song
2007 Progress Software Corporation47
What aboutbusiness entities
?
Where are they
?
2007 Progress Software Corporation48
Business entities
disc
track
track
track
performance
performance
performance
performer
performer
song
song
song
arch 3: Database Design a Practical Guide
8/7/2019 Database_Design_-_A_Practical_Guide
17/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA17
2007 Progress Software Corporation49
Business entities
disc
track
track
track
performance
performance
performance
performer
performer
song
song
song
2007 Progress Software Corporation50
Business entities
disc
track
track
track
performance
performance
performance
performer
performer
song
song
song
2007 Progress Software Corporation51
Should you use arrays?
arch 3: Database Design, a Practical Guide
8/7/2019 Database_Design_-_A_Practical_Guide
18/25
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA18
2007 Progress Software Corporation52
Indexes
Enforce uniqueness
Make searches faster
Enable fast retrieval of entities by theiridentities
Enable finding entities with certain attributes
2007 Progress Software Corporation53
What indexes do we needfor the music store database?
2007 Progress Software Corporation54
Tables
0) Discs1) Tracks2) Songs3) Performers4) Performances5) Tracks of discs6) Performances of songs7) Performers of performances
arch 3: Database Design, a Practical Guide
8/7/2019 Database_Design_-_A_Practical_Guide
19/25
g ,
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA19
2007 Progress Software Corporation55
What indexes do we need
0) Indexes for identifying attributes1) A unique row identifier2) Indexes for the queries you will do
2007 Progress Software Corporation56
What should we do next ?
2007 Progress Software Corporation57
Other Topics
Normalization
Unique keys
Word indexes
Naming
Customisation
arch 3: Database Design, a Practical Guide
8/7/2019 Database_Design_-_A_Practical_Guide
20/25
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA20
2007 Progress Software Corporation58
Normalization
Oversimplified, it means: Dont duplicate data
Attributes should be simple have only one value
be necessary
not derived data
dont repeat
Complicated attributes are often entities intheir own right For example, addresses might be
2007 Progress Software Corporation59
Unique keys
EVERY table must have a unique key
EVERY row needs a unique identifier that never changes even if moved to another database
(i.e. if you replicate)
Often, users dont need to see it
Use a UUID or sequence or maybe datetime
Unique key is the ONLY way to identify rows
unambiguously ROWIDs are temporary and can change
Use the same method throughout Youll be glad you did
2007 Progress Software Corporation60
Word indexes
Can be used to hold multiple status orattribute values Conflicts with normalisation
Flexible
Easy to add new ones
Queries are fast
Example: Category: classical, violin, orchestral, concerto
arch 3: Database Design, a Practical Guide
8/7/2019 Database_Design_-_A_Practical_Guide
21/25
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA21
2007 Progress Software Corporation61
Naming
What is in the column GL01262 ?
Good names are crucial to understanding
2007 Progress Software Corporation62
Naming
Table and column names should have clearmeanings everyone can understand GL01262 vs dateEntered
Names with dashes cause inconveniencewith SQL order-date
Booleans should be named for truth value backOrdered
No double negations notOutOfStock
Good names are crucial to understanding
2007 Progress Software Corporation63
Making tables customizable
Spare columns
Separate table with spare columns
Separate table with name/value pairs
We will look at 3 ways:
arch 3: Database Design, a Practical Guide
G Bj NO G
8/7/2019 Database_Design_-_A_Practical_Guide
22/25
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA22
2007 Progress Software Corporation64
Spare columns in table
custnum name city
001 Bob Phoenix
002 Alice Boston
003 Eve Denver
extra1 extra2 extra3
frozen ? 0.0
? 125.46 0.12
? ? ?
2007 Progress Software Corporation65
Spare columns in table
custnum name city
001 Bob Phoenix
002 Alice Boston
003 Eve Denver
extra1 extra2 extra3
frozen ? 0.0
? 125.46 0.12
? ? ?
What data types should you use?How many spare columns?Wasted columns when not usedHow do you know what each spare got used for?How do you know how many unused spares you have?
2007 Progress Software Corporation66
Separate table for spare columns
custnum name city
001 Bob Phoenix
002 Alice Boston
003 Eve Denver
custnum extra1 extra2 extra3
001 frozen ? 0.0
002 ? 125.46 0.12
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
8/7/2019 Database_Design_-_A_Practical_Guide
23/25
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA23
2007 Progress Software Corporation67
Separate table for spare columns
custnum name city
001 Bob Phoenix
002 Alice Boston
003 Eve Denver
custnum status owed discount
001 frozen ? 0.0
002 ? 125.46 0.12
2007 Progress Software Corporation68
Separate table with name/valuepairs
custnum name city
001 Bob Phoenix
002 Alice Boston
003 Eve Denver
custnum name value
001 status frozen
002 owed 125.46
002 discount 0.12
2007 Progress Software Corporation69
Modeling Tools
PCase
Enterprise Architect
Power Designer
ConceptDraw
Erwin
Rational
Pencil and paper !
Blackboard !
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
8/7/2019 Database_Design_-_A_Practical_Guide
24/25
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA24
2007 Progress Software Corporation70
Summary
Understand the requirements
Leave out what is not needed
Review the design with stakeholders
Evolve the design as changes come up
Test to make sure it works
Can it do everything that is needed?
Does it perform adequately?
Expect changes to come
2007 Progress Software Corporation71
Homework
Papers Wiles, A.: "Modular elliptic curves and Fermat's Last
Theorem, Annals of Mathematics 141 (3): 443-551
Chen, P.: The Entity-Relationship Model -- Toward aUnified View of Data, ACM TODSVol 1, No 1, 1976
Wikipedia articles to start from: entity-relationship model
data model
Books: Teorey, Lightstone, Nadeau: Database Modelingand
Design, Morgan Kaufmann.
2007 Progress Software Corporation72
Questions
arch 3: Database Design, a Practical Guide
Gus BjUNOXQG
8/7/2019 Database_Design_-_A_Practical_Guide
25/25
Gus BjUNOXQG
Progress Exchange 2007
10-13 June, Phoenix, AZ, USA25
2007 Progress Software Corporation73