Upload
hunter-mcintyre
View
229
Download
0
Tags:
Embed Size (px)
Citation preview
CM20145CM20145Database DesignDatabase Design
Dr Alwyn BarryDr Joanna Bryson
Lecture PlanLecture Plan
1. Basic Concepts
2. Data, Information & Knowledge
3. Data Models (The E-R Model)
4. The Relational Algebra
5. Introduction to SQL
6. Further SQL (Joins, RA Equivalences)
7. Database Design
8. Further DB Design – Normalisation
9. Architectures and Implementations
10. Integrity and Security
Lecture PlanLecture Plan
11. Ethics and Professional Conduct
12. Legal Issues
13. Transactions
14. Recovery
15. Concurrency Control
16. Storage and File Structure
17. Indexing and Hashing
18. Query Processing & Optimisation
January… Review Session
`Reading’ (PROJECT) Week
OverviewOverview
Database Design and Software Engineering.
Example from real life. Views and Users. Redundancy and
Decomposition.
Loosely based on A. Keller, 2002, USCU
What Is Relational Algebra For?What Is Relational Algebra For?
Fundamental Operators: Restriction Projection Cartesian Product Union Difference
Useful Operators: Join Intersection Division
Unary
Binary
Fun
dam
en
tal
Not
Prim
itive
File Problems DBs Should SolveFile Problems DBs Should Solve
Data redundancy & inconsistency Duplication of information; Different formats
Data isolation Different formats; Different locations
Limited access Need new program for each query
Cannot support “Business Rules” Consistency Validity
Atomicity Failures leave data in inconsistent state; Two users
cannot update the database at once
Security Needs bespoke security in each application
Database DesignDatabase Design
DB design usually covers formal methods for trying to ensure databases meet all these goals.
No formal system is perfect: Formal methods take a long time,
may be computationally intractable. Human error: mistakes in proofs. Formally correct databases can still
have errors (will be examples.) Design is an art; but science can
help.
E-R Model – System DesignE-R Model – System Design
Problemdomain
Problemdomain
Data RequirementsFunctional Requirements
RequirementsCollection & Analysis
Conceptual DesignFunctional Analysis
Conceptual SchemaHigh-level Transaction Spec.
Logical Schema
Logical Design(Data model mapping)
Physical Design
Application ProgramDesign
Internal Schema
TransactionImplementation
Application Programs
DB
MS
In
de
pe
nd
en
tD
BM
S D
ep
en
de
ntt
From Elmasri & Navathe, 2003, pg 51
Software Development IteratesSoftware Development Iterates
http://www.extremeprogramming.org/
Waterfall Feigning IterationWaterfall Feigning Iteration
This is not agile SE. People don’t want to know, but
iteration happens. The question is how well you cope.
Software Development IteratesSoftware Development Iterates
Models change as programmers understand the problem better.
Requirements change as users understand possibilities better.
More resources become available. So:
Save and maintain all modelling and planning tools.
Interact with users frequently. Learn rules to recognize common
failures.
OverviewOverview
Database Design and Software Engineering.
Example from real life. Views and Users. Redundancy and
Decomposition.
Loosely based on A. Keller, 2002, USCU
Where Do Models Come From?Where Do Models Come From?
Demand – What do the users want to see?
Data – Don’t throw anything away.
Design – Never store anything twice. Efficiency. Integrity. Clarity. Security.
Automating Trading OperationsAutomating Trading Operations
Automating Trading OperationsAutomating Trading Operations
Data from: trading cards, market prices, underlying
values.
Demand from: Knowing what is
owned for determining risk.
Knowing what individuals did for determining pay.
Entities and attributesEntities and attributes
Trades Instrument, trade,
price, 2 traders, 2 clearing firms.
Traders Positions, trades.
Trading cards Trades, trader,
number, time.
Instruments Daily values,
volatilities, expiration dates.
(draw on board)
New Requirement: ReconcilingNew Requirement: Reconciling
Business process – dual entry: Operations: primary concerns are
risk, execution on trading floor, relations with individual traders.
Clearing: primary concerns are accounting, banks, law, relations with other clearing firms.
The same trading cards resulted in different trade quantities.
The same trade has different dates!
Models will change!Models will change!
Design – Find out models are inefficient or clumsy to maintain.
Data – Discover new categories, salient values.
Demand – Users see new potential.
OverviewOverview
Database Design and Software Engineering.
Example from real life. Views and Users. Redundancy and
Decomposition.
Loosely based on A. Keller, 2002, USCU
Views: Why they’re important.Views: Why they’re important.
The same data may be seen by different users in different ways.
Shorthand for frequent joins, formulas – may be more efficient.
Automate / enforce security – make access to tables and views depend on user’s function.
Keep reports logically independent from underlying representation – protect the users!
Why Limit / Protect Users?Why Limit / Protect Users?
Databases contain all data for a company. Limits on access eliminate
suspects for errors, crime. Information overload.
Smart, authoritative users still need to find things quickly.
Some users really are naïve.
Programmers are users.
OverviewOverview
Database Design and Software Engineering.
Example from real life. Views and Users. Redundancy and
Decomposition.
Loosely based on A. Keller, 2002, USCU
Software Development IteratesSoftware Development Iterates
Models change as programmers understand the problem better.
Requirements change as users understand possibilities better.
More resources become available. So:
Save and maintain all modelling and planning tools.
Interact with users frequently. Learn rules to recognize common
failures.
Software Development IteratesSoftware Development Iterates
Models change as programmers understand the problem better.
Requirements change as users understand possibilities better.
More resources become available. So:
Save and maintain all modelling and planning tools.
Interact with users frequently. Learn rules to recognize common
failures.
Pitfalls in Relational DB DesignPitfalls in Relational DB Design
A bad design may lead to: redundant information, difficulty in representing certain
information, or difficulty in checking integrity
constraints.
Design goals: Avoid redundant data. Ensure that relationships among
attributes are represented. Facilitate the checking of updates for
violation of integrity constraints.
©Silberschatz, Korth and Sudarshan
Modifications & additions by S Bird, Melbourne
Example of Bad DesignExample of Bad Design Consider the relation schema:
Lending-schema = (branch-name,branch-city,assets, customer-name,loan-
number,amount)
Redundant Information: Data for branch-name, branch-city, assets are
repeated for each loan that a branch makes. Wastes space and complicates updates, introducing
possibility of inconsistency of assets value. Difficulty representing certain information:
Cannot store info. about a branch if no loans exist. Can use null values, but they are difficult to handle.
Solution: DecompositionSolution: Decomposition
Break up redundant tables into multiple tables - this operation is called decomposition. E.g. Lending-schema = (branch-
name, branch-city, assets, customer-name, loan-number, amount)
Branch-schema = (branch-name, branch-city,assets)
Loan-info-schema = (customer-name, loan-number, branch-name, amount)
Lossless-Join Decomposition:Lossless-Join Decomposition: Want to ensure that the original
data is recoverable.1.All attributes of the original schema
(R) must appear in the decomposition (R1, R2), i.e. R = R1 R2
2.Decomposition must be a lossless-join decomposition.
Definition: R1,R2 is a lossless-join decomposition of Rif, for all possible relations r(R) r = R1 (r) ⋈ R2 (r)
Bad Decomposition ExampleBad Decomposition Example
A Non Lossless-Join Decomposition R = (A, B) R1 = (A), R2 = (B)
A B
121
A
B
12
rA(r)
B(r)
A (r) ⋈ B (r)
A B
1212
Thus, r is different to A (r) ⋈ B (r)
So A,B is not a lossless-join decomposition of R.
OverviewOverview
Database Design and Software Engineering.
Example from real life. Views and Users. Redundancy and
Decomposition.
Loosely based on A. Keller, 2002, USCU
SummarySummary
Database design is an ongoing, iterative process. Requirements come from data, user
demands, design issues. Change occurs:
Corporations & technologies grow. Programmers & users learn.
Views / security. Lossless-join decomposition
Next: Science for improving design.
Reading & ExercisesReading & Exercises
Reading Connolly & Begg Chapter 9, (13, 14) Silberschatz Chapter 7.
Much of 7 will be in the next lecture!
Exercises: C&B: 9.2, 9.3/11, 9.9, 9.10 Silberschatz:7.1, 7.2, 7.16
These need functional dependencies, which are covered next lecture.