Upload
emory-emery-powers
View
264
Download
0
Tags:
Embed Size (px)
Citation preview
Bottom Up Analysis
Bottom-up Data Analysis
An Example
Bottom Up Analysis
Rigorous, bottom-up, data analysis
• Bottom-up data analysis is used as a method of database design when an enterprise already has an information system, but one which needs to be refined before it can be implemented as a relational database
• It uses the techniques we have met
Bottom Up Analysis
Outline
• Systems analyst obtains a copy of every• document• screen display• report• record card• ...
• … or any other tangible information source relevant to the information system that is being designed
• These sources are processed according to a well defined procedure, to produce a suitable collection of relation schemas
Bottom Up Analysis
The Procedure
(to be followed for every information source)
• represent the source as an un-normalised structure
• normalise that structure• splitting as necessary
• check that all relations are in normal form• repeating as necessary
• optimise the whole set of normalised relations
Bottom Up Analysis
Normalisation
• Remember ...• normalisation means a separation of
concerns• often eliminates redundancy in the relations
• there are various normal forms• choose an acceptable one• 3NF• (BCNF)
Bottom Up Analysis
Student no:Name:Course Title:Year of Course:
Staff Code:Name:Extension:
Unit Code Unit Title MarkC66 Computing Fundamentals 80C22 Problem Solving 57C89 Practical Business Computing 65
Josephine M. Bloggs1989074
Llandwp College
Units7724
Amanda DolittleCsasd
Personal Tutor:2
Computer Science
Analysing a Document
• Suppose we are working with a registration card for Llandwp College
Student_Registration
Bottom Up Analysis
Student no.Student's NameCourse TitleYear of CourseTutor's Staff CodeTutor's NameTutor's Extension
Unit Code Unit TitleMark
repeats for each unit
An Un-normalised Structure
• As an initial step, the document is represented as an un-normalised structure:
Student
Enrolment
Bottom Up Analysis
Functional Dependencies and Enterprise Rules
• Next, functional dependencies between the data fields are recorded
• Functional dependencies are determined by enterprise rules for the data in question
Bottom Up Analysis
Student no. Student's Name, Course title, Year of Course, Tutor's Staff Code, Tutor's Name, Tutor's Extension
Tutor's Staff Code Tutor's Name, Tutor's Extension
Student no., Unit Code Unit Title, Mark
Unit Code Unit Title
Functional Dependencies
Bottom Up Analysis
Student no. Student's Name, Course title, Year of Course, Tutor's Staff Code, Tutor's Name, Tutor's Extension
Tutor's Staff Code Tutor's Name, Tutor's Extension
Student no., Unit Code Unit Title, Mark
Unit Code Unit Title
Functional Dependencies
Bottom Up Analysis
First Normal Form
• The un-normalised structure is brought to first normal form by transferring repeating groups of items to separate relation schemes
• Each new scheme includes the key of the original un-normalised structure as part of its own key
Bottom Up Analysis
Student no.Student's NameCourse TitleYear of CourseTutor's Staff CodeTutor's NameTutor's Extension
Student no.Unit CodeUnit TitleMark
First Normal Form
Student
Enrolment
Bottom Up Analysis
STUDENT(Student no., Student’s Name, Year of Course, Tutor's Staff Code, Tutor’s Name, Tutor’s Extension)
ENROLMENT(Student no*, Unit Code, Unit Title, Mark) FK Student no references STUDENT
First Normal Form
Student no.Student's NameCourse TitleYear of CourseTutor's Staff CodeTutor's NameTutor's Extension
Student no*Unit CodeUnit TitleMark
Student
Enrolment
Bottom Up Analysis
Second Normal Form
• The relation schemes are brought to second normal form by eliminating partial dependencies on keys
• Unit Title is only partially dependent on the key (Student no., Unit Code), since Unit Code alone is sufficient to determine its value for any unit
Bottom Up Analysis
Student no.Student's NameCourse TitleYear of CourseTutor's Staff CodeTutor's NameTutor's Extension
Student no.*Unit Code*Mark
Unit CodeUnit Title
Second Normal Form
Student
Enrolment
Unit
Bottom Up Analysis
STUDENT(Student no., Student’s Name.Course Title, Year of Course, Tutor's Staff Code, Tutor’s Name, Tutor’s Extension)
UNIT(Unit Code, Unit Title)
ENROL(Student no.*, Unit Code*, Mark) FK Student no references STUDENT FK Unit Code references UNIT
Second Normal Form
Student no.Student's NameCourse TitleYear of CourseTutor's Staff CodeTutor's NameTutor's Extension
Student no.*Unit Code*Mark
Unit CodeUnit Title
Student
Enrolment Unit
Bottom Up Analysis
Bringing Relation Schemes to Third Normal Form
• The relation schemes are brought to third normal form by eliminating transitive dependencies on keys
• Tutor's Name and Tutor's Extension are transitively dependent on the key, Student no
Bottom Up Analysis
Student no.Student's NameCourse TitleYear of CourseTutor's Staff Code *
Tutor's Staff CodeTutor's NameTutor's Extension
Student no.*Unit Code*Mark
Unit CodeUnit Title
Third Normal Form
Student
Enrolment
UnitStaff
Bottom Up Analysis
STUDENT(Student no. {PK}, Student’s Name.Course Title, Year of Course, Tutor's Staff Code*) FK Tutor’s Staff Code references TUTOR
UNIT(Unit Code {PK}, Unit Title)
TUTOR(Tutor’s Staff Code {PK}, Tutor’s Name, Tutor’s Extension)
ENROL(Student no.*, Unit Code* {PK}, Mark) FK Student no. references STUDENT, FK Unit Code references UNIT
Third Normal Form
Student no.Student's NameCourse TitleYear of CourseTutor's Staff Code *
Tutor's Staff CodeTutor's NameTutor's Extension
Student no.*Unit Code*Mark
Unit CodeUnit Title
Student
Enrolment
UnitStaff
Bottom Up Analysis
Summary of the normalisation procedure
• Represent the source document (as an un-normalised structure)
• Identify functional dependencies between data items
• Construct first normal form schemes by eliminating repeating groups of data items
• Bring the schemes to second normal form by eliminating partial dependencies on key fields
• Bring the schemes to third normal form by eliminating transitive dependencies on key fields
Bottom Up Analysis
More documents from the Llandwp College
• Also included are copies of• a list of Course Units• a list of Staff Teaching Duties• a Timetable
• As a systems analyst, it is your job to analyse these documents• like the Registration Card was analysed
above
Bottom Up Analysis
The List of Course Units
Example data (document)
C22 Problem SolvingC31 Basic Compiler DesignC66 Computing FundamentalsC70 Java codingC89 Practical Business Computing…
Bottom Up Analysis
Unit CodeUnit Title
The List of Course Units
• Representing the Course Units list as an "un-normalised" structure gives:
• Enterprise rule:• Each unit has a
unique unit code which determines its title
• The only functional dependency is:
Unit Code Unit Title
Bottom Up Analysis
Unit CodeUnit Title
Analysing the List of Course Units
• In this case, the “un-normalised” structure is already in third normal form• It contains
• no repeating groups of data items,• no partial dependencies on key fields, and• no transitive dependencies on keys.
• It needs no further analysis, and the resulting relation scheme is, as before:
Bottom Up Analysis
Analysing the List of Course Units
• UNIT( Unit Code, Unit Title)
Unit CodeUnit Title
Unit
Bottom Up Analysis
The List of Teaching Duties
Example data
csano Tony Otter C31, C70csasd Amanda Dolittle C89cswas Bill Smith C22, C66
Bottom Up Analysis
The List of Teaching Duties
• Un-normalised Structure:
• Enterprise Rule:• Each member of staff has
a unique staff code.
• Functional dependency:Staff Code Name
• Enterprise Rule:• Each member of staff may
teach many Units:
{Staff Code, Unit Code} { }
Staff CodeName
Unit Code Repeats for each unit
Staff
Bottom Up Analysis
• These schemes are also in third normal form (and BCNF) .
Analysing the list of Teaching Duties
• First Normal Form• Eliminate repeating
groups of data items by moving the items in question to a new relation scheme, and including the key of the original scheme in that of the new scheme
Staff CodeName
Staff Code*Unit Code
Staff
Teach
Bottom Up Analysis
Analysing the list of Teaching Duties
STAFF(Staff Code, Name )
TEACH(Staff Code*, Unit Code)FK Staff Code references STAFF
Staff CodeName
Staff Code*Unit Code
Staff
Teach
Bottom Up Analysis
Analysing the Timetable
Example Datathe timetable is written as a list
Monday10:00 C22 32112:00 C66 318
Tuesday11:00 C89 31711:00 C70 320
...
Bottom Up Analysis
Day of WeekTime
Unit CodeRoom No.
repeatsrepeats
Analysing the Timetable
• Un-normalised Structure:
Bottom Up Analysis
Analysing the Timetable
• Enterprise Rules:• At a given time and day of the week, a unit
(module) can only take place in one room.• Functional Dependency:Day of Week, Time, Unit Code Room No.
• At a given time and day of the week, a room can only hold one unit (module).
• Functional Dependency:Day of Week, Time, Room No. Unit Code
(Captures that different units cannot co-exist at same time in same room)
Bottom Up Analysis
Day of Week
Day of WeekTime
Analysing the Timetable
• First Normal Form:
• These schemes are also in third normal form.
Day of Week*Time*UnitRoom
CKs:
{Day, Time, Unit}
{Day, Time, Room}
Bottom Up Analysis
Analysing the Timetable
WEEK_DAY(Day of Week )
TIME_SLOT(Day of Week*, Time) FK Day of Week references WEEK_DAY
T_TABLE(Day of Week, Time*, Unit Code, Room No) ALTERNATE KEY: (Day of Week, Time, Room No.)FK {Day of Week, Time} references TIME_SLOT
Day of Week
Day of WeekTime
Day of Week*Time*UnitRoom
CK:
{Day, Time, Room}
Bottom Up Analysis
The Complete Set of Normalised Relation Schemes
• From the Student Registration Card:• 4 relations
• From the Course Units List:• 1 relation
• From the Teaching Duties List:• 2 relations
• From the Timetable:• 3 relations
Bottom Up Analysis
Staff CodeName
Staff Code*Unit Code*
The Normalised Relation Schema
Day of Week*Time*UnitRoom No.
Day of Week*Time
Day of Week
Student no.*Unit Code*Mark
Unit CodeUnit Title
Student no.Student's NameCourse TitleYear of CourseTutor's Staff Code *
Tutor's Staff CodeTutor's NameTutor's Extension
Unit CodeUnit Title
Bottom Up Analysis
Merging the Normalised Relation Schemes
• Check attribute names for consistency, renaming attributes where necessary
• Merge relations that have the same key if they represent the same real world entity
• Do not merge relations that have the same key if one represents a subset domain of the other
• Eliminate key-only relations unless they represent data that you want to store in tables
Bottom Up Analysis
Staff Code*Unit Code*
The Optimised Normalised Relation Schema
Day of Week*Time*UnitRoom No.
Day of WeekTime
Student no.*Unit Code*Mark
Unit CodeUnit Title
Student no.Student's NameCourse TitleYear of CourseTutor's Staff Code *
Staff CodeTutor's NameTutor's Extension
Day of WeekUnit CodeUnit Title
Staff CodeName
Duplication?
Sameconcept?
Rename?
Key only?
Bottom Up Analysis
Staff Code*Unit Code*
The Optimised Normalised Relation Schema
Day of Week*Time*UnitRoom No.
Day of WeekTime
Student no.*Unit Code*Mark
Unit CodeUnit Title
Student no.Student's NameCourse TitleYear of CourseTutor's Staff Code *
Staff CodeTutor's NameTutor's Extension
AddedSlide
Summary of Relation Schema
STAFF(Staff Code, Name, Extension)
STUDENT(Student no., Course Title, Year of Course, Tutor's Staff Code*)FK Tutor's Staff Code references
STAFF
UNIT(Unit Code, Unit Title)
ENROL(Student no.*, Unit Code*, Mark)FK Student no. references
STUDENTFK Unit Code references UNIT
TEACH(Staff Code*, Unit Code*)FK Staff Code references STAFF
FK Unit Code references UNIT
TIME_SLOT(Day of Week, Time)
T_TABLE(Day of Week*, Time*, Unit
Code*, Room No.) FK Time (Day of Week, Time)
references TIME_SLOT
FK Unit Code Code references UNIT
AddedSlide
Bottom Up Analysis
Staff Code*Unit Code*
Alternative Optimised Normalised Relation Schema
Slot*UnitRoom No.
Student no.*Unit Code*Mark
Unit CodeUnit Title
Student no.Student's NameCourse TitleYear of CourseTutor's Staff Code *
Staff CodeStaff NameStaff Extension
Slot*Day of WeekTime
Summary of Relation Schema
STAFF(Staff Code, Name, Extension)
STUDENT(Student no., Course Title, Year of Course, Tutor's Staff Code*)FK Tutor's Staff Code references
STAFF
UNIT(Unit Code, Unit Title)
ENROL(Student no.*, Unit Code*, Mark)FK Student no. references
STUDENTFK Unit Code references UNIT
TEACH(Staff Code*, Unit Code*)FK Staff Code references STAFF
FK Unit Code references UNIT
TIME(Time Slot, Day of Week, Time)ALTERNATE KEY: (Day of Week,
Time)
T_TABLE(Time Slot*, Unit Code*, Room No.) ALTERNATE KEY: (Time Slot, Room No.)FK Time Slot references TIMEFK Unit Code references UNIT
Bottom Up Analysis
Summary
• Rigorous, bottom-up data analysis entails• collecting source documents• drawing up un-normalised structures• normalising these - 3NF, BCNF, 4NF ...• optimising the resulting set of schemes
• The resulting set defines the structure of the database that is to be constructed
Bottom Up Analysis
Review - conciseFirst bring relation to 1NF:
StuDetails( StuNo, StuName, Ctitle, Cyear, TSCode, Tname, TExt, UCode, UTitle, Mark)
Document the FDs:
StuNo {StuName, CTitle, CYear, TSCode, TName, TExt}TSCode {TName, TExt}{StuNo,UCode} {UTitle, Mark}UCode Utitle
Bottom Up Analysis
Review - concise{StuNo,UCode}
these attributes determine all others, and are minimal.
So, the 1NF relation
StuDetails( StuNo, StuName, Ctitle, Cyear, TSCode, Tname, TExt, UCode, UTitle, Mark)
has key {StuNo, Ucode}
Bottom Up Analysis
Bring to 2NF
The attributes of the key {StuNo,UCode }
satisfy FDsStuNo {StuName, CTitle, CYear, TSCode,
TName, TExt}
UCode Utitle
These present partial dependencies on the key {StuNo,UCode} . Eliminate these for 2NF:
Bottom Up Analysis
Bring to 2NF
StuDetails( StuNo, StuName, CTitle, CYear, TSCode, Tname, TExt,)
UnitDetails(UCode, UTitle)
StuReg( StuNo*, Ucode*, Mark)
Bottom Up Analysis
Bring to 3NFIn StuDetails, we still have:
StuNo {StuName, CTitle, CYear,TSCode, TName, TExt}
TSCode {TName, TExt}
So, StuNo TSCode {TName, TExt}Transitive dependency!
Bottom Up Analysis
Bring to 3NF
Eliminate the transitive dependency:
StuDetails( StuNo, StuName, CTitle, CYear, TSCode*}
TutorDetails(TScode, Tname, TExt,)
Bottom Up Analysis
Complete Normalisation of Registration Table
StuDetails( StuNo(PK), StuName, CTitle, CYear, TSCode*}
TutorDetails(TScode (PK), Tname, TExt,)
UnitDetails(UCode (PK), UTitle)
Enrol( StuNo*, Ucode*, Mark)FK StudNo references StudDetailsFK Ucode references UnitDetails
Bottom Up Analysis
Other Sources
Course Units• UNIT( Unit Code, Unit Title)
Teaching Duties• STAFF(Staff Code,
Name )• TEACH(Staff Code*, Unit
Code)• FK Staff Code references
STAFF
TimetableWEEK_DAY(Day of Week )
TIME_SLOT(Day of Week*, Time)
FK Day of Week references WEEK_DAY
T_TABLE(Day of Week, Time*, Unit Code, Room No) ALTERNATE KEY: (Day of Week, Time, Room No.)FK {Day of Week, Time}
references TIME_SLOT
AddedSlide
Bottom Up Analysis
Complete Normalisation of Llandwp Tables
Carry out the normalisation of each source in turn
Collate the different results, making mergers and renamings where appropriate
Consider redundant relations for eliminationEnsure every relation has a PK (and possible
CKs) and indicate each FK with its referenced relation.
Summary of Relation Schema
STAFF(Staff Code, Name, Extension)
STUDENT(Student no., Course Title, Year of Course, Tutor's Staff Code*)FK Tutor's Staff Code references
STAFF
UNIT(Unit Code, Unit Title)
ENROL(Student no.*, Unit Code*, Mark)FK Student no. references
STUDENTFK Unit Code references UNIT
TEACH(Staff Code*, Unit Code*)FK Staff Code references STAFF
FK Unit Code references UNIT
TIME_SLOT(Day of Week, Time)
T_TABLE(Day of Week*, Time*, Unit
Code*, Room No.) FK Time (Day of Week, Time)
references TIME_SLOT
FK Unit Code Code references UNIT
AddedSlide