Upload
georgia-brown
View
223
Download
1
Tags:
Embed Size (px)
Citation preview
CSE3180 Semester 1,2005. Lect 1 / 1
Welcome to CSE3180 ‘Principles of Database Systems’ First Semester 2005
This NOT This
CSE3180 Semester 1,2005. Lect 1 / 2
Introduction to CSE3180Introduction to CSE3180
This unit covers many aspects associated with database, and in this series of lectures, those related to Relational Data Base (but not exclusively)
I’m : Rod Simpson
My Office is : Room C 4.46
My Phone contact number is : (03)990 32352
My email is [email protected]
My contact times are very limited as I’m running another unit concurrently.
CSE3180 Semester 1,2005. Lect 1 / 3
Which stands for
School of Computer Science and Software Engineering
Faculty of Information Technology
CSE3180 Semester 1,2005. Lect 1 / 4
The Time TableThe Time Table
Laboratory Sessions will take place from
9.30am to midday
• Monday, November 22nd to Wednesday November 24th• Monday, November 29th to Wednesday December 1st• Monday, December 6th to Wednesday December 8th• The laboratories are
• The examination will be held on Friday 10th December from 1.30pm to 3.45pm
CSE3180 Semester 1,2005. Lect 1 / 5
AssessmentAssessment
A 2 hour written examination Friday 10th December) Contribution 50%
A 2 part assignment Contribution 50%
- 5 short questions on database related terms and
application (15%)
- The development of a database and queries
relative to Monash University Laboratory facilities (35%)
CSE3180 Semester 1,2005. Lect 1 / 6
The Time TableThe Time Table
Lectures will take place from
1.00pm to 4.00pm
• Monday, November 22nd to Wednesday November 24th• Monday, December 29th to Wednesday December 1st • Monday, December 6th to Wednesday December 8th
• The examination will be held on Friday 10th December from 1.30pm to 3.45pm
CSE3180 Semester 1,2005. Lect 1 / 7
No relationship
CSE3180 Semester 1,2005. Lect 1 / 8
PRINCIPLES OF DATABASEPRINCIPLES OF DATABASEPRINCIPLES OF DATABASEPRINCIPLES OF DATABASE
CSE3180 Semester 1,2005. Lect 1 / 9
Introduction Part 2Introduction Part 2
Special and Important Notice
Today is the LAST DAY for withdrawing from this
unit without penalty
CSE3180 Semester 1,2005. Lect 1 / 10
Introduction Part 3Introduction Part 3
The notes for this unit can be found on the Monash Web Page at this address:
http://www.csse.monash.edu.au/courseware/cse3180s
The overheads are in PowerPoint (Office 97) format and can be viewed on the File Server software.
There are other notes and papers in Microsoft Word
CSE3180 Semester 1,2005. Lect 1 / 11
Lecture ObjectivesLecture Objectives
• This lecture will cover:
– some thoughts on data storage and retrieval constraints
– what is the form of the ‘data’
– some definitions of ‘data’, ‘information’, ‘audit trail’
– what is a data base - who would want one
CSE3180 Semester 1,2005. Lect 1 / 12
Lecture ObjectivesLecture Objectives
– some of the functions of the database management software
– different models (commercial)
– the relational model
– advantages and disadvantages of database– – some practical aspects of your assignment
CSE3180 Semester 1,2005. Lect 1 / 13
Introduction Part 4Introduction Part 4
Course Outline
As you will see from the notes, the recommended text is Hoffer, Prescott and McFadden’s ‘Database Management’.
Edition 7.
The examples and the exercises at the end of most chapters are well worth a read, and they form the basis of the examination questions (the exam is scheduled for Friday 12th December)
Another recommended text is Thomas Connolly and Caroline Beggs ‘DataBase Systems’ 3rd Edition.
CSE3180 Semester 1,2005. Lect 1 / 14
Introduction Part 4Introduction Part 4
There is one assignment, with 2 Parts, and it is expected that you will work in groups - your tutors should have arranged that this morning.
The assignment support software is either Oracle or MS Access. If you wish to use some other DBMS, (such as SQLServer or MySQL) you will need to come to some amicable arrangement with your tutor
CSE3180 Semester 1,2005. Lect 1 / 15
IntroductionIntroduction
Features of database such as
– Recovery – Security – Consistency– Concurrency– Database Management architecture– Background processes
will be based on Oracle’s version 8i. You will be using a 9i client in the labs.
CSE3180 Semester 1,2005. Lect 1 / 16
DatabaseDatabase
This morning’s exercise would have alerted you to the need to understand what requirements are made of data analyses, who needs these analyses, when and in what form.
In all this there is an expectation that the results of ‘queries’, which is the same as saying data supported by the database, is accurate, timely and complete.
In these lectures, you will see how these requirements can be built into a database - as you will do with your database model.
And that is what it is - a model which accurately reflects data as it occurs and is processed in the ‘real world’.
CSE3180 Semester 1,2005. Lect 1 / 17
Some Thoughts on Data StorageSome Thoughts on Data Storage
• A major benefit of Computing is the ability to STORE and RETRIEVE large amounts of data
• However, there are a number of processes and other considerations which need to be worked together to maximise this benefit
• Some very early items are– What data ?– What are its sources ?– What are the volumes /frequency ?– How long is to be stored and Why this period ?– In what FORM is it to be stored ?
CSE3180 Semester 1,2005. Lect 1 / 18
Some Thoughts on Data RetrievalSome Thoughts on Data Retrieval
• Who is going to ‘access’ (retrieve or query) this data ?
• How often ?
• From where ?
• Why is data to be accessed - for what purpose ?
CSE3180 Semester 1,2005. Lect 1 / 19
Some Thoughts on Data RetrievalSome Thoughts on Data Retrieval
• How is it to be accessed ? Voice inquiry, remote, by formal request, normal processing schedule, randomly, whenever the ‘need’ arises ?
• Is the data to be freely available ?– Are there some limitations on access ?– How are these access limitations managed ?
• What value is inherent in the data ?
CSE3180 Semester 1,2005. Lect 1 / 20
Some General ThoughtsSome General Thoughts
• What time base or volume spread is to be represented by the data ?
• What levels of accuracy are to be expected ?
• Is data to be available 7 days a week, 24 hours/day ? (24x7)
• What response time is expected ? Minimum ? / Tolerable ?
And just what does that mean in real value terms ?
• How is new or altered data to be directed to existing data ?
CSE3180 Semester 1,2005. Lect 1 / 21
Some General ThoughtsSome General Thoughts
• How is input access to be controlled ?
• When and why is data deleted - who authorises such deletions ?
• What does the ‘data’ consist of - characters, objects, audio visual, TV, audio, animation ?
• What is the optimum method of storage (organisation) ?
CSE3180 Semester 1,2005. Lect 1 / 22
Some DefinitionsSome Definitions
A General Definition:
DATA - raw (unprocessed or partly processed) facts which represent the state of entities (things) which
have occurred
INFORMATION - data which has been processed into a form USEFUL TO THE USER /STAKEHOLDER
What is Information to one user may be Data to another user.
CSE3180 Semester 1,2005. Lect 1 / 23
Possible influence of ABC Channel 2 ?
CSE3180 Semester 1,2005. Lect 1 / 24
Audit TrailAudit Trail
General Definition:
‘The presence of data processing media and procedures
which allow any and / or all transaction(s) to be traced
through ALL STAGES of processing’
This infers that the following devices / techniques are in place:
1. A logging device which ‘traps’ all transactions
2. Some way of tagging each transaction so that it can be identified
3. Some way of retrieving the required transaction(s)
4. Some way of archiving - what is the required period ?
5. Control procedures and processes to ensure integrity
CSE3180 Semester 1,2005. Lect 1 / 25
DatabaseDatabase
A Database is a shared collection of Inter-Related data designed to meet the needs of multiple types of users and applications.
This implies that multiple user VIEWS can be defined
Data stored is independent of the programs which use it
Data is structured to provide a basis for future applications
DATABASE = Stored Collection of Related Data May be physically distributed
CSE3180 Semester 1,2005. Lect 1 / 26
Database Management SoftwareDatabase Management Software
A DBMS is SOFTWARE which provides access to the database in an integrated and controlled manner
A DBMS must contain :
1. Data Definition and Structure capabilities
2. Data Manipulation capabilities
CSE3180 Semester 1,2005. Lect 1 / 27
Data Definition and ManipulationData Definition and Manipulation
Data Definition Language (DDL)
used to describe data at the database level
Schema level - complete database description
Sub-Schema level - user views (restricted)
Data Manipulation Language (DML)
Provides for these Create Insert
capabilities Update Retrieve (extract)
Delete Drop
Modify Calculation
Report
CSE3180 Semester 1,2005. Lect 1 / 28
CSE3180 Semester 1,2005. Lect 1 / 29
The Many Faces of DatabaseThe Many Faces of Database
Databases can be:
1. Transaction Intensive - ATM’s Checkouts
2. Decision Support - Browsing for trends
3. Mixed-Load - Combination of both
4. Small databases - Few thousand records
5. Very Large Database - Many millions or trillions
(VLDB) of records (Banks)
6. Non Traditional - Weather bureau, flight plans
Computer Aided Design data
7. Mobile - Able to ‘move around’
CSE3180 Semester 1,2005. Lect 1 / 30
DBMS RequirementsDBMS Requirements
Querying Capabilities
Data Displays (Presentation)
Data entry
Data Validation
Data Deletion
Committing Procedures (of changes)
AND Data Integrity, Security, Consistency and Concurrency Capabilities
CSE3180 Semester 1,2005. Lect 1 / 31
The Many Faces of DatabaseThe Many Faces of Database
• They can be:
Data Warehouses
Data Marts (or martlets)
• How is a database size measured ?
There are a number of ‘measurements’
Raw data size
Total database size
Total usable disk space size (which includes media protection such as mirroring)
CSE3180 Semester 1,2005. Lect 1 / 32
The Many Faces of DatabaseThe Many Faces of Database
Hardware Database Raw Data Total Disk
HP9000 Oracle 100GB 643GBDigital 8400 Oracle 100GB 361GB
IBM SP2 DB2/6000 100GB 377GB
NCR5100 Teradata 100GB 880GB
NCR5100 Teradata 1,000GB 3,280GB
CSE3180 Semester 1,2005. Lect 1 / 33
Important Database FeaturesImportant Database Features
• Data Integrity• Data Independence• Referential Integrity - Relational Database Model• Concurrency Control - Multiple Users• Consistency
- multi users
- distributed database
- replicated database
- partitioned database
- mobile database• Recovery from failure (Transaction and Media)• Security
CSE3180 Semester 1,2005. Lect 1 / 34
owner / parent
child / parent
child / parent
owner
member
Data Base Models - HierarchicalData Base Models - Hierarchical
child
CSE3180 Semester 1,2005. Lect 1 / 35
Data Base Models - NetworkData Base Models - Network
set of data
owner
member
Note: Only linked sets can be accessed
set ofdata
member
owner
CSE3180 Semester 1,2005. Lect 1 / 36
Data Base Models - NetworkData Base Models - Network
set of data
owner
member
Note: Only linked sets can be accessed
set ofdata
CSE3180 Semester 1,2005. Lect 1 / 37
Data Base Models - RelationalData Base Models - Relational
table table table table table A B C D E
Any table(s) can be joined to any other table(s), provided there is a means of effecting the join
Primary key / Foreign key concept. Data redundancy
No fixed linkages
CSE3180 Semester 1,2005. Lect 1 / 38
A quick introduction to the developer ofthe Relational Data Base
The late Dr. E. Codd
CSE3180 Semester 1,2005. Lect 1 / 39
A Primary Key - What’s that ?A Primary Key - What’s that ?
• Hoffer, Prescott and McFadden define a Primary Key as :
An attribute (or combination of attributes) which uniquely identifies each row in a relation. (table)
• Richard T. Watson has this to say:
The primary key definition block specifies a set of column values comprising the primary key. Once a Primary Key is defined, the system enforces its uniqueness by checking that the Primary Key of any new row does not already exist in the table.
CSE3180 Semester 1,2005. Lect 1 / 40
And - A Foreign Key ??And - A Foreign Key ??
• Hoffer, Prescott and McFadden’s definition:
An attribute (or attributes) in a relation (table) of a database which serves as the Primary Key of another relation (table) in the same database.
• Richard T. Watson says:
An attribute (or attributes) that is a Primary Key in the same table, or another table. It is the method of recording relations in a relational database.
And, both the Primary and Foreign Key(s) should be drawn from the same Domain.
CSE3180 Semester 1,2005. Lect 1 / 41
2 relations2 relations
EMPNUM NAME Date of Birth DEPTNUM 3 JONES 16-05-1956 605 Referencing 7 SMITH 23-09-1965 432 Table 11 ADAMS 11-08-1972 201 15 NGUYEN 23-10-1964 314 18 PHAN 16-11-1976 201 23 SMITH 19-09-1974 314Relation (Table) Name : EMPRelation Schema: EMP(empnum,name,date of birth,deptnum)
DEPTNUM DEPTNAME201 Production314 Finance432 Information Systems605 Administration
Relation (Table) Name : DEPTRelation Schema: DEPT(deptnum, deptname)
ReferencedTable
CSE3180 Semester 1,2005. Lect 1 / 42
What are these ?What are these ?
EMP DEPT
An Employee MUST be associated with one Department
Must a Department have AT LEAST one Employee ?Or can there be More Than 1 Employee ?Can there be a Department with NO employees ?
Employee Details are :Employee Number, Employee Name, Date of Birth and the Department Code the Employee works in.
Department Details are Department Number and Department Name
They are theEntities
CSE3180 Semester 1,2005. Lect 1 / 43
Relational DatabaseRelational Database
Data is represented in ROW and COLUMN form (matrix)
(tuple) (attribute)
Collections of related data ---> TABLES (relations)
1 or more tables ----> DATA BASE
ATTRIBUTES are generally static
ROWS are DYNAMIC and Time-Varying
The number of Attributes = DEGREE of a table
The number of Rows = CARDINALITY of a table
CSE3180 Semester 1,2005. Lect 1 / 44
Some RDB ConsiderationsSome RDB Considerations
• Data is held in tables• No order of data in tables - row or attribute• Concept of Foreign Key - Primary Key relationship• Data Typing (number, character ..) - including nulls• Query Access - insert, update, delete, retrieval• Indexing on candidate (and Primary) keys• Integrity Constraints
Attribute value ranges
Referential Integrity (Foreign Key - Primary Key)
Entity Integrity
User Defined Integrity• ‘Sets of Data’ retention constraints
CSE3180 Semester 1,2005. Lect 1 / 45
Some RDB ConsiderationsSome RDB Considerations
• Domain constraints• User defined ‘Rules’ e.g. quantities and values must not be
negative; pricing rate must not be zero• Recovery procedures• No explicit linkages between tables• Linking or embedding database operations in a procedural
language (Cobol, C ..)• Databases may be distributed across similar or different
DBMS’s• Security features
CSE3180 Semester 1,2005. Lect 1 / 46
Database ComponentsDatabase Components
1. Back End Engine
Used for Disk Input/Output processes
(Read/Write/Find)
2. Front End Processor
Data manipulation
String/Arithmetic/Statistical operations
3. DBMS Interface
Data Definition Language (DDL)Data Manipulation Language (DML)
4.Programmer Interface
Applications Environment (4GL’s, Embedded capability)
CSE3180 Semester 1,2005. Lect 1 / 47
Data Description LanguageData Description Language
Used to describe data at the Database level
Structure: Attributes
Schema : Complete description of the database using DDL
SubSchema : Describes data in the database as it is ‘known’ to individual programs(processes) or users
The segment of logical data record(s) required is
commonly known as a VIEW
CSE3180 Semester 1,2005. Lect 1 / 48
Data Manipulation LanguageData Manipulation Language
Language (commands and syntax) used to cause transfers of data from the Database and the Operating Environment and vice versa
Variety of Languages - Cobol, C, Java,
and SQL as in Access, DB2, dBASEV, Informix, Oracle, VisualDataBase, SQLServer, MySQL
Windows versions provide Icons and Menu options which are translated by the DBMS software to Database manipulation commands
Typical functions: get, put, replace, seek, update,delete, insert, drop, find, modify
CSE3180 Semester 1,2005. Lect 1 / 49
A Typical Database ModelA Typical Database Model
Users - keyboard directDBMS
Database
Users
MenuOptions
Program
Interface
Programs written inCobol, C, C++, PascalJava etc.
Database QueryAccess Language
CSE3180 Semester 1,2005. Lect 1 / 50
Advantages of DatabaseAdvantages of Database
• Reduced Data Redundancy
• Data Integrity
• Data Independence
• Data Security
• Data Consistency
• Easier use of Data via DBMS Tools (Query languages, 4GL's)
CSE3180 Semester 1,2005. Lect 1 / 51
Disadvantages of DatabaseDisadvantages of Database
• Complexity
• Expense
• Vulnerability
• Size of - disk storage, processor memory
• Training Costs
• Compatibility
• Technology Lock In
CSE3180 Semester 1,2005. Lect 1 / 52
More on DBMSMore on DBMS
Capabilities Required:– the DBMS must provide a natural interface of user data– the interface must be independent of any physical
storage structures– different users with different views must be able to
access the same database– database changes must be possible without affecting
programs which do not use the changes
(Physical and Logical Independence)
CSE3180 Semester 1,2005. Lect 1 / 53
More on DBMSMore on DBMS
Other Requirements : Provision of Operational Facilities to ensure:– multi user access control– remote terminal access– restrictions on user access– recovery from system faults– database distribution
CSE3180 Semester 1,2005. Lect 1 / 54
Overview of DataBase Overview of DataBase
Data Creation
Reliability, Content,Completeness
Good Data Input to Database Tables
DataBaseSubject to :AuditTime RelevanceAdvanced Business RulesNew DBMSNew ApplicationsAltered Business RulesAssociated with other Business Applications
UsersPeople, Other SystemsWeb based Access
SCMBPMCRM…..
CSE3180 Semester 1,2005. Lect 1 / 55
Overview of DataBaseOverview of DataBase
Legacy Data Current
from Legacy Systems DataBases
DataWarehousing Processes
Data Mining
Business Based Performance Analyses
CSE3180 Semester 1,2005. Lect 1 / 56
Your AssignmentYour Assignment
• Intended to bring together many of the aspects required in DATABASE DESIGN
• It is NOT a complete systems design
• There is no transaction processing required
• The database MUST contain accurate existing data
• It must exhibit constraints and other forms of data integrity
CSE3180 Semester 1,2005. Lect 1 / 57
CSE3180 Semester 1,2005. Lect 1 / 58
The AssignmentThe Assignment
Part 1 consists of 5 questions. The intention is to give you familiarity of some of the database terms and environment
Part 2 is directed at your developing a database of the Laboratory Devices of Monash University.
It probably sounds relatively easy, but don’t be misled.
You will need to develop ‘Business Rules’ and these will lead to ‘constraints’ which will effectively be part of the database contents.
CSE3180 Semester 1,2005. Lect 1 / 59
The AssignmentThe Assignment
And you will need to curb your energies in designing the database and its outputs (which should be your first consideration).
You have only 3 weeks to plan, design, construct and have your database operational.
CSE3180 Semester 1,2005. Lect 1 / 60
Your AssignmentYour Assignment
• The 2 Parts are designed to give you some understanding of the design requirements of a database - and this means of course, understanding the specific details which the database must address
• In Part 2 of the assignment (the model), YOU are the user– You set the ground rules, also known as Business Rules– You need to document these Rules– Don’t make the project too onerous i.e. make the design
simple.– That way you will complete both Part 1 and Part 2 in the
short period we have
CSE3180 Semester 1,2005. Lect 1 / 61
Your AssignmentYour Assignment
• Part 1 - Analysis and Solution
Due Tuesday 30th November
(Tuesday week )
• Part 2 - The Details of the University’s Laboratory Inventory
Due 8th December - final tutorial session
CSE3180 Semester 1,2005. Lect 1 / 62
Your AssignmentYour Assignment
• One of the group members will need to be the Co-ordinator
- do this co-operatively.
• Also, some person in the group will need to address the question of documentation !!!
• Use of Microsoft Word would be an advantage.
CSE3180 Semester 1,2005. Lect 1 / 63
Your AssignmentYour Assignment
• Part 1 - :– This sets the tempo for the assignment– Make sure you understand the MAJOR aspects
• By Wednesday, 24th November you should have some workable ideas about the rest of the assignment - and you should have done some reading or consulting with your tutor regarding the 5 questions
CSE3180 Semester 1,2005. Lect 1 / 64
Part 1Part 1
For Part 1, there is no code required. The questions are there to provide you with a kick-start mechanism to come to grips with database and its world
For Part 2, you will need
1.To understand the nature of the problem
2. Develop scenarios - successful and unsuccessful
3. Verify the scenarios
4. Modify - if or when necessary
5. Develop the ‘Business Rules’
- what combinations can exist / cannot exist
- how to control sequences
- what checks must be embedded to ensure accuracy
CSE3180 Semester 1,2005. Lect 1 / 65
Your AssignmentYour Assignment
One member will need to act as the Reviewer for
– Completeness of both Part 1 and Part 2
– Validity of the Comments made
– Validity of the Entity Relationships
– Normalisation / Primary key(s) / Foreign key(s)
CSE3180 Semester 1,2005. Lect 1 / 66
Your Assignment - Part 2Your Assignment - Part 2
For Part 2 you will need to develop :– Data structures
– Constraints
– Test strategy
– Test data
– Testing results
– Operational model
CSE3180 Semester 1,2005. Lect 1 / 67
Your Assignment - Part 2Your Assignment - Part 2
• Make rough outlines of the model of your database
Entities plus their relationships (connectivities)
• Make rough outlines of any changes to the the Data Structures - these expand and show the attributes of the Entities
• Make notes of Rules, and the Constraints, you consider will help the Rules to control data in your database for both existing and future users
CSE3180 Semester 1,2005. Lect 1 / 68
Your AssignmentYour Assignment
• Review the E-R diagram critically - it’s best if one person of the group becomes the ‘Quality Control’.
• Each member of the group MUST know what the assignment is about and how it is is being developed.
• Prepare the ‘final’ technical paper for Part 1 by Wednesday 1st December
• Complete the database, and produce the analyses, by Wednesday 8th December
CSE3180 Semester 1,2005. Lect 1 / 69
Your AssignmentYour Assignment
• You will (probably) be using MS-Access to develop the physical database. Or, you could use Oracle. Or SQLServer (but check with your Tutor)
• Become familiar with the Table Design feature– Include Primary Keys, Required Fields, Value Ranges or
Exclusions– ‘Numbers’ are ??
– Integers (no decimals)– Numerics with decimals– Text references only
• Use data types to constrain data• Use the ‘required’ property in Access or the ‘not null’ type in
Oracle
CSE3180 Semester 1,2005. Lect 1 / 70
Your AssignmentYour Assignment
• There should be no need to delve into the Applications Development features of the DBMS. However, if you have some expertise, and time, do so by all means
• Make sure you keep a backup of your database
CSE3180 Semester 1,2005. Lect 1 / 71
Building EvacuationBuilding Evacuation
If the Building Evacuation Alarm System Activates
– Collect your belongings– Move out of the room using the Exits– Use the Stairs - NOT the Lifts or Escalators– Follow the directions of FLOOR WARDENS - if present– Move to the Lawn outside K Block (common)– Wait for further instructions (if during the evening use
your discretion)
– TREAT EVERY EVACUATION ALERT as REAL
CSE3180 Semester 1,2005. Lect 1 / 72
The Time TableThe Time Table
Laboratory Sessions will take place from
9.30am to midday
• Monday, November 22nd to Wednesday November 24th• Monday, November 29th to Wednesday December 1st • Monday, December 6th to Wednesday December 8th
• The examination will be held on Friday 10th December from 1.30pm to 3.45pm
CSE3180 Semester 1,2005. Lect 1 / 73
That should be enough for today
The website :http://www.csse.monash.edu.au/courseware/cse3180s
and make sure you reference cse3180s