View
225
Download
0
Category
Preview:
Citation preview
1
I S 2 2 0 : D a t a b a s e F u n d a m e n t a l s
LECTURE2:
DATABASE ENVIRONMENT
Ref. Main: “Chapter2” + parts from “Chapter 15”
from
“Database Systems: A Practical Approach to Design, Implementation and Management.”
Thomas Connolly, Carolyn Begg.
Chapter Objectives
Lecture2
2
In this chapter you will learn:
The purpose of the three-level database architecture.
The contents of the external, conceptual, and internal levels.
The purpose of the external/conceptual and the conceptual/internal mappings.
The meaning of logical and physical data independence.
The distinction between a Data Definition Language (DDL) and a Data
Manipulation Language (DML).
A classification or models of DBMS’s.
The purpose and importance of conceptual modeling.
Main Terms
Lecture2
3
• Data abstraction
• Schemas and Instances
• Three-level Schema Architecture
• Mapping
• Data Independence
• Data Models
• Database system development lifecycle
• Classification of DBMSs.
Data abstraction
Lecture2
4
• One fundamental characteristic of the database approach is
that it provides some level of data abstraction.
• Data abstraction generally refers to the suppression of details
of data organization and storage, and the highlighting of the
essential features for an improved understanding of data.
• Data abstraction enable different users to perceive data at
their preferred level of detail.
THE THREE LEVEL SCHEMA ARCHITECTURE
Lecture2
5
Three-level Architecture
Lecture2
6
The goal of the three-schema architecture, is to separate the
user applications from the physical database.
Three-level Architecture
Lecture2
7
1. The external or view level
• includes a number of external schemas or user views. (the ways users perceive the data)
• Describes the part of database that is relevant to a particular user.
Three-level Architecture
Lecture2
8
2. Conceptual Level
• It has a conceptual schema (logical structure of entire database)
• which describes the structure of the whole database for a community of users.
• Describes what data is stored in database and relationships among the data.
• It concentrates on describing entities, data types, relationships, user operations, and constraints.
Three-level Architecture
Lecture2
9
3. Internal Level
• It has an internal schema ( the way DBMS and OS perceive the data)
• Physical representation of the database on the computer.
• How the data is stored in the database. It contains the definitions of stored records, the methods of representation, the data fields, and the indexes and storage structures used.
Schemas and Instances
Lecture2
10
• In any data model, it is important to distinguish between description of database and the database itself:
• Schema (intention)
• The description of the database. It rarely changes.
• when we define a new database, we specify its schema – “The structure, data types, and the constraints that describes the database”.
• A displayed schema is called a schema diagram
• We call each object in the schema a schema construct.
• Instance (database state / extension)
• The actual data in the database at any point of time
• Changes rapidly.
• When we initially load data into the database, it is said to move into the initial state of the database.
• Each write operation (insert, delete, modify) changes the current state of the database to its new state
Example
Database Concepts
11
Schema Instance
Mapping
Lecture2
12
In a DBMS based on the three-schema architecture, the DBMS must
transform a request specified on an external schema into a request against
the conceptual schema, and then into a request on the internal schema for
processing over the stored database.
The processes of transforming requests and results between levels are
called mappings.
Illustrating Example
Lecture2
13
Reasons for Separations?
Lecture2
14
The objective of the three-level architecture is to separate each user’s view of
the database from the way the database is physically represented. There are
several reasons why this separation is desirable:
Each user should able to access the data, but have a different customized
view of data.
The DBA should be able to change the DB storage structure without
affecting the user’s view.
The internal structure of database should be unaffected by changes to the
physical aspects of storage, such as change to new storage device.
DATA INDEPENDENCE
Lecture2
15
Data Independence
Lecture2
16
The three-level architecture provides Data Independence,
which means that upper level are unaffected by changes to
lower level
Data Independence is the ability to modify a schema
definition in one level without affecting a schema definition
in the next higher level.
There are two kinds of data independence:
Logical Data Independence
Physical Data Independence
Data Independence
Lecture2
17
Logical Data Independence
Refers to immunity of external schemas to changes in
conceptual schema.
Conceptual schema changes (e.g. addition/removal of
entities) should not require changes to external schema or
rewrites of application programs.
Data Independence
Lecture2
18
Physical Data Independence
Refers to immunity of conceptual schema to changes in the
internal schema.
Internal schema changes (e.g. using different file
organizations, storage structures/devices) should not require
change to conceptual or external schemas.
Data Independence and the Three-Level
Architecture
Lecture2
19
Database Language
Lecture2
20
Data Definition Language (DDL) and a Data Manipulation Language
(DML). The DDL is used to specify the database schema and the DML is
used to both read and update the database.
Lecture2 21
DATA MODELS
Lecture2
22
Data Model
Lecture2
23
• A data model—a collection of concepts that can be used to describe the
structure of a database.
• By structure of a database we mean the data types, relationships, and
constraints that apply to the data.
• Purpose
• To represent data in an understandable way.
Database system development lifecycle
Lecture2
24
As a database system is a fundamental component of the larger
organization-wide information system, the database system
development lifecycle is inherently associated with the lifecycle of the
information system.
The stages of the database system development lifecycle are shown
in the following Figure:
Lecture2 25 The Stages of the database System Development Lifecycle
Analysis
Phase
Design
Phase
Implementation
Phase
Maintenance
Database Design
Lecture2
26
Database design has three main phases: conceptual, logical,
and physical design.
• Conceptual database design – to build the conceptual representation of
the database, which includes identification of the important entities,
relationships, and attributes.
• Logical database design – to translate the conceptual representation to
the logical structure of the database, which includes designing the
relations.
• Physical database design – to decide how the logical structure is to be
physically implemented (as base relations) in the target Database
Management System (DBMS).
Conceptual Data Model
Lecture2
27
Conceptual Database Design: The process of constructing a model of the data used in an enterprise, independent of all physical considerations.
The conceptual data model includes ER and a data dictionary.
To build conceptual data model:
Step 1.1 Identify entity types
Step 1.2 Identify relationship types
Step 1.3 Identify and associate attributes with entity or relationship types
Step 1.4 Determine attribute domains
Step 1.5 Determine candidate, primary, and alternate key attributes
Step 1.6 Check model for redundancy
Step 1.7 Validate conceptual model against user transactions
Step 1.8 Review conceptual data model with user
Logical Data Model
Lecture2
28
• Logical Database Design: The process of constructing a model of the
data used in an enterprise based on a specific data model (e.g.
relational), but independent of a particular DBMS and other physical
considerations.
• To build and validate logical data model (for the relational model):
• Step 2.1 Derive relations for logical data model
• Step 2.2 Validate relations using normalization: The process of organizing
data to minimize redundancy such as dividing large tables into smaller (and
less redundant) tables and defining relationships between them
• Step 2.3 Validate relations against user transactions
• Step 2.4 Check integrity constraints
• Step 2.5 Review logical data model with user
• Step 2.6 Check for future growth
Physical Data Model
Lecture2
29
• Physical Database Design : The process of producing a description of the implementation of the database on secondary storage.
• The physical database design phase allows the designer to make decisions on how the database is to be implemented.
• Therefore, physical design is tailored to a specific DBMS
• To build physical data model:
• Step 3.1 Translate logical data model for target DBMS
• Step 3.2 Design file organizations and indexes
• Step 3.3 Design user views
• Step 3.4 Design security mechanisms
• Step 3.5 Denormalization and controlled redundancy: The process of attempting to optimise the read performance of a database Such as adding attributes to a relation from another relation with which it will be joined.
• Step 3.6 Monitor and tune the operational system
Lecture2 30
CLASSIFICATION OF DBMS’S
Lecture2
31
Classification or models of DBMSs 32
1. First generation
• Network, Hierarchical
2. Second generation
• Relational
3. Third generation
• Object-oriented, Object-relational
Network Data Model
• The model that allowing a record to participate in multiple
parent/child relationships.
• Allowing child records to have multiple parents (M:N
relationships).
Hierarchical Data Model
• Each parent record can have many children, but each child
record has only one parent (1:M relationships).
• Tree-like structure.
First Generation 33
Disadvantages of hierarchical and network DBMSs:
1. Required complex programs for even simple queries.
2. Minimal data independence.
3. No widely accepted theoretical foundation.
First Generation 34
Second Generation 35
Relational Data Model:
Computer database in which all data is stored in Relations
which are tables with rows and columns.
Each table is composed of records (called Tuples) and
each record is identified by a field (attribute containing a
unique value).
Advantages of Relational model 36
The benefits of a database that has been designed according to
the relational model are numerous. Some of them are:
1. Data entry, updates and deletions will be efficient.
2. Data retrieval, summarization and reporting will also be
efficient.
3. Since much of the information is stored in the database
rather than in the application, the database is somewhat self-
documenting.
4. Changes to the database schema are easy to make.
Third Generation 37
Object-oriented Data Model
• Response to increasing complexity of DB applications
Lecture2 38
Recommended