View
222
Download
3
Category
Preview:
Citation preview
Database Management
Foundations of Information Systems
© The McGraw-Hill Companies, Inc.., 1998
MS PowerPoint 4.0 Presentation
MS PowerPoint 4.0 Presentation
MS PowerPoint 4.0 Presentation
MS PowerPoint 4.0 Presentation
Chapter Objectives
• Specify the elements of the data hierarchy.• Understand what can be accomplished with files and
what the limitations of a file-based environment are.• Understand the advantages of a database
environment and the role of a database management system (DBMS).
• Specify the three levels at which data are defined in databases.
• Compare the three data models.• Understand the operation of relational databases and
theirquery language SQL.
MS PowerPoint 4.0 Presentation
MS PowerPoint 4.0 Presentation
MS PowerPoint 4.0 Presentation
Chapter Objectives, cont.
• Understand the role of the data dictionary.• Explain the components of information
resource management.• Describe the developmental trends in
database management, including distributed databases, data warehouses, and object-oriented and other rich database.
Hierarchy of Data
• Data are the Principal Resources of an Organization
• Data Stored in Computer Systems form a Hierarchy– Extending from a Single Bit to a Database
– The Major Record Keeping entity of a firm
– Each higher rung of this hierarchy is organized from the components below it
Hierarchy of Data
• Data are logically organized into:
• 1. Bits (Characters)
• 2. Fields
• 3. Records
• 4. Files
• 5. Databases
Hierarchy of Data
• Bit (Character)– The smallest unit of data representation– Value of a Bit may be 0 or 1– 8 Bits make a Byte which can represent a
character or a special symbol in a character code
Hierarchy of Data
• Field– A grouping of characters– A data field represents an attribute,
characteristic, or quality of some entity, object, person, place, or event.
Hierarchy of Data
• Record– A collection of attributes that describe a
real world entity– A Record consists of fields with each field
describing an attribute of the entity
Hierarchy of Data
• File– A group of related records– Classified by the application for which
they are used• Example: Employee File
– Primary Key in a File is the Field (or Fields) whose value identifies a records among others in a data file
Hierarchy of Data
• Database– An integrated collection of logically related records or
files– Consolidates records previously stored in separate files
into a common pool of data records that provides data for may applications
– The Data is managed by systems software called Database Management Systems (DBMS)
– The Data stored in a Database is independent of the application programs using it and of the types of secondary storage devices
Hierarchy of Data Organization in Computer Storage
Hierarchy of Data Organization in Computer Storage
Component of DataOrganization
LogicalComponents
Physical(Storage)Components
Database
File
Record
Field(attribute)
Byte
Bit
Example
SUPPLIERS PARTS
SHIPMENTS
SUPPLIERS
NO. NAMESTREETADDRESS CITY ST ZIP
13 Gasket Co. 50 Oak Tifflin OH 44883
3251 Reliable Supp. 11 Cedar Teaneck NJ 07666
13 Gasket Co. 50 Oak Tifflin OH 44883
Reliable Suppliers
01000001 (represents “A” in theASCII-8 character code)
0
.
.
.
File Environment and Its Limitations
• File Organization– Data files are organized so as to facilitate
access to records and to ensure efficient storage
• A Tradeoff between these two requirements generally exists
– If rapid access is required, then more storage is required.
File Environment and Its Limitations
• Access to a record for reading it is the essential operation on data
• 1. Sequential Access– Records are accessed in the order they are stored– The main access mode only in Batch Systems where files
are used and updated at regular intervals
• 2. Direct Access– Required by On-Line Processing– A Record can be accessed without including the records
between it and the beginning of the file– The Primary Key serves to identity the needed record
File Environment and Its Limitations
• File Organization Methods
• 1. Sequential Organization
• 2. Indexed Sequential Organization
• 3. Direct Organization
File Environment and Its Limitations
• Sequential Organization– Records are physically stored in a specified order according to a
key field in each record– Advantages
• Fast• Efficient for dealing with large volumes data that need to be processed
periodically (batch system)
– Disadvantages• Requires that all new transactions be sorted into the proper sequence for
sequential access processing• Locating, Storing, Modifying, Deleting, or Adding Records in the file
requires rearranging the file• The method is too slow to handle applications requiring immediate
updating or responses
File Environment and Its Limitations
• Indexed Sequential Organization– Records are physically stored in
sequential order on a magnetic disk or other direct access storage device based on the key field of each record
– Each file contains an index that references one or more key field of each data record to its storage location address
File Environment and Its Limitations
• Direct Organization– Provides the fastest direct access to records
– Records do not have to be arranged in any particular sequence on storage media
– Computers must keep track of the storage location of each record using a variety of direct organization methods to retrieve data
– New transactions’ data does not have to be sorted
– Processing that requires immediate responses or updating is easily performed
File Environment and Its Limitations
• Limitations of a File Oriented Environment– Fair amount of data redundancy– Example of Three Systems
• Supplier
• Shipment
• Inventory
• Note: These Databases may have inconsistencies
Database Environment
• A Database is an organized collection of interrelated data that serves a number of applications in an enterprise.– Stores both the values of various entities
and their relationships– Managed by a Database Management
System DBMS• A system software that provides assistance in
managing databases shared by multiple users
Database Environment
• DBMS• Helps organize data for effective access
by a variety of users with different access needs and for efficient storage
• Makes it possible to create, access, maintain, and control databases
• Data can be integrated and presented on demand
Database Environment
• Advantages of a Database Management Approach
• Avoiding uncontrolled data redundancy and preventing inconsistency
• Program-Data independence• Flexible access to shared data• Benefits of centralized control of data
A Database EnvironmentA Database EnvironmentReports
Reports
DBMS
ApplicationProgram A
ApplicationProgram X
End Users
Query
Update
Database
.
.
.
Levels of Data Definition in Databases
• User view of a DBMS is the basis for modeling steps
• Data models define the logical relationships among the elements to support the basic process
• Logical framework to base the physical design of databases and the development of application programs
Levels of Data Definition in Databases
• DBMS defines a database
• Schema– Overall logical view of the relationships between data in a
database
• Subschema– Logical view of data relationships needed to support
specific end user applications to access the database
• Physical– How data is physically arranged, stored, and accessed on
the magnetic disks and other secondary storage devices of a computer system
Levels of Data Definition in Databases
• DBMS provides the language, DDL Data Definition Language to define the databases on the three levels (Schema, Subschema, and Physical), and the DML Data Manipulation Language to access records, change values of attributes, and delete or insert records
Data Models or How to Represent Relationships
Between Data• Data Model is a method for organizing databases
on the logical, schema, and subschema levels.
• The main concern is how to represent relationships among database records
• Relationships are based on logical data structures or models
• DBMS are designed to provide end users with quick, easy access to information in the databases
Data Models or How to Represent Relationships Between Data
• Data Model Structures
• Hierarchical
• Network
• Relational
Data Models or How to Represent Relationships
Between Data: Hierarchical• Used by early mainframe DBMS
• Relationships between records form a hierarchy or tree-like structure.
• Records are dependent and arranged in multilevel structures of one root record and any number of subordinate levels.
• Relationships among the records are one-to-many as each data element is related only to one element above it.
• The data element or record at the highest level is the root element. Any data element can be accessed by moving progressively downward from the root and along the branches of the tree to locate the record.
Data Models or How to Represent Relationships
Between Data: Hierarchical• Advantages
– Ease with which data can be stored and retrieved in structured, routine types of transactions
– Ease with which data can be extracted for reporting purposes
– Routine types of transaction processing are fast and efficient.
• Disadvantages– Hierarchical one-to-many relationships must be specified in
advance and are not flexible.
– Cannot easily handle ad hoc requests for information
– Modifying a hierarchical database structure is complex
– Redundancy
A Hierarchical DatabaseA Hierarchical DatabaseCLASSCLASS
MIS 101 1 525 MWF 10-11 a.m.
PROFESSORPROFESSOR
PROFESSORNUMBER
PROFESSORNUMBER
221 Turkel 02579 Young
02311 Volkonsky
01171 Johnson
STUDENTNUMBER
STUDENTNAME
COURSENUMBER SECTION
ROOMNUMBER
MEETINGDAYS
MEETINGHOURS
STUDENTSTUDENT
STUDENTSTUDENT
STUDENTSTUDENT
Data Models or How to Represent Relationships
Between Data: Network• Can represent more complex logical
relationships and is used by mainframe DBMS packages
• Allows many-to-many relationships among records meaning that the network model can access a data element by following several paths due to multiple relationships
Data Models or How to Represent Relationships
Between Data: Network• Advantages
– More flexible than the hierarchical model
– Ability to provide sophisticated logical relationships among the records
• Disadvantages– Network many-to-many relationships must be specified in
advance
– User is limited to retrieving data that can be access using the established links between records
– Cannot easily handle ad hoc requests for information
– Requires knowledge of a programming language
PROFESSORPROFESSOR
221
PROFESSOR NUMBER
PROFESSORNAME
MIS 101 1 525 MWF 10-11 a.m.
COURSENUMBER SECTION
ROOMNUMBER
MEETINGDAYS
MEETINGHOURS
01171
STUDENT NUMBER
STUDENTNAME
Turkel Johnson
STUDENTSTUDENT
Links to Records ofOther Classes Taughtby This Professor
Links to Records ofOther Classes Takenby This Student
Links to Records ofOther Students Taking This Class
A Network DatabaseA Network Database
Data Models or How to Represent Relationships
Between Data: Relational• Most popular Database Structure
• Used by most microcomputer DBMS packages as well as many minicomputer and mainframe systems
• Data elements within the database are stored in the form of simple tables which are related if they contain common fields
• DBMS packages based on the relational model can link data elements from various tables to provide information to users
Data Models or How to Represent Relationships
Between Data: Relational• Advantages
– Flexible in that ad hoc information request can be handled
– Easy for programmers
– End users can use Relational Structures with little effort or training.
– Easier to maintain than the hierarchical and network models
• Disadvantages– Cannot process large amounts of business transactions as
quickly and efficiently as the hierarchical and network models
Relational Databases
• A collection of tables that is relatively easy to use and understand offering flexibility for the data and ability to modify
• All records in a relational database must have a unique primary key
• Relational Systems support three principal operations on tables without any predefined access paths
• Select- from a specified row that satisfies a given condition
• Project- selects the attribute values
• Join- builds a new table from two specified tables
Relational Databases
• The power of the relational model derives from the join operation.– Records are related through a join
operation rather than links so that a predefined access path is not needed
– The join operation is time consuming requiring access to many records stored (on disk) to find the needed records
SQL- A Relational Query Language
• International standard access language fro defining and manipulating data in databases
• Data-Definition-and-Management Language for DBMS and some nonrelational ones
• SQL is used an independent query language to define objects in a database, enter, and access the data
• Embedded SQL is for programming in procedural languages (host) such as C, COBOL, PL/1, PL/SQL to access from an application program
• For the end user environment, SQL is “hidden” by more friendly user interfaces
SQL- A Relational Query Language
• Principal Facilities of SQL
• Data Definition
• Data Manipulation
Designing a Relational Database
• Database design progresses from the design of the logical of the schema and the subschema to the physical level.
• Logical Design or Data Modeling is to design the schema and the subschema.
• A relational database consists of tables (relations) describing the attributes of a particular class of entities.
• Logical Design begins with identifying the entity classes to be represented in the database and establishing relationships or interactions between the entities represented to access the data.
Designing a Relational Database
• Entity-Relationship (E-R) Diagrams are used to perform data modeling.– Diagrams include rectangles for entities
and diamonds for relationships including attributes
Designing a Relational Database
• Normalization is the simplification of the logical view of data in relational databases meaning that all fields contain single data elements, distinct records, and each table describes only a single class of entities.
• The objective of normalization is to prevent replication of data with all of its negative consequences.
Designing a Relational Database
• Physical Design is after the Logical Design of the database.
• Fields are specified as to their length and the nature of the data (numeric, characters).
• A principal objective is to minimize the number of time consuming disk accesses that are necessary in order to answer typical database queries.
• Frequently, indexes are provided to ensure fast access for queries.
The Data Dictionary
• A software module and database containing descriptions and definitions of the structure, data elements, interrelationships, and other characteristics of an organization’s database
The Data Dictionary
• Schema, Subschemas, and Physical Schema
• Which applications and users may retrieve and/or modify the specific data
• Cross reference information such as which programs use what data and the users receive particular reports
• Where individual data elements originate, and who is responsible for maintaining the data
• Standard naming conventions for database entities
• Integrity rules for the data
• Where the data are stored in geographically distributed databases
The Data Dictionary
• Contains all the data definitions and information necessary to identify data ownership
• Ensures security and privacy of the data and the information used during the development and maintenance of applications which rely on the database
Managing the Data Resource of an Organization
• Database technology enables organizations to control their data as a resource, but it does not automatically produce organizational control of data
• Components of Information Resources Management: Organizational and Technical Means
• Ensure that a firm systematically accumulates data in its databases
• Maintains the data over time
• Provides the appropriate access
Managing the Data Resource of an Organization
• Principal Components of Information Resource Management
• Organizational Processes– Information planning and data modeling
• Enabling Technologies– DBMS and a Data Dictionary
• Organizational Functions– Data and database administration
Managing the Data Resource of an Organization
• Database Administration Functional Units to Manage the Data
• Data Administrator (DA)
• Database Administrator (DBA)
Managing the Data Resource of an Organization:
Data Administrator• Person who has the central responsibility for an organization’s
data
• Establish policies and specific procedures for collecting, validating, sharing, and inventorying data to be stored and databases and for making information accessible to members of the organization and possible persons outside of it
• Data administration is a policy making function and the DA should have access to senior corporate management.
• Key person involved in the strategic planning of the data resource
• Often defines the principal data entities, their attributes, and the relationship among them
Managing the Data Resource of an Organization:
Database Administrator• A specialist responsible for maintaining standards for the
development, maintenance and security of an organization’s databases
• Creating the databases and carrying out the policies of the data administrator
• In large organizations, the DBA function is performed by a group of professionals; in a small firm, a programmer/ analyst may perform the DBA function, while one of the managers acts as the DA.
• Schema and subschemas of the database are most often defined by the DBA, who has the requisite technical knowledge to detail the physical layout of the databases with a view toward optimizing system performance for the expected pattern of database usage
Managing the Data Resource of an Organization
• Joint Responsibilities of the DA and DBA• Maintaining the Data Dictionary• Standardizing names and other aspects of data
definition• Providing Backup• Provide security and privacy for the data
stored in the database• Establish a disaster recovery plan for the
databases
Developmental Trends in Database Management
• Distributed Databases
• Data Warehousing
• Rich Databases (includes object-oriented databases)
Developmental Trends in Database Management
• Distributed Databases– Spread across several physical locations
– Data are placed where they are used most often
– Entire database is available to each authorized user
– Local work groups (LAN)
– Departments at regional offices (WAN)• Branch offices, manufacturing plants, and other work sites
– Databases includes segments of both common operational and user as well as data generated only at the user’s own site
Developmental Trends in Database Management
• Data Warehouse Databases– Stores data from current and previous years that has been extracted
from operational and management databases
– Central source of data
– Standardized and integrated for use by managers and other end user professionals
– Objective of a corporate data warehouse• To continually select data from the operational databases
• Transform the data into a uniform format
• Open the warehouse to the ends through a friendly and consistent interface
– Data mining• Automated discovery of potentially significant relationships among
categories of data
Developmental Trends in Database Management
• Systems supporting a data warehouse
• Extract and Prepare Data– The first subsystem extracts the data from the operational systems
– Many are older legacy systems that get “scrubbed” by removing errors and inconsistencies
• Store Data in the Warehouse– The second support component is the DBMS that will manage the
warehouse data.
• Provide Access and Analysis Capabilities– The third subsystem is made up of the query tools to access the
data
– Includes OLAP (OnLine Analytical Processing) and other DSS tools supporting data analysis
Developmental Trends in Database Management
• Object-Oriented and Other Rich Databases– With expanded capabilities of information technology, the
content of databases is becoming richer
– Traditional databases include largely numerical data or short fragments of text organized into well structured records
– As processing and storage capabilities expand with growing telecommunications, the knowledge is supported with rich data
– Geographic Information Systems
– Object-Oriented Databases
– Hypertext and Hypermedia Databases
– Image Databases and Other Text Databases
Tele-Tele-communicationscommunications
NetworkNetwork
Site 1 Site 3
Site 2Users haveaccess to theentire databaseover the network
DatabaseFragment 3
DatabaseFragment 1
DatabaseFragment 2
.
.
.
.
.
.
A System with a Distributed DatabaseA System with a Distributed Database
BitByteFieldRecordFilePrimary KeyDatabaseAccess (to a record)Sequential AccessDirect AccessSequential FileIndexed-Sequential FileDirect FileDatabase Management System (DBMS)Program-Data IndependenceSchemaSubschema
BitByteFieldRecordFilePrimary KeyDatabaseAccess (to a record)Sequential AccessDirect AccessSequential FileIndexed-Sequential FileDirect FileDatabase Management System (DBMS)Program-Data IndependenceSchemaSubschema
Physical SchemaData Definition Language (DDL)Data Manipulation Language (DML)Data ModelHierarchical ModelNetwork ModelRelational ModelStructured Query Language (SQL)Logical Design (data modeling)Entity-Relationship (E-R) DiagramNormalizationPhysical DesignData DictionaryData Administrator (DA)
Physical SchemaData Definition Language (DDL)Data Manipulation Language (DML)Data ModelHierarchical ModelNetwork ModelRelational ModelStructured Query Language (SQL)Logical Design (data modeling)Entity-Relationship (E-R) DiagramNormalizationPhysical DesignData DictionaryData Administrator (DA)
Database Administrator (DBA)Distributed DatabaseObject-Oriented DatabaseData WarehouseOLAP SoftwareData MiningGeographical Information System (GIS)
Database Administrator (DBA)Distributed DatabaseObject-Oriented DatabaseData WarehouseOLAP SoftwareData MiningGeographical Information System (GIS)
Key TermsKey Terms
Recommended