13
7/30/2019 DB notes http://slidepdf.com/reader/full/db-notes 1/13 Types of DBMS There are four main types of database management systems (DBMS) and these are based upon their management of database structures. In other words, the types of DBMS are entirely dependent upon how the database is structured by that particular DBMS.  Contents 1 Hierarchical DBMS  2 Network DBMS  3 Relational DBMS  4 Object -Oriented DBMS  Hierarchical DBMS A DBMS is said to be hierarchical if the relationships among data in the database are established in such a way that one data item is present as the subordinate of another one. Here subordinate means that items have "parent-child" relationships among them. Direct relationships exist between any two records that are stored consecutively. The data structure "tree" is followed by the DBMS to structure the database. No backward movement is possible/allowed in the hierarchical database.  The hierarchical data model was developed by IBM in 1968 and introduced in information management systems. This model is like a structure of a tree with the records forming the nodes and fields forming the branches of the tree. In the hierarchical model, records are linked in the form of an organization chart. A tree structure may establish one-to-many relationship. Network DBMS A DBMS is said to be a Network DBMS if the relationships among data in the database are of type many-to-many. The relationships among many-to-many appears in the form of a network. Thus the structure of a network database is extremely complicated because of these many -to-many relationships in which one record can be used as a key of the entire database. A network database is structured in the form of a graph that is also a data structure. Though the structure of such a DBMS is highly complicated however it has two basic elements i.e. records and sets to designate many-to-many relationships. Mainly high-level languages such as Pascal, COBOL and FORTRAN etc. were used to implement the records and set structures.  Relational DBMS A DBMS is said to be a Relational DBMS or RDBMS if the database relationships are treated in the form of a table. there are three keys on relational DBMS 1)relation 2)domain 3)attributes. A network means it contains fundamental constructs sets or records.sets contains one to many relationship,records contains fields statical table that is composed of rows and columns is used to organize the database and its structure and is actually a two dimension array in the computer memory. A number of RDBMSs are available, some popular examples are Oracle, Sybase, Ingress, Informix, Microsoft SQL Server, and Microsoft Access. Object-Oriented DBMS 

DB notes

  • Upload
    t2p22

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 1/13

Types of DBMS 

There are four main types of database management systems (DBMS) and these are based upon their 

management of database structures. In other words, the types of DBMS are entirely dependent uponhow the database is structured by that particular DBMS. 

Contents

1 Hierarchical DBMS  

2 Network DBMS  

3 Relational DBMS  

4 Object -Oriented DBMS  

Hierarchical DBMS 

A DBMS is said to be hierarchical if the relationships among data in the database are established insuch a way that one data item is present as the subordinate of another one. Here subordinate means

that items have "parent-child" relationships among them. Direct relationships exist between any tworecords that are stored consecutively. The data structure "tree" is followed by the DBMS to structurethe database. No backward movement is possible/allowed in the hierarchical database. 

The hierarchical data model was developed by IBM in 1968 and introduced in informationmanagement systems. This model is like a structure of a tree with the records forming the nodes andfields forming the branches of the tree. In the hierarchical model, records are linked in the form of an

organization chart. A tree structure may establish one-to-many relationship. 

Network DBMS 

A DBMS is said to be a Network DBMS if the relationships among data in the database are of typemany-to-many. The relationships among many-to-many appears in the form of a network. Thus thestructure of a network database is extremely complicated because of these many-to-many relationshipsin which one record can be used as a key of the entire database. A network database is structured in theform of a graph that is also a data structure. Though the structure of such a DBMS is highlycomplicated however it has two basic elements i.e. records and sets to designate many-to-manyrelationships. Mainly high-level languages such as Pascal, COBOL and FORTRAN etc. were used toimplement the records and set structures. 

Relational DBMS 

A DBMS is said to be a Relational DBMS or RDBMS if the database relationships are treated in theform of a table. there are three keys on relational DBMS 1)relation 2)domain 3)attributes. A network 

means it contains fundamental constructs sets or records.sets contains one to many relationship,recordscontains fields statical table that is composed of rows and columns is used to organize the database and

its structure and is actually a two dimension array in the computer memory. A number of RDBMSs areavailable, some popular examples are Oracle, Sybase, Ingress, Informix, Microsoft SQL Server, andMicrosoft Access. 

Object-Oriented DBMS 

Page 2: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 2/13

Able to handle many new data types, including graphics, photographs, audio, and video, object-

oriented databases represent a significant advance over their other database cousins. Hierarchical andnetwork databases are all designed to handle structured data; that is, data that fits nicely into fields,

rows, and columns. They are useful for handling small snippets of information such as names,addresset6fututyu5ru5,765rm s, zip codes, product numbers, and any kind of statistic or number youcan think of. On the other hand, an object-oriented database can be used to store data from a variety of 

media sources, such as photographs and text, and produce work, as output, in a multimedia format. 

Data normalizaon is a process in which data aributes within a data model are organized to increase the

cohesion of enty types. In other words, the goal of data normalizaon is to reduce and even eliminate data

redundancy, an important consideraon for applicaon developers because it is incredibly dicult to stores

objects in a relaonal database that maintains the same informaon in several places. Table 1 summarizes the

three most common forms of normalizaon ( First normal form (1NF), Second normal form (2NF), and

Third normal form (3NF)) describing how to put enty types into a series of increasing levels of 

normalizaon. Higher levels of data normalizaon are beyond the scope of this arcle. With respect to

terminology, a data schema is considered to be at the level of normalizaon of its least normalized enty type.For example, if all of your enty types are at second normal form (2NF) or higher then we say that your data

schema is at 2NF. 

5. Why Data Normalization?

The advantage of having a highly normalized data schema is that information is stored in one

 place and one place only, reducing the possibility of inconsistent data. Furthermore, highly-

normalized data schemas in general are closer conceptually to object-oriented schemas

 because the object-oriented goals of promoting high cohesion and loose coupling between

classes results in similar solutions (at least from a data point of view). This generally makes it

easier to map your objects to your data schema.

Benefits of Normalization

 Normalization provides numerous benefits to a database. Some of the major benefits include

the following :

  Greater overall database organization

  Reduction of redundant data

  Data consistency within the database

  A much more flexible database design

  A better handle on database security

Organization is brought about by the normalization process, making everyone's job easier,

from the user who accesses tables to the database administrator (DBA) who is responsible for 

the overall management of every object in the database. Data redundancy is reduced, which

simplifies data structures and conserves disk space. Because duplicate data is minimized, the

 possibility of inconsistent data is greatly reduced. For example, in one table an individual's

name could read STEVE SMITH, whereas the name of the same individual reads STEPHEN R.

SMITH in another table. Because the database has been normalized and broken into smaller 

tables, you are provided with more flexibility as far as modifying existing structures. It is

much easier to modify a small table with little data than to modify one big table that holds allthe vital data in the database. Lastly, security is also provided in the sense that the DBA can

Page 3: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 3/13

grant access to limited tables to certain users. Security is easier to control when normalization

has occurred.

Data integrity is the assurance of consistent and accurate data within a database.

Referential Integrity

Referential integrity simply means that the values of one column in a table depend on the

values of a column in another table. For instance, in order for a customer to have a record in

the ORDERS_TBL table, there must first be a record for that customer in the CUSTOMER_TBL 

table. Integrity constraints can also control values by restricting a range of values for acolumn. The integrity constraint should be created at the table's creation. Referential integrity

is typically controlled through the use of primary and foreign keys.

In a table, a foreign key, normally a single field, directly references a primary key in another 

table to enforce referential integrity. In the preceding paragraph, the CUST_ID in ORDERS_TBL 

is a foreign key that references CUST_ID in CUSTOMER_TBL.

Drawbacks of Normalization

Although most successful databases are normalized to some degree, there is one substantial

drawback of a normalized database: reduced database performance. The acceptance of 

reduced performance requires the knowledge that when a query or transaction request is sent

to the database, there are factors involved, such as CPU usage, memory usage, and

input/output (I/O). To make a long story short, a normalized database requires much more

CPU, memory, and I/O to process transactions and database queries than does a denormalized

database. A normalized database must locate the requested tables and then join the data fromthe tables to either get the requested information or to process the desired data. A more in-

depth discussion concerning database performance occurs in Hour 18, "Managing Database

Users."

Denormalizing a Database

Denormalization is the process of taking a normalized database and modifying table structures

to allow controlled redundancy for increased database performance. Attempting to improve

 performance is the only reason to ever denormalize a database. A denormalized database is

not the same as a database that has not been normalized. Denormalizing a database is the

 process of taking the level of normalization within the database down a notch or two.Remember, normalization can actually slow performance with its frequently occurring table

 join operations. (Table joins are discussed during Hour 13, "Joining Tables in Queries.")

Denormalization may involve recombining separate tables or creating duplicate data within

tables to reduce the number of tables that need to be joined to retrieve the requested data,

which results in less I/O and CPU time.

There are costs to denormalization, however. Data redundancy is increased in a denormalized

database, which can improve performance but requires more extraneous efforts to keep track 

of related data. Application coding renders more complications, because the data has been

spread across various tables and may be more difficult to locate. In addition, referential

integrity is more of a chore; related data has been divided among a number of tables. There is

Page 4: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 4/13

a happy medium in both normalization and denormalization, but both require a thorough

knowledge of the actual data and the specific business requirements of the pertinent company.

6. Denormalization

From a purist point of view you want to normalize your data structures as much as possible,

 but from a practical point of view you will find that you need to 'back out" of some of your 

normalizations for performance reasons. This is called "denormalization". For example, with

the data schema of  Figure 1 all the data for a single order is stored in one row (assuming

orders of up to nine order items), making it very easy to access. With the data schema of 

Figure 1 you could quickly determine the total amount of an order by reading the single row

from the Order0NF table. To do so with the data schema of  Figure 5 you would need to read

data from a row in the Order table, data from all the rows from the OrderItem table for that

order and data from the corresponding rows in the Item table for each order item. For this

query, the data schema of  Figure 1 very likely provides better performance.

DBMS Functions' 

There are several functions that a DBMS performs to ensure data integrity and consistency

of data in the database. The ten functions in the DBMS are: data dictionary management, data storage

management, data transformation and presentation, security management, multiuser access control,

backup and recovery management, data integrity management, database access languages and

application programming interfaces, database communication interfaces, and transaction

management.

1. Data Dictionary Management

Data Dictionary is where the DBMS stores definitions of the data elements and their

relationships (metadata). The DBMS uses this function to look up the required data component

structures and relationships. When programs access data in a database they are basically

going through the DBMS. This function removes structural and data dependency and provides

the user with data abstraction. In turn, this makes things a lot easier on the end user. The Data

Dictionary is often hidden from the user and is used by Database Administrators and

Programmers. 

2. Data Storage Management

This particular function is used for the storage of data and any related data entry forms or

screen definitions, report definitions, data validation rules, procedural code, and structures that can

handle video and picture formats. Users do not need to know how data is stored or manipulated. Also

involved with this structure is a term called performance tuning that relates to a database’s efficiency

in relation to storage and access speed.

Page 5: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 5/13

 

3. Data Transformation and Presentation

This function exists to transform any data entered into required data structures. By using the

data transformation and presentation function the DBMS can determine the difference between logical

and physical data formats.

4. Security Management

This is one of the most important functions in the DBMS. Security management sets rules

that determine specific users that are allowed to access the database. Users are given a username

and password or sometimes through biometric authentication (such as a fingerprint or retina scan) but

these types of authentication tend to be more costly. This function also sets restraints on what specificdata any user can see or manage.

5. Multiuser Access Control

Data integrity and data consistency are the basis of this function. Multiuser access control is

a very useful tool in a DBMS, it enables multiple users to access the database simultaneously without

affecting the integrity of the database.

6. Backup and Recovery ManagementBackup and recovery is brought to mind whenever there is potential outside threats to a

database. For example if there is a power outage, recovery management is how long it takes to

recover the database after the outage. Backup management refers to the data safety and integrity; for

example backing up all your mp3 files on a disk.

7. Data Integrity Management

The DBMS enforces these rules to reduce things such as data redundancy, which is when

data is stored in more than one place unnecessarily, and maximizing data consistency, making suredatabase is returning correct/same answer each time for same question asked.

8. Database Access Languages and Application Programming Interfaces

A query language is a nonprocedural language. An example of this is SQL (structured query

language). SQL is the most common query language supported by the majority of DBMS vendors. The

use of this language makes it easy for user to specify what they want done without the headache of

explaining how to specifically do it.

Page 6: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 6/13

 

9. Database Communication Interfaces

This refers to how a DBMS can accept different end user requests through different network

environments. An example of this can be easily related to the internet. A DBMS can provide access to

the database using the Internet through Web Browsers (Mozilla Firefox, Internet Explorer, Netscape).

10. Transaction Management

This refers to how a DBMS must supply a method that will guarantee that all the updates in a given

transaction are made or not made. All transactions must follow what is called the ACID properties.

A  – Atomicity: states a transaction is an indivisible unit that is either performed as a whole and not by

its parts, or not performed at all. It is the responsibility of recovery management to make sure this

takes place.

C  – Consistency: A transaction must alter the database from one constant state to another constant

state.

I  – Isolation: Transactions must be executed independently of one another. Part of a transaction in

progress should not be able to be seen by another transaction.

D  – Durability: A successfully completed transaction is recorded permanently in the database and

must not be lost due to failures. 

Database Normalization Basics

By Mike Chapple, About.com Guide

See More About 

  database normalizaon 

  1nf  

  2nf  

  3nf  

  bcnf  

If you've been working with databases for a while, chances are you've heard the term normalizaon.

Perhaps someone's asked you "Is that database normalized?" or "Is that in BCNF?" All too oen, the

reply is "Uh, yeah." Normalizaon is oen brushed aside as a luxury that only academics have me

for. However, knowing the principles of normalizaon and applying them to your daily database

design tasks really isn't all that complicated and it could drascally improve the performance of your

DBMS. 

Page 7: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 7/13

In this arcle, we'll introduce the concept of normalizaon and take a brief look at the most common

normal forms. Future arcles will provide in-depth exploraons of the normalizaon process. 

What is Normalization?

Normalizaon is the process of eciently organizing data in a database. There are two goals of thenormalizaon process: eliminang redundant data (for example, storing the same data in more than

one table) and ensuring data dependencies make sense (only storing related data in a table). Both of 

these are worthy goals as they reduce the amount of space a database consumes and ensure that

data is logically stored.

The Normal Forms

The database community has developed a series of guidelines for ensuring that databases are

normalized. These are referred to as normal forms and are numbered from one (the lowest form of 

normalizaon, referred to as rst normal form or 1NF) through ve (h normal form or 5NF). Inpraccal applicaons, you'll oen see 1NF, 2NF, and 3NF along with the occasional 4NF. Fih normal

form is very rarely seen and won't be discussed in this arcle. 

Before we begin our discussion of the normal forms, it's important to point out that they are

guidelines and guidelines only. Occasionally, it becomes necessary to stray from them to meet

praccal business requirements. However, when variaons take place, it's extremely important to

evaluate any possible ramicaons they could have on your system and account for possible

inconsistencies. That said, let's explore the normal forms.

First Normal Form (1NF)

First normal form (1NF) sets the very basic rules for an organized database:

  Eliminate duplicave columns from the same table. 

  Create separate tables for each group of related data and idenfy each row with a unique

column or set of columns (the primary key). 

For more details, read Pung your Database in First Normal Form 

Second Normal Form (2NF)

Second normal form (2NF) further addresses the concept of removing duplicave data:

  Meet all the requirements of the rst normal form. 

  Remove subsets of data that apply to mulple rows of a table and place them in separate

tables. 

  Create relaonships between these new tables and their predecessors through the use of 

foreign keys. 

For more details, read Pung your Database in Second Normal Form 

Third Normal Form (3NF)

Page 8: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 8/13

Third normal form (3NF) goes one large step further:

  Meet all the requirements of the second normal form. 

  Remove columns that are not dependent upon the primary key. 

For more details, read Pung your Database in Third Normal Form 

Boyce-Codd Normal Form (BCNF or 3.5NF)

The Boyce-Codd Normal Form, also referred to as the "third and half (3.5) normal form", adds one

more requirement:

  Meet all the requirements of the third normal form. 

  Every determinant must be a candidate key. 

For more details, read Pung your Database in Boyce Codd Normal Form 

Fourth Normal Form (4NF)

Finally, fourth normal form (4NF) has one addional requirement:

  Meet all the requirements of the third normal form. 

  A relaon is in 4NF if it has no mul-valued dependencies. 

Remember, these normalizaon guidelines are cumulave. For a database to be in 2NF, it must rst

fulll all the criteria of a 1NF database. 

Should I Normalize?

While database normalizaon is oen a good idea, it's not an absolute requirement. In fact, there are

some cases where deliberately violang the rules of normalizaon is a good pracce. For more on

this topic, read Should I Normalize My Database?. 

Page 9: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 9/13

Find out what normalizaon is and how your database can benet from it (or suer from it). Learn

the advantages, disadvantages, and some techniques and guidelines to doing it yourself. 

In this hour, you learn the process of taking a raw database and breaking it into logical units

called tables. This process is referred to as normalization. The normalization process is used

 by database developers to design databases in which it is easy to organize and manage datawhile ensuring the accuracy of data throughout the database.

The advantages and disadvantages of both normalization and denormalization of a database

are discussed, as well as data integrity versus performance issues that pertain to

normalization.

The highlights of this hour include

  What normalization is

  Benefits of normalization

  Advantages of denormalization   Normalization techniques

  Guidelines of normalization

  The three normal forms

  Database design

Normalizing a Database

 Normalization is a process of reducing redundancies of data in a database. Normalization is a

technique that is used when designing and redesigning a database. Normalization is a process

or set of guidelines used to optimally design a database to reduce redundant data. The actualguidelines of normalization, called normal forms, will be discussed later in this hour. It was a

difficult decision to decide whether to cover normalization in this book because of the

complexity involved in understanding the rules of the normal forms this early on in your SQL

 journey. However, normalization is an important process that, if understood, will increase

your understanding of SQL. We have attempted to simplify the process of normalization as

much as possible in this hour. At this point, don't be overly concerned with all the specifics of 

normalization; it is most important to understand the basic concepts.

The Raw Database

A database that is not normalized may include data that is contained in one or more differenttables for no apparent reason. This could be bad for security reasons, disk space usage, speed

of queries, efficiency of database updates, and, maybe most importantly, data integrity. A

database before normalization is one that has not been broken down logically into smaller,

more manageable tables. Figure 4.1 illustrates the database used for this book before it was

normalized.

Figure 4.1 The raw database. 

Logical Database Design

Page 10: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 10/13

Any database should be designed with the end user in mind. Logical database design, also

referred to as the logical model, is the process of arranging data into logical, organized groups

of objects that can easily be maintained. The logical design of a database should reduce data

repetition or go so far as to completely eliminate it. After all, why store the same data twice?

 Naming conventions used in a database should also be standard and logical.

What Are the End User's Needs?

The needs of the end user should be one of the top considerations when designing a database.

Remember that the end user is the person who ultimately uses the database. There should be

ease of use through the user's front-end tool (a client program that allows a user access to a

database), but this, along with optimal performance, cannot be achieved if the user's needs are

not taken into consideration.

Some user-related design considerations include the following:

  What data should be stored in the database?  How will the user access the database?

  What privileges does the user require?

  How should the data be grouped in the database?

  What data is the most commonly accessed?

  How is all data related in the database?

  What measures should be taken to ensure accurate data?

Data Redundancy

Data should not be redundant, which means that the duplication of data should be kept to a

minimum for several reasons. For example, it is unnecessary to store an employee's home

address in more than one table. With duplicate data, unnecessary space is used. Confusion is

always a threat when, for instance, an address for an employee in one table does not match the

address of the same employee in another table. Which table is correct? Do you have

documentation to verify the employee's current address? As if data management were not

difficult enough, redundancy of data could prove to be a disaster.

The Normal Forms

The next sections discuss the normal forms, an integral concept involved in the process of 

database normalization.

 Normal form is a way of measuring the levels, or depth, to which a database has been

normalized. A database's level of normalization is determined by the normal form.

The following are the three most common normal forms in the normalization process:

  The first normal form

  The second normal form

  The third normal form

Page 11: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 11/13

Of the three normal forms, each subsequent normal form depends on normalization steps

taken in the previous normal form. For example, to normalize a database using the second

normal form, the database must first be in the first normal form.

The First Normal Form

The objective of the first normal form is to divide the base data into logical units called tables.

When each table has been designed, a primary key is assigned to most or all tables. Examine

Figure 4.2, which illustrates how the raw database shown in the previous figure has been

redeveloped using the first normal form.

Figure 4.2 The first normal form. 

You can see that to achieve the first normal form, data had to be broken into logical units of 

related information, each having a primary key and ensuring that there are no repeated groups

in any of the tables. Instead of one large table, there are now smaller, more manageable tables:

EMPLOYEE_TBL, CUSTOMER_TBL, and PRODUCTS_TBL. The primary keys are normally the firstcolumns listed in a table, in this case: EMP_ID, CUST_ID, and PROD_ID .

The Second Normal Form

The objective of the second normal form is to take data that is only partly dependent on the

 primary key and enter that data into another table. Figure 4.3 illustrates the second normal

form.

Figure 4.3 The second normal form. 

According to the figure, the second normal form is derived from the first normal form by

further breaking two tables down into more specific units.

EMPLOYEE_TBL split into two tables called EMPLOYEE_TBL and EMPLOYEE_PAY_TBL . Personal

employee information is dependent on the primary key (EMP_ID), so that information

remained in the EMPLOYEE_TBL (EMP_ID, LAST_NAME, FIRST_NAME, MIDDLE_NAME, ADDRESS,

CITY, STATE, ZIP, PHONE, and PAGER). On the other hand, the information that is only partly

dependent on the EMP_ID (each individual employee) is used to populate EMPLOYEE_PAY_TBL  

(EMP_ID, POSITION, POSITION_DESC, DATE_HIRE, PAY_RATE, and DATE_LAST_RAISE). Notice

that both tables contain the column EMP_ID. This is the primary key of each table and is used

to match corresponding data between the two tables.

CUSTOMER_TBL split into two tables called CUSTOMER_TBL and ORDERS_TBL. What took place

is similar to what occurred in the EMPLOYEE_TBL. Columns that were partly dependent on the

 primary key were directed to another table. The order information for a customer is dependent

on each CUST_ID, but does not directly depend on the general customer information in theoriginal table.

The Third Normal Form

The third normal form's objective is to remove data in a table that is not dependent on the

 primary key. Figure 4.4 illustrates the third normal form.

Page 12: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 12/13

Figure 4.4 The third normal form. 

Another table was created to display the use of the third normal form. EMPLOYEE_PAY_TBL is

split into two tables, one table containing the actual employee pay information and the other 

containing the position descriptions, which really do not need to reside in

EMPLOYEE_PAY_TBL. The POSITION_DESC column is totally independent of the primary key,EMP_ID .

Naming Conventions

 Naming conventions are one of the foremost considerations when you're normalizing a

database. Names are how you will refer to objects in the database. You want to give your 

tables names that are descriptive of the type of information they contain so that the data you

are looking for is easy to find. Descriptive table names are especially important for users

querying the database that had no part in the database design. A company-wide naming

convention should be set, providing guidance in the naming of not only tables within the

database, but users, filenames, and other related objects. Designing and enforcing naming

conventions is one of a company's first steps toward a successful database implementation.

Benefits of Normalization

 Normalization provides numerous benefits to a database. Some of the major benefits include

the following :

  Greater overall database organization

  Reduction of redundant data

  Data consistency within the database  A much more flexible database design

  A better handle on database security

Organization is brought about by the normalization process, making everyone's job easier,

from the user who accesses tables to the database administrator (DBA) who is responsible for 

the overall management of every object in the database. Data redundancy is reduced, which

simplifies data structures and conserves disk space. Because duplicate data is minimized, the

 possibility of inconsistent data is greatly reduced. For example, in one table an individual's

name could read STEVE SMITH, whereas the name of the same individual reads STEPHEN R.

SMITH in another table. Because the database has been normalized and broken into smaller 

tables, you are provided with more flexibility as far as modifying existing structures. It ismuch easier to modify a small table with little data than to modify one big table that holds all

the vital data in the database. Lastly, security is also provided in the sense that the DBA can

grant access to limited tables to certain users. Security is easier to control when normalization

has occurred.

Data integrity is the assurance of consistent and accurate data within a database.

Referential Integrity

Referential integrity simply means that the values of one column in a table depend on thevalues of a column in another table. For instance, in order for a customer to have a record in

the ORDERS_TBL table, there must first be a record for that customer in the CUSTOMER_TBL 

Page 13: DB notes

7/30/2019 DB notes

http://slidepdf.com/reader/full/db-notes 13/13

table. Integrity constraints can also control values by restricting a range of values for a

column. The integrity constraint should be created at the table's creation. Referential integrity

is typically controlled through the use of primary and foreign keys.

In a table, a foreign key, normally a single field, directly references a primary key in another 

table to enforce referential integrity. In the preceding paragraph, the CUST_ID in ORDERS_TBL is a foreign key that references CUST_ID in CUSTOMER_TBL.

Drawbacks of Normalization

Although most successful databases are normalized to some degree, there is one substantial

drawback of a normalized database: reduced database performance. The acceptance of 

reduced performance requires the knowledge that when a query or transaction request is sent

to the database, there are factors involved, such as CPU usage, memory usage, and

input/output (I/O). To make a long story short, a normalized database requires much more

CPU, memory, and I/O to process transactions and database queries than does a denormalized

database. A normalized database must locate the requested tables and then join the data from

the tables to either get the requested information or to process the desired data. A more in-

depth discussion concerning database performance occurs in Hour 18, "Managing Database

Users."

Denormalizing a Database

Denormalization is the process of taking a normalized database and modifying table structures

to allow controlled redundancy for increased database performance. Attempting to improve

 performance is the only reason to ever denormalize a database. A denormalized database is

not the same as a database that has not been normalized. Denormalizing a database is the process of taking the level of normalization within the database down a notch or two.

Remember, normalization can actually slow performance with its frequently occurring table

 join operations. (Table joins are discussed during Hour 13, "Joining Tables in Queries.")

Denormalization may involve recombining separate tables or creating duplicate data within

tables to reduce the number of tables that need to be joined to retrieve the requested data,

which results in less I/O and CPU time.

There are costs to denormalization, however. Data redundancy is increased in a denormalized

database, which can improve performance but requires more extraneous efforts to keep track 

of related data. Application coding renders more complications, because the data has been

spread across various tables and may be more difficult to locate. In addition, referentialintegrity is more of a chore; related data has been divided among a number of tables. There is

a happy medium in both normalization and denormalization, but both require a thorough

knowledge of the actual data and the specific business requirements of the pertinent company.