51
An Introduction to Database and Database Designing R C Goyal Principal Scientist IASRI, New Delhi

An Introduction to Database and Database Designing

Embed Size (px)

DESCRIPTION

An Introduction to Database and Database Designing. R C Goyal Principal Scientist IASRI, New Delhi. Databases Before the Use of Computers. Data kept in books, ledgers, card files, folders, and file cabinets Long response time Labor-intensive Often incomplete or inaccurate. - PowerPoint PPT Presentation

Citation preview

An Introduction to Database and Database Designing

R C GoyalPrincipal ScientistIASRI, New Delhi

Databases Before the Use of Computers

Data kept in books, ledgers, card files, folders, and file cabinets

Long response time Labor-intensive Often incomplete or inaccurate

Data & InformationData known facts that can be

recorded and that has implicit meaning

Information processed data

Meaningless data becomes information when it is processed and presented to the decision maker in a meaningful way

Database

A set of data that is required for a specific purpose or is fundamental to a system, project, enterprise, or business. A formally structured collection of data.

A database may consist of one or more data banks and be geographically distributed among several repositories.

In automated information systems, the database is manipulated using a database management system.

File Mgmt vs. Database Mgmt File Management

Each data entity is in a separate file system for creating, retrieving and

manipulating files. Database Management

Same file but data elements are integrated and shared among different files

Program controls the structure of a database and access to data

Database Management for Strategic Advantage We live in the Information Age Information used to make

organizations more productive and competitive

Databases used to support business operations

Databases used across a range of applictions Personal, department, enterprise, web

Technical Advantages of Database Management

Reduced data redundancy Improved data integrity More program independence Increased user productivity Increased security

Disadvantages of DBMS

Cost issues Implementation & maintenance

issues Security issues Privacy issues

Features of DBMS Database Engine

heart of DBMS stores, retrieves and updates data enforces business rules

Data dictionary holds definitions of all of the data tables Describes the data type, allows DBMS to

keep track of data, helps user find data they need

Features of DBMS…Cont.

Query processor Enables users to store and retrieve

data Use English command that use such

words as SELECT, DELETE, MODIFY Report generator

Formats and prints reports after user uses query processor

Features of DBMS…Cont.

Forms generator Application generator Access security

Setup access privileges to protect data from unauthorized access and sabotage

System recovery

Database Development Cycle

Database planning System Definition Requirements collection and analysis Database design DBMS selection Application design Prototyping Implementation Data Conversion and loading Testing Operational Maintenance

Database Planning

Systems Definition

Requirements Collectionand analysis

Database DesignDBMSSelection

ApplicationDesign

Implementation

Data Conversion and loading

Testing

Evaluation & Maintenance

Prototyping

Database Life Cycle

Source: http://www.cs/ucf.edu/courses/cgs2545/CH02/index.htm

Database Planning

Current systems evaluation

Development of standards

Technological feasibility

Operational feasibility

Economical feasibility

Requirements Collection and Analysis identifying management information

requirements, determining information requirements

by functional area, and establishing hardware and

software requirements

Systems definitionData dictionary Metadata

Database Design

Conceptual design

Logical design

Physical design

Database design methodology

A structured approach that uses procedures, techniques, tools, and documentation aids to support and facilitate the process of design.* Conceptual database design * Logical database design* Physical database design

Critical Success Factors in Database Design

Work interactively with the users as much as possible.

Follow a structured methodology throughout the data modelling process.

Incorporate structural and integrity considerations into the data models.

Combine conceptualisation, normalisation, and transaction validation techniques into the data modelling methodology.

Design Considerations - Tips

Planning Tables Queries Forms Reports Macros Modules General

Tips - Planning Before you start up the computer to build a

database, PLAN ON PAPER!!! Gather all of the paper forms you use to

collect your data Interview the people who will be using the

database, from those who conduct the field work to those who analyze the data

Think about what you want to get back out of your database, not just about how to get data in

Steal - see if it’s been done before

Tips - Tables

Normalize, normalize, normalize Use field properties to help maintain data

integrity When possible limit the possible entries in

a field to a lookup list (domain) Choose your key fields carefully Index fields that you will search often Many-to-many relationships in Access Steal - link to existing tables

Normalization

The process of breaking down large

tables into smaller ones by removing

all unnecessary or duplicate fields,

eliminating redundant data, and

making sure that each table

represents only one thing.

Why Normalize? If you don’t normalize, you may run into

anomalies (unexpected results) Deletion anomaly - deleting a record

unexpectedly removes a value we wanted to keep

Insertion anomaly - we can’t add a record because we don’t know the value of all of the required fields

Change anomaly - one change must be applied to many records in a single table

Non-normalized Table

Student# StName AdvName AdvRoom Class1 Class2 Class31022 Rajat Goyal 412 101-07 143-01 159-024123 Patel Gupta 216 202-01 211-02 214-01

Typical flat-file table

Contains all data in one record.

Contains repeating fields.

Contains information on more than one thing.

First Normal Form

Eliminate repeating fields within

tables.

Create a separate table for each set of

related data.

Identify each set of related data with a

primary key.

First Normal Form

No Repeating Fields: Class# is now a single field instead of repeating for each class a

student takes.

Now multiple records for each student because >1 record is needed to accommodate all

classes for each student

Student# StName AdvName AdvRoom Class#1022 Rajat Goyal 412 101-071022 Rajat Goyal 412 143-011022 Rajat Goyal 412 159-024123 Patel Gupta 216 210-014123 Patel Gupta 216 211-02

Student# StName AdvName AdvRoom Class1 Class2 Class31022 Rajat Goyal 412 101-07 143-01 159-024123 Patel Gupta 216 202-01 211-02 214-01

Non-normalized Table

Second Normal Form

Create separate tables for values

that apply to multiple records.

Relate these tables with a foreign

key.

Second Normal Form

Eliminate Redundant Data: no longer repeating student name, advisor, and adv-room for each class

Student# StName AdvName AdvRoom1022 Rajat Goyal 4124123 Patel Gupta 216

Students:

Student# Class#1022 101-071022 143-011022 159-024123 210-014123 211-024123 214-01

Classes:

Third Normal Form

Eliminate fields that do not depend on the key.

AdvRoom is independent of the key field (Student#). It does not depend on the student, it depends on

the advisor.

Student# StName AdvName AdvRoom1022 Rajat Goyal 4124123 Patel Gupta 216

Students:

Third Normal Form

Eliminate Data Not Dependent on Key: remove

AdvName from the Students table and create a

separate table describing the advisors with its own

primary key (AdvID).

Relate Advisors to Students using AdvID.

Student# StName AdvID1022 Rajat 14123 Patel 2

Students:

AdvID AdvName AdvRoom 1 Goyal 4122 Gupta 216

Advisors:

Deletion Anomalies

If we delete student Rajat, we lose the only records for advisor Goyal

Student# StName AdvName AdvRoom Class#1022 Rajat Goyal 412 101-071022 Rajat Goyal 412 143-011022 Rajat Goyal 412 159-024123 Patel Gupta 216 210-014123 Patel Gupta 216 211-02

Insertion Anomalies

If we want to add a new student we have to know what class they are in (it’s part of the key)

Student# StName AdvName AdvRoom Class#1022 Rajat Goyal 412 101-071022 Rajat Goyal 412 143-011022 Rajat Goyal 412 159-024123 Patel Gupta 216 210-014123 Patel Gupta 216 211-02

Change Anomalies

If student Rajat switches to advisor Gupta, we have to change three records instead of just one

Student# StName AdvName AdvRoom Class#1022 Rajat Goyal 412 101-071022 Rajat Goyal 412 143-011022 Rajat Goyal 412 159-024123 Patel Gupta 216 210-014123 Patel Gupta 216 211-02

Field Properties

Data Type - text, integer, double, boolean, etc.

Field Size - for text fields

Input Mask - a pattern for entering data

Default Values - auto-entered for new records

Validation Rule - limits values entered

Required? - force entry of data

Relationships

One to One One to Many Many to One Many to Many

Many-to-Many Relationship

Exist between two tables when: for one record in the first table,

there can be many corresponding records in the second table and…

for one record in the second table, there can be many corresponding records in the first table

Many-to-Many Relationship

One student can take many classes, and one class can be taken by many students.

Students

StudentIDNameAddressCityStateZipWorkPhoneHomePhone

Classes

ClassNumberStudentIDSubjectInstructorIDDaysTimeComments

Tips - Queries Take advantage of action queries to handle

batch record operations Use queries to present calculated values

rather than storing the calculated values in your tables

Remember that null never equals another null Joining two tables on a field that may contain a null

value may not give you the results you expect Searching for duplicate values will not return two

records that have a null Realize that you can link a table to itself

DBMS Selection

* Costs* Features and Tools* Underlying model * Portability* DBMS hardware requirements* Organisational requirements

Implementation

The physical realisation of the database and application designs

the detailed model is converted to the appropriate implementation model, the data dictionary is built, the database is populated, application programs are developed and users are trained

Data Conversion and Loading & Testing

Transferring any existing or new data into the new database and converting any existing applications to run on the new database

Finding errors

Database Evaluation

Interviewing and polling users to determine whether any data needs are unmet.

Operational maintenance

preventive maintenance (backup) corrective maintenance

(recovery)1 adaptive maintenance regular monitoring & periodical

check up

Recent Developments Affecting Database Design and Use

Data Mining (On-Line Analytical Processing) Drill down from summary data to detailed data Data Warehouses/Data Marts

Integrates many large databases into one repository

Linking Web Site Applications to Organizational Databases Users have Web view to organizational database Improves customer contact and service Adds security as a concern

Data Warehouses

Data warehouses are used to organize information for quick and effective queries

Data Warehouses and Database

In the data warehouse, data are organized around major subjects

Data in the warehouse are stored as summarized rather than detailed raw data

Data in the data warehouse cover a much longer time frame than in a traditional transaction-oriented database

Data warehouses are organized for fast queries

Data warehouses are usually optimized for answering complex queries, known as OLAP

Data Warehouses and Database

Data warehouses allow for easy access via data-mining software called software

Data warehouses include multiple databases that have been processed so that data are uniformly defined, containing what is referred to as “clean” data

Data warehouses usually contain data from outside sources

Data Mining

Statistical analysis Decision trees Neural networks Fuzzy logic Clustering

Data Mining Patterns

Data mining patterns that decision makers try to identify include Associations, patterns that occur

together Sequences, patterns of actions that

take place over a period of time Clustering, patterns that develop

among groups of people Trends, the patterns that are noticed

over a period of time

Web Based Databases and XML

Web-based databases are used for sharing data

Extensible markup language (XML) is used to define data used primarily for business data exchange over the Web An XML document contains only data and

the nature of the data Continue………………………………..