35
LBSC 690 Session 8 Files and Mass Storage

LBSC 690 Session 8 Files and Mass Storage. Mass Storage Handling large amounts of data. Deal with files: Organized collections of data. Three Basic

Embed Size (px)

Citation preview

Page 1: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

LBSC 690

Session 8

Files and Mass Storage

Page 2: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Mass Storage

Handling large amounts of data. Deal with files:

Organized collections of data.

Three Basic Types: Sequential, Index-sequential, Direct access.

Page 3: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Types of Mass Storage

Magnetic: Disk.

Floppy. Hard.

Disk array. Demountable.

Tape. Cartridge. Cassettes.

Card. “Flash” memories.

Page 4: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Disks

Organization: Tracks. Sectors. Cylinders. Access time. Floppy Disk Organization: Reserved area. File allocation table (FAT). Root directory. Files area.

Page 5: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Mass Storage (Cont’d.)

Optical: CD Rom.

WORM. Erasable.

DVD.

Page 6: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Storage

Hierarchical Memory Storage (HMS) Storage Area Network Systems (SANS) RAID

Page 7: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Data Base Approach

Basic concepts: Data is an asset. Represents an investment. Managing data: Managing facts.

Page 8: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Meanings

Entity: A specific object, either real or abstract. Has properties (attributes).

Entity type (class): A collection of entities with similar properties.

Instance: An individual occurrence of an entity. It represents facts.

Page 9: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Entity Example: Employee

Attributes: Employee Name. Employee Number. Social Security Number. Employee Address.

Birth Date. Hire Date. Employee Telephone. Job Classification.

Facts: Mary Alice Smith 123709 123-45-6789 1201 South Avenue

Silver Spring,MD 20867 9-15-35 8-12-91 301-254-7623 Librarian 1

Page 10: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Databases

Reference: Figure 1.1. Data as an integral part of the decision

making process. Data Is collected and stored in automated

form:A database.

Page 11: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Basic Terminology

Byte: Smallest addressable group of bits. Data item: Smallest unit of named data. Data aggregate: Collection of data items.

Vectors and repeating groups.

Record: A named collection of data items. Files: Named collection of all occurrences of a type of

record. Database: A collection of the occurrences of multiple

record types, containing the relationships between records.

Page 12: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Two Descriptions of Databases:

1. Logical:1. Treats representations of entities, attributes and

relationships.

2. Physical:1. Deals with physical placement of data on storage

devices or in memory.

Page 13: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Data

Views: Users, Programmers, & Computer Operators:

Users view data as high level entities. Programmers & operators view data as fields, records, files,

index structures,, addresses, etc.

Schemas: Models of logical and physical data structures employed.

Formalized according to a set of rules. Uses diagrams and text to describe.

Page 14: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Schema Types

1. External. Views of data as seen by individual users.

2. Conceptual. A neutral integrated view of data between user

and internal data base.

3. Internal. Implementation views of data.

Page 15: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

DBMS Functions

Data definitions and relationships Record data definitions and structures Organize and store data for effective and efficient

access Provide a standard and meaningful user interface Protect data resources Separate logical and physical concerns Provide for data sharing

Page 16: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Costs of Database Approach

DBMS technology costs. Database operation costs. Data and logic conversion costs. Planning costs. Risk costs.

Page 17: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Positive Factors

Databases benefit an organization with: Need for interactive inquiry and update

capability. An expected growth in data volumes. An expected expansion in range of decision

making to be expected. Commitment to data resource management.

Page 18: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Negative Factors

Don’t need database environment if: Relatively small data volumes.

Little expected growth..

Little or no redundancy in the applications’ data requirement.

A relatively fixed data-processing environment.

Limited need for interactive processes.

Page 19: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Data Modeling

Role of models: Aid in communicating one’s understanding of

meaning of the data (its semantics). Aid in discovering the organization’s data

semantics whether recorded or not.

Basic data modeling concepts.

Page 20: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Entity-Relationship Model

Entity: Basic concept of the data model. Can be a class of real or

abstract objects. Entities representing “real” objects:

Employee Customer Department.Teacher Part Student.Building Product Vendor.

Entities representing “abstract” objects:Degree Program Salary History.Investment Work Experience.Course Offering Sale.

Page 21: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

(Cont’d).

Entities have properties: Attributes. One or more attributes are called “keys.”

Uniquely identify entity instances. Can have “nulls.”

Entities have noun or noun phrase names. E.g., Student or Unclassified Student.

Entities have relationships: (associations). Two types:

1. Connection.2. Category.

Page 22: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Relationships

Connection: Associates different entities:

E.g., Student and Class Section,Professor and Class section.

Has a verb name: “ is enrolled in”, “teaches.” Has a “cardinality,”

How many instances of one entity are related to how many of the other entity? One to one. One to many. One to exactly n. One to between n and m. Many to many.

Page 23: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

(Cont’d)

Category relationship: Associates similar entities:

E.g., Student: Graduate, Undergraduate, Advanced Special Student, Unclassified.

Is named: “Must Be A” or “Can Be A”. This model is the “Entity-Relationship

Model.” Introduced by Peter Chen in 1975.

Page 24: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Relational Model

Proposed by E.F. Codd, IBM Research, in 1970. Based on set theory and relational calculus.

Objectives: Simplicity, data independence, rigorous treatment of derivability,

redundancy and consistency. Simpler than other data-modeling techniques.

Views data as if formatted into tables. Table’s columns represent properties. Table’s rows represent values of these properties. Connections between tables are formed by columns with the same

name or comparable values.

Page 25: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Relational Model (Cont’d).

Data Independence: Users need not know anything database’s physical

characteristics. Viewing data logically as tables does not mean they physically

stored as tables. A relation is a named table with columns and rows.

Degree of a relation is the number of columns. A column is an attribute and has a name..

Called a field in Access. A relation can represent an entity.

When it does, each row represents an entity instance.

Page 26: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Relational Model (Cont’d).

Tables can be related to one another. These relationships occur through linking fields in each table. The field (or fields) in a table that uniquely identifies each record

in the table is called the key field. When a “one to many” relationship exists between tables, the

field that uniquely identifies the field on the “one side” of the relation is the called the primary key.The corresponding field in the table on the “many side” of the relationship is called the foreign key.

Page 27: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Rational Operators

There are three basic operations: Project (Extract):

Selects specific columns and creates a new table. Select (Restrict).

Selects specific rows (“tuples”) and creates a new table. Join:

Combines some or all records from multiple tables to create a new table.

This provides the relational capability in the database.

Page 28: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Why Use Join?

Advantages: Minimize data entry effort Helps achieve consistency Can save a lot of storage space

Disadvantage: Joins are expensive to compute

Both in time and space

Page 29: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Teacher Table

Teacher

Name

Faculty

ID No.

Rank Dept.

Name

Building

Wiley 62186 Ass’t. MIS Hoyle

Schmidt 13462 Full History Bascomb

Baskin 42136 Assoc. Geology Phy. Sci.

Lee 36789 Full Elec. En. Eng’g.

Brown 56345 Ass’t. History Bascomb

Page 30: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

College Table

Department Name College Name

History Humanities

MIS Business

Geol. Natural Sciences

Elec. En. Engineering

Page 31: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Example:Join Teacher on College

Teacher

Name

Faculty

ID No.

Rank Dept. Name

Building College

Name

Wiley 62186 Ass’t. MIS Hoyle Business

Schmidt 13462 Full History Bascomb Humani--ties

Etc. Etc. Etc. Etc. Etc. Etc.

Page 32: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Example: Project

Department Name College Name

MIS Business

History Humanities

Geol. Phy. Sc.

Elec. En. Eng’g.

Page 33: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Example: Select

Teacher

NameEtc. Rank Etc. Etc. Etc.

Schmidt Full

Baskin Assoc.

Lee Full

Page 34: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

The Design Process

First design the tables. Who are the users? What questions will be asked. What data are needed to to answer those questions? Build an entity-relationship model. Develop the tables.

Then design the queries. Using join, select and project.

Page 35: LBSC 690 Session 8 Files and Mass Storage. Mass Storage  Handling large amounts of data.  Deal with files: Organized collections of data.  Three Basic

Designing Tables

First build an entity-relationship model.

Make one table for each entity. Make one column for each attribute. Add a primary key to uniquely identify rows.

Add foreign keys to represent relationships.