147
[email protected] Relational Database Management System

Relational Database Management System

Embed Size (px)

DESCRIPTION

Presentation is about the Basic RDBMS Concepts

Citation preview

Page 1: Relational Database Management System

[email protected]

Relational Database Management System

Page 2: Relational Database Management System

[email protected]

Surender Singh

Sr. Programmer

[email protected]

Page 3: Relational Database Management System

Relational Database Management System

DATABASE

Information

DBMS/RDBMS

DATA

[email protected]

Page 4: Relational Database Management System

[email protected]

File Processing System

Page 5: Relational Database Management System

[email protected]

File Processing System

Database(Information in Files Format)

Application Programs

(Programs Written in CPascal etc.)

File System

(Data StructureFile Handling)

Page 6: Relational Database Management System

[email protected]

File System

Database

Page 7: Relational Database Management System

[email protected]

Disadvantages of FPS

Data Redundancy and InconsistencyDifficulty in accessing dataData isolationIntegrity ProblemsAtomicity ProblemsConcurrent-access anomaliesSecurity Problems

Page 8: Relational Database Management System

[email protected]

Data Redundancy and Inconsistency

Customer Information Saving Account

Name Address

ABC BhiwaniDEF Delhi

AccNo Name Address

1002 ABC Bhiwani1005 DEF Jaipur

Page 9: Relational Database Management System

[email protected]

Difficulty in accessing data

Database(Information Storage

in Files Format)

Application Programs

(Programs Written in CPascal etc.)

File System

(Data StructureFile Handling)

Requirement

Manager

Page 10: Relational Database Management System

[email protected]

Data Isolation and Integrity Problems

#include <stdio.h>Main()

{-----

}

01 Reserve-rec. 03 saving 05 accno PIC A(2)

--------

Program in C Program in COBOL

New Document

Page 11: Relational Database Management System

[email protected]

Atomicity Problems

USER USER

BankData Transmission

Page 12: Relational Database Management System

[email protected]

Concurrent-access anomalies

Page 13: Relational Database Management System

[email protected]

Security Problems

Employee Information

Page 14: Relational Database Management System

[email protected]

Database the Piece of mind

Page 15: Relational Database Management System

[email protected]

• A mechanism for specification of data and its dependencies (Integrity Constraints) in an integrated fashion.• Prevention of redundancy and inconsistency.• Provision of adequate security and access-rights.• Mechanism for concurrency control.• Mechanism for recovery from failure.

Additionally any DBMS must provide• Schemes for specification of procession rules or application Programs.• Efficient techniques for storage and retrieval of data from the secondary storage (disk).

Requirements of a DBMS

Page 16: Relational Database Management System

[email protected]

A DBMS has two major components, namely

Structure of Database is called Database Schema.

Instance, which is a state of the database with the actual data loaded.

A set of software tools/programs which access, update and

process the database, called the query and update-mechanism.

DBMS

FileManager

SecondaryStorage

Page 17: Relational Database Management System

[email protected] Level

Logical Level

View 1 View 2 View n

View Level (External Level)

View of DATA

Internal View

Conceptual View

Page 18: Relational Database Management System

[email protected]

Data Independence

The ability to modify a schema definition in one level without affecting a schema definition in the next higher level is called data independence.

Physical data independence Logical data independence

Create table emp(empno number(10),

--------------);

Page 19: Relational Database Management System

[email protected]

Data ModelsA Data Model is a mechanism for describing the data, their interrelationships and the constraints.

Record-based models.Relational ModelNetwork ModelHierarchical Model

Physical data models.

Object-based Conceptual models.Entity-Relationship model

Page 20: Relational Database Management System

[email protected]

The E-R Model

Entities : An entity is a distinct clearly identifiable object of the database e.g Book

Relationships : A relationship is a mapping between entity sets.

Attribute : Each Entity is characterized by a set of attributes e.g. Acc.No.

Entity Set : Set of all entities having attributes of the same type.

BOOK USERSBorrowed_By

Acc_No Title

Author YearofPub

Card_No Name

Address

Acc_No

Card_No DOI

Page 21: Relational Database Management System

[email protected]

The Relational Model

AccNo Title Author YearofPub

Relational Model uses a collection of tables to represent both data and relationship among those data. Each table has multiple Attributes and similar kind of tuples.

Tuple

Attribute

Book Table/Relation

Page 22: Relational Database Management System

[email protected]

Network Model

Data in the network model are represented by collection of records and relationships among data are represented by links, which can be viewedas Pointers.

User

Card_No Name Address Link

Pointer Next

Acc_No Author ----- LinkBook

Page 23: Relational Database Management System

[email protected]

Hierarchical Model

This is special kind of a network model where the relationship is essentially a tree-like structure.

Hospital

Wards Units

Cardiology SkinPatient Doctors Nurses

Page 24: Relational Database Management System

[email protected]

Physical Data Models

Physical data models are used to describe data at the lowest level.In contrast to logical data models, there are few physical data modelsIn use. Two of the widely known ones are the Unifing model and frame-Memory model.

Page 25: Relational Database Management System

[email protected]

Database Languages

Data-Definition Data-Manipulation Data-Control

Create Table Test(

Title Varchar2(20),--------

);

UpdateInsertDeleteQuery

GRANT Connect,Resource TO xUser

Database Languages

Page 26: Relational Database Management System

[email protected]

Database Management System Structure

ApplicationInterfaces

DatabaseScheme

ApplicationPrograms Query

ApplicationPrograms

Object Code

EmbeddedDML

Precompiler

QueryEvaluation

Engine

DML Compiler

DDLInterpreter

TransactionManager

BufferManager

FileManager

Indices

Data Files

Statistical Data

Data Dictionary

Naïve Users Application Sophisticated Database(tellers, agents, etc.) Programmers Users Administrators

Users

QueryProcessor

StorageManager

DatabaseManagementSystem

Disk Storage

Page 28: Relational Database Management System

[email protected]

Oracle Storage System Structure

Page 29: Relational Database Management System

[email protected]

Database Administrator

Roles of DBA

• Schema Definition• Storage structure and access-method definition• Schema and Physical-organization modification• Granting of authorization for data access• Integrity-constraint specification

Page 30: Relational Database Management System

[email protected]

Terms Simple and Composite Attributes Single-valued and Multivalued Attributes Null Attributes Derived Attributes Existence Dependencies Weak Entity Set and Strong Entity Set

Page 31: Relational Database Management System

[email protected]

Weak Entity Set

Page 32: Relational Database Management System

[email protected]

Attributes

Page 33: Relational Database Management System

[email protected]

Keys

Keys

Candidate Key Secondary Key Foreign Key

Primary Key Alternate Key

Composite Key

Page 34: Relational Database Management System

[email protected]

Roll_No Name Branch City

01 Deepak Computers Bhiwani

02 Mukesh Electronics Rohtak

03 Teena Mechanical Bhiwani

04 Deepti Chemical Rohtak

05 Monika Civil Delhi

Candidate Keys

Primary key

Alternate Keys

Page 35: Relational Database Management System

[email protected]

Roll_No Name Branch City

01 Deepak Computers Bhiwani

02 Mukesh Electronics Rohtak

03 Teena Computers Bhiwani

04 Deepak Electronics Rohtak

05 Monika Computers Delhi

Primary Key

Secondary Key

Page 36: Relational Database Management System

[email protected]

Name Branch City

Deepak Computers Bhiwani

Mukesh Electronics Rohtak

Teena Computers Bhiwani

Deepak Electronics Rohtak

Monika Computers Delhi

Composite Primary Key

Page 37: Relational Database Management System

[email protected]

Part P_Name Colour Quantity

P1 Nut Red 200

P2 Bolt Green 250

P3 Screw Blue 300

P#

Page 38: Relational Database Management System

[email protected]

Supplier S_Name City Quantity

S1 John Delhi 200

S2 Smith Kolkata 250

S3 James Delhi 300

S4 David Chennai 400

S5 John Chennai 300

S#

Page 39: Relational Database Management System

[email protected]

P# S# Quantity

P1 S1 200

P2 S1 300

P3 S1 400

P1 S2 250

P2 S3 250

P3 S4 200

P2 S4 300

P3 S5 400

SP#

Page 40: Relational Database Management System

[email protected]

Mapping Cardinalities

Mapping cardinalities, or cardinality ratios, express the number of entities to which another entity can be associated via a relationship set.

For a binary relationship set R between entity sets A and B, the mappingCardinality must be one of the following

A AB B

One to One One to Many

Page 41: Relational Database Management System

[email protected]

AA BB

Many to ManyMany to One

Page 42: Relational Database Management System

[email protected]

Company

Vehicle

Owns Leased

More on E-R Diagrams

Multiple Relationship between Same entity set

Staff Reports toManager

Subordinate

Circular Relationship

Page 43: Relational Database Management System

[email protected]

Instructors Students

Courses

Teaches

Ternary E-R Diagram

Book UserBorrowed_ByN 1

Constraints

Page 44: Relational Database Management System

[email protected]

E-R Diagram Components

Entity Sets

Attributes

Relationship Sets

Connectors/Constraints

Multivalued Attributes

Derived Attributes

Total Participation of an entity in a relationship set

Page 45: Relational Database Management System

[email protected]

Existence Dependencies

Page 46: Relational Database Management System

[email protected]

Generalization and Specialization

Page 47: Relational Database Management System

[email protected]

Generalization and SpecializationThe abstraction mechanisms

Employee

Emp_No Name Date_of_hire

IS_A IS_A

Full_timeEmployee Salary

IS_AIS_A

Faculty Staff

Interest

Part_timeEmployee

Type

IS_A IS_A

Teaching Casual

Hour_RateStipendDegree

Generalization Specialization

Page 48: Relational Database Management System

[email protected]

Aggregation

The Process of compiling information on an object

Teacher

Teaches

Course

Book

Uses

Teacher Teaches Course

Uses

Book

Teacher-Teaches

Page 49: Relational Database Management System

[email protected]

Represent ER model using tables

Page 50: Relational Database Management System

[email protected]

Query Languages

A query language is a language in which a user requests information from a database. These are typically higher-level than programming languages.They may be one of:

Procedural, where the user instructs the system to perform a sequence of operations on the database. This will compute the desired information.

Nonprocedural, where the user species the information desired without giving a procedure for ob-taining the information.

A complete query language also contains facilities to insert and delete tuples as well as to modify parts of existing tuples.

Page 51: Relational Database Management System

[email protected]

The Relational Algebra

The Borrow and Branch relations

The relational algebra is a procedural query language.

Page 52: Relational Database Management System

[email protected]

Fundamental Operations

select (unary) project (unary) rename (unary) cartesian product (binary) union (binary) set-difference (binary)

Several other operations, dened in terms of the fundamental operations: set-intersection natural join division assignment Operations produce a new relation as a result.

Page 53: Relational Database Management System

[email protected]

Formal Definition of Relational Algebra

Page 54: Relational Database Management System

[email protected]

The Select Operation

Page 55: Relational Database Management System

[email protected]

The Project Operation

Page 56: Relational Database Management System

[email protected]

The Cartesian Product Operation

Page 57: Relational Database Management System

[email protected]

Output of Cartesian Product

A

1

2

3

B

X

Y

A B

1 X

1 Y

2 X

2 Y

3 X

3 Y

Relation A Relation B A X B

Page 58: Relational Database Management System

[email protected]

The Rename Operation

Page 59: Relational Database Management System

[email protected]

The Union Operation

Page 60: Relational Database Management System

[email protected]

The Set Difference Operation

Page 61: Relational Database Management System

[email protected]

Additional Operations

The Set Intersection Operation

Page 62: Relational Database Management System

[email protected]

The Natural Join Operation

Page 63: Relational Database Management System

[email protected]

The Division Operation

Page 64: Relational Database Management System

[email protected]

Example of Division Operation

A B

P A

Q A

P B

Q T

M A

Q B

B

A

B

A

P

Q

Relation R Relation S R S÷

Page 65: Relational Database Management System

[email protected]

The Assignment Operation

Page 66: Relational Database Management System

[email protected]

Relational Calculus

Tuple Relational Calculus Uses Tuple variables which take values of an entire tuple

Domain Relational CalculusUses Domain variables which takes values from an attribute

Relational Calculus is a nonprocedural Query language

Page 67: Relational Database Management System

[email protected]

Tuple Relational Calculus

Page 68: Relational Database Management System

[email protected]

Example Queries

Page 69: Relational Database Management System

[email protected]

Some More Examples

Page 70: Relational Database Management System

[email protected]

Domain Relational Calculus

Page 71: Relational Database Management System

[email protected]

SQL

Page 72: Relational Database Management System

[email protected]

Integrity Constraints

Types of Constraints

Domain ConstraintsReferential Integrity ConstraintFunctional Dependencies

Integrity and Consistency is of primary concern to any database designAt any instance a database must be correct according to a set of rules. Rules are checked during any database operation.

InsertionDeletionUpdationRecovery from FailureConcurrent Operations

Page 73: Relational Database Management System

[email protected]

Domain Constraints

Includes

TypeWidthNull or Not NullChecks/Conditions

Specify at the time of designingChecked at the time of insertion, deletion or modification

e.g Bname char(20)Amount number(7,2)DOL date check (date>=29/09/2004City char(10) not nullTotalAmt = amount + interest

Page 74: Relational Database Management System

[email protected]

Referential Integrity

Foreign Key

Referential integrity states that all values of the foreign key of oneRelation must be present in another relation where the same attributeIs declared as the primary key

Checks during Database ModificationInsertDeleteUpdate

Page 75: Relational Database Management System

[email protected]

Assertions and Triggers

An assertion is a general predicate, expressed in relational algebraOr calculus or any language like SQL which must always hold in a Database

Assert salary-constraint on emp salary >= 1000

A trigger is a statement or a block of statements which are executedAutomatically by the system when an event (i.e., insertion, updationOr deletion) takes place on a table

Define trigger insert_recordon delete of emp e

(insert into emp_historyvalues e.empno, e.name, e.deptno)

Page 76: Relational Database Management System

[email protected]

Functional Dependencies

Functional Dependencies provide a formal mechanism to express Constraints between attributes.

It is a mean of identifying how values of certain attributes are Determined by values of other attributes.

A functional dependency (FD) generalizes the concept of a key.

Book (acc_no, yr_pub, title)

Acc_no is Primary Key

Formal representation of Constraints acc_no yr_pub acc_no title

Page 77: Relational Database Management System

[email protected]

Formal Notation of FD

In general if there are two attributes A and B and the FD

A B

Holds then, it means that there can be no two tuple which haveThe same value of attributes A and different values in attribute B.

If α and β are two sets of attributes then the FD α β holds on a Relation r(R), if –

1. α , β ⊆ R, i.e. α , β subset of R2. for all tuples t1 and t2 in r, if t1 [α ] = t2 [α ] then

t1 [β ] = t2 [β ]

Page 78: Relational Database Management System

[email protected]

Closure of a Set of Functional Dependencies

Page 79: Relational Database Management System

[email protected]

Armstrong’s Axioms

Page 80: Relational Database Management System

[email protected]

Closure of a Set of F+

Page 81: Relational Database Management System

[email protected]

Closure of Attribute Sets

Page 82: Relational Database Management System

[email protected]

Canonical Cover

To minimize the number of functional dependencies that need to beTested in case of an update we may restrict F to a canonical cover Fc .

A canonical cover for F is a set of dependencies such that F logicallyImplies all dependencies in Fc .

A canonical cover Fc of a set of FDs F is a minimal cover of F in theSense that there is no subset of Fc which also covers F.

Page 83: Relational Database Management System

[email protected]

Example of Cannonical Cover

Consider a relation r ( X, Y, Z ) with the FDs F.

1. X Y Z2. Y Z3. X Y4. XY Z

Here 4 is redundant because (1) states that X Y and X Z holds.Thus (4) can be derived from (1). Also (3) is redundant because (1) contains (3).Deleting these two we get

1. X YZ2. Y Z

Which is a cover of F. Here again since X Y and Y Z holds, byTransitivity X Z holds. So it is redundant. Deleting this we get the FDs as

X YY Z

Which is a cannonical cover of F.

Page 84: Relational Database Management System

[email protected]

Relational Database Design

Page 85: Relational Database Management System

[email protected]

Database Decomposition – 1

Representation of Information

Page 86: Relational Database Management System

[email protected]

Database Decomposition – 2

Page 87: Relational Database Management System

[email protected]

Database Decomposition – 3

Page 88: Relational Database Management System

[email protected]

Database Decomposition – 4

Page 89: Relational Database Management System

[email protected]

Lossless-join Decomposition

Page 90: Relational Database Management System

[email protected]

Example of lossy decomposition

s_name s_addr Item Price

A1 B1 C1 D1

A1 B1 C2 D1

A2 B2 C1 D2

A2 B2 C3 D3

A3 B1 C2 D2

S_name Item

A1 C1

A1 C2

A2 C1

A2 C3

A3 C2

S_name S_addr Item Price

A1 B1 C1 D1

A1 B2 C1 D2

A1 B1 C2 D1

A1 B1 C2 D2

A2 B1 C1 D1

A2 B2 C1 D2

A2 B2 C3 D3

A3 B1 C2 D1

A3 B1 C2 D1

S_addr Item price

B1 C1 D1

B1 C2 D1

B2 C1 D2

B2 C3 D3

B1 C2 D2

S_by

p1 p2

Natural Join of P1 and p2

Page 91: Relational Database Management System

[email protected]

Dependency Preservation

Page 92: Relational Database Management System

[email protected]

Normalization

Normalization is a process of removing redundancy using functional Dependencies.

To reduce redundancy it is necessary to decompose a relation into a number of smaller relations.

There are several normal Forms.

-First Normal Form (1 NF)-Second Normal Form (2 NF)-Third Normal Form(3 NF)-Boyce-Codd Normal Form (BCNF)

Page 93: Relational Database Management System

[email protected]

First Normal Form (1NF)

Name

F_name L_name

This normal form says that all attributes are simple.

An attribute is said to be simple if it does not contain any subparts.An attributes which contains subparts is called complex attributes.

C_addr

City State Zip

Page 94: Relational Database Management System

[email protected]

Second Normal Form (2NF)

Consider a relation savings_deposit having the following structure:- Saving_deposit (name, addr, acc_no, amt )

With the following FDs :name addrname, acc_no amt

A relation is said to be in 2NF if it is in 1NF andAll non-prime attributes are fully functionally dependent on candidate key

Here [name, acc_no ] is the candidate key and addr and amt are the non prime attributes.Among the non-prime attributes amt depends on [name, acc_no ] whereas addr depends on name only.

Note that due to FD name addr every tuple with the same name will contain the sameAddress causing redundancy.

This redundancy arises because a non-prime attribute like address is dependent on an attributeWhich is not a candidate key.

Page 95: Relational Database Management System

[email protected]

Solution

We can remove this redundancy by splitting the original relation into following two relations

Sav_sch1 (name, addr)Sav_sch2(name, acc_no,amt)

Both the relations are now 2NF. In the first relation name is Primary Key and the onlyNon-prime attribute is addr which is dependent on name

In the second relation the only non-prime attribute amt depend on both name andAcc_no. that this decomposition is also lossless join and dependency preserving

Courses ( Course_no, title, loc, time )

And FD’s are –

Course_no titleCourse_no, time loc

Page 96: Relational Database Management System

[email protected]

Third Normal Form (3NF)

A relation is said to be in 3NF and non-prime attributes are not dependentOn each other.

Consider the relation –s_by ( s_name, item, price, gift_item )

With FDss_name, item priceprice gift_item

Here all prime attributes are fully functional dependent on candidate keys, theNon-prime attribute gift-item is also fully functional dependent on the non-primeAttribute price. This create redundancy because every price value there is a fixedGift item.

We shall have to impose the additional restriction that no non-prime attribute canBe functionally dependent on another non-prime attributes.

Page 97: Relational Database Management System

[email protected]

Solution

We decompose the relations_by (s_name, item, price, gift_item )

Intos_by_1 (s_name, item, price )s_by_2 (price, gift_item)

Now we have a lossless join and dependency preserving decomposition.

An alternative yet equivalent definition for 3NF is :

For every FD α β on R at least one of the following conditions hold –

• α ⊆ β (trivial dependency)• α R (α is a super key )

Page 98: Relational Database Management System

[email protected]

Boyce-Codd Normal Form (BCNF)

Page 99: Relational Database Management System

[email protected]

More on BCNF

Page 100: Relational Database Management System

[email protected]

Comparison of BCNF and 3NF

Page 101: Relational Database Management System

[email protected]

Comparison of BCNF and 3NF - 2

Page 102: Relational Database Management System

[email protected]

Normalization using Multivalued Dependencies

Page 103: Relational Database Management System

[email protected]

Multivalued Dependencies -2

Page 104: Relational Database Management System

[email protected]

Rules

Page 105: Relational Database Management System

[email protected]

More Rules

Page 106: Relational Database Management System

[email protected]

Fourth Normal Form (4NF)

Page 107: Relational Database Management System

[email protected]

Example

Page 108: Relational Database Management System

[email protected]

Normalization using Join Dependencies

Let R be a relation schema and R1, R2,….Rn be a decomposition of R. The join dependency*(R1, R2,….Rn) is used to restrict the set of legal relations to those for which R1, R2,….Rn isA lossless-join decomposition of R.

Formally, if R = R1∪R2 ∪ …… ∪ Rn, we say that a relation r( R ) satisfies the join dependency.

Page 109: Relational Database Management System

[email protected]

Fifth Normal Form (5NF)Project-Join Normal Form

Project-join normal form (PJNF) is defined in a manner similar to BCNF and 4NF, Except that join dependencies are used.

A relation schema R is in PJNF with respect to a set D of functional multivalued andJoin dependencies if, for all join depencdencies in D+ of the form *(R1, R2,…. Rn).Where each Ri ⊆ R and R = R1 ∪ R2 ∪…… ∪ Rn, at least one of the following holds:

• *(R1, R2…..Rn) is a trival join dependency.• Every Ri is a superkey for R.

It’s seems that every PJNF is also in 4NFThus, in general, we may not be able to find a dependency-preserving decompositionInto PJNF for a given schema.

Page 110: Relational Database Management System

[email protected]

Storage and File Structure Hierarchy of Storage

Page 111: Relational Database Management System

[email protected]

Description

Page 112: Relational Database Management System

[email protected]

Description - 2

Page 113: Relational Database Management System

[email protected]

File Organization

Page 114: Relational Database Management System

[email protected]

Fixed Length Record -1

Page 115: Relational Database Management System

[email protected]

Fixed Length Record -2

Page 116: Relational Database Management System

[email protected]

Variable-length Records

Page 117: Relational Database Management System

[email protected]

Fixed-length representation

Page 118: Relational Database Management System

[email protected]

Organization of Records in files

Page 119: Relational Database Management System

Concurrency Control and Recovery

Page 120: Relational Database Management System

[email protected]

Transactions

Concurrent execution of user programs is essential for good DBMS performance. Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by

working on several user programs concurrently. A user’s program may carry out many operations on the data retrieved from the database, but the

DBMS is only concerned about what data is read/written from/to the database. A transaction is the DBMS’s abstract view of a user program: a sequence of reads and writes.

A Tracnsaction is a unit of program execution That accesses and possibly updates variousData items.

Collection of operations that form a single logical unit of work are called tracsactions. A database system must ensure proper execution of transaction despite failures.

To ensure integrity of the data, database system must maintain the following properties of the transactions:

Page 121: Relational Database Management System

[email protected]

States of Transactions

Active

Partially Committed

FailedAborted

Page 122: Relational Database Management System

[email protected]

Concurrency in a DBMS

Users submit transactions, and can think of each transaction as executing by itself. Concurrency is achieved by the DBMS, which interleaves actions (reads/writes of DB objects) of

various transactions. Each transaction must leave the database in a consistent state if the DB is consistent when the

transaction begins. DBMS will enforce some ICs, depending on the ICs declared in CREATE TABLE statements. Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not

understand how the interest on a bank account is computed). Issues: Effect of interleaving transactions, and crashes.

Page 123: Relational Database Management System

[email protected]

Example

Consider two transactions (Xacts):

T1: BEGIN A=A+100, B=B-100 ENDT2: BEGIN A=1.06*A, B=1.06*B END

❖ Intuitively, the first transaction is transferring $100 from B’s account to A’s account. The second is crediting both accounts with a 6% interest payment.

❖ There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted together. However, the net effect must be equivalent to these two transactions running serially in some order.

Page 124: Relational Database Management System

[email protected]

Example (Contd.)

Consider a possible interleaving (schedule):T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B

❖ This is OK. But what about:T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B

❖ The DBMS’s view of the second schedule:

T1: R(A), W(A), R(B), W(B)T2: R(A), W(A), R(B), W(B)

Page 125: Relational Database Management System

[email protected]

Example (Contd.)

The DBMS must not allow schedules like this!

T1: R(A), W(A), R(B), W(B)T2: R(A), W(A), R(B), W(B)

T1 T2A

BDependency graph

❖ Dependency graph: One node per Xact; edge from Ti to Tj if Tj reads or writes an object last written by Ti.

❖ The cycle in the graph reveals the problem. The output of T1 depends on T2, and vice-versa.

Page 126: Relational Database Management System

[email protected]

Scheduling Transactions

Equivalent schedules: For any database state, the effect (on the set of objects in the database) of executing the first schedule is identical to the effect of executing the second schedule.

Serializable schedule: A schedule that is equivalent to some serial execution of the transactions. If the dependency graph of a schedule is acyclic, the schedule is called conflict serializable. Such a

schedule is equivalent to a serial schedule. This is the condition that is typically enforced in a DBMS (although it is not necessary for

serializability).

Page 127: Relational Database Management System

[email protected]

Detection of SerializabilityOne of the techniques of concurrency control is to detect whether a schedule is valid or notPrior to execution.

The task of understanding a schedule is simplified by considering only the sequence of readand write operation in a transaction

T1 T2

Read(X)Read(X)Write(X)

Write(X)Read(Y)Write(Y)

Read(Y)Write(Y)

Read-Write sequence of a non-serializable schedule

Page 128: Relational Database Management System

[email protected]

Serializable ConcurrencyT1 T2

Read(X)Write(X)

Read(X)Write(X)

Read(Y)Write(Y)

Read(Y)Write(Y)

A serializable concurrent schedule

Generalize the idea of conflict. Consider the four possibilities which can arise between twoConsecutive instructions T1 and T2 in a schedule ( T1 and T2 belong to two different transactions)

1. T1 : Read(X) followed by T2 : Write(X)2. T1 : Read(X) followed by T2 : Read(X)3. T1 : Write(X) followed by T2 : Read(X)4. T1 : Write(X) followed by T2 : Write(X)

T1 and T2 are said to be conflict if they cannot be swapped without fear of loss of consistency.In above 3 cases all pairs except case 2 are said to be in conflict.

Page 129: Relational Database Management System

[email protected]

Deadlock Condition

T1 T2

UPDATE account UPDATE accountSET balance = balance * 0.1 SET balance = balance * 0.1WHERE acc_no = ‘FC821’ WHERE acc_no = ‘FC523’

UPDATE account UPDATE accountSET age = 30 SET age = 38WHERE acc_no = ‘FC523’ WHERE acc_no = ‘FC821’

Page 130: Relational Database Management System

[email protected]

Lock-Based TechniquesIn this technique the system does not participate in detection of inconsistency nor does it take anyCorrective action.

The DBMS however, provides the user with a set of operations which when used properly can ensure that concurrent execution will not violate consistency.

In this techniques functions are provided to lock and unlock data items by transactions,

In the simplest case a data item X can be locked by a transaction T1 in two modes :

Shared Mode : if T1 locks X in shared mode then before T1 unlocks X, no other transaction T2 can write into X. But a transaction T2 can read the value of X even if T1 has locked locked X in shared mode.

Exclusive Mode : If T1 locks X in exclusive mode then before T1 unlocks X, no other transaction T2 can read or write into X.

Page 131: Relational Database Management System

[email protected]

Example

T1 T2

Lock-X(P)Read (P,p)P=p-1Write(P,p)Unlock(P)

Lock-S(Q)Read(Q,q)unlock(Q)Lock-S(P)Read(P,p)unlock(P)display(p)display(p)

Lock-X(Q)Read(Q,q)q = q + 1Write(Q,q)Unlock(Q)

Page 132: Relational Database Management System

[email protected]

Two-Phase locking

Phase I – Acquiring Phase : During this phase a transaction may lock a data item but not unlock any data item.

Phase II – Releasing Phase : During this phase a transaction may unlock data items locked earlier but no new locks may be acquired.

In two phase locking phase I must always precede phase II. This will ensure that all scheduleare automatically conflict serialzable.

Page 133: Relational Database Management System

[email protected]

Enforcing (Conflict) Serializability

Two-phase Locking (2PL) Protocol: Each Xact must obtain a S (shared) lock on object before reading, and an X (exclusive) lock on object

before writing. Once an Xact releases any lock, it cannot obtain new locks. If an Xact holds an X lock on an object, no other Xact can get a lock (S or X) on that object.

2PL allows only conflict-serializable schedules. Potential problem of deadlocks: we could have a cycle of Xacts, T1, T2, ... , Tn, with each Ti waiting for its

predecessor to release some lock that it needs. Dealt with by killing one of them and releasing its locks.

Page 134: Relational Database Management System

[email protected]

Atomicity of Transactions

A transaction might commit after completing all its actions, or it could abort (or be aborted by the DBMS) after executing some actions.

A very important property guaranteed by the DBMS for all transactions is that they are atomic. That is, a user can think of a Xact as always executing all its actions in one step, or not executing any actions at all. DBMS logs all actions so that it can undo the actions of aborted transactions.

This ensures that if each Xact preserves consistency, every serializable schedule preserves consistency.

Page 135: Relational Database Management System

[email protected]

Aborting a Transaction

If a transaction Ti is aborted, all its actions have to be undone. Not only that, if Tj reads an object last written by Ti, Tj must be aborted as well!

Most systems avoid such cascading aborts by releasing a transaction’s locks only at commit time. If Ti writes an object, Tj can read this only after Ti commits.

In order to undo the actions of an aborted transaction, the DBMS maintains a log in which every write is recorded. This mechanism is also used to recover from system crashes: all active Xacts at the time of the crash are aborted when the system comes back up.

Page 136: Relational Database Management System

[email protected]

The Log

The following actions are recorded in the log: Ti writes an object: the old value and the new value.

Log record must go to disk before the changed page! Ti commits/aborts: a log record indicating this action.

Log records are chained together by Xact id, so it’s easy to undo a specific Xact. Log is often duplexed and archived on stable storage. All log related activities (and in fact, all activities such as lock/unlock, dealing with deadlocks etc.) are

handled transparently by the DBMS.

Page 137: Relational Database Management System

[email protected]

The Log - 2

T: Read (X, xi)xi xi – 500Write (X,xi)

Read ( Y, yi)yi yi + 500Write (Y, yi)

Log file e.g. X=1000, Y= 2000

<T starts><T, X, 1000, 500><T, Y, 2000, 2500><T, commits>

Transaction NameData item NameOld ValueNew Value

Page 138: Relational Database Management System

[email protected]

Checkpoints

At the time of recovery the entire log needs to be searched to know which transaction need toBe redone and which transactions needs to be undone. The problem with this approach is:

1. It will take a reasonable amount of time.2. Most of the transactions that need to be redone have already modified the database.

To solve this problem the concept of checkpoint is used here at different points. Checkpoints are introduced to indicate that the data before this point has already been Updated to the database. Before writing checkpoints the following sequence of actions shuld to take place –

- Output all log records currently residing in the main store to a stable storage- Output all modified buffer blocks to secondary storage.- Output a log record <checkpoint>

Page 139: Relational Database Management System

[email protected]

Recovering From a Crash

There are 3 phases in the Aries recovery algorithm: Analysis: Scan the log forward (from the most recent checkpoint) to identify all Xacts that were active,

and all dirty pages in the buffer pool at the time of the crash. Redo: Redoes all updates to dirty pages in the buffer pool, as needed, to ensure that all logged

updates are in fact carried out and written to disk. Undo: The writes of all Xacts that were active at the crash are undone (by restoring the before value

of the update, which is in the log record for the update), working backwards in the log. (Some care must be taken to handle the case of a crash occurring during the recovery process!)

Data can be lost due to the failure of the nonvolatile storage like the disk. The scheme which is availableTo protect the data from disk failure is to periodically dump the entire contents of the database to any backup(or even stable) storage like a magnetic tape. When a failure occurs the most recent dump is used to restoring The datbase to a previous consistent state. Then the log is used to redo all the transactions that have committedSince the last dump occurred. The following steps are performed for this purpose :

• Output all log records currently residing in the main memory onto stable store.• Output all buffer blocks onto the disk.• Copy the contents of the database to stable store.• Output a log record <dump>.

Page 140: Relational Database Management System

[email protected]

Summary

Concurrency control and recovery are among the most important functions provided by a DBMS. Users need not worry about concurrency.

System automatically inserts lock/unlock requests and schedules actions of different Xacts in such a way as to ensure that the resulting execution is equivalent to executing the Xacts one after the other in some order.

Write-ahead logging (WAL) is used to undo the actions of aborted transactions and to restore the system to a consistent state after a crash.

Consistent state: Only the effects of commited Xacts seen.

Page 141: Relational Database Management System

[email protected]

Query Processing/Optimization

Page 142: Relational Database Management System

[email protected]

Optimization using algebraic ManipulationAny algebraic manipulation approach to query optimization uses a set of rules, which mayBe enumerated as follows.

Perform selection as early as possible, in order to reduce the number of tuples to be processed subsequently. Projections of projections should be combined, if possible, in order to avoid repeated scanning of tuples. Projection over indexed attributes should be done earlier and That over non-indexed attributes should be done later. Intermediate relations produced in separate processing sequences must be shared as as and when possible. If possible, attributes which are controlling a join operation should be sorted earlier.

Rules

Page 143: Relational Database Management System

[email protected]

Example

Page 144: Relational Database Management System

[email protected]

Example contd.

Page 145: Relational Database Management System

[email protected]

Projection Operation

Page 146: Relational Database Management System

[email protected]

Natural Join Operation

Page 147: Relational Database Management System

[email protected]

Natural Join Operation - 2