46
Transaction Processing

05-Transaction Processing

  • Upload
    ui

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Transaction Processing

Outline Introduction to Transaction Processing

Transaction & System Concept Concurrency/Synchronization Schedule Transaction in SQL

Content Development GDLN Batch 2

2

Introduction to Transaction Processing Single-User System: At most one user at a time can use the system.

Multiuser System: Many users can access the system concurrently.

Concurrency/Synchronization Interleaved processing: concurrent execution of processes is interleaved in a single CPU

Parallel processing: processes are concurrently executed in multiple CPUs.

Content Development GDLN Batch 2

3

FIGURE 17.1Interleaved processing versus parallel processing of concurrent transactions.

Content Development GDLN Batch 2

4

Introduction to Transaction Processing (2) A Transaction: logical unit of database processing that includes one or more access operations (read -retrieval, write - insert or update, delete).

A transaction (set of operations) may be stand-alone specified in a high level language like SQL submitted interactively, or may be embedded within a program.

Transaction boundaries: Begin and End transaction.

An application program may contain several transactions separated by the Begin and End transaction boundaries.

Content Development GDLN Batch 2

5

Introduction to Transaction Processing (3) A transaction must see a consistent database. During transaction execution the database may be temporarily inconsistent.

When the transaction completes successfully (is committed), the database must be consistent.

After a transaction commits, the changes it has made to the database persist, even if there are system failures.

Multiple transactions can execute in parallel. Two main issues to deal with:

Failures of various kinds, such as hardware failures and system crashes

Concurrent execution of multiple transactions

Content Development GDLN Batch 2

6

Introduction to Transaction Processing (4) Basic operations are read and write

Basic unit of data transfer from the disk to the computer main memory is one block. In general, a data item (what is read or written) will be the field of some record in the database, although it may be a larger unit such as a record or even a whole block.

read_item(X): Reads a database item named X into a program variable. To simplify our notation, we assume that the program variable is also named X.

write_item(X): Writes the value of program variable X into the database item named X.

Content Development GDLN Batch 2

7

Read_Item(X) read_item(X) command includes the following steps:

1.Find the address of the disk block that contains item X.

2.Copy that disk block into a buffer in main memory (if that disk block is not already in some main memory buffer).

3.Copy item X from the buffer to the program variable named X.

Content Development GDLN Batch 2

8

Write_item(X) write_item(X) command includes the

following steps:1. Find the address of the disk block

that contains item X.2. Copy that disk block into a buffer in

main memory (if that disk block is not already in some main memory buffer).

3. Copy item X from the program variable named X into its correct location in the buffer.

4. Store the updated block from the buffer back to disk (either immediately or at some later point in time).

Content Development GDLN Batch 2

9

FIGURE 17.2Two sample transactions. (a) Transaction T1. (b)

Transaction T2.

Content Development GDLN Batch 2

10

What is Synchronization? Ability of Two or More Serial Processes to Interact During Their Execution to Achieve Common Goal

Recognition that “Today’s” Applications Require Multiple Interacting Processes Client/Server and Multi-Tiered Architectures

Inter-Process Communication via TCP/IP Fundamental Concern: Address Concurrency Control Access to Shared Information Historically Supported in Database Systems Currently Available in Many Programming Languages

Content Development GDLN Batch 2

11

Thread Synchronization Suppose X and Y are Concurrently Executing in Same Address Space

What are Possibilities?

Content Development GDLN Batch 2

12

What Does Behavior at Left Represent?

Synchronous Execution!

X Does First Part of Task

Y Next Part Depends on X

X Third Part Depends on Y

Threads Must Coordinate Execution of Their Effort

X Y

1

2

3

Thread Synchronization Now, What Does Behavior at

Left Represent? Asynchronous Execution! X Does First Part of Task Y Does Second Part Concurrent

with X Doing Third Part What are Issues?

Content Development GDLN Batch 2

13

X Y

1

23

Will Second Part Still Finish After Third Part?Will Second Part Now Finish Before Third Part?What Happens if Variables are Shared?

This is the Database Concern - Concurrent Transactions Against Shared Tables!

Why Do we Need to Synchronize? Promote Sharing of Resources, Data, etc.

Cooperating Processes Norm Not Exception Difficult to Program Solutions

Handle Concurrent Behavior of Processes Multiple Processes Interacting via OS Resources Under Control of Process Manager/Scheduler

Performance, Parallel Algorithms & Computations Multi-Processor Architectures (CSE228) Different OS to Handle Multiple Processors

Client/Server and Multi-Tier Architectures

Underlying Database Support Concurrent Transactions Shared Databases

Content Development GDLN Batch 2

14

Potential Problems without Synchronization? Data Inconsistency

Lost-Update Problem Impact on Correctness of Executing Software

Deadlock Two Processes (Transactions) Each Hold Unique Resource (Data Item) and Want Resource (Date Item) of Other Process (Trans.)

Processes Wait Forever Non-Determinacy of Computations

Behavior of Computation Different for Different Executions

Two Processes (Transactions) Produce Different Results When Executed More than Once on Same Data

Content Development GDLN Batch 2

15

What is Concurrency Control? Single User vs. Multi-user Environment

Programs May Executed in an Interleaved Fashion

Why Concurrency Control ? Concurrent Execution of Transactions May Interfere with Each Other

May Produce an Incorrect Overall Result Even If Each Transaction is Correct When Executed in Isolation

Concurrency Problem The Lost Update Problem The Dirty Read Problem The Incorrect Summary Problem The Unrepeatable Read Problem

Content Development GDLN Batch 2

16

FIGURE 17.3Some problems that occur when concurrent execution

is uncontrolled. (a) The lost update problem.

Content Development GDLN Batch 2

17

FIGURE 17.3 (continued)Some problems that occur when concurrent execution is uncontrolled. (b) The temporary update problem.

Content Development GDLN Batch 2

18

FIGURE 17.3 (continued) Some problems that occur when concurrent execution is uncontrolled. (c) The incorrect summary problem.

Content Development GDLN Batch 2

19

The Unrepeatable Read Problem Consider at Transaction T1 T1 Reads Data Item X at Time t Another Transaction T2 Modifies X at Time t+1 T1 then Read X again at Time t+2 T1 has Read Two Different values of X!

Content Development GDLN Batch 2

20

TI T2read-item(X);

read-item(X);X = X + 1000;write-item(X);

read-item(X);

Desirable Properties of Transactions (A.C.I.D.)A transaction is a unit of program execution that accesses and possibly updates various data items.To preserve the integrity of data the database system must ensure:

Atomicity: All or nothing Consistency: Database state consistent upon entry and exit

Isolated: Executed without interference from other transactions

Durable: Changes to the database are permanent (or persist) after the transaction completes, even if there are system failures.

Content Development GDLN Batch 2

21

Methods for Concurrency Control and Recovery The two main methods for concurrency control include: Locking, Re-ordering/serializing transactions.

Recovery support is needed due to a number of failure types: System crash, software exceptions, transaction exceptions, concurrency enforcement, disk failure, power failure etc.

To be able to recover from failures that affect transactions, the system maintains a disk-based “system log”.

Content Development GDLN Batch 2

22

Transaction Operations BEGIN_TRANSACTION; READ or WRITE; END_TRANSACTION; COMMIT_TRANSACTION; ROLLBACK (to savepoint) or ABORT.

Content Development GDLN Batch 2

23

Example of Transaction in Programming Language

Content Development GDLN Batch 2

24

import java.sql.Connection;import java.sql.DriverManager;import java.sql.SQLException;import java.sql.Statement;

public class Main { /** * Updates tables using a transaction */ public void updateDatabaseWithTransaction() { Connection connection = null; Statement statement = null; try { Class.forName("[nameOfDriver]"); connection =

DriverManager.getConnection("[databaseURL]", "[userid]", "[password]"); //Here we set auto commit to false so no changes will

take //effect immediately. connection.setAutoCommit(false); statement = connection.createStatement(); //Execute the queries statement.executeUpdate("UPDATE Table1 SET Value = 1

WHERE Name = 'foo'"); statement.executeUpdate("UPDATE Table2 SET Value = 2

WHERE Name = 'bar'"); //No changes has been made in the database yet, so now

we will commit //the changes. connection.commit(); }

BEGIN;UPDATE accounts SET balance = balance -

100.00 WHERE name = 'Alice';SAVEPOINT my_savepoint;UPDATE accounts SET balance = balance +

100.00 WHERE name = 'Bob';-- oops ... forget that and use Wally's

accountROLLBACK TO my_savepoint;UPDATE accounts SET balance = balance +

100.00 WHERE name = 'Wally';COMMIT;END;

Transaction States (1)

Content Development GDLN Batch 2

25

Transaction State (2) Active – the initial state; the transaction stays in this state while it is executing

Partially committed – after the final statement has been executed.

Failed -- after the discovery that normal execution can no longer proceed.

Aborted – after the transaction has been rolled back and the database restored to its state prior to the start of the transaction. Two options after it has been aborted: restart the transaction; can be done only if no internal logical error

kill the transaction Committed – after successful completion.

Content Development GDLN Batch 2

26

System Log Keeps track of all transaction operations that affect values of database items

Log is kept on disk and periodically backed up to guard against catastrophy: Transaction ID [start, TID] [write_item, TID, X, old_value, new_value]

[read_item, TID, X] [commit, TID] [abort, TID]

Content Development GDLN Batch 2

27

Schedules (Histories) of Transactions

Characterizing Schedules Based on Recoverability The schedule is called recoverable schedule if

• Once transaction T is committed, it should never be necessary to roll back T

Non-recoverable schedule• Once transaction T is committed, it should necessary to roll back T or T should be aborted

Content Development GDLN Batch 2

28

T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = Y + 20;Write(Y);commit; Read(X)

;X:=X;Write(X);commit;

S1: R1(X),W1(X), R1(Y), W1(Y), c1, R2(X), W2(X), c2;

Content Development GDLN Batch 2

29

Recoverable schedule One where no transaction needs to be rolled back. Is guaranteed if No transaction T in S commits until all transactions T’ that have written an item that T reads has committed

Start T1Start T2R(x) T1W(y) T2R(y) T1Commit T1R(x) T2……(Recoverable?)

Start T1Start T2R(x) T1R(y) T1W(y) T2Commit T1R(x) T2……(Recoverable?)

If T2 aborts here then T1 would have to be aborted after commit violating Durability of ACID

NO YES

Content Development GDLN Batch 2

30

Cascadeless Schedules Those where every transaction reads only the items that are written by committed transactions

Cascaded Rollback Schedules A schedule in which uncommitted transactions that read an item from a failed transaction must be rolled back

Start T1Start T2R(x) T1W(x) T1R(x) T2R(y) T1W(x) T2W(y) T1……

Start T1Start T2R(x) T1W(x) T1R(y) T1W(y) T1Commit T1R(x) T2W(x) T2

If T1 were to abort here then T2 would have to abort in a cascading fashion. This is a cascaded rollback schedule

Cascadeless Schedule

Slide 5-31Elmasri and Navathe, Fundamentals of Database Systems, Fourth EditionRevised by IB & SAM, Fasilkom UI, 2005

Strict Schedules A transaction can neither read or write an item X until the last transaction that wrote X has committed.

(say x = 9)Start T1Start T2R(y) T2R(x) T1W(x) T1 (say x = 5)R(y) T1W(y) T1W(x) T2 (say x = 8)

For this exampleSay T1 aborts hereThen the recovery process will restore the value of x to 9 Disregarding (x= 8). Although this is cascadeless it is notStrict and the problem needs to be resolved: use REDO

Strict schedule Cascadeless schedule

Cascadeless schedule recoverable schedule

Recoverable schedule cascadeless schedule ?

Recoverable schedule strict schedule ?

Strict schedule recoverable schedule ?

Content Development GDLN Batch 2

32

Transactions and a Schedule Below are Transactions T1 and T2 Note that the Their Interleaved Execution Shown

Below is an Example of One Possible Schedule There are Many Different Interleaves of T1 and T2

Content Development GDLN Batch 2

33

T1 T2

Read(X);X:=X;Write(X);

Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

Schedule S: R1(X), W1(X), R2(X), W2(X), c2, R1(Y), W1(Y), c1;

Equivalent Schedules Are the Two Schedules below Equivalent? S1 and S4 are Equivalent, since They have the Same

Set of Transactions and Produce the Same Results

Content Development GDLN Batch 2

34

T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

Schedule S1

T1 T2

Read(X);X:=X;Write(X);

Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

Schedule S4

S1: R1(X),W1(X), R1(Y), W1(Y), c1, R2(X), W2(X), c2;S4: R1(X), W1(X), R2(X), W2(X), c2, R1(Y), W1(Y), c1;

Serializability of Schedules A Serial Execution of Transactions Runs One Transaction at a Time (e.g., T1 and T2 or T2 and T1) All R/W Operations in Each Transaction Occur Consecutively in S, No Interleaving

Consistency: a Serial Schedule takes a Consistent Initial DB State to a Consistent Final State

A Schedule S is Called Serializable If there Exists an Equivalent Serial Schedule A Serializable Schedule also takes a Consistent Initial DB State to Another Consistent DB State

An Interleaved Execution of a Set of Transactions is Considered Correct if it Produces the Same Final Result as Some Serial Execution of the Same Set of Transactions

We Call such an Execution to be Serializable

Content Development GDLN Batch 2

35

FIGURE 17.5Examples of serial and nonserial schedules involving

transactions T1 and T2. (a) Serial schedule A: T1 followed by T2. (b) Serial schedules B: T2 followed by

T1.

Content Development GDLN Batch 2

36

Example of Serializability Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20

After S1 or S2 X = 7 and Y = 40

Content Development GDLN Batch 2

37

T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

Schedule S1 Schedule S2T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

Example of Serializability Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20

After S1 or S2 X = 7 and Y = 40 Is S3 a Serializable Schedule?

Content Development GDLN Batch 2

38

T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

Schedule S1 Schedule S2T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

T1 T2

Read(X);X:=X;

Write(X);Read(Y);

Y = Y + 20;Write(Y);commit;

Read(X);X:=X;

Write(X);commit;

Schedule S3

Example of Serializability Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20

After S1 or S2 X = 7 and Y = 40 Is S4 a Serializable Schedule?

Content Development GDLN Batch 2

39

T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

Schedule S1 Schedule S2T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

T1 T2

Schedule S4

Read(X);X:=X;Write(X);

Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

Two Serial Schedules with Different Results Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20

After S1 X = 7 and Y = 28 After S2 X = 7 and Y = 27

Content Development GDLN Batch 2

40

T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = X + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

Schedule S1 Schedule S2T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = X + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

A Schedule is Serializableif it Matches Either S1 or S2 ,Even if S1 and S2 Produce Different Results!

The Serializability Theorem A Dependency Exists Between Two Transactions If: They Access the Same Data Item Consecutively in the Schedule and One of the Accesses is a Write

Three Cases: T2 Depends on T1 , Denoted by T1 T2 T2 Executes a Read(x) after a Write(x) by T1 T2 Executes a Write(x) after a Read(x) by T1

T2 Executes a Write(x) after a Write(x) by T1 Transaction T1 Precedes Transaction T2 If:

There is a Dependency Between T1 and T2, and The R/W Operation in T1 Precedes the Dependent T2 Operation in the Schedule

Content Development GDLN Batch 2

41

The Serializability Theorem A Precedence Graph of a Schedule is a Graph G = <TN, DE>, where Each Node is a Single Transaction; i.e.,TN = {T1, ..., Tn} (n>1)

and Each Arc (Edge) Represents a Dependency Going from the Preceding Transaction to the Other i.e., DE = {eij | eij = (Ti, Tj), Ti, Tj TN}

Use Dependency Cases on Prior Slide The Serializability Theorem

A Schedule is Serializable if and only of its Precedence Graph is Acyclic

Content Development GDLN Batch 2

42

Serializability Theorem Example Consider S1 and S2 for Transactions T1 and T2 Consider the Two Precedence Graphs for S1 and S2 No Cycles in Either Graph!

Content Development GDLN Batch 2

43

T1 T2

X

Schedule S1

T1 T2

X

Schedule S2

T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

Schedule S1 Schedule S2T1 T2

Read(X);X:=X;Write(X);Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

What are Precedence Graphs for S3 and S4? For S3

T1 T2 (T2 Write(X) After T1 Write(X)) T2 T1 (T1 Write(X) After T2 Read (X))

For S4 T1 T2 (T2 Read/Write(X) After T1 Write(X))

Content Development GDLN Batch 2

44

T1 T2

X

Schedule S4

T1 T2

X

Schedule S3

X

T1 T2

Read(X);X:=X;

Write(X);Read(Y);

Y = Y + 20;Write(Y);commit;

Read(X);X:=X;

Write(X);commit;

Schedule S3

T1 T2

Schedule S4

Read(X);X:=X;Write(X);

Read(Y);Y = Y + 20;Write(Y);commit;

Read(X);X:=X;Write(X);commit;

FIGURE 17.5 (continued)Examples of serial and nonserial schedules involving transactions T1 and T2. (c) Two nonserial schedules C

and D with interleaving of operations.

Content Development GDLN Batch 2

45

Sample SQL TransactionEXEC SQL WHENEVER SQLERROR GOTO UNDO;EXEC SQL SET TRANSACTION

READ WRITEDIAGNOSTIC SIZE 5ISOLATION LEVEL SERIALIZABLE;

EXEC SQL INSERT INTO EMPLOYEE (FNAME, LNAME, SSN, DNO, SALARY)VALUES (‘ROBERT’, ‘SMITH’, ‘991004321’,’2’,35000);

EXEC SQL UPDATE EMPLOYEESET SALARY = SALARY * 1.1 WHERE DNO = 2;

EXEC SQL COMMIT;GOTO THE_ENDUNDO: EXEC SQL ROLLBACK;THE_END:…;

Content Development GDLN Batch 2

46