46
1 Ch. 10. Transaction Manager Concepts Dr. Hua COP 6730

1 Ch. 10. Transaction Manager Concepts Dr. Hua COP 6730

Embed Size (px)

Citation preview

1

Ch. 10. Transaction Manager Concepts

Dr. Hua

COP 6730

2

Transaction Manager Concepts• The transaction manager (TM) furnishes the A, C,

and D of ACID.– It provides the all-or-nothing property (atomicity) by

undoing aborted transactions, redoing committed ones, and coordinating commitment with other TMs if a transaction happens to be distributed.

– It provides consistency by aborting any transactions that fail to pass the RM consistency tests at commit.

– It provides durability by forcing all log records of committed transactions to durable memory as part of commit processing, redoing any recently committed work at restart.

• The TM together with the log manager and the lock manager supplies the mechanism to build RMs and computations with the ACID properties.

3

Normal Execution

Begin_Work ( )new transaction

TRID

Normal

Functions

Callback

Functions

UNDO,REDO,

COMMIT

WorkRequests

LockRecords

LockRequests Lock

Manager

LogManager

1. Want to Commit

2. Commit Phase 1?

3. YES to Phase 1

5. Commit Phase 2

6. Acknowledge

4.

Write

Commit

log record

and

?

RMs TM

Commit_Work ( )

4

2. Read transaction’s log records

Transaction Abort

Rollback_Work ( )

Application

NormalFunctions

CallbackFunctions

LogManager

1. rollback transaction

5. write abort records

3. UNDO (log record)

4. Aborted (TRID)

RMs TMNote: Rollback to a savepoint has

similar logic.

5

• The DO-UNDO-REDO protocol is a programming style for RMs implementing transactional objectsDO program:

UNDO program:

REDO program:

• RM have following structure:

DO-UNDO-REDO Protocol

RMNormal Function: DO program

Callback Functions: UNDO & REDO program

Old State DO

New State

Log Record

New State

Log Record

Old State

Log Record

DOUNDO

REDO

Old State

New State

6

Restart1. The TM regularly invokes checkpoints during

normal processing it informs each RM to checkpoint its state to persistent memory.

2. At restart, the transaction mgr. scans the log table forward from the most recent checkpoint to the end.

3. For each transaction that has not committed (e.g., T ? ) the TM calls the UNDO( ) callback of the RMs to undo it to the most recent persistent savepoint.

Checkpoint Crash

T1T2

T3

7

Value LoggingEach log record contains the old and the new states of the object.

UNDO Program: set the object to the old state.

REDO Program: set the object to the new state.

Example:struct value_log_record_for_page_update{

int opcode; /* opcode will say page update */filename fname; /* name of file that was updated */long pageno; /* page that was updated */char old_value[PAGESIZE]; /* old value of page */char new_value[PAGESIZE]; /* new value of page */

};

8

Logical Logging• Value logging is often called physical logging

because it records the physical addresses and values of objects

• Logical (or operation) logging records the name of an UNDO-REDO function and its parameter

It assume that each action is atomic and that in each failure situation the system state will be action consistent: each logical action will have been completely done or completely undone.

9

Problem: Partially complete actions can fail, and the UNDO of these partial actions will not be presented with an action-consistent state.

Logical Logging (Cont’d)

step1

step2

step3

step1

step2

step3

logical action 1 logical action 2

a transaction

10

Physiological LoggingMotivation

• Physiological logging is a compromise between logical and physical logging. It uses logical logging where possible.

• There are the ideas that motivate physiological logging:

Page actions: Complex actions can be structured as a sequence of page actions.

Mini-transaction: Page actions can be structured as mini-transactions that use logical logging. When the action completes, the object is updated. An UNDO-REDO log record is created to cover

that action.These actions are atomic, consistent, and isolated.

11

Physiological LoggingMotivation (Cont’d)

Log-object consistency: It is possible to structure the system so that at restart, the persistent state is page-action consistent. The log can then be used to transform this

action-consistent state into a transaction-consistent state at restart.

Note: Physiological log records are physical to a page, and logical within a page.

12

Physiological LoggingAn Example

Consider the insert that has the following logical log record:<insert op, tablename = A, record value = r>

Key B Key CTable T(File A)

index 1 index 2File B File C

13

Physiological LoggingAn Example (Cont’d)

This insert operation involves three page actions (we assume that B-tree splits do not happen). The corresponding physiological record bodies are:

<insert op, base filename = A, page number = 508, record value = r>

<insert op, base filename = B, page number = 72, index record value = s>

<insert op, index filename = C, page number = 94, index record value = t>

Fundamental idea: Log records are generated on a per-page basis. Log records are designed to make logical transformation of pages.

14

Physiological LoggingDuring Online Operation

• We call normal operations without failures online operations.

• To allow updates, all page changes must be structured as mini-transactions of this form:

Mini_trans()

lock the object in exclusive mode

transform the object

generate an UNDO-REDO log record

unlock the object.

15

Physiological LoggingDuring Online Operation (Cont’d)

• The mini-transaction approach ensures online consistency

Page-action consistency: volatile and persistent memory are in a page-consistent state, and each page reflects the most recent updates to it.

Log consistency: The log contains a history of all updates to pages.

16

One-Bit Resource MgrRequirements

This RM manages an array of bits stored in a single page. Each bit is either free (TRUE) or busy (FALSE).

1 F

2 F

3 T F

4 F

5 F T...lsn

one page

One-Bit RM

Client 1

Client 2

get_bit

( )“3”

get_bit ( 5)locked

unlocked

17

One-Bit Resource MgrRequirements (Cont’d)

Requirements:

1. Page Consistencyi. No clean free bit has been given to any transaction.

ii. Every clean busy bit has been given to exactly one transaction.

iii. Dirty bits are locked in exclusive mode by the transaction that modified them.

iv. The log sequence number (page lsn) reflects the most recent log record for this page.

2. Log Consistencyi. The log contains a log record for every completed

mini-transaction update to the page.

18

One-Bit Resource Mgrgive_bit( ) #1

give_bit (int i) /* force a bit */

get XLOCK on the bit;

if the XLOCK is granted

then{

get the page semaphore;

free the bit;

generate log record saying bit is free;

write log record and update lsn; /* page is now consistent */

free page semaphore;

}

else

abort caller’s transaction; /* caller does not own the bit */

19

One-Bit Resource Mgrgive_bit( ) #1 (Cont’d)

Note: This code has all the elements of a mini-transaction.It is well formed and two-phased with respect

to the page semaphore.It provides a page action-consistent

transformation of the page.

20

One-Bit Resource Mgrgive_bit( ) #2

get_bit (void) /* allocate a free bit to and returns bit index */get the page semaphore;repeat_until end of bit array{

find the next free bit and XLOCK it;if lock is granted /* the bit is free */then{ mark the bit busy;

generate log record describing update;write log record and update lsn;

/* page is now consistent */give up semaphore;return the bit index to caller;}

if no free bits were found during the repeat loop then {

abort transaction;return “-1” to caller; }

21

The FIX RuleWhile the semaphore is set, the page is said to be fixed, and releasing the page is called unfixing it.

Fixed Rule:

1. Get the page semaphore in exclusive mode prior to altering the page.

2. Get the semaphore in shared or exclusive mode prior to reading the page.

3. Hold the semaphores until the page and log are again consistent, and read or update is complete.

22

The FIX Rule (Cont’d)

Note: This is just two-phase locking at the page-semaphore level. Isolation Theorem tells us that all read and

write actions on page will be isolated. Page updates are actually min-

transactions. When the page is unfixed, the page should

be consistent and the log record should allow UNDO or REDO of the page transformation.

23

Multi-Page Actions• Some actions modify several pages at once.

Examples: Inserting a multi-page record. Splitting a B-tree node.

• These actions are structured as follows:

1. Fix all the relevant pages

2. Do all the modifications and generate many log records.

3. Unfix the page.

24

Dealing with Failures• Page actions provide page consistency even if they fault.

– Copy the page at the beginning of the page action; then– if anything goes wrong with the page action prior to writing the

log record, the page action just returns the page to its original values by copying it back.

• Complex operations depend on transaction UNDO to roll back.– Each complex action should start by declaring a savepoint.– If anything goes wrong during a page action, the operation first

makes that page consistent.– The action can then call Roll_work () to return to the savepointNote: The save point wraps the complex action within a

subtransaction so that the complex action can be undone if it fails.

25

• Online log consistency requires that volatile log contain all log records up to and including vvlsn:

VVlsn VLlsn

Online Consistency & Restart Consistency

VVlsn

VLlsn

DLlsn

PPlsn

. . .

. . .

. . .

. . .

lsntime stamp

VolatilePage Versions

VolatileLog Records

DurableLog Records

PersistentPage Versions

26

Online Consistency & Restart Consistency (Cont’d)

• Restart consistency ensures that if a transaction has committed with commit_lsn, then that commit record is in the durable log:

commit_lsn DLlsnIn addition, restart consistency guarantees that if version X of the volatile copy overwrites the durable copy, then the log records for version X are already present in the durable log:

VVlsn DLlsnNote: At restart, all volatile memory is reset and must be

reconstructed from persistent memory. We must have:

PVlsn DLlsncommit_lsn DLlsn

27

Write Ahead Log (WAL) ProtocolProtocol:

1. Each volatile page has a LSN field naming the log record of the most recent update to the page.

2. Each update must maintain the page LSN field.3. When a page is about to be copied to persistent

memory, the copier must first use the log manager to copy all log records up to and including the page’s LSN to durable memory (force them).

4. Once the force completes, the volatile version of the page can overwrite the persistent version of the page.

5. The page must be fixed during the writes and during the copies, to guarantee page action consistency.

Effect: The log record of a page must be moved to durable memory prior to overwriting the page in persistent memory.

28

Force-Log-at-CommitQuestion: What if no pages were copied to persistent

memory, and the transaction committed?If the system were to restart immediately, there would

be no record of the transaction’s updates, and the transaction could not be undone.

Solution: Force-Log-at-Commit rule.Rule: The transaction’s log records must be moved to

durable memory as part of commit

Implementation: When a transaction commits, the TM writes a commit log record and requests the log manager to flush the log.

As a consequence, all the log records prior to the commit record are flushed as well.

29

Physiological Logging: SummaryThe RM must observe the following three rules:

Fix rule: Cover all page reads and page writes with the page semaphore.

Write-ahead log (WAL): Force the page’s log records prior to overwriting its persistent copy.

Force-log-at-commit: Force the transaction’s log records as part of commit.

Note: many systems use the physiological design.

30

UNDO:Compensation Log RecordsQuestion: what should the page LSN become when

an action is undone?If subsequent updates to the page by other transactions have advanced the log sequence number, the LSN should not be set back to its original value.

Strategy: the UNDO looks just like a new action that generates a new log record called a compensation log record.

This approach makes page LSNs monotonic, an essential property for write-ahead log.

A transaction that produced n new log records during forward processing will produce n new log records when the transaction is aborted.

31

Idempotence and Testable• Idempotent operation: If the UNDO or REDO

operation can be repeated an arbitrary number of times and still result in the correct state, the separation is idempotent.

Example: The operation “move the reactor rods to position 35” is idempotent.

The operation “move the reactor rods down 2 cm” is not idempotent.

Note: Repeated REDOs can arise from repeated failures.

32

Idempotence and Testable (Cont’d)

• Testable state: If the old and new states can be discriminated by the system, the state is testable.

If an operation is not idempotent and the state is not testable, the operation cannot be made atomic.

UnknownState

Test

OldState

NewState

33

Idempotence of Physiological REDO• Repeated REDOs can arise from repeated failures

during restart.

Example: Suppose the following physiological log record were redone many times:

<insert op, base filename, page number, record value >

If no special care were taken, this repeated REDO would result in many inserts of the record into the page.

34

Idempotence of Physiological REDO(Cont’d)

• The following logic makes physiological REDOs idempotent:

idempotent_physiologic_redo (page, logrec){

if (page_lsn < logrec_lsn)redo (page, logrec);

}

Note: The first successful REDO will advance the page LSN and cause all subsequent REDO of this log record to be null operations.

35

The Need for the 2-Phase Commit Protocol

Cancel key: The client may hit the cancel key at any time during the transaction.

Server Logic: A server may require that a certain set of steps be performed in order to make a complete transaction.

Example: At commit, many forms-processing systems check the completeness of the data.

Integrity check: SQL has the option to defer referential integrity checks to transaction commit. If any integrity checks are violated at commit, the transaction changes cannot be committed, and SQL wants to abort the transaction.

36

The Need for the 2-Phase Commit Protocol (Cont’d)

Field calls: It is possible that field calls cannot acquire the locks or that the predicates become false at the end of the transaction. In such cases, the RM waits to abort the transaction.

2-Phase Commit Protocol: When a transaction is about to commit, each participant in the transaction is given a chance to vote on whether the transaction is a consistent state transformation. If all the RMs vote yes, the transaction can commit. If any vote no, the transaction is aborted.

37

2-Phase Commit Commit

Phase I: • Prepare: Invoke each RM asking for its vote.

• Decide: If all vote yes, durably write the transaction commit log record.

Note: The commit record write is what makes a transaction atomic and durable. If the system fails prior to that instant, the transaction will be undone at restart; otherwise, phase 2 will be carried forward by the restart logic.

38

2-Phase Commit Commit (Cont’d)

Phase II: • Commit: Invoke each RM telling it the commit

decision.

Note: The RM can now release locks, deliver real messages, and perform other clean-up tasks.

• Complete: When all acknowledge the commit message, write a commit completion record to the log, indicating that phase 2 ended. When the completion message is durable, deallocate the live transaction state.

Note: Phase 2 completion record, is used at restart to indicate that the RM have all been informed about the transaction

39

Performance Advantage of Logging

Commit copies no objects only log records to durable memory.

logging converts random write I/Os to sequential write I/Os.

40

2-Phase Commit Abort

• If any RM votes no during the prepare step, or if it does not respond at all, then the transaction cannot commit.

• The simplest thing to do in this case is to roll back the transaction by calling Abort_work ( ).

41

2-Phase Commit Abort (Cont’d)

• The logic for Abort_work ( ) is as follows:Undo: Read the transaction’s log backwards, issuing

UNDO of each record. The RM that wrote the record is invoked to undo the operation.

Broadcast: At each savepoint, invoke each RM telling it the transaction is at the savepoint.

Abort: Write the transaction abort record to the log (UNDO of begin_work( )).

Complete: Write a complete record to the log indicating that abort ended. Deallocate the live transaction state.

42

Transaction Trees• How does a transaction manager first hear about a

distributed transaction?There are two cases:

– Outgoing case: a local transaction sends a request to another node.

– Incoming case: a new transaction request arrives from a remote transaction manager.

43

Transaction Trees (Cont’d)

• The TM involved in a transaction form the transaction tree.

TM

TM TM TM

TM TM

RM

RM

RM• This TM has one incoming session and

two outgoing sessions.

It is both a participant (on the incoming session) and a coordinator (on the outgoing session).

a local RM

A participant

a sessionRoot TM (coordinator)performs the original Began_work( ).

44

Distributed 2-Phase CommitCommit Coordinator

The root commit coordinator executes the following logic when a successful commit_work ( ) is invoked on a distributed transaction.

Local prepare: Invoke each local RM to prepare for commit.

Distributed prepare: Send a prepare request on each of the transaction’s outgoing sessions.

Decide: If all RM vote yes and all outgoing sessions respond yes, then durably write the transaction commit log record containing a list of participating RMs and TMs.

45

Distributed 2-Phase CommitCommit Coordinator (Cont’d)

Commit: Invoke each participating RM, telling it the commit decision. Send “commit” message on each of the transaction’s outgoing sessions.

Complete: When all local RMs and all outgoing sessions acknowledge commit, write a completion record to the log indicating that phase 2 completed.

When the completion record is durable, deallocate the live transaction state.

46

Distributed 2-Phase CommitCommit Participant

• When the prepare message arrives, the participant executes the following logic:

Prepare ( )

Local Prepare: Invoke each local RM to prepare for commit

Distributed prepare: Send prepare requests on the outgoing sessions.

Decide: If all RMs vote yes and all outgoing sessions respond yes, then the local node is almost prepared.

Prepared: Durably write the transaction prepare log record containing a list of participating RMs, participating TMs, and the parent TM.

Respond: Send yes as response (vote) to the prepare message on the incoming session.

Wait: Wait (forever) for a commit message coordinator.