CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 294 Database Systems II Coping With System Failures

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 1

Database Systems II

Coping With System Failures


IntroductionSystem failures are events that cause the state of a transaction to be lost.

Potential causes of system failures are power loss, software errors and media failures.

Power loss leads to the loss of main memory states, media failure to a loss of disk states, and software errors can lead to both.

Recovery from system failures is based on the concept of transactions.


IntroductionWe distinguish two types of system failures: temporary / local system failures and permanent / global system failures.In a local failure, main memory content or the content of a few disk blocks is lost. A log of database modifications is used to recover from such failures.In a global failure, the entire database content is lost.Archiving is employed to recover from such failures.


TransactionsA user’s program may carry out many operations on the data retrieved from the database, but the DBMS is only concerned about what data is read/written from/to the database.A transaction is the DBMS’s abstract view of a user program: a sequence of reads and writes.Requirements for transactions:• Atomicity: “all or nothing”, • Consistency: transforms consistent DB state into

another consistent DB state, • Independence: from all other transactions,• Durability: survives any system failures.


TransactionsUsers submit transactions, and can think of each transaction as executing by itself.Concurrency is achieved by the DBMS, which interleaves actions (reads/writes of DB objects) of various transactions.Each transaction must leave the database in a consistent state if the DB is consistent when the transaction begins.


TransactionsDBMS will enforce some ICs, depending on the ICs declared in CREATE TABLE statements, in triggers etc.Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not understand how the interest on a bank account is computed).Issues: effect of interleaving (concurrent) transactions (next chapter), and system failures (this chapter).


TransactionsA transaction can end in two different ways:- commit: successful end, all actions

completed,- abort: unsuccessful end, only some

actions executed.A transaction can also be aborted by the DBMS.

The DBMS guarantees that a transaction is atomic. That is, a user can think of a transaction as always executing all its actions, or not executing any actions at all.


TransactionsDBMS logs all actions so that it can undo the actions of aborted transactions.This ensures the atomicity of transactions.Log is also employed to redo actions of committed transactions, if a system failure occurs.This ensures the durability of transactions.


Primitive OperationsDatabase modifications are initially performed in the (main memory) buffer.In order to reduce the number of IO operations, the buffer manager writes buffer blocks back to disk only if necessary.In order to study failure recovery operations, we need to consider four primitive operations to read and modify disk blocks and buffer blocks. In the following, we assume a database element X which is not larger than a single block.


Primitive OperationsInput (x): transfer block containing x from disk to memory (buffer)

Output (x): transfer block containing x from buffer to disk

Read (x,t): do Input(x) if necessaryassign value of x in block to local variable t (in buffer)

Write (x,t): do Input(x) if necessary

assign value of local variable t (in buffer) to x

Read and Write are issued by transactions, Input and Output are issued by the buffer manager.


Primitive OperationsKey problem are unfinished transactions.

Example

Constraint: A=BT1: A A 2 B B 2

Initially, A=B=8


Primitive Operations

T1: Read (A,t); t t2Write (A,t);Read (B,t); t t2Write (B,t);Output (A);Output (B);

A: 8B: 8

A: 8B: 8

memory disk

1616

failure!

16

failure!


Primitive Operations

T1: Read (A,t); t t2 Write (A,t);Read (B,t); t t2Write (B,t);Output (A);Output (B);

A:8B:8

A:8B:8

memory disk log

Undo logging

1616

<T1, start><T1, A, 8>

<T1, commit>16 <T1, B, 8>

16


LoggingWhat content should log records have?

When to write log records back to disk?

How to deal with system failures during logging?

Different types of logging:- undo logging,- redo logging,- undo/redo logging.

The log manager (a DBMS component) records (logs) relevant events and manages the corresponding log file.


LoggingLog records are first kept in the buffer.

Log blocks are written to disk as soon as feasible.

FLUSH LOG: copy to disk all log blocks that are new or have changed since last flush

Generic log records used in each logging type:- <START T>: start of transaction T.- <COMMIT T>: transaction T completed successfully.- <ABORT T>: transaction T was terminatedunsuccessfully.


Undo LoggingUndo logging supports the undo of transactions that were incomplete at the time of a system failure.

In addition to the generic log records, undo logging keeps update records: <T, X, v>T: transactionX: database element (tuple, attribute)v: former value (before modification).


Undo LoggingLog is first written in memory.Not written to disk on every action.

memory

DB

Log

A: 8 16B: 8 16Log:<T1,start><T1, A, 8><T1, B, 8>

A: 8B: 8

16

BAD STATE# 1

old value of A is lost,

if system failure


Undo LoggingLog is first written in memory.Not written to disk on every action.

memory

DB

Log

A: 8 16B: 8 16Log:<T1,start><T1, A, 8><T1, B, 8><T1, commit>

A: 8B: 8

16

BAD STATE# 2

new value of B is lost,

if system failure <T1, B, 8>

<T1, commit>...


Undo LoggingTo avoid these bad states, the log manager and buffer manager need to obey the following rules:

U1: If transaction T modifies database element X, then the log record <T, X, v> must be written to disk before the new value of X is written to disk(write ahead logging).

U2: If a transaction commits, then its COMMIT log record must be written to disk only after all database elements changed by the transaction have been written to disk.


Undo LoggingUpon a system failure, the recovery manager (a DBMS component) uses the log to restore a consistent database state.

Distinguish committed and uncommitted transactions, based on COMMIT log records.

Committed transactions cannot have created an inconsistent state, because all of their modifications have been written to disk (U2).


Undo LoggingModifications by aborted transactions are also unproblematic, since already undone.

Uncommitted transactions T may have created inconsistent DB state.

For each modification of T written to disk, the corresponding log record must be on disk (U1).

To undo this action of T, restore database element X to its old value v as provided by the log record.


Undo LoggingConsider all uncommitted transactions, starting with the most recent one and going backward.

Undo all actions of these transactions.

Why going backward, not forward?

Example T1, T2 and T3 all write A

T1 executed before T2 before T3T1 committed, T2 and T3 incomplete

T1 write A T2 write Atime/log

T1 commit

systemfailure

T3 write A


Undo LoggingRecovery algorithm

(1) Let S = set of transactions with <Ti, start> in log, but no <Ti, commit> or <Ti, abort> record in log.

(2) For each <Ti, X, v> in log, in reverse order (from latest to earliest) do:

if Ti S then - Write (X, v) - Output (X).

(3) For each Ti S do- write <Ti, abort> to log.

(4) Flush log.


Undo LoggingWhat if a system failure happens during the recovery?

We just repeat the undo from scratch.

This is no problem, since multiple repetitions of the recovery algorithm are equivalent to a single execution.

In principle, we need to examine the entire log.

Checkpointing is a method to limit the part of the log that needs to be considered during recovery up to a certain point (checkpoint).


Undo LoggingTo create a checkpoint:- stop accepting new transactions,- wait until all current transactions commit or abort and have written the corresponding log records,- flush the log to disk,- write a <CKPT> log record and flush the log,- resume accepting new transactions.

When encountering a checkpoint record, we know that there are no incomplete transactions.

Do not need to go backward beyond checkpoint.


Redo LoggingIn Undo logging, we need to write all modified data to disk before committing a transaction.

This may require an unnecessarily large number of block IOs.

With redo logging, DB modifications can be written to disk later than commit time.

No undo necessary, since DB modifications written to disk only after commit.Update records <T, X, v> record new (not old) value (after modification).


Redo Logging

T1: Read(A,t); t = t2; Write (A,t); Read(B,t); t = t2; Write (B,t);

Output(A); Output(B)

A: 8B: 8

A: 8B: 8

memory DB

LOG

1616

<T1, start><T1, A, 16><T1, B, 16>

<T1, commit><T1, end>

output1616

Example


Redo Logging Before modifying any database element on disk,

corresponding log records (update and COMMIT) must be written to disk.

Redo logging works as follows:(1) For every action, generate redo log record.(2) Before X is modified on disk, all log records for transaction that modified X (including commit) must be on disk.(3) Flush log at commit.(4) Write END log record after DB modifications

flushed to disk.


Redo LoggingIn recovery, need to redo modifications by committed transactions that have not yet been flushed to the disk.

Recovery algorithm(1) Let S = set of transactions with <Ti, commit> and no <Ti, end> in log

(2) For each <Ti, X, v> in log, in forward order (from earliest to latest) do:

- if Ti S then Write(X, v) Output(X)

(3) For each Ti S, write <Ti, end>


Redo LoggingThe END log records allow us to limit the number of transactions that need to be considered in a recovery.

Alternatively, can set a checkpoint:(1) Do not accept new transactions.

(2) Wait until all transactions finish. (3) Flush all log records to disk. (4) Flush all buffers to disk

(do not discard buffers). (5) Write “checkpoint” log record on disk. (6) Resume transaction processing.


Redo Logging

<T1

,A,1

6>

<T1

,com

mit

>

<C

heck

poin

t>

<T2

,B,1

7>

<T2

,com

mit

>

<T3

,C,2

1>

System

failure... ... ... ......

...

Example

Redo log (disk)

recovery does not need to go beyond checkpoint


Undo/Redo Logging Undo logging requires to write modifications to

disk immediately after commit, leading to an unnecessarily large number of IOs.

Redo logging requires to keep all modified blocks in the buffer until the transaction commits and the log records have been flushed, increasing the buffer size requirement.

Undo/redo logging combines undo and redo logging. It provides more flexibility in flushing modified blocks at the expense of maintaining more information in the log.


Undo/Redo Logging Update records <T, X, new, old> record new

and old value of X.

Undo/redo logging has only the constraints that both undo logging and redo logging have.

The only undo/redo logging rule is as follows: UR1: Log record must be flushed before corresponding modified block (write ahead logging).

Block of X can be flushed before or after T commits, i.e. before or after the COMMIT log record.

Flush the log at commit.


Undo/Redo LoggingBecause of the flexibility of flushing X beforeor after the COMMIT record, we can have uncommitted transactions with modificationson disk and committed transactions withmodifications not yet on disk.

The undo/redo recovery policy is as follows:- Redo committed transactions.- Undo uncommitted transactions.


Undo/Redo LoggingMore details on the recovery procedure:

- Backward pass From end of log back to latest valid checkpoint, construct set S of committed transactions. Undo actions of transactions not in S.- Forward pass From latest checkpoint forward to end of log, redo actions of transactions in S.

Alternatively, can also perform the redos beforethe undos.


Undo/Redo LoggingIn either case, the following can happen.

Transaction T1 has committed and is redone.

However, T1 has read X written by transactionT2 which has not committed and is undone.

This situation needs to be avoided, since theresulting database state is inconsistent (notserializable).

Concurrency control ensures that this situationis avoided.


Protecting Against Media Failures

Logging protects from local loss of main memoryand disk content, but not against global loss ofsecondary storage content (media failure).

To protect against media failures, employ archiving: maintaining a copy of the databaseon a separate, secure storage device.

Log also needs to be archived in the same manner.

Two levels of archiving:

full dump vs. incremental dump.



Typically, database cannot be shut down forthe period of time needed to make a backupcopy (dump).Need to perform nonquiescent archiving, i.e.create a dump while the DBMS continuesto process transactions.Goal is to make copy of database at time whenthe dump began, but transactions may changedatabase content during the dumping.Logging continues during the dumping, anddiscrepancies can be corrected from the log.



We assume undo/redo (or redo) logging.

The archiving procedure is as follows:- Write a log record <START DUMP>.- Perform a checkpoint for the log.- Perform a (full / incremental) dump on the secure storage device.- Make sure that enough of the log has been copied to the secure storage device so that at least the log up to the check point will survive media failure.- Write a log record <END DUMP>.



After a media failure, we can restore the DBfrom the archived DB and archived log as follows:- Copy latest full dump (archive) back to DB.- Starting with the earliest ones, make the modifications recorded in the incremental dump(s) in increasing order of time.- Further modify DB using the archived log. Use the recovery method corresponding to the chosen type of logging.

Documents

CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 294 Database Systems II Coping With System Failures