24

Databases might get lost We don’t like that, because … Solutions are based on logging techniques General term: write ahead logging

Embed Size (px)

Citation preview

Page 1: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging
Page 2: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging
Page 3: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Databases might get lost We don’t like that, because … Solutions are based on logging techniques General term: write ahead logging

Page 4: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Wrong user data: avoid by using constraints System failure: loss of main memory (= volatile

memory) Media failures (disk errors) Catastrophe (several levels possible)

Page 5: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging
Page 6: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Traffic between disk and main memory: IO

Unit size: page or block Block in memory is

buffered Typically 32 kbyte Access time: 5 – 10 msec Block is often unit for

concurrency (sometimes tuple)

Page 7: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Log file is a separate file, containing information to support database reconstruction

Entries have record structure Entries are (in most cases) related to a specific

transaction Good practice: log file on separate disk Recent development: log files on SSD

Page 8: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

INPUT(X): copy block X from disk to memory READ(X,t): assign value of X to variable t WRITE(X,t): copy value of t into X in memory OUTPUT(X): copy block X from memory to disk

(also called flushing X)

Example (financial transaction):INPUT(Account1); READ(Account1, v1); v1 := v1 – 200;WRITE(Account1, v1); OUTPUT(Account1);INPUT(Account2); READ(Account2, v2); v2 := v2 + 200;WRITE(Account2, v2); OUTPUT(Account2);

Page 9: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

INPUT(Account1); READ(Account1, v1); v1 := v1 – 200;WRITE(Account1, v1); OUTPUT(Account1);

>> CRASH <<

INPUT(Account2); READ(Account2, v2); v2 := v2 + 200;WRITE(Account2, v2); OUTPUT(Account2);

Page 10: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Old values of each data element X should be written to the log file: <T, X, oldvalue>

Such a record is often called the before image of X Before doing an OUTPUT(X), the log record for this X

should be flushed The <T, commit> record is written to the log after all

database elements have been updated on disk

Page 11: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Example transaction run + log entriesDistinguish values in M(emory) and on D(isk)

Note: Log means M-Log; after FLUSH LOG: D-Log = M-Log

Page 12: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Check the log file for uncommitted transactions Rollback these transactions using the before images The order of undoing transactions is essential

What about … a crash during the recovery process?

By the way: restart the transactions that are undone

Page 13: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Dealing with all the transactions since the DB started operating (possibly a few years ago) may be less desirable

Checkpoint steps:1. No new transactions accepted2. Wait until all active transactions are finished

(COMMIT or ABORT)3. Flush the log records4. Write a <CKPT> record into the log5. Resume transaction processing

Page 14: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

INPUT(Account1); READ(Account1, v); v := v – 200;WRITE(Account1, v); INPUT(Account2); READ(Account2, v); v := v + 200;WRITE(Account2, v); COMMIT;

>> CRASH <<

OUTPUT(Account1);OUTPUT(Account2);

Page 15: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Why would you delay OUTPUT?

Page 16: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Why would you delay OUTPUT?

Page 17: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Before commitment, the new value of X should be written to the log file: <T, X, newvalue>

Such a record is often called the after image of X After the logging of all after images,

the <T, commit> is written to the log

Page 18: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Example transaction run + log entriesDistinguish values in M(emory) and on D(isk)

Page 19: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Check the log file for committed transactions Restore the effects of these transactions using the

after images The order of redoing transactions is essential What about … a crash during the recovery process?

Page 20: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Combining before image and after image:<T, X, oldvalue, newvalue>

Optimal freedom for buffer manager

Page 21: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Definition: Tj RF (reads from) Ti if there is an X such that Wi[X] is the last write on X before Rj[X]

Definition: IF Tj RF Ti and Ti is not yet committed, then this read is called a dirty read

A schedule is recoverable if for each pair of transactions the following property holds:Tj RF Ti => COMMIT i < COMMIT j (in log)

A schedule avoids cascading rollbacks if:Tj RF Ti => COMMIT i < Rj[X]

Page 22: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

A schedule is strict if: Wi[X] < Oj[X] => COMMITi < Oj[X] or ABORTi < Oj[X] for any read or write operation Oj[X]

Implementation of strictness by 2PL: hold your locks until ABORT/COMMIT

Relates to SQL Isolation Levels

Page 23: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Up till now: system failures◦Memory lost; data on disk undamaged

What to do in case of disk failure? Archiving: always keep a copy of your entire DB Full dump or incremental dump Keep logs since last dump Archive copy + log = actual DB

… that’s why you should keep your log file on a separate disk

Page 24: Databases might get lost  We don’t like that, because …  Solutions are based on logging techniques  General term: write ahead logging

Checkpoint for REDO Non-quiescent checkpoint

>> see the exercises