11 ARIES Recovery Algorithm Dominant recovery scheme –Microsoft SQL-server –IBM DB2 –Oracle...

Preview:

Citation preview

11

ARIES Recovery Algorithm

• Dominant recovery scheme– Microsoft SQL-server– IBM DB2– Oracle

• Based on Steal/No-Force– We will look at basic idea with immediate update

• But won’t cover in detail• Lots of details in text

– Good topic for presentation !

22

Immediate Update – Undo/Redo

• Steal/No-Force– Most commonly used

• Steal: AFIMs of a transaction can be flushed to the database disk before it commits. Consequence ?

• Recovery manager may need to undo writes of some transactions during recovery.

• No-Force :When T commits, RAM blocks may not get flushed. Consequence ?

• Need to redo• Using WAL.

3

System Log RecordsTypes of system log record:1. [start_transaction,T]: Records that transaction T has

started execution.2. [write_item,T,X,old_value, new_value]: Records

that transaction T has changed the value of database item X from old_value to new_value.

3. [read_item,T,X]: Records that transaction T has read the value of database item X.

4. [commit,T]: Records that transaction T has completed successfully : affirms that its effect can be committed (recorded permanently) to the database.

5. [abort,T]: Records that transaction T has been aborted.

4

Recovery using log records:• If the system crashes, we can recover to a consistent

database state by examining the log – by using one of the techniques we will study later

• Log contains a record of every write operation that changes the value of some database item– it is possible to undo the effect of these write operations of a

transaction T . How ?• Trace backward through the log and reset all items

changed by a write operation of T to their old_values.• Can also redo the effect of the write operations of a

transaction T. How ? • Trace forward through the log and set all items changed

by a write operation of T (that did not get done permanently) to their new_values.

5

Immediate Update – Single User• We first study single user for simplicity

– No concurrency in a single user system.

• For the moment, assume checkpoints not being used.• Undo writes done by active (uncommitted) T, redo

writes done by committed T.• How many uncommitted T can there be ?• One : single user• What do with this ? • Undo writes of T using BFIM. In which order ?• In reverse order

– Eg: x = 2. w1(x):3, BFIM = 2 . w1(x) :4, BFIM = 3 Crash.

66

Immediate Update –Single User• Redo write ops of committed transaction. Order ?• In order in which written to log. Why?• Eg: x = 2. w1(x):3, BFIM = 2 . c1. w2(x) :4, BFIM = 3.

c2. Crash.

• [SKS] Eg: Log as it appears at 3 instances of time.• What will be recovery actions in each case if system

crashes i.e. where end of log is shown in each case

77

Immediate Update –Single User [SKS] Eg The log as it appears at three instances of time.

Recovery actions in each case above are:(a) undo (T0): B is restored to 2000 and A to 1000.

(b) undo (T1) and redo (T0): C is restored to 700, and then A and B are set to 950 and 2050 respectively.

(c) redo (T0) and redo (T1): A and B are set to 950 and 2050 respectively. Then C is set to 600

88

Immediate Update –Concurrent Users

• We assume strict 2-phase locking– No T gives up it’s write locks till it commits

• No cascading rollbacks, but deadlock possible– Deadlock leads to rollback of transactions

• With checkpoints, which T do we need to redo?• If T committed before last checkpoint

– T’s changes already recorded in disk blocks

• Only need to redo those which committed after last checkpoint. In which order ?

• Redo in order in which they committed– Will this be OK ? Look at Eg on next slide.

99

Immediate Update –Concurrent Users• Consider the following (initially, x = 3)T1: w1(x): 5 c1T2: r2(x): 5, w2(x):7 c2• If this happened, and we did in order of

commit?

• The final value of x is 5– which is wrong

• How can we be sure that this doesn’t happen?

• Strict 2PL– strict schedules

1010

Immediate Update –Concurrent Users

• Assume strict schedules (eg: strict 2PL).• Deadlock possible: leads to rollbacks• First Undo, then Redo of committed transactions

– Since last checkpoint

• For Undo of active transactions, do in reverse order as before. Is enough to restore BFIM ?– Eg: x = 2. w1(x):3, BFIM = 2 . w2(x) :4, BFIM = 3 ; c2;

checkpoint, a1– Will restoring BFIM work here

• No – but this can’t happen. Why not ?• Strict schedule. So enough to restore BFIM• Redo in order in which written into the log

1111

(b) [start_transaction, T1] initially: A= 5, B = 6, D = 7[write_item, T1, D, 7,20] BFIM written before AFIM[commit, T1][checkpoint][start_transaction, T4][write_item, T4, B,6, 15][write_item, T4, A,5, 20][commit, T4][start_transaction T2][write_item, T2, B,15, 12]

this value gets written out to disk

[start_transaction, T3][write_item, T3, A,20, 30]

this value gets written out to disk

[write_item, T2, D, 20,25] system crash What should happen with the different transactions ?

(a) T1 T2 T3 T4read_item (A) read_item (B) read_item (A) read_item (B)read_item (D) write_item (B) write_item (A) write_item (B)write_item (D) read_item (D) read_item (C) read_item (A)

write_item (D) write_item (C) write_item (A)

Immediate Update –Concurrent UsersModified Figure 23.3

• T3 rolled back because it did not reach its commit point.

• T2 rolled back because it did not reach its commit point

• How to show UNDO, REDO in the HW

1212

Immediate Update –Concurrent Users Modified [SKS] Eg (for review later if needed):

• Go over the steps of the recovery algorithms on the following log:

<T0 start>

<T0, A, 0, 10>

<T0 commit>

<T1 start>

<T1, B, 0, 10>

<T2 start>

<T2, C, 0, 10>

<T2, C, 10, 20><checkpoint>

<T3 start>

<T3, A, 10, 20>

<T3, D, 0, 10>

<T3 commit>System crash

1313

Recovery from disk crash [SKS]

• Technique similar to checkpointing used– Periodically dump entire content of the database

to stable storage (eg: tape).– No transaction may be active during the dump– Output all log records currently residing in main

memory onto stable storage.– Output all buffer blocks onto the disk.

• To recover from disk failure– restore database from most recent dump. – Redo all transactions that committed after the

dump

1414

Schedule for HW 2 problem• <T0, X, 10> means that T0 has written the value of 10 into X• Assume initial values are A = 2, B = 3• This is not a strict schedule• There are only writes in this schedule i.e. no reads.

<T0 start>

<T0, A, 2, 6>

<T0 commit><checkpoint>

<T1 start>

<T1, A, 6, 20>

<T2 start>

<T2, A, 20, 27>

<T2, B, 3, 15>

<T3 start>

<T3, B, 15, 32>

<T3, A, 27, 49>

<T1 commit>System crash

1515

Relational Query Languages• Relational Algebra and SQL both query languages

• Relational Algebra vs SQL

– procedural : step by step details on how to get what we want vs declarative : we say what we want rather than how to compute it

– small # operations vs large # operations,– non-commercial vs commercial

• Relational Algebra : Useful in optimization: – SQL translated by DBMS into relational algebra– DBMS tries to find faster way to get same result

• Number of operators included in SQL

1616

[RG] SailorDatabase

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

sid bid day

22 101 10/10/9658 103 11/12/96

R1

S1

S2

• “Sailors” and “Reserves” table from [RG].

• There is information about sailors : sailor id, name, rating, age.

• Two tables for sailors: S1 and S2.

• Sailors reserve boats: sailor id, boat id, day. Primary key – all 3 ?

• Also B table for boats

1717

Preliminaries• A query is applied to relation instances• Closure : query result also a relation. So can

apply one operation after another.• Names of fields in query results are

`inherited’ from names of fields in query input relations.

• Operations can be broken up into two groups, – Mathematical operations: union,

intersection, set difference, cross product– Relational operations : selection, projection, /

division, join

1818

Projection: unary operatorsname rating

yuppy 9lubber 8guppy 5rusty 10

)2(, Sratingsname

age

35.055.5age S( )2

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S2

• Deletes attributes that are not in projection list.• Schema of result contains exactly the fields in the

projection list, with the same names that they had in the input relation.

• Projection operator eliminates duplicates.

1919

Projection Properties• The number of tuples in <list> Ris always

less or equal to the number of tuples in R. How could it be a smaller number ?

• Because duplicates eliminated.

• What do we need to guarantee equal ?

• If the list of attributes includes key of R, then number of tuples is equal to the number of tuples in R– Because each row guaranteed to be unique

2020

Selection : unary operator

• Selects rows that satisfy selection condition.• Schema of result identical to schema of input relation.• No duplicates in result – even if in original table.• Can combine operations as above

rating

S82( )

sid sname rating age28 yuppy 9 35.058 rusty 10 35.0

sname ratingyuppy 9rusty 10

sname rating rating

S,

( ( ))82

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S2

2121

Projection, Selection Egs [EN] Fig 6.1

2222

Selection Properties• The SELECT operation <selection condition>(R) produces a

relation S that has the same schema as R

• The SELECT operation is commutative; i.e., <condition1>(< condition2> ( R)) = <condition2> (< condition1> ( R))

• A cascaded SELECT operation may be applied in any order; i.e.,

<condition1>(< condition2> (<condition3> ( R)) = <condition2> (< condition3> (< condition1> ( R)))

• A cascaded SELECT operation may be replaced by a single selection with a conjunction of all the conditions; i.e.,

<condition1>(< condition2> (<condition3> ( R)) = <condition1> AND < condition2> AND < condition3> ( R)))

2323

• Will see temporary tables later

Combining Projection, Selection [EN] Fig 6.2

2424

Union, Intersection, Set-Difference:Binary Set Operators

• Union : for sets. r , s are setsr s = {t | t r or t s}

is the symbol for belongs to

• Intersection:

r s = {t | t r and t s} • Set difference :

r – s = {t | t r and t s}

• Work in similar ways for tables

2525

Union, Intersection, Set-Difference

• For these operations on tables, two input tables have to be: union-compatible :– Same number of fields.– Corresponding fields have the same type –

could have diff. names.• What is the schema of result?• Schema same as that of the two input tables

– we use the convention that column names in output are the same as column names in first table

2626

Union – Example [SKS]• Relations

r, s:

r s:

A B

1

2

1

A B

2

3

rs

A B

1

2

1

3

2727

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S1 S2

Union – Example [RG]

?21 SS

2828

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S1 S2

Union – Example [RG]

sid sname rating age

22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.044 guppy 5 35.028 yuppy 9 35.0

S S1 2

2929

Intersection– Example [SKS]

• r s

A B

121

A B

23

r s

A B

2

3030

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S1 S2

Intersection– Example [RG]

?21 SS

3131

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S1 S2

Intersection– Example [RG]

sid sname rating age31 lubber 8 55.558 rusty 10 35.0

21 SS

3232

Set Difference– Example [SKS]

• Relations r, s:

r – s:

A B

1

2

1

A B

2

3

rs

A B

1

1

Is this the same as s – r ?

3333

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S1 S2

Set Difference – Example [RG]

?21 SS

3434

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

S1 S2

Set Difference – Example [RG]

sid sname rating age

22 dustin 7 45.0

S S1 2

3535

Set Operators• Both union and intersection are commutative

R S = S RR S = S R

• Minus operation is not commutative; in generalR – S ≠ S – R

• Union and intersection are associative R (S T) = (R S) T(R S) T = R (S T)

• Would minus be associative ?i.e. is R – (S – T) = (R – S) – T ?

• No. Counterexample ?

3636

Set Operators Eg [EN] Figure 6.4

3737

Renaming columns and creating temporary tables

• Suppose want to apply several relational algebra operations one after the other. One way of doing this is by nesting the operations:

• Eg: Want to retrieve the first name, last name, and salary of all employees who work in department number 5. How to do?

• We apply a select followed by a project. As we saw earlier, can write a single relational algebra expression FNAME, LNAME, SALARY(

DNO=5(EMPLOYEE))

3838

Renaming columns and creating temporary tables

• Alternatvely, we can apply one operation at a time and create intermediate result relations– We must give names to the relations that hold the

intermediate results.

• We explicitly show sequence of operations by giving a name to each intermediate relation:

TEMP DNO=5(EMPLOYEE)

RESULT FNAME, LNAME, SALARY (TEMP )

3939

• Saw same Eg before, now with temporary tables

Using Temporary Tables [EN] Fig 6.2

4040

Temporary tables with notation• Alternate notation: rename operator is

– we won’t use notation, but is commonly used

• S ( R) is a renamed relationS based on R

– column names of S same as that of R

• S (B1, B2, …, Bn ) ( R) is a renamed relationS

based on R with column names B1, B1,…..Bn

• we will use the arrow notation

RESULT (FN, LN, Sal) FNAME, LNAME,

SALARY (TEMP )

4141

Cross-Product– Example [SKS] Relations r, s:

r x s:

A B

1

2

A B

11112222

C D

1010201010102010

E

aabbaabb

C D

10102010

E

aabbr

s

4242

Cartesian-Product/Cross-Product• Defined as:

r x s = {t q | t r and q s}

• All possible combination of rows from r and s.

• If r has nr tuples and s has ns tuples, then how many tuples will have r x s have ?

• r x s will have nR * nS tuples.

• Do tables r , s have to be union compatible ?

• No

4343

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

sid bid day

22 101 10/10/9658 103 11/12/96

R1S1

(sid) sname rating age (sid) bid day

22 dustin 7 45.0 22 101 10/ 10/ 96

22 dustin 7 45.0 58 103 11/ 12/ 96

31 lubber 8 55.5 22 101 10/ 10/ 96

31 lubber 8 55.5 58 103 11/ 12/ 96

58 rusty 10 35.0 22 101 10/ 10/ 96

58 rusty 10 35.0 58 103 11/ 12/ 96

• S1 x R1

• S1,R1 both have sid

Cross-Product Eg [RG]

4444

Referring to attribute by position• Positional notation: Eg: referring to column 3• Named-field notation: more readable.• Both used in SQL• Why allow positional notation in relational

algebra:– Sometime end up with unnamed columns. Eg: S1 R1

, S1 and R1 have an identically named column sid, how else to differentiate?

– We will only allow for the purposes of renaming.

R (FNAME, LNAME, SALARY)

FNAME, 2, SALARY (DEP5_EMPS)– Not done in [EN]

4545

Eg [RG] referring by position• S1 x R1 : Both S1 and R1 have a field called sid.

sid1 sname rating age sid2 bid day

22 dustin 7 45.0 22 101 10/ 10/ 96

22 dustin 7 45.0 58 103 11/ 12/ 96

31 lubber 8 55.5 22 101 10/ 10/ 96

31 lubber 8 55.5 58 103 11/ 12/ 96

58 rusty 10 35.0 22 101 10/ 10/ 96

58 rusty 10 35.0 58 103 11/ 12/ 96

• With renaming:C (sid1, sname, rating, age, sid2, bid, day)

1, sname, rating, age, 5, bid, day(S1 R1)

4646

[EN] example• Example: Get ssn of all employees who

either work in department 5 or directly supervise an employee who works in department 5. How to do ?

DEP5_EMPS DNO=5 (EMPLOYEE)

RESULT1 SSN(DEP5_EMPS)

RESULT2(SSN) SUPERSSN(DEP5_EMPS)

RESULT RESULT1 RESULT2

4747

Cross-Product and Selection Eg– [SKS]

Relations r, s:

r x s:

A B

1

2

A B

11112222

C D

1010201010102010

E

aabbaabb

C D

10102010

E

aabbr

s

A B C D E

122

101020

aab

A=C(r x s)

Want only those rows where A = C

4848

[EN] example• Example : Get the names of female

employees and the names of their dependents.– How to do ?

• Try with cross product

FEMALE_EMPS SEX=’F’(EMPLOYEE)

EMPNAMES FNAME, LNAME, SSN (FEMALE_EMPS)

EMP_DEPENDENTS EMPNAMES DEPENDENT

• Will this work ?

4949

[EN] company database Figure 5.6

5050

[EN] example Figure 6.5

5151

[EN] example• What was the problem with what we had ?

• We had dependents who were not connected to any female emp. How to fix ?

• Need to only pick rows where the Ssn matches the ESSN

ACTUAL_DEPENDENT

SSN = ESSN(EMP_DEPENDENTS)

RESULT FNAME, LNAME, Dependent_name

(ACTUAL_DEPENDENT)

5252

[EN] example Figure 6.5

5353

Joins• Motivation: we frequently want to combine (via

cross product) table1 and table2– But only where the values in a column from table1

matches the value from table2

• In previous example, would like to combine:

EMP_DEPENDENTS EMPNAMES x DEPENDENT and

ACTUAL_DEPENDENT

SSN = ESSN(EMP_DEPENDENTS)

• Would like to do this via a single operation.

• This is what the join operator lets us do

5454

Joins• Join: c : condition• Cross product followed

by selection. • [RG] Eg:

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

R c S c R S ( )

(sid) sname rating age (sid) bid day

22 dustin 7 45.0 58 103 11/ 12/ 9631 lubber 8 55.5 58 103 11/ 12/ 96

sid bid day

22 101 10/10/9658 103 11/12/96

S1R1

S RS sid R sid

1 11 1

. .

Recommended