Optimized Transaction Time Versioning Inside a Database Engine Intern: Feifei Li, Boston University...

Optimized Transaction Time Versioning Inside a Database Engine

Intern: Feifei Li, Boston University

Mentor: David Lomet, MSR

Transaction Time Support Provide access to prior states of a database:

Auditing the database Querying the historical data Mining the pattern of changes to a database

General approach: Build it outside the database engine Build it inside the database engine

Overview of A Versioned Database

Page Header

0Dynamic Slot Array

Record A

Record B

Key Challenges Timestamping

Eager timestamping vs. lazy timestamping Record takes the transaction commit timestamp

Recovery of timestamping information when system crashes

Indexing both current versions and historical versions simultaneously Storage utilization Query efficiency

Talk Outline Even “lazier” timestamping Deferred-key-split policy in the TSB tree Auditing the database

Lazy Timestamping When do we timestamp records affected by a

transaction? Maintain a list of updated records and timestamp

them when transaction commits may lead to additional I/Os

Timestamp records when they are accessed by other queries, updates, page reads and writes later on.

Where to get the timestamping information?

Volatile timestamp table (VTT) and Persistent timestamp table (PTT)

Transaction 23 begins

Insert a record A

Transaction

commits

Insert a record B

DiskMain

memory

TID Ttime Refcnt

… … …

… … …23 NA 0

Record ATimestamp=

TID.23

Record BTimestamp=

TID.23

23 NA 123 NA 223 178432

TID Ttime

… …

23 178432

1. Ensure that we can recoverthe timestamping informationif system crashes (VTT is gone!)

Timestamping the RecordTransacti

on 45 begins

Insert a record C

Transaction

commits

Update record A

Main memory

TID Ttime Refcnt

… … …

… … …23 NA 0

Record ATimestamp=

TID.23

Record BTimestamp=

TID.23

23 NA 123 NA 223 178432

TID Ttime

… …

23 178432

45 NA 0

Record CTimestamp=

TID.45

Record ATimestamp=

178432

Record DTimestamp=

TID.88

88 342234Update

record D

Record DTimestamp=

342234

45 923121

45 NA 1

23 178432

45 923121

The Checkpointing Process

kth checkpoint

k-1th checkpoint

k-2th checkpoint

EOLLSN(U)LSN(P)

All the log records have been removed from the log and it is impossible to recover information earlier than LSN(P).

The dirty pages between LSN(P) and LSN(U) have been all flushed into the disk prior to our current checkpoint

The current checkpoint may not finish yet and log records with LSNs between LSN(U) and EOL are not guaranteed to be stable yet.

LSN(P) LSN(U) EOL

k+1th checkpoint

Garbage Collection

LSN(P) LSN(U) EOL

k-2 k-1checkpoint checkpoint checkpoint

PTT: Do Nothing VTT: Keep

PTT: Delete Entry VTT: Drop Entry

RefCnt != 0

RefCnt == 0

At the end of current checkpoint (EOL) interval, update: LSN(U) = EOL and LSN(P) = LSN(U)

rcz_lsn rcz_lsn

rcz_lsn rcz_lsnrcz_lsn

PTT: Delete Entry VTT: Drop Entry

rcz_lsn

Let’s Be Even More Lazier Don’t write an entry to PTT when transaction

commits Piggyback timestamping information to the

commit log record so that we still can recover if necessary

Batch updates entries from VTT to PTT at the checkpoint

Why this is better? Batch update using one transaction is faster

than write to PTT on a per transaction basis; A lot of entries have their Refcnt down to zero

by the time of checkpointing less number of writes to PTT

The New StoryTransacti

on 23 begins

Insert a record A

Transaction

commits

DiskMain

memory

TID Ttime Refcnt

… … …

… … …23 NA 0

Record ATimestamp=

TID.23

23 NA 123223 178432

123 178432

TID Ttime

… …

Transaction 76 begins

Insert a record B

Transaction

commits

Record BTimestamp=

TID.76

76 NA 076 NA 176 287544

Update A

76 287544

Record ATimestamp=

178432

Checkpoint

Be Careful When Updating the VTT and PTT at the Checkpoint

LSN(P) LSN(U) EOL

PTT: Do nothingVTT: Keep

PTT: Insert EntryVTT: Keep

RefCnt != 0

At the end of current checkpoint (EOL) interval, update:LSN(U) = EOL and LSN(P) = LSN(U)

cmt_lsn

PTT: Do nothingVTT: Keep

cmt_lsn

Be Careful When Updating the VTT and PTT at the Checkpoint

LSN(P) LSN(U) EOL

PTT: Do Nothing VTT: Drop Entry

RefCnt == 0

At the end of current checkpoint (EOL) interval, update:LSN(U) = EOL and LSN(P) = LSN(U)

rcz_lsn

rcz_lsncmt_lsn

cmt_lsn

cmt_lsn rcz_lsn

Insert Entry into PTT

cmt_lsn rcz_lsnDelete Entry from PTT and drop it from VTT

cmt_lsn rcz_lsn

Improvement Each record is 200 bytes The database is initialized with 5,000 records Generate workload containing up to 10,000

transactions Each transaction is an insert or an update (to

a newly inserted record by another transaction)

One checkpoint every 500 transactions Cost metrics:

Execution time Number of writes to PTT Number of batched updates

Execution Time

Audit Mode: Always keep everything in PTT

Number of Writes to PTT

Batched Update Analysis

Time Split B (TSB) Tree Indexing both the current version pages and

historical version pages simultaneously Time split:

Create a new page and historical records in the current page is pushed into the new page

Key split: Proceed as the normal B+ tree key split

When to do time split and key split?

What Happens Now

Page Header

0Dynamic Slot Array

A.0B.2

Record C

Insert C but page is full

Current page

What if the current page exceeds the key split threshold?

Record C

Page Header

Dynamic Slot Array

Current page

Page Header

0Dynamic Slot Array

A.0B.2

Historical page

Why We need a Key Split Threshold? Wait till the page is full then do the key split:

Leads to too many time splits and hence lots of replicas in the historical versions

What is the best value for the key split threshold? Too high: overall utilization drops Too low: current version utilization is reduced Find a balance

Could We Do Better? Key split immediately follows the time split

Leads to two pages with utilization 0.5threshksplit

If the new pages are not filled up quickly, storage utilization is wasted for no good reason

A fix Deferring the key split until the next time that

the page requires a key split Simulate as if a key spit has been performed on

previous occasion as it is in the current situation

Deferring the Key Split

Page Header

0Dynamic Slot Array

A.0B.2

Record C

Insert C but page is full

Historical page Current page

What if the current page exceeds the key split threshold?

Record C

Page Header

Dynamic Slot Array

01Current page

We still insert the record

A.0’

B.1’

Page is full again.Update D

Now we key split if last time the page has already satisfied the key split requirement.

Page Header

0Dynamic Slot Array

A.0B.2

We use the key splitvalue from the last occasionwhen a key split should hashappened.

Analytical Result We can show the following:

)1( maxmaxmaxno-deferno-deferdefer SVCU

incrup

inSVCUSVCU

Where in is the insertion ratio, up is the update ratio and cr is the compression ratio.

incrup

inSVCUSVCU

no-deferavg

no-deferavgdefer

The Goal of Our Design To ensure that for any particular version the

version utilization is at least kept above a specified threshold value.

Experiment 50,000 transactions Each transaction inserts or updates a record Varying the insert / update ratio in the

workload Each record is 200 bytes Utilize the delta-compression technique to

compress the historical versions (as they share a lot of common bits with newer version)

Single Version Current Utilization (SVCU)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.35

SVCU: Deferred-Key-Split vs. No Deferred-Key-Split

uncompressed deferuncompressed no deferAnalytical un-compressedCR=90% deferCR=90% no deferAnalytical CR=90%

Percent of Update

Multi-Version Utilization (MVU)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.3

SVCU: Defer Key Split vs. No Defer Key Split

uncompressed deferuncompressed no deferCR=50% deferCR=50% no deferCR=90% deferCR=90% no defer

Percent of Update

Auditing A Database Transaction versioning support enables the

check of any prior state of a database Store the user id in PTT for each transaction

entry Any change to the database is traceable User id is grabbed from the current session

that a transaction belongs to

Conclusion Transaction versioning support inside a

database engine is one step closer to be even more practical

Other interesting applications that will become possible now with transaction versioning support?

Thanks!

Optimized Transaction Time Versioning Inside a Database Engine Intern: Feifei Li, Boston University...

Documents

Video Versioning

Web Service Versioning

COP4710 - Introduction to Database Systems Prof. Feifei Li

Versioning 101

Application versioning

handdraft FEIFEI WANG

Soa Contract Versioning

Perforce Unplugged: Central and Distributed Versioning and distributed versioning

Versioning avec Git

Row-Level Versioning

Code versioning with git

Liquibase - database structure versioning

Grasp(eo) versioning system

Concurrent Versioning System

SSL Versioning Guide

Best Practices for Software Development in the Research ......Coding Practices: versioning and source control • A versioning system typically called a (code) repository • Versioning

Release and versioning

Core Data Versioning

Extensibility And Versioning

Versioning APIs