Split Snapshots and Skippy Indexing: Long Live the Past! Ross Shaull Liuba Shrira Brandeis...

Split Snapshots and Skippy Indexing:Long Live the Past!

Ross Shaull <rshaull@cs.brandeis.edu>

Liuba Shrira <liuba@cs.brandeis.edu>

Brandeis University

Our Idea of a Snapshot

• A window to the past in a storage system• Access data as it was at time snapshot was

requested• System-wide• Snapshots may be kept forever

– I.e., “long-lived” snapshots

• Snapshots are consistent– Whatever that means…

• High frequency (up to CDP)

Why Take Snapshots?

• Fix operator errors• Auditing

– When did Bob’s salary change, and who made the changes?

• Analysis– How much capital was tied up in blue shirts at the

beginning of this fiscal year?

• We don’t necessarily know now what will be interesting in the future

• Give the storage system a new capability: Back-in-Time Execution

• Run read-only code against current state and any snapshot

• After issuing request for BITE, no special code required for accessing data in the snapshot

Other Approaches: Databases

• ImmortalDB, Time-Split BTree (Lomet) – Reorganizes current state– Complex

• Snapshot isolation (PostgreSQL, Oracle)– Extension to transactions– Only for recent past

• Oracle FlashBack– Page-level copy of recent past (not forever)– Interface seems similar to BITE

Other Approaches: FS

• WAFL (Hitz), ext3cow (Peterson)– Limited on-disk locality– Application-level consistency a challenge

• VSS (Sankaran)– Blocks disk requests– Suitable for backup-type frequency

A Different Approach

• Goals:– Avoid declustering current state– Don’t change how current state is accessed– Application requests snapshot– Snapshots are “on-line” (not in warehouse)

• Split Snapshots– Copy past out incrementally– Snapshots available through virtualized buffer

manager

Our Storage System Model

• A “database”– Has transactions– Has recovery log– Organizes data in pages on disk

Our Consistency Model

• Crash consistency– Imagine that a snapshot is declared, but

then before any modifications can be made, the system crashes

– After restart, recovery kicks in and the current state is restored to *some* consistent point

– All snapshots will have this same consistency guarantee after a crash

I want record R

Our Storage System Model

Application

P1 … Pn

AccessMethods

Database

Snapshot Now

Find Table

Find Root

Search for R

Return R

P1 Address XP2 Address Y…

Page Table

Retaining the Past

Versus

Copy-on-Write (COW)

P1 P2 P1

PageTable

Snapshot PageTable “S”

Operations:

Snapshot “S”

Modify P1

The old page table became the Snapshot

page table

Split-COW

Expensive to update P2 in both

page tables

P1 P1 P2

PageTable

SPT(S)

SPT(S+1)

What’s next

1. How to manage the metadata?2. How will snapshot pages be accessed?3. Can we be non-disruptive?

Metadata Solution

• Metadata (page tables) created incrementally

• Keeping many SPTs costly

• Instead, write “mappings” into log

• Materialize SPT on-demand

Maplog

Maplog• Mappings created incrementally• Added to append-only log• Start points to first mapping created

after a snapshot is declared

P1 P1 P2 P1 P1 P1 P2 P1P2

Maplog

Maplog• Materialize SPT with scan• Scan for SPT(S) begins at Start(S)• Notice that we read some mappings

that we do not need

Cost of Scanning Maplog

• Let overwrite cycle length L be the number of page updates required to overwrite entire database

• Maplog scan cannot be longer than overwrite cycle

• Let N be the number of pages in the database

• For a uniformly random workload, L N ln N (by the “coupon collector’s waiting time” problem)

• Skew in the update workload lengthens overwrite cycle

• Skew of 80/20 (80% of updates to 20% of pages) increases L by a factor of 4

Skew hurts

Skippy

P1 P2 P1 P1P2Skippy Level 1

Maplog

• Copy first-encountered mapping (FEM) within node to next level

P1 P1 P2 P1 P1 P1 P2 P1P2

Pointers

Copies

Skippy

P1 P2 P1 P1P2

Maplog

P1 P1 P2 P1 P1 P1 P2 P1P2

P3Skippy Level 1

Cut redundant mapping count in

K-Level Skippy• Can eliminate effect of skew — or more• Enables ad-hoc, on-line access to snapshots,

whether they are old or young

Skew # Skippy Levels Time to Materialize SPT (s)

50/50 0 13.8

80/20 0 19.0

1 15.8

2 14.7

3 13.9

99/1 0 33.3

1 6.69

Read Current StateBITE

Accessing Snapshots• Transparent to layers above cache• Indirection layer to redirect page requests

from a BITE transaction into the snapstore

P1 P1 P2

Non-Disruptiveness

• Can we create Skippy and COW pre-states without disrupting the current state?

• Key idea:– Leverage recovery to defer all snapshot-

related writes– Write snapshot data in background to

secondary disk

Implementation• BDB 4.6.21• Page cache augmented

– COWs write-locked pages– Trickle COW’d pages out over time

• Leverage recovery– Metadata created in-memory at transaction

commit time, but only written at checkpoint time– After crash, snapshot pages and metadata can be

recovered in one log pass

• Costs– Snapshot log record– Extra memory– Longer checkpoints

Early Disruptiveness Results

• Single-threaded updating workload of 100,000 transactions

• 66M database • We can retain a

snapshot after every transaction for a 6–8% penalty to writers

• Tests with readers show little impact on sequential scans (not depicted)

50/50 80/20 99/1

Time (s)

No Snapshots

Snapshots Every Other Transaction

Snapshots Every Transaction

Paper Trail

• Upcoming poster and short paper at ICDE08

• “Skippy: a New Snapshot Indexing Method for Time Travel in the Storage Manager” to appear in SIGMOD08

• Poster and workshop talks– NEDBDay08, SYSTOR08

Questions?

Backups…

Recovery Sketch 1

• Snapshots are crash consistent• Must recover data and metadata for all

snapshots since last checkpoint• Pages might have been trickled, so must

truncate snapstore back to last mapping before previous checkpoint

• We require only that a snapshot log record be forced into the log with a group commit, no other data/metadata must be logged until checkpoint.

Recovery Sketch 2

• Walk backward through WAL, applying UNDOs

• When snapshot record is encountered, copy the “dirty” pages and create a mapping

• Trouble is that snapshots can be concurrent with transactions

• Cope with this by “COWing” a page when an UNDO for a different transaction is applied to that page

The Future

• Sometimes we want to scrub the past– Running out of space?– Retention windows for SOX-compliance

• Change past state representation– Deduplication– Compression

Split Snapshots and Skippy Indexing: Long Live the Past! Ross Shaull Liuba Shrira Brandeis...

Documents

Anatomy and Physiology: Reproductive Systems (3025B) By Jessi Spry and Kathryn Shaull

@rnation Rich Chocolate amation DUTCH Cjjurefarm' DUTCH ... · @rnation Rich Chocolate amation DUTCH Cjjurefarm' DUTCH LADY' Skittles Skittles SKIPPI spREAD SKIPPY FU.WITH pop NET

Course Selection at Brandeis: A Guide for New Graduate Students Liuba Shrira Professor and Graduate Program Director Volen 260

Retro: Modular and efficient retrospection in a database Ross Shaull Liuba Shrira Brandeis University

OVERVIEW, TERM 1, 2011 - General Web view‘Ripper Readers’ was created to help beginning readers learn, ... classifieds, comics, ... Students could draw a picture of ‘Skippy Frog’

Slither the Sound It Out Snake - thelearningpad.netthelearningpad.net/sitebuildercontent/sitebuilderfiles/cuecards.doc · Web viewStretch the Stretch It Out Snake. He ... Skippy

C Tutorial Ross Shaull cs146a 2011-09-21. Why C Standard systems language – Historical reasons (OS have historically been written in C, so libraries written

Split Snapshots and Skippy Indexing: Long Live the Past!

BasketTecido - Thermshaull · aaaaaaaaa aaaaaaaaaaaoaa aaaaaaaaaaaaa SHAULL . Title: Apresentação1 Created Date: 7/16/2018 12:13:44 PM

Skippy Dies by Paul Murray (Excerpt)

A chicken in every pot: a persistent snapshot memory scaled in time Liuba Shrira and Hao Xu Brandeis University

UNIVERSIDADE PRESBITERIANA MACKENZIEtede.mackenzie.br/jspui/bitstream/tede/3941/2/Colez Garcia Junior.pdf · A Millard Richard Shaull (in memoriam), no ano de seu centenário (1919-2019),

RegularSensing · 2020. 11. 1. · RegularSensing Shaull Almagor1, Denis Kuperberg2, and Orna Kupferman1 1TheHebrewUniversity,Israel. 2TheUniversityofWarsaw,Poland. Abstract

Ownership Types for Object Encapsulation Authors:Chandrasekhar Boyapati Barbara Liskov Liuba Shrira Presented by: Charles Lin Course: CMSC 631

NEW LISTING NEW - Amazon S3 · lighting, paver patio, 3 car garage with carriage doors, DAYLIGHT LL, minutes from West Shore Hospital & SHAULL ELEM. The Preserve 6165 Run Cross Lane

Long-term effects of cumulative adversity: The relationship between adversity type, well- being, and physical disability Amit Shrira The Interdisciplinary

HEARST October 19, 2010 11:00am CORPORATION Alana Parsons Alyssa Hennessy Jennifer Alberts Manny Curling Samantha Hayden Skippy Natural Peanut Butter with

UNIVERSIDADE ESTADUAL DE MARINGÁ CENTRO DE … · Prof. de História da Unicesumar. 4 DOMIQUILE, Sabrina. ... Shaull (1919-2002), “um dos próceres da Teologia da Libertação,

Providing Persistent Objects in Distributed Systems...Providing Persistent Objects in Distributed Systems Barbara Liskov, Miguel Castro, Liuba Shrira, Atul Adya Laboratory for Computer

Racconti dal forum · brando Ci cini Fabio ilfantasmadiOrange Laura. MarcoMI mario55 Mila Not-turbo Obel-ic patatacotta roxie scanner79 skippy su.piu toffolina2cv Autori delle risposte: