83
Minimizing Faulty Executions of Distributed Systems Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula, Arvind Krishnamurthy, Scott Shenker

Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula

  • Upload
    vudung

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

Minimizing Faulty Executions of Distributed Systems

Colin Scott, Aurojit Panda, Vjekoslav Brajkovic, George Necula, Arvind Krishnamurthy, Scott Shenker

Software Developer

(GBs)Software Developer

Node1 ?

Node2

Node3

Node4

Node5

Node6

Node7

Node8

Node9

Node10

Node11

Node12

Software Developer

1 LaToza, Venolia, DeLine, ICSE’ 06

49% of developers’ time spent on debugging!1

1 LaToza, Venolia, DeLine, ICSE’ 06

49% of developers’ time spent on debugging!1

Understanding How Bug Is Triggered

Fixing Problematic Code

Our Goal

Allow Developers To Focus on Fixing the Underlying Bug

Problem Statement

Identify a minimal causal sequence of events that

triggers the bug

OutlineIntroductionBackground

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

OutlineIntroductionBackground

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

OutlineIntroductionBackground

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

OutlineIntroductionBackground

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

Randomized Testing with DEMiApp

RPC lib

OS

App

RPC lib

OS

App

RPC lib

OS

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

message delivery

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

msg dst: b

message delivery

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: bmsg

dst: a

message delivery

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: b

msg dst: a

message delivery

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: b

msg dst: a

message delivery

External events (events outside

system’s control): Crash-recovery Process creation External message

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: b

msg dst: a

message delivery

crash recovery

External events (events outside

system’s control): Crash-recovery Process creation External message

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: b

msg dst: a

message delivery

crash recovery

External events (events outside

system’s control): Crash-recovery Process creation External message

Randomized Testing with DEMiApp

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

App

RPC lib

OSAspectJ

timer dst: b

msg dst: a

message delivery

crash recovery

Invariant Checking

An invariant is a predicate P over the state of all processes.

a b c d e

{ ✔ ✗

Invariant Checking

An invariant is a predicate P over the state of all processes.

a b c d e

{ ✔ ✗

A faulty execution is one that ends in an invariant violation.

e1 i1 i2 i3 i4e2

OutlineIntroductionBackground

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Formal Problem Statement

Find: locally minimal reproducing sequence τ’:

τ’ violates P, |τ’| ≤ |τ|

τ’ contains a subsequence of the external events of τ

if we remove any external event e from τ’,

¬∃ τ’’ containing same external events - e, s.t. τ’’ violates P

Given: schedule τ that results in violation of P

Formal Problem Statement

After finding τ’, minimize internal events:

remove extraneous message deliveries from τ’

RequestVote

RequestVote

RequestVote

RequestVote

VoteGranted

VoteGranted

VoteGranted

VoteGranted

Minimization

τ :Given

… ✗e1 i1 i2 i4e2 enim

Minimization

τ :Given

Straightforward approach: Enumerate all schedules |τ’| ≤ |τ|, Pick shortest sequence that reproduces ✗

τ Schedule Space

… ✗e1 i1 i2 i4e2 enim

Minimization

τ :Given

Straightforward approach: Enumerate all schedules |τ’| ≤ |τ|, Pick shortest sequence that reproduces ✗

τ Schedule Space

… ✗e1 i1 i2 i4e2 enim

O(n!)

Observation #1: many schedules are commutative

Adopt DPOR: Dynamic Partial Order Reduction

C. Flanagan, P. Godefroid, “Dynamic Partial-Order Reduction for Model Checking Software”, POPL ‘05

O( !)nk

Approach: prioritize schedule space exploration

Approach: prioritize schedule space exploration

Assume: fixed time budget Objective: quickly find small failing schedules

… ✗e1 i1 i2 i4e2 enim

Observation #2: selectively mask original events

τ :

… ✗e1 i1 i2 i4e2 enim

Observation #2: selectively mask original events

τ :

e1 e2 ene3 e4ext: e5

τ :

ene3ext: e5e1 e2 e4

… ✗e1 i1 i2 i4e2 enim

Observation #2: selectively mask original events

x

τ :

ene3ext: e5e1 e2 e4

… ✗e1 i1 i2 i4e2 enim

Observation #2: selectively mask original events

x

τ :

ene3ext: e5e1 e2 e4

… ✗e1 i1 i2 i4e2 enim

(Apply Delta Debugging1)

1A Zeller, R. Hildebrandt, “Simplifying and Isolating Failure-Inducing Input”, IEEE ‘02

Observation #2: selectively mask original events

τ :

ene3ext: e5

sub1:

e1 e2 e4

… ✗e1 i1 i2 i4e2 enim

e4 e5 en…

(Apply Delta Debugging1)

1A Zeller, R. Hildebrandt, “Simplifying and Isolating Failure-Inducing Input”, IEEE ‘02

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

… e5e4 en

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 … e5e4 en

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 … e5e4 en

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 i4 … e5e4 enim

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 i4 … e5e4 enim

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 i4 ✗… e5e4 enim

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4e1 e2 e3

foreach i in τ: if i is pending: deliver i # ignore unexpected

i1 i4 ✗… e5e4 enim

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4

i1 i4 ✗… e5e4 enim

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

ene5e4

i1 i4 ✗… e5e4 enim

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

sub2:

ene5e4

i1 i4 ✗… e5e4 enim

e5 en

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

sub2: i1 i4 …

ene5e4

i1 i4 ✗… e5e4 enim

e5 enim

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

sub2: i1 i4 ✔…

ene5e4

i1 i4 ✗… e5e4 enim

e5 enim

Observation #2: selectively mask original events

τ :

ext:

sub1:

… ✗e1 i1 i2 i4e2 enim

sub2:

…. . .

i1 i4 ✔…Explore backtrack points until (i) ✗ or (ii) time budget for sub2 expired

ene5e4

i1 i4 ✗… e5e4 enim

e5 enim

Observation #2: selectively mask original events

Observation #4: shrink external message contents

Observation #1: many schedules are commutative

Approach: prioritize schedule space exploration

Goal: find minimal schedule that produces violation

Minimize internal events after externals minimized

Observation #2: selectively mask original events

Observation #3: some contents should be masked

OutlineIntroductionBackground

Node 1 Node N

Test Coordinator

QA Testbed

Software Under Test

Minimization

Evaluation Conclusion

Randomized Testing with

DEMi

Target Systems

How well does DEMi work?To

tal E

vent

s

0

300

600

900

1200

1500

1800

2100

2400

2700

3000

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294 spark-3150 spark-9256

1114407718040226823523 300

600

1000

400

17101500

2850

2380

1250

2160

Initial ExecutionAfter Minimization

How well does DEMi work?To

tal E

vent

s

0

300

600

900

1200

1500

1800

2100

2400

2700

3000

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294 spark-3150 spark-9256

1114407718040226823523 300

600

1000

400

17101500

2850

2380

1250

2160

Initial ExecutionAfter Minimization

80% - 97% Reduction!

How well does DEMi work?To

tal E

vent

s

0306090

120150180210240270300

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294spark-3150spark-9256

11112529392851

212322 111440

77

180

40

226

82

3523

After MinimizationSmallest Manual Trace

How well does DEMi work?To

tal E

vent

s

0306090

120150180210240270300

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294spark-3150spark-9256

11112529392851

212322 111440

77

180

40

226

82

3523

After MinimizationSmallest Manual Trace

Factor of 1x - 5x from hand-crafted

170 69

How quickly does DEMi work?Ru

ntim

e in

Sec

onds

0

400

800

1200

1600

2000

2400

2800

3200

3600

4000

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294 spark-3150 spark-9256

210245427348

10676

69

43482

2132

282170

Overall Minimization(~12 hours) (~3 hours)

(~35 minutes)

170 69

How quickly does DEMi work?Ru

ntim

e in

Sec

onds

0

400

800

1200

1600

2000

2400

2800

3200

3600

4000

Case Studyraft-45 raft-46 raft-56 raft-58a raft-58b raft-42 raft-66 spark-2294 spark-3150 spark-9256

210245427348

10676

69

43482

2132

282170

Overall Minimization

<10 minutes except 3 cases

(~12 hours) (~3 hours)

(~35 minutes)

See the paper for… How we handle non-determinism Handling multithreaded processes Supporting other RPC libraries Sketch for minimizing production traces More in-depth evaluation Related work …

Conclusion

Open source tool: github.com/NetSys/demi

Read our paper! eecs.berkeley.edu/~rcs/research/nsdi16.pdf

Optimistic that these techniques can be successfully applied more broadly

Thanks for your time!Contact me! [email protected]

AttributionsInspiration for slide design: Jay Lorch’s IronFleet slides

Graphic Icons: thenounproject.org logfile: mantisshrimpdesign magnifying glass: Ricardo Moreira disk: Anton Outkine hook: Seb Cornelius bug report: Lemon Liu devil: Mourad Mokrane Putin: Remi Mercier

Production TracesModel: feed partially ordered log into

single machine DEMi

Require: - Partial ordering of all message deliveries - All crash-recoveries logged to disk

Instrumentation Complexity

Related WorkThread Schedule Minimization

•Isolating Failure-Inducing Thread Schedules. SIGSOFT ’02. •A Trace Simplification Technique for Effective Debugging of

Concurrent Programs. FSE ’10. Program Flow Analysis.

•Enabling Tracing of Long-Running Multithreaded Programs via Dynamic Execution Reduction. ISSTA ’07.

•Toward Generating Reducible Replay Logs. PLDI ’11. Best-Effort Replay of Field Failures

•A Technique for Enabling and Supporting Debugging of Field Failures. ICSE ’07.

•Triage: Diagnosing Production Run Failures at the User’s Site. SOSP ’07.

DDmin in more detail

DDmin assumptions

Local vs. global minima

Minimization Pace

Dealing With ThreadsIf you’re lucky: threads are largely independent (Spark)

If you’re unlucky: key insight: A write to shared memory is equivalent to a message delivery

Approach: •interpose on virtual memory, thread scheduler •pause a thread whenever it writes to shared memory / disk

Cf. “Enabling Tracing Of Long-Running Multithreaded Programs Via Dynamic Execution Reduction”, ISSTA ‘07

Dealing With Non-DeterminismInterpose on:

- Timers - Random number generators - Unordered hash values - ID allocation

Stop-gap: replay each schedule multiple times

Complete Results

Runtime Breakdown

Integrating with other RPC libsApp

RPC lib

OS

App

RPC lib

OS

App

RPC lib

OS

DEMiJVM