Implementation and Evaluation of a Protocol for Recording Process Documentation in the Presence of Failures Zheng Chen and Luc Moreau [email protected]

Implementation and Evaluation of a Protocol for Recording Process

Documentation in the Presence of Failures

Zheng Chen and Luc [email protected]

[email protected]

University of Southampton

http://www.ecs.soton.ac.uk/

Outline

Motivation

Protocol Overview

Implementation

Experimental Setup

Experimental Results & Analysis

Conclusions & Future Work

The provenance of a data product refers to the process that led to that data product

Process documentation is a computer-based representation of a past process for determining provenance

Process documentation consists of a set of p-assertions

Process documentation is stored in provenance stores Provenance obtained by querying provenance stores

Link

A protocol to record process documentation

Multiple provenance stores are interlinked to enable retrievability of distributed process documentation

PReP (Groth 04-08)

invocationresult

Actor1

PS1

Invocation and result p-assertions

PS2

Actor2 invocationresult

Actor3

PS3

invocationresult

Actor4

PS4

Pointer Chain

Failures

Provenance store crash, communication failures We do not consider application failures, e.g. actor crash Poor quality process documentation

Incomplete

Disconnected

invocationresult

Actor1 Actor2 invocationresult

Actor3 invocationresult

Actor4

Broken Pointer Chain

PS2

Link Invocation and result p-assertions

PS1 PS3 PS4

Requirements Guaranteed Recording After a process completes, the entire documentation of

the process must eventually be recorded in provenance stores

Link Accuracy All the links recorded during a process must eventually

be accurate to enable retrievability of distributed documentation

Efficient Recording The protocol should be efficient and introduce

minimum overhead

F-PReP

A protocol for recording process documentation in the presence of failures

Derives from PReP to inherit its generic nature

Introduces an Update Coordinator to facilitate updating links (We assume the coordinator does not crash)

Actor’s side Uses timeout and retransmission to record p-assertions Chooses alternative provenance stores in case of failures Requests the coordinator to update links

Provenance store Replies an acknowledgement only after it has successfully

recorded p-assertions in its persistent storage.

Invocation and result p-assertions

Link

PS1 PS2 PS3 PS4

F-PReP

Actor1 Actor2invocationresult

Actor3invocationresult

Actor4invocationresult

PS2’

Update Coordinator

Repair Request

Pointer Chain

Update

Update

Implementation Provenance Store

Implemented as a Java Servlet backend store (Berkeley DB) Disk cache Flushing OS buffers to disk before providing an ack to actor Update Plug-In

Client Side Library Remedial actions that cope with failures Multithreading for the creation and recording of p-assertions A local file store (Berkeley DB) for temporarily maintaining p-

assertions

Update Coordinator Implemented as a Java Servlet Berkeley DB is also employed to maintain request

information

Performance Study

Throughput of provenance store and coordinator

Scalability of update coordinator

Failure-free recording performance

Overhead of taking remedial actions

Performance impact on application

Experimental Setup Iridis cluster (Over 1000 processor-cores) Gigabit Ethernet Tomcat 5.0 container Berkeley DB Java Edition database Java 1.5 A generator is used on an actor's side to inject random

failure events: Failure to submit a batch of p-assertions to a

provenance store Failure to receive an acknowledgement from a

provenance store before a timeout Generates a failure event based on a failure rate, i.e.,

the number of failure events occurring after a total number of recordings

1. Provenance Store (PS) Throughput

Setup: up to 512 clients sending 10k p-assertions to 1 PS in 10 min Hypothesis: Disk cache may sacrifice a provenance store's throughput. Result: 20% decrease in throughput

2. Coordinator Throughput

Setup: up to 512 clients sending 100 requests to 1 coordinator in 10 min Hypothesis: The coordinator’s throughput is high. Result: 30,000*100 repair requests accepted in 10 min

3. Throughput Experiment with Failures (1 client)

Setup: 1 client sending 10k p-assertions to 1 PS 1 alt. PS and 1 coordinator used in the case of failures Hypothesis: (a) Resending to a same PS is preferred over alt. PS

for transient failures (b) Update coordinator is not a bottleneck.

A client sends at most 200*100 repair requests. (Maximum is seen when failure rate is 50%.)

Coordinator throughput: 30,000*100 req/10min

This implies that coordinatorcan support a large number

of clients (50 - 100?) without being a bottleneck.

4. Throughput Experiment with Failures (128 clients)

Setup: 128 clients sending 10k p-assertions to 1 PS 1 alt. PS and 1 coordinator used in the case of failures Hypothesis: (a) Resending to a alt. PS is preferred to same PS

(b) The coordinator is not a bottleneck.

128 clients send at most 750*100 repair requests. (Maximum is seen when failure rate is 50%.)

Coordinator throughput: 30,000*100 req/10min

This implies that coordinator can support a large number of clients

without being a bottleneck.

5. Failure-free Recording Performance

Setup: 1 client recording 10,000 10k p-assertions to 1 PS 100 p-assertions shipped in a single batch

Hypothesis: Disk cache causes overhead. Results: (a) 900 10k p-assertions may be lost if PS’s OS crashes. (PReP)

(b) 13.8% overhead, compared to PReP

6. Overhead of Taking Remedial Actions

Setup: 1 client recording 100 p-assertions to 1 PS 1 alt. PS and 1 coordinator used in the case of failures

Hypothesis: Remedial actions have acceptable overhead. Result: <10% overhead, compared to failure-free record time

7. Performance Impact on Application

Amino Acid Compressibility Experiment (ACE) High performance and fine grained, thus representative One run of ACE: 20 parallel jobs; 54, 000 interactions/job Extremely detailed process documentation 1.08 GB p-assertions/job in 25 minutes

Recording Performance in ACE

Setup: 5 PS and 1 coordinator Multithreading for creation and recording p-

assertions Hypothesis: F-PReP has acceptable recording overhead. Results: (a) similar overhead (12%) as PReP on application

performance when no failure occurs

(b) Timeout and queue management affect performance.

Impact of Queue Management on Performance

Hypothesis: Flow control on queue affects performance. Conclusions: (a) The result supports our hypothesis.

(b) We can monitor queue and take actions,

e.g., employing the local file store.

8. Quality of Recorded Process Documentation

Setup: Using F-PReP and PReP to record p-assertions Querying PS to verify recorded documentation Results: (a) PReP: incomplete; F-PReP: complete (b) PReP: irretrievable; F-PReP: retrievable

Conclusions & Future Work Coordinator does not affect an actor’s recording performance. In an application, F-PReP has similar recording overhead as

PReP on application performance when there is no failure. Although it introduces overhead in the presence of failures,

we believe the overhead is still acceptable, given that it can record high quality (i.e., complete and retrievable) process documentation.

We are currently investigating how to create process documentation when an application has its own fault tolerance schemes to tolerate application level failures.

In future work, we plan to make use of the process documentation recorded in the presence of failures to diagnose failures.

Questions?

Thank Thank you!you!

Documents

Implementation and Evaluation of a Protocol for Recording Process Documentation in the Presence of Failures Zheng Chen and Luc Moreau [email protected]