33
Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein 6/5/08

Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Embed Size (px)

Citation preview

Page 1: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Dataflow Analysis for Concurrent Programs using

Datarace Detection

Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner

LBA Reading GroupMichelle Goodstein

6/5/08

Page 2: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Outline

Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results Conclusions

Page 3: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Motivation

Want to apply dataflow analysis to concurrent programs without: Requiring annotations Escape analysis (loss of precision) Custom concurrency analysis Model checking (combinatorial explosion)

Page 4: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Introducing Radar

Scheme for concurrent dataflow analysis Starts with sequential dataflow analysis Race detection creates concurrent analysis Can use already-created race detectors

We’ll see it applied to Relay

Page 5: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Outline

Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results Conclusions

Page 6: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Assumptions

For each procedure, either Have access to code Have access to a sound summary

Shared memory is sequentially consistent

Page 7: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Radar’s Key Insights

Adjustability of sequential analysis: Concurrent dataflow facts are a subset of

sequential dataflow facts “Missing facts”

Facts that can be killed by other threads Suppose we have a fact about lvalue l

“At line y, l is not null” Enough to know if another thread can write to l

concurrently “At line z, another thread can write to l”

Page 8: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Radar’s Key Insights Pseudo-Races :

Identify “missing facts”, Remove from sequential analysis

Solution: insert a pseudo-read for location l Ask a race detector: “is there a race at this point for l?”

Yes Another thread can write. Remove fact No No other thread can write. Retain fact.

Producer/Consumer examples follow Non-null dataflow analysis Sequential analysis on left Facts “killed” by concurrency crossed out in red

Page 9: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

First example: non-null facts

Producer–Consumer Pseudo-read for px->data at

line PA Consumer thread can

execute line C5 Race! px->data is crossed out at

line PA

Page 10: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Second example: non-null facts

Modified producer/consumer Still race-free, other than

perf_ctr Now, producer

acquires/releases lock twice

Page 11: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Second example: non-null facts

Insert pseudo-read at P5 on px->data

Races with C5 write to cx->data Kills px->data at P5 and where

it propagates At P8, not necessarily true that

px->data is non-null Null pointer dereference! Note: no data races (except on

perf_ctr)

We can detect this!

Page 12: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Outline

Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results Conclusions

Page 13: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Sequential Dataflow Analysis

Representation: nodes in CFG Flow function F(n,d,p): facts true after point p

n: node, d: incoming dataflow fact, p: program point lvals(f): lvalues fact f depends on ThreadKill(p,l): computes whether race can

occur on l at program point p

Fadj(n,d,p) = {fF(n,d,p), llvals(f), fThreadKill(p,l)}

Page 14: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Is Radar Sound?

Suppose there is an oracle function O Give a program point p and a location l Returns whether a race is possible

Suppose radar is given a race detector R Radar is sound if O(p,l) implies R(p,l)

If there is a race, radar wil detect it Can also return false positives

Page 15: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Outline

Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results Conclusions

Page 16: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Radar Optimizations

Reduce number of times call ThreadKill Handle function calls

Page 17: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Reduce ThreadKill calls

Race detector for cross product of program points and lvalues is expensive

Many program points have similar behavior For each lvalue in a region:

Racy for entire region Not racy for entire region

Compute once for entire region Region Map: points “regions

Page 18: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Incorporating Function Calls

To handle function calls: Introduce a new kind of region: Introprocedural

Summary Region (SumReg) At a particular call site, approximately

summarizes possible regions can pass through To maintain soundness

Suppose there is a transitively reachable path from a callsite cs to a racy region

Summary region must repot that cs is racy

Page 19: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Radar’s Requirements

Race Detection Engine Region Lvalue raciness

Region Map Points Race-equivalent Regions

Summary Region Map Callsites Summary Regions

Page 20: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Outline

Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results Conclusions

Page 21: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Relay

Relay Static race detection tool Lockset-based Works bottom up Scales to the linux kernel

Page 22: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Relay

Uses relative lockset analysis: L+, L- :

L+ : locks definitely acquired since function entry point

L- : locks possibly released since function entry point

Relative lockset for exit point of function is stored as summary of function’s behavior Approximates effect of function call on locks

currently held

Page 23: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Radar(Relay) Race Detection Engine

Relay Region Map

Maps program point (g, (L+,L-)) g: function name (L+,L-): relative lockset summary for function g

Summary Region Map Function g being called at the call site cs in function h Computes AllUnlocks(cs) = L- in g Suppose Region is (h, (L+,L-)) Returns (h, (L+ - AllUnlocks(cs),L- AllUnlocks(cs)))

Page 24: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Pseudoreads

Suppose at some program point p fact f holds RegionMap(p): region (g, (L+,L-)) For all lvalues l lvals(f):

Pretend to read l at p with relative lockset (L+,L-) For any other lvalue m which might be aliased…

Intersection of positive locksets is empty report race

Page 25: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Relay with Radar: Implementation First Pass: Run Relay

Computes relative lockset associated with each function

Second Pass: Sequential Analysis Pretend no races exist Collect all the possible queries about races

Third Pass: Run Relay, Adding Pseudo-reads Insert pseudo-access wherever race query exist

Fourth Pass: Adjusted Sequential Analysis At each pseudo-access for l, query race detector If race could occur, kill facts depending on l

Page 26: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Outline

Motivation Overview of Radar Radar(Relay) Radar Formalization Radar Optimizations Evaluation & Results Conclusions

Page 27: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Evaluation

Focus on non-null dataflow analysis Used 4 black boxes to answer race queries

Steensgaard’s pointer analysis If a value is reachable from a global true

Radaralias Region map always returns empty lockset Answers the question of whether any two values alias

Radar Optimistic

Always return false Unsound, and overly precise

Page 28: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Results

Page 29: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Terminology Blob nodes:

Many lvalues on the heap are merged into one node by alias analysis Can lead to false positives when checking null-

dereferences Other work shows hard to account for heap structures Next figure excludes “blob nodes” for pointer

dereferences Non-blob dereferences:

Apache: 52% SSL: 76% Linux: 71%

Page 30: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Results

Page 31: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Results

Consider gap between Seq and Steensgaard Check how much is bridged by Radar

With and without locks

Page 32: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Outline

Motivation Overview of Radar Radar(Relay) Radar Formalization Radar Optimizations Evaluation & Results Conclusions

Page 33: Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein

Conclusions

Radar is Scalable Not tied to particular concurrency models Tunable to desired precision

Radar(Relay) Good precision relative to sequential, steensgaard

Future Work More types of analysis Race detection for other concurrency constructs