Upload
meredith-mcdaniel
View
217
Download
1
Tags:
Embed Size (px)
Citation preview
Scalable Computing on Open Distributed Systems
Jon WeissmanUniversity of Minnesota
National E-Science CenterCLADE 2008
What is the Problem?• Open distributed systems
– Tasks submitted to the “system” for execution– Workers do the computing, execute a task, return an answer
• The Challenge– Computations that are erroneous or late are less useful– Failure, errors, hacked, misconfigured– Unpredictable time to return answers
• Both local- and wide-area systems– Focus on volunteer wide-area systems
Shape of the Solution
• Replication• Works for all sources of unreliability
– computation and data
• How to do this intelligently - scalably?
Replication Challenges• How many replicas?
– too many – waste of resources– too few – application suffers
• Most approaches assume ad-hoc replication– under-replicate: task re-execution (^ latency)– over-replicate: wasted resources (v throughput)
• Using information about the past behavior of a node, we can intelligently size the amount of redundancy
Problems with ad-hoc replication
Unreliable node
Reliable nodeTask x sent to group A
Task y sent to group B
System Model
0.9
0.4
0.8
0.8
0.7
0.8
0.8
0.7
0.4
0.3
• Reputation rating ri– degree of node reliability
• Dynamically size the redundancy based on ri
• Note: variable sized groups
• Assume no correlated errors, relax later
Smart Replication• Rating based on past interaction with clients
– prob. (ri) over window • correct/total or timely/total
– extend to worker group (assuming no collusion) => likelihood of correctness (LOC)
• Smarter Redundancy– variable-sized worker groups– intuition: higher reliability clients => smaller groups
12
1:,
12
1
1
12
1121
)1(k
kmm
k
iii
k
iik
ii rr
Terms
• LOC (Likelihood of Correctness), g
– computes the ‘actual’ probability of getting a correct or timely answer from a group g of clients
• Target LOC (target)– the success-rate that the system tries to ensure while
forming client groups
Scheduling Metrics
• Guiding metrics– throughput : is the set of successfully completed
tasks in an interval
– success rate s: ratio of throughput to number of tasks attempted
Algorithm Space
• How many replicas?– algorithms compute how many replicas to meet a
success threshold
• How to reach consensus?– Majority (better for byzantine threats)– M-1 (better for timeliness)– M-2 (2 matching)
One Scheduling Algorithm
Evaluation
• Baselines– Fixed algorithm: statically sized equal groups uses no
reliability information
– Random algorithm: forms groups by randomly assigning nodes until target is reached
• Simulated a wide-variety of node reliability distributions
Experimental Results: correctness
Simulation: byzantine behavior only … majority voting
Role of target
• Key parameter– hard to specify
• Too large– groups will be too large (low throughput)
• Too small– groups will be too small (low success rate)
• Instead, adaptively learn it– bias toward or s or both
Adaptive Algorithm
What about time?
• Timeliness• Result > time T is less (or not) useful
– (1) soft deadlines• user interacting, visualization output from computation
– (2) hard deadlines• need to get X results done before HPDC/NSDI/… deadline
• Live experimentation on PlanetLab• Real application: BLAST
Some PL data
Computation
- both across and within nodes
Communication
- both across and within nodes
Temporal variability
PL EnvironmentRidge is our live system that implements reputation
120 wide-area nodes, fully correct, M-1 consensus
3 Timeliness environments based on deadlines
D=120s D=180s D=240s
Experimental Results: timeliness
Best BOINC (BOINC*), conservative (BOINC-) vs. RIDGE
Makespan Comparison
Collusion
• Suppose errors are correlated?• How?
– Widespread bug (hardware or software)– Misconfiguration– Virus– Sybil attack– Malicious group
• With Emmanuel Jeannot (Inria)
Key Ideas• Execute a task => answer groups
– A1, A2, … Ak
– For each Ai there are associated workers Wi1, Wi
2… Win
– Pcollusion(workers in Ai)
• Learn probability of correlated errors– Pcollusion(W1, W2)
• Estimate probability of group correlated errors– Pcollusion(G), G=[W1, W2, W3, …] via f {Pcollusion(Wi, Wj), for all i,j}
• Rank and select answer– Pcollusion(G) and |G|– Update matrix: Pcollusion(W1, W2)
Bootstrap Problem
• Building collusion matrix• Must first “bait” colluders
– Over-replicate such that majority group is still correct to expose colluders
– : probability of worker collusion– : probability colluders fool the system
• Given group size k
4: 1 group 30% colluders, always collude5. Same group – colludes 30% of the time7. 2 groups (40%, 30% colluders)
correctness
throughput
Summary
• Reliable Scalable computing– correctness and timeliness
• Future work– combined models and metrics– workflows: coupling data and computation
reliability
Visit ridge.cs.umn.edu to learn more