Programming the cloudwith Skywriting"
Derek Murray"
Outline"• Existing parallel programming models"
• Introduction to Skywriting"
• Ongoing work"
SMP programming"• Symmetric multiprocessing"– All cores share the same address space"– Usually assumes cache-coherency"
• Multi-threaded programs"– Threads created dynamically"– Shared, writable data structures"– Synchronisation"
• Atomic compare-and-swap"• Mutexes, Semaphores"• {Software, Hardware} Transactional Memory"
Distributed programming"• Shared nothing*"– Processors communicate by message passing"– Standard assumption for large supercomputers"– ...and data centres"– ...and recent manycore machines (e.g. Intel SCC)"
• Explicit messages"– MPI, Pregel, Actor-based"
• Implicit messages: task parallelism"– MapReduce, Dryad"
Task parallelism"• Master-worker architecture"– Master maintains task queue"– Workers execute independent tasks in parallel"
• Fault tolerance"– Re-execute tasks on failed workers"– Speculatively re-execute “slow” tasks"
• Load balancing"– Workers consume tasks at their own rate"– Task granularity must be optimised"
Task graph"
A" B"
A runs before B"
B depends on A"
MapReduce"• Two kinds of task: map and reduce"• User-specified map and reduce functions"– map() a record to a list of key-value pairs"– reduce() a key and the list of related values"
• Tasks apply functions to data partitions"M" R"
R"
R"
M"
M"
Dryad"• Task graph is first class"– Vertices run arbitrary code"– Multiple inputs and inputs"– Channels specify data transport"
• Graph must be acyclic and finite"– Permits topological sorting"– Prevents unbounded iteration"
Architecture"
Driver program"
Code"
Results"
while (!converged) do work in parallel;
Existing systems"
Driver program"
Code"
Results"
Code"
Results"
Code"
Results"
Code"
Results"
Driver program"
while (…) submitJob();
Existing systems"
Driver program"
Code"
Results"
Code"Driver
program"
while (…) submitJob();
Skywriting"
Code"
Results"
while (…) doStuff();
• Turing-complete language for job specification"
• Whole job executed on the cluster"
Spawning a Skywriting task"
function f(arg1, arg2) { … }
result = spawn(f, [arg1, arg2]);
// result is a “future” value_of_result = *result;
Building a task graph"function f(x, y) { … } function g(x, y) { … } function h(x, y) { … }
a = spawn(f, [7, 8]); b = spawn(g, [a, 0]); c = spawn(g, [a, 1]); d = spawn(h, [b, c]); return d;
a a
c b
d
f
g g
h
Iterative algorithm"current = …; do { prev = current; a = spawn(f, [prev, 0]); b = spawn(f, [prev, 1]); c = spawn(f, [prev, 2]); current = spawn(g, [a, b, c]); done = spawn(h, [current]); while (!*done);
Iterative algorithm"
f f f
g h
f f f
Aside: recursive algorithm"function f(x) { if (/* x is small enough */) { return /* do something with x */; } else { x_lo = /* bottom half of x */; x_hi = /* top half of x */; return [spawn(f, [x_lo]), spawn(f, [x_hi])]; } }
Performance case studies"• All experiments used Amazon EC2"– m1.small instances, running Ubuntu 8.10"
• Microbenchmark"
• Smith-Waterman"
Job creation overhead"
0"
10"
20"
30"
40"
50"
60"
0" 20" 40" 60" 80" 100"
Ove
rhea
d (s
econ
ds)"
Number of workers"
Hadoop"Skywriting"
Smith-Waterman data flow"
Parallel Smith-Waterman"
Parallel Smith-Waterman"
0"50"
100"150"200"250"300"350"
1" 10" 100" 1000" 10000"
Tim
e (s
econ
ds)"
Number of tasks"
Future work: manycore"• Trade-offs are different"– Centralised master may become a bottleneck"– Switch to local work-queues and work-stealing"– Distributed scoreboards for futures"– Optimised interpreter/compilation?"
• Multi-scale hybrid approaches"– Multiple cores"– Multiple machines"– Multiple clouds..."
Future work: message-passing"
• Language designed for batch processing"– However, batches may be infinitely long!"
• Can we apply it to streaming data?"– Log files"– Financial reports"– Sensor data"
• Can we include data dependencies?"– High- and low-frequency streams"
Conclusions"• Turing-complete programming language
for cloud computing"
• Runs real jobs with low overhead"
• Lots more still to do!"
Questions?"• Email"
– [email protected]"• Project website"
– http://www.cl.cam.ac.uk/netos/skywriting/