71
Cluster Computing with DryadLINQ Mihai Budiu, MSR-SVC PARC, May 8 2008

Cluster Computing with DryadLINQ

Embed Size (px)

DESCRIPTION

Cluster Computing with DryadLINQ. Mihai Budiu, MSR-SVC PARC, May 8 2008. Aknowledgments. MSR SVC and ISRC SVC Michael Isard, Yuan Yu, Andrew Birrell , Dennis Fetterly Ulfar Erlingsson, Pradeep Kumar Gunda, Jon Currey. Computer Evolution. ?. 1961. 2008. 2040. Computer Evolution. ?. - PowerPoint PPT Presentation

Citation preview

Page 1: Cluster Computing with DryadLINQ

Cluster Computing with DryadLINQ

Mihai Budiu, MSR-SVCPARC, May 8 2008

Page 2: Cluster Computing with DryadLINQ

2

Aknowledgments

MSR SVC and ISRC SVC

Michael Isard, Yuan Yu, Andrew Birrell, Dennis Fetterly

Ulfar Erlingsson, Pradeep Kumar Gunda, Jon Currey

Page 3: Cluster Computing with DryadLINQ

3

Computer Evolution

1961 2008 2040

?

Page 4: Cluster Computing with DryadLINQ

4

Computer Evolution

ENIAC 1943

30 tons200kW

Datacenter 2008

500,000 ft2

40MW

?2040

Page 5: Cluster Computing with DryadLINQ

5

2040

Page 6: Cluster Computing with DryadLINQ

6

Layers

Networking

Storage

Distributed Execution

Scheduling

Resource Management

Applications

Identity & Security

Caching and Synchronization

Programming Languages and APIs

Ope

ratin

g Sy

stem

Page 7: Cluster Computing with DryadLINQ

7

Pieces of the Global Computer

Page 8: Cluster Computing with DryadLINQ

8

This Work

Page 9: Cluster Computing with DryadLINQ

9

The Rest of This Talk

Windows Server

Cluster Services

Distributed Filesystem

Dryad

DryadLINQ

Windows Server

Windows Server

Windows Server

CIFS/NTFS

Large Vectors

Machine Learning

Page 10: Cluster Computing with DryadLINQ

10

How fast can you sort 1010 100-byte records (1Tb)?

Sequential scan/disk = 4.6 hours

Current record: 435 seconds (7.2 min)cluster of 40 Itanium2, 2520 SAN disks

Code: 3300 lines of C

Our result: 349 seconds (5.8 min)cluster of 240 AMD64 (quad) machines, 920 disks

Code: 17 lines of LINQ

TeraSort

Page 11: Cluster Computing with DryadLINQ

11

• Introduction• Dryad • DryadLINQ• Building on DryadLINQ

Outline

Page 12: Cluster Computing with DryadLINQ

12

• Introduction• Dryad

– deployed since 2006– many thousands of machines– analyzes many petabytes of data/day

• DryadLINQ• Building on DryadLINQ

Outline

Page 13: Cluster Computing with DryadLINQ

13

Goal

Page 14: Cluster Computing with DryadLINQ

14

Design Space

ThroughputLatency

Internet

Privatedata

center

Data-parallel

Sharedmemory

DryadSearch

HPC

Grid

Transaction

Page 15: Cluster Computing with DryadLINQ

15

Data Partitioning

RAM

DATA

DATA

Page 16: Cluster Computing with DryadLINQ

16

2-D Piping• Unix Pipes: 1-D

grep | sed | sort | awk | perl

• Dryad: 2-D grep1000 | sed500 | sort1000 | awk500 | perl50

Page 17: Cluster Computing with DryadLINQ

17

Dryad = Execution Layer

Job (application)

Dryad

Cluster

Pipeline

Shell

Machine≈

Page 18: Cluster Computing with DryadLINQ

18

Virtualized 2-D Pipelines

Page 19: Cluster Computing with DryadLINQ

19

Virtualized 2-D Pipelines

Page 20: Cluster Computing with DryadLINQ

20

Virtualized 2-D Pipelines

Page 21: Cluster Computing with DryadLINQ

21

Virtualized 2-D Pipelines

Page 22: Cluster Computing with DryadLINQ

22

Virtualized 2-D Pipelines• 2D DAG• multi-machine• virtualized

Page 23: Cluster Computing with DryadLINQ

23

Dryad Job Structure

grep

sed

sortawk

perlgrep

grepsed

sort

sort

awk

Inputfiles

Vertices (processes)

Outputfiles

ChannelsStage

Page 24: Cluster Computing with DryadLINQ

24

Channels

X

M

Items

Finite Streams of items

• distributed filesystem files (persistent)• SMB/NTFS files (temporary)• TCP pipes (inter-machine)• memory FIFOs (intra-machine)

Page 25: Cluster Computing with DryadLINQ

25

Architecture

Files, TCP, FIFO, Networkjob schedule

data plane

control plane

NS PD PDPD

V V V

Job manager cluster

Page 26: Cluster Computing with DryadLINQ

Fault Tolerance

Page 27: Cluster Computing with DryadLINQ

X[0] X[1] X[3] X[2] X’[2]

Completed vertices Slow vertex

Duplicatevertex

Dynamic Graph Rewriting

Duplication Policy = f(running times, data volumes)

Page 28: Cluster Computing with DryadLINQ

28

S S S S

A A A

S S

T

S S S S S S

T

# 1 # 2 # 1 # 3 # 3 # 2

# 3# 2# 1

static

dynamic

rack #

Dynamic Aggregation

Page 29: Cluster Computing with DryadLINQ

29

Data-Parallel Computation

Storage

Execution

Application

Parallel Databases

Map-Reduce

GFSBigTable

Dryad

Page 30: Cluster Computing with DryadLINQ

30

• Introduction• Dryad • DryadLINQ• Building on Dryad

Outline

Page 31: Cluster Computing with DryadLINQ

31

DryadLINQ

Dryad

Page 32: Cluster Computing with DryadLINQ

32

LINQ

Collection<T> collection;bool IsLegal(Key);string Hash(Key);

var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};

Page 33: Cluster Computing with DryadLINQ

33

Collection<T> collection;bool IsLegal(Key k);string Hash(Key);

var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};

DryadLINQ = LINQ + Dryad

C#

collection

results

C# C# C#

Vertexcode

Queryplan(Dryad job)Data

Page 34: Cluster Computing with DryadLINQ

34

Data Model

Partition

Collection

C# objects

Page 35: Cluster Computing with DryadLINQ

35

Query Providers

DryadLINQ

Client machine

(11)

Distributed query plan

C#

Query Expr

Data center

Output TablesResults

Input TablesInvoke Query

Output DryadTable

Dryad Execution

C# Objects

JM

ToDryadTable

foreach

Page 36: Cluster Computing with DryadLINQ

36

Demo

Page 37: Cluster Computing with DryadLINQ

37

Example: Histogrampublic static IQueryable<Pair> Histogram( IQueryable<LineRecord> input, int k){ var words = input.SelectMany(x => x.line.Split(' ')); var groups = words.GroupBy(x => x); var counts = groups.Select(x => new Pair(x.Key, x.Count())); var ordered = counts.OrderByDescending(x => x.count); var top = ordered.Take(k); return top;}

“A line of words of wisdom”

[“A”, “line”, “of”, “words”, “of”, “wisdom”]

[[“A”], [“line”], [“of”, “of”], [“words”], [“wisdom”]]

[ {“A”, 1}, {“line”, 1}, {“of”, 2}, {“words”, 1}, {“wisdom”, 1}]

[{“of”, 2}, {“A”, 1}, {“line”, 1}, {“words”, 1}, {“wisdom”, 1}]

[{“of”, 2}, {“A”, 1}, {“line”, 1}]

Page 38: Cluster Computing with DryadLINQ

38

Histogram Plan

SelectManyHashDistribute

MergeGroupBy

Select

OrderByDescendingTake

MergeSortTake

Page 39: Cluster Computing with DryadLINQ

39

Map-Reduce in DryadLINQ

public static IQueryable<S> MapReduce<T,M,K,S>( this IQueryable<T> input, Expression<Func<T, IEnumerable<M>>> mapper, Expression<Func<M,K>> keySelector, Expression<Func<IGrouping<K,M>,S>> reducer) { var map = input.SelectMany(mapper); var group = map.GroupBy(keySelector); var result = group.Select(reducer); return result;}

Page 40: Cluster Computing with DryadLINQ

40

Map-Reduce Plan

M

D

R

G

M

Q

G1

R

D

MS

G2

R

(1) (2) (3)

X

X

M

Q

G1

R

D

MS

G2

R

X

M

Q

G1

R

D

MS

G2

R

X

M

Q

G1

R

D

M

Q

G1

R

D

MS

G2

R

X

M

Q

G1

R

D

MS

G2

R

X

M

Q

G1

R

D

MS

G2

R

MS

G2

R

map

sort

groupby

reduce

distribute

mergesort

groupby

reduce

mergesort

groupby

reduce

consumer

map

parti

al a

ggre

gatio

nre

duce

S S S S

A A A

S S

T

Page 41: Cluster Computing with DryadLINQ

41

Distributed Sorting in DryadLINQ

public static IQueryable<TSource>DSort<TSource, TKey>(this IQueryable<TSource> source,                                  Expression<Func<TSource, TKey>> keySelector,                                  int pcount){            var samples = source.Apply(x => Sampling(x));            var keys = samples.Apply(x => ComputeKeys(x, pcount));            var parts = source.RangePartition(keySelector, keys);            return parts.OrderBy(keySelector);}

Page 42: Cluster Computing with DryadLINQ

42

Distributed Sorting Plan

O

DS

H

D

M

S

DS

H

D

M

S

DS

D

DS

H

D

M

S

DS

D

M

S

M

S

(1) (2) (3)

Page 43: Cluster Computing with DryadLINQ

43

• Introduction• Dryad • DryadLINQ• Building on DryadLINQ

Outline

Page 44: Cluster Computing with DryadLINQ

44

Machine Learning in DryadLINQ

Dryad

DryadLINQ

Large Vector

Machine learningData analysis

Page 45: Cluster Computing with DryadLINQ

45

Operations on Large Vectors: Map 1

U

T

T Uf

f

f preserves partitioning

Page 46: Cluster Computing with DryadLINQ

46

V

Map 2 (Pairwise)

T Uf

V

U

T

f

Page 47: Cluster Computing with DryadLINQ

47

Map 3 (Vector-Scalar)T U

fV

V

47

U

T

f

Page 48: Cluster Computing with DryadLINQ

Reduce (Fold)

48

U UU

U

f

f f f

fU U U

U

Page 49: Cluster Computing with DryadLINQ

49

Linear Algebra

T U Vnmm ,,=, ,

T

Page 50: Cluster Computing with DryadLINQ

50

Linear Regression

• Data

• Find

• S.t.

mt

nt yx ,

mnA

tt yAx

},...,1{ nt

Page 51: Cluster Computing with DryadLINQ

51

Analytic Solution

X×XT X×XT X×XT Y×XT Y×XT Y×XT

Σ

X[0] X[1] X[2] Y[0] Y[1] Y[2]

Σ

[ ]-1

*

A

1))(( Ttt t

Ttt t xxxyA

Map

Reduce

Page 52: Cluster Computing with DryadLINQ

52

Linear Regression Code

Vectors x = input(0), y = input(1);Matrices xx = x.Map(x, (a,b) => a.OuterProd(b));OneMatrix xxs = xx.Sum();Matrices yx = y.Map(x, (a,b) => a.OuterProd(b));OneMatrix yxs = yx.Sum();OneMatrix xxinv = xxs.Map(a => a.Inverse());OneMatrix A = yxs.Map(xxinv, (a, b) => a.Mult(b));

1))(( Ttt t

Ttt t xxxyA

Page 53: Cluster Computing with DryadLINQ

Expectation Maximization (Gaussians)

53

• 160 lines • 3 iterations shown

Page 54: Cluster Computing with DryadLINQ

Conclusions

• Dryad = distributed execution environment• Application-independent (semantics oblivious)• Supports rich software ecosystem

– Relational algebra, Map-reduce, LINQ• DryadLINQ = Compiles LINQ to Dryad• C# objects and declarative programming• .Net and Visual Studio for parallel programming

54

Page 55: Cluster Computing with DryadLINQ

55

Backup Slides

Page 56: Cluster Computing with DryadLINQ

56

Software Stack

Windows Server

Cluster Services

Distributed Filesystem

Dryad

Distributed Shell

PSQL

DryadLINQ

PerlSQL

server

C++

Windows Server

Windows Server

Windows Server

C++

CIFS/NTFS

legacycode

sed, awk, grep, etc.

SSISScope

C#

Vectors

Machine Learning

C#

Job

queu

eing

, mon

itorin

g

Page 57: Cluster Computing with DryadLINQ

57

Very Large Vector LibraryPartitionedVector<T>

T

Scalar<T>

T T

T

Page 58: Cluster Computing with DryadLINQ

58

DryadLINQ

• Declarative programming • Integration with Visual Studio• Integration with .Net• Type safety• Automatic serialization• Job graph optimizations static dynamic

• Conciseness

Page 59: Cluster Computing with DryadLINQ

59

Sort & Map-Reduce in DryadLINQ

Page 60: Cluster Computing with DryadLINQ

60

• Many similarities• Exe + app. model• Map+sort+reduce• Few policies• Program=map+reduce• Simple• Mature (> 4 years)• Widely deployed• Hadoop

Dryad Map-Reduce

• Execution layer• Job = arbitrary DAG• Plug-in policies• Program=graph gen.• Complex ( features)• New (< 2 years)• Still growing• Internal

Page 61: Cluster Computing with DryadLINQ

61

PLINQ

public static IEnumerable<TSource> DryadSort<TSource, TKey>(IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IComparer<TKey> comparer, bool isDescending){

return source.AsParallel().OrderBy(keySelector, comparer);}

Page 62: Cluster Computing with DryadLINQ

Query histogram computation

• Input: log file (n partitions)• Extract queries from log partitions• Re-partition by hash of query (k buckets)• Compute histogram within each bucket

Page 63: Cluster Computing with DryadLINQ

Naïve histogram topology

Q Q

R

Q

R k

k

k

n

n

is:Each

R

is:

Each

MS

C

P

C

S

C

S

D

P parse linesD hash distributeS quicksortC count

occurrencesMS merge sort

Page 64: Cluster Computing with DryadLINQ

Efficient histogram topologyP parse linesD hash distributeS quicksortC count

occurrencesMS merge sortM non-deterministic

merge

Q' is:Each

R

is:

Each

MS

C

M

P

C

S

Q'

RR k

T

k

n

T

is:

Each

MS

D

C

Page 65: Cluster Computing with DryadLINQ

Final histogram refinement

Q' Q'

RR 450

TT 217

450

10,405

99,713

33.4 GB

118 GB

154 GB

10.2 TB

1,800 computers43,171 vertices11,072 processes11.5 minutes

Page 66: Cluster Computing with DryadLINQ

66

Data Distribution(Group By)

Dest

Source

Dest

Source

Dest

Source m

n

m x n

Page 67: Cluster Computing with DryadLINQ

TT[0-?) [?-100)

Range-Distribution Manager

S

D D D

S S

S S S

Tstatic

dynamic67

Hist

[0-30),[30-100)

[30-100)[0-30)

[0-100)

Page 68: Cluster Computing with DryadLINQ

68

Goal: Declarative Programming

X

T

S

X X

S S

T T T

X

static dynamic

Page 69: Cluster Computing with DryadLINQ

JM code

vertex code

Staging1. Build

2. Send .exe

3. Start JM

5. Generate graph

7. Serializevertices

8. MonitorVertex execution

4. Querycluster resources

Cluster services6. Initialize vertices

Page 70: Cluster Computing with DryadLINQ

70

SkyServer Query 18

D D

MM 4n

SS 4n

YY

H

n

n

X Xn

U UN N

L L

select distinct P.ObjIDinto results from photoPrimary U, neighbors N, photoPrimary Lwhere U.ObjID = N.ObjID and L.ObjID = N.NeighborObjID and P.ObjID < L.ObjID and abs((U.u-U.g)-(L.u-L.g))<0.05 and abs((U.g-U.r)-(L.g-L.r))<0.05 and abs((U.r-U.i)-(L.r-L.i))<0.05 and abs((U.i-U.z)-(L.i-L.z))<0.05

Page 71: Cluster Computing with DryadLINQ

71

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

16.0

0 2 4 6 8 10

Number of Computers

Speed-up (times)

Dryad In-Memory

Dryad Two-pass

SQLServer 2005

SkyServer Q18 Performance