27
A Case for Performance-Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

Embed Size (px)

Citation preview

Page 1: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

A Case for Performance-Centric Network

Allocation

Gautam Kumar, Mosharaf Chowdhury,

Sylvia Ratnasamy, Ion Stoica

UC Berkeley

Page 2: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

Datacenter Applications

Page 3: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

3Data Parallelism

• Applications execute in several computation stages and require transfer of data between these stages (communication).

• Computation in a stage is split across multiple nodes.

• Network has an important role to play, 33% of the running time in Facebook traces. (Orchestra, SIGCOMM 2011)

M M

R R

J

M

R

M

R

J

M

R

J

Map

Reduce

Join (*RoPE, NSDI 2012)

*

HotCloud June 12, 2012

Page 4: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

4

Data Parallelism

• Users, often, do not know what network support they require. Final execution graph created by the

framework.

• Frameworks know more, provide certain communication primitives. e.g., Shuffle, Broadcast etc.

HotCloud June 12, 2012

Page 5: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

5

ScopePrivate clusters running data parallel applications.

Little concern for adversarial behavior.

Application level inefficiencies dealt extrinsically.

HotCloud June 12, 2012

Page 6: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

Current Proposals

Page 7: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

7

Explicit Accounting• Virtual cluster based network

reservations. (Oktopus, SIGCOMM 2011)

• Time-varying network reservations. (SIGCOMM 2012)

DRAWBACK:Exact network requirements often not known; non work-conserving.

HotCloud June 12, 2012

Page 8: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

8

Fairness-Centric• Flow level fairness or Per-Flow. (TCP)

• Fairness with respect to the sources. (Seawall, NSDI 2012)

• Proportionality in terms of total number of VMs. (FairCloud, SIGCOMM 2012)

DRAWBACK:

Gives little guidance to developers about the performance they can expect while scaling their applications.

HotCloud June 12, 2012

Page 9: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

9

In this work . . .• A new perspective to share the network

amongst data-parallel applications – performance-centric allocations: enabling users to reason about the performance of

their applications when they scale them up. enabling applications to effectively parallelize to

preserve the intuitive mapping between scale-up and speed-up.

• Contrast / relate performance-centric proposals with fairness-centric proposals.

HotCloud June 12, 2012

Page 10: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

Performance-Centric

Allocations

Page 11: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

11

λ λ/2

λ/2 λλλ

Shuffle Broadcast

Types of Transfers*

(*Orchestra, SIGCOMM 2011)HotCloud June 12, 2012

Page 12: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

12

λλλλ

λλ/2

λλ

λ

2λ2λ 2λ2λ

λ/2

2X Scale UP

2X Scale UP

Tota

l D

ata

=

Tota

l D

ata

=

λ/2λ/2

Shuffle Broadcast

Scaling up the application

HotCloud June 12, 2012

Page 13: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

13

Performance-Centric Allocations

• Understand the support that the application needs from the network to effectively parallelize.

• At a sweet spot – framework knows application’s network requirements.

HotCloud June 12, 2012

Page 14: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

14

Shuffle-only clusters

λ/2 λ

2λAm Ar

Bm Br

λBm Br

λ/2

λ/2 λ/2

tAmtAs

tAr

tBmtBr

tBs

HotCloud June 12, 2012

Page 15: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

15

Shuffle-only Clusters

λ/2 λ

2λAm Ar

Bm Br

λBm Br

λ/2

λ/2 λ/2

tAmtAs

= 2λ/α tAr

tAm/2 tAr

/2tBs= λ/2α = tAs

/4

α

α

tB < tA/2

λ/2 λ

2λAm Ar

Bm Br

λBm Br

λ/2

λ/2 λ/2

tAmtAs

= 2λ/α tAr

tAm/2 tAr

/2tBs= λ/α = tAs

/2

α

α/2

tB = tA/2

Per-Flow Proportional

HotCloud June 12, 2012

Page 16: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

16

Broadcast-only Clusters

λ λ

2λAm Ar

Bm Br

λBm Br

λ

λ λ

tAmtAs

tAr

tBmtBr

tBs

HotCloud June 12, 2012

Page 17: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

17

Broadcast-only Clusters

λ λ

2λAm Ar

Bm Br

λBm Br

λ

λ λ

tAmtAs

= 2λ/α tAr

tAm/2 tAr

/2tBs= 2λ/α = tAs

α

α/2

tB > tA/2

λ

2λAm Ar

Bm Br

λBm Br

tAmtAs

= 2λ/α tAr

tAm/2 tAr

/2tBs= λ/α = tAs

/2tB = tA/2

α

α λ

λ

λ λ

Proportional Per-Flow

HotCloud June 12, 2012

Page 18: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

18

Recap• TCP in shuffle gives more

than requisite speed-up and thus hurts performance of small jobs. Proportionality achieves the right balance.

• Proportionality in broadcast limits parallelism. TCP achieves the right balance.

Degree of Parallelism

Sp

eed U

p

TCP (Shuffle)Prop. (Shuffle)TCP. (Broadcast)Prop. (Broadcast)

HotCloud June 12, 2012

Page 19: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

19

Complexity of a transfer

• xN -transfer if x is the factor by which the amount of data transferred increases when a scale up of N is done, x [1, N].

• Shuffle is a 1N -transfer and broadcast is an NN-transfer.

• Performance-centric allocations encompass x.

HotCloud June 12, 2012

Page 20: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

Heterogeneous Frameworks and

Congested Resources

Page 21: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

21

• Share given based on the complexity of the transfer.

• The job completion time of both jobs degrades uniformly in the event of contention.

Both finish in 6s

Both finish in 4s

2GAm Ar

2G

2GBm Br

2G

1GA’m A’r

1G

A’m A’r

0.5G

1G

B’m B’r

1G

B’m B’r

500Mbps2Gb to send

500Mbps2Gb to send

~333Mbps2Gb to send

~666Mbps4Gb to send

2X Scale UP

0.5G

0.5G

0.5G

1G

1G

1G

1G

HotCloud June 12, 2012

Page 22: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

22Network

Parallelism• Isolation between the speed-up due to the

scale-up for the application and the performance degradation due to finite resources.

y’ yN(α)X

α : degradation due to limited resourcesy : old running time

y’: new running time after a scale-up of N

HotCloud June 12, 2012

Page 23: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

Summary

Page 24: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

24

• Understand performance-centric allocations and their relationship with fairness-centric proposals. Proportionality is the performance-centric approach

for shuffle-only clusters. Breaks down for broadcasts, per-flow is the

performance-centric approach for broadcast-only clusters.

• An attempt to a performance-centric proposal for heterogeneous transfers. Understand what happens when resources get

congested.HotCloud June 12, 2012

Page 25: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

Future Work

Page 26: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

26

• A more rigorous formulation. Some questions to be answered:

different N1 and N2 on both sides of the stage etc.

• Analytical and experimental evaluation of the policies. Whether redistribution of completion

time or total savings.

HotCloud June 12, 2012

Page 27: A Case for Performance- Centric Network Allocation Gautam Kumar, Mosharaf Chowdhury, Sylvia Ratnasamy, Ion Stoica UC Berkeley

Thank you