33
PrincetonUnive rsity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh *Princeton IBM Research

PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

Embed Size (px)

Citation preview

Page 1: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

PrincetonUniversity

Towards Predictable Multi-Tenant Shared Cloud Storage

David Shue*, Michael Freedman*, and Anees Shaikh✦

*Princeton ✦IBM Research

Page 2: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

2

Shared Services in the Cloud

Z

Y

T

FZ

Y

F

T

S3 EBS SQSDynamoDB

Page 3: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

4

DD DD DD DDShared Storage Service

Co-located Tenants Contend For Resources

Z Y T FZ Y FTY YZ F F F

2x demand 2x demand

Page 4: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

6

DD DD DD DDShared Storage Service

Co-located Tenants Contend For Resources

Y FY FY Y F F F

2x demand 2x demand Z Y T FZ Y FTY YZ F F F

Page 5: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

8

DD DD DD DDShared Storage Service

Co-located Tenants Contend For Resources

2x demand 2x demand Y FY FY Y F F FZ Y T FZ Y FTY YZ F F F

Resource contention = unpredictable performance

Page 6: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

16

Z Y T FZ Y FTY YZ F F FZ Y T FZ Y FTY YZ F F F

DD DD DD DDSS SS SS SS

Towards Predictable Shared Cloud Storage

Shared Storage Service

Per-tenant Resource Allocation and Performance

Isolation

Page 7: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

17

Zynga Yelp FoursquareTP

Shared Storage Service

Towards Predictable Shared Cloud Storage

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

80 kreq/s120 kreq/s 160 kreq/s40 kreq/s

Hard limits are too restrictive, achieve lower utilization

Page 8: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

18

Zynga Yelp FoursquareTP

Shared Storage

Towards Predictable Shared Cloud Storage

wz = 20%wy = 30% wf = 40%wt = 10%

demandz = 40%ratez = 30%

demandf = 30%

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

Goal: per-tenant max-min fair share of system resources

Page 9: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

23

PISCES: Predictable Shared Cloud Storage

•PISCES Goals- Allocate per-tenant fair share of total system

resources

- Isolate tenant request performance

- Minimize overhead and preserve system throughput

•PISCES Mechanisms

minutessecondsmicroseconds

Partition Placement (Migration)

Local Weight Allocation

Replica Selection

Fair Queuing

Page 10: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

24

Tenant A Tenant B Tenant C

PISCES Node 2

PISCES Node 3

PISCES Node 4

VM VM VM VM VM VM VM VM VM

PISCESNode 1

weightA weightB weightC Tenant D

VM VM VM

weightD

Place Partitions By Fairness Constraints

Akeyspace BkeyspaceCkeyspaceDkeyspace

keyspace partition

pop

ula

rity

Page 11: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

25

Controller

PP

Place Partitions By Fairness Constraints

Rate A < wA Rate B < wB Rate C < wC Compute feasible partition

placement

25

Over-loaded

Page 12: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

26

Controller

PP

Place Partitions By Fairness Constraints

Compute feasible partition placement

Page 13: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

31

PISCES Mechanisms

Partition Placement (Migration)

minutessecondsmicrosecondsTimescale

Mechanism

Controller

Page 14: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

32

VM VM VM VM VM VM VM VM VM

wA wB wC VM VM VM

wD

Allocate Local Weights By Tenant Demand

wA wB wC wD = = =

wa2wb2wc2wd2 wa3wb3wc3wd3 wa4wb4wc4wd4

32

wa1wb1wc1wd1

RA > wA RB < wB RC < wC RD > wD

Page 15: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

33

Swap weights to minimize tenant latency (demand)

Allocate Local Weights By Tenant Demand

WA

Controller

VM VM VM VM VM VM VM VM VM VM VM VM

wA wB wC wD

A→C C→AD→B C→D B→C

Page 16: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

41

Achieving System-wide Fairness

minutessecondsmicrosecondsTimescale

Mechanism Local Weight Allocation

Controller

Controller

Partition Placement (Migration)

Page 17: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

42

Select Replicas By Local Weight

VM VM VM VM VM VM VM VM VM

wA wB wD

42

RR

1/2 1/2

VM VM

wC VM

GET 1101100

C over-loaded

C under-utilized

RSSelect replicas based

on node latency

GET 1101100

Page 18: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

43

Select Replicas By Local Weight

VM VM VM VM VM VM VM VM VM

wA wB wD

RR

VM VM

wC VM

RSSelect replicas based

on node latency

1/3 2/3

Page 19: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

48

Achieving System-wide Fairness

minutessecondsmicrosecondsTimescale

Mechanism Local Weight Allocation

Replica Selection

Partition Placement (Migration)Controlle

r

Controller

RRRR ...

Page 20: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

49

Fair Queuing

Queue Tenants By Dominant Resource

50 12.5

6.350

62.5

7.3

40.5 1

outin req

40.5

3.2

outin reqBandwidth limitedRequest Limited

Out bytes fair sharing

VM VM VM VM VM VM

wA wB

Page 21: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

50

Queue Tenants By Dominant Resource

out

40.5 1in req

40.5

3.2

outin req

Fair Queuing

Bandwidth limitedRequest Limited

5511.3

5.645 55

6.9

Dominant resource fair sharing

Shared out bytesbottleneck

VM VM VM VM VM VM

wA wB

Page 22: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

57

Achieving System-wide Fairness

minutessecondsmicrosecondsTimescale

Mechanism Local Weight Allocation

Replica Selection

Fair Queuing

Partition Placement (Migration)Controlle

r

Controller

RRRR

N

...

N...

Page 23: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

61

•Does PISCES achieve system-wide fairness?

•Does PISCES provide performance isolation?

•Can PISCES adapt to shifting tenant distributions?

Evaluation

YCSB 1

ToRSwitch

YCSB 2

PISCES 6

PISCES 8

PISCES 5

PISCES 7

GigabitEthernet

YCSB 3

YCSB 8

YCSB 7

YCSB 4

YCSB 5

YCSB 6

PISCES 1

PISCES 2

PISCES 3

PISCES 4

Page 24: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

63

PISCES Achieves System-wide Fairness

Membase (Noqueue)

FQ Replica SelectionFQ + Replica

Selection

0.68 MMR 0.51 MMR 0.79 MMR 0.97 MMR

3.77-5.30 ms 3.31-5.77 ms 3.80-4.69 ms 4.05-4.23 ms

GET R

eq

uest

s (k

req

/s)

Ideal fair share: 110 kreq/s (1kB requests)

Time (s)

Page 25: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

66

PISCES Provides Strong Performance Isolation

Membase (Noqueue)

FQ Replica SelectionFQ + Replica

Selection

0.68 MMR 0.51 MMR 0.79 MMR 0.97 MMR

3.77-5.30 ms 3.31-5.77 ms 3.80-4.69 ms 4.05-4.23 ms

GET R

eq

uest

s (k

req

/s)

Time (s)

2x demand vs. 1x demand tenants (equal weights)

Membase (Noqueue)

FQ Replica SelectionFQ + Replica

Selection

0.42 MMR 0.50 MMR 0.50 MMR 0.97 MMR

3.94-4.99 ms 4.29-6.15 ms 4.16-4.27 ms 5.40-5.45 ms

3.27-4.14 ms 2.41-3.72 ms 3.57-4.04 ms 2.78-2.83 ms

Page 26: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

68

Equal Global Weight, Differing Resource Workloads

PISCES Achieves Dominant Resource Fairness

Time (s)

Band

wid

th (

Mb

/s)

Latency (ms)

GET R

eq

uest

s (k

req

/s)

76% of effective bandwidth

76% of effective throughput

Bottleneck Resource

Page 27: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

69

Differentiated Global Weights (4-3-2-1)

PISCES Achieves System-wide Weighted Fairness

GET R

eq

uest

s (k

req

/s)

Latency (ms)

Fract

ion

of

Req

uest

s

Time (s)

Page 28: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

70

Equal Global Weights, Staggered Local Weights

PISCES Achieves Local Weighted Fairness

Time (s)

GET R

eq

uest

s (k

req

/s)

Latency (ms)

Fract

ion

of

Req

uest

s

Page 29: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

71

Differentiated (Staggered) Local Weights

Server 1 Server 2 Server 3 Server 4

PISCES Achieves Local Weighted Fairness

Time (s)

wt = 4

wt = 3

wt = 2

wt = 1

GET R

eq

uest

s (k

req

/s)

Page 30: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

74

PISCES Adapts To Shifting Tenant Distributions

Tenant 3Tenant 2Tenant 1 Tenant 4Weight

1xWeight

2xWeight

1xWeight

2x

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Page 31: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

75

Global Tenant Throughput Tenant Latency (1s average)

Server 1 Server 2 Server 3 Server 4

PISCES Adapts To Shifting Tenant Distributions

Tenant 3Tenant 2Tenant 1 Tenant 4

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Ten

ant

Weig

ht

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Page 32: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

78

Conclusion

•PISCES achieves:

- System-wide per-tenant fair sharing

- Strong performance isolation

- Low operational overhead (< 3%)

•PISCES combines:

- Partition Placement: find a feasible fair allocation (TBD)

- Weight Allocation: adapts to (shifting) per-tenant demand

- Replica Selection: distributes load according to local weights

- Fair Queuing: enforces per-tenant fairness and isolation

Page 33: PrincetonUniversity Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh ✦ *Princeton ✦ IBM Research

80

Future Work

•Implement partition placement (for T >> N)

•Generalize the fairness mechanisms to different services and resources (CPU, Memory, Disk)

•Scale evaluation to a larger test-bed (simulation)

Thanks!