36
Datacenter Computing Trends and Problems : A survey Partha Kundu Sr. Distinguished Engineer Corporate CTO Office Special Session, May3 NOCS 2011 Pittsburgh, PA, USA

Data center computing trends a survey

Embed Size (px)

DESCRIPTION

State of research question in data center computation

Citation preview

Page 1: Data center computing trends   a survey

Datacenter Computing

Trends and Problems :

A survey

Partha KunduSr. Distinguished Engineer

Corporate CTO Office

Special Session, May3NOCS 2011

Pittsburgh, PA, USA

Page 2: Data center computing trends   a survey

Special Session NOCS 2011 2Partha Kundu

Data center computing is a new paradigm!

Page 3: Data center computing trends   a survey

Special Session NOCS 2011

Outline of talk

Power & Energy in Data Centers

Network architecture

Protocol interactions

Conclusions

3Partha Kundu

Page 4: Data center computing trends   a survey

Special Session NOCS 2011

Power & Energy in the Data Center

4Partha Kundu

Page 5: Data center computing trends   a survey

Special Session NOCS 2011

Source: ASHRAE Source: Google 2007

Data Center Energy breakdown Server Peak power usage profile

• Power delivery and Cooling overheads are quantified in PUE metric• Cooling is the most significant source of energy inefficiency

CPU power contribution is less than 1/3 of server power

5Partha Kundu

Page 6: Data center computing trends   a survey

Special Session NOCS 2011

Energy Efficiency

Most of the time server load is around 30%

But, server is least energy efficient in it’s most common operating region!

Source : Barroso, Holzle: Data Center as a Computer, Morgan Claypool (publishers), 2009

Servers are never completely idle

6Partha Kundu

Page 7: Data center computing trends   a survey

Special Session NOCS 2011

Dynamic Power Range

CPU power component (peak & idle) in servers has reduced over the years

Dynamic Power range:• CPU power range is 3x for servers• DRAM range is 2X• Disk and Networking is < 1.2X

Disk and Network switches need to learn from the CPU’s power

proportionality gainsSource : Barroso, Holzle: Data Center as a Computer, Morgan Claypool (publishers), 2009

7Partha Kundu

Page 8: Data center computing trends   a survey

Special Session NOCS 2011

Energy Proportionality

Goal:Achieve best energy efficiency

(~80%) in the common operating regions (20 – 30% load)

Challenges to proportionality:• Most proportionality tricks in embedded/mobile devices are not useable in DC due to huge activation penalties

• Distributed structure of data and application doesn’t allow powering down during low use• Disk drives spin >50% of time even when there is no activity

[Sankar et al, ISCA ‘08] smaller rotational speeds, multiple heads

8Partha Kundu

Page 9: Data center computing trends   a survey

Special Session NOCS 2011

Source : Kozyrakis et al, IEEE Micro 2010

Application Behavior in Data Centers

• Cosmos is similar to data mining workload• Bing preloads web index in memory• But, peak disk bandwidth can be high

Significant variation in disk, memory and network capacity and bandwidth usage across Apps

9Partha Kundu

Page 10: Data center computing trends   a survey

Special Session NOCS 2011

Dynamic Resource requirements in the Data-center

Intra-server variation (TPC-H, log scale) Inter-server variation (rendering farm)

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q120.1MB

1MB

10MB

100MB

1GB

10GB

100GB

Time

Serv

er M

emo

ry A

lloca

tio

n

Query

Huge variations even within a single Application running in a large cluster

10Partha Kundu

Page 11: Data center computing trends   a survey

Special Session NOCS 2011 11Partha Kundu

CPUsDIMMDIMM

CPUsDIMMDIMM

CPUsDIMMDIMM

CPUsDIMMDIMM

DIMM

DIMM

DIMMB

ackplan

e

DIMM

DIMM

DIMM

DIMM

DIMM

Conventional blade systems

Motivating Disaggregated memory**Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009

Page 12: Data center computing trends   a survey

Special Session NOCS 2011 12Partha Kundu

Disaggregated Memory*

Break CPU-memory co-location

Leverage fast, shared communication fabrics

Blade systems with disaggregated memory

CPUsDIMMDIMM

CPUsDIMMDIMM

CPUsDIMMDIMM

CPUsDIMMDIMM

Backp

lane

DIMM

*Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009 Memory blade

DIMMDIMM

DIMMDIMMDIMM

DIMM DIMM

Page 13: Data center computing trends   a survey

Special Session NOCS 2011 13Partha Kundu

Blade systems with disaggregated memory*Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009

CPUsDIMMDIMM

CPUsDIMMDIMM

CPUs DIMMDIMM

CPUs DIMMDIMM

Backp

lane

DIMM

Memory blade

DIMMDIMM

DIMMDIMMDIMM

DIMMDIMM

Authors claim: 8X improvement on memory constrained environments 80+% improvement in performance per $ 3x consolidation

Disaggregated Memory*

Page 14: Data center computing trends   a survey

Special Session NOCS 2011

Disaggregated Server

High Density, Low Power SM10000 Servers*• Designed to replace 40 1 RU servers in a single 10 RU system. • 512 1.66 GHz 64 bit X86 Intel Atom cores in 10 RU; 2,048 CPUs/rack• 1.28 Terabit interconnect fabric• Up to 64 1 Gbps or 16 10 Gbps uplinks• 0-64 SATA SSD/Hard disk• Integrated load balancing, Ethernet switching, and server management• Uses less than 2.5 KW of power

SeaMicro SM10000 server*

Claim:Achieves 4x Space & Power consolidation

*Source : Seamicro URL http://www.seamicro.com/?q=node/102

14Partha Kundu

DRAM Disk drivesPower supply

Fabric connectivity

Servers with Consolidated

Page 15: Data center computing trends   a survey

Special Session NOCS 2011

Network Architecture

15Partha Kundu

Page 16: Data center computing trends   a survey

Special Session NOCS 2011

Requirements of a Cloud-enabled Data Center

Capacity re-allocation

Economies of Scale

Economic & Technical Motivations:

Use commodity hardware & components

Dynamically distribute compute resources

16Partha Kundu

Page 17: Data center computing trends   a survey

Special Session NOCS 2011

Status Quo: Conventional DC Network

Ref: “Data Center: Load balancing Data Center Services”, Cisco 2004

CR CR

AR AR AR AR. . .

SS

DC-Layer 3

Internet

SS

SS

. . .

DC-Layer 2Key

• CR = Core Router (L3)• AR = Access Router (L3)• S = Ethernet Switch (L2)• A = Rack of app. servers

~ 1,000 servers/pod == IP subnet

17Partha Kundu

Page 18: Data center computing trends   a survey

Special Session NOCS 2011

Conventional DC Network Problems

• Cost of network equipment is prohibitive

• Limited server-to-server capacity

CR CR

AR AR AR AR

SS

SS

SS

. . .

SS

SS

SS

~ 5:1

~ 40:1

~ 200:1

18Partha Kundu

Page 19: Data center computing trends   a survey

Special Session NOCS 2011

And More Problems …

CR CR

AR AR AR AR

SS

SS SS

SS

SS SS

IP subnet (VLAN) #1

~ 200:1

• Resource fragmentation, significantly lowering cloud utilization (and cost-efficiency)

IP subnet (VLAN) #2

… … … …

19Partha Kundu

Page 20: Data center computing trends   a survey

Special Session NOCS 2011

And More Problems …

CR CR

AR AR AR AR

SS

SS SS

SS

SS SS

IP subnet (VLAN) #1

~ 200:1

• Server IP address assignments are topological

• IP movement from contained VLAN is hard

Complicated manual L2/L3 re-configuration

IP subnet (VLAN) #2

… … … …

20Partha Kundu

Page 21: Data center computing trends   a survey

Special Session NOCS 2011

What We Need is…..

1. L2 semantics

2. Uniform High capacity

3. Performance isolation

… … … …

21Partha Kundu

Page 22: Data center computing trends   a survey

Special Session NOCS 2011

Achieve Uniform High Capacity :Clos Network Topology*

. .

.

. .

.TOR

20 Servers

Int

. .

.

. . . . .

.

Aggr

K aggr switches with D ports

20*(DK/4) Servers

. .

.. . . . . . . .

• Large bisection BW • Multi paths at modest cost

• Tolerates Fabric Failure

*Ref: A Scalable, Commodity, Data Center architecture, Al-Fares et al, SIGCOMM 2008

22Partha Kundu

Page 23: Data center computing trends   a survey

Special Session NOCS 2011

Addressing and Routing:Name-Location Separation

payloadToR3

. . . . . .

yx

Servers use flat names

Switches run link-state routing and maintain only switch-level topology

y zpayloadToR4 z

ToR2 ToR4ToR1 ToR3

ypayloadToR3 z

. . .

DirectoryService

…x ToR2

y ToR3

z ToR4

Lookup &Response

*VL2: A Scalable and Flexible Data Center Network, Greenberg et al, SIGCOMM 2009

23Partha Kundu

Page 24: Data center computing trends   a survey

Special Session NOCS 2011

Addressing and Routing:Name-Location Separation

payloadToR3

. . . . . .

yx

Servers use flat names

Switches run link-state routing and maintain only switch-level topology

yzpayloadToR4 z

ToR2 ToR4ToR1 ToR3

payloadToR3 z

. . .

DirectoryService

…x ToR2

y ToR3

z ToR4

Lookup &Response

…x ToR2

y ToR3

z ToR3

24Partha Kundu

*VL2: A Scalable and Flexible Data Center Network, Greenberg et al, SIGCOMM 2009

Page 25: Data center computing trends   a survey

Special Session NOCS 2011

VL2 FabricObjectives and Solutions

SolutionApproachObjective

2. Uniformhigh capacity between servers

Enforce hose model using existing

mechanisms only

Employ flat addressing

1. Layer-2 semantics

3. Performance Isolation

Guarantee bandwidth for

hose-model traffic

Clos based network,Valiant LB flow

routing

Name-location separation &

resolution service

TCP

25Partha Kundu

Page 26: Data center computing trends   a survey

Special Session NOCS 2011

Protocol Interactions

26Partha Kundu

Page 27: Data center computing trends   a survey

Special Session NOCS 2011

TCP InCast Collapse : Problem

Affects key datacenter applications with barrier synchronization boundaries e.g. DFS, web search, MapReduce

Source : Nagle et al, The Panasas ActiveScale Storage Cluster – Delivering Scalable High Bandwidth Storage,SC2004

27Partha Kundu

Page 28: Data center computing trends   a survey

Special Session NOCS 2011 28Partha Kundu

Page 29: Data center computing trends   a survey

Special Session NOCS 2011

New Cluster Based Storage System

29Partha Kundu

Page 30: Data center computing trends   a survey

Special Session NOCS 2011

Incast Application overfills Buffers

30Partha Kundu

Page 31: Data center computing trends   a survey

Special Session NOCS 2011

Solution: TCP with ms-RTO**Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication, Vasudevan et al, SIGCOMM 2009

• Little adverse effect on WAN traffic

31Partha Kundu

Page 32: Data center computing trends   a survey

Special Session NOCS 2011

Incast Collapse : an unsolved problem at scale*

*Understanding TCP Incast Throughput Collapse in Datacenter Networks, Griffith et al WREN 2009

Solution space is complex:• Network conditions can impact RTT• Switch buffer management strategies • Goodput can be unstable with load/num. senders

32Partha Kundu

Page 33: Data center computing trends   a survey

Special Session NOCS 2011

Conclusions

33Partha Kundu

Page 34: Data center computing trends   a survey

Special Session NOCS 2011

• Opportunities to realize energy efficiency particularly in IO sub-systems

• Data Center fabrics need to be re-architected for application scalability and cost

• WAN artifacts can create bottlenecks

34Partha Kundu

Data Center Computing

Page 35: Data center computing trends   a survey

Special Session NOCS 2011

• Energy Efficiency: Local (distributed) energy management decision & coordination by NOC

• Fabric communication:NOC can reduce intra-chip/socket communication latencies between VMs

• Congestion Mgt:NOC can assist in traffic orchestration across VMs

35Partha Kundu

NOCs in the Data Center

Page 36: Data center computing trends   a survey

Special Session NOCS 2011 36Partha Kundu

Thank you!