17
IBM Research and U. Tennessee, Knoxville June 12, 2007 © 2007 IBM Corporation The 4 th IEEE International Conference on Autonomic Computing Server-level Power Control Charles Lefurgy, Xiaorui Wang, Malcolm Ware

IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research and U. Tennessee, Knoxville

June 12, 2007 © 2007 IBM CorporationThe 4th IEEE International Conference on Autonomic Computing

Server-level Power Control

Charles Lefurgy, Xiaorui Wang, Malcolm Ware

Page 2: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM

Researc

h

2/1

7T

he 4

th IE

EE

Inte

rnatio

nal C

onfe

rence o

n A

uto

nom

ic C

om

putin

gJune 1

2, 2

007

Se

rve

r po

we

r su

pp

lies

�D

ata

cen

ter m

ust w

ire to

lab

el p

ow

er

�H

ow

ever, re

al w

ork

load

s d

o n

ot u

se th

at

mu

ch

po

wer

0

50

10

0

15

0

20

0

25

0

30

0

35

0

idlegzip-1vpr-1gcc-1mcf-1

crafty-1parser-1

eon-1perlbmk-1

gap-1vortex-1bzip2-1twolf-1

wupwise-swim-1mgrid-1applu-1mesa-1galgel-1

art-1equake-1facerec-1ammp-1

lucas-1fma3d-1

sixtrack-1apsi-1gzip-2vpr-2gcc-2mcf-2

crafty-2parser-2

eon-2perlbmk-2

gap-2vortex-2bzip2-2twolf-2

wupwise-swim-2mgrid-2applu-2mesa-2galgel-2

art-2equake-2facerec-2ammp-2

lucas-2fma3d-2

sixtrack-2apsi-2

SPECJBBLINPACK

p4maxlabel

Server Power (W)

La

bel po

we

r: 30

8 W

P4

Ma

x: 27

3 W

Lab

el p

ow

er

50 W

above

real w

ork

loads

Page 3: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

3/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

The problem

� Server power consumption is not well controlled.

– System variance (workload, configuration, process, etc.)

– Design for worst-case power

� Results:

– Power supplies are significantly over-provisioned

– Therefore, datacenters provision for power that cannot be used

– High cost, with no benefit in most environments

Page 4: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

4/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Our approach

� Use “better-than-worst-case” design

– Example: Intel’s Thermal Design Power (TDP)

– Power, like temperature, can be controlled

� Reduce design-time power requirements

– Run real workloads at full performance

– Use smaller, cost-effective power supplies

� Enforce run-time power constraint with feedback control

– Slow system when running power virus

Page 5: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

5/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Our contributions

� Control of peak server-level power (to 0.5 W in 1 second)

� Derivation and analysis [see paper]

– Guaranteed accuracy and stability

� Verified on real hardware

� Better application performance than previous methods

Page 6: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

6/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Caveats

� Our prototype is a blade server

– The results of the study also apply to rack-mount servers.

� Power controller uses clock throttling, not dynamic voltage and frequency scaling (DVFS)

– At the time of the study, only clock throttling was available on our

prototype system.

– DVFS is not available on all processors (lower speed grades)

– Recently, we have built a prototype using DVFS

Page 7: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

7/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Rest of the talk

� Power measurement

� Power control

– Open loop controller

– Ad-hoc controller

– Proportional controller

� Experimental results

� Conclusions

Page 8: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

8/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Power measurement

controller firmware on service

processor (Renesas H8 2168)

Measurement/calibration circuit

Sense resistors

HS20 8843 (Intel Xeon blade)

Measure 12V bulk power

0.1 W precision, 2% error

Page 9: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

9/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Options for power control

� Open-loop

– No measurement of power

– Chooses fixed speed for a given power budget

– Based on most power hungry workload

� Ad-hoc

– Measures power and compares to power budget

– +1/-1 adjustments to processor clock throttle register

� Proportional Controller (“P control”)

– Designed using control theory

– Guaranteed controller performance

Page 10: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

10/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Open loop design

� P4MAX workload used as basis for open-loop controller

� Graph shows maximum 1 second power for workload

130

150

170

190

210

230

250

270

12

.5%

25

.0%

37

.5%

50

.0%

62

.5%

75

.0%

87

.5%

10

0.0

%

Processor performance setting (effective frequency)

Po

we

r (W

)

SPECCPU 2-threadsSPECCPU 1-thread

LINPACKSPECJBB 2005

P4MAX

Idle

Page 11: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

11/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Proportional controller design

� Settle to within 0.5 W of desired power in 1 second

– Based on BladeCenter power supply requirements

� Every 64 ms

– Compare power to target power

– Use proportional controller to select desired processor speed

• 12.5% - 100% in units of 0.1%

� Clock throttling

– Intel processor: 8 settings in units of 12.5% (12.5% - 100%)

– Use delta-sigma modulation to achieve finer resolution

Page 12: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

12/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Why not use ad-hoc control?

190

200

210

220

230

240

250

38 39 40 41 42Time (s)

Serv

er

pow

er

(W) power 1ms

power 64mspower 1s

190

200

210220

230

240

250

38 39 40 41 42Time (s)

Serv

er

pow

er

(W)

overshoot

40%

50%

60%

70%

80%

90%

100%

38 39 40 41 42Time (s)

Perf

orm

ance s

tate

40%

50%

60%

70%

80%

90%

100%

38 39 40 41 42Time (s)

Perf

orm

ance s

tate

overshoot

Settles to 216.0 W 5 W Violation

CPU speed: 68.8%

Ad-hoc P Controller

Settles to 211.0 W No violation

CPU speed: 65.8%

Set point = 211.0 W

Page 13: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

13/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Steady-state error

� P controller has no steady-state error (x=y)

� Ad-hoc controller has steady-state error

– Add safety margin of 6.1 W to ad-hoc

170

180

190

200

210

220

230

240

250

260

270

180 190 200 210 220 230 240 250 260

Controller power setpoint (W)

Avera

ge p

ow

er m

easure

d .

over 66 s

econds (W

)

p-controller

ad-hoc

ad-hoc with safetymargin

Above y=x

Violation of power budget

Page 14: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

14/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Comparison of 3 controllers

� Run each controller with 5 power budgets

� Compare throughput of workloads

� Table shows settings used for each controller

205.8 W199.7 W37.5%210 W

215.6 W209.5 W50%220 W

225.4 W219.3 W62.5%230 W

235.2 W229.1 W62.5%240 W

245.0 W238.9 W75%250 W

P control

set point

Ad-hoc (with safety

margin) set point

Open-loop processor

performance setting

Power

budget

Page 15: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

15/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Application performance summary� P controller

– 31-82% higher performance than open-loop

– 1-17% higher performance than ad-hoc

• Quicker settling time

• Zero steady state error

0%

20%

40%

60%

80%

100%

210 220 230 240 250

Budget (W)

Ap

plic

ati

on

th

rou

gh

pu

t

(% o

f fu

ll-p

erf

orm

an

ce

)

P controllerImproved ad-hocOpen-loop

LINPACK92%99%

70%

1 W = 1.1% performance!

Page 16: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

16/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Power supply reduction

� 308 W: Label power of HS20 blade

� 260 W: Real workloads run at full performance

– A reduction of 15% in supply power.

� Fit 15% more servers per circuit

Page 17: IBM Research and U. Tennessee, Knoxville...facerec-2 ammp-2 lucas-2 fma3d-2 sixtrack-2 apsi-2 SPECJBB LINPACK p4max label Server Power (W) Label power: 308 W P4Max: 273 W Label power

IBM Research

17/17 The 4th IEEE International Conference on Autonomic Computing June 12, 2007

Conclusions

� Power is a 1st class resource that can be managed.

– Power is no longer the accidental result component

configuration, manufacturing variation, and workload.

� Reduce power supply capacity, safely.

– Relax design-time constraints, enforce run-time constraints.

– Install more servers per rack.

� Power control is a fundamental mechanism for power management in a power-constrained datacenter.

– Move power to critical workloads.