1 October 2003 Copyright © 2006 HP corporate presentation. All rights reserved.
A S i f P M t
© 2008 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice
A Science of Power Management a systems perspectiveParthasarathy RanganathanDistinguished Technologist, Hewlett Packard Labs
What does a science of power pmanagement mean to me?
Prior body of workchips to datacentersmobile to enterprise
ApproachEnergy-aware user interfacesHeterogeneity-based architecturesPower-aware blade serversIntegrated IT/facilities resource mgmtProfit-aware schedulingUnified power management archMicroblades and megaserversPower-aware networkingT l t i b h k
pmetrology to solutions
Interesting topics for us to ponder…
Tools, metrics, benchmarks, … Joulesort, Zesti, WeathermanConsil, Splice, eprof, …
hpl.hp.com/personal/partha_ranganathan
Why do we need a science of power pmanagement?
Of course, power is important
Battery, electricity, environment, heat, compaction, reliability, …
But, havent we done a lot?[Ranganathan, EE282 class notes]
2 October 2003 Copyright © 2006 HP corporate presentation. All rights reserved.
Yes!…but
Need much lower lower power
[Starner, IBM Systems Journal,1996]
Solar energy?
9 10 April 2009
energy?164W/sqm * 9% * 3cmx2cm * 50
deration for non-sunny periods = 10mW
Increased compaction…
not enough cooling?Customer Facility Examples
Datacenter average Watts per square footDatacenter average Watts per square foot
150
200
300 Genetic researchGenetic research
Palo AltoPalo Alto Web hosting service 2Web hosting service 2Web Hosting Service 1Web Hosting Service 1
HP Houston datacenterHP Houston datacenter 10kW-20kW/
Wat
ts/f
oot
Wat
ts/f
oot22
Size of Datacenter25
35
45
55
65
75
85
95
105 Nat. Lab 2Nat. Lab 2Nat. Lab 1Nat. Lab 1
Banking 1Banking 1
Financial ServicesFinancial Services
Bank4Bank4
Bank3Bank3
Bank2Bank2
Insurance 1Insurance 1
Corporate Corporate database sitedatabase siteTelecom siteTelecom site
Site1Site1Site2Site2
Site3Site3
Site4.1Site4.1
Site4.2Site4.2
Site6.1Site6.1
Site6.2Site6.2
Site7
Site4.2Site4.2
Site8.1Site8.1
Site8.2Site8.2
Site9
3.5kW-5kW/rack
$40 Billion
3 October 2003 Copyright © 2006 HP corporate presentation. All rights reserved.
The cost of power and cooling is likely to exceed that of
h dhardware…Luiz Barroso, Google
More work needed on power mgmt…p g
So, how can a formal science help? p
Formulating th blthe problem
One billion pentiumsin one handheld!
Another quick experiment
! Laptop: 165X! Handheld: 15X! Cell phone: 6X! RIM pager: 92 mW
Radio wakeup:• 100ms (iPAQ); • 1.2 sec (cell) • 5 sec (RIM)
Data from [PACS2003]
4 October 2003 Copyright © 2006 HP corporate presentation. All rights reserved.
!Can we identify minimum energy gyneeded for a task?
(optimality?)
minimum energy needed for a task…
How do we define energy & work done?
MIPS per Watt? Energy-Aware User Interfaces
MIPS per Watt?
! " dd PULKLK $12111 ###
Burdened costs of power
! "hardwareconsumedgrid PULKLK $,12111 ###
0.050 KW 1000+ KW10 - 15 KW
1000 KW1 KW0.005 KW
Heat Generated
Energy to Remove Heat
.250 KW
.025 KW
5 October 2003 Copyright © 2006 HP corporate presentation. All rights reserved.
Supply vs demand side power efficiencies power efficiencies
Supply vs demand side power efficiencies
Energy in extraction, reclamation, transport, manufacturing, …
Energy in operation
MIPS per Watt?
"How do we measure and reason about energy efficiency?energy efficiency?
Thinking about th l tithe solution
6 October 2003 Copyright © 2006 HP corporate presentation. All rights reserved.
ConsumptionProvision
(Compute nodes)
(Feeds, PDUs, UPS, etc.)
Extraction
(CRAC units, floor tiles, etc.)
Landscape of Power Optimizations
• Voltage scaling/islands• Clock gating/routingClock-tree distribution, half-swing clocks
• Redesigned latches/flip-flops
• Voltage/freq scaling• GatingPipeline, clock, functional units, branch prediction, data path • Split instrucn windows
• Switching controlRegister relabeling, operand swapping, instruction scheduling
• Memory access reduceLocality optimizations, register allocation
Average power, peak power, power density, energy-delay, …
CIRCUITS ARCHITECTURE COMPILER, OS, APP
pin-ordering, gate restructuring, topology restructuring, balanced delay paths, optimized bit transactions
• Redesigned memory cellsLow-power SRAM cells, reduced bit-line swing, multi-Vt, bit line/word line isolation/segmentation
• Other optimizationsTransistor resizing, GALS, low-power logic
• SMT thread throttling
• Bank partitioning• Cache redesignSequential, MRU, hash-rehash, column-associative, filter cache, sub-banking, divided word line, block buffers, multi-divided module, scratch
• Low-power states• DRAM refresh-control
• Switching controlGray, bus-invert, address-increment
• Code compression• Data packing/buffering
• Power-mode-control
• CPU/resource schedule• Memory/disk controlDisk spinning, page allocation, memory mapping, memory bank control
• NetworkingPower-aware routing, proximity-based routing, balancing hop count, …
• Distributed computingMobile agents placement, network-driven computation
• Fidelity control• Dynamic data types• Power API
Avoid waste!
How do we avoid waste?avoid waste?
A framework to ti i optimize power
[CACM09]
A taxonomy of inefficiencies(how do we identify waste?)
A taxonomy of techniques(how do we reduce waste?)
General-purpose designse.g., convergence, mission-critical
Design process structuree.g., modularity, aggregation
Bad design choices & business constraints
Design for futuree.g., unforeseen or worst-case
End-user experiencee.g., tethered system hangover
Inefficiencies
7 October 2003 Copyright © 2006 HP corporate presentation. All rights reserved.
avoiding waste & improving power efficiency
!Energy-efficient technologiese.g., replace disk with flash, replace copper with optics, …
"Match power to worke.g., energy proportionality; turn-off/dial-down things, …
#Match work to power
38 10 April 2009
pe.g., asymmetric multicores, temp-aware scheduling, …
$Piggy back energy eventse.g., shared caches, coalesced request streams, …
%Special purpose solutionse.g., GPUs, ASICs, …
&Cross layers for efficiencye.g., ensemble power management, …
'Tradeoff some other metrice.g., fidelity-aware energy management
(Tradeoff the uncommon-case
39 10 April 2009
e.g., power-supply efficiency, …
)Spend somebody’s powere.g., remote server offload for mobile power
*Spend power to save powere.g., periodic cleanup to save energy, …
#How do we formalize these principles?
Building the l tisolution
Measure & modelPredict & analyzePredict & analyzeActuate & control
8 October 2003 Copyright © 2006 HP corporate presentation. All rights reserved.
Measure & modelPredict & analyzeActuate & control
“Science” challenges in each category
Measure & modele.g., neural-network models for thermal maps (Consil/Weatherman)
Predict & analyzePredict & analyzee.g., models to predict power-performance-cooling (Zephyr, Zesti)
Actuate & controle.g., federated control theory, scheduling algorithms
One example[ASPLOS08][ASPLOS08]
Server iLO
Enclosure IAM
Rack
X
X
X
X
VM
heterogeneity
performance
performance
performance
X
X
X
X
X
X
Power Struggles!
Average power
Peak thermal power
Peak electrical power
CPU
iLO
X
XX
OS-wlm
OS-gwlm
SIM
Vmotion
VM-res.all
LSF
X
Local optima
global optima
performance
X
X
X
X
XCHAOS!!
(“Power” Struggle)
X
X
X
Feedback Controller @ Core
April 10, 200948
• Formal theoretical guarantees of stability and performance• Can account for inaccuracies in model, workload variations
9 October 2003 Copyright © 2006 HP corporate presentation. All rights reserved.
Efficiency + capping at single server
Adapt power consumption to the local budget
Adapt the “capacity” allocation to the resource demand
Power efficiencyPower Capping
D1
EMEnclosure
budget -
rref1
P1SM1Local
budget Max
P
+Min
BLADEEC1
Dm
rrefm
PmSMmLocal
budget Max BLADEECm
Multi-level Power CappingServer Manager (SM)• Efficiency + capping
Group Manager (GM)
Enclosure Manager (EM)• Enclosure-level throttling
April 10, 2009 50
- Pgrp
+
Pencl
Groupbudget
GM Min
Min
!
DN-1
rrefN-1
PN-1SMN-1
!
ServerECN-1
DN
rrefN
PNSMN ServerECN
p g ( )• Policy-based allocation
EM
SM BLADE
SM
EC
BLADEEC
VMC
r1
rm
D1
Dm
Virtual Machine Workload Distributor
Virtual Machine Controller• Workload placement
• Power minimization
• Power budget constraints
• ‘01’ I t P i
Groupbudget GM
SERVEREC
SERVEREC
SM
SM
rN-1
rN
DN-1
DN
Workload consolidation & migrationCapping + efficiency
• ‘0-1’ Integer Programming
• Greedy heuristic
Unified and Extensible Architecture
It works!
P k P
Average Power
It works well!
65% savings (OpEx)
0 20 40 60 80 100
Performance
Peak Power
Unified Baseline
20% savings (CapEx)
Similar performance
10 October 2003 Copyright © 2006 HP corporate presentation. All rights reserved.
Average Power
Other interesting insights…
No VMCVMC onlyUnified
0 20 40 60 80 100
Performance
Peak Power
Unified VMCOnly NoVMC Baseline
InterfacesFormal rigor
FlexibleFlexibleExtensible
Federated
ClosingPower mgmt ready for a scienceimportant future challenges & prior body of work
Several interesting challengesformulating the problemformulating the probleme.g., optimality, metrics
thinking about the solutione.g., principles of power management
designing the solutione.g., models, analysis, control, algorithms, …
Thanks!!