Upload
kimball
View
63
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Scalably Verifiable Dynamic Power Management. Opeoluwa (Luwa) Matthews, Meng Zhang, and Daniel J. Sorin. Duke University. E xecutive Summary. Dynamic Power Management (DPM) used to improve power-efficiency at several levels of computing stack - PowerPoint PPT Presentation
Citation preview
Scalably Verifiable Dynamic Power Management
Opeoluwa (Luwa) Matthews, Meng Zhang,
and Daniel J. Sorin
Duke University
HPCA-20 Orlando, FL, February 19, 2014
• Dynamic Power Management (DPM) used to improve power-efficiency at several levels of computing stack
-within multicore chip, across servers in datacenter, etc.
• Deploying DPM scheme risky if not fully verified-difficult to verify scheme for large-scale systems
• Our contribution: Fractal DPM-framework for designing scalably verifiable DPM-implement Fractal DPM on 2-chip (16-core) system-experimental evaluation on real system
Executive Summary
HPCA-20 Orlando, FL, February 19, 2014
DPM aims to: -dynamically allocate power to computing resources
(e.g. cores, chips, servers, etc.)-attain best performance at given power budget-achieve lowest power consumption for desired performance
n cores in CMP
…
DPMR
eque
st P
ower R
equest Pow
ergr
ant deny
Dynamic Power Management
HPCA-20 Orlando, FL, February 19, 2014
……
DPMR
eque
st P
ower
Req
uest P
ow
er
gra
nt d
eny
n machines in datacenter
Dynamic Power ManagementDPM aims to: -dynamically allocate power to computing resources
(e.g. cores, chips, servers, etc.)-attain best performance at given power budget-achieve lowest power consumption for desired performance
HPCA-20 Orlando, FL, February 19, 2014
[Hennessy and Patterson Computer Architecture]
• Chips have hit power density ceiling
Case for Dynamic Power Management
HPCA-20 Orlando, FL, February 19, 2014
[hp.com]
Reducing cloud electricity consumption by half saves as much as UK consumes
• Datacenters consume increasing amounts of power
Case for Dynamic Power Management
Cloud map of UK
HPCA-20 Orlando, FL, February 19, 2014
Case for Verifiable DPM
• Want formal verification - prove correctness for all possible DPM allocations
- guarantee safety of DPM scheme
• DPM can greatly improve energy efficiency
• Unverified DPM could -overshoot power budget system damage-underutilize resources-deadlock
HPCA-20 Orlando, FL, February 19, 2014
• CMPs and datacenters have many computing resources
S power
states per CR+ ⟹
Sn
possible DPM states
Why Scalably Verifiable DPM is Hard
n computing resources (CR)
• Checking Sn states is intractable for typical values of S and n
HPCA-20 Orlando, FL, February 19, 2014
Hypothesis and AssumptionsProblem: verification of existing DPM protocols is unscalable
Hypothesis: We can design DPM such that it is scalably verifiable
-key idea: design DPM amenable to inductive verification-change architecture to match verification methodologies
Approach:-abstract away details of computing resources-abstract power states – e.g. Medium power-focus on decision policy (not mechanism e.g. DVFS)
HPCA-20 Orlando, FL, February 19, 2014
Outline
• Background and Motivation
• Fractal DPM
• Experimental Evaluation
• Conclusions
HPCA-20 Orlando, FL, February 19, 2014
Our Inductive Approach• Induction key to scalable verification can prove DPM
correct for arbitrary number of computing resources
• Base case: small scale system with few CRs is correct - small enough that it’s easy to verify with existing tools
• Inductive step: system behaves the same at every scale fractal behavior
• Prove base case + prove inductive step DPM scheme is correct for any number of CRs
• Approach more general than DPM, borrowed from prior work on coherence protocols [Zhang 2010]
HPCA-20 Orlando, FL, February 19, 2014
Attaining Scalable Verification
-base case of induction• CRs request power from DPM controller
• DPM controller grants or denies each request
• Few states easy to verify that DPM is correct note: over-simplified base case for now
Req
ues
t Pow
er
Gra
nt/D
eny
DPM-C
CRCR
HPCA-20 Orlando, FL, February 19, 2014
CR
DPM-C
CR
CR
Root DPM-C
• Base Case -Refine our base case a little-Need all types of structures: CR, DPM-C, Root DPM-C
Attaining Scalable Verification
-base case of induction
HPCA-20 Orlando, FL, February 19, 2014
• behavior must be fractal
Req
ues
t Pow
er
Gra
nt/D
eny
DPM-C
CRCR
Attaining Scalable Verification-inductive step
HPCA-20 Orlando, FL, February 19, 2014
Req
ues
t Pow
er
Gra
nt/D
eny
DPM-C
CRR
eques
t Pow
er
Gra
nt/D
eny
DPM-C
CRCR
• can scale system by replacing CR with larger system
{DPM-C + 2 CRs} “behaves just like” 1 CR observational equivalence
Attaining Scalable Verification-inductive step
HPCA-20 Orlando, FL, February 19, 2014
1) “Looking-down” equivalence check
Attaining Scalable Verification-observational equivalence
• Inductive Step – Two Observational Equivalences
Observed externally from P1, A and A’ behave same
A A’
O1
O1
(a) Small System (b) Large System
Small SystemLarge System
A A’
P1
P1
HPCA-20 Orlando, FL, February 19, 2014
• By induction, protocol correct for all scales
2) “Looking-up” equivalence check
Attaining Scalable Verification -observational equivalence
• Inductive Step – Two Observational Equivalences
Observed externally from P2, B and B’ behave same
B B’
O2
O2
(a) Small System (b) Large System
Large System
Small System
B’B
P2
P2
HPCA-20 Orlando, FL, February 19, 2014
• CR can be in 1 of 5 power states: L(ow), LM, M(ed), MH and H(igh)
• Parent DPM controller “sees” child DPM controller in averaged state
• DPM controller state is <Left Child State>:<Right Child State>
H L
H:LL
M:L
M
Fractal DPM Design
Avg(H:L) = M
HPCA-20 Orlando, FL, February 19, 2014
• CR can be in 1 of 5 power states: L(ow), LM, M(ed), MH and H(igh)
• Parent DPM controller “sees” child DPM controller in averaged state
• DPM controller state is <Left Child State>:<Right Child State>
Fractal DPM Design
MH H
MH:HL
H:L
H
Avg(MH:H) = H
HPCA-20 Orlando, FL, February 19, 2014
Fractal DPM Design-fractal invariant
• Fractal design + inductive proof invariant must also be fractal- Invariant must apply at every scale of system- Not OK to specify, e.g., <75% of all CRs are in H state
• Our fractal invariant: children of DPM controller not both in H
H H
H:HL
H:L
H MH
H:MHH
H:H
ILLEGAL ILLEGAL
H
HPCA-20 Orlando, FL, February 19, 2014
Translating Fractal Invariant to System-Wide Cap
• We must have fractal invariant for fractal design• But most people interested in system-wide invariants
• We prove (not shown) that our fractal invariant implies system-wide power cap
• Max power for n CRs is: (n-1)MH + Hi.e., (n-1) CRs in state MH and one CR in state H
HPCA-20 Orlando, FL, February 19, 2014
Fractal DPM Design -illustration
• CR requests MH
H L
L
M:L
H:L
Req
. MH
HPCA-20 Orlando, FL, February 19, 2014
H L
L
M:L
MH:L
block
Gra
nt M
H
Fractal DPM Design -illustration• CR requests MH
• Granting request doesn’t change controller’s Avg stateAvg(H:L)=Avg(MH:L)=M
• Request Granted, doesn’t violate invariant
• Controller blocks waiting for ack
HPCA-20 Orlando, FL, February 19, 2014
Fractal DPM Design -illustration
• CR sends ack to Controller
MH L
L
M:L
MH:L
block
a
ck
• CR sets its state
HPCA-20 Orlando, FL, February 19, 2014
Fractal DPM Design -illustration
• Controller unblocks
H L
L
M:L
H:L
HPCA-20 Orlando, FL, February 19, 2014
• Computing Resource requests H
Fractal DPM Design -illustration
L L
L
L:L
L:L
R
eq. H
HPCA-20 Orlando, FL, February 19, 2014
• Controller defers request to its parent-new request is M (not H) because Avg(H:L)=M
• CR requests H from its Controller
Fractal DPM Design -illustration
L L
L
L:L
L:L
R
eq. M
R
eq. H
HPCA-20 Orlando, FL, February 19, 2014
Fractal DPM Design -illustration
• Root grants request to Controller, blocks
L L
L
M:L
L:L
G
rant
M
block
HPCA-20 Orlando, FL, February 19, 2014
• Controller grants request to CR, blocks
Fractal DPM Design -illustration
L L
L
M:L
H:L
G
rant
H
block
Gra
nt M
block
HPCA-20 Orlando, FL, February 19, 2014
Fractal DPM Design -illustration
• acks percolate up tree from CR
H L
L
M:L
H:L
a
ck
block
block
HPCA-20 Orlando, FL, February 19, 2014
Fractal DPM Design -illustration
• acks percolate up tree from CR
H L
L
M:L
H:L
a
ck
block
• Controllers unblock upon receiving ack
a
ck
HPCA-20 Orlando, FL, February 19, 2014
Fractal DPM Design -illustration
• acks percolate up tree from CR
H L
L
M:L
H:L
• Controllers unblock upon receiving ack
HPCA-20 Orlando, FL, February 19, 2014
• Use same model checker to verify observational equivalences- use prior aggregation method for equivalence check
(Park, TCAD 2000)
• Use model checker to verify base case- we use well-known, automated Murphi model checker
Verification Procedure
HPCA-20 Orlando, FL, February 19, 2014
Outline• Background and Motivation
• Fractal DPM
• Experimental Evaluation
• Conclusions
HPCA-20 Orlando, FL, February 19, 2014
⟹ overshooting system-wide power cap
Illegal: total power = 4MHLegal: total power = 4MH
violates fractal invariant
• Our fractal invariant implies system-wide cap > n*MH
MH MH
MH:MH
MH MH
MH:MH MH:MH
M M
M:H
H H
H:HM:M
• Violating fractal invariant
• Situations are few and don’t significantly degrade performance
Experimental Evaluation-fractal inefficiency: cost of fractal behavior
HPCA-20 Orlando, FL, February 19, 2014
• Implemented Fractal DPM on 16-core linux system, 2 sockets-2 cores act as a CR-controllers communicate through UDP across sockets
Experimental Evaluation-system model
HPCA-20 Orlando, FL, February 19, 2014
Experimental Evaluation-experimental setup
Power Modes
L ML M MH H
Freq. (GHz) 1.4 2.1 2.7 3.3 3.6
Power Mode DVFS Mappings
• Entire system plugged into power meter (Wattsup?)
HPCA-20 Orlando, FL, February 19, 2014
Experimental Evaluation-comparison schemes
• Static Scheme:- no DPM, set all CRs to the same power state (e.g. MH)- trivially correct, poor energy efficiency
• Oracle DPM:- allocates for optimal energy efficiency (ED2) under budget- oracle doesn’t scale, unimplementable
• Optimized Fractal DPM (OptFractal): - CRs re-request lower power state when denied- no change to Fractal DPM decision algorithm
HPCA-20 Orlando, FL, February 19, 2014
Experimental Evaluation
• Benchmarks: Details in the paper.
HPCA-20 Orlando, FL, February 19, 2014
Results- compared to static scheme
• OptFractalDPM within 2% of Oracle DPM ED 2 savings
• FractalDPM within 8% of Oracle DPM ED2 savings
-5
0
5
10
15
20
Delay
Energy
% S
avi
ngs
from
S
tati
c-M
H
HPCA-20 Orlando, FL, February 19, 2014
• Most power requests serviced within 1ms.
- UDP packet round trip ~0.6ms
0
20
40
60
80
100
0 0.5 1 1.5 2 2.5 3 3.5
% C
DF
Response Time (ms)
Results- response latency
HPCA-20 Orlando, FL, February 19, 2014
• We show how a scalably verifiable DPM can be built
• Fractal behavior enables one-time verification for all scales
• Entire verification is done completely automated in model checker
• Fractal DPM achieves energy-efficiency close to optimal allocator
Conclusions
HPCA-20 Orlando, FL, February 19, 2014
Scalably Verifiable Dynamic Power Management
Opeoluwa (Luwa) Matthews, Meng Zhang,
and Daniel J. Sorin
Duke University
HPCA-20 Orlando, FL, February 19, 2014
Important: experiments must stress all Fractal DPM power modes• Each CR repeatedly launches bodytrack (from PARSEC benchmark
suite), under a range of predetermined duty cycles• Under given duty cycle, CRs request power state that minimizes
ED2
Why rely on duty cycle, not just different benchmarks or phases?
• Stressing all Fractal DPM power modes stressing DVFS states• Without varying duty cycle, optimal ED2 always under highest
frequency for all benchmarks tried [Dhiman 2008]• Predetermined set of duty cycles for launching bodytrack that
directly maps to set of power modes (or DVFS state)• Experiment constitutes running sequence of bodytrack jobs,
randomly selecting duty cycles from predetermined set
Benchmarks
HPCA-20 Orlando, FL, February 19, 2014
Results
()
% C
DF
• Millions of time steps simulated
% system performance loss
𝒑𝒆𝒓𝒇(𝐂𝐑)𝑪𝑹 • For each time step, system perf =
% system perf loss =
* 100%
HPCA-20 Orlando, FL, February 19, 2014
Results
()
% C
DF
• Millions of time steps simulated
% system performance loss
𝒑𝒆𝒓𝒇(𝐂𝐑)𝑪𝑹 • For each time step, system perf =
% system perf loss =
* 100%
On 72.6% of time steps Fractal DPM ≡ Oracle DPM
HPCA-20 Orlando, FL, February 19, 2014
Results
()
% C
DF
• Millions of time steps simulated
% system performance loss
𝒑𝒆𝒓𝒇(𝐂𝐑)𝑪𝑹 • For each time step, system perf =
% system perf loss =
* 100%
On 99.9% of time stepsFractal DPM < 20% off from Oracle
HPCA-20 Orlando, FL, February 19, 2014
Results
()
% C
DF
• Millions of time steps simulated
% system performance loss
𝒑𝒆𝒓𝒇(𝐂𝐑)𝑪𝑹 • For each time step, system perf =
% system perf loss =
* 100%
Worst case, Fractal DPM < 36.4% off from Oracle
HPCA-20 Orlando, FL, February 19, 2014