59
µDPM: Dynamic Power Management for the Microsecond Era Chih-Hsun Chou [email protected] Laxmi N. Bhuyan [email protected] Daniel Wong [email protected]

µDPM: Dynamic Power Management for the ... - Daniel Wong

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: µDPM: Dynamic Power Management for the ... - Daniel Wong

µDPM: Dynamic Power Management for the Microsecond Era

Chih-Hsun [email protected]

Laxmi N. Bhuyan [email protected]

Daniel [email protected]

Page 2: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

ns msµs events

Killer

Microsecond

2

Computer systems efficiently support . . .

Masked by microarchitectural

techniques

Masked by OS-level

techniques

Page 3: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

The Killer Microseconds

3

“System designers can no longer ignore efficient support for microsecond-scale I/O ... Novel microsecond-optimized system stacks are needed”[1]

Killer Microseconds

[1] Luiz Barroso, Mike Marty, David Patterson, and Parthasarathy Ranganathan. Attack of the killer microseconds. Commun. ACM 60, 4 (March 2017), 48-54.

Page 4: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Computer systems cannot efficiently support Microsecond-scale service time

4

Request Response

~milliseconds

Traditional Monolithic Services

Page 5: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Computer systems cannot efficiently support Microsecond-scale service time

5

Request Response

Emerging Microservices

microseconds

Page 6: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Microservice Example

6

Source: Adrian Cockcroft, “Monitoring Microservices and Containers: A Challenge”

~700 microservices

Page 7: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019 7

Implications of Killer Microsecond service time on

Dynamic Power Management?

Page 8: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Opportunity for DPM – Latency Slack› Slow down

request processing(DVFS)

› Delay request processing(Sleep)

8

0%

20%

40%

60%

80%

100%

120%

10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Tail

Late

ncy

(% o

f SLA

)

Load (% of peak load)

tail latency SLA

Latency Slack

Page 9: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› DVFS (Rubik [MICRO’15], Pegasus [ISCA’14])

› Rubik adjusts f per request

› DVFS + Sleep › SleepScale [ISCA’14] finds optimal frequency & C-state depth for 60s epochs

9

R1R0 R2 R3 R4

R4R0 R2 R3R1Epoch 0 Epoch 1

t

t

f

f

Page 10: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

sleep

time

Page 11: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

sleep

time

R1 arrives

Page 12: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

sleep

time

R1 arrives

Target Tail Latency

Page 13: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

sleep

time

R1 arrives

Wake

Target Tail Latency

Page 14: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

sleep

time

R1 arrives

Wake

Target Tail Latency

Page 15: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

sleep

time

Wake

R2 arrives

Page 16: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

sleep

time

Wake

R2 arrives

Target Tail Latency

Page 17: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

sleep

time

Wake

R2 arrives

Target Tail Latency

Page 18: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

sleep

time

Wake

R3 arrives

Page 19: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

Target Tail Latency

sleep

time

Wake

R3 arrives

Page 20: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

Wake

Target Tail Latency

sleep

time

R3 arrives

Page 21: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

Wake

Target Tail Latency

sleep

time

R3 arrives

Page 22: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Dynamic Power Management Overview› Sleep states (PowerNap[ASPLOS’09], Dreamweaver[ASPLOS’12],

DynSleep[ISLPED’16], CARB[CAL’17])

10

Wakesleep

time

Page 23: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

DPM ineffective w/ microsecond service time

11P

ower

(W)

28

30.75

33.5

36.25

39

Avg service time(us)10 100 1000

Baseline RubikSleepScale DynSleep

DVFSDeep SleepDVFS+Sleep

Baseline

Page 24: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

DPM ineffective w/ microsecond service time

11P

ower

(W)

28

30.75

33.5

36.25

39

Avg service time(us)10 100 1000

Baseline RubikSleepScale DynSleep

DPM Ineffective

DVFSDeep SleepDVFS+Sleep

Baseline

Page 25: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

DPM ineffective w/ microsecond service time

› >250µs: DVFS effective at slowing down request processing

11P

ower

(W)

28

30.75

33.5

36.25

39

Avg service time(us)10 100 1000

Baseline RubikSleepScale DynSleep

DPM Ineffective

DVFSDeep SleepDVFS+Sleep

Baseline

Page 26: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

DPM ineffective w/ microsecond service time

› >250µs: DVFS effective at slowing down request processing

› <250µs: DPM becomes ineffective

11P

ower

(W)

28

30.75

33.5

36.25

39

Avg service time(us)10 100 1000

Baseline RubikSleepScale DynSleep

DPM Ineffective

DVFSDeep SleepDVFS+Sleep

Baseline

Page 27: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

DPM ineffective w/ microsecond service time

› >250µs: DVFS effective at slowing down request processing

› <250µs: DPM becomes ineffective

› Surprisingly, sleep-based policiesoutperform DVFS-based policies

11P

ower

(W)

28

30.75

33.5

36.25

39

Avg service time(us)10 100 1000

Baseline RubikSleepScale DynSleep

DPM Ineffective

DVFSDeep SleepDVFS+Sleep

Baseline

Page 28: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Fragmented idle periods à Lost Opportunities

12

50% utilization

Longer Service Time

Shorter Service Time

50% utilization

Page 29: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Fragmented idle periods à Lost Opportunities

› Short service times fragment idle periods

13

DVFS (500µs)Sleep (500µs)Sleep (200µs)

DVFS (200µs)

Page 30: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Fragmented idle periods à Lost Opportunities

› Short service times fragment idle periods

› Sleep states / request delaying canconsolidateidle periods

13

DVFS (500µs)Sleep (500µs)Sleep (200µs)

DVFS (200µs)

Page 31: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Significant transition overheads and idle power

14

Baseline

Rubik

Sleepscale

DynSleep

Optimal

Normalized Energy

0.6 0.68 0.76 0.84 0.92 1

Busy Idle C-state tran.VFS tran.

Idleness and transition overhead still account for up to ~25% of energy

Page 32: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

DPM inefficiencies

15

Tail Service Time = 78µs

C6 Residency Time = 300µs

R1C0C3C6

R1

Arr

ival

Tail Latency Target = 800µs

Wasted Energy

Wasted Energy

DVFS limited in closing Latency Gap tR0

* SPECjbb timing

Page 33: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

DPM inefficiencies

15

Tail Service Time = 78µs

C6 Residency Time = 300µs

R1C0C3C6

R1

Arr

ival

Tail Latency Target = 800µs

Wasted Energy

Wasted Energy

Solution:Aggressive Deep Sleep

Solution:Request Delaying

DVFS limited in closing Latency Gap tR0

Solution:Coordinate DVFS

* SPECjbb timing

Page 34: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Key Insight

16P

ower

(W)

28

30.75

33.5

36.25

39

Avg service time(us)10 100 1000

Baseline RubikSleepScale DynSleepµDPM

Careful coordination of DVFS, Sleep state, and request delaying is the key to effective DPM with microsecond service times

Deep SleepDVFS+SleepDVFSBaseline

Page 35: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

µDPM› Aggressively Deep Sleep› Delay and slow down request processing to finish just-in-time,

even under microsecond request service times› Carefully coordinating DVFS, Sleep, and request delaying

17

Tail Service Time = 78µs

R1 t

Tail Latency Target = 800µs

Solution:Aggressive Deep Sleep

Solution:Request Delaying

R1

Arr

ival

Residency Time = 300µs

C0C3C6

R0

Solution:Coordinate DVFS

Page 36: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Can latency-critical workloads utilize deep sleep states?

18

Memcached

SPECjbb

Xapian

Masstree

Start

Tail Service Time

(95th percentile)

33µs

78µs

250µs

1200µs

Page 37: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Can latency-critical workloads utilize deep sleep states?

18

Memcached

SPECjbb

Xapian

Masstree

Start

Target Tail Latency(95

th percentile)150µs

800µs

1100µs

2100µs

Tail Service Time

(95th percentile)

33µs

78µs

250µs

1200µs

Page 38: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Can latency-critical workloads utilize deep sleep states?

18

Memcached

SPECjbb

Xapian

Masstree

Start

Target Tail Latency(95

th percentile)150µs

800µs

1100µs

2100µs

Tail Service Time

(95th percentile)

33µs

78µs

250µs

1200µs

Opportunity

Page 39: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Aggressive deep sleep and request delaying

› Wakeup after residency time

› Wakeup before residency time if needed to meet tail latency

19

R0Req

R0 Arrival

t

C0C3C6

Residency Time = 300µs

Page 40: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Estimating Service time and Latency› Estimate tail service time

› Statistical performance model[2]

› Online periodic resampling (100ms)

20

R0ReqR0 Arrival

t

C0C3C6

Si

[2] Kasture, Harshad, Davide B. Bartolini, Nathan Beckmann, and Daniel Sanchez. "Rubik: Fast analytical power management for latency-critical systems." MICRO 2015.

Page 41: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Estimating Service time and Latency› Estimate tail service time

› Statistical performance model[2]

› Online periodic resampling (100ms)

› Estimate Request Tail Latency› L = W + Twake + Tdvfs + Si / f

20

R0ReqR0 Arrival

t

C0C3C6

Si

W Si/fTwake + Tdvfs

[2] Kasture, Harshad, Davide B. Bartolini, Nathan Beckmann, and Daniel Sanchez. "Rubik: Fast analytical power management for latency-critical systems." MICRO 2015.

Page 42: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Detecting critical request arrival

› Detecting critical request arrival › If inter-arrival time between 2 consecutive requests are shorter than the

tail service time

21

R0Req

R0

t

C0C3C6

Page 43: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Detecting critical request arrival

› Detecting critical request arrival › If inter-arrival time between 2 consecutive requests are shorter than the

tail service time

21

R0Req

R0

tR1*

R1 Arrival – Critical!

C0C3C6

Page 44: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Detecting critical request arrival

› Detecting critical request arrival › If inter-arrival time between 2 consecutive requests are shorter than the

tail service time

21

R0Req

R0

tR1*

R1 Arrival – Critical!QoS

Violation!

C0C3C6

Page 45: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Detecting critical request arrival

› Detecting critical request arrival › If inter-arrival time between 2 consecutive requests are shorter than the

tail service time

21

R0Req

R0

tR1*

R1 Arrival – Critical!QoS

Violation!

C0C3C6

R1 critical if tR1 – tR0 ≤ Sf

tail

Page 46: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Coordinate frequency, sleep, delay

22

R0Req

R0

tR1*

R1QoS

Violation!

C0C3C6

R1 critical if tR1 – tR0 ≤ Sf

tail

Page 47: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Coordinate frequency, sleep, delay

22

Req

R0

t

R1

C0C3C6

R1 critical if tR1 – tR0 ≤ Sf

tail

R1R0

f’ = Stail/(TRiTargetCompletion-TRi-1Completion )

Page 48: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Coordinate frequency, sleep, delay

22

Req

R0

t

R1

C0C3C6

R1 critical if tR1 – tR0 ≤ Sf

tail

R1R0Increase frequency on wakeup

Can sleep longerdue to higher freq.

f’ = Stail/(TRiTargetCompletion-TRi-1Completion )

Page 49: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Coordinate frequency, sleep, delay

› Reset to lowest frequency on wake up › Only increase frequency on reconfiguration

22

Req

R0

t

R1

C0C3C6

R1 critical if tR1 – tR0 ≤ Sf

tail

R1R0Increase frequency on wakeup

Can sleep longerdue to higher freq.

f’ = Stail/(TRiTargetCompletion-TRi-1Completion )

Page 50: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Minimize state transition overheads› Calculate criticality score

› Send to core that is least critical

› Minimize state transitions

23

See paper for details!

Page 51: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Evaluation› In-House Simulator (similar to BigHouse) › Empirical Power Model

› 10µs DVFS transition, 89µs sleep transition time (double to account for cache flushing)

› Add 25µs to first request service time after idle period for cold miss penalty › Baseline – Linux menu idle governor and intel_pstate driver › Workloads

24

Page 52: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Energy Savings

25

Ene

rgy

Sav

ing

(%)

0

10

20

30

40

Load0.1 0.2 0.3 0.4 0.5

Memcached

Ene

rgy

Sav

ing

(%)

0

10

20

30

40

Load

0.1 0.2 0.3 0.4 0.5 0.6 0.7

SPECjbb

Ene

rgy

Sav

ing

(%)

0

5

10

15

20

Load0.1 0.2 0.3 0.4 0.5 0.6

Masstree

Ene

rgy

Sav

ing

(%)

0

2.75

5.5

8.25

11

Load0.1 0.2 0.3 0.4 0.5 0.6

Xapian

Rubik DynSleep Sleepscale µDPM Optimal

Page 53: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Energy Savings

25

Ene

rgy

Sav

ing

(%)

0

10

20

30

40

Load0.1 0.2 0.3 0.4 0.5

Memcached

Ene

rgy

Sav

ing

(%)

0

10

20

30

40

Load

0.1 0.2 0.3 0.4 0.5 0.6 0.7

SPECjbb

Ene

rgy

Sav

ing

(%)

0

5

10

15

20

Load0.1 0.2 0.3 0.4 0.5 0.6

Masstree

Ene

rgy

Sav

ing

(%)

0

2.75

5.5

8.25

11

Load0.1 0.2 0.3 0.4 0.5 0.6

Xapian

Rubik DynSleep Sleepscale µDPM Optimal

µDPM typically within 2-3% of OptimalµDPM saves ~2X vs others

Page 54: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

State transition overhead reduction

26

Baseline

Rubik

Sleepscale

DynSleep

Optimal

µDPM

Normalized Energy

0.6 0.68 0.76 0.84 0.92 1

Busy IdleC-state tran. VFS tran.

Page 55: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Under Varying Load

27

Page 56: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Sensitivity to target tail latency

28

Ene

rgy

savi

ng (%

)

0

3

6

9

12

Latency contraint (µs)

600 1133 1667 2200

Ene

rgy

savi

ng (%

)0

7.5

15

22.5

30

Latency contraint (µs)80 135 190 245 300

Ene

rgy

savi

ng (%

)

0

2.25

4.5

6.75

9

Latency contraint (µs)

1200 1950 2700 3450 4200

Ene

rgy

savi

ng (%

)

0

7.5

15

22.5

30

Latency contraint (µs)400 700 1000 1300 1600

Memcached SPECjbb

Masstree Xapian

Page 57: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Sensitivity to Transition time

5

10

15

20

25

30

0 100 200 300

Ene

rgyr

Sav

ing

(%)

sleep transition time(µsec)

µDPM µDPM w/ criticality-awareness

29

0

10

20

30

0 20 40 60 80 100

Ene

rgy

Sav

ing

(%)

VFS transition time (µsec)

Rubik µDPM µDPM w/ criticality-awareness

Page 58: µDPM: Dynamic Power Management for the ... - Daniel Wong

HPCA 2019

Conclusion› Microsecond service times present challenges for Dynamic

Power Management

› Careful coordination of DVFS, Sleep and Request delaying can achieve savings with µs service times

› µDPM is able to save ~2x energy compared to state-of-the-art techniques

30

Page 59: µDPM: Dynamic Power Management for the ... - Daniel Wong

µDPM: Dynamic Power Management for the Microsecond Era

Thank you!

Chih-Hsun [email protected]

Laxmi N. Bhuyan [email protected]

Daniel [email protected]