26
Scheduling for Reduced CPU Energy M. Weiser, B. Welch, A. Demers and S. Shenker Appears in "Proceedings of the First Symposium on Operating Systems Design and Implementation," Usenix Association, November 1994 Apresentado por Ricardo Carrano para Sistemas de Tempo Real e Embarcados – IC / UFF

Scheduling for Reduced CPU Energy

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Scheduling for Reduced CPU Energy

Scheduling for Reduced CPU Energy

M. Weiser, B. Welch, A. Demers and S. ShenkerAppears in "Proceedings of the First Symposium on Operating Systems Design and Implementation," Usenix Association, November 1994

Apresentado por Ricardo Carrano paraSistemas de Tempo Real e Embarcados – IC / UFF

Page 2: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 2

Abstract

� Introduces MIPJ

� Examines a class of methods to reduce MIPJ based on the dynamic control of system clock speed by the OS scheduler

� Question: What are the right scheduling algorithms for taking advantage of reduced clock-speed, especially in the presence of applications demanding ever more IPSs?

� Result: Adjusting clock speed at a fine grain, saves substantial CPU energy (with little impact on

performance)

Page 3: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 3

Outline

� An Energy Metric for CPUS

� Motivation

� The experiment

� Trace Data

� Assumptions and Simulations

� Three algorithms (OPT, FUTURE and PAST)

� Evaluating the Algorithms

� Conclusions

Page 4: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 4

Motivation: components’ energy use

� Dominated by display and disk

� But CPU is significant

� Common approach (at the time):

� Power down when idle

� Proposed (new) approach:

� Minimize idle time

Page 5: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 5

An Energy Metric for CPUs

� MIPJ: new metric for CPU energy performance

� MIPJ = MIPS/WATTS

� MIPS stands for any workload-per-time bench

mark

� Examples

− 1984 2-MIPS 68020 – 2W MIPJ: 1

− 1994 200-MIPS Alpha – 40W MIPJ: 5

− Laptops: Motorola 68349 6MIPS/300mW: MIPJ: 20

Page 6: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 6

More recent data: Energy per Instruction

Trends in Intel® Microprocessors

� “Core Duo and Pentium M reverse the trend towards ever-greater EPI” (but the arrow still points up)

� “(…) improving IPC has emerged as the more energy-efficient of the two techniques.” (as opposed to increasing frequency).

Grochowski and AnnavaramMicroarchitecture Research Lab Intel Corporation

Page 7: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 7

How to reduce MIPJ?

� Other things equal, MIPJ is unchanged by changes in clock speed.

� Reducing clock speed causes a linear reduction in energy

consumption → The two cancel

� But a reduced clock speed creates an opportunity for quadratic energy savings

� Clock speed reduced by n → energy per cycle reduced by n2.

� So, dynamic control of system clock speed by the OS scheduler do saves energy

� Reducing voltage, Reversible logic, Adiabatic logic

Page 8: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 8

Adiabatic Logic

� Benjamin Gojman (August 8, 2004)� As circuits get smaller and faster, their energy dissipation

greatly increases� a problem that adiabatic circuits promises to solves

� Adiabatic process → total heat or energy in the system remains constant.

� Term given to low-power electronic circuits that implement reversible logic.

� CMOS technology dissipate energy as heat mostly when switching.

� There are two fundamental rules CMOS adiabatic circuits must follow� never to turn on a transistor when there is a voltage difference between

the drain and source. � never to turn off a transistor that has current flowing through it.

Page 9: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 9

The experiment

� Simulations over real traces

� Lengthen runtime of individually scheduled

segments of the trace in order to eliminate idle

time.

� The idea is to stretch runtime into idle times

Page 10: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 10

The experiments: Trace Data

� Taken from UNIX stations

� Over periods up to several hours on a work day

� Workload includes SW devel., documentation, e-mail, simulation, etc

� Other traces taken during specific workload

Page 11: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 11

The experiment: assumptions (1/2)

� No reordering of tasks

� Sleep events classified into hard and soft

� Disk request time are hard (non-deterministic)

� Keystrokes, for example, can be stretched

� No energy consumption when idle

� Energy/instructions in proportion to n2 when

running at speed n

� n varies between minimum relative speed and

1.0 (full speed)

Page 12: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 12

The experiment: assumptions (2/2)

� No time to switch speeds

� Turning off due to power saving skipped/ignored

� Lower bound to practical speed (5V – full speed):

� 0.2, 0.44 or 0.66 → 1.0, 2.2 and 3.3 V

� Speed adjusted linearly with voltage

Page 13: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 13

Scheduling algorithms

� Three types of scheduling

� OPT: unbounded-delay, perfect-future

� FUTURE: bounded-delay, limited-future

� PAST: bounded-delay, limited-past

Page 14: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 14

OPT

� Takes the entire trace

� Stretches all the runtimes to fill all the idle times

� Off periods (90% of idle times over 30s) not

available for stretching

� Impractical – future knowledge

� Undesirable – large delays

� no regard to interactivity

Page 15: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 15

FUTURE

� Like OPT but peers only a small window into the future

� Stretches runtime into idle time only within this window

� setting window size of 10 to 50ms interactive

response will remain high

� Impractical: future knowledge

� Desirable: limited delay

Page 16: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 16

PAST

� Practical version of FUTURE

� Looks a fixed window into the past

� Assumes the next will be like the previous

� The algorithm follows...

Page 17: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 17

PAST algorithm

� Process previous window and computes:

� run_cycles number of non-idle CPU cycles

� idle_cycles idle CPU cycles, hard and soft.

� excess_cycles left over because we ran too slow.

� run_percent = run_cycles / (idle_cycles + run_cycles)

� Adjusts speed accordingly

� If excess_cycles > idle_cycles → newspeed = 1.0

� elseif run_percent > 0.7 → newspeed = speed + 0.2

� elseif run_percent < 0.5 →newspeed = speed – (0.6 – run_percent)

Page 18: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 21

Evaluating the Algorithms

Algorithms and

Minimum speeds

allowed

PAST beats FUTURE, because

excess cycles are deferred

Page 19: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 22

Penalty at 20ms

Time it would take to execute them at full speed

20msec

Excess cycles

built up

Most intervals have no

excess cycles

Page 20: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 23

Penalty at 2.2V

The peak shifts right

as the interval length

increases

Page 21: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 24

PAST (Min Volts, 20ms)

Minimum speed does not always

result in the minimum energy

2.2V almost as good as 1.0V

Kestrel march 1

Page 22: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 26

PAST (2.2V vs. Interval)

Longer adjustment periods result in more savings

Page 23: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 27

Excess Cycles

Lower minimum voltage → more excess cycles

Page 24: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 28

Longer interval → more excess cycles

Page 25: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 29

Conclusions (1/2)

� PAST, with a 50ms window, saves energy:

� up to 50% for conservative assumptions (3.3V)

� up to 70% for more aggressive assumptions (2.2V)

� Savings depends on the interval between speed adjustments.

� too fine: less power saved (CPU usage bursty).

� too coarse: excess cycles built up during a slow interval will adversely affect interactive response.

� interval of 20 or 30 milliseconds: good compromise: power savings vs interactive response.

Page 26: Scheduling for Reduced CPU Energy

Sistemas de Tempo Real e Embarcados 30

Conclusions (2/2)

� Too low a min. speed → less efficiency

� more excess cycles → must speed up to catch up.

� If an effective way of predicting workload can be found, then significant power can be saved.

� adjusting the processor speed at a fine grain so it is just fast enough to accommodate the workload.

� The tortoise is more efficient than the hare:

� better to spread work out by reducing cycle time (and voltage) than to run the CPU at full speed for short bursts and then idle.

� But QoS is not actually taken into account

� Hard and soft idle cycles are no guarantee for RT systems