Leakage Pres

Embed Size (px)

Citation preview

  • 8/10/2019 Leakage Pres

    1/44

    Leakage Modeling and

    ReductionAmit Agarwal, Lei He et. al

    Presenters: Qun Gu

    Ho-Yan Wong

    Courtesy of Lei He

  • 8/10/2019 Leakage Pres

    2/44

    Outline

    Introduction

    Circuit level leakage reduction

    System level leakage reduction

    Coupled leakage and thermal simulation

    and management

  • 8/10/2019 Leakage Pres

    3/44

    Power Trends

  • 8/10/2019 Leakage Pres

    4/44

    Circuit Power Dynamic Power:

    determined by circuitperformance requirementetc. The percentage isgetting smaller.

    Short_Circuit Power: BothPU and PD circuit partiallyconduct. Small percentage.(

  • 8/10/2019 Leakage Pres

    5/44

    Leakage Power Sources

    Subthreshold leakage

    Reverse Biased Junction

    BTBT Leakage

    Gate Leakage

    Reverse BiasedJunction BTBT

    GateSource

    n+n+

    Bulk

    Drain

    SubthresholdLeakage

    Gate Leaka e

  • 8/10/2019 Leakage Pres

    6/44

    Leakage Dependences

  • 8/10/2019 Leakage Pres

    7/44

    Circuit Techniques to Reduce

    Leakage Design Time Techniques

    Dual threshold CMOS

    Run Time Techniques Standby Leakage Reduction Techniques

    Natural Transistor Stacks

    Sleep Transistor (MTCMOS)

    Forward/Reverse Body Biasing (VTCMOS)

    Active Leakage Reduction Techniques

    Dynamic Vth Scaling (DVTS)

  • 8/10/2019 Leakage Pres

    8/44

    Dual Threshold CMOS

    Low Vth for critical path

    High Vth for non-critical path

    Concerns:It is not so straigtht forward to do this. Sometime tradeoff exist

    between high Vth and low Vth applications.

    Vth variation cannot be always success at low voltage supplies.

    Increasing the number of critical paths will sometimes hurt

    circuit performance.

    Adjust Vth approaches in fabrication:Adjustment of tox (the higher tox, the higher Vth)

    How?

  • 8/10/2019 Leakage Pres

    9/44

    Natural Transistor Stacks

    Reduce the leakage by stacking the devices.

    Trade off between speed and

    powerData pattern determined

    Trade off with other leakage

    power ( gate leakage)

    How?

    Concerns:

  • 8/10/2019 Leakage Pres

    10/44

    Sleep Transistor (MTCMOS)

    How?Inserts an extra series connected transistor

    (sleep transistor with high Vth) in the PU/PDpath of a gate and turns it off in the standby

    mode of operation.

    Disadvantages:Increase area and delay

    Data retention problemHard to turn on completely at very low

    supply voltages

  • 8/10/2019 Leakage Pres

    11/44

    Improvements for MTCMOS --

    VRC Virtual power/ground Rails Clamp

    (VRC)

    Solves data retention problem

    with diodes Virtual level changes are clamped

    Allow data to be retained in SRAM

    arrays

    Alternatives: Super cutoff CMOS (with

    low Vth) (SCCMOS)

    In standby mode, PMOS gate is

    Vcc+0.4v, NMOS is Vss-0.4v to fully

    cut off leakage.

  • 8/10/2019 Leakage Pres

    12/44

    Forward/Reverse Body Biasing (VTCMOS)

    RBB (Reverse Body Bias):zerobody bias in active mode, a deep

    reverse bias in standby mode.

    FBB (Forward Body Bias):high Vth instandby mode, forward body biasing to

    achieve better current drive in active mode.

    Disadvantages:

    Increase PN junction reverseleakage

    Scaling down technology worsen

    short channel effects and weaken

    the Vth modulation capability

    Disadvantages:Larger junction capacitance

    High body effect for stack devices

    Technology improvement for high Vth:Different doping profile

    Higher work function materials

  • 8/10/2019 Leakage Pres

    13/44

    Dynamic Vth Scaling (DVTS)

    The lowest Vth is delivered (NBB-no body bias) if the highestperformance is required.

    When the performance demand is low, clock frequency is lowered

    and Vth is raised via RBB to reduce the run time leakage power

    dissipation.

    How?When critical path replica frequency is less then reference CLK,

    adjust bias to decrease Vth.

    Otherwise adjust bias to increase Vth.

    Results:

  • 8/10/2019 Leakage Pres

    14/44

    Process Variation and Leakage

    IDSATand IOFFvariation measured (150nm process).

    Variation Sources:Channel length

    Transistor width

    Oxide thickness

    Flat-band voltage

    Random dopant effect

    The effects of largerspread of leakage:Robustness of logic

    circuits.

    Circuit design margin.

    Circuit Techniques for Compensation Process Variation:Adaptive body biasing for process compensation

    Process variation compensation in dynamic circuits

  • 8/10/2019 Leakage Pres

    15/44

    Adaptive Body Biasing for Process

    CompensationDue to the worsening parameter fluctuation:Some dies may not meet the target frequency.

    Others exceed the leakage power constraints.

    How?The slow dies which fail to meet the desired frequency can be forward

    body biased to improve performance which paying more leakage power.

    On the other hand, excess leakage dies can be reverse body biased to

    meet the leakage power specifications.

    Effects:So adaptive body bias reduces the spread of the die frequency distribution

    by 7X, compared to a conventional zero body bias.

  • 8/10/2019 Leakage Pres

    16/44

    Process Variation Compensation in Dynamic Circuits (I)

    Programmable

    keeper size scheme:A desired effective keeper

    width can be chosenamong {0, W, 2W, 7W}

    according to the control

    bit.

    Dynamic Circuits need keepers to compensate leakage current to keepdata.

    The consideration for keepers size:Unnecessary large keeper size will hurt circuit performance

    Excess leakage dies can not meet the robustness requirements

    without enough keeper size.

  • 8/10/2019 Leakage Pres

    17/44

    Process Variation Compensation in Dynamic Circuits (II)

    Simulation Results:5X reduction in the number of robustness failing dies and 10%

    improvement in average performance.

    Variation spread of the robustness and delay distribution is reduced

    by 55% and 35%

  • 8/10/2019 Leakage Pres

    18/44

    System Level Leakage Reduction

    Motivation

    Leakage characteristics and reduction

    Coupled leakage and thermal simulation

    and management

    Power and thermal simulation

    Dynamic power and thermal management

    Vdd scaling with cooling selection

  • 8/10/2019 Leakage Pres

    19/44

    Motivation

    Leakage current has increased due to

    scaling in Vt, L, and tox

    Leakage power becomes more importantdue to high leakage devices and low

    activity rates

    Leakage power depends greatly ontemperature

  • 8/10/2019 Leakage Pres

    20/44

    Power States at System Level

    3 Power states defined at system level:

    1. Active Modecircuit in operation;

    P= Pd+ Ps

    2. Standby Modecircuit is idle but ready

    to execute; P= Ps

    3. Inactive Modecircuit is deactivated by

    leakage reduction techniques; P < Ps

  • 8/10/2019 Leakage Pres

    21/44

    System Level Leakage Power

    Modeling Early model:

    Ps = Vdd *NFET *kdesign *Ileakage

    Later model, with application of 2 leakage

    power reduction techniques (later):

    Ps = Vdd *Ngate *Iavg

  • 8/10/2019 Leakage Pres

    22/44

    Leakage Power Characteristics

    Minimum Idle Time (M.I.T)

    M.I.T. = {Es-i+ Ei-sPi* (ts-i+ ti-s)} / (PsPi)

    Idle Period

    Leakage power reduction is useful only

    when Idle Period > M.I.T.

  • 8/10/2019 Leakage Pres

    23/44

    Runtime Leakage Reduction for

    Caches Caches dissipate large amount of leakage

    power due to large SRAM array structures

    Different techniques are developed toreduce L1 cache Ps, e.g. DRI, SWAY

    Basic principle is to dynamically turn off

    partial cache array structure

  • 8/10/2019 Leakage Pres

    24/44

    Ps Reduction for L2 Caches

    L2 cache has much larger miss penalty, so

    approach for L1 can not be directly applied

    Use VRC to reduce Ps , and use time-outbased control mechanisms to shutdown

    L2-cache data portion

    Time out threshold could be fixed (FTO),dynamic, or by feedback control (FCTO)

  • 8/10/2019 Leakage Pres

    25/44

    Ps Reduction for L2 Caches contd

    FTO Time out threshold is set as M.I.T.

    FCTO Adjust the time-out threshold with the proportional-

    integral (PI) feedback controller

    Update time-out threshold according to N: L2 cache miss rate in previous time window

    Told: Time-out thresholdin previous time window

    New timeout threshold T = Told+ (N Setpoint) *Gain

  • 8/10/2019 Leakage Pres

    26/44

    Circuits for FCTO

    Timeout controller

    Threshold controller

    Tag Index Block offset

    Tag

    potion

    Check for tag match

    Data

    potion

    Mux

    hit/miss

    Timeout

    controller

    Requestaddress:

    Hit?

    hit/miss Wakeup

    signal

    Yes

    Threshold

    controller

    = Shutdown

    signalCounter

    Threshold

    register

    - X +Nmiss

    setpoint gain

    Threshold

    output

    Data word

    Wakeup/

    shutdown

    signals

  • 8/10/2019 Leakage Pres

    27/44

    Comparison of L2 Leakage Reduction

    Power reduction (%) Performance penalty (%)

    Benchmark FTO FCTO SWAY DRI FTO FCTO SWAY DRI

    go 52.21 63.80 57.55 56.79 1.06 1.10 9.95 7.39

    l i 12.92 27.87 26.64 26.56 0.93 1.07 7.28 7.71

    equake 35.75 48.61 46.40 45.71 0.84 1.01 9.73 10.58

    art 0.07 2.20 2.17 2.18 0.37 0.92 3.18 3.14

    Time-out (FTO and FCTO) achieve much smallerperformance penalty

    Targeting at 1% performance loss, FCTO obtains morepower reduction than FTO does.

  • 8/10/2019 Leakage Pres

    28/44

    System Level Leakage Reduction

    Motivation

    Leakage characteristics and reduction

    Coupled leakage and thermal simulation

    and management

    Power and thermal simulation

    Dynamic power and thermal management

    Vdd scaling with cooling selection

  • 8/10/2019 Leakage Pres

    29/44

    Temperature Aware Computing

    Initial

    conditions

    (T, delay)

    uArch

    Floorplanpackaging

    Workload

    (e.g. Spec 2k)

    Adjusted

    conditions

    (T, delay)

    Performance simulator

    (e.g. SimpleScalar, IMPACT)

    Dynamic power estimation

    (e.g. Wattch)

    Leakage estimation

    Coupled power and thermal simulator

    (e.g. PTscalar, PowerImpact)

    Temperature-aware

    architecture techniques

    (DVS, DTM,

    reconfigurability

    power model, GALS, etc)

  • 8/10/2019 Leakage Pres

    30/44

    Leakage Model with Temperature

    Scaling Exponential scaling based on BSIM3v3

    Logic circuits in ITRS 100nm technology:

    Memory units in ITRS 100nm technology:

    dddd

    ddc

    dddd

    ddl

    VT

    VTwordsizewordsVTP

    VT

    VTwordsizewordsVTP

    53.372592.711exp1029.5),(

    09.439613.1986exp)1072.11030.5(),(

    210

    2910

    T

    VTVTIVNP ddddavgddgates

    09.439613.1986exp),( 200

  • 8/10/2019 Leakage Pres

    31/44

  • 8/10/2019 Leakage Pres

    32/44

    Thermal Modeling

    For the lumped RC thermal circuit

    Thermal resistance Rth: the ability to remove heat to the ambient in

    steady-state condition

    Thermal capacitance Cth: capture the delay between a change in

    power and the corresponding change in the temperature

    Thermal time constant = Rth* Cth

    Distributed model is needed for accurate solution

  • 8/10/2019 Leakage Pres

    33/44

    Coupled Power and Thermal

    Simulation Simulate time step ts < 0.5% of time

    constant (~106 cycles) will give negligible

    temperature and power calculation errors Clock gating reduces dynamic power and

    also leakage energy

    Leakage energy changes with operationtemperature

  • 8/10/2019 Leakage Pres

    34/44

    Leakage Power at Different Temperature

    uP similar to DEC Alpha 21264 and with clock gating

    Leakage differs by up to 2X between 80oC and 110oC Differs for different applications too.

    Coupled thermal and power simulation is a must

    0%

    20%

    40%

    60%

    80%

    100%

    35 85 110 Dep 35 85 110 Dep

    Temperature (oC)

    Normalizedtotalpow

    er

    Dynamic power Leakage power

    Benchmark art Benchmark gcc

    100nm, 3.33GHz, 1.2V

  • 8/10/2019 Leakage Pres

    35/44

    Thermal Runaway

    Thermal runaway is caused by the positivefeedback loop between on-resistor,

    temperature, and powerAlso a result of the interaction between

    leakage power and temperature

    Component temperature leakage power

    exponentiallytemperature

    If cooling not adequate, both keep increasing

  • 8/10/2019 Leakage Pres

    36/44

    Thermal Runaway contd

    Assume no throttlingand constant powerconsumption,

    conditions for thermalrunaway is equivalentto d2T/dt2> 0

    Lowest temperature

    to meet TR criteria isrunaway temperature

  • 8/10/2019 Leakage Pres

    37/44

    Dynamic Power and Thermal

    Management (DPTM) Goal: Maximize throughput subject to maximum

    on-chip temperature constraint

    For each time window =Xcycles, stop orthrottle instruction fetch in cycles

    0

  • 8/10/2019 Leakage Pres

    38/44

    Dynamic Power and Thermal

    Management (DPTM) Fetch toggling toggles I-cache, I-TLB,

    branch prediction and decode units

    Dynamic frequency scaling (DFS) andDynamic Voltage Scaling (DVS) adjust theclock freq and Vddstall

    Activity migration move activities toanother component copy of lowertemperature

  • 8/10/2019 Leakage Pres

    39/44

    Need for Temperature Dependent

    Leakage Model Dynamic thermal

    management using

    fetch toggling with PIfeedback controller

    Implemented 2

    models: simple (fixed

    Ps) and accurate (Psis temp. dependent)

  • 8/10/2019 Leakage Pres

    40/44

    Validation of PI-based DPTM

    Compared with two practices:

    No dynamic management

    Lower Vdd to avoid thermal violations

    Cooling down

    If reaching the thermal threshold, stop the

    whole processor until the maximum

    temperature is XoC lower than the threshold

    X = 5 in our experiments

  • 8/10/2019 Leakage Pres

    41/44

    System Performance

    DPTM by feedback control may improve throughputby up to 11% compared to no DPTM case

    DPTM allows designing for common workload but notthe worst case => thermal speculation

    2.0

    2.5

    3.0

    3.5

    4.0

    4.55.0

    5.5

    1 1.1 1.2 1.3

    Vdd (V)

    Throughput(BIPS

    )

    Feedback control, Max T=80C Simple cooling down, Max T=80C

    No management, Max T=110C

    Max throughput

  • 8/10/2019 Leakage Pres

    42/44

    Active Cooling

    Direct water-spray cooling Thermal resistance 0.067 compare to 0.8 for

    conventional heatsink

    Microchannel with liquid coolant,

  • 8/10/2019 Leakage Pres

    43/44

    Impacts of Water Cooling

    Increases the maximum throughput by 30%

    Improves power efficiency by 9% and slows

    down the decay of power efficiency

    0

    1

    2

    3

    4

    5

    6

    7

    1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8

    Vdd (V)

    Throu

    ghput(BIPS

    0

    0.1

    0.2

    0.3

    0.4

    Powereff

    iciency(BIPS/W)

    water cooling, Max T=60oC

    Air cooling, Max T=80oC

  • 8/10/2019 Leakage Pres

    44/44

    References

    Amit Agarwal et. al, Leakage Mechanismsand Leakage Control for Nano-Scale

    CMOS Circuits, Purdue University.

    Lei He et. al, System Level LeakageReduction Considering theInterdependence of Temperature andLeakage, UCLA.