Upload
macy
View
45
Download
0
Tags:
Embed Size (px)
DESCRIPTION
CSV881: Low-Power Design Gate-Level Power Optimization. Vishwani D. Agrawal James J. Danaher Professor Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 [email protected] http://www.eng.auburn.edu/~vagrawal. Components of Power. Dynamic - PowerPoint PPT Presentation
Citation preview
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 11
CSV881: Low-Power DesignCSV881: Low-Power Design
Gate-Level Power Optimization Gate-Level Power Optimization
Vishwani D. AgrawalVishwani D. AgrawalJames J. Danaher ProfessorJames J. Danaher Professor
Dept. of Electrical and Computer EngineeringDept. of Electrical and Computer EngineeringAuburn University, Auburn, AL 36849Auburn University, Auburn, AL 36849
[email protected]://www.eng.auburn.edu/~vagrawal
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 22
Components of PowerComponents of PowerDynamicDynamic
Signal transitionsSignal transitionsLogic activityGlitches
Short-circuit (often neglected)Short-circuit (often neglected)StaticStatic
LeakageLeakage
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 33
Power of a TransitionPower of a Transition
VVDDDD
GroundGround
CL
R
R
Dynamic Power
= CLVDD2/2 + Psc
Vi
Vo
isc
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 44
Dynamic PowerDynamic PowerEach transition of a gate consumes Each transition of a gate consumes CV CV 22/2./2.Methods of power saving:Methods of power saving:
Minimize load capacitancesMinimize load capacitancesTransistor sizingLibrary-based gate selection
Reduce transitionsReduce transitionsLogic designGlitch reduction
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 55
Glitch Power ReductionGlitch Power Reduction Design a digital circuit for minimum transient Design a digital circuit for minimum transient
energy consumption by eliminating hazardsenergy consumption by eliminating hazards
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 66
Theorem 1Theorem 1 For correct operation with minimum energy For correct operation with minimum energy
consumption, a Boolean gate must produce consumption, a Boolean gate must produce no more than no more than oneone event per transition. event per transition.
Output logic state changesOne transition is necessary
Output logic state unchangedNo transition is necessary
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 77
Event PropagationEvent Propagation
2 4 61
1 3
5
3
10
0
0
2
2
Path P1
P2
Path P3
Single lumped inertial delay modeled for each gatePI transitions assumed to occur without time skew
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 88
Inertial Delay of an InverterInertial Delay of an Inverter
dHL dLH
dHL+dLH
d = ──── 2
Vin
Vout
time
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 99
Multi-Input GateMulti-Input Gate
Delay d < DPD
A
B
C
A
B
C d d Hazard or glitch
DPD
DPD: Differential path delay
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 1010
Balanced Path DelaysBalanced Path Delays
Delayd < DPD
A
B
C
A
B
C d No glitch
DPD
Delay buffer
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 1111
Glitch Filtering by InertiaGlitch Filtering by Inertia
Delay d > DPD
A
B
C
A
B
C
d > DPD
Filtered glitch
DPD
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 1212
Given that events occur at the input of a gate, Given that events occur at the input of a gate, whose inertial delay is whose inertial delay is dd, at times, , at times, tt11 ≤ . . . ≤ ≤ . . . ≤ ttnn , the , the number of events at the gate output cannot exceednumber of events at the gate output cannot exceed
TheoremTheorem
min ( min ( n n , 1 + ), 1 + )ttnn – t – t11
────────dd
ttnn - t - t11
tt11 t t22 t t33 t tnn timetime
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 1313
Minimum Transient DesignMinimum Transient Design
Minimum transient energy condition for a Minimum transient energy condition for a Boolean gate:Boolean gate:
| t| tii – t – tjj | < d | < d
Where tWhere tii and t and tjj are arrival times of input are arrival times of input
events and d is the inertial delay of gateevents and d is the inertial delay of gate
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 1414
Balanced Delay MethodBalanced Delay Method
All input events arrive simultaneouslyAll input events arrive simultaneously Overall circuit delay not increasedOverall circuit delay not increased Delay buffers may have to be insertedDelay buffers may have to be inserted
11 111111 11
111111
33
11 11
No increase in critical path delay
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 1515
Hazard Filter MethodHazard Filter Method Gate delay is made greater than maximum input Gate delay is made greater than maximum input
path delay differencepath delay difference No delay buffers needed No delay buffers needed (least transient energy)(least transient energy) Overall circuit delay may increaseOverall circuit delay may increase
11 111111 11
33111111 11
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 1616
Designing a Glitch-Free CircuitDesigning a Glitch-Free Circuit Maintain specified critical path delay.Maintain specified critical path delay. Glitch suppressed at all gates byGlitch suppressed at all gates by
Path delay balancingPath delay balancing Glitch filtering by increasing inertial delay of gates or by inserting delay Glitch filtering by increasing inertial delay of gates or by inserting delay
buffers when necessary.buffers when necessary. A linear program optimally combines all objectives.A linear program optimally combines all objectives.
DelayD
Path delay = d1
Path delay = d2
|d1 – d2| < D
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 1717
Problem ComplexityProblem Complexity
Number of paths in a circuit can be Number of paths in a circuit can be
exponential in circuit size.exponential in circuit size.
Considering all paths through enumeration Considering all paths through enumeration
is infeasible for large circuits.is infeasible for large circuits.
Example: c880 has 6.96M path constraints.Example: c880 has 6.96M path constraints.
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 1818
Define Arrival Time VariablesDefine Arrival Time Variables ddi Gate delay.Gate delay.
Define two Define two timing window variables per gate output: per gate output: ttii Earliest time of signal transition at gate Earliest time of signal transition at gate ii..
TTii Latest time of signal transition at gate Latest time of signal transition at gate ii..
Glitch suppression constraint: Glitch suppression constraint: TTi – t – ti < d < di
t1, T1
tn, Tn
.
.
.
ti, Ti
Reference: T. Raja, Master’s Thesis, Rutgers Univ., 2002.
di
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 1919
Linear ProgramLinear Program
Variables: gate and buffer delays, Variables: gate and buffer delays, arrival time variables.arrival time variables.
Objective: minimize number of buffers.Objective: minimize number of buffers.Subject to: overall circuit delay Subject to: overall circuit delay
constraint for all input-output paths.constraint for all input-output paths.Subject to: minimum transient energy Subject to: minimum transient energy
condition for all multi-input gates.condition for all multi-input gates.
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 2020
An Example: Full Adder add1bAn Example: Full Adder add1b
11
11
Critical path delay = 6
11
11
1111
11
11
11
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 2121
Linear ProgramLinear Program
Gate variables: dGate variables: d4 4 . . . d. . . d1212 Buffer delay variables: dBuffer delay variables: d15 15 . . . d. . . d2929 Window variables: tWindow variables: t4 4 . . . t. . . t2929 and T and T4 4 . . . . T. . . . T2929
For Gate 7:For Gate 7:
TT77 ≥ ≥ TT55 + d + d77 tt77 ≤≤ t t55 + d + d77 dd77 > T > T77 – t – t77
TT77 ≥≥ T T66 + d + d77 tt77 ≤≤ t t66 + d + d77 Glitch suppressionGlitch suppression
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 2222
Multiple-Input Gate ConstraintsMultiple-Input Gate Constraints
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 2323
Single-Input Gate ConstraintsSingle-Input Gate Constraints
T16 + d19 = T19
t16 + d19 = t19
Buffer 19:
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 2424
Critical Path Delay ConstraintsCritical Path Delay Constraints
TT1111 ≤≤ maxdelaymaxdelay
TT1212 ≤≤ maxdelaymaxdelay
maxdelay is specifiedmaxdelay is specified
Objective FunctionObjective FunctionNeed to minimize the number of buffers.Need to minimize the number of buffers.Because that leads to a nonlinear Because that leads to a nonlinear
objective function, we use an approximate objective function, we use an approximate criterion:criterion:
minimize minimize ∑ (buffer delay)∑ (buffer delay)
all buffersall buffers
i.e.,i.e., minimize minimize dd1515 + d + d1616 + ∙ ∙ ∙ + d + ∙ ∙ ∙ + d2929
This gives a near optimum result.This gives a near optimum result.
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 2525
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 2626
AMPL Solution: AMPL Solution: maxdelay maxdelay == 66
11
22
Critical path delay = 6
22
11
1111
11
22
11
22
11
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 2727
AMPL Solution: AMPL Solution: maxdelay maxdelay == 77
11
11
Critical path delay = 7
33
22
1111
11
22
11
22
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 2828
AMPL Solution: AMPL Solution: maxdelay maxdelay ≥≥ 1111
11
44
Critical path delay = 11
55
33
3311
11
22
11
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 2929
ALU4: Four-Bit ALU 74181ALU4: Four-Bit ALU 74181
maxdelaymaxdelay Buffers insertedBuffers inserted
77 55
1010 22
1212 11
1515 00
Maximum Power Savings (zero-buffer design):
Peak = 33%, Average = 21%
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 3030
ALU4: Original and Low-PowerALU4: Original and Low-Power
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 3131
Benchmark CircuitsBenchmark Circuits
Circuit
ALU4
C880
C6288
c7552
Max-delay(gates)
715
2448
4794
4386
No. ofBuffers
5 0
6234
294120
366111
Average
0.800.79
0.680.68
0.400.36
0.44 0.42
Peak
0.680.67
0.540.52
0.360.34
0.34 0.32
Normalized Power
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 3232
C7552 Circuit: Spice SimulationC7552 Circuit: Spice Simulation
Power Saving: Average 58%, Peak 68%Power Saving: Average 58%, Peak 68%
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 3333
ReferencesReferences R. Fourer, D. M. Gay and B. W. Kernighan, R. Fourer, D. M. Gay and B. W. Kernighan, AMPL: A Modeling Language for AMPL: A Modeling Language for
Mathematical ProgrammingMathematical Programming, South San Francisco: The Scientific Press, 1993., South San Francisco: The Scientific Press, 1993. M. Berkelaar and E. Jacobs, “Using Gate Sizing to Reduce Glitch Power,” M. Berkelaar and E. Jacobs, “Using Gate Sizing to Reduce Glitch Power,” Proc. Proc.
ProRISC WorkshopProRISC Workshop, Mierlo, The Netherlands, Nov. 1996, pp. 183-188., Mierlo, The Netherlands, Nov. 1996, pp. 183-188. V. D. Agrawal, “Low Power Design by Hazard Filtering,” V. D. Agrawal, “Low Power Design by Hazard Filtering,” Proc. Proc. 1010thth Int’l Conf. Int’l Conf.
VLSI DesignVLSI Design, Jan. 1997, pp. 193-197., Jan. 1997, pp. 193-197. V. D. Agrawal, M. L. Bushnell, G. Parthasarathy and R. Ramadoss, “Digital V. D. Agrawal, M. L. Bushnell, G. Parthasarathy and R. Ramadoss, “Digital
Circuit Design for Minimum Transient Energy and Linear Programming Method,” Circuit Design for Minimum Transient Energy and Linear Programming Method,” Proc. Proc. 1212thth Int’l Conf. VLSI Design Int’l Conf. VLSI Design, Jan. 1999, pp. 434-439., Jan. 1999, pp. 434-439.
T. Raja, V. D. Agrawal and M. L. Bushnell, “Minimum DynamicT. Raja, V. D. Agrawal and M. L. Bushnell, “Minimum Dynamic Power CMOS Power CMOS Circuit Design by a Reduced Constraint Set Linear Program,” Circuit Design by a Reduced Constraint Set Linear Program,” Proc.Proc. 16 16thth Int’l Int’l Conf. VLSI DesignConf. VLSI Design, Jan. 2003, pp. 527-532., Jan. 2003, pp. 527-532.
T. Raja, V. D. Agrawal, and M. L. Bushnell, “Transistor sizing of logicgates to T. Raja, V. D. Agrawal, and M. L. Bushnell, “Transistor sizing of logicgates to maximize input delay variability,” maximize input delay variability,” J. Low Power Electron., vol.J. Low Power Electron., vol.2, no. 1, pp. 121–2, no. 1, pp. 121–128, Apr. 2006.128, Apr. 2006.
T. Raja, V. D. Agrawal, and M. L. Bushnell, “Variable Input Delay CMOS Logic T. Raja, V. D. Agrawal, and M. L. Bushnell, “Variable Input Delay CMOS Logic for Low Power Design,” IEEE Trans. VLSI Design, vol. 17, mo. 10, pp. 1534-for Low Power Design,” IEEE Trans. VLSI Design, vol. 17, mo. 10, pp. 1534-1545. October 2009.1545. October 2009.
Exercise: Dynamic PowerExercise: Dynamic PowerAn average gateAn average gate
VDD, V = 1 voltVDD, V = 1 voltOutput capacitance, C = 1pFOutput capacitance, C = 1pFActivity factor, Activity factor, αα = 10% = 10%Clock frequency, f = 1GHzClock frequency, f = 1GHz
What is the dynamic power What is the dynamic power consumption of a 1 million gate VLSI consumption of a 1 million gate VLSI chip?chip?
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 3434
AnswerAnswer
Dynamic energy per transition = 0.5CVDynamic energy per transition = 0.5CV22
Dynamic power per gateDynamic power per gate
= Energy per second= Energy per second
= 0.5 CV= 0.5 CV2 2 αα f f
= 0.5 ✕ 10 = 0.5 ✕ 10 – 12– 12 ✕ 1 ✕ 122 ✕ 0.1 ✕ ✕ 0.1 ✕ 101099
= 0.5 ✕ 10 = 0.5 ✕ 10 – 4– 4 = 50 = 50μμWWPower for 1 million gate chip = 50WPower for 1 million gate chip = 50WCopyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 3535
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 3636
Components of PowerComponents of PowerDynamicDynamic
Signal transitionsSignal transitionsLogic activityLogic activityGlitchesGlitches
Short-circuitShort-circuitStaticStatic
Leakage
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 3737
Subthreshold ConductionSubthreshold ConductionVgs – Vth –Vds
Ids = I0 exp( ───── ) × (1– exp ─── ) nVT VT
Subthreshold slope
0 0.3 0.6 0.9 1.2 1.5 1.8 V Vgs
Ids
1mA100μA10μA1μA
100nA10nA1nA
100pA10pA
Vth
Sub
thre
shol
dre
gion
Saturation region
d
s
g
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 3838
Thermal Voltage, Thermal Voltage, vvTT
VT = kT/q = 26 mV, at room temperature.
When Vds is several times greater than VT
Vgs – Vth
Ids = I0 exp( ───── ) nVT
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 3939
Leakage CurrentLeakage Current
Leakage current equals Leakage current equals IIdsds when when VVgs gs = 0= 0
Leakage current, Leakage current, IIdsds = = II00 exp( exp( – V– Vthth/nV/nVTT))
At cutoff, At cutoff, VVgsgs = = VVth th , and , and IIdsds = = II00
Lowering leakage to 10Lowering leakage to 10--b b ✕ ✕ II00
VVthth = = bnVbnVT T ln 10 = 1.5ln 10 = 1.5b b × 26 ln 10 = 90× 26 ln 10 = 90bb mV mV
Example: To lower leakage to Example: To lower leakage to II00/1,000/1,000
VVthth = 270 mV = 270 mV
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 4040
Threshold VoltageThreshold Voltage VVthth = = VVt0t0 + + γγ[([(ΦΦss++VVsbsb))½ ½ – – ΦΦss
½½]]
VVt0t0 is threshold voltage when source is at is threshold voltage when source is at
body potential (body potential (0.4 V for 180nm process)) ΦΦs s = = 22VVTT ln(ln(NNA A /n/ni i )) is surface potentialis surface potential
γγ = (2 = (2qqεεsi si NNAA))½½ttox ox //εεoxox is body effect is body effect
coefficient (0.4 to 1.0)coefficient (0.4 to 1.0) NNAA is doping level = is doping level = 8×1017 cm–3
ni = 1.45×1010 cm–3
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 4141
Threshold Voltage, Threshold Voltage, VVsb sb = 1.1V= 1.1V
Thermal voltage, Thermal voltage, VVTT = = kT/qkT/q = 26 mV = 26 mV
ΦΦss = 0.93 V = 0.93 V
εεoxox = 3.9×8.85×10 = 3.9×8.85×10-14-14 F/cm F/cm
εεsisi = 11.7×8.85×10 = 11.7×8.85×10-14-14 F/cm F/cm
ttoxox = 40 A = 40 Aoo
γγ = 0.6 V = 0.6 V½½
VVthth = = VVt0t0 + + γγ[([(ΦΦss++VVsbsb))½½- - ΦΦss½½] = 0.68 V] = 0.68 V
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 4242
A Sample CalculationA Sample Calculation
VVDDDD = 1.2V, 100nm CMOS process = 1.2V, 100nm CMOS processTransistor width, W = 0.5Transistor width, W = 0.5μμmmOFF device (OFF device (VVgsgs = = VVthth) leakage) leakage
II00 = 20nA/ = 20nA/μμm, for low threshold transistorm, for low threshold transistorII00 = 3nA/ = 3nA/μμm, for high threshold transistorm, for high threshold transistor
100M transistor chip100M transistor chipPower = (100×10Power = (100×1066/2)(0.5×20×10/2)(0.5×20×10-9-9A)(1.2V) = 600mW A)(1.2V) = 600mW for for
all low-threshold transistorsall low-threshold transistorsPower = (100×10Power = (100×1066/2)(0.5×3×10/2)(0.5×3×10-9-9A)(1.2V) = 90mW A)(1.2V) = 90mW for for
all high-threshold transistorsall high-threshold transistors
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 4343
Dual-Threshold ChipDual-Threshold Chip
Low-threshold only for 20% transistors on Low-threshold only for 20% transistors on critical path.critical path.
Leakage power Leakage power = 600×0.2 + 90×0.8= 600×0.2 + 90×0.8
= 120 + 72= 120 + 72
= 192 mW= 192 mW
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 4444
Dual-Threshold CMOS CircuitDual-Threshold CMOS Circuit
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 4545
Dual-Threshold DesignDual-Threshold Design To maintain performance, all gates on To maintain performance, all gates on
critical paths are assigned low critical paths are assigned low VVth th ..
Most other gates are assigned high Most other gates are assigned high VVth th .. But, some gates on non-critical paths But, some gates on non-critical paths
may also be assigned low may also be assigned low VVthth to prevent to prevent
those paths from becoming critical.those paths from becoming critical.
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 4646
Integer Linear Programming (ILP) to Integer Linear Programming (ILP) to Minimize Leakage PowerMinimize Leakage Power
Use dual-threshold CMOS processUse dual-threshold CMOS process First, assign all gates low First, assign all gates low VVthth
Use an ILP model to find the delay (Use an ILP model to find the delay (TTcc) of the ) of the
critical pathcritical path Use another ILP model to find the optimal Use another ILP model to find the optimal VVthth
assignment as well as the reduced leakage power assignment as well as the reduced leakage power for all gates without increasing for all gates without increasing TTcc
Further reduction of leakage power possible by Further reduction of leakage power possible by letting letting TTcc increase increase
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 4747
ILP -ILP -VariablesVariables For each gate For each gate ii define two variables. define two variables. TTi i :: the longest time at which the output of the longest time at which the output of
gate gate ii can produce an event after the can produce an event after the occurrence of an input event at a primary occurrence of an input event at a primary input of the circuit. input of the circuit.
XXi i :: a variable specifyinga variable specifying low or high low or high VVthth for for gate gate i i ;; X Xii is an integer [0, 1], is an integer [0, 1],
1 1 gate gate ii is assigned low is assigned low VVth th ,,
0 0 gate gate ii is assigned high is assigned high VVth th ..
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 4848
ILP - ILP - objective functionobjective function
minimize the sum of all gate leakage currents, given by minimize the sum of all gate leakage currents, given by
IILi Li is the leakage current of gate is the leakage current of gate ii with low with low VVthth IIHiHi is the leakage current of gate is the leakage current of gate ii with high with high VVthth Using SPICE simulation results, construct a leakage Using SPICE simulation results, construct a leakage
current look up table, which is indexed by the gate current look up table, which is indexed by the gate type and the input vectortype and the input vector. .
i
leakiddleak IVP
i
HiiLii IXIXMin 1
Leakage power:
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 4949
ILP - ILP - ConstraintsConstraints
For each gateFor each gate (1)(1)
output of gate output of gate jj is fanin of gate is fanin of gate ii
(2) (2)
Max delay constraints for primary outputs (PO)Max delay constraints for primary outputs (PO)
(3) (3)
TTmaxmax is the maximum delay of the critical path is the maximum delay of the critical path
HiiLiiji DXDXTT 1
10 iX
maxTTi
Gate j
Gate i
Tj
Ti
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 5050
ILP Constraint ExampleILP Constraint Example
Assume all primary input (PI) signals on the left arrive at the Assume all primary input (PI) signals on the left arrive at the same time. same time.
For gate 2, constraints areFor gate 2, constraints are
0
3
1
2
222202 1 HL DXDXTT
22222 10 HL DXDXT
HiiLiiji DXDXTT 1
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 5151
ILP – Constraints (cont.)ILP – Constraints (cont.)
DDHi Hi is the delay of gateis the delay of gate i i with highwith high V Vthth
DDLi Li is the delay of gateis the delay of gate i i with lowwith low V Vthth
A second look-up table is constructed and A second look-up table is constructed and specifies the delay for given gate types specifies the delay for given gate types and fanout numbers. and fanout numbers.
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 5252
ILP – Finding Critical DelayILP – Finding Critical Delay
TTmaxmax can be specified or be the delay of longest path (can be specified or be the delay of longest path (TTcc).).
To find To find TTc c , we first delete the above constraint and assign , we first delete the above constraint and assign
all gates low all gates low VVthth
Maximum Maximum TTii in the ILP solution is in the ILP solution is TTcc..
If we replace If we replace TTmaxmax with with TTc c , the objective function then , the objective function then
minimizes leakage power without sacrificing performance.minimizes leakage power without sacrificing performance.
10 iX
maxTTi
1iX
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 5353
Power-Delay TradeoffPower-Delay Tradeoff
1 1.1 1.2 1.3 1.4 1.5
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Normalized Critical Path Delay
Nor
mal
ized
Lea
kage
Pow
er
C432
C880
C1908
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 5454
Power-Delay TradeoffPower-Delay Tradeoff
If we gradually increase If we gradually increase TTmaxmax from from TTc c , leakage , leakage
power is further reduced, because more gates power is further reduced, because more gates can be assigned high can be assigned high VVth th ..
But, the reduction trends to become slower.But, the reduction trends to become slower. When When TTmax max = = (130%)(130%) T Tcc , the reduction about , the reduction about
levels off because almost all gates are levels off because almost all gates are assigned high assigned high VVth th . .
Maximum leakage reduction can be 98%. Maximum leakage reduction can be 98%.
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 5555
Leakage & Dynamic Power Optimization 70nm Leakage & Dynamic Power Optimization 70nm CMOS c7552 Benchmark Circuit @ 90CMOS c7552 Benchmark Circuit @ 90ooCC
0
100
200
300
400
500
600
700
800
900
Mic
row
att
s
Original circuit Optimizeddesign
Leakage powerDynamic powerTotal power
Leak
age
exce
eds
dyn
amic
pow
er Y. Lu and V. D. Agrawal, “CMOS Leakage and Glitch Minimization for Power-Performance Tradeoff,” Journal of Low Power Electronics (JOLPE), vol. 2, no. 3, pp. 378-387, December 2006.
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 5656
SummarySummary
Leakage power is a significant fraction of the Leakage power is a significant fraction of the total power in nanometer CMOS devices.total power in nanometer CMOS devices.
Leakage power increases with temperature; can Leakage power increases with temperature; can be as much as dynamic power.be as much as dynamic power.
Dual threshold design can reduce leakage.Dual threshold design can reduce leakage. Reference: Y. Lu and V. D. Agrawal, “CMOS Leakage
and Glitch Minimization for Power-Performance Tradeoff,” J. Low Power Electronics, Vol. 2, No. 3, pp. 378-387, December 2006.
Access other paper at http://www.eng.auburn.edu/~vagrawal/TALKS/talks.html
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 5757
Problem: Leakage ReductionProblem: Leakage ReductionFollowing circuit is designed in 65nm CMOS technology using low threshold transistors. Each gate has a delay of 5ps and a leakage current of 10nA. Given that a gate with high threshold transistors has a delay of 12ps and leakage of 1nA, optimally design the circuit with dual-threshold gates to minimize the leakage current without increasing the critical path delay. What is the percentage reduction in leakage power? What will the leakage power reduction be if a 30% increase in the critical path delay is allowed?
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 5858
Solution 1: No Delay IncreaseSolution 1: No Delay IncreaseThree critical paths are from the first, second and third inputs to the last output, shown by a dashed line arrow. Each has five gates and a delay of 25ps. None of the five gates on the critical path (red arrow) can be assigned a high threshold. Also, the two inverters that are on four-gate long paths cannot be assigned high threshold because then the delay of those paths will become 27ps. The remaining three inverters and the NOR gate can be assigned high threshold. These gates are shaded blue in the circuit.The reduction in leakage power = 1 – (4×1+7×10)/(11×10) = 32.73%Critical path delay = 25ps
5ps
5ps
5ps
5ps
5ps
5ps5ps
12ps
12ps
12ps
12ps
Copyright Agrawal, 2011Copyright Agrawal, 2011 Lectures 10, 11, 12: Gate-level optimizationLectures 10, 11, 12: Gate-level optimization 5959
Solution 2: 30% Delay IncreaseSolution 2: 30% Delay IncreaseSeveral solutions are possible. Notice that any 3-gate path can have 2 high threshold gates. Four and five gate paths can have only one high threshold gate. One solution is shown in the figure below where six high threshold gates are shown with shading and the critical path is shown by a dashed red line arrow.The reduction in leakage power = 1 – (6×1+5×10)/(11×10) = 49.09%Critical path delay = 29ps
12ps
12ps
12ps
12ps12ps
12ps5ps
5ps
5ps
5ps
5ps