Upload
sarah-burchill
View
214
Download
1
Embed Size (px)
Citation preview
Optimizing High-Performance Digital Circuits in Energy Constrained Environment
Vojin G. OklobdzijaACSEL Laboratory
University of California, Daviswww.ece.ucdavis.edu/acsel
4ème journées francophones d'études Faible Tension Faible Consommation
Présentation Invité May 15, 2003
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 2
Objective
• Compare Energy-Delay parameters for several commonly used high-performance adder topologies.
• Determine which topology is the best for given Energy-Delay.
• Determine which topology can stretch the furthest in terms of speed or power.
• Develop an estimation tool which can provide those answers before design is committed.
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 3
Representative 64-b Adders Delay (Static CMOS design)
0
2
4
6
8
10
12
MXA2 HC2 KS2 QTA2 KS4 LNG4
To
tal D
elay
(F
O4)
Static
Dynamic
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 4
Realistic Design Decisions
• It is difficult to determine which design can provide most speed if energy is not constrained.
• Design can always be made faster by increasing power. However, there is a limit.
• Which design is fastest if power is no object ?
• Which design is fastest for a given power budget ?
• Which design works at the minimal power ?
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 5
Energy-Delay SpaceEnergy
Delay
Emin
Dmin
The only way to compare !
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 6
Design Objective
• Design takes time: – comparing designs afterward does not bring
much value
• There is a disconnect between estimates used on algorithmic level and what is obtained when implementation is finished.
• We need a simple tool that can evaluate different design trade-offs (speed/power) before we commit to design.
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 7
Methodology Background
• “Back of the Envelope” complexity – Logical Effort method
• “Logical Effort” accuracy is not sufficient– We needed to extend and refine the method– However, that becomes more than “Back of the
Envelope”• Excel – a platform of choice:
– Simple enough– Can provide relatively complex computation quickly– Easy to enter a given design
• For accuracy technology characterization is needed:– This needs to be done only once and should be available
for every design afterwards
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 8
Delay in a Logic Gate
Delay of a logic gate has two components
d = f + p
• Logical effort describes relative ability of gate topology to deliver current (defined to be 1 for an inverter)
• Electrical effort is the ratio of output to input capacitance
parasitic delay
effort delay, stage effort
f = gh
logical effort
electrical effort = Cout/Cin
electrical effortis alsocalled “fanout”
*from Mathew Sanu / D. Harris
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 9
Logical Effort Parameters: Inverter
• d = gh + p• Delay increases linearly with fanout• More complex gates have greater g and p
0
2
4
6
8
10
12
14
16
0 1 2 3 4 5 6
p=3.8ps (parasitic delay)
Fanout: h =Cin/Cout
Del
ay
d=gh+p
g=2.2 (logic effort)
*from Mathew Sanu / D. Harris
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 10
Normalized Logical Effort: Inverter
• Define delay of unloaded inverter = 1 • Define logical effort ‘g’ of inverter = 1• Delay of complex gates can be defined w.r.t d=1
1
2
3
4
5
6
1 2 3 4 5
parasitic delay
effortdelay
Fanout: h = Cout/Cin
Nor
mal
ized
del
ay:
d
inver
ter g =
p =d =
1 1gh + p = h+1
*from Mathew Sanu / D. Harris
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 11
Computing Logical EffortDEF: Logical effort is the ratio of the input capacitance to the input
capacitance of an inverter delivering the same output current
• Measured from delay vs. fanout plots of simulated gates• Or estimated, counting capacitance in units of transistor W
*from Mathew Sanu / D. Harris
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 12
L.E for Adder Gates
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
0 1 2 3 4 5 6
Fanout
Del
ay (
ps)
Inverter
Static CM
Dyn PG
Dyn CM
Mux
• Logical effort parameters obtained from simulation for std cells• Define logical effort ‘g’ of inverter = 1• Delay of complex gates can be defined w.r.t d=1
*from Mathew Sanu / D. Harris
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 13
Normalized L.E
• Logical effort & parasitic delay normalized to that of inverter
Gate type Logical Eff. (g)Parasitics
(Pinv)
Inverter 1 1
Dyn. Nand 0.6 1.34
Dyn. CM 0.6 1.62
Dyn. CM-4N 1 3.71
Static CM 1.48 2.53
Mux 1.68 2.93
XOR 1.69 2.97
*from Mathew Sanu
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 14
Delay of a string of gates
• Delay of a path, D = di = gihi + pi
• gi & pi are constants
• To minimize path delay, optimal values of hi are to be
determined
D is minimized when each stage bears the same effort, i.e. gihi = g i+1h i+1
*from Mathew Sanu / D. Harris
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 15
Minimizing path delay
• Logical Effort of a string of gates:
• Path Electrical Effort:
• Branching Effort
• Path Branching Effort:
• Path Effort: F=GBH
giG = Cout(path)
Cin(path)
H = hi =
biB =
Con-path + Coff-path
Con-path
b =
Delay is minimized when each stage bears the same effort:
f = gihi = F1/N
The minimum delay of an N-stage path is: NF1/N + P
*from Mathew Sanu / D. Horowitz
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 16
Modeling interconnect cap.• Include interconnect cap in branching factor
Con-path + Coff-path
Con-path
b =
CM0
CM0
Coff-path
Con-path
PG
Add
er b
itpitc
h CM0
CM0Cint
Con-path
PG
Add
er b
itpitc
h
Coff-path
= 2 Con-path + Coff-path+Cint
Con-pathb = = 2+
Cint
Con-path
= 2 + I I : % int. cap to gate cap in 1 adder bitpitch
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 17
Correction on Branching
CINCOUT1
COUT2
f0 f1
f2 f3
g0 g1
g2 g3
Logical Effort assumes the “branching” factor of this circuit to be 2. This is incorrect and can create significant inaccuracies
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 18
CINCOUT1
COUT2
f0 f1
f2 f3
f0 = f1 , f2 = f3
Td1 = (f0 + f1 + parasitics) Td2 = (f2 + f3 + parasitics)
g0 g1
g2 g3
Minimum Delay occurs when Td1 = Td2
Correction on Branching
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 19
F1g0 g1 out1
CinF2
g2 g3 out2Cin
B1F1 F2
F1
B1g0 g1 out1 g2 g3 out2
g0 g1 out1
B2F1 F2
F2
B2g0 g1 out1 g2 g3 out2
g2 g3 out2
“Real” Branching Calculation
Branching only equals 2 when: g0 g1 out1 g2 g3 out2
This explains why we had to resort to Excel !
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 20
Technology Characterization
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 21
Characterization Setup
• Logical Effort Requirements:– Equalize input and output transitions.
(We should also account for the input slope dependence of delay)
• Logical Effort is characterized by varying the h (Cout/Cin) of a gate. By using a variable load of inverters each gate can be characterized over the same range of loads.
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 22
• The Logical Effort of each gate is characterized for each input.
• Energy is characterized for each output transition of the gate caused by each input transition.
i.e. for an inverter: energy is measured for tLH and tHL
Characterization Setup
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 23
LE Characterization Setup for Static Gates
Gate Gate Gate GateIn
•tLH
•tHL
•Average•Energy
..
Variable Load
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 24
LE Characterization Setup for Dynamic Gates
Gate GateIn
•tHL
•Energy
Variable Load
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 25
LE Table (Static CMOS)
• Technology: P/N Ratio = 2 INV = 3.67, pINV = 4.29
• Measured on worst-case single-input switching
Fan-out INV NAND2 NAND3 NOR2 TGXORi TGXORs TGM UXi TGM UXs AOI OAI2 11.6 16.3 22.2 20.5 34.9 22.3 8.0 26.0 23.2 21.33 15.3 20.0 26.6 25.4 42.6 28.2 9.9 33.0 28.5 26.74 19.0 24.0 31.2 30.6 50.2 34.2 12.0 39.0 34.1 32.16 26.4 32.4 40.6 41.1 64.4 45.7 16.0 53.0 45.3 43.68 33.6 40.6 50.0 51.9 79.8 56.5 20.2 68.0 56.7 55.3
g (ps) 3.67 4.08 4.65 5.25 7.43 5.71 2.04 6.97 5.60 5.68p (ps) 4.29 7.90 12.74 9.77 20.19 11.12 3.85 11.76 11.82 9.69
g (norm) 1.00 1.11 1.27 1.43 2.03 1.56 0.55 1.90 1.52 1.55p (norm) 1.00 1.84 2.97 2.28 4.71 2.59 0.90 2.74 2.76 2.26
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 26
0
10
20
30
40
50
60
70
80
90
0 1 2 3 4 5 6 7 8 9
Fanout
Delay
INV
NAND2
NAND3
NOR2
AOI
OAI
Static CMOS Gates: Delay Graphs
0
10
20
30
40
50
60
70
80
90
0 1 2 3 4 5 6 7 8 9
FanoutD
elay
INV
TGXORi
TGXORs
TGMUXi
TGMUXs
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 27
LE Table (Static CMOS, Pull-up)• Technology:
• Measured on worst-case single-input switching
• Pull-up path only
Fan-out INV NAND2 NAND3 NOR2 AOI OAI2 12.0 19.0 27.5 18.9 26.4 17.53 15.8 23.3 33.1 23.2 31.7 22.04 19.8 28.2 39.2 28.0 37.4 27.16 27.9 38.5 51.7 37.0 49.3 36.98 35.4 48.8 64.8 46.7 61.7 47.3
g (ps) 3.93 5.01 6.24 4.63 5.91 4.98p (ps) 4.10 8.52 14.57 9.44 14.14 7.26
g (norm) 1.07 1.36 1.70 1.26 1.61 1.36p (norm) 0.96 1.99 3.40 2.20 3.30 1.69
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 28
Static Gates: Pull-up Delay Graph
0
10
20
30
40
50
60
70
0 1 2 3 4 5 6 7 8 9
Fanout
Del
ayINV
NAND2
NAND3
NOR2
AOI
OAI
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 29
LE Table (Dynamic CMOS)• Technology:
• Minimum-sized keeper included
• Measured on all-input switching of worst path
Fan-out DN2 DN3 DN4 Dk1ND2 Dk1NR2 DAOI_A DOAI_O2 9.9 12.7 16.0 13.7 10.6 10.1 8.83 12.6 14.7 19.1 16.7 13.2 12.1 11.34 16.0 18.3 23.2 20.7 16.7 14.7 14.06 21.7 24.7 30.2 27.9 23.2 20.0 19.28 27.3 31.2 37.8 36.1 29.5 24.8 24.0
g (ps) 2.92 3.15 3.65 3.75 3.19 2.49 2.55p (ps) 4.04 5.82 8.46 5.76 3.95 4.86 3.75
g (norm) 0.80 0.86 1.00 1.02 0.87 0.68 0.69p (norm) 0.94 1.36 1.97 1.34 0.92 1.13 0.87
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 30
LE Table (Dynamic CMOS)Fan-out DG4 DP4 DC4 Dsum LG3 LP4 LC Lsum
2 15.4 16.0 15.4 12.1 21.0 12.1 21.6 12.73 18.5 19.1 18.5 14.8 22.2 14.8 26.0 14.74 22.6 23.2 22.6 16.9 27.6 16.9 29.3 18.36 29.6 30.2 29.6 22.2 34.0 22.2 36.9 24.78 37.5 37.8 37.5 27.5 42.0 27.5 44.6 31.2
g (ps) 3.70 3.65 3.70 2.56 3.61 2.56 3.79 3.15p (ps) 7.72 8.46 7.72 6.94 12.76 6.94 14.24 5.82
g (norm) 1.01 1.00 1.01 0.70 0.98 0.70 1.03 0.86p (norm) 1.80 1.97 1.80 1.62 2.97 1.62 3.32 1.36
Fan-out KSG4 KSP4 KSG16 KSP16 KSSum2 18.9 15.2 12.1 12.1 12.73 22.1 17.0 14.6 14.6 14.74 25.0 19.6 16.4 16.4 18.36 31.5 24.7 21.2 21.2 24.78 38.5 30.7 27.0 27.0 31.2
g (ps) 3.25 2.61 2.45 2.45 3.15p (ps) 12.23 9.45 6.99 6.99 5.82
g (norm) 0.89 0.71 0.67 0.67 0.86p (norm) 2.85 2.20 1.63 1.63 1.36
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 31
Dynamic CMOS: Delay Graphs
0
5
10
15
20
25
30
35
40
0 2 4 6 8 10
N2
N3
N4
k1ND2
k1NR2
AOI_A
OAI_O
0
5
10
15
20
25
30
35
40
0 2 4 6 8 10
G4
P4
C4
STBSum
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 32
Dynamic CMOS: Delay Graphs
0
5
10
15
20
25
30
35
40
45
50
0 2 4 6 8 10
LG3
LP4
G4
P4
LC
Lsum
0
5
10
15
20
25
30
35
40
45
50
0 2 4 6 8 10
KSG4
KSP4
KSG16KSP16KSSum
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 33
Energy Calculation
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 34
Energy CalculationEnergy [fJ] vs. Output Load [u]
dgk2
y = 2.3643x + 28.481
y = 2.4975x + 19.35
0.00E+00
2.00E+01
4.00E+01
6.00E+01
8.00E+01
1.00E+02
1.20E+02
1.40E+02
1.60E+02
1.80E+02
2.00E+02
0 10 20 30 40 50 60 70
Output Load [u]
En
erg
y [f
J]
8X Minimal Size Dyn-NAND
16X Minimal Size Dyn-NAND
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 35
Energy CalculationOffset (parasitic+wiring energy) vs. Size (in multiplesof the
gate size)
y = 0.8931x + 4.6411
y = 1.1413x + 10.22
y = 1.6382x + 11.988
y = 0.5538x + 12.338
y = 3.89x + 14.5
y = 1.9595x + 9.621
y = 1.2559x + 6.762
y = 1.0592x + 1.71
0
10
20
30
40
50
60
0 5 10 15 20 25 30 35 40 45
Gate Size (x)
Off
se
t
invdgckoai_odaoitgxoraoi_ona2stgmuxsLinear (inv)Linear (dgck)Linear (oai_o)Linear (daoi)Linear (tgxor)Linear (aoi_o)Linear (na2s)Linear (tgmuxs)
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 36
0
2
4
6
8
10
8
12
1620
3 4 5 6 7 8
En
erg
y
Siz
e
Fanout
Energy CalculationNAND-2
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 37
Energy Calculation
M 1 5 10 15 20 1 5 10 15 200 1.12 5.6 11.2 16.8 22.4 2.51E+00 1.26E+01 2.51E+01 3.77E+01 5.02E+011 2.24 11.2 22.4 33.6 44.8 3.70E+00 1.85E+01 3.70E+01 5.54E+01 7.39E+012 3.36 16.8 33.6 50.4 67.2 4.85E+00 2.42E+01 4.85E+01 7.27E+01 9.70E+013 4.48 22.4 44.8 67.2 89.6 6.16E+00 3.08E+01 6.16E+01 9.24E+01 1.23E+024 5.6 28 56 84 112 7.45E+00 3.73E+01 7.45E+01 1.12E+02 1.49E+025 6.72 33.6 67.2 100.8 134.4 8.74E+00 4.37E+01 8.74E+01 1.31E+02 1.75E+026 7.84 39.2 78.4 117.6 156.8 1.02E+01 5.08E+01 1.02E+02 1.52E+02 2.03E+027 8.96 44.8 89.6 134.4 179.2 1.15E+01 5.75E+01 1.15E+02 1.72E+02 2.30E+028 10.08 50.4 100.8 151.2 201.6 1.27E+01 6.36E+01 1.27E+02 1.91E+02 2.54E+029 11.2 56 112 168 224 1.42E+01 7.08E+01 1.42E+02 2.13E+02 2.83E+0210 12.32 61.6 123.2 184.8 246.4 1.55E+01 7.76E+01 1.55E+02 2.33E+02 3.10E+0211 13.44 67.2 134.4 201.6 268.8 1.69E+01 8.44E+01 1.69E+02 2.53E+02 3.37E+0212 14.56 72.8 145.6 218.4 291.2 1.81E+01 9.05E+01 1.81E+02 2.71E+02 3.62E+0213 15.68 78.4 156.8 235.2 313.6 1.97E+01 9.85E+01 1.97E+02 2.96E+02 3.94E+0214 16.8 84 168 252 336 2.09E+01 1.04E+02 2.09E+02 3.13E+02 4.18E+0215 17.92 89.6 179.2 268.8 358.4 2.26E+01 1.13E+02 2.26E+02 3.39E+02 4.52E+0216 19.04 95.2 190.4 285.6 380.8 2.39E+01 1.20E+02 2.39E+02 3.59E+02 4.79E+0217 20.16 100.8 201.6 302.4 403.2 2.53E+01 1.27E+02 2.53E+02 3.80E+02 5.06E+0218 21.28 106.4 212.8 319.2 425.6 2.67E+01 1.34E+02 2.67E+02 4.01E+02 5.34E+0219 22.4 112 224 336 448 2.81E+01 1.40E+02 2.81E+02 4.21E+02 5.61E+02
INV
Output Capacitance (u) Energy [fJ]
Multiplier FactorEnergy Factors
1.211300121 7.39E-01Output Capacitance Factor
NAND-2
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 38
Example: 64-Bit Adders
• Han-Carlson (prefix-2, HC2): Static and Dynamic
• Han-Carlson (prefix-2, HC2-2): Dynamic-Static• Kogge-Stone (prefix-2, KS2): Static and
Dynamic• Kogge-Stone (prefix-2, KS2-2): Dynamic-Static• Quaternary-Tree (prefix-2, QT2): Static and
Dynamic
Included wire delay, tdelay = 0.7RwireCwire
Included wire energy, Ew = CwireV2
Len (um) 10 20 30 40 60 80 120 160 240 320 480Delay (ps) 0.01 0.04 0.09 0.17 0.38 0.67 1.50 2.67 6.01 10.7 24.1
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 39
Han-Carlson Adder: Circuits
pi gi-1 gi
G
pi gi-1 gi
G
pi pi-1
P
pi pi-1
P
a b a b
g p
hsum Cin
Sum
CK
Gi
Gi-1
G
Pi
CKGi
Ai
Bi
Gi-1
Pi
Gi
GGi-1
Gi Pi-1
CKPi
Ai Bi
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 40
Han-Carlson Adder: Diagram
4 3 02 114 7 663
30
31
15... ... ...
L2
L4
L6
L1
L3
L5
562
Odd
Sum ... ... ...
(p,g)
XOR2NAND2
NOR2OAI
CM6CM1
NAND2AOI
NOR2OAI
CM2 CM3
NAND2AOI
NOR2OAI
CM4 CM5
AOI
OAI
CMo
XOR2NAND2
XOR2
XOR2
SumCiN
Evenbits
Oddbits
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 41
Han-Carlson Adder: Critical Path
2b
in0
s0 p0
C0G0-1
C2
g0
cIN
4b
C6
8b
C14
16b
C30
32b
C62
S63
C63
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 42
Han-Carlson: Logical Effort Delay
Prefix-2 Han-Carlson Static
Stages# Bits Span
Branch LE ParasiticEnergy/Load Slope
Energy/Size Slope
Energy @ 0 size 0
Load [fJ]
Minimum Input
Capacitance (u)
Total Branch
Total LEPath Effort
Parasitic Delay
Wire Delay 460u [ps]
input 3.0g0 (nand2) 0 2.0 1.36 1.99 2.34 1.26 6.76 0.40C0 (OAI) 1 2.0 1.36 1.69 2.40 1.64 11.99 0.60C2 (AOI) 2 2.0 1.61 3.30 2.40 1.96 9.62 0.60C6 (OAI) 4 2.0 1.36 1.69 2.40 1.64 11.99 0.60C14 (AOI) 8 2.0 1.61 3.30 2.40 1.96 9.62 0.60C30 (OAI) 16 2.0 1.36 1.69 2.40 1.64 11.99 0.60C62 (AOI) 32 2.0 1.61 3.30 2.40 1.96 9.62 0.60Odd Carry (OAI) 1 2.0 1.36 1.69 2.40 1.64 11.99 0.60S63 (TGXORs) 0 1.0 1.56 3.03 3.95 3.89 14.50 0.60Cout (INV) 0 1.0 1.00 0.96 2.35 0.89 4.61 0.60
3.03E+017.68E+02 2.33E+04 97.13 14.24
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 43
Koggie-Stone Adder: Circuits
CK
Gi
Gi-1
G
Pi
CKGi
Ai
Bi
Gi-1
Pi
Gi
GGi-1
Gi Pi-1
CKPi
Ai Bi
pi gi-1 gi
G
pi gi-1 gi
G
pi pi-1
P
pi pi-1
P
a b a b
g p
hsum Cin
Sum
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 44
Koggie-Stone Adder: Diagram
4 3 02 114 7 615 ...
L2
L4
L6
L1
L3
L5
5
Inv
Sum ...
13...
...
...
...
30
31
29
63
62
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 45
Koggie-Stone Adder: Critical Path
2b
in0
s0 p0
C0G0-1
C2
g0
cIN
4b
C6
8b
C14
16b
C30
32b
C62
S63
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 46
Koggie-Stone Adder: Logical Effort Delay Calculation
Prefix-2 Kogge-Stone (2-0) (Dynamic)
Stages# Bits Span
Branch LE ParasiticEnergy/
Load Slope
Energy/Size
Slope
Energy @ 0 size 0
Load [fJ]
Minimum Input
Capacitance (u)
Total Branch
Total LEPath Effort
Parasitic Delay
Wire Delay 460u [ps]
input 3.0g0 (dnand2f) 0 2.0 1.02 1.34 2.40 1.14 10.22 0.50INV (INV) 1.0 1.00 0.96 2.35 0.89 4.61 0.60C0 (DOAI) 1 2.0 0.68 1.33 2.30 0.55 12.34 0.20INV (INV) 1.0 1.00 0.96 2.35 0.89 4.61 0.60C2 (DAOI) 2 2.0 0.68 1.33 2.30 0.55 12.34 0.20INV (INV) 1.0 1.00 0.96 2.35 0.89 4.61 0.60C6 (DOAI) 4 2.0 0.68 1.33 2.30 0.55 12.34 0.20INV (INV) 1.0 1.00 0.96 2.35 0.89 4.61 0.60C14 (DAOI) 8 2.0 0.68 1.33 2.30 0.55 12.34 0.20INV (INV) 1.0 1.00 0.96 2.35 0.89 4.61 0.60C30 (DOAI) 16 2.0 0.68 1.33 2.30 0.55 12.34 0.20INV (INV) 1.0 1.00 0.96 2.35 0.89 4.61 0.60C62 (DAOI) 32 2.0 0.68 1.33 2.30 0.55 12.34 0.20S63 (TGXORs) 0 1.0 1.56 3.03 3.95 3.89 14.50 0.60INV (INV) 0 1.0 1.00 0.96 2.35 0.89 4.61 0.60
1.57E-013.84E+02 81.816.04E+01 14.24
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 47
Quaternary-Tree Adder: Circuits
CK
Gi
Gi-1
G
Pi
CKGi
Ai
Bi
Gi-1
Pi
Gi
GGi-1
Gi Pi-1
CKPi
Ai Bi
pi gi-1 gi
G
pi gi-1 gi
G
pi pi-1
P
pi pi-1
P
a b a b
g p
Sel
In1
In0
Out
*from Mathew Sanu, Intel AMR
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 48
Quaternary-Tree Adder: Diagram
12b
b0b2b4b6b57b59b61b63
C15
C31C47
Int. Carry Genr
C19C23C27 C3C7C11C35C39C43C51C55C59
Int. Carry Genr Int. Carry Genr Int. Carry Genr
Sum[63:60]
4-bit SumGenr
Sum[47:44]
4-bit SumGenr
Sum[31:28]
4-bit SumGenr
Sum[15:12]
4-bit SumGenr
Cin
C31C47
C15
4-bit SumGenr
Sum[3:0]
16b
16b
8b
4b
4b
2b
*from Mathew Sanu, Intel AMR
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 49
Quaternary-Tree Adder: Logical Effort Delay Calculation
Prefix-2 Quarternary (2-2) (Static)
Stages# Bits Span
Branch LE ParasiticEnergy/Load Slope
Energy/Size Slope
Energy @ 0 size 0
Load [fJ]
Minimum Input
Capacitance (u)
Total Branch
Total LE
input 3.0g0 (nand2) 0 1.0 1.36 1.99 2.34 1.26 6.76 0.40C1 (OAI) 2 1.0 1.36 1.69 2.40 1.64 11.99 0.60C3 (AOI) 4 1.0 1.61 3.30 2.40 1.96 9.62 0.60C7 (OAI) 8 1.0 1.36 1.69 2.40 1.64 11.99 0.60C15 (AOI) 16 2.0 1.61 3.30 2.40 1.96 9.62 0.60C31 (OAI) 16 2.0 1.36 1.69 2.40 1.64 11.99 0.60C47 (AOI) 12 4.0 1.61 3.30 2.40 1.96 9.62 0.60C59 (TGMUXs) 4 4.0 1.90 2.74 0.40 1.06 1.71 0.80S63 (TGMUXs) 0 1.0 1.90 2.74 0.40 1.06 1.71 0.80INV (INV) 0 1.0 1.00 0.96 2.35 0.89 4.61 0.60
1.92E+02 5.15E+01
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 50
Quaternary-Tree Adder: Logical Effort Delay Calculation
Prefix-2 Quarternary (2-2) (Dynamic-Static)
Stages# Bits Span
Branch LE ParasiticEnergy/Load Slope
Energy/Size Slope
Energy @ 0 size 0
Load [fJ]
Minimum Input
Capacitance (u)
Total Branch
Total LE Path Effort
input 3.0g0 (Gk1ND2) 0 1.0 1.02 1.34 2.40 1.14 10.22 0.50C1 (OAI) 2 1.0 1.36 1.69 2.40 1.64 11.99 0.60C3 (DAOI) 4 1.0 0.68 1.33 2.30 0.55 12.34 0.20C7 (OAI) 8 1.0 1.36 1.69 2.40 1.64 11.99 0.60C15 (DAOI) 16 2.0 0.68 1.33 2.30 0.55 12.34 0.20C31 (OAI) 16 2.0 1.36 1.69 2.40 1.64 11.99 0.60C47 (DAOI) 12 4.0 0.68 1.33 2.30 0.55 12.34 0.20C59 (TGMUXs) 4 4.0 1.90 2.74 0.40 1.06 1.71 0.80S63 (TGMUXs) 0 1.0 1.90 2.74 0.40 1.06 1.71 0.80INV (INV) 0 1.0 1.00 0.96 2.35 0.89 4.61 0.60
2.91E+001.92E+02 5.59E+02
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 51
Simulation
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 52
Adder
S0
S63
A0
A63
Cwire
Cwire
Test Setup
1mm wire
H=(Cin + Cwire)/Cin
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 53
Wire Load Explanation
• The Wire load is expressed in terms of the input gate capacitance.
• For a 1mm wire:– If the input gates are minimum size, then the
wire load is roughly 80x the min. size inverter input cap (or H is roughly 80 in the worst case).
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 54
Kogge-Stone Adder (2-2) Analysis
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 55
Results
• The results are shown for a Kogge-Stone prefix 2 (2-2) adder with H=1
• No internal wiring is included.
• Measured delay is worst case delay from Clk to S.
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 56
Estimation vs. SimulationEnergy vs. Delay Kogge-Stone-2
0
5E-11
1E-10
1.5E-10
2E-10
2.5E-10
3E-10
3.5E-10
0.00E+00 5.00E-11 1.00E-10 1.50E-10 2.00E-10 2.50E-10 3.00E-10 3.50E-10 4.00E-10
Delay
En
erg
y
delay vs. Energy
Estimated
Simple model (w/o branch correction)
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 57
Energy-Delay CalculationEnergy vs. Delay Analysis
Cin 1 10 20 30 40 50 60 70H 153.6 15.36 7.68 5.12 3.84 3.072 2.56 2.194285714f 1.838654148 1.577006121 1.5057911 1.465633204 1.437791945 1.416561319 1.399447559 1.385139486
Delay 183.0282108 168.624487 164.7041 162.4934079 160.9607466 159.7920006 158.8498881 158.0622287Energy [J] 360.5587565 480.3520072 538.59688 580.6499651 615.0500515 644.824554 671.441635 695.7384103
Critical Path Energy 2418.006611 3456.676415 3961.6898 4326.311888 4624.578533 4882.738959 5113.522899 5324.188581
Input Capacitance [u] 0.333333333 3.333333333 6.6666667 10 13.33333333 16.66666667 20 23.33333333Size 0.666666667 6.666666667 13.333333 20 26.66666667 33.33333333 40 46.66666667
Energy [fJ] 12.04251502 26.39300879 41.448918 56.11848716 70.54453162 84.79309096 98.90235574 112.8972071
Input Capacitance [u] 0.300433684 2.576807388 4.9208858 7.184476491 9.397332977 11.57321339 13.72007411 15.84309869Size 0.500722807 4.29467898 8.2014763 11.97412748 15.66222163 19.28868899 22.86679019 26.40516449
Energy [fJ] 6.356320592 17.9961342 29.348829 40.05014041 50.35074284 60.36401926 70.15456226 79.76397111
Input Capacitance [u] 0.55239364 4.063641024 7.4098258 10.5298073 13.51140966 16.39416644 19.20052423 21.94490158Size 2.761968198 20.31820512 37.049129 52.6490365 67.55704831 81.97083218 96.00262115 109.7245079
Energy [fJ] 17.30290147 45.26564783 70.594916 93.69433198 115.4586974 136.2829861 156.3885039 175.9159682
Input Capacitance [u] 0.746809453 4.712049095 8.204154 11.34767295 14.28426175 17.07598679 19.75744616 22.35055124Size 1.244682422 7.853415159 13.67359 18.91278825 23.80710292 28.45997799 32.92907693 37.25091873
Energy [fJ] 8.949467974 29.08757121 45.854176 60.58609791 74.13694538 86.87318494 98.99630665 110.6325785
Input Capacitance [u] 1.373124299 7.430930267 12.353742 16.63152627 20.53779649 24.18918238 27.6495098 30.95863105Size 6.865621495 37.15465134 61.768708 83.15763133 102.6889825 120.9459119 138.247549 154.7931553
Energy [fJ] 24.67960998 72.55076325 109.46456 140.8379748 169.0849236 195.2157258 219.7763893 243.1045836
Input Capacitance [u] 1.856397564 8.616634205 13.678054 17.92332142 21.71255763 25.19519126 28.45149928 31.53089875Size 3.09399594 14.36105701 22.796757 29.87220237 36.18759605 41.99198543 47.41916546 52.55149791
Energy [fJ] 15.39543952 49.3697995 73.372069 93.02204918 110.2927721 125.9867956 140.5295525 154.180271
Input Capacitance [u] 3.413273081 13.58848489 20.596292 26.269015 31.21814047 35.69053337 39.81638122 43.67469288Size 17.06636541 67.94242443 102.98146 131.345075 156.0907024 178.4526669 199.0819061 218.3734644
Energy [fJ] 43.01642085 122.4453747 174.26845 215.2999958 250.5985885 282.1695087 311.0574365 337.8904978
Input Capacitance [u] 4.614579932 15.75670871 22.8042 28.30936811 33.00381685 37.1748743 40.97127759 44.48201591Size 7.690966553 26.26118118 38.007 47.18228018 55.00636142 61.95812383 68.28546265 74.13669318
Energy [fJ] 11.47980223 28.06486091 38.555051 46.74949443 53.73718139 59.9458004 65.5967467 70.82248068
Input Capacitance [u] 8.484616531 24.84842609 34.33836 41.49114989 47.45262204 52.66048899 57.33715443 61.61379664Size 42.42308266 124.2421304 171.6918 207.4557494 237.2631102 263.302445 286.6857722 308.0689832
Energy [fJ] 88.59754048 213.6845802 282.3102 332.9106058 374.502097 410.4676397 442.5057334 471.6091264
Input Capacitance [u] 11.47079072 28.81332356 38.019408 44.71382864 50.16691012 54.85059689 59.00025061 62.7527225Size 19.11798453 48.02220594 63.365681 74.52304773 83.61151686 91.41766148 98.33375102 104.5878708
Energy [fJ] 71.24869177 154.2807831 195.73871 225.172603 248.788957 268.8493131 286.4671013 302.2834208
Input Capacitance [u] 21.09081693 45.43878763 57.249285 65.53407194 72.12957929 77.6992339 82.56775672 86.92127377Size 105.4540846 227.1939382 286.24643 327.6703597 360.6478965 388.4961695 412.8387836 434.6063689
Energy [fJ] 201.901783 380.5281006 462.43857 518.6731647 562.8395914 599.7683219 631.7964492 660.2518357
Input Capacitance [u] 28.51376325 52.68915165 63.386369 70.62419988 76.25538833 80.93068331 84.96268061 88.52800622Size 47.52293875 87.81525275 105.64395 117.7069998 127.0923139 134.8844722 141.6044677 147.546677
Energy [fJ] 170.2570669 278.3029217 323.26118 352.9816766 375.7696959 394.4880196 410.4948677 424.5504842
Input Capacitance [u] 52.42694907 83.09111468 95.446627 103.5091724 109.6393831 114.6432755 118.900816 122.623637Size 262.1347453 415.4555734 477.23313 517.5458618 548.1969157 573.2163776 594.5040801 613.1181851
Energy [fJ] 483.5502262 600.7155042 643.62904 670.6330806 690.6976786 706.799734 720.3163847 732.0037153
Input Capacitance [u] 70.87869658 96.34940918 105.67844 111.5488823 115.9107514 119.4111983 122.3496006 124.8903246Size 118.131161 160.5823486 176.13073 185.9148039 193.1845857 199.0186638 203.916001 208.1505411
Energy [fJ] 804.010708 1023.894344 1102.573 1151.673008 1187.968424 1216.987393 1241.275748 1262.226473
Input Capacitance [u] 83.53936503 97.39974875 102.00618 104.8011191 106.8304774 108.4315927 109.7575961 110.891359Size 139.232275 162.3329146 170.01031 174.6685318 178.0507957 180.7193211 182.9293268 184.8189317
Energy [fJ] 489.9193448 510.550526 517.40721 521.5674658 524.5881657 526.9714257 528.9451818 530.6327879
Prefix-2 Kogge-Stone (2-0) (Dynamic)Number of Gates
XOR2
INV
OAI
DAOI
OAI
DAOI
INV
INV
INV
INV
INV
INV
dnand2f
OAI
DAOI
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 58
Energy vs. DelayEstimate Comparison
(Cout = 1mm wire)
0.00E+00
5.00E-11
1.00E-10
1.50E-10
2.00E-10
2.50E-10
3.00E-10
0.00E+00 5.00E-11 1.00E-10 1.50E-10 2.00E-10 2.50E-10 3.00E-10 3.50E-10 4.00E-10
Delay [S]
En
erg
y [
J]
Estimate assuming every gate switches
Spice
Critical Path Estimate
Estimate with correct switching
KS-2-2 dynamic
Complete model (w/ branch correction)
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 59
Energy-Delay Estimates
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 60
Adders: EnergyEnergy vs. Delay
Cout = 1mm wire (160u gate cap)For Cin = ~minimum input to 50*minimum input
0
100
200
300
400
500
600
700
800
900
0 50 100 150 200 250 300
Delay [pS]
En
erg
y [p
J]
HC Dynamic (2-2)
KS Dynamic (2-0)
HC Dynamic (2-0)
KS Dynamic (2-2)
KS Static Prefix 2
HC Static Prefix 2
Quarternary Dynamic (2-2)
Quarternary Static
Dynamic: KS, HC
Static
Dynamic-Static
QT
KS
HC
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 61
Energy-Delay comparison of 64-bit KS, HC and QT adders
0
0.5
1
1.5
2
2.5
3
0.9 1.1 1.3 1.5 1.7 1.9 2.1
Normalized Delay
No
rmal
ized
En
erg
y
QT Static
HC Static
KS Static
QT compound-domino
HC compound-domino
KS compound-domino
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 62
Adders: Critical Path EnergyCritical Path Energy vs. Delay (no internal wire Energy)
Cout = 1mm wire (160u gate cap)For Cin = ~minimum input to 50*minimum input
0
2000
4000
6000
8000
10000
12000
0 50 100 150 200 250 300
Delay [S]
En
erg
y [f
J]
HC Dynamic (2-2)
KS Dynamic (2-0)
HC Dynamic (2-0)
KS Dynamic (2-2)
KS Static Prefix 2
HC Static Prefix 2
Quarternary (2-2)
Quarternary Static (2-2)
QT dynamic-static
HC dynamicQT static
KS dynamic-static
HC-dynamic
KS dynamic
HC-staticKS-static
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 63
Intel 32-bit Adder 0.13u 1.2V [VLSI-2002]Comparison with Intel Measured Data
0
5
10
15
20
25
30
35
40
45
50
0 20 40 60 80 100 120 140 160 180 200
Delay [pS]
En
erg
y [f
J]
Kogge-Stone (2-0)
Quarternary (2-2)
Intel Kogge-Stone (2-0)
Intel Quarternary (2-2)
QT
KS
KS estimated
QT Estimated
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 64
Energy-Delay comparison of 32-bit QT and KS adders: estimated vs. simulation in
0.10mm technology
0
10
20
30
40
50
60
90 100 110 120 130 140 150 160Delay [pS]
En
erg
y [p
J]
KS [9]
QT [9]
KS Estimate
QT Estimate
55%
35%
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 65
Energy-Delay Space
• LE does not provide energy optimization• Our new results can save ~30-50% energy
0
10000
20000
30000
40000
50000
60000
70000
80000
10 11 12 13 14 15 16
Total Delay (FO4)
To
tal
Siz
e (
um
)
Potential energysaving at equal
performance
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 66
Simulated Energy-Delay for 2 Adders
H-SPICE confirms potential energy savings
0.0E+00
5.0E-11
1.0E-10
1.5E-10
2.0E-10
2.5E-10
10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0Delay (FO4)
KS_LE KS_Opt
HC_LE HC_Opt
saving
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 67
Estimated Energy-Delay for 2 Adders
• Static prefix-2 Kogge-Stone & Han-Carlson
• Can save ~40% energy on average
0
2E-11
4E-11
6E-11
8E-11
1E-10
1.2E-10
1.4E-10
10.0 11.0 12.0 13.0 14.0 15.0
Delay (FO4)
To
tal
En
erg
y (
J)
KS_LE HC_LE
KS_Opt HC_Optsaving
May 15, 2003 4ème journées francophones d'études: Faible Tension Faible Consommation, Présentation Invité 68
Conclusion• Relaxing delay constraint for a small amount can
result in huge energy saving. – This is very dependent on where design point is on the
Energy-Delay curve (“Hardware Intensity” V. Zyban)• We demonstrated our findings by simulation and
confirmed by matching against measured results• We developed a tool for estimation of design
trade-offs– The tool is important at the beginning of the design
• We found that a topology that is good for low-power is not necessarily the most efficient for speed and vice versa
• We are developing adder topologies suitable for both.