18
EE M216A .:. Fall 2010 Lecture 5 Logical Effort Prof. Dejan Marković [email protected] Logical Effort Recap Normalized delay d = g·h + p g is the logical effort of the gate g C /C g = C IN /C INV Inverter is sized such that R INV equals the gate’s drive strength h is the electrical effort h = C OUT /C IN The combination of g·h is essentially our fanout definition p is the parasitic effort p = C SELF /C INV D. Markovic / Slide 2 SELF INV Typically ~1 for an inverter May have different g, p for each type of gate structure The g, p may differ per input, and for pullup/down EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 2

Prof. Dejan Markovi ee216a@gmail - UCLAicslwebs.ee.ucla.edu/dejan/classwiki/images/7/7c/Lec-05_Logical... · Example: a 2 ‐input NAND gate

Embed Size (px)

Citation preview

1

EE M216A .:. Fall 2010Lecture 5

Logical Effort

Prof. Dejan Marković[email protected]

Logical Effort Recap

Normalized delay– d = g·h + p– g is the logical effort of the gate

g C /C● g = CIN/CINV

● Inverter is sized such that RINV equals the gate’s drive strength

– h is the electrical effort● h = COUT/CIN

● The combination of g·h is essentially our fan‐out definition

– p is the parasitic effort● p = CSELF/CINV

D. Markovic / Slide 2

p SELF/ INV

● Typically ~1 for an inverter

May have different g, p for each type of gate structure– The g, p may differ per input, and for pull‐up/down

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 2

2

A Side Note: Total Gate Effort (gTOT)

One way of evaluating a gate is to look at the total logical effort of the gate– gTOT = CIN_TOT/CINV

● A sum of the total input capacitance● A sum of the total input capacitance

– Example:● 3‐input NAND● Equivalent inverter is 4:2● gINPUT = 10/6=5/3● gTOT = 30/6 = 5 (or 3·gINPUT)

WP:WN = 4:6

D. Markovic / Slide 3

The total gate effort is not very useful to calculate delay– But it is an indication of the “cost” of the gate– Can be very useful in gate mapping of logic synthesis

● To find which gate is best to use to map a given Boolean expression

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 3

A Note on Asymmetry

The gates used in examples so far have been symmetric– All inputs are essentially identical

● Some are bundled

P:N ratio is approximately equal to the mobility ratio– P:N ratio is approximately equal to the mobility ratio● Same as the reference inverter

The truth is that inputs of most gates are not symmetric– Different inputs may see different capacitances– Even series stacked gates may not be the same size

Pull‐up and pull‐down is rarely equal resistanceC ll thi “ k d” t

D. Markovic / Slide 4

– Call this “skewed” gates– P:N sizing for optimal delay is β = root(µ)– Inverter in a domino gate often favors the PMOS to improve

speed

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 4

3

Asymmetric Examples

PMOS NOR: different sizingEquivalent inv is still 6:2– CinA = 10, CinB = 26

(10/8 5/4) t l t

For more complex gates:Equivalent inv is now 9:3− gEDC=21/12=7/4,

g 30/12 5/2– gA (10/8=5/4) not equal to gB (26/8=13/4)● B input is much worse…

(for a reason)

Bd

e

a’

b’cde

2412

12

18

18

999

gA’B’=30/12=5/2

D. Markovic / Slide 5

A

NOR

Outputb’a’

c

d

8

2 2

12

12

12 12

Same β = µ = 3

The larger g is BAD! (More effort to do logic)

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 5

What is the Reference Inverter?

The assumption of logical effort is that the reference inverter has equal rise and fall delays– β =µ

The implication for different pull‐up and pull‐down resistance is that the rising and falling delays are not equal– Similar to the parasitic delay calculation earlier

● dUP = gUP·h + pUP

● dDN = gDN·h + pDN

D. Markovic / Slide 6

Since static CMOS gates are inverting, the transitions through subsequent gates must be alternating– Use an average logical effort (and parasitic effort) – a good idea

● gAVG = (gUP + gDN)/2

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 6

4

Skewed P:N Ratio Gates

Assume: β = 1.7, µ = 3Example: Inverter– Pull‐up: reference inverter is sized P:N of WP:WP/µ

W (1 1/1 7)/W (1 1/ ) 1 19● gUP = WP(1+1/1.7)/WP(1+1/µ) = 1.19

– Pull‐down: reference inverter is sized P:N of µWN:WN

● gDN = WN(1+1.7)/WN(1+ µ) = 0.675

– gAVG = 0.93 (instead of 1)Example: 2‐input NAND Gate– gUP = WP(1+2/1.7)/WP(1+1/µ) = 1.63

W (2 1 7)/W (1 ) 0 925

D. Markovic / Slide 7

– gDN = WN(2+1.7)/WN(1+ µ) = 0.925– Average = 1.28 (instead of 5/4)

Skewing rise/fall transition in gates may improve delay(because reference gate wasn’t optimized for min tp)

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 7

Velocity Saturation

Velocity saturation complicates a little– Need to account for the change in resistance

A f i t i W 2WAssume reference inverter is WP=2WN

Example: a 2‐input NAND gate with sizing WP=2, WN=2– Assuming RN_nostack=(4/3)RN_stack, RP_nostack=(6/5)RP_stack

– The equivalent inverter with same drive resistance● Pull‐up: equiv inv = 2:1, gUP=4/3 (same as before)

D. Markovic / Slide 8

● Pull‐down: equiv inv = 8/3:4/3, gDN=4/4=1

This makes sense because velocity saturation allows the transition through the series stacking be faster (more current)

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 8

5

Choosing a Reference Inverter

The reference inverter can actually be any βLogical effort can be used for “optimally” sizing a logic chain– Part of optimal sizing is to add inverters to increase N

(the number of logic stages)(the number of logic stages)– It is inconvenient to add inverters that do not have a LE of 1

A common practice is to choose the reference × inverter β to be the one that is used. Use a normalization to adjust the g.– Calculate gINV(β =µ) assuming the reference × inverter gAVG = 1– Example:

● g = 1 with β = 1 7 (µ = 3)

D. Markovic / Slide 9

● gREF = 1 with β = 1.7 (µ = 3)● gAVG_INV(β=µ) = 1.16 for β = 3 (µ = 3) ● A 2‐input NAND β = 1.7 has g = 1.37● A 2‐input NAND β = 3 has g = 1.45 = 1.25·1.16

– g of 2‐input NAND with reference inverter of β = µ is 1.25

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 9

Generalization for Multistage Networks

Same concept applies at the path level

Stage effort: fi = gi· hi

Path electrical effort: Hpath = Cout /Cin

Path logical effort: Gpath = g1g2…gN

Branching effort: B th = b1b2 bN I hi

D. Markovic / Slide 10EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 10

Branching effort: Bpath b1b2…bN

Path effort = Gpath· Hpath· Bpath

Path delay D = Σdi = Σpi + Σgi· hi

Ignore thisfor now

6

Total Effort

A path can have a total effort

– G = Path Logical Effort ∏=

=N

iiPATH gG

1

– H = Path Electrical Effort

– F = Path Effort● Consider path effort as fanout. Total fanout

should be a product of all fanouts.

Treat the path as a single “gate”

∏=

=

=

=

N

iiPATH

IN

OUTPATH

i

fF

CCH

1

1

D. Markovic / Slide 11

– Treat the path as a single “gate”

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 11

Example: Total Effort

Calculations

– G = (4/3)2(5/3) = 3∏

=

=

OUT

N

iiPATH

CH

gG1

– H = 60/4 = 15

– F = 45; F ?= GH (yes for this case)

g=4/3p =2

WP:WN=2:2f = 4

∏=

=

=

N

iiPATH

IN

OUTPATH

fF

CH

1

WP:WN=24:6

D. Markovic / Slide 12EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 12

g=4/3

p =2

p =2

g=5/3

p =2

Cout=60

WP:WN=6:6

12

3f1 = 4

f2 = 3.33

f3 = 3.33

7

Example: Random Multi‐stage Networks

Delay of a multi‐stage network = sum of stage delays

– Path Effort Delay ∑=i

iF fD

– Path Parasitic Delay

– Total Path Delay

∑=i

ipP

∑ +==i

Fi PDdD

f1 = 4

f 3 33g = 4/3W :W = 2:2

D. Markovic / Slide 13EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 13

f2 = 3.33

f3 = 3.33

DF = 10.66, P = 6

D = 16.66 (τ delays)g = 4/3p = 2

g /3p = 2

g = 5/3p = 2

Cout = 60

WP:WN = 2:2

WP:WN = 6:6

12

3

WP:WN = 24:6

Add Branching Effort

Ratio of total to on‐path capacitance– How much more current is needed to supply on‐path

(given that some current “flows” off‐path)

Con-path

Con-path + Coff-pathb =

D. Markovic / Slide 14EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 14

cload

8

Branching Example #1

5

15

90

g =h =F =f

190/5 = 1818 (wrong!)(15+15)/5 6

Introduce new kind of effort to account for branching:

– Branching Effort:

15 90f1 =f2 =F =

(15+15)/5 = 690/15 = 636, not 18!

Con‐path + Coff‐path

Cb =

D. Markovic / Slide 15

– Path Branching Effort:

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 15

Con‐path

Π biB =

Now we can compute path effort: H = ∏g·h·b

Multistage Networks with Branching

General logical effort formulation

Stage effort: fi = gi· hi

Path electrical effort: Hpath = Cout /Cin

Path logical effort: Gpath = g1g2…gN

Branching effort: B th = b1b2 bN

D. Markovic / Slide 16EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 16

Branching effort: Bpath b1b2…bN

Path effort = Gpath· Hpath· Bpath

Path delay D = Σdi = Σpi + Σgi · bi · hi

Branching

9

Branching Example #2

g = 4/3p = 2

Cout = 60Cin = 4

WP:WN = 2:2 WP:WN = 24:6

WP:WN = 6:6 WP:WN = 24:6

12

3

f1 = 8

f2 = 6.66

f1 = 4

f2 = 3.33

branching w/o bran.

For circuits with branching:– GPATH is the same = 3– HPATH is the same = 15– FPATH differs

g = 4/3p = 2

g = 5/3p = 2

Cout 60

WP:WN = 6:6

2f3 = 3.33 f3 = 3.33

D. Markovic / Slide 17

FPATH differs● h1 = 24/4= 6, h2 = 60/12=5, h3=2● FPATH = 4/3 · 6 · 4/3 · 5 · 5/3 · 2 = 177.8

F is no longer GH– New FPATH with branching: FPATH = G·B·H

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 17

Interconnects

Capacitance from wires are the most difficult to deal withFixed load so intuitively, we would increase fanoutShort wires – small parasitic capacitance– Treat them as increasing p for each gateTreat them as increasing p for each gate

● Not exact but accounts for the effectLong wires – large capacitance (dominates gate loading)– Size of driving gate is as if driving a large C (inverter chain)– Size of receiving gate does not impact branching

● It is typically big and limited by area, power, and functionMedium wires – most difficult. Cap similar to gate loading

D. Markovic / Slide 18

– Brute force is to write delay as function of gate and wire capacitances ● Nth order polynomial and differentiate

– More realistic method is to iterate

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 18

10

Handling Wires & Fixed Loads

iCL

Cw

i

D. Markovic / Slide 19EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 19

Summary

Delay and/or power of a logic network depend significantly on the relative sizes of logic gates (not transistors within a gate)Inverter buffering is a simple example of the analysis– The analysis leads to ~FO4 as being optimal fanout for drivingThe analysis leads to FO4 as being optimal fanout for driving

larger capacitive loadsTo generalize analysis of delay, we introduce logical effort– Delay normalized by inverter delay, d = g·h + p– g and p are characteristics of a logic gate that depends on its

structure and does not depend on gate size.● May have different g’s and p’s for different inputs and PU / PD

D. Markovic / Slide 20

● Simplify by using gAVG and ignoring C’s of intermediate nodes– Once a table of g’s and p’s are created for the catalog of gates,

delay can be calculated quickly and easilyNext we will look at how to size a network (instead of just analyzing it)

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 20

11

Optimum Effort per Stage

When each stage bears the same effort:

Fanout of each stage:

Complex gates should drive smaller load!!!

D. Markovic / Slide 21EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 21

Minimum path delay

Gate Sizing Example (1/2)

From: David Harris

First, compute path effort:

Hpath

D. Markovic / Slide 22

The optimal stage effort is:

path

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 22

12

Gate Sizing Example (2/2)

We can now size the gates:

8.1345.1201 =⋅=z 5.14

45.135

=⋅=yx

D. Markovic / Slide 23

7.1245.13

4=⋅=

zy 1045.1

1 =⋅=xCin

The total normalized delay is (assuming pinv = 1):

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 23

Sizing Example with Branching

Size the gates for optimum delayOnce the path effort is determined, it is quite easy to determine the appropriate gate sizes

Start from the output of a pathiiout gC

C _=– Start from the output of a path– Work backwards to the input

● Check your work if the input is the same as the specification● Assuming each unit W has capacitance of unit C

optiin f

C _ =

g = 4/3Gate 2 (dup) Gate 3 (dup)

B = b1b2=4

F = 177.8, fopt = 5.62

D. Markovic / Slide 24EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 24

g = 4/3p = 2

g 4/3p = 2

g = 5/3p = 2

Cout = 601

23Cin = 4

Cin3 = 60*5/3*1/5.6 = 17.9

Cin2 = b2*17.9*4/3*1/5.6 = 8.5

Cin1 = b1*8.5*4/3*1/5.6 = 4

WP3 = 4/5*17.9 = 14.3

WP2 = ½*8.5 = 4.25

13

Branches: Same # Stages, Different Loads

Example: 2 paths with the same # of stages but different loads– Optimal system has all paths with equal delay– Branching for path 1, b = 1 + α

A ti i th t ~● Assumption is that p1 ~ p2

Cout1

2

2222

1

1111

2211

1 2

in

out

in

out

NN

CCGBF

CCGBF

pFNpFN

DD

===

+=+

=Cin1

Path 1

D. Markovic / Slide 25EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 25

Cout2

111

222

1

2

out

out

in

in

CGBCGB

CC

== αCin2

Path Effort Estimate

With equal‐stage branches, the Path Effort can be estimatedwithout knowing each stage’s, b– Htot = (Cout1+Cout2)/Cin (with B = 1)

Simple example for Path 1:– G = gNAND

– H1 = Cout1/Cin

– B = (1+Cout2/Cout1) ● Since F1 = F2 and G1 = G2, B = (1+Cout2/Cout1)

– F = G·H1·B = gNAND(Cout1+Cout2)/Cin

Cout1

Cout2

Cin1

Cin2

Path 1

Cin

D. Markovic / Slide 26

– Same as F = G·Htot

The errors occur with different g and p for the gates in the two different paths

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 26

14

Unequal‐Length Branches (Not Straightforward)

If both branches are long, then N1=N2 is an o.k. assumptionOtherwise, fopt differs between two paths

Solving precisely, example:C = H C

2dinv

1C1

– 2dinv − p = Cout1/C1

– (dinv − p)2 = Cout2/C2

– dnand = gnand(C1 + C2)/Cin

– D = dnand + 2dinv = gnand(Cout1/(2dinv − p) + Cout2 /(dinv − p)2) + 2dinv

– Take partial derivative of ∂D/∂(dinv)

Cout2

= H2·Cin

Cin = 1

Cout1 = H1·Cin

dinv dinvdnand 2

C2

D. Markovic / Slide 27

Not easy so ignore p to simplify– If gNAND = 4/3, H1 = H2 = 3: dnand= 2.35, dinv = 1.80– The per stage f = g·h (no p) is different per stage

Most branches are relatively long or not critical path

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 27

Re‐convergent Paths

Very similar to unequal branching, another logic structure that adds complexity to Logical Effort is recombined branches.– Typically constrains the sizing problem

Example: ignoring parasitics. Let x = Cin = 1– fa = (y+z)/x– fb = (3z/y)– fc = gNANDCout/z– 2 variables

Cout

Cinx

yz

fa

fb

fc2x NANDz

D. Markovic / Slide 28

– Constrain with fa = fb = fc, 2 vars equations (directly solve)● Any constraints are possible

– Increase variables by introducing more buffering (parallel to y)

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 28

15

Logic Opt. Example: 2‐Stage; 8‐Input AND

Assume symmetric 8‐input AND (β = µ =2, 3Wo is unit C)– gNAND=10/3, pNAND=8 for an 8‐input NAND– gINV = 1, pINV=1– Cout=100, Cin=1Cout 100, Cin 1

Logical effort– G = 10/3, B = 1, H = 100– F = 333.3– For 2 stages, fopt=18.3

● h2 = 18.3, Cin2 = 5.5, WP = 11Wo● h1 = 5.5, Cin1 = 1, WP = 0.6Wo

D l 36 6 9 45 6

x8

Cout

G1 G2

D. Markovic / Slide 29

● Delay = 36.6 + 9 = 45.6– Buffering

● For 3 stages, fopt = 7 (1 extra inverter), Delay = 31● For 4 stages, fopt = 4.3 (2 extra inverters), Delay = 28.2● For 5 stages, fopt = 3.19 (closer to optimal of 3.6), Delay=28● For 6 stages, fopt = 2.63 (below optimal of ~3.6), Delay=28.8

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 29

Logic Opt. Example: Multi‐2‐Input Gate 8‐AND

Many ways to implement this same function

Use a tree of fewer input AND gates– (A0A1)(A2A3)…– If multiple ANDs (as in a memory decoder), then partial

results can be shared

D. Markovic / Slide 30EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 30

16

2‐input Implementation

Assuming that 3W has capacitance of unit C

out

Cout = 100

G7G6G5G4G3G2

Cin = 1

Assuming that 3Wo has capacitance of unit C– F = B·G·H = 4/33 ·100 = 235– fopt = 2.48 (too small)– D = 6·2.48 + 3·2 + 3·1 = 14.88 + 9 = 23.88

● Still better than 8‐input NANDOptimal sizing– G7: Cin7 = 100ginv/f = 40, WP=80W0

D. Markovic / Slide 31

– G6: Cin6 = Cin7gnand/f = 21.5, WP=32W0

– G5: Cin5 = Cin6ginv/f = 8.7, WP = 17.4W0

– G4: Cin4 = Cin5gnand/f = 4.7, WP = 7W0

– G3: Cin3 = Cin4ginv/f = 1.88, WP = 3.8W0

– Double check G2: Cin = Cin3gnand/f = 1, WP=1.5W0

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 31

4‐input Implementation

out

Cout = 100

G5G4G3G2

Cin = 1

Assuming that unit for C = 3Wo

– F = B·G·H = 4/3·6/3·100 = 266– fopt = 4.04 (a tad high)– D = 4·4.04 + 1·4 + 1·2 + 2·1 = 16 .:. 16 + 8 = 24.16

● Slightly worse than 2‐input! Due to self‐loadingOptimal sizing

D. Markovic / Slide 32

p g– G5: Cin5 = 100ginv/f = 24.75, WP=50W0

– G4: Cin4 = Cin5gnand/f = 8.16, WP=12.5W0

– G3: Cin3 = Cin4ginv/f = 2.02, WP=4W0

– Double check G2: Cin=Cin3gnand4/f = 1, WP=1.0W0

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 32

17

Example: 2‐NOR Implementation

i h i f

out

Cout = 100

G5G4G3

G2

Cin = 1

Assuming that unit for C = 3Wo

– F= BGH = 4/32*5/3*100 = 296– fopt = 4.14 (a tad high)– D = 4·4.14 + 3·2 + 1·1 = 16.56 + 7 = 23.6 (Best one!)

Optimal sizing– G5: Cin5= 100ginv/f = 24.1, WP=48W0

D. Markovic / Slide 33

5 5 0

– G4: Cin4 = Cin5gnand/f = 7.76, WP=11.6W0

– G3: Cin3 = Cin4gnor/f = 3.125, WP=7.5W0

– Double check G2: Cin = Cin3gnand/f = 1, WP = 1.5W0

Lesson: use close to optimal f, use gates with small p

EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 33

Logical Effort “Design Flow”

Compute the path effort: Path Effort = F = ∏ g· h· b

Find the best number of stages: N* ~ log4(PathEffort)

Compute the stage effort: se* = (PathEffort)1/N

Working from either end, determine gate sizes:

D. Markovic / Slide 34EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 34

Reference: Sutherland, Sproull, Harris, “Logical Effort,” (Morgan-Kaufmann 1999)

18

Summary of LE Design Methodology

1. Draw network2. Buffer non‐critical paths with minimum‐sized gates

• Minimize loading on critical path• Simplifies sizing of non‐critical pathSimplifies sizing of non critical path

3. Estimate total effort along each path (without branching)4. Verify that the number of stages is appropriate

• Add inverters if fopt > 55. Assign branch ratio of each branch

• Estimate based on the ratio of the Effort of the paths• Ignore paths that have little effect (i.e. min‐sized)

D. Markovic / Slide 35

• Include wire capacitances6. Compute delays for the design (include parasitic delay)

• Adjust branching ratios (especially with wire capacitance)• Repeat if necessary until delay meets specification

7. Re‐optimize logic network if fopt is small (Return to step 3)EEM216A .:. Fall 2010 Lecture 5: Logical Effort | 35