8/18/2019 12a Timing Optimization
1/29
1
Logic Restructuring forLogic Restructuring forTiming OptimizationTiming Optimization
Outline:Outline:• Definitions and problem statementDefinitions and problem statement• Overview of techniquesOverview of techniques (motivated b(motivated b
adders!adders!" Tree height reduction (T#R!Tree height reduction (T#R!
" $eneralized b pass transform ($%&!$eneralized b pass transform ($%&!" $eneralized select transform ($'T!$eneralized select transform ($'T!" artial collapsing ()!artial collapsing ()!
8/18/2019 12a Timing Optimization
2/29
2
Timing OptimizationTiming Optimization*actors determining*actors determining deladela of circuit:of circuit:• +nderl ing+nderl ing circuitcircuit technologtechnolog
" ,ircuit t pe,ircuit t pe (e-g- domino. static ,/O'. etc-!(e-g- domino. static ,/O'. etc-!" $ate t pe$ate t pe" $ate size$ate size
• LogicalLogicalstructurestructure of circuitof circuit" Length of computation pathsLength of computation paths" *alse paths*alse paths"
%uffering%uffering
• arasiticsarasitics" 0ire loads0ire loads" La outLa out
8/18/2019 12a Timing Optimization
3/29
3
roblem 'tatementroblem 'tatement$iven:$iven:• 1nitial circuit function description1nitial circuit function description
• Librar of primitive functionsLibrar of primitive functions• erformance constraintserformance constraints (arrival2required times!(arrival2required times!$enerate:$enerate:an implementation of the circuit using the primitivean implementation of the circuit using the primitive
functions. such that:functions. such that:3-3- performanceperformance constraints are metconstraints are met4-4- circuitcircuit areaarea is minimizedis minimized
8/18/2019 12a Timing Optimization
4/29
4
,urrent Design rocess,urrent Design rocess%ehavior%ehaviorOptiizationOptiization(scheduling!(scheduling!
artitioningartitioning(retiming!(retiming!
Logic s nthesisLogic s nthesis
•Technolog independentTechnolog independent•Technolog mappingTechnolog mapping
Timing drivenTiming drivenplace and routeplace and route
%ehavioral description%ehavioral description
Logic and latchesLogic and latches
Logic equationsLogic equations
$ate netlist$ate netlist
La outLa out
•$ate librar$ate librar• erf- ,onstraintserf- ,onstraints•Dela modelsDela models
8/18/2019 12a Timing Optimization
5/29
5
Technolog mapping forTechnolog mapping fordeladela
FunctionFunctiontreetree
BufferBuffertreetree
8/18/2019 12a Timing Optimization
6/29
6
Overview of 'olutions for delaOverview of 'olutions for dela3-3- ,ircuit,ircuit re5structuringre5structuring" Rescheduling operations to reduce time of computationRescheduling operations to reduce time of computation4-4- 1mplementation of1mplementation of functionfunction treestrees (technolog mapping!(technolog mapping!
" 'election of gates from librar'election of gates from librar• /inimum dela/inimum dela (load independent model 5 6u7imoto!(load independent model 5 6u7imoto!
• /inimize dela and area/inimize dela and area (8ongeneel. D9, ;;!(8ongeneel. D9, ;;!(combines Lehman50atanabe and 6u7imoto!(combines Lehman50atanabe and 6u7imoto!
8/18/2019 12a Timing Optimization
7/29
7
,ircuit re5structuring,ircuit re5structuring9pproaches:9pproaches:Local:Local: • /imic optimization techniques in/imic optimization techniques in addersadders
" ,arr loo7ahead (,arr loo7ahead ( T#RT#R tree height reduction!tree height reduction!" ,onditional sum (,onditional sum ( $'T $'T transformation!transformation!" ,arr b pass (,arr b pass ( $%&$%& transformation!transformation!
$lobal:$lobal:• Reduce depth of entire circuitReduce depth of entire circuit
" artial collapsingartial collapsing" %oolean simplification%oolean simplification
8/18/2019 12a Timing Optimization
8/29
8
Re5structuring methodsRe5structuring methodserformance measured berformance measured b
3-3- levels.levels.4-4- sensitizable paths.sensitizable paths.@3!artial collapsing and simplification (Touati >@3!" $eneralized select transform (%erman >@;!$eneralized select transform (%erman >@;!
• 'ensitizable'ensitizable pathspaths" $eneralized b pass transform (/cgeer >@3!$eneralized b pass transform (/cgeer >@3!
8/18/2019 12a Timing Optimization
9/29
9
Re5structuring for dela :Re5structuring for dela :
tree5height reductiontree5height reductionnn
ll mm
ii A Ahh
77
8/18/2019 12a Timing Optimization
10/29
10
Restructuring for dela :Restructuring for dela :
path reductionpath reduction
ii33
;; ;;aa bb
mm
A A
hh
77
8/18/2019 12a Timing Optimization
11/29
11
$eneralized b pass transform$eneralized b pass transform($%&!($%&!
• /a7e critical path/a7e critical path falsefalse" 'peed up the circuit'peed up the circuit
• % pass% pass logic of critical path(s!logic of critical path(s!
McGeer ‘91McGeer ‘91ff mm=f=f ff m+1m+1 ff nn=g=g……
ff mm =f=fff m+1m+1 ff nn=g=g……
11g’g’
dgdg
!! !! dfdf
%oolean%oolean
differencedifferences5a5; redundants5a5; redundant
8/18/2019 12a Timing Optimization
12/29
12
$%& and 6/' transform$%& and 6/' transform$%& gives little area increase.$%& gives little area increase. %+T%+Thave now created anhave now created an untestableuntestable
faultfault (on control input to multipleFor!(on control input to multipleFor!6/' transform:6/' transform: (remove false paths without increasing dela !(remove false paths without increasing dela !
3-3- ff 77 isis lastlast node on false path that fans out-node on false path that fans out-4-4- DuplicateDuplicate false path Gffalse path Gf 33.H. f.H. f77I 5JI 5J GfGf33. H . f. H . f 77II
8/18/2019 12a Timing Optimization
13/29
13
6/'6/' ((6eutzer. /ali7. 'aldanha >@;6eutzer. /ali7. 'aldanha >@; !!
ff mm ff m+1m+1 ff nn……ff "" ff "+1"+1
f’f’mm f’f’m+1m+1 f’f’""
ff mm ff m+1m+1 ff nn……ff "" ff "+1"+1;;
……Dela isDela is notnotincreasedincreased
8/18/2019 12a Timing Optimization
14/29
14
Mnd of lecture 4;Mnd of lecture 4;
8/18/2019 12a Timing Optimization
15/29
8/18/2019 12a Timing Optimization
16/29
8/18/2019 12a Timing Optimization
17/29
17
$'T vs $%&$'T vs $%&• 'elect transform'elect transform appearsappears to be moreto be more areaarea
efficientefficient• %ut%ut %oolean difference generall more efficientl%oolean difference generall more efficientlformed informed in practicepractice
• oo dela 2speedupdela 2speedup advantageadvantage for either transformfor either transform
• eedeed" one /+&one /+& perper fanoutfanout in $'T.in $'T." onlonl oneone /+& in $%&/+& in $%&
cc dd ee ff ggbb
cc dd ee ff ggbb
aE;aE;
aE3aE3
out3out3
11
aa
$'T $'T out4out411
aa
8/18/2019 12a Timing Optimization
18/29
18
Technolog independentTechnolog independent
dela reductionsdela reductions$enerall T#R. $%&. $'T$enerall T#R. $%&. $'T (critical path based methods!(critical path based methods! wor7 O6.wor7 O6. butbut notnot greatgreat
0h are technolog independent dela reductions0h are technolog independent dela reductions hardhard ))Lac7 ofLac7 of fast and accuratefast and accurate dela modelsdela models
3-3- levels levels .. fastfast butbut crudecrude4-4- levels K correction term levels K correction term (fanout. wires.H !: a little(fanout. wires.H !: a little betterbetter ..
but still crude (what coefficients to use)!but still crude (what coefficients to use)!
8/18/2019 12a Timing Optimization
19/29
19
,lustering2partial5collapse,lustering2partial5collapseTraditionalTraditional critical5pathcritical5path based methods requirebased methods require
" 0ell defined0ell defined criticalcritical pathpath" $ood$ood dela 2slac7dela 2slac7 informationinformation
roblems:roblems:" $ood dela information comes from mapper and la out$ood dela information comes from mapper and la out" Dela estimates and models are wea7Dela estimates and models are wea7
ossible solutions:ossible solutions:" %etter dela modeling at technolog independent level%etter dela modeling at technolog independent level" /a7e speedup. insensitive to actual critical paths and/a7e speedup. insensitive to actual critical paths and
mapped dela smapped dela s
8/18/2019 12a Timing Optimization
20/29
20
,lustering2partial5collapse,lustering2partial5collapseTwo5level circuits are fastTwo5level circuits are fast
" ,ollapse circuit to 45level 5,ollapse circuit to 45level 5 butbut• #uge#uge areaarea penaltpenalt• #uge capacitive#uge capacitive loadingloading on inputs (can beon inputs (can be muchmuch slower!slower!
To avoid huge area penaltTo avoid huge area penalt" 1dentif1dentif clusters of nodesclusters of nodes
• Mach cluster has some fiFed sizeMach cluster has some fiFed size" erformerform collapsecollapse of each clusterof each cluster" 'implif'implif each nodeeach node
DetailsDetails" #ow to choose the#ow to choose the clustersclusters ))" #ow to choose cluster#ow to choose cluster sizesize ))" #ow to#ow to simplifsimplif each node)each node)
8/18/2019 12a Timing Optimization
21/29
21
Lawler s clustering algorithmLawler s clustering algorithm• OptimalOptimal in dela :in dela :
" *or a given clustering size*or a given clustering size•
/a/a
duplicateduplicate
nodesnodes
(hence possible area penalt !(hence possible area penalt !
" ot optimal w-r-t duplicationot optimal w-r-t duplication" +se a heuristic+se a heuristic
• *ast*ast : O(m: O(m xx 7!7!" m E number of edges in networ7m E number of edges in networ7" 7 E maFimum cluster size7 E maFimum cluster size
8/18/2019 12a Timing Optimization
22/29
22
,lustering algorithm 5 overview,lustering algorithm 5 overview3-3- Label phase:Label phase: ((77 is cluster size!is cluster size!
" 1f node u is an input.1f node u is an input. label(u! :E L :E ;label(u! :E L :E ;• MlseMlseL :E maF label of fanin of uL :E maF label of fanin of u
" 1f ( nodes in T*1(u! with (label E L! JE1f ( nodes in T*1(u! with (label E L! JE 77!! label(u! :E LK3label(u! :E LK3
4-4- ,luster phase:,luster phase: (outputs to inputs!(outputs to inputs!" 1f node u is an output.1f node u is an output. L :E infinitL :E infinit
• MlseMlseL :E maF label of fanouts of uL :E maF label of fanouts of u" 1f (label(u! P L! then create a1f (label(u! P L! then create a newnew cluster with Qroot u and with memberscluster with Qroot u and with members allall
the nodes in T*1(u! with label E label(u!the nodes in T*1(u! with label E label(u!
8/18/2019 12a Timing Optimization
23/29
23
MFample of clusteringMFample of clustering
11
11
11
11
$$
11
11$$
ResultResult: Lawler s algorithm: Lawler s algorithmgivesgives minimum depthminimum depth circuitcircuit
T picall .T picall .
3-3- we decompose initial circuitwe decompose initial circuitinto 45input 9 Ds andinto 45input 9 Ds andinvertors-invertors-
4-4- then cluster sizethen cluster size 77reflects 45input 9 Dsreflects 45input 9 Dsto be collapsed together-to be collapsed together-
7 E
8/18/2019 12a Timing Optimization
24/29
24
,hoosing,hoosing 7 7 • 1(7!:1(7!: number of levels. given 7number of levels. given 7• d(7!:d(7!: duplication ratioduplication ratio
" umber of gates in cluster networ7umber of gates in cluster networ7 divideddivided b number of gates in originalb number of gates in originalnetwor7networ7
• Determine 7Determine 7 ;; where 7where 7 ;; 2d(72d(7;; !S4-;!S4-;• *or ever 7 from 4 to 7*or ever 7 from 4 to 7 ;; . compute d(7!. 1(7!. compute d(7!. 1(7!
" +se eFhaustive enumeration: label and cluster (without collapse! for each 7-+se eFhaustive enumeration: label and cluster (without collapse! for each 7-" Mach iteration is O( M 7!Mach iteration is O( M 7!
• ,hoose 7 such that,hoose 7 such that" 1(7! is minimized1(7! is minimized
• %rea7 ties using d(7!%rea7 ties using d(7!" /inimize d(7!/inimize d(7!
d(7!d(7!1(7!1(7!
33 44 77;;
8/18/2019 12a Timing Optimization
25/29
25
9rea recover9rea recover9rea increase is due to node9rea increase is due to node duplicationduplication 55
" this occurs when node is inthis occurs when node is in multiplemultiple clustersclusters
Two solutions:Two solutions:3-3- %rea7 clusters into%rea7 clusters into smallersmaller pieces off criticalpieces off critical
pathpath4-4- 9fter cluster and collapse.9fter cluster and collapse. recoverrecover areaarea
8/18/2019 12a Timing Optimization
26/29
26
Relabeling procedure:Relabeling procedure:9ttempt to9ttempt to increaseincrease node labels without eFceeding clusternode labels without eFceeding cluster
sizesize1n1n reversereverse topological ordertopological order
'tart'tart : assign: assign
1ncrease1ncrease label(u! iflabel(u! if3-3- new5label(u! PE label(v! for each fanout vnew5label(u! PE label(v! for each fanout vandand4-4- new5label(u! E new5label(v! for each fanout v onl if label(u! Enew5label(u! E new5label(v! for each fanout v onl if label(u! E
label(v! before relabeling.label(v! before relabeling. andand
8/18/2019 12a Timing Optimization
27/29
27
Relabeling eFampleRelabeling eFample
11
11
11
$$
$$
11
11
11
11
$$beforebefore
afterafter
8/18/2019 12a Timing Optimization
28/29
28
ost5collapse area recoverost5collapse area recover• Do algebraic factorization.Do algebraic factorization. butbut
" +ndo+ndo factorization if depth increasesfactorization if depth increases• *ullNsimplif*ullNsimplif
" Onl consider nodeOnl consider node % % as possible fanin of a nodeas possible fanin of a node((%%introduced bintroduced b using don t cares!using don t cares! if level ofif level of% % P level of node- P level of node-
• Redundanc removalRedundanc removal
8/18/2019 12a Timing Optimization
29/29
29
,onclusions,onclusions• Uariet of methods for dela optimizationUariet of methods for dela optimization
" o single technique dominateso single technique dominates (68 'ingh hD thesis!(68 'ingh hD thesis!• 0hen applied to ripple5carr adder get0hen applied to ripple5carr adder get
" ,arr 5loo7ahead adder (T#R!,arr 5loo7ahead adder (T#R!" ,arr 5b pass adder ($%&!,arr 5b pass adder ($%&!" ,arr 5select adder ($'T!,arr 5select adder ($'T!" ) (partial collapse!) (partial collapse!
• 9ll techniques ignore9ll techniques ignore false pathsfalse paths when assessingwhen assessingthe dela and critical regionsthe dela and critical regions" ,an use,an use 6/'6/' transform to eliminate false paths withouttransform to eliminate false paths without
increasing delaincreasing dela (area increase however!-(area increase however!-