High-Dimensional Metamodeling for Prediction of Clock Tree Synthesis Outcomes

-1-UC San Diego / VLSI CAD Laboratory

High-Dimensional Metamodeling for Prediction

of Clock Tree Synthesis Outcomes

High-Dimensional Metamodeling for Prediction

of Clock Tree Synthesis Outcomes

Andrew B. Kahng, Bill Lin and Siddhartha Nath

VLSI CAD LABORATORY, UC San Diego

-2-

OutlineOutline

Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

-3-

Challenge: High DimensionalityChallenge: High Dimensionality Why is CTS prediction hard?

Testcases

Layoutcontexts

Tools & knobs

Outcomes?(power, skew, delay, wirelength)

CTS instance

CTS prediction is difficult due to inherent high dimensionality

-4-

Challenge: Sensitivity Challenge: Sensitivity

Delay varies by up to 43% with clock entry point locations Delay varies by up to 45% with core aspect ratio

0.1

00

0.1

25

0.2

50

0.3

30

0.4

00

0.5

00

1.0

00

2.0

00

2.5

00

3.0

00

4.0

00

8.0

00

100

100

200

300

400

500

600

700

800 BL BLM B RBM R

Core aspect ratio

Fall d

ela

y (

ps)

BL

BLM

BRBM

R

CTS outcomes are sensitive to instance parameters

-5-

Challenge: MulticollinearityChallenge: Multicollinearity

Estimation error % = % difference between actual and predicted outcomes

Up to 5 estimation errors as D increases from eight to 13 for MARS, RBF and KG techniques

LHS AS LHS AS LHS AS LHS ASMARS RBF KG HSM

0%

50%

100%

150%

200%

250%

300%

350% 8 9 10 11 12 13D =

45

Estimation errors increase at high dimensions

-6-

Challenge: Realistic InstancesChallenge: Realistic InstancesSinks (x,

y)

Rectangular core

Placement blockage

Simple testcases and layout contexts do not reflect real-world CTS instances

ISPD 2010 CTS Benchmark 01

-7-

ContributionsContributions

Generate realistic testcases with real-world CTS structures Study and identify appropriate modeling parameters Propose hierarchical hybrid surrogate modeling (HHSM) – a

divide and conquer approach to overcome parameter collinearity issues

Develop prediction methodologies for practical use models– Which tool should be used?– How should the tool be driven?– How wrong can the model guidance be?

Validate methodologies on a new CTS instance

-8-

Related WorksRelated Works Testcases

– Tsay90 CTS testcases r1 - r5 with sink (x, y) coordinates

– ISPD 2010 Placement blockage Inverters/buffers in clock hierarchy

Prediction– Kahng02

CUBIST to estimate clock skew, insertion delay– Kahng13

MARS, RBF, KG, HSM to estimate several clock metrics Uniform placement of sinks, no combinational logic

Gaps in testcases and layout contexts

-9-

OutlineOutline


-10-

Example of Our CTS TestcaseExample of Our CTS Testcase

Real-world clock structures– Clock-gating cells (CGCs)– Clock dividers– Gitch-free clock MUX

Multiple levels in the clock tree hierarchy (K6 vs. K2) Generators, runscripts to be published

CGC

K1

K2

cg_en[0]cg_en[1

]

Glitch Free MUX

DIV-8

DIV-4CGC

CGC

DIV-24

CGC

CGC

K3

K4

K5CGC

CGC K6

cg_en[2]

cg_en[3]

cg_en[4]

cg_en[5]

cg_en[6]

Clk root pin

clk

mux_en[0]

Sinks

-11-

Example of Our CTS InstanceExample of Our CTS InstanceSinks (x,

y)

Core(aspect ratio

=1)

Placement and routing blockage

Buffers

Clock-gating cells

Clock entry point

location

Clock dividers

Clock MUX

Nonuniform sink

placement

-12-

OutlineOutline


-13-

Modeling ParametersModeling Parameters Microarchitectural

– Msinks – # sinks Floorplan context

– Mcore, MAR – core area and aspect ratio– MCEP – clock entry point– Mblock – placement and routing blockage % of core area

Tool constraints– Mskew, Mdelay – max skew and insertion delay– Mbuftran, Msinktran – max buffer and sink transition time– MFO – max fanout– Mbufsize , Mwire– max buffer size and wire width

Nonuniformity measure– MDCT – nonuniformity in sink placement

-14-

Modeling FlowModeling Flow

Synthesis (DC)

Gate-level netlist

Testcase Verilog RTL

Generate placed DEF

Floorplan parameters

CTS tool parameters

CTS instance

CTS + CT route (ToolA)

CTS + CT route (ToolB)

Extract CTS metrics

µArch parameter

Nonuniformity parameter

Fitted models for metrics

Metamodeling

-15-

Metamodeling TechniquesMetamodeling Techniques

Accurate because they derive surrogate models from actual post-CTS data

Our techniques– Hybrid Surrogate Modeling (HSM) [Kahng13]– Multivariate Adaptive Regression Splines (MARS) [Friedman91]– Radial Basis Function (RBF) [Buhmann03]– Kriging (KG) [Matheron78]

-16-

OutlineOutline


-17-

MulticollinearityMulticollinearity

If parameters are linear combinations of each other– Example: MAR, Mbuftran, Msinktran, Mwire

– Matrix of parameters is ill-conditioned– Large variance in regression coefficients– Hard to determine relationship between parameters and

output– Large errors between actual and predicted outputs as D

increases Previous works [Kahng13] report large estimation errors (≥

30%) as D ≥ 10

𝑓 (𝛽 , �⃗� )=𝛽0+∑𝑖=1

𝐷

𝛽𝑖 𝑓 (𝑥 𝑖)�̂� ( �⃗�)= 𝑓 ( 𝛽 , �⃗� )+𝜀( �⃗�)

-18-

Our Solution: HHSMOur Solution: HHSM

Hierarchical Hybrid Surrogate Modeling Divide the parameters (D) into two sets

– One set of k parameters has low collinearity– Other set of D – k parameters may have high collinearity– Derive HSM surrogate models for each set– Combine using weights from least-squares regression

�̂� ( �⃗� )=𝑤1 �̂� (�⃗� )𝐻𝑆𝑀 (𝑘)+𝑤2 �̂� ( �⃗�)𝐻𝑆𝑀 (𝐷−𝑘)

where, w1,2 are weights

w1 : k parameters with low collinearity

w2 : D – k parameters with high collinearity

-19-

HHSM AccuracyHHSM Accuracy

Up to 4reduction in estimation errors HHSM errors vary by ≤ 2% as D increases from eight to 13 Worst-case error ≤ 13%

HSM HHSM HSM HHSM HSM HHSM HSM HHSMSkew Delay Power WL

0%

5%

10%

15%

20%

25%

30%

35%

40% 8 9 10 11 12 13D =

4

2≤ 2%

≤ 13%

-20-

OutlineOutline


-21-

Use Models For PredictionUse Models For Prediction

Develop methodologies to answer three questions – Q1: Which tool should be used?– Q2: How should the tool be driven?– Q3: How wrong can the model guidance be?

-22-

Q1: Which Tool Should Be Used?Q1: Which Tool Should Be Used? Methodology

– Determine the better tool using models– Compare with actual post-CTS data

D Skew Power Delay Wirelength

8 5.26 4.55 4.87 4.92

9 5.26 4.6 4.93 5.01

10 5.82 4.62 4.94 5.03

11 5.88 4.9 4.94 5.11

12 6.12 5.23 4.95 5.25

13 6.13 5.23 4.98 5.27

Errors increase 8 ≤ D ≤ 11 Errors saturate D ≥ 12 Worst-case prediction error = 6.13%

Incorrect Tool Prediction %

-23-

Q2: How Should The Tool Be Driven?Q2: How Should The Tool Be Driven? Methodology

– Determine the smallest and largest values of parameters that deliver desired outcome

Max Skew (ps) Max Delay (ns) Max Buffer Transition (ps)

Skew (ps) ToolA ToolB ToolA ToolB ToolA ToolB

5 N N N N N N

25 10 - 25 25 - 50 1.0 - 1.75 1.5 - 2.50 275 - 450 300 - 475

50 10 - 50 25 - 100 1.0 - 2.0 1.5 - 1.75 275 - 575 300 - X

100 10 -100 40 - 115 1.0 - X 1.5 - X 300 - X 300 - X

200 10 - 100 45 - 115 1.0 - X 1.5 - X 300 - X 300 - X

500 10 - 100 45 - 115 1.0 - X 1.5 - X 300 - X 300 - X

N – infeasibleX - unbounded

Parameter subspaces for tools

-24-

Q3: How Wrong Can The Guidance Be?Q3: How Wrong Can The Guidance Be? Methodology

– Compare model and actual outcomes of tools– If model is wrong,

Power

ToolA ToolB

D SVM SUB SVM SUB

8 5.38 5.89 5.22 9.08

9 5.38 9.07 5.24 9.07

10 5.78 9.2 5.67 9.22

11 5.8 8.25 6.04 8.96

12 5.8 6.45 6.22 8.93

13 5.81 3.12 6.22 8.93

Suboptimality ≤ 10%

Wrong guidance % and suboptimality %

𝑆𝑈𝐵=𝐶𝑇𝑆𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝐵𝑒𝑡𝑡𝑒𝑟 𝑡𝑜𝑜𝑙𝑜𝑢𝑡𝑐𝑜𝑚𝑒

×100

-25-

OutlineOutline


-26-

Max Skew (ps)

Max Delay (ns)

Max Buffer Transition (ps)

CTS Tool

Post-CTS Skew (ps)

Number of CTS runs

15 1.1 325 ToolA 29.8 435 1.6 350 ToolB 27.7 5

Validation on “New” CTS InstanceValidation on “New” CTS Instance How well does our prediction methodologies generalize? Goals

– Apply methodologies to a new CTS instance– Obtain skew target ≤ 30ps

Determine parameter values from subspace results of Q2

Generalizes with small overhead Few CTS runs to deliver the desired outcome

-27-

OutlineOutline


-28-

ConclusionsConclusions Study high-D CTS prediction with appropriate

modeling parameters Generate testcases with real-world CTS structures Propose HHSM to limit error to ≤ 13% even with

multicollinearity Develop methodologies for practical use models Ongoing work

– Learning techniques to cure high-D multicollinearity– Methodologies to characterize EDA tools– Apply methodologies to reduce time and cost for IC

implementation

-29-

AcknowledgmentsAcknowledgments Work supported by NSF, MARCO/DARPA, SRC

and Qualcomm Inc.

-30-

Thank You!

-31-

Backup

-32-

Brief Background on MetamodelingBrief Background on Metamodeling

General form of estimation

where,Predicted response

deterministic response

Random noise

function

𝑓 (𝛽 , �⃗� )=𝛽0+∑𝑖=1

𝐷

𝛽𝑖 𝑓 (𝑥 𝑖)

�⃗�={𝑥1 , 𝑥2 ,…, 𝑥𝐷−1 ,𝑥𝐷 }

Regression coefficients

�̂� ( �⃗�)= 𝑓 ( 𝛽 , �⃗� )+𝜀( �⃗�)

-33-

Regression Function: MARSRegression Function: MARS

where,Ii : # interactions in the ith basis functionbji: ±1xv: vth parametertji: knot location

𝑓 (𝑥𝑖 )=∏𝑗=1

𝐼𝑖

[𝑏 𝑗𝑖 (𝑥𝑣−𝑡 𝑗𝑖 ) ]+¿¿

Knot = value of parameter where line segment changes

slope

-34-

Regression Function: RBFRegression Function: RBF

where,aj: coefficients of the kernel functionK(.): kernel functionµj: centroidrj : scaling factors

𝑓 (𝑥𝑖 )=∑𝑗=1

𝑁

𝑎 𝑗𝐾 (𝜇 𝑗 ,𝑟 𝑗 ,𝑥 𝑖 )

-35-

Regression Function: KGRegression Function: KG

where,R(.): correlation function (Gaussian, linear, spherical, cubic, …): correlation function parameter

𝑓 (𝛽 , �⃗� )=𝛽0+∑𝑖=1

𝐷

𝛽𝑖 𝑓 (𝑥 𝑖)

𝜀 ( �⃗� )=𝑅 (𝜃 , 𝑥𝑖 , 𝑥𝑘)=∏𝑗=1

𝑁

𝑅 𝑗 (𝜃 ,𝑥𝑘−𝑥 𝑖 )

-36-

Hybrid Surrogate Modeling (HSM)Hybrid Surrogate Modeling (HSM) Variant of Weighted Surrogate Modeling but uses least-

squares regression to determine weights

�̂� ( �⃗� )=𝑤1 �̂� (�⃗� )𝑀𝐴𝑅𝑆+𝑤2 �̂� ( �⃗� )𝑅𝐵𝐹+𝑤3 �̂� ( �⃗�)𝐾𝐺

where, w1,2,3 are weights of predicted response of surrogate model forw1 : MARS

w2 : RBF

w3 : KG

Documents

High-Dimensional Metamodeling for Prediction of Clock Tree Synthesis Outcomes