36
-1- UC San Diego / VLSI CAD Laboratory High-Dimensional Metamodeling for Prediction of Clock Tree Synthesis Outcomes Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY, UC San Diego

High-Dimensional Metamodeling for Prediction of Clock Tree Synthesis Outcomes

  • Upload
    luna

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

High-Dimensional Metamodeling for Prediction of Clock Tree Synthesis Outcomes. Andrew B. Kahng , Bill Lin and Siddhartha Nath VLSI CAD LABORATORY, UC San Diego. Outline. Challenges Testcase generation Design of experiments N ew estimation technique Prediction methodologies - PowerPoint PPT Presentation

Citation preview

Page 1: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-1-UC San Diego / VLSI CAD Laboratory

High-Dimensional Metamodeling for Prediction

of Clock Tree Synthesis Outcomes

High-Dimensional Metamodeling for Prediction

of Clock Tree Synthesis Outcomes

Andrew B. Kahng, Bill Lin and Siddhartha Nath

VLSI CAD LABORATORY, UC San Diego

Page 2: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-2-

OutlineOutline

Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

Page 3: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-3-

Challenge: High DimensionalityChallenge: High Dimensionality Why is CTS prediction hard?

Testcases

Layoutcontexts

Tools & knobs

Outcomes?(power, skew, delay, wirelength)

CTS instance

CTS prediction is difficult due to inherent high dimensionality

Page 4: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-4-

Challenge: Sensitivity Challenge: Sensitivity

Delay varies by up to 43% with clock entry point locations Delay varies by up to 45% with core aspect ratio

0.1

00

0.1

25

0.2

50

0.3

30

0.4

00

0.5

00

1.0

00

2.0

00

2.5

00

3.0

00

4.0

00

8.0

00

100

100

200

300

400

500

600

700

800 BL BLM B RBM R

Core aspect ratio

Fall d

ela

y (

ps)

BL

BLM

BRBM

R

CTS outcomes are sensitive to instance parameters

Page 5: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-5-

Challenge: MulticollinearityChallenge: Multicollinearity

Estimation error % = % difference between actual and predicted outcomes

Up to 5 estimation errors as D increases from eight to 13 for MARS, RBF and KG techniques

LHS AS LHS AS LHS AS LHS ASMARS RBF KG HSM

0%

50%

100%

150%

200%

250%

300%

350% 8 9 10 11 12 13D =

45

Estimation errors increase at high dimensions

Page 6: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-6-

Challenge: Realistic InstancesChallenge: Realistic InstancesSinks (x,

y)

Rectangular core

Placement blockage

Simple testcases and layout contexts do not reflect real-world CTS instances

ISPD 2010 CTS Benchmark 01

Page 7: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-7-

ContributionsContributions

Generate realistic testcases with real-world CTS structures Study and identify appropriate modeling parameters Propose hierarchical hybrid surrogate modeling (HHSM) – a

divide and conquer approach to overcome parameter collinearity issues

Develop prediction methodologies for practical use models– Which tool should be used?– How should the tool be driven?– How wrong can the model guidance be?

Validate methodologies on a new CTS instance

Page 8: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-8-

Related WorksRelated Works Testcases

– Tsay90 CTS testcases r1 - r5 with sink (x, y) coordinates

– ISPD 2010 Placement blockage Inverters/buffers in clock hierarchy

Prediction– Kahng02

CUBIST to estimate clock skew, insertion delay– Kahng13

MARS, RBF, KG, HSM to estimate several clock metrics Uniform placement of sinks, no combinational logic

Gaps in testcases and layout contexts

Page 9: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-9-

OutlineOutline

Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

Page 10: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-10-

Example of Our CTS TestcaseExample of Our CTS Testcase

Real-world clock structures– Clock-gating cells (CGCs)– Clock dividers– Gitch-free clock MUX

Multiple levels in the clock tree hierarchy (K6 vs. K2) Generators, runscripts to be published

CGC

K1

K2

cg_en[0]cg_en[1

]

Glitch Free MUX

DIV-8

DIV-4CGC

CGC

DIV-24

CGC

CGC

K3

K4

K5CGC

CGC K6

cg_en[2]

cg_en[3]

cg_en[4]

cg_en[5]

cg_en[6]

Clk root pin

clk

mux_en[0]

Sinks

Page 11: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-11-

Example of Our CTS InstanceExample of Our CTS InstanceSinks (x,

y)

Core(aspect ratio

=1)

Placement and routing blockage

Buffers

Clock-gating cells

Clock entry point

location

Clock dividers

Clock MUX

Nonuniform sink

placement

Page 12: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-12-

OutlineOutline

Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

Page 13: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-13-

Modeling ParametersModeling Parameters Microarchitectural

– Msinks – # sinks Floorplan context

– Mcore, MAR – core area and aspect ratio– MCEP – clock entry point– Mblock – placement and routing blockage % of core area

Tool constraints– Mskew, Mdelay – max skew and insertion delay– Mbuftran, Msinktran – max buffer and sink transition time– MFO – max fanout– Mbufsize , Mwire– max buffer size and wire width

Nonuniformity measure– MDCT – nonuniformity in sink placement

Page 14: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-14-

Modeling FlowModeling Flow

Synthesis (DC)

Gate-level netlist

Testcase Verilog RTL

Generate placed DEF

Floorplan parameters

CTS tool parameters

CTS instance

CTS + CT route (ToolA)

CTS + CT route (ToolB)

Extract CTS metrics

µArch parameter

Nonuniformity parameter

Fitted models for metrics

Metamodeling

Page 15: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-15-

Metamodeling TechniquesMetamodeling Techniques

Accurate because they derive surrogate models from actual post-CTS data

Our techniques– Hybrid Surrogate Modeling (HSM) [Kahng13]– Multivariate Adaptive Regression Splines (MARS) [Friedman91]– Radial Basis Function (RBF) [Buhmann03]– Kriging (KG) [Matheron78]

Page 16: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-16-

OutlineOutline

Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

Page 17: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-17-

MulticollinearityMulticollinearity

If parameters are linear combinations of each other– Example: MAR, Mbuftran, Msinktran, Mwire

– Matrix of parameters is ill-conditioned– Large variance in regression coefficients– Hard to determine relationship between parameters and

output– Large errors between actual and predicted outputs as D

increases Previous works [Kahng13] report large estimation errors (≥

30%) as D ≥ 10

𝑓 (𝛽 , �⃗� )=𝛽0+∑𝑖=1

𝐷

𝛽𝑖 𝑓 (𝑥 𝑖)�̂� ( �⃗�)= 𝑓 ( 𝛽 , �⃗� )+𝜀( �⃗�)

Page 18: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-18-

Our Solution: HHSMOur Solution: HHSM

Hierarchical Hybrid Surrogate Modeling Divide the parameters (D) into two sets

– One set of k parameters has low collinearity– Other set of D – k parameters may have high collinearity– Derive HSM surrogate models for each set– Combine using weights from least-squares regression

�̂� ( �⃗� )=𝑤1 �̂� (�⃗� )𝐻𝑆𝑀 (𝑘)+𝑤2 �̂� ( �⃗�)𝐻𝑆𝑀 (𝐷−𝑘)

where, w1,2 are weights

w1 : k parameters with low collinearity

w2 : D – k parameters with high collinearity

Page 19: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-19-

HHSM AccuracyHHSM Accuracy

Up to 4reduction in estimation errors HHSM errors vary by ≤ 2% as D increases from eight to 13 Worst-case error ≤ 13%

HSM HHSM HSM HHSM HSM HHSM HSM HHSMSkew Delay Power WL

0%

5%

10%

15%

20%

25%

30%

35%

40% 8 9 10 11 12 13D =

4

2≤ 2%

≤ 13%

Page 20: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-20-

OutlineOutline

Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

Page 21: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-21-

Use Models For PredictionUse Models For Prediction

Develop methodologies to answer three questions – Q1: Which tool should be used?– Q2: How should the tool be driven?– Q3: How wrong can the model guidance be?

Page 22: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-22-

Q1: Which Tool Should Be Used?Q1: Which Tool Should Be Used? Methodology

– Determine the better tool using models– Compare with actual post-CTS data

D Skew Power Delay Wirelength

8 5.26 4.55 4.87 4.92

9 5.26 4.6 4.93 5.01

10 5.82 4.62 4.94 5.03

11 5.88 4.9 4.94 5.11

12 6.12 5.23 4.95 5.25

13 6.13 5.23 4.98 5.27

Errors increase 8 ≤ D ≤ 11 Errors saturate D ≥ 12 Worst-case prediction error = 6.13%

Incorrect Tool Prediction %

Page 23: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-23-

Q2: How Should The Tool Be Driven?Q2: How Should The Tool Be Driven? Methodology

– Determine the smallest and largest values of parameters that deliver desired outcome

Max Skew (ps) Max Delay (ns) Max Buffer Transition (ps)

Skew (ps) ToolA ToolB ToolA ToolB ToolA ToolB

5 N N N N N N

25 10 - 25 25 - 50 1.0 - 1.75 1.5 - 2.50 275 - 450 300 - 475

50 10 - 50 25 - 100 1.0 - 2.0 1.5 - 1.75 275 - 575 300 - X

100 10 -100 40 - 115 1.0 - X 1.5 - X 300 - X 300 - X

200 10 - 100 45 - 115 1.0 - X 1.5 - X 300 - X 300 - X

500 10 - 100 45 - 115 1.0 - X 1.5 - X 300 - X 300 - X

N – infeasibleX - unbounded

Parameter subspaces for tools

Page 24: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-24-

Q3: How Wrong Can The Guidance Be?Q3: How Wrong Can The Guidance Be? Methodology

– Compare model and actual outcomes of tools– If model is wrong,

Power

ToolA ToolB

D SVM SUB SVM SUB

8 5.38 5.89 5.22 9.08

9 5.38 9.07 5.24 9.07

10 5.78 9.2 5.67 9.22

11 5.8 8.25 6.04 8.96

12 5.8 6.45 6.22 8.93

13 5.81 3.12 6.22 8.93

Suboptimality ≤ 10%

Wrong guidance % and suboptimality %

𝑆𝑈𝐵=𝐶𝑇𝑆𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝐵𝑒𝑡𝑡𝑒𝑟 𝑡𝑜𝑜𝑙𝑜𝑢𝑡𝑐𝑜𝑚𝑒

×100

Page 25: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-25-

OutlineOutline

Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

Page 26: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-26-

Max Skew (ps)

Max Delay (ns)

Max Buffer Transition (ps)

CTS Tool

Post-CTS Skew (ps)

Number of CTS runs

15 1.1 325 ToolA 29.8 435 1.6 350 ToolB 27.7 5

Validation on “New” CTS InstanceValidation on “New” CTS Instance How well does our prediction methodologies generalize? Goals

– Apply methodologies to a new CTS instance– Obtain skew target ≤ 30ps

Determine parameter values from subspace results of Q2

Generalizes with small overhead Few CTS runs to deliver the desired outcome

Page 27: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-27-

OutlineOutline

Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

Page 28: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-28-

ConclusionsConclusions Study high-D CTS prediction with appropriate

modeling parameters Generate testcases with real-world CTS structures Propose HHSM to limit error to ≤ 13% even with

multicollinearity Develop methodologies for practical use models Ongoing work

– Learning techniques to cure high-D multicollinearity– Methodologies to characterize EDA tools– Apply methodologies to reduce time and cost for IC

implementation

Page 29: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-29-

AcknowledgmentsAcknowledgments Work supported by NSF, MARCO/DARPA, SRC

and Qualcomm Inc.

Page 30: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-30-

Thank You!

Page 31: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-31-

Backup

Page 32: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-32-

Brief Background on MetamodelingBrief Background on Metamodeling

General form of estimation

where,Predicted response

deterministic response

Random noise

function

𝑓 (𝛽 , �⃗� )=𝛽0+∑𝑖=1

𝐷

𝛽𝑖 𝑓 (𝑥 𝑖)

�⃗�={𝑥1 , 𝑥2 ,…, 𝑥𝐷−1 ,𝑥𝐷 }

Regression coefficients

�̂� ( �⃗�)= 𝑓 ( 𝛽 , �⃗� )+𝜀( �⃗�)

Page 33: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-33-

Regression Function: MARSRegression Function: MARS

where,Ii : # interactions in the ith basis functionbji: ±1xv: vth parametertji: knot location

𝑓 (𝑥𝑖 )=∏𝑗=1

𝐼𝑖

[𝑏 𝑗𝑖 (𝑥𝑣−𝑡 𝑗𝑖 ) ]+¿¿

Knot = value of parameter where line segment changes

slope

Page 34: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-34-

Regression Function: RBFRegression Function: RBF

where,aj: coefficients of the kernel functionK(.): kernel functionµj: centroidrj : scaling factors

𝑓 (𝑥𝑖 )=∑𝑗=1

𝑁

𝑎 𝑗𝐾 (𝜇 𝑗 ,𝑟 𝑗 ,𝑥 𝑖 )

Page 35: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-35-

Regression Function: KGRegression Function: KG

where,R(.): correlation function (Gaussian, linear, spherical, cubic, …): correlation function parameter

𝑓 (𝛽 , �⃗� )=𝛽0+∑𝑖=1

𝐷

𝛽𝑖 𝑓 (𝑥 𝑖)

𝜀 ( �⃗� )=𝑅 (𝜃 , 𝑥𝑖 , 𝑥𝑘)=∏𝑗=1

𝑁

𝑅 𝑗 (𝜃 ,𝑥𝑘−𝑥 𝑖 )

Page 36: High-Dimensional  Metamodeling  for Prediction of Clock Tree Synthesis Outcomes

-36-

Hybrid Surrogate Modeling (HSM)Hybrid Surrogate Modeling (HSM) Variant of Weighted Surrogate Modeling but uses least-

squares regression to determine weights

�̂� ( �⃗� )=𝑤1 �̂� (�⃗� )𝑀𝐴𝑅𝑆+𝑤2 �̂� ( �⃗� )𝑅𝐵𝐹+𝑤3 �̂� ( �⃗�)𝐾𝐺

where, w1,2,3 are weights of predicted response of surrogate model forw1 : MARS

w2 : RBF

w3 : KG