32
1 LVF Characterization in SiliconSmart Ning Jin – GLOBALFOUNDRIES Inc Suhas Subhaschandra – Synopsys Inc March 31, Year 2016 SNUG Silicon Valley

LVF Characterization in SiliconSmart - … Characterization in SiliconSmart Ning Jin ... – MonteCarlo ... Slew sigma tolerance : 5%,2ps . Pass rate :

Embed Size (px)

Citation preview

  • SNUG 2016 1

    LVF Characterization in SiliconSmart

    Ning Jin GLOBALFOUNDRIES Inc Suhas Subhaschandra Synopsys Inc

    March 31, Year 2016 SNUG Silicon Valley

  • SNUG 2016 2

    Agenda

    Introduction to GLOBALFOUNDRIES 14nm Technology

    LVF Characterization in SiliconSmart

    LVF Correlation Results

    Conclusion

  • SNUG 2016 3

    Agenda

    Introduction to GLOBALFOUNDRIES 14nm Technology

  • SNUG 2016 4 GLOBALFOUNDRIES

    Lg

    3D control over the channel

    Gate length shrink enables performance scaling

    Double-gate reduces off-current

    Device Performance

    +50%

    Total Power

    -65% 28nm 14LPP

    Intrinsically operates at lower supply voltage (FD behavior)

    Reduced off-state leakage Faster switching speed high drive current

    GLOBALFOUNDRIES 14nm FinFET Offers Breakthrough Performance & Power

  • SNUG 2016 5

    Effect of Random Device Variability

    Delay variability (including CLK-to-Q delay) The delay varies for each cell/edge (rise vs fall)/type (early vs late) The delay varies based on the active arc/input transition/output load

    Constraints variability (Setup and Hold) The constraints vary based on

    the slew on both Data_pin and CLK_pin

    Output transition The output transition can vary with

    input transition and output load

  • SNUG 2016 6

    Evolution of Variation Modeling

    Different design margining methodologies have evolved over the course of many years.

    AOCV object_type: lib_cell delay_type: cell rf_type: rise derate_type: early object_spec: */INV depth: 1 2 3 4 5 distance: 0 table: 0.89 0.91 0.93 0.95 0.97

    POCV ocvm_type : pocvm object_type: lib_cell delay_type : cell derate_type: early rf_type : rise object_spec: */INV coefficient: 0.0366

    LVF ocv_sigma_cell_rise(lut_4x4) { sigma_type : "early" ; index_1("0.000617, 0.00391, 0.0154,0.0375"); index_2("0.0001, 0.00154, 0.00663, 0.0163"); values("0.00467, 0.0085, 0.0216, 0.0466",\ "0.00675, 0.0107, 0.0239, 0.0489",\ "0.0114, 0.0179, 0.0321, 0.0570",\ "0.0159, 0.0259, 0.0469, 0.0732"); }

    Sigmas modeled in Liberty file as tables

    Simulation for every input transition/output load for every arc

    Side file with coefficients

    Simulation at 1 input transition/output load

    Side file with derates

    Simulation at 1 input transition/output load

  • SNUG 2016 7

    LVF Characterization in SiliconSmart

  • SNUG 2016 8

    LVF Characterization in SiliconSmart

    Two methods available MonteCarlo (MC) Golden reference method used for accuracy analysis purposes Sensitivity-Based Analysis (SBA) Recommended method to be used in production

    environment.

  • SNUG 2016 9

    Monte Carlo Method Golden Reference

    P1

    P2

    Generate random samples following the input distributions

    Nominal point (no variation)

    Estimate sigma from sampled data

    Separate early/late sigmas - Model asymmetrical distributions - Each side is modeled as a Gaussian

    Delay

    early late

  • SNUG 2016 10

    Sensitivity Based Analysis Production Method

    Sample the nominal Sample each parameter at two points (e.g. -3,+3) Compute sensitivity to each parameter (sp1,sp2)

    P1

    P2

    late

    early

    ( ) ( ) ( )2222211 PNPNPPPPdelay sss +++= Cell delay variation computed by addition in quadrature (also called RSS) - If two points are defined, SiliconSmart will model separate early/late sigmas

    delay

    P1

    delay

    P2

    late

    early

  • SNUG 2016 11

    Basics of SBA Characterization

    Uses independent statistical parameter in transistor models GLOBALFOUNDRIES technology models variation with 6 independent parameters Simulates each parameter independently (P1, P2, P6) At nominal At 2 points (e.g. -3,+3) for every input transition and output load for every timing arc

    Compute sensitivity for each parameter (sp1,sp2, sp6) Example

    D Flip-Flop 24 transistors 24 x 6 x 2 = 288 simulations 7 x 7 delay table SBA method total simulations : 288 x 49 + 49 nominal = 14,161 MC method total simulations: 1000 x 49 = 49,000 ~3.46X simulation reduction comparing SBA vs. MC!!

    P1

    P2

    Nominal

    +3 -3

  • SNUG 2016 12

    LVF Characterization in SiliconSmart

    Offers multiple optimization methods to improve throughput while improving accuracy/runtime tradeoff Screening Netlist pruning Binning Reduction factor

  • SNUG 2016 13

    Screening simulation will identify parameters not affecting cell delay

    These parameters are removed from further simulations

    Screening done for small subset of points on table

    Other points use results that closely match screening point

    Screening SBA Optimization

    1 2 3 4 5 6 7

    8 9 10 11 12 13 14

    15 16 17 18 19 20 21

    22 23 24 25 26 27 28

    29 30 31 32 33 34 35

    36 37 38 39 40 41 42

    43 44 45 46 47 48 49

    Screening simulation done for selected points

    Output load index

    Inpu

    t tra

    nsiti

    on in

    dex

    Nor

    mal

    ized

    del

    ay

    Sensitivity parameters

    Screened parameters that do not impact cell delay significantly

    Parameters affecting cell delay

  • SNUG 2016 14

    Removes inactive nodes from the sensitized path

    For arc BZ, transistors not affecting netA in the AND gate will be removed

    netA is set to an appropriate voltage

    Pruning results in faster simulations by using:

    Reduced netlist

    Reduced number of sensitivity parameters

    Observed no affect on accuracy

    Netlist pruning and screening optimizations reduce 14,161 simulations to just 752 for D Flip Flop!!

    20X reduction in simulations

    Netlist Pruning Optimization

    Z

    Cell AO21

  • SNUG 2016 15

    Reference/baseline library analyzed to determine if timing groups for the same arc can be grouped/binned.

    Tables binned if all individual points between tables match within given threshold

    Sigmas characterized for a single table and modeled for other binned tables

    Binning Optimization

    AOI22 Delay (A->Z) for Cond1 AOI22 Delay (A->Z) for Cond2

    Difference in delays of Cond1 and Cond2 compared against 1%,1ps threshold

    Cond1 & Cond2 not binned as all points do not meet threshold

  • SNUG 2016 16

    SiliconSmart will characterize sigmas for all points in table by default Optimization allows to characterize for a reduced number of sigmas and

    populate the rest using interpolation Provides user with direct control over runtime

    Reduction Factor Optimization

    Characterized sigmas

    0 1 2 3 4 5 6

    7 8 9 10 11 12 13

    14 15 16 17 18 19 20

    21 22 23 24 25 26 27

    28 29 30 31 32 33 34

    35 36 37 38 39 40 41

    42 43 44 45 46 47 48

    Interpolated sigmas

    0 1 2 3 4 5 6

    7 8 9 10 11 12 13

    14 15 16 17 18 19 20

    21 22 23 24 25 26 27

    28 29 30 31 32 33 34

    35 36 37 38 39 40 41

    42 43 44 45 46 47 48

    Default = 1.0 Characterize all points in table

    reduction_factor=0.6 Characterizes 60% of points

    Sheet1

    0123456

    78910111213

    14151617181920

    21222324252627

    28293031323334

    35363738394041

    42434445464748

    Sheet2

    Sheet3

    Sheet1

    0123456

    78910111213

    14151617181920

    21222324252627

    28293031323334

    35363738394041

    42434445464748

    Sheet2

    Sheet3

  • SNUG 2016 17

    Optimizations Order

    Binning

    Pruning

    Screening

    Reduction Factor

    Analyze baseline lib to reduce arcs to be characterized for LVF Effective for cells like AOI with large number of arcs with similar data

    Reduce number of transistors in the netlist Effective for cells with large netlists where all devices in a path do not switch

    Reduce the number of simulations Simulate sensitivity parameters with the largest impact

    Reduce number of points to be simulated User control on tradeoff between runtime and accuracy

    char

    acte

    rize

  • SNUG 2016 18

    LVF Characterization Results

  • SNUG 2016 19

    LVF Qualification Setup/Criteria

    Characterize LVF with MonteCarlo compare libs

    Characterize LVF using SBA

    Implement ARM Cortex-A9 Processor

    Liberty file with NLDM + CCSTN data

    Timing analysis in PrimeTime

    Verify accuracy of SBA against reference MonteCarlo Compare LVF characterization runtime against NLDM+CCSTN

    Compare PT results against spice simulation to verify accuracy

  • SNUG 2016 20

    Experiment to compare accuracy of MonteCarlo results against Sensitivity Based Analysis Based on a subset of cells in the GLOBALFOUNDRIES 14LPP library

    Library validated using compare_library in SiliconSmart

    Criteria for Accuracy Evaluation LVF Library Qualification

    LVF Liberty File Using MonteCarlo

    compare_library

    LVF Liberty File Using SBA

    Library of 139 cells (66% Combinational - 34% Sequential) 1 cell per family used in the analysis

  • SNUG 2016 21

    Comparison using Monte Carlo vs Sensitivity Based Analysis Accuracy Experiment Results LVF Library Qualification

    Delay sigma tolerance : 5%,2ps Pass rate : 99.9%

    Slew sigma tolerance : 5%,2ps Pass rate : 99.9%

    Constraint sigma tolerance : 5%,5ps Pass rate : 96% Constraint resolution : 5ps

    Distribution of delay sigma delta

    Sigma difference (ps)

    Freq

    uenc

    y

    Distribution of slew sigma delta

    Freq

    uenc

    y

    Sigma difference (ps) Sigma difference (ps)

    Distribution of constraint sigma delta

  • SNUG 2016 22

    Based on all cells in the GLOBALFOUNDRIES 14LPP library

    Criteria for Runtime Evaluation LVF Library Qualification

    NLDM/NLPM/CCST/CCS-Noise Library Generation

    Library consisting of 809 cells (80% Combinational - 20% Sequential)

    Add-On Flow for LVF using Sensitivity Based Analysis

    50 CPUs Runtime compared between the

    baseline library and the LVF add-on characterization

  • SNUG 2016 23

    Optimizations allowed 10x reduction in runtime with negligible impact on accuracy

    LVF characterization with optimization is comparable to baseline characterization

    Runtime Experiment Results LVF Library Qualification

    Run Time (hrs) CCSTN + NLDM/NLPM 12.6

    LVF with SBA 10.3

    reduction factor

    screening

    pruning

    binning

    no optimization

    1X runtime

  • SNUG 2016 24

    LVF Correlation Results

  • SNUG 2016 25

    PrimeTime Correlation Experiments Setup

    Static Timing Analysis in PrimeTime Using baseline library with LVF add-on

    Implementation of ARM Cortex-A9 Processor Using baseline library

    SPICE MonteCarlo simulations (Golden Reference) (+3) calculated for setup

    Compare

  • SNUG 2016 26

    PrimeTime Correlation Experiments SPICE Accuracy Setup Analysis

    Setup analysis for 20 paths shows an average difference of 2.1% between STA using LVF library and SPICE MonteCarlo simulations

    -10.0%

    -9.0%

    -8.0%

    -7.0%

    -6.0%

    -5.0%

    -4.0%

    -3.0%

    -2.0%

    -1.0%

    0.0%0 2 4 6 8 10 12 14 16 18 20

    %D

    iffer

    ence

    Path ID

    Setup MonteCarlo-based STA to SPICE Correlation

    SPICE(+3)

    STA(+3)

    SPICE MEAN

  • SNUG 2016 27

    Conclusion

  • SNUG 2016 28

    Conclusion

    LVF provides a fine grain representation of timing variation. It is the recommended variation methodology if reduced design margining is needed Its exhaustive nature presents a challenge to characterization runtime

    SiliconSmart LVF characterization provides an optimal balance between accuracy and runtime. SiliconSmart has multiple features that allow LVF characterization to be done in a time that is around

    1X of baseline library characterization time User control over the tradeoff between accuracy and runtime

    SiliconSmart characterized LVF library yields close correlation (< 5%) between static timing analysis and SPICE MonteCarlo simulations.

  • SNUG 2016 29

    Acknowledgement

    GLOBALFOUNDRIES Design Methodology Team

    Tamer Ragheb Ramya Srinivasan

  • SNUG 2016 30

    Thank You

  • SNUG 2016 31

    Evolution of Variation Characterization Which method to use? Global flat derate On-Chip Variation (OCV) Table based granular derates Advanced OCV

    (AOCV) Parameterized approach Parametric OCV (POCV) POCV with slew/load dependency Liberty Variation

    Format (LVF)

    SiliconSmart allows the characterization of

    AOCV, POCV and slew/load based POCV (LVF) Allows the generation of AOCV/POCV from LVF

    OCV 90nm and above

    AOCV 65nm and below

    POCV/LVF 14nm and

    below

    Pessimism Reduction

  • SNUG 2016 32

    Experiment Setup

    We used an ARM Cortex-A9 Processor implemented at low-power 14nm technology node

    Used SiliconSmart for Characterization and PrimeTime for timing SiliconSmart version 2015.06-SP2 PrimeTime version 2016.06-BETA

    Library characterization corner: SS, 0.72V, 125C

    GLOBALFOUNDRIES Confidential

    LVF Characterization in SiliconSmartAgendaAgendaSlide Number 4Effect of Random Device VariabilityEvolution of Variation ModelingSlide Number 7LVF Characterization in SiliconSmartGolden ReferenceProduction MethodBasics of SBA CharacterizationLVF Characterization in SiliconSmartSBA OptimizationOptimizationOptimizationOptimizationOptimizations OrderLVF Characterization ResultsLVF Qualification Setup/CriteriaLVF Library QualificationLVF Library Qualification LVF Library QualificationLVF Library Qualification Slide Number 24PrimeTime Correlation ExperimentsPrimeTime Correlation ExperimentsSlide Number 27ConclusionAcknowledgementSlide Number 30Evolution of Variation CharacterizationExperiment Setup