37
Using at-speed testing with OCC on a complex SoC A user experience BRUDER Bertrand Atmel June 11, 2013 Grenoble

A4 Bruder Pres User

Embed Size (px)

DESCRIPTION

snug

Citation preview

  • SNUG 2013 1

    Using at-speed testing with OCC on a

    complex SoC

    A user experience

    BRUDER Bertrand

    Atmel

    June 11, 2013

    Grenoble

  • SNUG 2013 2

    Agenda

    1. Principle of at-speed Scan and On-Chip Clocking

    2. Scan at-speed Complex SoC Design Integration

    3. Synthesis Flow and STIL generation

    4. STA constraints

    5. Multi-Pass ATPG flow

    6. Results

    7. Conclusions and Future Work

  • SNUG 2013 3

    1/7 Principle of at-speed Scan

    Using On-Chip-Clocking

  • SNUG 2013 4

    OCC At-speed test : main concepts

    At-speed test detects Delay related defects

    Stuck-at fault model does not fit at-speed requirements : transition fault model is used

    Synopsys OCC is used to manage at-speed clocks switch

    ATE slow clock

    test_se

    Launch Capture OCC

    OCC Ref clock

    bypass

    PLL OCC

  • SNUG 2013 10

    2/7 Scan at-speed Complex SoC

    Design Integration

  • SNUG 2013 11

    At speed testing on Complex SoC

    Scan At-speed test increases drastically the number of patterns

    Compression is needed

    Complex SoC design have multiple clocks at different frequencies

    In our case some of the clocks have differents frequencies, but are synchronous.

    1. How to test all clock domains at their maximum speed ?

    2. How to test inter clock domain and multi-cycle path at speed ?

    What is to take into account

    L C

    L C

    Inte

    r clo

    ck d

    om

    ain

    Mu

    lti-

    cycle

    pa

    th

    PLL

    div

  • SNUG 2013 13

    SAMA5D3 clock system analysis

    Clock system has 4 synchronous subsystems :

    ARM processor, which runs at 400 MHz

    LCD controller, which runs at 266 MHz

    System clock, which runs at 133 MHz

    1/3 of the peripherals, which runs at 66 MHz

    hclocks

    pclocks

    266 MHz

    66 MHz

    400 MHz

    133 MHz

    armclock

    lcdclock

    APMC

    PLLA

    800 MHz

    %1.5

    %3 %2 %2

  • SNUG 2013 14

    Scan at-speed OCC insertion

    Solution 1 : One Synopsys OCC per clock domain

  • SNUG 2013 15

    OCC insertion : Solution 1

    4 OCCs are inserted

    All dividers have to keep their functionality in scan mode

    Advantages :

    Easy automated solution

    One Synopsys OCC per clock domain

    hclocks

    pclocks

    266 MHz

    66 MHz

    400 MHz

    133 MHz

    armclock

    lcdclock

    APMC

    OCC

    OCC

    OCC

    %1.5

    %3 %2 %2 PLLA

    800 MHz OCC

    Drawbacks :

    Dividers are not tested at-speed

    Synopsys OCC limitation : All OCCs are considered asynchronous

    All data path from one sub-domain to the other cant be tested

    Loss of 15% of at-speed coverage !

  • SNUG 2013 16

    Scan at-speed OCC insertion

    Solution 2 : Use of one custom synchronous OCC

  • SNUG 2013 17

    OCC insertion : Solution 2

    4 synchronous OCCs are inserted

    All dividers have to keep their functionality in scan mode

    Advantages :

    Inter clock domain and multi-cycle path can be tested at-speed :

    no at-speed coverage loss

    ATPG is done in one PASS

    Use of one custom synchronous OCC per domain

    hclocks

    pclocks

    266 MHz

    66 MHz

    400 MHz

    133 MHz

    armclock

    lcdclock

    APMC

    OCC

    OCC

    OCC

    %1.5

    %3 %2 %2 PLLA

    800 MHz OCC

    Drawbacks :

    Dividers are not tested at-speed

    Development of synchronous OCC is time consuming and not bug free

    Good solution but :

    due to our planning constraints and risk assessment study,

    this solution has not been retained

  • SNUG 2013 18

    Scan at-speed OCC insertion

    Solution 3 : One Synopsys at the output of PLL

  • SNUG 2013 19

    OCC insertion : Solution 3

    1 OCC is inserted

    All dividers are bypassed in scan mode

    4 test modes are added to select the frequency corresponding to the

    target domain

    Test mode controller drives PLL frequency and scan modes

    Advantages :

    Dividers are tested at-speed

    Inter clock domain and multi-cycle path can be tested at-speed : no at-speed

    coverage loss

    Drawbacks :

    Not a fully automated solution

    Increase ATPG complexity with addition of 4 modes => Multi-pass ATPG

    One Synopsys OCC at the output of PLLA

    hclocks

    pclocks

    266 MHz

    66 MHz

    400 MHz

    133 MHz

    armclock

    lcdclock

    APMC

    %1.5

    %3 %2 %2 PLLA

    800 MHz OCC

    %1

    %1 %1 %1

    Good trade of between solution (1) and (2) :

    Solution retained

  • SNUG 2013 20

    OCC insertion : Solution 3

    Scan multiplexors are inserted on clock paths to protect clock branches against unsupported clock rates

    Frequency partitioning is done in a second step during ATPG using mutli-pass pattern generation

    4 scan modes are created :

    ARM scan mode

    ARM + LCD scan mode

    ARM + LCD + hclocks scan mode

    ARM + LCD + hclocks + pclocks scan mode

    Frequency partitioning

    hclocks

    pclocks

    armclock

    lcdclock

    APMC

    ATE clock

    PLLA

    800 MHz OCC

    %1

    %1 %1 %1

  • SNUG 2013 21

    OCC insertion : Solution 3

    PLLA is programmed to run at 400 MHz

    Muxes are controlled such a way that only ARM get the OCC clock

    The rest of the system is clocked on ATE tester clock

    Mode 0 : ARM scan mode

    hclocks

    pclocks

    ATE

    ATE

    OCC (400 MHz)

    ATE

    armclock

    lcdclock

    APMC

    PLLA

    400 MHz

    %1

    %1 %1

    ATE clock

    Mode 0

    OCC %1

  • SNUG 2013 22

    OCC insertion : Solution 3

    PLLA is programmed to run at 266 MHz

    Muxes are controlled such a way that only ARM and LCD get the OCC clock

    The rest of the system is clocked on ATE tester clock

    Interclock domain between ARM and LCD are tested at 266 MHz

    Mode 1 : ARM + LCD scan mode

    hclocks

    pclocks

    OCC (266 MHz)

    ATE

    OCC (266 MHz)

    ATE

    armclock

    lcdclock

    APMC

    ATE clock

    PLLA

    266 MHz %1

    %1

    Mode 1

    %1

    OCC %1

  • SNUG 2013 23

    OCC insertion : Solution 3

    PLLA is programmed to run at 133 MHz

    Muxes are controlled such a way that only ARM, LCD and hclocks get the OCC clock

    pclocks is clocked on ATE tester clock

    Interclock domain between ARM, LCD and hclocks are tested at 133 MHz

    Mode 2 : ARM +LCD + hclocks scan mode

    hclocks

    pclocks

    OCC (133 MHz)

    ATE

    OCC (133 MHz)

    OCC (133 MHz)

    armclock

    lcdclock

    APMC

    ATE clock

    PLLA

    133 MHz %1

    Mode 2

    %1

    %1

    %1 OCC

  • SNUG 2013 24

    OCC insertion : Solution 3

    PLLA is programmed to run at 66 MHz

    Muxes are controlled such a way that all the system gets OCC clock

    Interclock domain between ARM, LCD, hclocks and pclocks are tested at 66 MHz

    Mode 3 : ARM + LCD + hclocks + pclocks scan mode

    hclocks

    pclocks

    OCC (66 MHz)

    OCC (66 MHz)

    OCC (66 MHz)

    OCC (66 MHz)

    armclock

    lcdclock

    APMC

    ATE clock

    PLLA

    66 MHz

    Mode 3

    OCC

    %1

    %1 %1 %1

  • SNUG 2013 25

    3/7 Synthesis flow and STIL generation

  • SNUG 2013 26

    Synthesis flow

    To prepare the synthesis, the test mode controller must provide the control of :

    OCC devices

    Scan mode selection in multi-pass flow

    PLL frequency

    OCC insertion is done with set_dft_clock_controller with :

    chain_count : set the number of FFs inside the clock chain

    cycles_per_clock : max number of clock pulses during capture

  • SNUG 2013 28

    STIL file generation

    All frequency modes are not described during OCC insertion to reduce the complexity of DFT insertion.

    Therefore, post processing of STIL files is needed to derivates as many STIL files as scan frequency modes.

    Only scan modes control signals are impacted

    If scan mode selection results from a sequential initialization of test mode controller, post processing consist in changing the test_setup

    sequence in MacroDefs structures

    SAMA5D3 example : Top_ScanCompression_mode0.stil PA[5:4] forced to 00

    Top_ScanCompression_mode1.stil PA[5:4] forced to 01

    Top_ScanCompression_mode2.stil PA[5:4] forced to 10

    Top_ScanCompression_mode3.stil PA[5:4] forced to 11

    Specificity of our multi-pass ATPG flow

  • SNUG 2013 29

    4/7 STA constraints

  • SNUG 2013 31

    STA scan constraints

    Two scenarios : one shift and one capture

    In multi-pass flow, all modes are overlaid

    All clocks of each mode are created in the same STA scenario

    Exclusive Clock groups are defined to remove inter-mode timing path

    hclocks

    pclocks

    armclock

    lcdclock

    APMC

    OCC_${OCC}_ATEClock

    Mode selection

    hclock_$freq_mode2

    hclock_$freq_mode1

    hclock_$freq_mode0

    pclock_$freq_mode2

    pclock_$freq_mode1

    pclock_$freq_mode0

    OCC_$freq_mode2

    OCC_$freq_mode1

    OCC_$freq_mode0

    clkplla_$freq_mode2

    clkplla_$freq_mode1

    clkplla_$freq_mode0

    PLLA

    800 MHz OCC %1 %1

    %1

    %1

    LCD_$freq_mode2

    LCD_$freq_mode1

    LCD_$freq_mode0

  • SNUG 2013 36

    SDC generation for ATPG

    To avoid timing violations during scan mode, multi-cycle and false paths have to be given to Tetramax

    Recommended Synopsys flow is to generate SDC during STA, that will be read by Tetramax (read_sdc command)

    pt2tmax.tcl script generates SDC from timing violation

    write_exception_from_violation

    Multi-pass ATPG flow : one SDC per mode is generated

    False path and multicycle path management

  • SNUG 2013 41

    5/7 Multi-Pass ATPG flow

  • SNUG 2013 43

    +

    ATPG at-speed Multi-pass flow

    add_faults launch \ capture $occ_clock

    mode$i-1 dictionary

    mode$i.stil

    read_faults \

    delete mode$i-1

    For 1

  • SNUG 2013 44

    Incremental ATPG Flow Complete the ATPG generation using stuck-at fault model

    Final dictionary and

    Stuck-at patterns

    update_faults \

    direct_credit mode3

    Stuck-at equivalency

    add_faults all

    Stuck at dictionary Transition fault

    final dictionary

    + -

    run_atpg (target : 80%)

    Multi-pass flow

    run_atpg (95%)

  • SNUG 2013 50

    6/7 Scan At-speed Results

    SAMA5D3 product

  • SNUG 2013 51

    ATPG results : SAMA5D3 product

    # faults # patterns coverage

    Transition fault model

    Mode 0 (400) 971,686 2,081 76.25%

    Mode 1 (266) 8,114 467 80.02%

    Mode 2 (133) 5,867,180 12,874 75.51%

    Mode 3 (66) 12,856 454 77.64%

    Total transition 6,859,836 15,876 75.62%

    Stuck-at fault model

    Total Stuck-at 9,287,250 (incr : 2,507,386) 4,337 95.10%

    Total patterns 20,213

    Scan at-peed ATPG multi-PASS incremental flow (with OCC)

    # faults # patterns coverage

    Total Stuck-at 9,287,250 8,615 95.15%

    Scan ATPG stuck-at only (without OCC) Ratio: 2.4

    *

    * Without inter-domain at-speed test, we would have been around 60% of at-speed coverage

  • SNUG 2013 52

    SAMA5D3 : Early Silicon Results At-speed VS stuck-at

    Stuck-at patterns only : 17 fails

    At-speed patterns : 19 fails

    2 parts were caught thanks to

    scan at-speed (251 parts

    tested)

    Mode 0

    Mode 1

    Mode 2

    Mode 3

    0.4% loss vs stuck-at

    0% loss vs stuck-at

    0.8% loss vs stuck-at

    0% loss vs stuck-at

    High freq

    Small area

    Big area

    Low freq

  • SNUG 2013 53

    7/7 Conclusions and future work

  • SNUG 2013 54

    Conclusions and future work

    We have successfully deployed Synopsys OCC in our DFT implementation flows at ATMEL Rousset leading to the

    ATMEL SAMA5D3 product, in production since begin of 2013

    Gain of the approach : increase of 15 % of at-speed coverage

    This methodology is now part of the official Atmel flow for all new SoCs

    For future work, we plan to implement a Custom OCC as described in solution (2), improve the functional timing

    exception flow handling and consider looking at

    complementary fault models such as Small Delay or Bridging

    Defects.

  • SNUG 2013 55

    Thank You

  • SNUG 2013 56

    Appendix A

    CTS constraints

  • SNUG 2013 57

    CTS constraints

    CTS constraints aim at

    balancing all flip flops in OCC driven by the fast_clk OCC input.

    balancing all flip flops in OCC driven by the slow_clk input.

    balancing all flip flops in the clock chain driven by OCC output clk,

    tagging the nets from PLL outputs to OCC fast_clk input.

    tagging the net from ATE clock pad to OCC slow_clk input.

    tagging the net from ATE ref clock pad to PLL input.

    tagging the net from OCC output (clk) to APMC scan multiplexor.

    See example on next slide

    (Caution CTS constraints are design dependant.

    Example is not exhaustive).

  • SNUG 2013 58

    CTS Constraints - Example

  • SNUG 2013 59

    CTS Constraints - Reusability

    CTS constraints are not reusable :

    Constraints at top level are design dependant

    Constraints in test mode controller are design dependant

    Excluded pins at the input of APMC are automatically set by the ::MCTS::check_CTS_spec case_sensitive during the functional pass of CTS