57
Pontifícia Universidade Católica do Rio Grande do Sul Fault Classification of the Error Detection Logic in the Blade Resilient Template Felipe A. Kuentzer, and Alexandre M. Amory May 9th, 2016 Pontifcia Universidade Catlica do Rio Grande do Sul, Porto Alegre, Brazil

Fault Classification of the Error Detection Logic in the

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Fault Classification of the Error Detection

Logic in the Blade Resilient Template

Felipe A. Kuentzer, and Alexandre M. Amory

May 9th, 2016

Pontificia Universidade Catolica do Rio Grande do Sul, Porto Alegre, Brazil

Page 2: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Overheads

• Delay margins limits performance gains and increase power

consumption

MOTIVATION | 2

Cycle time of

clocked logic

Comb. Logic

Page 3: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Overheads

• Delay margins limits performance gains and increase power

consumption

MOTIVATION | 2

Cycle time of

clocked logic

Comb. LogicWorst-case data

Clock margin

PVT margin

Page 4: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Overheads

• Delay margins limits performance gains and increase power

consumption

• Increase desire to operate near threshold voltage levels

– Challenging in technologies like 45nm, 28nm and below due to PVT

MOTIVATION | 2

Cycle time of

clocked logic

Comb. LogicWorst-case data

Clock margin

PVT margin

Page 5: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Resilient Design

• Resilient design emerged as a solution to reduce timing

margins

• Allow the circuit to operate with relaxed timing constraints

– Extra logic to detect and recover from timing violations

MOTIVATION | 3

Page 6: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Resilient Design

• Resilient design emerged as a solution to reduce timing

margins

• Allow the circuit to operate with relaxed timing constraints

– Extra logic to detect and recover from timing violations

• Past synchronous resilient designs

– Metastability issues

– High recovery penalties

MOTIVATION | 3

Page 7: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Resilient Design

• Blade combines the advantages of the asynchronous design

and the best features of past resilient schemes

– 30% power reduction at the same performance for a 10% area

overhead

MOTIVATION | 4

Page 8: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Resilient Design

• Blade combines the advantages of the asynchronous design

and the best features of past resilient schemes

– 30% power reduction at the same performance for a 10% area

overhead

• Why resilient designs are not widely used?

MOTIVATION | 4

Page 9: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Resilient Design

• Blade combines the advantages of the asynchronous design

and the best features of past resilient schemes

– 30% power reduction at the same performance for a 10% area

overhead

• Why resilient designs are not widely used?

• Challenges in terms of testability

MOTIVATION | 4

Page 10: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Testability Issues

• Resilient circuits are inherently tolerant to timing violations

– Pass/fail criteria needs to be re-examined

MOTIVATION | 5

Page 11: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Testability Issues

• Resilient circuits are inherently tolerant to timing violations

– Pass/fail criteria needs to be re-examined

• Area overhead may be critical

– Limited chip area for test circuitry

MOTIVATION | 5

Page 12: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Testability Issues

• Resilient circuits are inherently tolerant to timing violations

– Pass/fail criteria needs to be re-examined

• Limited chip area for test circuitry

– Area overhead may be critical

• Is functional test enough?

• Can we use existing error detection logic for testing?

MOTIVATION | 5

Page 13: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Outline

• Blade Template

• Simulation Environment

• EDL Fault Classification

• EDL Fault Analysis

– Stuck-at Faults

– Delay Faults

• Conclusions and Next Steps

OUTLINE | 6

Page 14: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Blade Template

• Blade controller

– Two 2-phase BD channels

• L and R

• LE and RE

BLADE TEMPLATE | 7

[Hand et al., ASYNC 2015]

Page 15: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Blade Template

• Blade controller

– Two 2-phase BD channels

• L and R

• LE and RE

• Error Detection Logic (EDL)

BLADE TEMPLATE | 7

[Hand et al., ASYNC 2015]

Page 16: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Blade Template

• Blade controller

– Two 2-phase BD channels

• L and R

• LE and RE

• Error Detection Logic (EDL)

• Two reconfigurable delay lines

– δ time where data can be propagate through the EDL

– ∆ time that the latch is transparent

BLADE TEMPLATE | 7

[Hand et al., ASYNC 2015]

Page 17: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Error Detection Logic

BLADE TEMPLATE | 8

[Hand et al., ASYNC 2015]

Page 18: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Error Detection Logic

BLADE TEMPLATE | 8

[Hand et al., ASYNC 2015]

tcomp = tx,pd + tx,pw

tx,pd tx,pw

Page 19: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Error Detection Logic

BLADE TEMPLATE | 8

[Hand et al., ASYNC 2015]

tcomp = tx,pd + tx,pw

tcomp Δ

Page 20: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Error Detection Logic

BLADE TEMPLATE | 8

[Hand et al., ASYNC 2015]

tcomp

tCE,pd +tOR,pd +tQF,setup

tcomp = tx,pd + tx,pw

Page 21: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Error Detection Logic

BLADE TEMPLATE | 8

[Hand et al., ASYNC 2015]

tCE,pd +tOR,pd +tQF,setup

tcomp

tcomp = tx,pd + tx,pw

TRW

Δ

Page 22: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Simulation Environment

• Two types of faults are considered

– Stuck-at faults

• Wires stuck-at-0 (ST0) stuck-at-1 (ST1)

– Delay faults

• Longer propagation delay (LPD)

• Shorter propagation delay (SPD)

SIMULATION ENVIRONMENT | 9

Page 23: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Simulation Environment

• Two types of faults are considered

– Stuck-at faults

• Wires stuck-at-0 (ST0) stuck-at-1 (ST1)

– Delay faults

• Longer propagation delay (LPD)

• Shorter propagation delay (SPD)

• Only single faults are considered

• Data is not masked between stages

SIMULATION ENVIRONMENT | 9

Page 24: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Simulation Environment

• 3 stage pipeline

SIMULATION ENVIRONMENT | 10

stage 0 stage 1 stage 2

Page 25: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Simulation Environment

• 3 stage pipeline

– Dummy combinational logic – balanced stages

– Timing violations (TV) by extending the inverters delay

SIMULATION ENVIRONMENT | 10

stage 0 stage 1 stage 2

Page 26: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Simulation Environment

• 3 stage pipeline

– Dummy combinational logic – balanced stages

– Timing violations (TV) by extending the inverters delay

• Faults are inserted only in stage 1

• Error signals are observable

SIMULATION ENVIRONMENT | 10

stage 0 stage 1 stage 2

Page 27: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Fault Classification

• Generalizes the relationship between cause and effect of

the analyzed faults

EDL FAULT CLASSIFICATION | 11

Page 28: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Fault Classification

• Generalizes the relationship between cause and effect of

the analyzed faults

• Some faults only affect the performance

• Some faults impact directly the circuit resiliency

– Disables the EDL

EDL FAULT CLASSIFICATION | 11

Page 29: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Fault Classification

• Generalizes the relationship between cause and effect of

the analyzed faults

• Some faults only affect the performance

• Some faults impact directly the circuit resiliency

– Disables the EDL

EDL FAULT CLASSIFICATION | 11

Page 30: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Fault Classification

• Fault classification for stuck-at fault model

EDL FAULT CLASSIFICATION | 12

Page 31: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Fault Classification

• Fault classification for delay fault model

EDL FAULT CLASSIFICATION | 13

Page 32: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Stuck-at Faults

w1

– SA0

• Pipe halts (CLK never rises)

– SA1

• Pipe halts (CLK never falls)

EDL FAULT ANALYSIS | 14

Page 33: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Stuck-at Faults

w1

– SA0

• Pipe halts (CLK never rises)

– SA1

• Pipe halts (CLK never falls)

w4

– SA0

• Error is never captured and TV propagated to the next stage

– SA1

• Error is always generated, but circuit is still functional

EDL FAULT ANALYSIS | 14

Page 34: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• Q-Flop, OR and AND

– SPD

• It won't affect the functionality

– LPD

• Only affects the performance

EDL FAULT ANALYSIS | 15

Page 35: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• C-element and OR Gate

– LPD

• Can miss a TV

EDL FAULT ANALYSIS | 16

Page 36: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• C-element and OR Gate

– LPD

• Can miss a TV

EDL FAULT ANALYSIS | 16

Page 37: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• C-element and OR Gate

– LPD

• Can miss a TV

EDL FAULT ANALYSIS | 16

Page 38: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• C-element and OR Gate

– LPD

• Can miss a TV

EDL FAULT ANALYSIS | 16

Page 39: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• C-element and OR Gate

– LPD

• Can miss a TV

• Captured by the next state

EDL FAULT ANALYSIS | 16

TV at the end of TRW

Page 40: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• Delay line – tcomp

– LPD

• Undetectable without a TV

EDL FAULT ANALYSIS | 17

Page 41: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• Delay line – tcomp

– LPD

• Undetectable without a TV

EDL FAULT ANALYSIS | 17

Page 42: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• Delay line – tcomp

– LPD

• Undetectable without a TV

EDL FAULT ANALYSIS | 17

Page 43: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• Delay line – tcomp

– LPD

• Undetectable without a TV

EDL FAULT ANALYSIS | 17

Page 44: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• Delay line – tcomp

– LPD

• Undetectable without a TV

EDL FAULT ANALYSIS | 17

Page 45: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• Delay line – tcomp

– LPD

• Undetectable without a TV

EDL FAULT ANALYSIS | 17

Page 46: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Delay Faults

• Delay line – tcomp

– LPD

• Undetectable without a TV

• TV must be generated

EDL FAULT ANALYSIS | 17

TV at the beginning of TRW

Page 47: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Test Approaches

EDL FAULT ANALYSIS | 18

• Three approaches evaluated

– Functional

Page 48: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Test Approaches

EDL FAULT ANALYSIS | 18

• Three approaches evaluated

– Functional

– Scan chain• Observability of Err1 and Err0

Page 49: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Test Approaches

EDL FAULT ANALYSIS | 18

• Three approaches evaluated

– Functional

– Scan chain• Observability of Err1 and Err0

– TV generator• Some faults are only observed

when the circuit has a TV

• Glitch generator

• Delay test pattern

Page 50: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Test Approaches

• Three approaches evaluated

– Functional

– Scan chain• Observability of Err1 and Err0

– TV generator• Some faults are only observed

when the circuit has a TV

• Glitch generator

• Delay test pattern

EDL FAULT ANALYSIS | 18

Page 51: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Test Approaches

EDL FAULT ANALYSIS | 18

• Three approaches evaluated

– Functional

– Scan chain• Observability of Err1 and Err0

– TV generator• Some faults are only observed

when the circuit has a TV

• Glitch generator

• Delay test pattern

Page 52: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Test Approaches

EDL FAULT ANALYSIS | 18

• Three approaches evaluated

– Functional

– Scan chain• Observability of Err1 and Err0

– TV generator• Some faults are only observed

when the circuit has a TV

• Glitch generator

• Delay test pattern

Page 53: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Test Approaches

EDL FAULT ANALYSIS | 18

• Three approaches evaluated

– Functional

– Scan chain• Observability of Err1 and Err0

– TV generator• Some faults are only observed

when the circuit has a TV

• Glitch generator

• Delay test pattern

Page 54: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Test Approaches

EDL FAULT ANALYSIS | 18

• Three approaches evaluated

– Functional

– Scan chain• Observability of Err1 and Err0

– TV generator• Some faults are only observed

when the circuit has a TV

• Glitch generator

• Delay test pattern

Page 55: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Conclusions and Next Steps

• Fault classification

– Identify faults based on the effects observed in the overall circuit

operation

– Extract fault coverage for three approaches

– Design for testability must be incorporated

• Increase observability and controlability

CONCLUSIONS | 19

Page 56: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Conclusions and Next Steps

• Fault classification

– Identify faults based on the effects observed in the overall circuit

operation

– Extract fault coverage for three approaches

– Design for testability must be incorporated

• Increase observability and controlability

• Next steps

– Investigate alternatives for TV generation

– Apply this fault analysis and classification to other resilient designs

CONCLUSIONS | 19

Page 57: Fault Classification of the Error Detection Logic in the

Pontifícia Universidade Católica do Rio Grande do Sul

Questions?