Detecting Multi-cycle Errors using Invariance Information

European Test Symposium, May 28, 2008

Nuno Alves, Jennifer Dworak, and R. Iris Bahar

Division of Engineering Brown University

Providence, RI 02912

Kundan NepalElectrical Engineering Dept.

Bucknell UniversityLewisburg, PA 17837

Online Error DetectionGOAL: Detect transient faults that may occur

in a circuit during operationBecoming critical as circuits scale to smaller

sizesError detection in memory is “relatively easy”

since we know the answer a prioriWhat about random logic?

Determine that the functional transformation of the inputs to the outputs has occurred correctly.

But how do we know what it’s supposed to be?

Online Detection TechniquesUse of pre-computed test vectors and their

expected responses (stored in hardware)

Duplicating the computation of disjoint hardware elements and voting on the result (e.g.: TMR or logic duplication)

Use of check bits (e.g.: parity bit or Hamming Codes)

Our ApproachInstead, find invariant relationships that are

naturally present among circuit sites in a block of logic. Discovered automatically without requiring

knowledge of the circuit’s function or designer constraints.

Can be used for combinational or sequential circuits.For sequential circuits, invariants can be within a

single cycle or across multiple cycles

Violations of these expected relationships can then be used to identify errors.All error checking is done off the critical path

invariant relationships verified by

checker logic

Invariant Relationships in CircuitsIn all digital circuits, certain relationships

must always be present among particular circuit sites if the circuit is operating correctly.

n5=1 n8=0

These relationships can be thought of as logic invariants or logic implications

n1

n2n3

n4n5

n6n7

n8

Error Detection with ImplicationsOnce we have discovered the implication,

extra HW is added to the circuit to verify if implication holds

If n5=1 & n8=1, this implies an error occurred in the circuit

ERROR

n1

n2n3

n4n5

n6n7

n8

n5=1 n8=0

Process Overview (1/2)Verilog

Description

Logic Simulation

Find &

ValidateImplications

1.Run logic simulation to identify potential implication sites (parallel search).

2.Check all implications for validity using a SAT solver.

3.Eliminate implications subsumed by others.

Process Overview (2/2)Fault Analysis

Compress Implications

Return Verilog Descriptionwith Additional HW

4. Determine the fault coverage of all valid implications.

5. Select a subset of the valid implications such that the highest fault coverage is obtained given a user-specified hardware budget.

6. Deliver a modified Verilog file with implication logic

Multi-cycle ImplicationsImplications can also exist across latch

boundaries, over multiple time cyclesFaults not adequately covered by single-cycle

implications may be better covered across cycles with additional spatial distance.

In particular, including multi-cycle implications in checker hardware can help detect errors near latch boundaries

May also be useful in detecting delay faults

Multi-cycle Implication ExampleConsider only the

combinational portion of this circuit:

There are no useful implication we can use for error checking

What if we created a (virtual) copy of this circuit and searched for implications across time cycles?

AB

F

Y

X

AB

F

Y

X AB

Y

X

F

AB

Y

X

F

B=0 F=0

AB

Y

X

F

AB

Y

X

F

B1=0 F2=0

AB

F

Y

X ERROR

Experimental SetupTested approach on circuits from ISCAS

benchmarks3 step process:

1. Initial (non-validated) implications generated using a random set of 32,000 input vectors

2. Zchaff SAT solver used to valid initial set of implications

3. Run fault coverage using these implication in checker

Process completed for single cycle and across 2 time cycles

Case 1

Case2

Case3

Case4

Error Propagates To Output

An Implication is Violated

Covering faults with implicationsFor each random input vector, and at each

fault, the implications-based circuit operation can fall into the following 4 categories:

Detection of Observable Faults

Average 9.6% improvement in detection rate

Case1/[Case1+Case4]

Do We Need All Implications?Generating checker logic for all discovered

implications is unnecessary and wasteful

We only want to keep important implications that:Detect many faultsIdentify hard-to-detect faultsCover faults not detected by other implications

Identifying Important Implications

Finding these important implications requires a combination of: structural analysis to removed subsumed

implicationsfault analysis to determine the specific fault

coverage for each implication.

Pruning Implications

ConclusionsWe presented a practical online error detection

alternative based on implication validation Does not require modification of targeted logicChecker logic is added off the critical path and

run in parallel rest of circuit.We show our approach can easily be expanded to

discover implications across multiple time frames to improve fault coverage.

We detect up to 90% of observable faultsMulti-cycle implications boost detection rate by

almost 10% on average

Documents

Detecting Multi-cycle Errors using Invariance Information