Net Diagnosis using Stuck-at and Transition Fault Models Master’s Defense Lixing Zhao

Oct. 26, 2011 Lixing's MS Defense 1

Net Diagnosis using Stuck-at and Transition Fault Models

Master’s DefenseMaster’s DefenseLixing ZhaoLixing Zhao

Department of Electrical and Computer EngineeringAuburn University, AL 36849 USA

Thesis Advisor: Dr. Vishwani D. AgrawalThesis Committee: Dr. Adit Singh and Dr. Victor P. Nelson

Oct. 26, 2011 2

Outline

MotivationBackgroundProblem StatementContributions

Proposed Fault Filtering SystemProposed Fault Ranking SystemProposed Net Ranking System

Conclusion

Lixing’s MS DefenseLixing's MS Defense


Motivation

Due to high logic density of modern VLSI design and manufacturing, chips on the first round of tape-out often suffer a relatively low yield that may be unacceptable.

Fault diagnosis can bring the yield up in later manufacturing rounds by indentifying the possible causes of defect in earlier tape-outs.

Net fault diagnosis is an important area of fault diagnosis. Because of the large routing area of modern VLSI devices, the routing interconnection nets are more vulnerable to certain defects.

In this work, we try to provide an effective method on solving the net-diagnosis problem.




Conclusion

Outline


How does Fault Diagnosis Work?Defective Circuit

Circuit Net-listTest Vectors

Circuit Responses

Expected Responses

Compare

Diagnosis AlgorithmPossible Defect Locations


Circuit Under Diagnosis The Circuit Under Diagnosis (CUD) can be classified

into two groups: Combinational Circuits:

Sequential Circuits:


Diagnosis Pattern Random Pattern:

Randomly or Pseudo Randomly Generated Computer program or Pattern generator (e.g., LFSR)

N-Detect Pattern: Each fault is detected by at least N different Patterns ATPG-based

Fault-Model-Based Patterns: Patterns used for diagnosing faults based on certain fault model Using certain generating algorithm and ATPG

• Yu Zhang and Vishwani D. Agrawal, "A Dianostic Test Generation System," in Proc. International Test Conf., Nov.2010, pp.1-9.• Yu Zhang and Vishwani D. Agrawal, "Reduced Complexity Test Generation Algorithms for Transition Fault Diagnosis",

International Conf. on Computer Design, Oct. 2011. pp. 96-101.



Fault ModelsFault model is an abstraction of the real defect in chip and different fault models are used to handle different types of defects in fault diagnosis.

The types of fault model that can be used on a net:

Stuck-at fault

Transition fault

Bridging fault

Open fault

Open Fault

Interconnect Open: undesired breaks and electrical discontinuities on interconnection line.

Resistive Open: narrow crack Modeled as transition delay fault



Complete Open: wide crackThe coupling capacitors between the floating node and the supply and ground. The drain voltage of the driven gates.The effects from surrounding lines.

Net Structure


A net means a connection metal wire in the circuit.

Previous Works on Multiple Faults Diagnosis

Single-Fault-Simulation(TFS)-BasedMultiple-Faults-Simulation(MFS)-BasedSingle-Location-AT-a-Time(SLAT)-BasedRegion-Model-Based


SLAT


X-Region


Diagnosis Strategies Cause-Effect method

diagnosis faults by comparing the fault simulation results with the CUD response.

Traditional SimulationMore information available but more costs.

Effect-Cause methoddiagnosis faults by tracing from erroneous primary

outputs.Back-trace simulationLost some information but fewer costs.


OUTLINE




Conclusion


Problem StatementGiven the failing response of CUD

Failing Pattern IndexIndex of Erroneous Primary Outputs (EPO)

Given the net-list of CUD Verilog file

Find out locations of faulty nets with certain defects.




Conclusion

OUTLINE


Proposed Fault Filtering System

Count Assignment:A Count is a value we assign to each fault

candidate under certain measurement method.Contribution: A more balanced count assignment

method for fault candidates filtering.The Count we use in our filtering system is a ratio-

count.Count = Matched Value

Expected Value in Fault Simulation




Failing Pattern Index Matching DEF1:The union of all the failing pattern index from the single

fault simulation of a fault candidate is defined as the Detectable Pattern Set (DPS) of this fault under the test.

DEF2:The union of all the failing pattern index of the observed CUD response is defined as the Failing Pattern Set (FPS) of the test.

DEF3:The shared part between the DPS of a fault candidate and the FPS of CUD is called the Shared Pattern Set (SPS) of this candidate.

• THEOREM: If the CUD is a circuit with multiple faults and we assume that all the multiple faults in the circuit will not totally cancel each other on the primary outputs, then the DPS of any one of the multiple faults in the circuit should be a subset of FPS of the test, in other words, the SPS of the fault candidate equals to its DPS.

• INTUITIVE ASSUPTION: The percentage of SPS takes in DPS of a fault represents the possibility that this fault be a real one in CUD.

• Count =


Number of patterns in SPSNumber of patterns in DPS



FPIM with EPO-HittingDEF: Under the same test pattern, if the

affected primary outputs of a candidate fault simulation shares at least one erroneous output with the faulty response of CUD, then we say that this fault candidate can 'hit' the EPO under this pattern and this pattern is called a Hit-Pattern of this candidate.

Count = Number of Hit-Patterns in SPSNumber of patterns in DPS

EPO-Matching DEF1: A Pattern-EPO-Pair (PEP) is a pair of failing pattern number

and an EPO associated with it. Like [P2, PO1], which indicates under pattern P2, an error is observed on PO1. PEP could be used to either represents the faulty response of CUD testing or the fault simulation results of fault candidate.

DEF2: The union of all the PEPs from CUD testing is called PEP-Set-of-CUD (PEPSC) and the union of the PEPs under all the patterns in SPS of a fault candidate is called the PEP-Set-of-Fault (PEPSF). The shared part of PEPSC of a fault candidate with PEPSF of CUD is called Shared-PEP-Set (SPEPS) of this fault candidate.


THEOREM: If CUD is a circuit with multiple faults and we assume that there is no cancelling effect among these faults, then the PEPSF of a fault candidate in single fault simulation should be the subset of of CUD and the SPEPS of the fault candidate should equal to its PEPSF.

INTUITIVE ASSUPTION: The percentage SPEPS taking in PEPSF of a fault candidate indicates the possibility that this fault be a real one in CUD.

Count =


Number of PEPs in SPEPSNumber of PEPs in PEPSF


Count of Fault1: 3/3=1 Count of Fault2: 3/4 = 0.75

• For each step, we will set one threshold value to filter the unrelated fault candidates out. These threshold values depend on the our assumption on fault density of CUD.



Filtering Results

Circuit # of Total Faults Reduction Rate Survival Rate

C432 524 0.77 0.975

C880 942 0.97 0.96

C1355 1574 0.75 0.97

C1908 1879 0.85 0.95

C2670 2747 0.925 0.97

C3540 3428 0.95 0.965

C6288 7744 0.933 0.96

C7552 7419 0.992 0.96




Conclusion

OUTLINE


Candidate Ranking SystemAfter getting a smaller list of fault candidates from

filtering stage, we need to rank the fault candidates so that we can have a better diagnosis resolution.

A structure called EPO-Tree is used in our work.

34Oct. 26, 2011 Lixing's MS Defense


Branch Ranking in EPO-Trees with Same Branch Combination

Observation: The activation situations are sometimes similar under certain test patterns, which means these patterns can activate same set of injected faults in the CUD and the observed EPO combinations from the CUD are the same.

Intuitive Assumption: Assuming we have a circuit with large enough number of primary outputs, when the failing outputs combinations are the same under different test patterns, because it is not very easy to repeat the same combinations for different injected faults in CUD, it is possible that the cause of these failures are the same. If we can find shared set of leaves between the corresponding branches in these EPO-Trees, then these shared faults are more possible than other faults in branch to be the real faults.


Branch Ranking with Counts from Reduction Stage

Branch ranking procedure we used in step two, three and five.

The candidates in each branch have already had an initial rank from previous stage, now what we have to do is to utilize the counts of each fault got from reduction stage to rank the candidates within each group.


Branch Ranking with Leaves Count in Each EPO-Tree

Rule: If several fault candidates still have the same rank after previous ranking steps, then we assume that the ones with more leaves in the EPO-Tree have more chance to be real faults in CUD, because it is much easier to activate just one or two faults than many faults together to cause the same effects.


Final Fault Ranking• We rank the fault candidates by considering

the best rank they have among all the branches in all EPO-Trees.

• A Top-Single-Fault is a single fault that has top rank in a branch. This kind of faults are the most suspicious fault candidates to us in diagnosis.

• Because we have applied many constraints in branch ranking, the earlier the TSF comes out, the more suspicious it seemed to us.


40Lixing's MS Defense


Experimental Results


Cir Our Work(1 Stuck-at) Wang’s work(1 stuck-at)Dia FHR RES T(s) Dia FHR Res T(s)

c2670 1.0 1.0 1.0 6.4 1.0 1.27 1.3 0.01

c3540 1.0 1.0 1.0 0.5 1.0 1.2 1.5 0.01

c6288 1.0 1.0 1.0 0.6 1.0 1.1 1.3 0.01

c7552 1.0 1.0 1.0 1.5 1.0 1.15 1.6 0.01

Z. Wang, M. Marek-Sadowska, and J. Rajski, "Analysis and methodology for multiple-fault diagnosis,” IEEE Tran on CAD of Integrated Circuits and Systems, March 2006, vol. 25, pp. 558-576.


c2670 0.97 1.0 2.0 8 0.97 1.35 2.0 0.05

c3540 0.95 1.0 1.0 0.6 0.95 1.2 1.7 0.06

c6288 0.99 1.0 2.0 1 0.97 1.28 2.0 0.2

c7552 0.93 1.0 2.0 1.7 0.925 1.25 2.0 0.2



c2670 0.94 1.0 2.0 13 0.925 1.35 2.6 0.1

c3540 0.94 1.0 2.0 0.7 0.92 1.15 2.4 0.1

c6288 0.95 1.0 2.0 2.7 0.93 1.15 2.5 0.5

c7552 0.90 1.04 2.0 2.9 0.92 1.2 2.3 0.25

Cir Our Work(4 stuck-at) Wang’s work(4 stuck-at)Dia FHR RES T(s) Dia FHR Res T(s)

c2670 0.89 1.06 2.0 17 0.92 1.3 2.6 0.2

c3540 0.91 1.0 2.0 1.8 0.89 1.25 2.5 0.2

c6288 0.92 1.02 2.0 5.7 0.82 1.15 2.8 0.8

c7552 0.88 1.1 2.0 3.2 0.91 1.2 2.4 0.5

Fault List Extension• Before we start handling the net diagnosis work,

we need to first extend the collapsed faults to uncollapsed faults.


From the net-list of the circuit, we can get the corresponding net for each fault. Then each group of equivalent faults can be transformed into a set of nets.





Conclusion

OUTLINE

Net RankingFirst, we build a net pool, which will include

all the net candidates of each rank group.Final net candidate list includes two parts:

The nets which we can find more than two members in the net pool.

The nets which can only be found once in the net pool.

Nets’ ranking are based on the group's rank they derived from.



Experimental Results


For each fault model and each benchmark circuit, we randomly constructed 20-50 faulty circuits. For each injected net fault, we randomly selected 2-4 fault sites which could be either the stem or the branches of the net. For stuck-at model, we injected one single stuck-at fault on each fault site and for transition net fault we injected a D-flip-flop on each fault site to perform transition delay behavior.


Cir One Net Fault(Stuck-at) Two Net Fault(Stuck-at)

Dia FHR Res T Dia FHR Res T

c432 1.0 1.1 2.0 0.1 0.98 1.2 3.0 0.2

c880 1.0 1.1 2.0 0.1 1.0 1.1 3.0 0.1

c1355 0.98 2.0 3.0 1.0 0.75 2.6 5.0 16

c1908 0.92 1.4 3.0 7.5 0.86 1.3 3.0 13

c2670 0.98 1.4 3.0 7.4 0.95 1.14 4.0 12

c3540 0.98 1.04 3.0 0.5 0.96 1.1 3.0 0.7

c6288 0.98 1.1 3.0 3 0.9 1.06 3.0 4

c7552 0.96 1.1 3.0 3 0.9 1.2 3.0 11


Cir One Net Fault(Transition) Two Net Fault(Transition)

Dia FHR Res T Dia FHR Res T

c432 1.0 1.05 2.0 0.48 0.9 1.2 3.0 0.7

c880 0.9 1.2 2.0 0.8 0.8 1.3 3.0 2.3

c1355 1.0 1.0 1.0 0.9 0.86 1.3 3.0 13

c1908 0.96 1.2 2.0 7.2 0.93 1.4 3.0 35

c2670 1.0 1.16 2.0 0.9 0.97 1.06 2.0 4.6

c3540 0.98 1.2 2.0 1.7 0.95 1.1 2.0 2.3

c6288 1.0 1.08 2.0 2.8 0.92 1.3 2.0 2.8

c7552 0.98 1.3 2.0 7.5 0.96 1.08 2.0 13.1




Conclusion

OUTLINE

Advantages


Utilizing only single fault simulation, which avoids the problem of exponential searching space in multiple-fault-simulation-based works.

No requirement for the ability of testing pattern to trigger single fault at a time.

A balanced candidate filtering system which can effectively reduce the number of fault candidates and to some extent tolerate the non-stuck-at behavior caused by other types of faults.

A candidate ranking system with high First Hit Rank(FHR), diagnosis resolution and diagnosability.

Suitable for diagnosing multiple types of faults.


Conclusion Traditional gate faults are closely related to the fault models and not

necessarily to physical defects. Therefore, from a practical viewpoint it makes sense to diagnose a faulty net on a VLSI chip than to locate a `modeled' fault.

Our use of stuck-at and transition faults models is for a practical reason, i.e., availability of tools for test generation and fault simulation. These models are used only for the possibility of analysis they offer. In identifying faulty nets no assumption is made about the actual fault on them except that those nets `may' have caused the observed and simulated errors. The fault models may, or may not be, used as suggestions. We verified our work with the injected multiple-faults.

In the future, arbitrary defects such as bridges, opens, short, etc., should be examined to evaluate the presented diagnosis algorithms.


References1. Yu Zhang and Vishwani D. Agrawal, “A Dianostic Test Generation System,” In Proc.

International Test Conf., 2010, pp.1-9.2. Yu Zhang and Vishwani D. Agrawal, “Reduced Complexity Test Generation Algorithms

for Transition Fault Diagnosis,” In Proc. International Conf. on Computer Design, 2011, pp. 96-101.

3. N. Sridhar and M.S. Hsiao, “On Efficient Error Diagnosis of Digital Circuits,” In Proc. International Test Conference, 2001, pp. 678 - 687.

4. S.M. Reddy, H. Tang, I. Pomeranz, S. Kajihara and K. Kinoshita, “On Testing of Interconnect Open Defects in Combinational Logic Circuits with Stems of Large Fanout,” In Proc. Intl Test Conf. , 2002, pp. 83-87.

5. Z. Wang, M.Sadowska, and J. Rajski, “Analysis and Methodology for Multiple-fault Diagnosis,” IEEE Tran on CAD of Integrated Circuits and Systems, March 2006, vol. 25, pp. 558-576.

6. S. Venkataraman and S. B. Drummonds, “Poirot: Applications of a Logic Fault Diagnosis Tool,” IEEE Design and Test of Computers, Jan. 2001, pp. 19-29.

7. J. Segura and C. F. Hawkins, “CMOS Electronics: how it works, how it fails,” Wiley- IEEE, Apr. 2004.


Thank You . . .


Questions

Documents

Net Diagnosis using Stuck-at and Transition Fault Models Master’s Defense Lixing Zhao