ELEC516/10 Lecture 10 1 ELEC 516 VLSI System Design and Design Automation Spring 2010 Lecture 10 – Design for Testability Reading Assignment: Kang – CMOS

ELEC516/10 Lecture 101

ELEC 516 VLSI System Design and Design Automation Spring 2010Lecture 10 – Design for Testability

Reading Assignment:

Kang – CMOS Digital Integrated Circuit: Analysis and Design

Chapter 15


Testing your prototype!!!

• Test is time consuming and Test equipment is very expensive!!!!

• Test cost contributes greatly to the cost of the system (20-30% of the chip cost).

• You must think about the test during the design – End-up with untestable chip

– Test your functionality as well as performance

• If you don’t test it, It won’t work!!!!

Prototype Specification?


Introduction• Testing is important, probably as important as the

design process.• Test the chip to make sure it is full functional is

highly complex and time-consuming• Cost of chip debugging is much higher than that of

board-level debugging which is in turn much higher than that of system-level debugging.

• In production environment, many chips must be tested within a short time fro timely delivery to customers.

• Therefore design for testability become very critical.


Testing Classification

• Diagnostic test– Used in chip/board level debugging

– Defect localization

• “go/no go” or production test– Used in chip production

• Parametric test– Voltage and current test, instead of logic test

– Check other parameters such as noise margin (NM), threshold voltage (Vt), delay time (tp) and temperature (T).


Chip Debugging

• Design errors or fabrication defect?• Micro-probing the die• E-beam• Single-die repair


Testing is Expensive

• VLSI tester cost several million dollars (US)• Volume manufacturing requires large number of

testers, maintenance• A lot of time, design company cannot afford this

and a rental model is commonly used. The rent is counted by time usage.

• Tester time costs are in $/sec• Test cost contributes 20-30% to total chip cost.


Types of Testing

Step Error Source Test Type

Design Design flaws Design Verification

Prototype Design flaws/ Prototype flaws

Functional test

Manufacture Physical defects Manufacture Test

Shipping Manu. Test, transport

System Integration

Same Functional Test

Service Stress, Age Diagnosis


Manufacturing defects

• During Manufacturing: misalignment, dust and other particles, “stacking faults”, defects in dielectric, mask scratches, thickness variation: layer to layer shorts, discontinuous wires (open), circuit sensitivities (Vth, Lchannel): found during wafer probe of test structures.

• During packaging: Defects from scratching in handling, damage during bonding misalignment (need always to check the wire bonding), other defects undetected during wafer probe: found during test of packaged parts.

• During mounting: Defects from damage during board insertion(thermal, ESD), infant mortality (mfg defects that show up after a few hours of use). Noise problems, susceptibility to latch-up: found during testing/mounting on board.

• Long term: Defects that appear after months or years of utilization (metal migration, oxide damage during manufacture, impurities): found by the customer

•

• Errors can occur at different stage in the life-time of a chip


Testers for volume manufacturing

• Each pin on the chip is driven/observed by a separate set of circuitry which typically can either:– drive the pin to one data value per cycle – or observe the value of the pin at a particular point in a clock cycle.

• Timing of input transitions and sampling is controlled by a high resolution timing generators

Associated with each pin

Device under test(DUT) is mounted on the test head


Test Strategy

• The test using the testers is achieved in many steps:– Supply a set of test vectors that specify an input or

output value for every pin on every cycle.

– Tester will load the program into the pin cards.

– Run the program and report any miss-compares between an output value and the expected value.


Testers for volume manufacturing

Behavioral model

Specification

DesignCycle

Test patternsI/O vectors

Memory

Vcompareerror

Force/Compare


How many test vectors do we need?

• For exhaustive test: for a digital circuit with 25 inputs and 50 states, 275 cycles are required. Assuming 1us/cycle then test time >109 years.

• Exhaustive test is impractical and unnecessary.– We only need to verify that no faults are present which may

take fewer vectors.– In fact many vectors can test the same fault.

2n inputs required toexhaustively test circuit

2n+m inputs required toexhaustively test the circuit


Fault Types and Models

• Testing Goal: to detect faults in fabrication, design and failures due to stressful operating conditions and reliability problems

• Test process:– Input test vector to the device under test (DUT) as its

stimuli– Measured outputs are compared with the expected

correct responses to determine the correctness– Difficulty: only system inputs and outputs pins are

accessible– Another difficulty – generation of correct test vectors

to detect all modeled faults and design errors.– Manual or automatic test pattern generator (ATPG)

becomes a difficult task.


Defect causes

• Physical defects:– Defects in silicon substrate– Photolithographic defects– Mask contamination and scratches– Process variations and abnormalities– Oxide defects

• Physical defects -> electrical faults– Shorts (bridging faults) or Opens– Transistor stuck-on, stuck-open– Resistive shorts and opens– Excessive change in threshold voltage and current

• Electrical faults -> logical faults– Logical stuck-at-0 or stuck-at-1– Slow transition (delay fault)– AND-bridging, OR-bridging


Fault models

• Traditional models, first developed for board-level tests, assumes that a node gets “stuck” at “0” or “1”, presumably by shorting to GND or Vdd.

• If the output is faulty the entire gate is “stuck”. There are also cases which would correspond to a transistor stuck or stuck-off.

F=(A+B)’What about Fx (F with stuck off fault)?


Fault Models

• Most Popular – “stuck-at” model

Sa0 (output stuck at 0)

Sa1 (input stuck at 1)

• Covers almost all (other) occurring faults, such as opens and shorts

x

yw

z

1

2

31,3: x sa12: y sa0 or x sa03: z sa1


Another example



Stuck-at fault

• Single stuck at fault models are used frequently– Complexity of test generation is greatly reduced

– Single stuck-at fault is independent of technology, design style

– Single stuck-at tests cover a large percentage of multiple stuck-at fault

– Single stuck-at tests cover a large percentage of unmodeled physical defects


Delay fault

• Cause timing failures at target speed • Reason for delay fault

– Improper estimation of on-chip interconnect delay and other timing consideration

– Excessive variation in the fab. Process -> variations in circuit delay and clock skew

– Open in metal line connecting parallel transistors

– Aging effects such as hot carrier induced delay increase.

• Detecting delay fault is even more subtle than detecting functional faults in steady state.


Problem with stuck-at model: CMOS open fault

• Sequential effect: needs two vectors to ensure detection

x y

x

y

z

x y z

0 x 11 1 01 0 zn-1

• Other options: • Use stuck-open or stuck-short models• This requires fault-simulation and analysis at the

switch or transistor level – very expensive


• Cause short circuit between Vdd and GND for A-C=0 and D = 1

• Possible approach:– Supply Current

Measurement (IDDQ)

– Not applicable for gigascale integration

Problem with stuck-at model: CMOS short fault

A

C

B

D

C

A

D

B

0

0

0

1


Design for TestabilityCombinational function Sequential function

2n inputs required toexhaustively test circuit

2n+m inputs required toexhaustively test the circuit

• Exhaustive test is impossible or unpractical.• We need to find meaningful vectors to test for possible

faults?????– Not easy because of limited IO and increased complexity

– Concept of: Controllability and observability


Controllability and Observability

• Controllability – measure of how easy the controller (test engineer) can establish a specific signal value at each node by setting values at the circuit inputs

• Observability - measure of how easy the controller (test engineer) can determine the signal value at any logic node by control values at the circuit primary inputs and observing the primary circuit outputs

• Degree of controllability and observability (testability) can be measured with respect to whether the test vectors are generated deterministically or randomly.


Path Sensitization

• Step1: Sensitize the circuit: Find input values that produce a value on the faulty node that’s different from the value forced by the fault. For our S-A-1 fault example, we want output of OR gate to be 0.

• Is this always possible? What would it mean if no such input values exist?

• Is the set of sensitizing input values unique? • What’s left to do?


Error Propagation

• Step2: Fault propagation: Select a path that propagates the faulty value to an observed output (Z in our example)

• Step3: Backtracking: Find a set of input values that enables the selected path.

• Is this always possible? • What would it mean if no such input values exist?


Testability

• Example of non-testable error:• For x=1 we need both a and =1, What ever the

value of C, one of the three outputs is 1: PB!!!! Two possible propagation of “1” Pb: Fault propagation



Line 7 cannot be tested at the primary output. Thus this circuit is not fully testable.Reason: reconvergent fanout of line 7

Fault test pattern generation:Fault sensitization – input vector to sensitize a faultFault propagation – condition that propagate the fault to the output so that it can be observed.



• Circuit with poor controllability– Circuits with feedbacks

– Decoder and clock generator

• Circuits with poor observability– Sequential circuits with long feedback loops

– Circuits with reconvergent fanouts

– Redundant nodes

– Embedded memories such as RAM, ROM, PLA

• Use self test for circuits with poor observability


Generating and Validating Test-vectors

• Automatic test-pattern generation (ATPG)– For given fault, determine excitation vector (called test

vector) that will propagate error to primary (observable) output

– Majority of available tools: combinational networks only

– Sequential ATPG available from academic research

• Fault simulation– Determine fault coverage of proposed test-vector set

– Simulate correct network in parallel with faulty networks

• Both require adequate models of faults in CMOS IC


ATPG Process

Fault Selection

Fault Observe Point Assessment

Fault Excitation

Vector Generation

Fault Simulation

Fault Dropping


ATG for fanout-free combinational circuits

• 2 steps– Activate (excite) the fault from the primary input

• For signal l with stuck-at-v fault, set primary input values such that signal l equal to v’

• Called justification problem – find an assignment of PI vaues that results in a desired value setting on a specified signal in the circuit

– Propagate the resulting error to a primary output• Composite logic values (v/vf), where v and vf are values of the

same signal in N and Nf, where N and Nf are the fault-free circuit, and faulty circuit, respectively.

• Composite logic values (1/0, denoted by D) and (0/1, denoted by D’) represent errors

• We have this logic behavior:– D+D’=1, D.D’=0,D+D = D.D=D,D’+D’=D’.D’=D’, D+0=D,

D’+0=D’,….


Test generation for the fault l stuck-at-v in a fanout-free circuit

Beginset all values to xJustify(l,v’)if v = 0 then propagate(l,D)else propage(l,D’)

end


Example

ab

cde

f

gi

h j

Stuck-at-0

ab

cde

f

gi

h j

Stuck-at-011

0x0

D

0 1

D D


Circuits with Fanout

• Two basic goals: fault activation and error propagation

• Fanout – several ways to propagate an error to PO• Fundamental difficulty

– Reconvergent fanout – the resulting line justification problems are no longer independent


Example

d

ab

c

e

f1

f2

G1

G2

G3

G4

G5

G6

s-a-1

The only vector that can test the fault is 111x0


Another Example

s-a-1

a

bcd

ef

h

kl

m

n o

p

q

r s


Fault SImulation

• Applying a set of vectors to a structural (netlist) description of a design and determining how many and which faults are detected out of the total set of available faults.

• Concurrent fault simulation– Applies the vectors to many copies of the netlist at the

same time.– Each copy contains one or more faults.– Each of these simulations is run concurrently with a

good circuit simulation– If a difference is observed at the legal observation

point between the good circuit and any faulty circuit simulation, the fault is listed as detected


What can we do to increase testability?

• Increase observability:– Add more pins (???? Can be a problem)

– Add small “probe” bus to selectively enable different values onto the bus

– Compress a sequence of values (for example a value of a bus over many clock cycles) into a small number of bits for later read-out

• Increase controllability– Use Multiplexers to isolate sub-modules and select

sources of test data as inputs

– Provide easy setup of internal states


Test approaches

• Scan-based testing• Built-in self test


Scan-based technique

• Minimize the use of additional I/O pins for testing• Use scan registers with both shift and parallel load

capabilities.• Storage cells in registers act as observation points,

control points or both.• Reduce testing of a sequential circuit to that of a

combinational circuit


Scan the idea

• Two modes of operations: normal and one in which all registers are chained into one long shift register which can be loaded and read-out serially.

Comb.Logic A

reg Comb.Logic A

reg

Scan-in

Scan-out

Scan-based structure


Scan-based structure


Scan-path register

scaninscan 2 1

in

out

scanout

keepload


Scan-based Test- operation

In0

latch

scanintest test

latch

test test

In1

latch

test test

In2

latch

test test

In3

scanoutOut0 Out1 Out2 Out3

Test

1

2

N cycle scan-in

1 cycle evaluation

N cycle scan-out

Testing time per test pattern increases due to shifting time in long register.


Scan-path Testing

Reg Reg

Reg Reg

+

Reg

>

Reg

in1 in0

out

scanin

scanout


JTAG – Boundary Scan

• Testing PCB and multichip modules carrying multiple pins

• Shift registers are placed in each chip close to I/O pins in order to form a chain around the board for testing PCB

Scan path


Buit-In Self Test –The idea• Problem: Scan-based approach is very useful for testing

combinational logic but can be impractical when trying to test memory blocks because of the number of separate test values required to get adequate fault coverage.

• Solution: use on-chip circuitry to generate test data and check the results. Can be used at every power-on to verify correct operation!

Generate pseudo-random data formost circuit using e.g. a linear Feedback shift register (LFSR).

For pseudo-random input data, computesome output values and compare againstexpected value “signature” at the end of the test.


Built-in self test

• Parts of the circuit are used to test the circuit itself.• Essential circuit modules:

– Pseudo random pattern generator (PRPG)

– Output response analyzer (ORA)


PRPG using LFSR

• LSFR – linear feedback shift register

Q0 Q1 Q21 0 01 1 01 1 10 1 11 0 10 1 00 0 11 0 0


Signature Analysis

• Reduce chip area, data compression schemes are used to compare the compacted test responses instead of the entire raw test data.

• Signature analysis – based on cyclic redundancy checking

• Use polynomial division which divides the polynomial representation of the test output data by a characteristic polynomial and then finds the remainder as the signature.

• The signature is then compared with the expected signature to determine whether the device is faulty or not.

• Sometimes the fault may be un-detected - aliasing


ORA by LFSR

• The signature is the content of this register after the last input bit has been sampled.

• The input sequence {an} is represented by G(x) and the output sequence by Q(x). Gx( = Q(x)P(x) + R(x) where P(x) is the characteristic polynomial of LFSR and R(x) is the remainder.

• For the above LFSR we have P(x) = 1+x2+x4+x5

• For the input sequence [11110101], the G(x) = x7+x6+x5+x4+x2+1 and the remainder term R(X) = x4+x2 which corresponds to the register content of [00101]


ORA

• On-chip storage of a fault dictionary containing all test inputs with the corresponding outputs is too expensive.

• A simple alternative is to compare the output of two identical circuits for the same input– Cannot detect if both circuits have the same faults

• Self-checking design- detect fault autonomously during on-line operation– A checker circuit is inserted such that the checker

generates and sends out a signal when on-line faults occur.


Built-in Logic Block Observer (BILBO)

• A form of ORA, used in each cluster of partitioned registers

• Allows monitoring of circuit operation through exclusive ORing into LFSR at multiple points, which corresponds to the signature analyzer with multiple inputs

Co C1 Mode

0 0 linear shift1 0 signature analysis

1 1 data latch0 1 reset


BILBO Application

CombinationLogic

CombinationLogic

BIL

BO

-A B

ILB

O-

B

scanIn scanOut

In Out


Memory Self-Test

MemoryUnder Test

FSM SignatureAnalysis

Data-inData-out

Address& R/W control

Patterns: Writing/Reading 0s, 1sWalking 0s, 1sGalloping 0s, 0s


Current monitoring: IDDQ –The idea

• CMOS logic should draw no current when it’s not switching. So after initializing circuit and disabling pseudo-NMOS gates, the power supply current should be zero after all signals have settled.

• Good for detecting short faults. Need to try several different circuit states to ensure all parts of the chip have been observed.


Current Monitoring IDDQ Test

• Under bridging fault, static currents drawn from the power supply, much larger than leakage current

• Test different situations– Gate oxide short

– Channel punch through

– P-n diode leakage

– Transmission-gate defect

• IDDQ test only needs sensitization, but not propagation

• Performance in open drain and open gate test is less effective

Documents

ELEC516/10 Lecture 10 1 ELEC 516 VLSI System Design and Design Automation Spring 2010 Lecture 10 – Design for Testability Reading Assignment: Kang – CMOS