Chapter 2 AT-SPEED TESTING AND LOGIC BUILT IN SELF TESTshodhganga.inflibnet.ac.in/bitstream/10603/91434/8/08_chapter 2.pdfministic ATPG patterns, the number of BIST patterns must be

Chapter 2

AT-SPEED TESTING ANDLOGIC BUILT IN SELF TEST

2.1 Introduction

This Chapter describes the concepts of scan based testing, issues in testing, need

for logic BIST and trends in VLSI testing. Scan and logic BIST are the primary

design-for-test (DFT) methods to control test cost. Both ATPG and logic BIST

based pattern generation depends on the scan architecture of the designs. The

increase in test data volume is prohibitive and impractical to apply on the designs

to achieve particular fault coverage. Several researchers have explored the concept

of test data volume reduction by multiple techniques.

2.2 The Issue: Test Data Volume

2.2.1 ATE Based Testing

Exhaustive IC chip testing after manufacturing is critical to ensure product quality.

Full scan based IC testing using ATE is the most widely used approach. As chip

designs become larger and new process technologies require more complex fault

models, the length of scan chains and the number of scan patterns required have

13

increased dramatically (Agrawal, 2002). To maintain adequate fault coverage, test

application time and test data volume have also escalated, which has driven up the

cost of test. As test data volume increases, test power also increases rapidly. Scan

chain values currently dominate the total stimulus and observe values of the test

patterns, which constitute the test data volume. With the limited availability of

I/O ports as terminals for scan chains, the number of flip-flops per scan chain has

increased dramatically. As a result, the time required to operate the scan chains -

the test application time - has increased to the extent that it is becoming uneconomic

to employ scan test on complex designs.

Built-in self-test (BIST) has emerged as an alternative to ATE-based external

testing. BIST offers number of key advantages. It allows pre-computed test sets

to be embedded in the test sequences generated by on-chip hardware, supports test

reuse and at-speed testing, and protects intellectual property. To understand the

impact of shift time on test application time, consider the typical sequence involved

in processing a single scan test pattern shown in Figure 2.1. The shift operations take

as many clock periods as required by the longest scan chain. Optimizations (such

as overlapping of scan operations of adjacent test patterns) still do not adequately

influence the unwieldy test application time required by the scan operation. For a

scan-based test, test data volume can be approximately expressed as

Test data volume ≈ scan cells ∗ scan patterns (2.1)

Assuming balanced scan chains, the relationship between test time and test data

volume is then

14

Figure 2.1: Scan based testing(Wang, 2006)

Test time ≈ (scan cells ∗ scan patterns)/(scan chains ∗ frequency) (2.2)

Consider a circuit having 60000 flip-flops. Assume the number of scan chains is 60

(sc). Each scan chain will have 1000 flip-flops (len). Assume the number of vectors

is 20000 (tv). The time taken to test the IC is given by,

Number of clock cycles for flush test = len+ 4 (2.3)

Number of clock cycles for scan test = ((tv ∗ (1 + len)) + len) (2.4)

Most of the automatic test equipments work at the speed of few MHz to few tens

of MHz. Even if an ATE working at 100 MHz, for the above example it will take

15

approximately 2 seconds which is typically a long test time. For all fault models the

following metrics are used to characterize the quality of a test set.

Definition 2.1: The fault coverage (FC) is the percentage of detected faults with

respect to the total number of faults. The terms test coverage (TCov) and fault

coverage (FC) are interchangeably used by different EDA vendors.

Definition 2.2: The fault efficiency (FE) is the percentage of detected faults with

respect to the total number of testable faults.

As designs grow larger, maintaining high test coverage becomes increasingly

expensive because the test equipment must store a prohibitively large volume of test

data and the test application time increases. Moreover, a very high and continually

increasing logic-to-pin ratio creates a test data transfer bottleneck at the chip pins.

Accordingly, the overall efficiency of any test scheme strongly depends on the method

employed to reduce the amount of test data.

2.2.2 ATE Limitations over LBIST

Test data storage, and test data bandwidth requirements between the tester and

chip grow rapidly. ATE capability cannot scale indefinitely for increasing complex

IC designs. A new tester with more memory, channels, and higher speed of operation

is also not a good solution. The cost of a high-speed tester has already reached

around 20 million USD (US dollars) in 2010 (ITRS Annual Report, 2011). Some of

the applications of ATE in IC back end process is shown in Figure 2.2. The difficulty

of using ATE can be avoided by eliminating the costly ATE and use BIST. BIST

uses on-chip test pattern generation by Linear Feedback Shift Registers (LFSRs).

16

Figure 2.2: Applications of ATE in IC back end process (chromaate, 2004).

2.2.3 The Need for Logic BIST

Any electronic system employed in safety critical applications is expected to have a

periodical self checking scheme for sustained error free operation. For example, med-

ical electronic devices need to test themselves to assure continued safety of the pa-

tients. LBIST found its use mainly in safety-critical (automotive, medical, military),

mission-critical (deep-space, aviation) and high-availability (telecom) applications.

However, process technologies plunging below 22nm, LBIST will become compulsory

for Application-Specific Integrated Circuits (ASICs), Application-Specific Standard

Products (ASSPs) and complex commercial ICs (Nan Li, et al., 2015). Another

example is automotive electronics. With the explosion in the growth of the automo-

tive semiconductors industry comes an associated and intense focus on high silicon

quality and reliability. The last thing anyone wants is a brake system failure due to

a latent silicon defect, and concerns over reliability are driving changes in the testing

17

requirements for these chips. The electronics must meet certain safety standards set

forth by the automotive manufacturers, such as ISO 26262 and AEC-Q100, which

outline acceptable failure rates for key functionality (Meehl, 2014). DFT methods

and testing strategies need to be restructured to meet these high quality metrics.

Most needed strategy is to test the IC’s for their entire lifespan. On chip testing

emerged as the solution to enable very effective testing on these IC’s. Power-on test

that ensures the silicon is functioning as required each time the vehicle is started,

with LBIST logic coverage typically falling in the range of 75% to 80%. In a typical

application, if the LBIST test signals a silicon failure, an associated error code is

posted to the vehicle computer and the check engine light comes on. In this way,

LBIST provides the means to quickly and easily identify potential safety issues with-

out needing dedicated and complex equipment at car dealerships. Another benefit

of LBIST is that it provides a low pin interface and very small test data volume

when testing the chip on the tester.

2.2.4 Fault Coverage in Logic BIST

Logic BIST′s fault coverage typically reduces because of random pattern resistance

(rpr) faults. So fault coverage is much less than 100 percentage and very long test

sequences are required. To achieve test coverage close to that achievable by deter-

ministic ATPG patterns, the number of BIST patterns must be significantly greater

and which forces a tradeoff between increased test application time and reduced test

coverage. Several solutions have been devised to address this problem. The number

of faults detected by a random pattern is usually high for the first few or many pat-

terns and then reduces with further patterns (Agrawal, 2002). The detected faults

18

are removed from the list of total faults after the application of each test pattern. In

general, after a particular number of patterns, the fault coverage of random patterns

saturates and even hundreds of patterns may detect a few faults only. These faults

are the most difficult to detect. Further generated many random patterns may not

detect even a single fault in the CUT. These undetected faults may also be referred

as random pattern resistant (rpr) faults. Figure 2.3 shows the general trend in fault

coverage with the number of the random patterns. It is not known precisely when

to stop generating these random patterns and is normally done when no more im-

provement in fault coverage is seen.

Figure 2.3: Typical logic BIST fault coverage map

19

2.2.5 At-Speed and On-Chip Testing of Logic BIST

Around 2012, there is a lot of interest in mission-critical markets - military, med-

ical, and especially automotive - in LBIST. LBIST is generally not for a one-time

manufacturing test but for repeated testing in the field (or in system testing) to

ensure that a car or a satellite or medical device is working as expected. For exam-

ple, in an electronically-controlled braking system, LBIST could run every time the

car is turned on. Every time the right signatures have to be compared and if the

signatures are not matched, then the car’s electronic controls can alert the driver

or perhaps prevent the transmission from going into drive. Of course, logic ICs still

run through the usual manufacturing tests, but most likely the LBIST will not be

turned on in production testing.

Meehl (Meehl, 2014) observed that LBIST requires more test patterns than con-

ventional testing, may take an extra millisecond or two to work, and probably would

not provide high enough test coverage without additional test vectors. For manu-

facturing test, the most common approach is the combination of LBIST and ATPG

driven tests for the majority of users. During ATPG testing, LBIST is turned off

and treated like functional logic.

2.2.6 An Automated Logic BIST Insertion Flow

Two kinds of logic BIST configurations are in practice based on the method of

accessing the LBIST from an external device. First in a production facility, efficient

and first-pass quality check of the silicon can be performed on multiple sites in

parallel due to the reduction in test pin interfaces down to the JTAG pins only.

This type of LBIST is called as JTAG Access LBIST and uses the JTAG interface

20

and protocols to start and execute LBIST. Other type called as Direct Access LBIST

can also be easily executed by holding an extra pin high or low. This method uses the

I/O pins of the device directly to access the LBIST blocks. The cadence encounter

test product has supported automatic test pattern generation (ATPG) for many

years. A new capability with the recent Encounter 13.1 release, however, is the

automatic insertion of the LBIST macro with the Cadence RTL (register transfer

level) Compiler (RC). The automated flow first builds scan chains, then inserts

compression channels, and then builds the LBIST circuitry. Designers can check

coverage and manually add more test points to improve coverage. Encounter test

then validates that the PRPG and MISR are working properly, generates the final

signature, and provides the fault coverage for a given LBIST run. The diagram in

Figure 2.4 below shows the LBIST flow in more detail (OPCG is on-product clock

generation). A comparative feature list of the ATE based testing and logic BIST

based testing is given in Table 2.1.

2.3 Trends in VLSI Testing

In the last three decades, the world of electronics (especially microprocessors) has

been expanded exponentially both in productivity and performance. ICs (Integrated

Circuits) are considered as the basis of the modern products. IC industry has

followed a steady path of constantly shrinking the device dimensions and hence

increasing the density of the chip.

IC testing is one of the critical processes involved in the life cycle of a product.

Testing difficulty increases with the integration levels of the devices. Rule of ten

21

Figure 2.4: Automated LBIST insertion with Cadence RTL Compiler (RC) andEncounter Test (ET) (Cadence, 2012)

says that the cost of detecting a faulty IC increases by an order of magnitude as

it move through each stage of manufacturing, from device level to board level to

system level and finally to system operation in the field. Hence rule of ten will be

changed into rule of twenty because the chips, boards and systems are enormously

complex. Feedback obtained from the IC testing is the only way to analyze and

isolate many defects in the manufacturing process. Time-to-yield, time-to-market

and time-to-quality are all gated by testing. With the increasing needs for high

quality electronic products, IC test is used for debugging, diagnosing and repairing

the parts of the device. Also with the increasing reliability, availability and service-

ability requirements results in the periodic testing of the device throughout their life

cycle. VLSI testing is very important to designers, product engineers, test engineers,

managers, manufactures and end-user.

22

Table 2.1: Differences between LBIST and ATE based testing.

Logic BIST ATE based Testing

1 At speed testing Slow (ATE dependent)

2 Reduces Test costs High ATE cost

3 Less test time More test time

4 Random patterns(Low pattern effi-ciency)

Deterministic patterns(high patternefficiency)

5 Low stuck at coverage High stuck at coverage

6 On chip Needs test house

7 Memory not needed Large Memory needed

8 Any stage of implementation Only after manufacturing

9 Debug without disintegrating from thesystem

Debug using ATE only, needs disinte-gration from the system

10 Complex debug Relatively simpler

As we move deep into submicron technology, IC has been replaced by SoCs type

design. SoC is a heterogeneous mixture of a number of cores obtained from various

sources. Conventional test and diagnosis methods became inadequate and costly to

test SoC designs. Traditional bed-of-nails method is not able to control and observe

each and every point in the device. The ATE (automatic test equipment) is computer

controlled equipment which is used to apply test patterns to a DUT (device under

test) and based on the response collected from the DUT it mark the device as a good

or bad. Since core based system introduce new complexities in testing, the designers

has to search for alternate approach for chip development in order to keep up with

diminishing time-to-market requirements and stay within budgets. One alternative

is embedding the testing capabilities or test circuits beyond primary I/O ports and

internal structure of the chip. The process of incorporating test circuits into the

23

design of a device is called Design for Testability. DFT technique (embedded test)

enables the production of higher-quality devices in less time. The use of embedded

test raises margins and significantly reduces the time required for system verification,

test and debug. One of the oldest DFT techniques is scan chain that reduces the test

complexity to a large extend. In the case of SoCs, boundary scan and BIST methods

are used to perform testing. Test cost contributes a major part to the total device

cost. Test cost includes cost of ATE equipment, cost of test development, test vector

generation, programming etc. Scan based techniques significantly reduces the cost

of test generation and BIST methods can lower the complexity and cost of ATE.

The implications that made by today’s complex ICs on the test are three scaling

trends: complexities, increased performance, and higher densities that have direct

impact on the test methods.

2.3.1 Implication on Complexity

Transistor count in an IC keeps on increasing from generation to generation as stated

by Moore’s law. The drastic increase in the ratio of transistor per pin continuously

reduces the accessibility to the transistors from the chip and it will create big problem

for IC test. The amount of test data needed for testing an IC with a certain level of

fault coverage grows with the transistor count of the IC under test. SoC is a mixed IC

that contains various non-homogenous circuits (such as analog blocks, digital blocks,

memory elements, etc) collected from different vendors. Different types of circuits

exhibit distinct defect behavior and need different testing techniques. Typically

distinct external test equipments are used to test each of these circuits and it is

considered as an expensive process. Embedded tests such as BIST and boundary

24

scan remains as a solution to test these types of mixed circuits.

2.3.2 Implication on Performance

With the continuous increase in IC internal speed, performance-related defect cov-

erage has become increasingly important. The complex chip will demand a com-

prehensive performance test. Performance test that is applied from external test

equipment cannot adequately and cost effectively test the high clock speeds and

provide the necessary performance-related defect coverage. Because these equip-

ment are typically made of older technology compared to the chips they test, and

the higher speed test equipment are substantially more expensive. Since embedded

test circuit is built on the same IC, it will have same speed as that of the chips

internal speed.

2.3.3 Implication on Density

Continuous advancement in the technology will keep on increasing the silicon (de-

vice) density. Increase in the density causes reduction in the defect sizes and hence

increases the difficulty in fault localization. Embedded test hardware whether in

embedded memories, cores, user defined logic or analog blocks can act as the infras-

tructure to collect the faulty data from the block under test. This helps to quickly

isolate faults. As a result, embedded test hardware can assist in defect isolation.

2.4 Fault Models of Interest

This section describes the logical fault model used in this work. Logical faults

represent the effect of physical defects on the logic behavior of the modeled system.

25

Restricting the analysis of physical defects to the level of the logic behavior has

several advantages.

The complexity is reduced by transforming a physical problem into a logical

problem. The space of physical defects is larger than the space of logical faults,

such that a fault model can cover several physical defect types. Moreover, tests

derived for certain logical faults may cover physical defects for which no accurate

fault model is known (Agrawal, 2002). Also, most of the logical fault models are

technology-independent and hence testing and diagnosis methods developed for such

fault models are applicable to many technologies.

Functional faults are commonly defined at RTL or higher levels (like behavioral

or system level) and they affect the proper execution of the operations used at

these levels. Shorts and opens are two examples of structural faults. A short is

formed by connecting points not intended to be connected while an open results by

breaking a connection. In this work, only structural, permanent and single faults

of combinational logic are considered. Intermittent, transient, or multiple-faults are

not taken into account. The analog and the memory elements that may be present

in the circuit under test are not considered. Under the single-fault assumption one

assumes that in a system at most one logical fault is present. This assumption is

justified by the fact that in most of the cases a multiple fault can be detected by

the tests designed for the individual single faults that compose the multiple-fault.

26

2.5 Stuck-at Faults

The logical fault corresponding to a signal line being stuck at a fixed logic value

(0/1) is referred to as a single stuck-at 0/1 fault. Physical defects which can be

modeled with the help of a stuck-at 0/1 fault on the signal line i include an open on

the fan-out lines driven by the line i, a short to power/ground or an internal error

in the component driving the line i (Gherman, 2006).

Despite the fact that the single stuck-at fault model does not cover all the physical

defects that can appear in a digital circuit, it is very useful due to the following

properties:

As compared to other fault models, the number of single stuck at faults in a cir-

cuit grows linearly with its size. Moreover, the number of these faults that have to be

explicitly considered can be reduced by fault collapsing. Techniques like structural-

based and dominance-based fault collapsing can reduce the number of faults to be

explicitly analyzed by 50% and 40%, respectively (Abramovici, 1990).

Test sets generated for single stuck-at faults may detect many faults belonging

to other fault models. It is technology independent. The single stuck-at fault model

and its analysis can be used to construct and analyze other types of fault models, like

the transition fault model. A combinational circuit that contains an undetectable

stuck-at fault is said to be redundant, since such a circuit can always be simplified by

removing at least one gate or input. The test generation problem for stuck-at faults

belongs to the class of NP complete problems (worst-case behavior) (Ibarra, 1975).

Undetectable (redundant) faults are usually the ones that cause test generation

algorithms to exhibit their worst-case behavior respectively (Abramovici, 1990).

27

A straightforward extension of the single stuck-at fault model is the multiple

stuck-at fault model. This fault model is more difficult to handle. The list of faults

for a circuit having N possible sites for single stuck-at faults can contain up to

2N single and (3N -1) multiple stuck-at faults (Abramovici, 1990). Fortunately, the

importance of the multiple stuck-at fault model is reduced due to the fact that

tests with complete detection of the single stuck-at faults would detect most of the

multiple stuck-at faults (Hughes, 1984).

2.6 Test-per-Scan Schemes

Test-per-scan BIST schemes require scan-based design. In the case of sequential

circuits, this means that all the storage cells can be configured as one or several scan

paths (chains), which are used as serial shift registers in test mode (Figure 2.6). In

this way, each storage device of the CUT becomes easily controllable and observable.

The test stimuli/responses are shifted into/out of the scan paths (Abramovici, 1990)

(Agrawal, 2002) (Wang, 2006). Scan-based design helps to reduce the problem of

testing sequential circuits to the simpler problem of testing combinational circuits.

The BIST control unit (BCU) in Figure 2.5 must contain at least a shift counter

and a pattern counter. The shift counter controls the bit stream which is generated

and shifted into the scan path by a TPG. The pattern counter controls the length

of the test sequence. A system clock cycle (also called capture or functional clock

cycle) is applied to load the CUT response to the current test pattern into the scan

path. During the so-called shift mode (also called scan or test mode) a new test

pattern is shifted into the scan path, while the CUT response to the previous pattern

28

is shifted out and compressed by a test response evaluator (TRE). A very common

and effective parallel-serial mixed scheme is obtained by partitioning a full scan path

into multiple scan chains (Figure 2.6).

Figure 2.5: Test-per-scan scheme (Wun, 1998)

Figure 2.6: STUMPS architecture for parallel-serial mixed scheme (Wun, 2002)

In Figure 2.6, the test patterns are generated by a pseudo-random pattern gen-

erator (PRPG) and the responses are compacted by a multiple input shift register

(MISR). Both the PRPG and the MISR are typically implemented as linear feedback

shift registers (LFSRs. Such a scheme is called Self-Test Using MISR and Parallel

29

Shift register sequence generator (STUMPS) (Bardell, 1982).

The basic design with multiple scan chains suffers from highly correlated pat-

terns. To solve this problem, XOR-trees (phase shifters (PS)) may be inserted

between the LFSR and the scan chains inputs (Figure 2.6) (Bardell, 1990)(Rajski,

1998).

Test-per-scan schemes have several advantages: (a) high fault/defect coverage

(b) reduced test data size (compared to sequential test patterns) (c) relatively low

test generation time (d) reduced test costs (no special requirement for costly ATEs

for functional testing) (e) low impact on the system behavior, as only scan paths

are included into the mission logic and (f) separation of the pattern generator from

the CUT, so that it can be synthesized at a later step of the design flow.

The drawbacks of test-per-scan schemes are: (i) long test application time re-

quired by the scan mode (ii) functionally untestable faults can be activated (iii)

reduced testability for faults whose detection necessitates pairs of test patterns and

(iv) reduced system performance if scan elements are introduced into the critical

paths. If partial scan paths (Trischler, 1980) are used, such problems can be re-

duced and more test patterns may be applied within the same test time.

2.7 Test-per-Clock Schemes

In a test-per-clock scheme (Koenemann, 1979)(Krasniewski, 1989)(Stroele, 1994)

(Wang, 1986), a test pattern is applied to the CUT every clock cycle. This scheme

is best suited for register-based design. This kind of scheme employs a specific

BIST architecture using the built-in logic block observer (BILBO) (Koenemann,

30

1979), which is a more sophisticated register that can function as a normal state

register, scan register, PRPG or MISR. All functionality of the BILBO depends on

the mode input signals B0 and B1. Signal B0 controls all the registers to switch

between the global and local modes (Figure 2.7).

Figure 2.7: Control signals of a BILBO (Wun, 1998)

Figure 2.8: Test-per-clock scheme (Wun, 2002)

The global mode covers the functional and scan modes. In the local mode, the

registers may act as pattern generators or response evaluators. In order to select

each of these sub-modes associated with the global or local mode, the signal B1

is used. In contrast to signal B0, which is unique for all registers, the signal B1

depends upon the addressed register. In Figure 2.8, it can be seen how to facilitate

testing by changing the functionality of the BILBO registers. Initially, the registers

R1 and R2 are initialized in scan mode. Then register R1 is set to a PRPG mode

31

for the combinational logic C1 and the test responses are observed by register R2

that functions in response evaluation mode as MISR. The combinational logic C2 is

tested after the test outcome contained in R2 is shifted out and the functionalities

of R1 and R2 are interchanged. In the end, the new test outcome contained in R1

has to be shifted out.

Compared to test-per-scan schemes, the test-per-clock schemes have both advan-

tages and disadvantages. The advantages of test-per-clock schemes are: (a) shorter

test times and better support for two-pattern testing (Cockburn, 1998), as a new

pattern can be applied in each clock cycle and (b) better support for at-speed test-

ing, as no pattern shifting is required, which generally is done at a lower speed.

The disadvantages of the test-per-clock schemes may be the following: (i) larger

hardware overhead and (ii) stronger impact on the system behavior and design flow.

The overhead can also be affected by the increased complexity in the test-per-clock

schedule that requires the synthesis of a rather complex BCU. One reason for these

disadvantages is that additional test registers have to be included, due to the fact

that normal BILBO registers cannot work as TPG and TRE simultaneously. (Wang,

1986) introduced a special type of BILBO register, also called concurrent BILBO,

which is able to perform signature analysis and pattern generation concurrently.

2.8 Summary

This Chapter briefed the concepts of VLSI testing, ATE based testing, Logic BIST

based testing and the two types of logic BIST. Most of the commercial DFT tools use

the STUMPS architecture which is the test-per-scan scheme. Various researchers

32

attempted to make the logic BIST based testing more efficient, by reducing the

number of random test patterns in different ways. Details of the different test

patterns compression techniques and schemes proposed previously in literature are

explained in Chapter 3.

33

Documents

Chapter 2 AT-SPEED TESTING AND LOGIC BUILT IN SELF TESTshodhganga.inflibnet.ac.in/bitstream/10603/91434/8/08_chapter 2.pdfministic ATPG patterns, the number of BIST patterns must be