24
198 IEEE rRANSACTlONS ON PARALLEL AND DISTRIBUTED SYSTEMS. VOL. 4, NO. 2, FEBRUARY 1993 A Novel Concurrent Error Detection Scheme for FFT Networks D. L. Tao, Member, IEEE, and C. R. P. Hartmann, Senior Member, IEEE Abstract-The algorithm-based fault tolerance techniques have been proposed to obtain reliable results at very low hardware overhead. Even though 100% fault coverage can be theoretically obtained by using these techniques, the system performance, i.e., fault coverage and throughput, can be drastically reduced due to many practical problems, e.g., round-off errors. In this paper, we propose a novel algorithm-based fault tolerance scheme for fast Fourier transform (FFT) networks. We show that the proposed scheme achieves 100% fault coverage theoretically. We analyze and provide an accurate measure of the fault coverage for FFT networks by taking the round-off error into account. We show that the proposed scheme provides concurrent error detection capability to FFT networks with low hardware overhead, high throughput, and high fault coverage. Index Terms- Concurrent error detection, digital signal pro- cessing, fast Fourier transform, fault tolerance, VLSI processor array. I. INTRODUCTION HE fast Fourier transform (FFT) has played a key role in T digital signal processing because it reduces drastically the computational complexity for large discrete Fourier transform (DFT). With the advances in VLSI technology, special-purpose VLSI chips are available to construct dedicated system for FFT. One of the obvious implementation of the *V-point FFT network is to use two-input butterflies, such an implementation consists of log2 N stages and each stage contains AV/2 two- input butterflies, i.e., it contains nr/2 x log2 N processors. It has been reported in [3], [14], [ 151 that a single wafer, which contains 8-point FFT network and 16-point FFT network, is already available. Therefore, it is conceivable that to fabricate large scale FFT networks on a single wafer will be feasible in near future. For real-time computation, the high performance parallel systems are required to be extremely reliable. Hence, fault tolerant designs are necessary in high performance systems, e.g., in FFT networks, to ensure the correctness of results. Three schemes have been proposed to design the fault-tolerant FFT network (21, (61, [13]. In (21, Choi and Malek have proposed a scheme, which is called recomputing by alternate path, for concurrent error detection and fault diagnosis in FFT network. The basic idea is to compute the operants twice in Manuscript received January 26. lY90: revised March 18. 1991 and June 17, 1991. This work was supported in part by the Research Foundation of SUNY at Stony Brook under Faculty Development Grant. D. L. Tao is with the Department of Electrical Engineering, SUNY at Stony Brook, Stony Brook, NY 11794. C. R. P. Hartmann is with the School of Computer and Information Science. Syracuse University, Syracuse. NY 13234. IEEE Log Number 9204627. their modified FFT structure so that any computation will then go through two different butterflies. As a result, any fault within a butterfly will produce different outputs, hence the fault is detectable. The 100 percent fault coverage is achieved on the assumption that the fault is restricted within the butterfly only. The hardware overhead includes (lOO/log, N) percent of extra butterflies, registers, and other logic used to control computation paths. Moreover, since time redundancy is used, the throughput of this scheme is only 50% of the throughput of the FFT network without concurrent error detection (CED) capability. In [2], they have also proposed an algorithm to locate the faulty butterfly within (log N + 5) cycles. The traditional techniques for designing fault tolerant sys- tems, e.g., TMR, are too costly because of the hardware over- head. Recently, the algorithm-based fault tolerance (ABFT) has been suggested as a means for designing fault tolerant VLSI processor arrays. The potential advantage of the ABFT is that errors which are caused by permanent or transient failures in the system can be detected/corrected by using very low hardware overhead and at high throughput. The ABFT was originated by Huang and Abraham [4]. In [4], a checksum approach has been proposed to provide concurrent error detection/correction for matrix operations. The technique has been extended to matrix computations [5], FFT [6], [13], matrix equation solvers [7], the QR decomposition, the LU decomposition, eigenvalue and singular value problems [ 13, and matrix triangularization [9]. In [6], Jou and Abraham have proposed an ABFT scheme for the FFT networks. The block diagram of the scheme in [6] is shown in Fig. 1. They have shown in [6] that 100% fault coverage and throughput can be achieved theoretically. Theo- reticu//y, only a fault will result in the discrepancy of the input operants, namely A- and Y, of the comparator shown in Fig. 1. Since the processing elements in FIT networks (especially in VLSI implementations) only have finite precision, round-off errors exist inherently in FFT networks, and so inputs of the comparator may not be equal to each other even if the FFT network is free of any fault. In other words, the round-off errors will cause false-alarms. Consequently, the throughputs or the fault coverage or both could be seriously reduced. In [6], taking into account the effect of the round-off errors, Jou and Abraham have estimated the throughputs and the fault coverage of their scheme. In addition, another algorithm to distinguish the type of error (round-off or functional) and to locate the faulty butterfly has been proposed. and this scheme requires approximately (21 log2 Iy) x 100 percent of extra hardware. However, the fault coverage given in [6] is not accurate.

A novel concurrent error detection scheme for FFT networks

  • Upload
    crp

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A novel concurrent error detection scheme for FFT networks

198 IEEE rRANSACTlONS ON PARALLEL AND DISTRIBUTED SYSTEMS. VOL. 4, NO. 2, FEBRUARY 1993

A Novel Concurrent Error Detection Scheme for FFT Networks

D. L. Tao, Member, IEEE, and C. R. P. Hartmann, Senior Member, IEEE

Abstract-The algorithm-based fault tolerance techniques have been proposed to obtain reliable results at very low hardware overhead. Even though 100% fault coverage can be theoretically obtained by using these techniques, the system performance, i.e., fault coverage and throughput, can be drastically reduced due to many practical problems, e.g., round-off errors. In this paper, we propose a novel algorithm-based fault tolerance scheme for fast Fourier transform (FFT) networks. We show that the proposed scheme achieves 100% fault coverage theoretically. We analyze and provide an accurate measure of the fault coverage for FFT networks by taking the round-off error into account. We show that the proposed scheme provides concurrent error detection capability to FFT networks with low hardware overhead, high throughput, and high fault coverage.

Index Terms- Concurrent error detection, digital signal pro- cessing, fast Fourier transform, fault tolerance, VLSI processor array.

I. INTRODUCTION

HE fast Fourier transform (FFT) has played a key role in T digital signal processing because i t reduces drastically the computational complexity for large discrete Fourier transform (DFT). With the advances in VLSI technology, special-purpose VLSI chips are available to construct dedicated system for FFT. One of the obvious implementation of the *V-point FFT network is to use two-input butterflies, such an implementation consists of log2 N stages and each stage contains AV/2 two- input butterflies, i.e., it contains nr /2 x log2 N processors. It has been reported in [3] , [14], [ 151 that a single wafer, which contains 8-point FFT network and 16-point FFT network, is already available. Therefore, it is conceivable that to fabricate large scale FFT networks on a single wafer will be feasible in near future.

For real-time computation, the high performance parallel systems are required to be extremely reliable. Hence, fault tolerant designs are necessary in high performance systems, e.g., in FFT networks, to ensure the correctness of results. Three schemes have been proposed to design the fault-tolerant FFT network (21, (61, [13]. In (21, Choi and Malek have proposed a scheme, which is called recomputing by alternate path, for concurrent error detection and fault diagnosis in FFT network. The basic idea is to compute the operants twice in

Manuscript received January 26. lY90: revised March 18. 1991 and June 17, 1991. This work was supported in part by the Research Foundation of SUNY at Stony Brook under Faculty Development Grant.

D. L. Tao is with the Department of Electrical Engineering, SUNY at Stony Brook, Stony Brook, NY 11794.

C. R. P. Hartmann is with the School of Computer and Information Science. Syracuse University, Syracuse. NY 13234.

IEEE Log Number 9204627.

their modified FFT structure so that any computation will then go through two different butterflies. As a result, any fault within a butterfly will produce different outputs, hence the fault is detectable. The 100 percent fault coverage is achieved on the assumption that the fault is restricted within the butterfly only. The hardware overhead includes (lOO/log, N ) percent of extra butterflies, registers, and other logic used to control computation paths. Moreover, since time redundancy is used, the throughput of this scheme is only 50% of the throughput of the FFT network without concurrent error detection (CED) capability. In [2], they have also proposed an algorithm to locate the faulty butterfly within (log N + 5) cycles.

The traditional techniques for designing fault tolerant sys- tems, e.g., TMR, are too costly because of the hardware over- head. Recently, the algorithm-based fault tolerance (ABFT) has been suggested as a means for designing fault tolerant VLSI processor arrays. The potential advantage of the ABFT is that errors which are caused by permanent or transient failures in the system can be detected/corrected by using very low hardware overhead and at high throughput. The ABFT was originated by Huang and Abraham [4]. In [4], a checksum approach has been proposed to provide concurrent error detection/correction for matrix operations. The technique has been extended to matrix computations [5], FFT [6], [13], matrix equation solvers [7], the QR decomposition, the LU decomposition, eigenvalue and singular value problems [ 13, and matrix triangularization [9].

In [6], Jou and Abraham have proposed an ABFT scheme for the FFT networks. The block diagram of the scheme in [6] is shown in Fig. 1. They have shown in [6] that 100% fault coverage and throughput can be achieved theoretically. Theo- reticu//y, only a fault will result in the discrepancy of the input operants, namely A- and Y , of the comparator shown in Fig. 1. Since the processing elements in F I T networks (especially in VLSI implementations) only have finite precision, round-off errors exist inherently in FFT networks, and so inputs of the comparator may not be equal to each other even if the FFT network is free of any fault. In other words, the round-off errors will cause false-alarms. Consequently, the throughputs or the fault coverage or both could be seriously reduced. In [6], taking into account the effect of the round-off errors, Jou and Abraham have estimated the throughputs and the fault coverage of their scheme. In addition, another algorithm to distinguish the type of error (round-off or functional) and to locate the faulty butterfly has been proposed. and this scheme requires approximately ( 2 1 log2 I y ) x 100 percent of extra hardware. However, the fault coverage given in [6] is not accurate.

Page 2: A novel concurrent error detection scheme for FFT networks

TA0 AND HARTMANN: ERROR DETECTION SCHEME FOR FIT NETWORKS

I rror Irldlcallon

199

c = a + Wh x b

d = a - Wh x b

U;,v = e x p ( j + )

Fig. 3. A butterfly

Fig. 1 The block diagram of fault-tolerant FFT network dmgned by Jou and Abraham

I i ro r lndicatiun

Encoder

x(o)

Decoder &+ X(o)

Fig. 2. The block diagram of fault-tolerant FFT network deslgned by the proposed scheme.

Another algorithm-based fault-tolerant scheme for the FFT networks is proposed by Reddy and Banerjee 1131. The fault model used in [13] is much weaker than that used in [6], and hardware overhead used in [13] is approximately the same as that used in [6]. In [13], Reddy and Banerjee provide simulation results of the fault coverage and throughput for the %point one-dimensional FFT networks. However, the throughput and fault coverage for the N-point ( N > 8) fault tolerant FFT networks constructed by their scheme are unknown.

We propose a new algorithm-based concurrent error de- tection scheme for FFT networks. The proposed scheme maintains the low hardware overhead and high throughput of Jou and Abraham's scheme, and at the same time increases the fault coverage significantly. The block diagram of the proposed technique is shown in Fig. 2. In this paper, we will show that the proposed scheme has 100% fault coverage theoretically, we will analyze and provide the fault coverage and throughput of the proposed scheme by taking into consideration the round- off error effect. We will show that for the same throughput, the fault coverage of the proposed technique is significantly better than that proposed in [6].

The N-point FFT network considered in this paper consists of $ x log, N two-input butterflies, where AV = 2'; and k is an integer. A butterfly consists of two complex adders and one complex multiplier as shown in Fig. 3, and an eight-point FFT network is shown in Fig. 4. The fault model used in this paper is the same as that used in [6], i.e., a functional fault is modeled as the additive noise at an input or an output of a butterfly. In other words, a fault can be modeled as the additive

I I s = < ' q > I ] $ l

Fig 4 An 8-point FIT network.

error at (a), or (b), or (c), and or (d) as shown in Fig. 3. For the sake of simplicity, we use the words "fault" and "error" interchangeably in the rest of the paper. Moreover, since the proposed scheme is used for the fault detection during the system's normal operation, we assume that only one butterfly can be faulty at a time, and the fault is basically transient or intermittent in nature.

In Section 11, we propose an algorithm-based concurrent error detection scheme for FFT networks. In Section 111, by taking the round-off error into consideration, we analyze and estimate the fault coverage and throughput of the proposed scheme as well as the Jou and Abraham's scheme. The comparisons are also given in Section 111.

11. A NOVEL CED SCHEME FOR FFT NETWORK

In this section, we propose a new concurrent error detection scheme for detecting all errors in an N-point FFT network. The proposed CED scheme consists of two separate concurrent error detection schemes: the CED by using Predicted Sum concurrent error detection (PSCED) and simple concurrent error detection (SCED) [6]. Although neither of these CED schemes alone can detect all of the errors in an N-point FFT network, the combination of these two schemes will be able to detect all of the errors. We shall first describe these two CED schemes, then combine them to form the proposed CED scheme.

Page 3: A novel concurrent error detection scheme for FFT networks

200 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 4, NO. 2, FEBRUARY 1993

A. The CED for FFT Network by Using Predicted Sum: PSCED

Before introducing the PSCED, we give an example to illustrate the basic strategy used in the design of PSCED.

Example I : Let us consider the PSCED for the 4-point F IT network as shown in Fig. 5. In Fig. 5 , the inputs of the Totally Self-checking (TSC) comparator are respectively labeled as Si and Sf. Si is the encoded output and Sy is used to provide an error-free reference. The Si can be represented as follows:

s; = w:[x(o) - X ( l ) ] + rv,l[x(3) - X(2)] = U’,OX(O) - W,OX(l) + u;‘x(:J) - M.;1A(2) = W!X(O) + WjX(1) + W 3 ( 2 ) + w;S(s) = a(;l’X(O) + “y).x(l) + O Y ) X ( 2 ) + 4 4 , 1 7 3 )

. (‘3) = ( 0 2W,o 2(W,O-U’,‘) 2w;) (%)

.I‘ ( 3 )

It can be easily proved that any error except the one at .r(O) will result in St # Sy, i.e., the error is detected. Note that the labels on some lines in Fig. 5 will be explained later. 0

As shown in Example 1, the PSCED detects the errors in FFT network by comparing the encoded output Si with encoded input 5’;. We now provide an algorithm to construct PSCED for an N-point FFT network.

Algorithm: 1. multiply a weight a!‘) to the output A - ( / ) (0 5 I 5

2. form the sum of weighted outputs, Si = C,”l-,’ ~ 1 ( ” ) N - 1) of the FFT network,

Error Indication G x z

Fig. 5 . PSCED for an 4-point FFT network.

3. predict the expected error-free output Si by using ~ ( j ) , such a value is denoted by Sy , where

L(N-1) 1 V - 1

27r N

where b!v) = a y ) W i k 3 and W,$ = exp(-j-). k = O

4. compare St with Sy. If Si # Sy, then an error is indicated. 0

In above algorithm, the critical part is to select the coeffi- cients allv), where 0 5 1 5 N - 1. We now provide a method

to construct , where cl(s) is recursively defined as follows:

c ( N l where 0 5 z 5 N - 1. Let a!N) = W;

(,(a - - 0 and ~(1’) = 1

Page 4: A novel concurrent error detection scheme for FFT networks

TA0 AND HARTMANN: ERROR DETECTION SCHEME FOR FFT NETWORKS 20 1

Example 2: The for the 2-point FFT, 4-point 0

We now provide two properties of the coefficients, ai ' '. By using these properties, the complexity of the decoder can be significantly reduced.

") and a! FFT, and 8-point FFT are shown in Table 1.

Lemma 1: = 21+1 , for 0 5 7 5 + - 1. The proof of the Lemma 1 can be found in Appendix I . By using the property given in Lemma 1, we can compute

n $ ~ ) X ( 2 ~ ) + o ! j ~ 2 ~ X ( 2 7 + 1) by using ap) [X(27) -X(27+1)]. Hence, a complex multiplier will be saved.

Lemma 2: The following equations are valid:

N 8

for O < i < - - - l . X S N

where * denotes complex conjugation.

The proof of the Lemma 2 can be found in Appendix I . By using the properties given in Lemmas 1 and 2, we obtain

are, respectively, the real

and imaginary parts of the a::+. Hence, instead of using four complex multipliers, only one

complex multiplier is required. Since = at y'2), for 0 < i < ( N / 2 \ - 1. the Lemma 2 can be recursivelv aoolied

( 4 ) to a, , where 0 5 7 < ( N / 4 ) - 1. As result of applying Lemmas 1 and 2, (N/4) + 1 complex multipliers are required instead of N complex multipliers in the decoder.

We now introduce a definition and then consider the fault detection capability of PSCED.

Definition: Line (0.0) is the input line of the butterfly connected to r ( 0 ) of the FFT network.

Theorem 1: Except the one at the line (0, O), all the errors in an 1V-point FFT network are detectable by using PSCED.

The proof of the theorem can be found in Appendix I.

B. The Simple CED for FFT Network

As stated in Theorem 1, the only undetectable error in PSCED for FFT network is the one at the line (0,O). We use a SCED technique [6] to detect the fault at line (0,O). Let us consider the butterfly whose inputs are connected to z(0) and x ( $ ) , and let us denote its outputs by ~ ' ( 0 ) and z'(+), respectively, where

r1 (0) = .r(O) + r (:) and r1 (:) = ~ ( 0 ) - T (;) .

Let Sh and 5'; be determined by the following equations:

s; = 2'(0) + Z' (;) and Sy = 22(0).

We construct SCED by simply feeding Sh and Sp into the inputs of a TSC comparator, respectively. In Fig. 6, SCED used in a 4-point FFT network and an N-point FFT network are illustrated, and it can be seen that if an error occurs at the line (0,0), then it will be reflected in the input of the TSC comparator. Therefore, i t is detectable by SCED.

C. A Novel CED for FFT Networks

We combine the PSCED and SCED to construct the pro- posed CED scheme for an N-point FFT network.

Example3: The proposed CED scheme for 8-point FFT network is shown in Fig. 7. The 0 1 ~ ) ' s are constructed as given in Table I of Example 2, and the coefficient b!') is constructed as follows:

k=O

Page 5: A novel concurrent error detection scheme for FFT networks

202 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 4, NO. 2, FEBRUARY 1993

Error Indication

I N-point FFT Network

r ( N - I ) X ( N ~ l j

(b)

Fig. 6. (a) SCED for an 4-point FFT network. (b) SCED for an .V-point FFT network.

Theorem 2: All of the errors in an N-point FFT network are detectable in the proposed scheme.

Proof: It has been shown in Theorem 1 that except the one at the line (O,O), all of the errors in an N-point FFT network are detectable by using PSCED. The error at the line (0,O) will be detected by using SCED discussed in Section 11-B. From the definition of TSC checker, the output of TSC checker is a codeword iff a codeword appears at its inputs. Therefore, if an error can be detected by both of these two schemes, e.g., the error at the first level which is associated with x(;), then an error is indicated at the outputs of the TSC checker because a noncodeword appears at its inputs. Q.E.D.

111. SYSTEM PERFORMANCE

It is known that there is an inherent limitation on the accuracy of processors because only a finite number of bits are available. As a result, round-off errors are unavoidable in FFT networks, and care must be taken to prevent the overflow.

Error Indication

La k x r

“ Y

Fig. 7. The proposed CED for 8-point FFT network.

The analysis of the round-off error has been discussed in detail in [ l l , ch. 91. In [ l l ] , the mean-square value of the round-off error and the output noise-to-signal ratio have been considered separately for fixed-point and floating-point, respectively. Due to the existence of the round-off errors, the inputs of the TSC comparator may not be equal to each other even if the FFT network is free of any functional error. Hence, when an error is indicated, a second execution is required to distinguish the functional error and round-off error [6]. Since round-off errors always exist, the second execution is almost necessary, and so the throughput is very low. One approach to solve such a problem is to allow a small difference (threshold), 77, between the inputs of the TSC comparator during the comparison [6]. By using this approach, we expect to obtain a high throughput and high fault coverage (< 100%). However, there exists a tradeoff for selecting 77. This is because a small 77 would reduce the throughput of the FFT system, whereas a large q would reduce the fault coverage. In this section, we discuss the throughput and fault coverage of the proposed scheme. Since the effect of round-off errors on PSCED is much more serious than that on SCED, the system performance of the proposed scheme (by taking round-off errors into account) is basically determined by the PSCED. Similarly, since the round-off errors in FFT network are dominant, we only consider the effect of round-off errors on Si, which is one input of the TSC comparator for PSCED. In this section, we mainly concentrate on the system performances of the fixed-point systems, the system performance of the floating-point systems will be

Page 6: A novel concurrent error detection scheme for FFT networks

T A 0 AND HARTMANN: ERROR DETECTION SCHEME FOR FFT NETWORKS 203

briefly discussed in Section 111-E. Moreover, we will compare the proposed scheme with the hardware redundancy scheme (JAH) proposed in [6] in terms of fault coverages.

A. Round-off Error Consideration

In fixed-point arithmetic, the mean-square value of the output noise at the lcth DFT value due to the round-off error, F ( k ) , can be described by the following equation [ 1 I]:

E[IF(k)1*] = CT; = ( N - 1)mL = ( N - 1)(2-*'/3) ( 2 )

where: ai: the variance of the complex noise of each butterfly

computation, a;: the mean-square value of the output noise due to the

round-off error, b: the word length. In addition, as discussed in [6] and [ I l l , F(O), F ( l ) , . . . ,

and F ( N - 1) are considered as the random variables with mean 0, variance a; . and statistically independent of each other. Since the encoder and the decoder in the PSCED are in parallel with the FFT network as shown in Fig. 2, the mean- square value of the output noise due to the round-off error of the proposed scheme, E[lF(k)1*], remains the same as that given in (2). However, in [6], the encoder and the decoder are cascaded respectively at the input and output of the FFT network, and so the encoder/decoder will change the mean- square value of the output noise due to round-off errors. The mean-square value of the output noise in the kth DFT value in [6], Fj,(k), can be described as

Note that the divisor by 3 is introduced in above equation because of the attenuation by decoding multipliers. The total round-off error appearing at the input of the TSC comparator in PSCED' can be represented as

n- ~ 1

F z F ( k ) k = O

(4)

When N is very large, by the central limit theorem [12], we can consider F as a random variable with normal distribution. In addition, the mean and the variance of F are equal to 0 and Nn;, respectively. Similarly, the total round-off error appearing at the input of the TSC comparator in JAH can be represented as

.Y - 1

k=O

In the next two subsections, we analyze how F ( F ( j a ) ) affects the throughput and fault coverage of an F I T network,

Although multipliers in the decoder also contribute round-off error effect, such an effect can be ignored compared to those in FFT network because round-off errors in FFT network accumulate in each level.

Fig. 8. The throughput of the FFT networks.

where throughput is defined as the reciprocal of the average number of tries per job (if an error is indicated, then two tries are required) and fault coverage is the probability of a functional error being detected.

B. Throughputs

Since both round-off errors and functional errors can cause a false alarm, it is necessary to use the retry (second execution) to distinguish these two types of errors. Hence, the large round-off error will degrade the system performance because two executions are required. The degradation occurs when IF1 > q. The average throughput of the FFT system with CED capability is defined as given in [6, p. 5551 as follows:

T(q. N . o f ) = l/average tries per job

= 1/[1 * Pr(lF1 I q ) + 2 * PT(IFI > q)]

= 1/[2 - Pr(lFI 57711

F 7 = 1/[3 - 2Pr(- I -11

f i a f f i a f

Since the random variable F is normal, T(q , N. a f ) can also be represented by using error function [12] as follows:

The relationship between q and the throughput of the system is shown in Fig. 8. When an error is indicated, the second execution is required, and the algorithm to distinguish the round-off error and functional error is similar to that given in [6].

C. Fault Coverage

Since the functional error in a butterfly, e, is modeled as the additive white noise at its input or its output, it is assumed as in [6] that the real and imaginary parts of the noise are independent and their amplitudes are uniformly distributed between 1 and - 1 (in fixed-point system). The density function of the functional error is shown in Fig. 9(a). In an N-point FFT network, there are N I 2 x log, N butterflies, and the error in the faulty butterfly may appear either at its inputs or outputs. Hence, there exist N x (log2 N + 1) distinct lines where a functional error can occur. Let us use a two-tuple ( i , j ) to denote the j t h line of the i th level, where 0 5 i 5 log, N and

Page 7: A novel concurrent error detection scheme for FFT networks

204 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. VOL. 4, NO. 2, FEBRUARY 1993

percentage

(102,10']

(101 .100]

(100 .10- ' ]

(10-1.10-21

( i o - 2 .

(10-3.10-41

(10-4 .10-51

( 1 0 - 5 . 1 0 - ~ 1

(10-G.10-']

(10-7.10-81

( l o r R .

( 1 0 - 9 . 0 )

TABLE I1 THE DISTRIBUTIONS OF THE IS'I'S AND

256

tao iou

0.072 0.010

0.505 0.237

0.263 0.323

0.144 0.061

0.01 1 0.036

0.000 0.042

0.000 0.018

0.000 0.018

O.Oo0 0.023

0.000 0.018

0.000 0.012

0.000 0.202

512 tao iou

0.086 0.009

0.469 0.216

0.225 0.294

0.179 0.056

0.036 0.034

0.000 0.039

0.000 0.016

0.000 0.016

0.000 0.022

0.000 0.016

0.000 0.01 1

0.000 0.268

f ( R , I ) = O 1 ' R

i(R, I ) = 114 ~ - I I I

(a) ( b) ( c )

Fig. 9. (a) The distribution function of the functional error (b). The shaded area = 1 - Pr(le1 5 71). where 11 5 1 (c). The shaded area = 1 - Pr(lvI 5 r l ) , where 1 5 1) 5 a.

0 5 j 5 N - 1. If a functional error, e, occurs at ( i . j) , it will be amplified by the succeeding butterflies. The functional error appearing at the input of the TSC comparator will be equal to e x Sj, where Sj is the transfer function from the error site to the input of the TSC comparator. Note that given a two-tuple ( i , j ) , Sj can be computed by (1-16) in Appendix I .

Example 4: Let us consider PSCED for 4-point FIT net- work shown in Fig. 5. According to the ( i , j ) notation intro- duced above, we label some lines in Fig. 5. By using (I-16), we obtain

0 Since the round-off errors always exist, the total error

(functional error and round-off errors) appearing at Si will be equal to e x Sj + F. When a functional error occurs, it will be detected iff le x Sj + FI > q. The fault coverage for the error occurring at a line ( i , j ) , CL(ql N, i , j ) , can be computed as follows:

CL(ql N , i , j ) = Pr(7 < le x S; + FI) = 1 - Pr(le x Si + FI 5 77). (5)

tao iou

0.098 0.009

0.430 0.198

0.195 0.270

0.192 0.052

0.074 0.031

0.008 0.036

0.000 0.016

0.000 0.016

0.000 0.020

0.000 0.016

0.000 0.010

0.000 0.326

2048 tao iou

0.101 0.008

0.406 0.183

0.173 0.249

0.176 0.048

0.118 0.029

0.019 0.033

0.003 0.015

0.000 0.015

0.000 0.019

0.000 0.015

0.000 0.009

0.000 0.377

In an N-point FFT network, there are N x (log, N + 1) S," 's. Each of the SJ's could have a different value as shown in Example 4, which will result in different fault coverages. Based on (1-16) given in Appendix I, we calculate all of the IS," 1's from 4 5 N 5 2048. The distribution of IS; 1's in different ranges for N = 256, 512, 1024, and 2048 are shown in Table 11. For example, when N = 1024, 19.5% of ISJI's are in the range of [0.1, l.O), whereas 7.4% of ISJI's are in the range of [0.001,0.01).

We have assumed that at most one butterfly can be faulty, that each butterfly is equal likely to be faulty, and that the error can occur either at its input or its output. Hence, there exist N/2 x log, N butterflies and 4 x NI2 x log, N = 2 x N x log, N distinct modeled functional errors. In addition, in an N-point FFT network, there exist N x (log, N + 1) lines where a functional error can occur. On the input (output) lines of the first (last) level butterfly, only one modeled error can occur, whereas two modeled functional errors can occur on the other lines. Thus.

Pr(the error occurs at ( z , j ) ) =

for otherwise

= 0 or a = log, N . ' (6)

.v x log, N

The fault coverage of the FFT network, C F N ( q , N) , can be calculated as follows:

C F N ( 7 , N ) = 1 - Pr (the error being masked) log, N .\'- 1

z=o j =0

= 1 - {Pr (the error being masked

I the error occurs at ( i , j ) )

x P r (the error occurs at ( 2 , j ) ) } log2 A' N - 1

= 1 - Pr(le x S; + F I 5 7 ) 2 = o j = o

x Pr (the error occurs at (z , j ) ) .

Page 8: A novel concurrent error detection scheme for FFT networks

TA0 AND HARTMANN: ERROR DETECTION SCHEME FOR FIT NETWORKS

n: v Throu.

0 = f l b f 0.759

I) = 2 n a f 0.957

205

256 512 1024 2048

tao jou tao jou tao jou tao jou 0.996 0.676 0.968 0.601 0.888 0.537 0.773 0.482

0.988 0.663 0.941 0.589 0.827 0.525 0.717 0.468

TABLE 111 FAULT COVERAGES OF 16-BlT FIT NETWORKS ASSUMING F = 0

q = 3ma, 0.997 0.982 0.655 0.916 0.581 0.793 0.516 0.686 0.456

From (6) and (7), we obtain

N-1

N-I

log, N-1 1v-1

+ Pr(le x SJ + FI > 77) . (8) z = 1 3=0 J

Although the fault coverage of the FFT network can be calculated by using (8), it is rather difficult to determine exact values because both functional error and round-off errors (e and F ) are complex random variables with different distributions. In the Section 111-Cl, we first calculate the fault coverages by assuming F = 0. When F = 0, the exact values of fault coverages can be computed. We then use these results to calculate a lower bound and an upper bound of the fault coverages by taking into account the effect of the F in Section 111-c2.

I) Fault Coverages when F = 0: In this case, (5) can be simplified as follows:

Since e is assumed to distribute uniformly within a square as shown in Fig. 9(a), if Sj =1 and 77 5 1, then CL(vl N , Z , ~ ) F = O

is equal to the shaded area as shown in Fig. 9(b). If Sj =1 and 1 < 77 5 a, CL(v , N , i . j ) p = ~ is equal to the shaded

area as shown in Fig. 9(c). For an arbitrary Sj (Sj # 0), CL(q, N , z, j ) ~ = o can be calculated as follows:

1

1 I - -

N-1

As discussed above, we have calculated all 5'; for 4 5 N 5 2048. Given an 77 and N , the fault coverages of the proposed scheme can be determined by using (9) and (10). We have calculated the fault coverages of the proposed scheme for the word length = 16 and the word length = 32, which are shown in Table 111 and Table IV. For example, by applying the proposed technique to a 16-bit 1024-point FFT network, if the throughput is 95.7%, then fault coverage of the system by assuming F = 0 is equal to 82.7% as shown in Table 111.

We use the same method to calculate the distribution of IS;[ in JAH, which is denoted by [S;( ja) l , in different ranges for N = 256, 512, 1024, and 2048. The distributions of the ISj(ja)l are also shown in Table 11. From Table 11, it can be seen that a large percent of l S j ( ja ) l ' s are below lo-'. Since 77 is selected in such a way that round-off errors alone will rarely generate an alarm, if IS; ( j a ) I is very small, then a fault at line ( i , j) will not be able to generate an alarm. As a result, the fault coverage of hardware redundancy scheme proposed in [6] should be much lower than the fault coverage of the proposed scheme under the same throughput. In order to make comparable analyses, we employ the method described above to calculate the fault coverage of the hardware redundancy

Page 9: A novel concurrent error detection scheme for FFT networks

206

>v 1) Throu.

1)= Obf 0.759

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 4, NO. 2, FEBRUARY 1993

256 512 1024 2048

tao jou t ao jou tao jou tao jou

0.999 0.801 0.999 0.725 0.999 0.658 0.999 0.600

TABLE IV FAULT COVERAGES OF 32-BIT F n NETWORKS ASSUMING F = 0

1) = 2 a a r 0.957

1) = 3maf 0.997

0.999 0.79.5 0.999 0.719 0.999 0.652 0.999 0.595

0.999 0.791 0.999 0.715 0.999 0.649 0.999 0.592

2'

scheme proposed in 161. The fault coverages are also shown in Tables I11 and IV. In what follows, we provide an example to illustrate why these two different schemes have different fault coverages under the same throughput.

Example 5: Let us consider the 16-bit 1024-point FFT network with the CED capability. If we select y to be equal to 3flof, then the throughput of the system will be 99.8%. By using Table 11, we calculate a lower bound and an upper bound of the fault coverages when F = 0 as follows:

Case 1: The system is constructed by using the proposed scheme, in which fault coverage is denoted by

For a 16-bit system, 7 = 3mof = 3 f i d F = C F N t h ( v , N ) F = O .

0.027.

10' or 1 . 0 ~ ~ or 10-1

larger s m a 11 er

CFNth (0.027,1024)~,0

= 1 - ~ - p ~ ( l e l l s ; l L 7 ) 2 3

x Pr(the error occurs at (2,~)) 1

= 1 - Pr(lOk 5 < 10"') k = - 5

x Pr(lellSiI 5 0.027 1 l ok 5 ISfI < lo"+') 1

> 1 - Pr(lOk 5 ISJ < lok+') k = - 5

x Pr(lellSJl 5 0.027 1 = 10')).

From (9), we obtain

I S I 100 or larger 10-1 1.0-2 or smaller I

Similarly,

CFNth(0 .027 , 1 0 2 4 ) ~ = 0 1

< 1 - Pr(lOk I IS;^ < 10"')

x Pr(le((Si1 5 0.027 1 IS';/ = lo"+')

x 0.0573 - (0.074 + 0.008) x 1.0

k = - 5

< 1 - (0.098 + 0.430 + 0.195) x 0 - 0.192

= 1 - (0.011 + 0.074 + 0.008)

= 1 - 0.093

= 0.907.

Therefore, 0.715 < CFNfh(0 .027 . 1024)~,0 < 0.907. Case 2: The system is constructed by using the hardware

redundancy scheme in [6], in which fault coverage is denoted by CFN3"(7. N)F=o.

For a 16-bit system, 77 = 3 n a f = 0.0156.2

C F N J " (0 .0156 .1024)~ ,~

= 1 - y 7; Pr(l4q.7n)l I 77) ' J

x Pr(the error occurs at ( i j)) 1

> 1 - F' ( lok 5 JS(]o)l < lo"+')

x Pr(lPIISi(ja)l 5 0.0156 l lSJ( ja) l = lok)

x Pr(lel\Si(~a)l I 0.0156 I 0 < ISi( ja) l < lo-')

k = - 9

- Pr(0 < l ~ ( j n ) l < i op9)

CFNth(0.027, 1 0 2 4 ) ~ = 0

> 1 - (0.098 + 0.430) x 0 - 0.195 x 0.0573 - (0.192 + 0.074 + 0.008) x 1.0

= 1 - (0.011 + 0.192 + 0.074 + 0.008) = 1 - 0.285 = 0.715.

C F N 3 a ( 0.0 1 56.1024) F =O

> 1 - (0.009 + 0.198) x 0 - 0.270 x 0.019-

'Since the encodingidecoding scheme proposed in the [6] is different from the proposed one, the uq's in these two schemes are different. However, the throughputs of these two systems are the same.

Page 10: A novel concurrent error detection scheme for FFT networks

TA0 AND HARTMANN: ERROR DETECTION SCHEME FOR FlT NETWORKS 207

(0.052 + 0.031 + 0.036 + 0.016 + 0.016 + 0.020 + 0.016 + 0.010 + 0.326) x 1.0

By Theorem 3 and (€9, we obtain an upper bound and a lower bound of the fault coverage for the FFT networks as follows: = 1 - (0.005 + 0.523)

= 1 - 0.528 CFNup,er(rl, N ) = 0.472.

Similarly, we have

CFNJa(0.0156, 1 0 2 4 ) ~ = 0

= 1 - ~ p r ( l e l l S ; l 571) 2 3

x Pr(the error occurs at ( 2 . j ) )

1

< 1 - Pr(lOk 5 I ~ ; ( j a ) l < lok+’) k = - 9

x Pr(lellS;(ja)l 5 0.0156 I IS;(ja)l = 10“’)

- Pr(O < I s ; ( J o u ) ~ < x Pr(lpllS,”(ja)l 5 0.0156 I 0 < IS;(ja)l < lo-’)

(0.031 + 0.036 + 0.016 + 0.016 + 0.020 + 0.016 + 0.010 + 0.326) x 1.0

< 1 - (0.009 + 0.198 + 0.270) x 0 - 0.052 x 0.019-

= 1 - (0.001 + 0.471)

= 1 - 0.472 = 0.528.

Therefore, 0.472 < CFNj“(0.027, 1 0 2 4 ) ~ = ~ < 0.528. In this case, approximately 45% of ISjI’s are below JL

= 0.011. Thus the functional errors occurring at these 45% lines are treated as small round-off errors, i.e., they are simply masked, although many of these errors will produce erroneous

2) Fault Coverages by Taking the Round-Off Errors into Account: As discussed above, it is extremely complicated to find the precise value of the C L ( q , N , Z > j ) by using ( 5 ) because e and F are complex random variables with different distributions. Thus, we provide a method for calculating an upper bound and a lower bound of the fault coverage for a given ( i , j ) and an upper bound and a lower bound of the fault coverage for FFT systems.

Theorem3: Upper bound and lower bound on the fault coverage for a given line (Z.j), denoted respectively by CLupper(q , N, i , j ) and C L l o w , e r ( ~ , N . i . j ) , are given as fol- lows:

outputs. 0

CLupper(71. N , G j ) = Pr(??h < le x Si,/) + (0.01 + Pr(q + 711 < IFI)}

. Pr(le x $ 1 < 71) + P r ( q 5 IC: x S ~ I 5 q h ) .

CLlower(71, N , 2. j ) = 0.9973 x Pr(qh < le x Sjl)

+ Pr(v + 771 < IFI) x Pr(le x SJI < m) -

1 N x log, N

- -

,=1 1=0 1

For a given 77 and N , by using (9), (ll), (12), and Theorem 3, upper bounds and lower bounds of the fault coverage for the FFT networks can be determined. We have calculated the bounds of the fault coverage for the FFT networks, which are given in Table V and Table VI. We also calculate upper bounds and lower bounds of JAH. Since a large percent of IS,”(ja)l’s are less than 2Tb, where b is the word length, we consider that CL&,,,(q, N , Z , j ) = CL::wer(~, N , j , i ) = 0 when ISJ(ja)l < 2Tb. The upper bounds and lower bounds of JAH are also shown in Table V and Table VI. From Table V and Table VI, it can be seen that the lower bounds of the fault coverages of the proposed scheme are much higher than the upper bounds of the fault coverages of the JAH under the same throughput.

It should be mentioned that the fault coverage of the JAH is also considered in [6]. In [6], G, the total complex noise appearing at the input of TSC comparator due to the functional error, is considered as follows:

A - 1

G = G ( k ) , k=O

where V h = q + 3 d N x of and ~1 = 0.01 x 77. where G ( k ) is the complex noise at the kth DIT value

The proof of the theorem can be found in Appendix 11. caused by the functional error. In [6], the fault coverages

Page 11: A novel concurrent error detection scheme for FFT networks

208

I) = nffJ

Throu. = 0.759

I ) = 2 6 V f Throu. = 0.957

I! = 3 f ig r Throu. = 0.997

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. VOL. 4, NO. 2, FEBRUARY 1993

tao jou tao jou tao jou tao jou

upper 0.999 0.740 0.999 0.671 0.999 0.609 0.995 0.556

lower 0.974 0.650 0.894 0.577 0.769 0.513 0.669 0.451

upper 0.999 0.736 0.999 0.664 0.999 0.600 0.986 0.544

lower 0.967 0.646 0.873 0.572 0.750 0.504 0.651 0.441

upper 0.999 0.732 0.999 0.659 0.997 0.594 0.979 0.536

lower 0.959 0.641 0.855 0.566 0.733 0.496 0.637 0.432

TABLE V UPPER AND LDWER BOUNDS OF FAULT COVERAGES OF 1 6 - B i ~ FFT NETWORKS

I! = f i a r

.\- I 256 I 512 I 1024 I 2048

tao jou tao jou tao jou tao jou

upper 0.999 0.829 0.999 0.756 0.999 0.690 0.999 0.633

I! = 2 6ff .f

Throu. = 0.957

I / = 3 f i V J Throu. = 0.997

TABLE VI UPPER AND L O W ~ R BOIJYDS O t FAUIT COVERAGFs O t 32-BIT F f l NETWORKS

upper 0.999 0.826 0.999 0.7.52 0.999 0.686 0.999 0.628

lower 0.999 0.787 0.999 0.71 1 0.9YY 0.645 0.999 0.589

upper 0.999 0.824 0.999 0.750 0.999 0.684 0.999 0.626

lower 0.999 0.785 0.999 0.710 0.999 0.643 0.999 0.586

AY I 256 I 512 I 1024 I 2048

Throu. = 0.759 1 lower I 0.999 0.789 I 0.999 0.714 I 0.999 0.649 I 0.999 0.592

(I t1 'L x b aMc=T Fig. 10. An attenuate butterfly

are calculated based on G and G is calculated under the assumption that "G(k) ' s are random variables with mean 0 and variance 09' and statistically independent of each other." If we consider many functional errors during a computation, then "G( k ) ' s are statistically independent of each other." However, since it is assumed in [6] that only one functional fault occurs at a time, G(k) ' s are not independent as shown in Appendix 111. As a result, the fault coverages given in [6] are not accurate.

D. Reducing the Round-OffErrors by Using Alternative Scaling

An alternative scaling procedure for reducing the effect of round-off errors is suggested in [16]. In [16], the butterfly is modified as shown in Fig. 10. It can be seen that the scaling by 112 is introduced at the inputs of each butterfly. By incorporating these butterflies, the mean-square value of the output noise in the lcth DFT value, F ( k ) , can be reduced to

In this subsection, we denote the FFT networks using the butterflies shown in Fig. 10 by F F T n t t , and we consider the fault coverage of the FFTatt with CED capability. If we apply the proposed scheme to FFT,tt, the 77 will be reduced from

k m , , / m to qntt = k f l , / w ; whereas if we apply JAH to FITatt , then 7) will be reduced from k m , , / w to qatt = k m , , / w . Since qatt is much smaller than q, i t is conjectured in [6] that FFTatt has better fault coverage. However, since functional errors will be attenuated simultaneously with round-off errors, these functional errors could be masked, and so it is worthwhile to evaluate the fault coverage of the F I T a t t .

The analysis of the fault coverages for FFT,t, proceeds in Section 111-C. We will first calculate the fault coverages of the FITntt by assuming F = 0. We then provide an upper bound and a lower bound of the fault coverages for the FFT,tt.

1) Fault Coverages of the FFTatt when F = 0: Let us con- sider an N-point FFTnt t network, and let us apply the proposed scheme to construct an F F T a t t network with CED capability. Due to the attenuation of each stage, if a functional error, e, occurs at the input of a butterfly at the first level, then the error appearing at the input of the TSC comparator will be equal to

e x $ - F X S " y " U 2 y - I . This error is detectable iff

It: x sy1 > V a t t . N

Similarly. a functional error, c, occurring at the input of the ( 2 + 1)th (1 < 7 5 log, N ) level is detectable iff

For a given ( i . , j ) , the fault coverage for the error occurring at ( 5 , j ) can be calculated by using (14) at the bottom of this page instead of using (9).

Comparing (14) with (9), we find that for a given ( z . j ) , C L F = O ( ~ . N . b . j ) is usually larger than or equal to CL,t,IF="(7/,tl.N.i.,j) for 0 5 1 5 v, whereas

Page 12: A novel concurrent error detection scheme for FFT networks

TA0 AND HARTMANN: ERROR DETECTION SCHEME FOR F I T NETWORKS 209

-Y 11 Throu.

11 = n u , 0.759

I ] = 2 n a f 0.957

rl = 3 6 a f 0.997

256 512 1024 2048 tao jou tao jou tao jou tao jou

0.972 0.652 0.895 0.584 0.783 0,527 0.662 0.479

0.946 0.639 0.841 0.572 0.714 0.515 0.602 0.468

0.921 0.631 0.804 0.565 0.673 0.508 0.575 0.461

.\- 256 512 1024 2048

CLF=CJ(V. N . 1 . 3 ) is smaller than CLattlF=o(rlatt. N, I . J ) for + (0.01 + P r ( ~ a t t + Val < I F / ) ) log, N 7 < z 5 1og2N. Similar to (lo), we can find the fault coverages of F F T a t t by using next equation. The fault coverages of FITutt constructed by using proposed scheme for N = 256, 512, 1024, and 2048, are shown in Table VI1 ' u t t l o u e r ( % t t S N . '*.?) and Table VIII. The fault coverages of FITatt constructed by using JAH are also given in Table VI1 and Table VIII.

. Pr( l r x 1 x S;I < q,/)

(16) + Pr(Va/ 5 IC x 1 x SI1 5 % h ) .

= 0.9973 x Pr(V,h < x 1 x 5'1) + W v a t t +rial < IFI) x Pr(lc x 1 x S;l < ~ ~ 1 ) (17)

1 ) = o a f

rl = 2 o r r f

q = 3 0 r r f

1 = 1 )=o

0.759 0.999 0.7X4 0.999 0.71 1 0.999 0.647 0.999 0.592

0.957 0.999 0.778 0.999 0.705 0.999 0.641 0.999 0.587

0.997 0.999 0.774 0.999 0.702 0.999 0.638 0.999 0.584

2) Fault Coverages of FFTatt by Taking the Round-Off Errors into Account: As discussed in Section 111-C2, we can find an upper bound and a lower bound of the fault coverage for a given 5'; and an upper bound and a lower bound of the fault coverage for F F T a t t systems as follows:

C a t t u p p e r ( V a t t . N. 2.j) = Pr(71,h < le x 1 x S:l)

.v - 1

Page 13: A novel concurrent error detection scheme for FFT networks

210

s

1)= n a f upper

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 4, NO. 2, FEBRUARY 1993

256 512 1024 2048 tao jou tao jou tao jou tao jou

0.999 0.685 0.997 0.615 0.983 0.558 0.953 0.507

TABLE IX UPPER AND LOWER BOUNDS OF FAULT COVERAGES OF 16-BIT FFT,JJ NETWORKS

1) = 2 m a r Throu. = 0.957

1) = 3 n u r Throu. = 0.997

upper 0.999 0.666 0.997 0.595 0.979 0.538 0.939 0.487

lower 0.883 0.623 0.750 0.556 0.624 0.501 0.543 0.452

upper 0.999 0.656 0.995 0.587 0.972 0.529 0.923 0.480

lower 0.867 0.617 0.731 0.552 0.608 0.495 0.532 0.447

Throu. = 0.759 I lower I 0.900 0.633 I 0.773 0.569 I 0.645 0.514 I 0.558 0.467

s 256 512 1024 2048

7) = f l u f Throu. = 0.759

' I = 2 O U f Throu. = 0.957

tao jou tao jou tao jou tao jou

upper 0.999 0.796 0.999 0.722 0.999 0.658 0.999 0.603

lower 0.999 0.777 0.999 0.705 0.999 0.642 0.998 0.589

upper 0.999 0.785 0.999 0.71 1 0.999 0.647 0.999 0.592

lower 0.999 0.771 0.999 0.698 0.999 0.636 0.998 0.581

q = 3 0 0 1 Throu. = 0.997

log, N N - 1

= C L a t t l o w e r ( r l , N . *i. .?) 2=0 J=O

x Pr(the error occurs at (2. j))

upper 0.999 0.780 0.999 0.707 0.999 0.643 0.999 0.588

lower 0.999 0.768 0.999 0.696 0.999 0.633 0.998 0.579

N - l

3 =O

By using (14), (16), and (17), (18), and (19), upper bounds and lower bounds of the fault coverage for the FFT networks are calculated as given in Table IX and Table X. Comparing Table 111 with Table VI1 (Table IV with Table VIII, Table V with Table IX, and Table VI with Table X), we can conclude that the fault coverage of the FFTat t systems is not higher than the fault coverage of the FFT systems when a CED scheme (the proposed one or JAH) is incorporated.

E. Floating-point Systems

It has been mentioned in [6] that the system performance of the floating-point systems, i.e., the throughput and the fault coverage, can be analyzed in the same way as discussed above. Since the floating-point systems are less affected by round- off errors than fixed-point systems, we expect that the fault coverage of the floating-point system constructed by using the proposed scheme will be better than that of the fix-point system constructed by the same method.

In this subsection, we compare the proposed scheme with three existing schemes, i.e., the hardware redundancy scheme (JAH) in [6], and the time redundancy scheme in [2], and the scheme proposed in [13]. The time redundancy schemes use less hardware than the proposed scheme, but their throughputs are only 50% less than the hardware redundancy schemes. In the proposed scheme, extra hardware is required both in the encoding and in the decoding circuits. Since a multiplier is much more complex than an adder, we consider the hardware overhead only in terms of the multipliers. As result of using Lemma 1 and Lemma 2, the number of the multipliers required in the decoder is equal to N/4 + 1. The number of multipliers required in the encoder is equal to N - 1. Hence, the total number of complex multipliers used in the encoder and de- coder is equal to y, which is 5% more the extra hardware than that used in the hardware redundancy scheme in [6], [13]. Note that an FFT network can be used to compute either F F T or inverse FFT, we do not assume that input signal are real as considered in [13]. The extra time delay in the proposed scheme is slightly less than that in JAH because the encoder is not cascaded with FFT network. Moreover, it has been shown in Section I11 that the fault coverage of the proposed scheme is significantly better than that in JAH. The comparison of the proposed scheme with other existing schemes is shown in Table XI.

IV. CONCLUSION

In this paper, we have proposed a novel concurrent error detection scheme for FFT networks. We have shown that the proposed scheme can achieve 100% fault coverage and throughput theoretically, and it uses 5% more extra hardware and less extra time delays than the scheme proposed in [6].

Page 14: A novel concurrent error detection scheme for FFT networks

TA0 AND HARTMANN: ERROR DETECTION SCHEME FOR F l T NETWORKS 21 1

TABLE XI COMPARISON WITH EXISTING TECHNIQUES

Technique Overhead Hardware Fault Coverage Throughput

Choi and Malek

Jou and Abraham

Reddy and

The proposed one

&i" High LOW

&% LOW High

unknown unknown Banerjee *%

a". High High

Even though 100% fault coverage and throughput can be ob- tained theoretically, many practical problems, such as overflow and effect of round-off error, must be considered during the implementation [7], [lo]. Hence, for most of the FFT networks which are widely used in practice, we have provided realistic fault coverages by taking round-off error into consideration. The results in this paper show that the fault coverage of the proposed scheme is significantly better than that of the existing one. Furthermore, comparing the proposed scheme with the hardware redundancy scheme proposed in [6] , we can see that the system performance of a system with concurrent error detection capability is determined not only by the 71, but also by the encodinddecoding scheme. Therefore, further study is still required to find encoding/decoding schemes which can:

1) minimize the extra hardware, 2) minimize the extra delay, 3) be implemented easily, 4) maximize the fault coverage and throughput by taking

the round-off errors into account.

APPENDIX I Lemma 1: ah.2") = - a ( N ) 2 z + 1 , for 0 I i 5 $ - 1.

Proof: From the construction procedure for a!'V), we know that when 0 5 i 5 N - 1,

2T N where W.V = exp(-j-). (1-1)

We now prove the Lemma 1 by using math induction on

1. Base step: 1 = 1. From definition, a?) = W t = 1, and

2. Induction step: assume that it is valid for 1 = I C , i.e., N = 2k. We consider 1 = IC + 1, i.e., 2"' = 2 N .

I , 1 = log, N .

a ( 2 ) - - w; = -1.

a. 0 5 i 5 N - 1 By (1-1) and induction hypothesis, we have that

b. N 5 i 5 $ N - 1 By the construction procedure for a!2") and induction hypothesis, we have that

c. $ N 5 i 5 2N - 1 By the construction procedure for a!2N) and induction hypothesis, we have that

= (NI (NI a ( 2 N ) 2% a 2 1 - L i ~ W i N = - a 2 a + 1 - q ~ W 2 1 N = - 22+1'

Q.E.D. Before we prove Lemma 2, we introduce another Lemma. Lemma A l : The following equations are valid for 0 5 a 5

_ - Ay 1, and 4 5 N :

(1-2) ( N ) - N - - 2, -

Page 15: A novel concurrent error detection scheme for FFT networks

212 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 4, NO. 2, FEBRUARY 1993

Lemma 2: The following equations are valid:

N 8

f o r 8 5 N , O < i < - - l

where [z]*is the conjugate of a complex variable z.

Proof: Since W z = -1 and by Lemma 1, we have N

By Lemma A1 and for 8 5 N , we have

Hence,

f N ) ( N I

by (1-13) and (I-14), we obtain

that is

Therefore,

Q.E.D. In the following, we first provide the conditions for detecting

all modeled errors (additive error at the input or output of a butterfly). We then prove that all modeled errors, except the error at fine (O,O), are detectable in PSCED.

An output signal of the N-point FFT network can be represented by

N - 1

~ ( m ) = z ( n ) ~ ~ ~ ~ , for o 5 m 5 N - 1. n = O

N 2'

m =2' x ml + m2, where 0 5 ml 5 - - 1

and 0 5 m2 5 2' - 1 N 2'

n =- x nl + n2, where 0 5 n1 5 2' - 1,

We obtain

n = O

(2' Xml +m*) x ( n 2 ) . W N

We then have

N where x ' ( 3 x m2 + n2

Note that x[ ( $ x m2 + n2) is an input signal of an (1 + 1)th level butterfly for 0 5 1 5 log2 N, and qOg, ~ ( 7 ~ 2 ) is an output signal of the last level butterfly. As discussed in Section 111-C, each line in FFT network is represented by a two- tuple, hence we use (1, $ x m2 + n2) to denote the line where x'($ x ma + n 2 ) is at. From (I-15), it can be seen that the signal at the line (1 , $ x m2 + n2) will only affect the output signals X ( m 2 ) , X(2' + m2), X(2' x 2 + m2),

. . ., and X(2' x ($ - 1) + mz). In addition, the transfer function from this line to outputs X ( m 2 ) , X(2 '+mz), X(2' x 2 + m2), . . ., and X(2' x ($ - 1) + mz), are, respectively, equal to W"* nz, W, ( 2 ' + m z ) x n 2 , wg' x2+m2) x n z , ..., and

W N . Hence, when an error, e, occurs at this (2' x ($ - l)+mz) x n2

line, W " Z X " 2 e , w; ' fmz)xn2 e, w g ' x 2 + m 2 ) x n 2 e, (2' x (5 -l)+mz) x n 2 . . .) WN x e will appear at ~ ( m z ) , ~ ( 2 ' +

m2), X(2' x 2 + mz), ..., and X(2' x ($ - 1) + mz), A- respectively. Moreover, the error occuring at the input of the

Page 16: A novel concurrent error detection scheme for FFT networks

TA0 AND HARTMANN: ERROR DETECTION SCHEME FOR FFT NETWORKS 213

Base Step: For Part lE, we consider N = 2 as the base base. In this

case, we have U: = 1 and U: = -1. x 2 + m z ) x n a x f: x a;+m2 + WE1 x e x a?x2+m2+

x e x 8 SA = ai # 0.

(2' x (3 - l)+m2) x n l (21 x ( $ - l ) + m , ) For Part 2E, we consider N = 4 as the base base. In this

case, we have U: = W j , U: = W:, a; = W,", and U ; = W i . By (I-16), we have

. ' . + W N

5-1 x a?xr+mz '

sy = 2w:, s; = 2(W,0 - W,'), s: = 2w:. 1 = e [ w ; i x r + m z ) x %

r=O

Let us define - _ : : I

( 2 ' x r + m ? ) x n ? s k x m , + n Z = W N x ~ $ ~ ~ + ~ ~ . ( I - 1 6 )

r=O

where 0 5 1 2 log2N, 0 5 m2 5 2' - 1. N 2

and 0 5 712 5 - 1

It can be easily seen that the error, e, at the line, (1, $ x m2 + n2), is detectable by using PSCED iff S h xma+n, # 0. Therefore, we introduce the following Lemmas.

LemmaA2: In order to detect all errors at an input of (1 + 1)th level butterflies by using PSCED, the following equation has to be satisfied:

N 2'

for O < 1 5 1 0 g 2 N - l . O L j < - - l .

and k E (0 .1 :... 2' - l}.

Lemma A3: In order to detect an error at an output of last level butterflies, X ( j ) (0 5 j 5 N - l), by using PSCED, the

Thus, Part 1E and Part 2E are valid in the base case. Induction Step: Assume that it is true for 7 2 1, we have

N N 2 22' for 1 5 1 5 log, -, 0 5 j 5 - - 1,

and k E (0.1, . . . .2' - 1). E-1

N Part 2E: 2 t = O u:')Wz # 0 , for 1 I j 5 - 2 - 1.

Let us consider the N-point FFT network. We need to prove

i=O

N 2'

and k E { 0 , 1 , . . . , 2 ' - 1).

for 1 5 1 5 log, N , 0 5 j 5 - - 1,

following equation has to be satisfied: N-1

Part 2E: S,"( N ) = W;;" s:Opz = a ! N ) ~ : # 0, for 0 5 j 5 N - 1. (1-18) 2=0

N - 1 Theorem 1:

Proof: From (1-18) and (I-17), we need to prove that

Except the error at the line (O,O), all errors in c a{N)W$i # 0, an N-point FFT network are detectable by using the PSCED. i=O

for 1 5 j = 2 r < N - l .

To prove Part lE , we consider three subcases in terms of 1 , 0 5 1 5 log,N.

Part lE.l : 1 = log, N. In this subcase, 0 5 j 5 0, and k E {0 ,1 , . . . , N - 1).

i=O

N

2' We have for 1 2 15 log2N. 0 5 j 5 - - 1, and 0 5 k 5 2' - 1;

N-1

The following two cases, i.e., j is even and j is odd, are

In this case, we prove it by using math induction on N ,

Part 2 in this case are, respectively, denoted by Part 1E and Part 2E.

Since have

belongs to the set { Wgr, W k , . . . , W,"-'}, we considered: Case 1. j is even, i.e., j = 2r.

( N ) # 0. where N is the number of points in FFT network. Part 1 and sp2 N

Part 1E.2 : 1 = log, N - 1.

Page 17: A novel concurrent error detection scheme for FFT networks

214 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. VOL. 4, NO. 2, FEBRUARY 1993

In this subcase, 0 5 j 5 1, and k E (0. l:.., $ - l}. Since j is even, j can only be equal to 0.

l o g L N - 1 (NI = s 2 x k ( N )

log* N-1 ' 2 x k + 3

i=0

From the coefficient construction, we know that I. ay:"', where 0 5 k 5 $ - 1, belongs to the set

2. a K k , where 0 5 k 5 $ - 1, belongs to the set {Wi,, w;, . . . ,WE-'},

{Wk7? w;, . . . , W y }

Part 1E.3 : 1 5 1 < log,N - 1. In this subcase, 0 5 j = 27- 5 !$ - 1, and k E

(0.1, ' . . ,2L - 1}.

i=0

1=0

(by the rules of constructing a!Av)7s)

i=o

[by (1- 1311.

By the induction hypothesis, we know that

Since w:.w.!' + + k ) 2 r E {W1.WJ. . . . ~ W"--'}, we have

# 0. 1 w(2' + + k ) 2 r

1 + w.y 'v

Therefore, S $ x k + 3 ( N ) # 0.

conclude that the equation in Part IE is valid.

2' From the proofs in Part lE . l , 1E.2, and 1E.3, we can

We now prove Part 2E.

Page 18: A novel concurrent error detection scheme for FFT networks

TA0 AND HARTMANN: ERROR DETECTION SCHEME FOR FFT NETWORKS 215

We need to prove that m - 1

By the induction hypothesis, we know that

%-I

In addition, 1 + WiWi!)2r = 1 f W,k # 0 Therefore, Part 2E is valid, i.e.,

N-1

S,O(N) = a,(%;:' # 0. 2=0

Case 2. j is odd, i.e., j = 2r + 1.

Part 10 and Part 20.

I , 0 5 1 5 log, N .

In this case, Part 1 and Part 2 are respectively denoted by

To prove Part 1 0 , we consider three subcases in terms of

Part 10.1: 1 = log, N . In this subcase, 0 5 j L 0, and k E {0 ,1 :... N - 1}.

Part 10.2: 1 = log, N - 1. In this subcase, 0 5 j 5 1, and k E ( 0 , l . . . . ~ q - 1).

Since j is odd in Case 2, nothing needs to be proved.

Since j is odd, j can only be equal to 1.

log, n. - 1 log, ,v- 1 s2 x k+J (NI = S,Xk+l ( N )

From the coefficient construction, we know that 1) a i N ) , where 0 5 k $ - 1, belongs to

2 ) uE),,+, where 0 5 k 5 $ - 1, belongs to { W i , w;, . . ' , w-t-",

{Wk, w;, . * . , W y } . log, N-1 Hence, S2xk+l ( N ) = a y . " " - a("), # O'

Part 10 .3 : 1 5 1 < log:,N - 1.

the set

the set

- _ In this subcase, 0 5 j = 2r + 1 5 2 - 1, and k E

l0.1:..,2' - 11. Let m = x .

S i X k + 3 ( N ) = . _ al;ikWi2'+')' # 0. (1-19) 2=O

7

We can decompose the (1-19) into four parts as follows: m-1

i=O F-1 F-1

_ - " 1

i=O

= W k J [ P : - ( N ) + P i p ) + WkW$'(Pl(N)

+ P:(N)W$J)]

where: -- 7 1

P;(N) = al;\kw$21)j a=O _ _ 7 1

and P,'(N) = az(;ikW$2')3. (1-20) t=m

From (I-20), it can be seen that 1. since 1 5 1, ( i2 ') j is an even number; 2. when 0 5 i 5 - 1, we have 22' + k < g.

From the rule of constructing a!N)'s, we know that E {Wi,W$,...,W;-'} when 0 5 i 5 f - 1.

Page 19: A novel concurrent error detection scheme for FFT networks

216 IEEE TRANSADIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. VOL. 4, NO. 2, FEBRUARY 1993

Therefore, we can conclude that Pf ( N ) + PJ ( N ) is a linear combination of the elements in {W:,. . K7?-'} and

WkW$J[P:(N) - P i ( N ) ] is a linear combination of the elements in {W&, W,:.. . . . . WA$-'}. I t is know that given a set of real numbers ( ~ 0 . q . ~ . . . . . T L - , } , (1-21) is valid only if T , = 0 for all 1,

i=O

if S+xk+..(N) = 0, then the following equations must be satished:

P:(N) + P i ( N ) = 0 and P ; ( N ) + Pj(N)U7.*J = 0.

It can be easily seen that the (1-22) are satisfied only if (1-22)

_ - 'i' 1

In what follows, we show that S',

Two possibilities are considered: Possibility a). y = 1. In this subcase, we have

(.V) # (1 by showing

that Pf ( N ) = 0 and Pi ( N ) = 0 are not valid simultaneously. ; r X " J

Therefore, we can conclude that S, ( N ) # 0. Possibility b). > 1. By using the same decomposition technique, we can obtain

1=O

_ _ 6' 1

i=O L= ilL

= P:(N) + P;(N) (1-24)

2 = O

If P ; ( N ) = 0 and P i ( N ) = 0, from (1-24) and (I-25), it can be seen that both P: ( N ) and P; ( N ) must be equal to 0. If 2 = 1, then we have P:(N) # 0 (the proof is similar to the possibility a); otherwise, we can further decompose P:(N) and P:(N). Similarly and recursively, we reach the conclusion that P ; ( N ) = 0 and P i ( N ) = 0 are not valid simultaneously. Note that the number of decompositions is a finite number because 2' < $.

From the proofs in Part 1 0 . 1 , 10.2, and 10.3, we can conclude that the equation in Part 1 0 is valid.

We now prove Part 20.

r,, - 1

The proof is similar to that of Part 10.3. It should be noted that a single fault can also occur in the

encoding circuit or the decoding circuit. Such a fault can occur at one of the encoding (decoding) multipliers, or one of the encoding (decoding) adders, or the input (output) summation adders. Since a fault occurs either in the encoding circuit or in the decoding circuit (but not both), only one of the two inputs of a TSC comparator is erroneous. As a result, the fault occurring in the encoding circuit (the decoding circuit) is always detectable. Q.E.D.

APPENDIX I1 Theorem 3: A n upper bound (a lower bound) of the fault

coverage for a given Sr , Cupper (7. N. i, j), (Clower(v, N , i,

Page 20: A novel concurrent error detection scheme for FFT networks

T A 0 A N D HARTMANN: ERROR DETECTION SCHEME FOR FFT NETWORKS 217

j)) is given as follows: Again by using the theorem of total probability, we have

' Pr( I f x s; 1 < / / / ) + PI ( / / / 5 I f x s; 1 5 t / h )

= 0.9'373 x Pr(7],l < I / x S i ] ) + Pl[// + 'I, < IFI) ciouCl (7/. N . 1 . 1 )

x P r ( J t x SJI < r i d where 7 / h = 71 + 3 6 x of and r / / = 0.01 x r l .

Proof: By ( 5 ) and by the theorem of total probability (Section 11-C of [12], we can obtain (11-1).

C(7)" 1 . I ) = Pr(q < (( x s; + F/) = Pr(71 < I C x S; + F / 7/tl < I f x S)i)

x Pr(r/rl < I f x S)l)

+ Pr(r/ < If, x s; + FllIf x SJ1 < / I / )

x Pr( / r x S11 < r l l )

+ Pr(q < / c x SJ + Fllt1/ I x wr// I I f x s;I I ) / / I )

x S ~ I 5 r/tl)

(11- 1)

where / / h = rl + 3v% x crf and rll = 0.01 x '1.

Note that the selection of r /h and r / l is not unique, but they should be selected to obtain closer upper and lower bounds of fault coverage. Since P and F are independent of each other, by using the theorem of total probability, we have

Pr(r/ < ~c x SJ + FjIii,, < I f x s;\) = Pr(r/ < / e x LSJ + FI r / , , < ( 6 x

and 7/tz ~ ri < IFI) x Pr(m - r / < / F I ) + Pr(q < x S; + F1(7//, < x S i 1

and IF1 5 r/h - 71) x Pi(lFl 5 r l r , - 1 1 ) . (11-2)

Whenilh < I f x S J / a n d I F / 5 i / / , - r / , w e h a v e / ~ x S j + F I 2 16, x Si1 - IF1 > '1. Hence,

Pr(q < If> x 5'; + F l ) r / h < 1 1 x SJI and IF1 5 rill - r l ) = 1 (11-3)

AS discussed in Section 1II-A2, the random variable F can be viewed as having a Gussian dktribution (by central limit theorem), and its means and variance are equal to 0 and No:, respectively. Thus,

Pr(lF1 F r/rl - r / ) = P ~ ( I F I 5 *I x f l u f ) = o 9073.

Pr(7)h - 7/ < IFI) = 1 - o.!N7:3 = 0 0027 (11-4)

By (11-2), (11-3), and (11-4), we have

Pr( r / < x Si + FII x Si I < r / r and 71 + 711 < IFI) = 1. (11-8)

Moreover, we found that

By (11-6)-(11-9), a lower bound and an upper bound for Pr(r/ < 16' x SI + FIJI^ x 5 r / / ) are given in (11-10).

I t is known that

By applying the results in (11-5), (11-lo), and (11-11) to (11-l), we obtain

Page 21: A novel concurrent error detection scheme for FFT networks

21x IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. VOL. 4, NO. 2, FEBRUARY 1YY3

1 1 = 0

I , = 1

I ] = 2

1 1 = 3

Q.E.D.

1 2 = 0 12 = 1 12 = 2 I , = 3 I I I 1 1 1 1 1 1 1 < 1 I I

- -

0 0

0 0 0

1 0 0 0

APPENDIX I11 In [6], just as in almost all the work in the area of concurrent

error detection/correction, a fundamental assumption is made as follows:

Assumption I: the fault occurs one at a time. In [6], the fault coverage of the JAH is calculated based

on G, the total complex noise appearing at the input of TSC comparator due to the functional error, where

.Y - 1

G = G(X). k=O

and G ( k ) is the complex noise at the Xth DFT value caused by the functional error. In order to find G, another assumption is made in [6] as follows:

Assumption 2: G(i ) 's are random variables with mean 0 and variance cri and statistically independent of each other.

If G(i) 's are statistically independent of each other. then any two of them are statistically independent of each other [12, ch. 8].'Under the Assumption 1, we show that G(X)'s are not statistically independent by showing G(0) and e( $) are not independent. In other words, the Assumption 2 is notvalid.

Let ( 1 1 . i 2 ) ( ( i l . i 2 ) is defined as in 161, i l E (0.1 :... log2N + l } and i 2 E {O.l..'..LV - l}) denote a line where a functional error can occur, and let g ( i 1 . / 2 . I ; ) be the transfer function from an error site ( i l . i 2 ) to the kth output. Since there is only one fault i n the FFT network (under Assumption l ) , let random variables 11 and 12 denote the error site. Hence,

Pr(1l = ' i l . 1 2 = i 2 ) = P r j e r r o r occurs at ( i , . i 2 j ) .

and the G(I;) can be represented by

G ( k ) = P x g(11 = i l . II, = ;2 . A.).

Thus, we have

G(0) = ( > x g(11 = i l . I2 = ij.0). (111-1)

/ , = 2 I 0

3Since C;( r ) ' s are complex variables. the definition of C:( I ) and C:( A , ) heing independent of each other can be found in [pp. 23X. I?].

1 0 0

We now show that the following equation is not valid,

1 ] = 3 I 0

r G(O) 5 n.. G ( ;) 5 .J] [ = P[G(O) 5 . I . ] x P [ C J (:) 5 XI . (111-3)

1 0 0

To prove (111-3) is not valid, we only need to prove that the i t is not valid when the complex variable 3' = O+jS. From the Equations 111-1 and 111-2, G(0) and G ( k + % ) are the functions of random variable e (which has the distribution as shonw in Fig. 9.a) and random variables ( Il. 12). Since an outcome of ( 1 1 . 1 2 ) is ( i l . i 2 ) , where i l E {0.1 :.., log,N+ I} and i 2 E {O. 1. . ' . .lV - l}, the outcomes ( i l . 22) are categorized into the following five cases:

Cusr I : s 1 = { ( i l . i 2 ) I g ( i l . i 2 . 0 ) = + g ( i l . i 2 . $ ) , and

Cuse2: s 2 = { ( i l . i 2 ) I g ( i l . i 2 . 0 ) = 2 g ( z l . & . $ ) , and

Cuse.3: s y = { ( i l . i 2 ) 1 g ( i 1 . i 2 . o ) = g ( i 1 . , i 2 . $ ) = O} Cuse4:s4= { ( i l . i 2 ) I y ( i l . i . , . O ) # O a n d g ( , i l . i 2 , ~ ) =

C u s r 5 : s j = { ( i l . i 2 ) I y ( i , . i 2 . 0 ) = O a n d y ( . i l . 2 2 . ~ ) #

Example: Let us consider the 4-point fault-tolerant FFT network constructed by JAH, which is shown in Fig. 11. In Fig. 11, each line is labeled as ( i l . i 2 ) .

Moreover, we find that g(11 = i l . 1 2 = i 2 . 0 ) and g(11 = i 1 . 1 2 = i 2 . 2 ) are given as follows:

g ( i 1 . i 2 . 0 ) # 0 ) .

g ( i 1 . i 2 . 0 ) # O}.

O } .

0 ) .

I I I / , = 1 I 1 I -I I 0 I 0 I

In Fig. 11, a functional error can occur at the input or output of the butterfly or decoding multiplier, there exist 4 x log2 4 + 2 x 4 distinct modeled faults. It is assumed in [6] that these errors are equal likely to occur, and only one occurs at a time. We obtain

-1 4 X 2 X 2 f X 24

- - - 2 + 2

Pr((1, .12) E .SI) =

4 - 2 + 2 PI.((11.1>) E s 2 ) = -

4 X 2 X 2 + 8 24' 2 + 2(2 + 2) 1 X 2 X 2 + 8 21'

10 - - P r ( ( l l . 1 2 ) E .sJ) = -

Page 22: A novel concurrent error detection scheme for FFT networks

219 TA0 AND HARTMANN ERROR DETECTION SCHEME FOR F I T NETWORKS

2z(3)+z(O)

Error Indication

M'," LVi ( 1 ~ 3 ) . (%3) 5 1 ( 3 , 3 ) X(3)

__

4 4 0 )

Cuse 2 ( r I . Q) E 9 , . Similar to case I, Pr[e x g(I1 = 1 1 . 12 = 1 2 . 0 ) 5 .I.] = 1/2. Hence,

P r I c:(o) 5 I 1 1 1 = / I . 12 = L 2 ] = 1/2 .

Caw 3 ( l 1 . 1 2 ) E SJ. In this case, since y(I1 = i1,Iz = 1 2 . 0 ) = 0, then G(O) 5 s is always true. Hence,

Cuse 4 ( / I . 11) E Case 4. Similar to case 1, Pr[e x g(11 = 1 1 . 12 z 1 2 . 0 ) 5 1.1 = 1/2. Hence,

Pr [ ~ ' ( 0 ) 5 r I = I l . 1 2 = L 2 ] = 1/2.

Cuse 5 (11 .12 ) E sj. Similar to case 3,

Pr [ ~ ( 0 ) 5 r I 11 = 2 1 . 1 2 = z 2 ] = 1.

2 + 1 3

3

From above five cases and (III-g), we have Pr((11.12) E s,) = ~ = - 1 X 2 X 2 + X 2 1 ' P[G(O) 5 .r]

1 - 2 + 1 = Pr( (11.12) E c s j ) = 3 X 2 X 2 + X 21 ' = C ~ P ~ [ G ( O ) 5 .r 1 = ( 1 . ~ 2 = 22

I l 1 2 In an N-point FFT network, the above equations can be

x Pr[Il = ~ 1 , I , = i 2 ] generalized as follows:

$ + 2 ( ; - 1) D,.(/ 1. l - \ c '._ \ -

$ + 2(: - 1)

'1 x +- x log, +2Av Pr((11.12) E s,) = -

( i l . ~ ? ) € S l

( y ~ p r [ 1 1 = i l . 1 2 = i 2 :3N - 1

- - (111-5) 1 ~ 3 . 3 I ) 12

-1 x Y x (log,! I V + 1)

1 'L(3.V - 1) + 12

Pr((11.12) E .s3) = 1 - 4 x N x (log2 Izr + 1 ) = [ 5 ~ r [ ( ~ 1 . ~ 2 ) E S I ] + c ~r[(11,12) E c q i ~ ]

2 x 'V x log2 ,v - LV - 2 1=1.2 .1 I ,=A - (11-6) 1 ( 3 N - 4) + (3N - 4) + G 2 x x (log2 + 1) -

+ 1 )

+ 1)

-

3 2 1 x N x (log, N + 1) (4 x N x log, 1v - 2 N - 4) + 6

Pr((11.12) E s 1 ) = (111-7) 2 x Y x (log,

2 x N x (log, 3 + 4 x N x (log, N + 1)

(I 11-8) 4 x AV x log, N + N + 1 1 x N x (log, N + 1)

Pr((11 .12) E S A ) = - - '

We now compute Pr[G(O) 5 .I:]. By using the same argument, we obtain

N P[G(- ) 5 X]

11 1 , 2

= 7, Pr[G(O) 5 .I;] = 7, P[G(u) 5 .I' \Il = i l . 1, = i,]

N P[G( r ) 5 L I 11 = 21, I 2 = 221

x P[I1 = i l . 12 = i 2 ] (111-9)

1 1 1 2 Since Pr[G(O) 5 .r. Il = i l . T2 = i 2 ] = Pr [ r x g( l l = x Pr[Il = i l , 1 2 = i 2 ] 2 1 . 1 2 = i 2 , O ) 5 .I;], the following five cases are considered.

- 1 ( 3 N - 4 ) + (3N - 4) + 6 Case I ( i l ? i 2 ) E s l . In this case, g(I1 = i l . 1 2 = r 2 . 0 ) E

of e, we obtain Pr[e x y( I1 = i l . 1 2 = i 2 . 0 ) 5 . I . ] = 1/2.

-

{W.:.. W,qT.. . . . W c - ' } and .I' = O+,jti, from the distribution 2 1 x N x ( l o g 2 N + 1) (4 x N x log, N - 2N - 4) + 3

Hence, + 4 x N x ( l o g , N + l ) '4 x N x log, N + N + 1

-1 x Lv x (log, N + 1) - - G(O) 5 .I: 1 1, = i l . I., = i,] = 1 / 2 . '

Page 23: A novel concurrent error detection scheme for FFT networks

220 IEEE TRANSACTIONS ON PARALLEL AND DlSTRlBUTED SYSTEMS, VOL. 4, NO. 2, FEBRUARY 1993

We now consider P[G(O) 5 z , G ( $ ) 5 z].

x Pr[11 = i 1 , 1 , = 221.

The following five cases are considered: Case I ( i l , i 2 ) E S I . In this case,

N P[G(O) 5 z l G ( T ) 5 2 I 1 1 = i i 1 1 2 = 221 = P[G(O)

I 2,3G(O) 5 2 I 1 1 = i 1 , 1 2 = 221.

Since G(0) = e x g ( 1 1 = i l , 1 2 = i 2 , O ) and g ( I 1 = i l , 1 2 = i 2 , O ) E {w;,W;,..., W,”-’}. From the distribution of e, if G(0) 5 z, then 3G(O) 5 z. Therefore, we have

P[G(O) 5 z,3G(O) 5 2 I 1 1 = 2 1 , I2 = 221 = P[G(O) 5 2 1 1 1 = 21, 12 = 221 = 1/2,

i.e.,

N 2 P r [ G ( O ) I Z , G ( - ) 5 5 1 = i l , l , = i2] = 1/2.

Case 2 (i1,iz) E 52. In this case,

N 2 P[G(O) I 2 , G( -) 5 I 1 1 = i l , 1 2 = 221

= P[G(O) 5 2, -3G(O) 5 x I 1 1 = i1,IZ = 221

= P [ y I G(0) 5 2 1 11 = i 1 , 1 2 = i z ] . -X

S i n c e G ( O ) = e x g ( l l = i l , 1 2 = i 2 , O ) a n d g ( l l = i 1 , 1 2 =

i z , 0) E {wg, w;,. . ., w,”-’}. From the distribution of e , we know that

I.e.,

N P r [ G ( O ) L z, G( z) I z I 1 1 = i l , 1 2 = i2] = 1/2.

Case 3 ( i i , i z ) E s3. In this case, since g ( 1 1 = i 1 , 1 2 = h , O ) = 0 and g ( 1 1 = 2 1 ~ 1 2 = i2,:) = 0, then G(0) 5 z and G( 5 ) 5 x are always true. Hence,

N P[G(O) L 2 , G ( T ) 5 z I 1 1 = i 1 , 1 2 = 221 = 1.

I.e.,

I.e.,

Case 5 ( i i , i z ) E s5. Similar to case 4, P[G(O) 5 z , G ( g ) 5 z I 1 1 = z 1 , 1 2 = 221 = 1/2, i.e.,

N Pr [ c ( O ) L z, G( 2) 5 3: I 1 1 = 2 1 , I2 = i2] = 1/2.

From the above five cases, we have

- 2 4 x N x ( l o g , N + 1)

-I- (4 x N x log2 N - 2N - 4)

4 x N x (log, N + 1) 4 x N x 1 0 g z N + N - 2 - -

4 x N x ( l o g , N + 1 ) ’

Since

N 2

4 x N x log2N + N + 1 4 x N x ( l o g , N + 1)

P[G(O) 5 z] = P[G( -) I z ] =

N 2

4 x N x 1 0 g z N + N - 2 P[G(O) 5 2 , G( -) I X ] = 4 x N x ( l o g , N + 1) ’

it can been seen that

Page 24: A novel concurrent error detection scheme for FFT networks

TA0 AND HARTMANN: ERROR DETECTION SCHEME FOR FFT NETWORKS 22 1

ACKNOWLEDGMENT (141 K. l‘amashita er al., ”A wafer-scale 170 000-gate F IT processor with built-in test circuits,” IEEE J . Solid-State Circuits, vol. 23, no. 2, pp. 336-342, Apr. 1988.

[IS] P. W. Wyatt and J . 1. Raffel, “Restructurdbk VLSl -A deconstrated wafer scale technology,” in Proc. IEEE Int. Conf Wafer Scale Integra- tion, Jan. 1989, pp. 13-20.

Trans. Audio Electroacoust., vol. AU-17, pp. 153-157, June 1969

The authors would like to thank Y. S. Chen, Prof. J. Murray, Prof. H. S. Don of the SUNY at Stony Brook, and Prof, K. Mehrotra of the Syracuse University, for their helpful

Referee A, for giving us detailed comments which made the discussions. w e are also grateful to the referees, especially

manuscript more readable.

[ 16) P. D. Welch. “A fixed-point fast Fourier transform error analysis,” IEEE

I21

I31

I41

I51

REFERENCES

C. Y. Chen and J . A. Abraham, “Fault tolerance systems for the computation of eigenvalues and singular values,” Proc. SPIE, vol. 696, Advanced Algorithms and Architectures for Signal Processing, pp. 222-227, 1986. Y. H. Choi and M. Malek, “A fault tolerant FFT processor,” IEEE Trans. Comput., vol. C-37, no. 5 , pp. 617-621, May 1988. S. L. Garverick and E. A. Pierce, “A single wafer 16-point 16 MHz FIT processor,” in Proc. Custom Integrated Circuits Conf, 1983 K. H. Huang and J. A. Abraham, “Algorithm-based fault-tolerance for matrix operations,” IEEE Trans. Comput., vol. C-33, no. 6, pp. 518-529, June 1984. J . Y. Jou and J. A. Abraham, “Fault tolerant matrix arithmetic and signal processing on highly concurrent computing structures,” Proc. IEEE, vol. 74, pp. 732-741, May 1986. ..

(61 -, “Fault tolerantFFT networks,” IEEE Tram. Compur., vol. C-37, no. 5 , pp. 548-561, May 1988.

[7] F. T. Luk, “Algorithm-based fault tolerance for parallel matrix equation solvers,” Proc. SPIE, vol. 564, Real Time Signal Processing, pp. 49-53, 1985.

[8] F. T. Luk and H. Park, “An analysis of algorithm-based fault tol- erance techniques,” Proc. SPIE, vol. 696, Advanced Algorithms and Architectures for Signal Processing, pp. 222-227, 1986.

191 -, “A fault tolerance matrix triangularizations on systolic arrays,” IEEE Trans. Comput., vol. (2-37, no. 11, pp. 1434-1438, Nov. 1988.

[lo] V. S. S. Nair and J. A. Abraham, “General linear codes for fault-tolerant matrix operations on processor arrays,” in Proc. FTCS-Z8, June 1988,

(111 A. V. Oppenheim and R. W. Schafer, Digital Signal Processing. En- glewood Cliffs, NJ: Prentice-Hall, 1975, ch. 9.

[ 12) A. Papoulis, Probabiliry, Random Variables, and Stochustic Processes. New York: McGraw-Hill, 1965.

[13] A. L. N. Reddy and P. Banerjee, “Algorithm-based fault detection for signal processing applications,” IEEE Trans. Comput., vol. C-39, no. 11, pp. 1304-1308, Nov. 1990.

pp. 180-185.

D. L. Tao (S’85-M’87) received the B.S. degree in electrical engineering from Beijing Institute of Posts and Telecommunications in 1982, the M.S. degree in electrical engineering and the Ph.D. degree in computer engineering from Syracuse University in 1984 and 1988, respectively.

Since 1988 he has been an Assistant Professor with the Department of Electrical Engineering, State University of New York at Stony Brook. His re- search interests include fault-tolerant computing and VLSl testing.

C. R. P. Hartmann (S’61-M’73-SM’91) was born in Sao Paulo. Brazil, on October 22, 1939. He received the B.S. and M.S. degrees in electrical engineering from the Instituto Tecnologico de Aero- nautic (ITA), Sao Jose dos campos (S.P.), Brazil, in 1963 and 1966, respectively, and the Ph.D. degree from the University of Illinois, Urbana, in 1970.

He was an Instructor at ITA fro 1964 to 1965. From 1Y66 to 190 he was a Research Assistant at the Coordinated Science Laboratory, University of Illinois. He is currently a Professor in the School

of Computer and Information Science, Syracuse University, Syracuse, NY. He is interested in error control for digital systems, fault detection in digital circuits, fault-tolerant computer design, analysis and design of algorithms, and data compression.

Dr. Hartmann is a member of Phi Kappa Phi and the Brazilian Conselho Re- gional de Engenharia e Arquitetura. From 1981 to 1983 he was the Associate Editor for Coding Theory for the IEEE TRANSACTIONS ON INFORMATION THEORY.