18
Multichannel Blind Identification: From Subspace to Maximum Likelihood Methods LANG TONG, MEMBER, IEEE, AND SYLVIE PERREAU Invited Paper A review of recent blind channel estimation algorithms is pre- sented. From the (second-order) moment-based methods to the maximum likelihood approaches, under both statistical and de- terministic signal models, we outline basic ideas behind several new developments, the assumptions and identifiability conditions required by these approaches, and the algorithm characteristics and their performance. This review serves as an introductory reference for this currently active research area. Keywords— Blind equalization, parameter estimation, system identification. I. INTRODUCTION A. What Is Blind Channel Estimation and Why? There have been considerable interests from both signal processing and communications communities in the so- called “blind” problem. This is evident from titles of recent publications in both societies’ journals and annual conferences. The basic blind channel estimation problem involves a channel model shown in Fig. 1, where only the observation signal is available for processing in the identification and estimation of channel . This is in contrast to the classical input–output system identification and estimation problem where both input and observation are used. The impetus behind the increased research activities in blind techniques is perhaps their potential applications in wireless communications, which are currently experiencing explosive growth. For example, the distortion caused by multipath interference affects both transmission quality and efficiency in wireless communications. Conventional designs of receivers that mitigate such distortions require either the knowledge of the channel or the access to the in- Manuscript received May 23, 1997; revised May 8, 1998. This work was supported in part by the National Science Foundation under Contract NCR-9321813 and by the Office of Naval Research under Contract N00014-96-1-0895. L. Tong is with the School of Electrical Engineering, Cornell University, Ithaca, NY 14853 USA (e-mail: [email protected]). S. Perreau is with the Institute for Telecommunications Research, The Levels, SA 5095 Australia (e-mail: [email protected]). Publisher Item Identifier S 0018-9219(98)06969-2. put so that certain “training” signals can be transmitted. The latter is the case in many, if not most, communication sys- tems design. The transmission of training signals obviously decreases communications throughput although, for time invariant channels, the loss is insignificant because only one training is necessary. For time varying channels, however, the loss of throughput becomes an issue. For example, in high-frequency (HF) communications, the time used to transmit training signals can be as much as 50% of the overall transmission. Even the group special mobile (GSM) system for cellular mobile communication has consider- able overhead associated training. Yet another example, described by Godard [46] in his pioneer work in blind equalization, is in computer networks where links between terminal and central computers need to be established in an asynchronous way such that, in some instances, training is impossible. Another example is the potential application of blind equalization in high-definition television (HDTV) broadcasting [35]. Outside of the communications arena, blind channel estimation has long been an interest in geoscience. Some of the earlier pioneers in this field include Robinson [99], Wiggins [133], and Donoho [24]. Recent results can be found in [79] and [97]. Blind channel esti- mation also has applications in image restoration problems. Already, blind estimation techniques have been proposed for image deblurring applications [40], [42]. At first glance, the estimation problem illustrated in Fig. 1 may not seem tractable. How is it possible to distinguish the signal from the channel when neither is known? The essence of blind channel estimation rests on the exploitation of structures of the channel and properties of the input. A familiar case is when the input has known prob- abilistic description, such as distributions and moments. In such a case, the problem of estimating the channel using the output statistics is related to time series analysis. In communications applications, for example, the input signals may have the finite alphabet property, or sometimes exhibit cyclostationarity. This last property was exploited in [118] to demonstrate the possibility of estimating a nonminimum 0018–9219/98$10.00 1998 IEEE PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998 1951

Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

Multichannel Blind Identification: FromSubspace to Maximum Likelihood Methods

LANG TONG, MEMBER, IEEE, AND SYLVIE PERREAU

Invited Paper

A review of recent blind channel estimation algorithms is pre-sented. From the (second-order) moment-based methods to themaximum likelihood approaches, under both statistical and de-terministic signal models, we outline basic ideas behind severalnew developments, the assumptions and identifiability conditionsrequired by these approaches, and the algorithm characteristicsand their performance. This review serves as an introductoryreference for this currently active research area.

Keywords—Blind equalization, parameter estimation, systemidentification.

I. INTRODUCTION

A. What Is Blind Channel Estimation and Why?

There have been considerable interests from both signalprocessing and communications communities in the so-called “blind” problem. This is evident from titles ofrecent publications in both societies’ journals and annualconferences. The basic blind channel estimation probleminvolves a channel model shown in Fig. 1, where onlythe observation signal is available for processing inthe identification and estimation of channel. This is incontrast to the classical input–output system identificationand estimation problem where both inputand observation

are used.The impetus behind the increased research activities in

blind techniques is perhaps their potential applications inwireless communications, which are currently experiencingexplosive growth. For example, the distortion caused bymultipath interference affects both transmission qualityand efficiency in wireless communications. Conventionaldesigns of receivers that mitigate such distortions requireeither the knowledge of the channel or the access to the in-

Manuscript received May 23, 1997; revised May 8, 1998. This workwas supported in part by the National Science Foundation under ContractNCR-9321813 and by the Office of Naval Research under ContractN00014-96-1-0895.

L. Tong is with the School of Electrical Engineering, Cornell University,Ithaca, NY 14853 USA (e-mail: [email protected]).

S. Perreau is with the Institute for Telecommunications Research, TheLevels, SA 5095 Australia (e-mail: [email protected]).

Publisher Item Identifier S 0018-9219(98)06969-2.

put so that certain “training” signals can be transmitted. Thelatter is the case in many, if not most, communication sys-tems design. The transmission of training signals obviouslydecreases communications throughput although, for timeinvariant channels, the loss is insignificant because only onetraining is necessary. For time varying channels, however,the loss of throughput becomes an issue. For example,in high-frequency (HF) communications, the time used totransmit training signals can be as much as 50% of theoverall transmission. Even the group special mobile (GSM)system for cellular mobile communication has consider-able overhead associated training. Yet another example,described by Godard [46] in his pioneer work in blindequalization, is in computer networks where links betweenterminal and central computers need to be established inan asynchronous way such that, in some instances, trainingis impossible. Another example is the potential applicationof blind equalization in high-definition television (HDTV)broadcasting [35]. Outside of the communications arena,blind channel estimation has long been an interest ingeoscience. Some of the earlier pioneers in this field includeRobinson [99], Wiggins [133], and Donoho [24]. Recentresults can be found in [79] and [97]. Blind channel esti-mation also has applications in image restoration problems.Already, blind estimation techniques have been proposedfor image deblurring applications [40], [42].

At first glance, the estimation problem illustrated inFig. 1 may not seem tractable. How is it possible todistinguish the signal from the channel when neither isknown? The essence of blind channel estimation rests on theexploitation of structures of the channel and properties ofthe input. A familiar case is when the input has known prob-abilistic description, such as distributions and moments. Insuch a case, the problem of estimating the channel usingthe output statistics is related to time series analysis. Incommunications applications, for example, the input signalsmay have the finite alphabet property, or sometimes exhibitcyclostationarity. This last property was exploited in [118]to demonstrate the possibility of estimating a nonminimum

0018–9219/98$10.00 1998 IEEE

PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998 1951

Page 2: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

Fig. 1. Schematic of blind channel estimation.

(a) (b)

Fig. 2. (a) Classification of blind channel estimators and (b) contents of the paper.

phase channel using only the second-order statistics, whichled to the development of many subspace-based blindchannel estimation algorithms.

B. The Goal and the Scope of this Paper

By complementing recent surveys [72], [73], the goal ofthis paper is to review developments in blind channel iden-tification and estimation within the estimation theoreticalframework. We have paid special attention to the issue ofidentifiability, which is at the center of all blind channelestimation problems. Various existing algorithms are clas-sified into the moment-based and the maximum likelihood(ML) methods. We further divide these algorithms based onthe modeling of the input signal. If input is assumed tobe random with prescribed statistics (or distributions), thecorresponding blind channel estimation schemes are consid-ered to be statistical. On the other hand, if the source doesnot have a statistical description, or although the sourceis random but the statistical properties of the source arenot exploited, the corresponding estimation algorithms areclassified as deterministic. Fig. 2 shows a map for differentclasses of algorithms and the organization of the paper.

Space limitations force us to be brief in discussing somealgorithms and, unfortunately, to exclude other importantapproaches. We have excluded discussions related to the“dual” problem of blind channel estimation, namely, theblind signal estimation, where the goal is to estimateor detect the input signal without knowing the channel.Formulated as blind signal estimation problems are someof the first applications of blind estimation (equalization)method in communications, including the celebrated workof Godard [46], Sato [101], and Trechler and Agee [124].These earlier works are based on some forms of higherorder statistics. Under the multichannel model, direct blindequalization becomes possible using only the second-orderstatistics, which potentially may have faster convergencerates. It is shown by Liu and Dong [75] that in theabsence of noise, a whitening filter is in fact a perfectequalizer. Direct equalization using the second-order sta-tistics was first recognized by Slock [108], and there area number of new developments [16], [34], [45], [74],[76], [77], [110]. Also not considered here are neuralnetwork-based direct blind signal estimation techniques (see[6]).

1952 PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998

Page 3: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

(a) (b)

Fig. 3. (a) A multichannel model and (b) a multirate channel model.

In presenting moment-based methods, this paper fo-cuses on the second-order moment techniques that havereceived considerable research attentio since the publica-tion of [118]. Consequently, applications of the algorithmssurveyed in this paper are limited to certain types of chan-nels and sources specified by the identifiability condition.Without elaboration, we briefly mention two alternativeapproaches.

1) The Higher Order Statistical Approaches:Many appli-cations may not have the multichannel model considered inthis paper. In such a case it may be necessary to exploithigher order statistics. There is an extensive literature deal-ing with blind channel estimation using higher order statis-tics in both time and frequency domains. See, for example,[12], [30], [39], [54], [90], [125], and [126] and a tutorialin [81]. For the multiuser case, see [38], [115], and [116].

2) The Bayesian Approach:In this paper, the channelis modeled by finite dimensional deterministic unknownparameters. In some applications, however, channels canbe modeled as a random vector or a random process. Forexample, Rayleigh fading channels can be modeled as aGaussian random process with a certain power spectrum. Insuch cases we have a Bayesian estimation problem. State-space model of the channel is also used in some applicationswhere the extended Kalman filter can be applied [67]. Morerecent approaches can be found in [60], [61], and [68].

3) Notations: Most notations are standard: vectors andmatrices are boldface small and capital letters, respectively;the matrix transpose, the complex conjugate, the Hermitian,and pseudoinverse are denoted by, , , and ,respectively; stands for the Kronecker product; is the

identity matrix; is the mathematical expectation.

II. PROBLEM FORMULATION

In contrast to classical input–output channel identificationand estimation problems, the so-called blind channel iden-tification and estimation involves only the channel output.The basic problem considered here is: given the received(perhaps noisy) signal, estimate the channel impulse re-sponse.

A. Channel Models

We consider two equivalent channel models shown inFig. 3. The multichannel model is natural in applications

involving multiple receivers, whereas the multirate chan-nel model comes directly from communication problemsinvolving linear modulations.

1) The Multichannel Model:Considered in this paper isthe identification and estimation of a discrete-time linearsingle-input -output channel model shown in Fig. 3(a).Denoting the vector impulse response and its-transformby

(1)

we have the following system equations:

(2)

where is the noiseless channeloutput and is the received (noisy) signal.

It is often convenient to consider the channel model fora block of consecutive samples. Denoting

(3)

(4)

(5)

we then have

(6)

(7)

where the filtering matrix and the datamatrix are defined by

......

......

...

(8)

TONG AND PERREAU: MULTICHANNEL BLIND IDENTIFICATION 1953

Page 4: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

We drop the subscript whenever its omission does notcause confusion.

2) Multirate Channel Model:An equivalent alternativeto multichannel representation is the multirate channelmodel shown in Fig. 3(b). If the input sequenceis widesense stationary, then is wide sense cyclostationary withperiod . It is the cyclostationarity of the received signalthat makes the identification using second-order statisticspossible. See [32] for detailed discussions.

The system equations are given by

(9)

The relation between in the multirate channelmodel and in the multichannel model isgiven by

(10)

(11)

B. Channel and Source Conditions

As in classical system identification problems, certainconditions about the channel and the source must be sat-isfied to ensure identifiability. In the multichannel blindidentification problem, two conditions are shared by manydifferent approaches.

1) Channel Diversity:What makes identification of themultichannel model different from that of the single channelcase is the channel diversity. By diversity we mean thatdifferent subchannels have different modes. When they aremodeled as finite impulse response channels, this meansthat they have different zeros, or in other words, they arecoprime.

The significance of coprimeness among subchannels canbe understood more clearly in the deterministic settingwhere no stochastic models are used to describe the in-put sequence. Consider the multichannel model shown inFig. 3. If the subchannels are not coprime, then there existsa common factor such that

(12)

Consequently, without further information, it is difficult todistinguish whether is part of the input signal or partof the channel. In fact, one can replace by factorsof the input sequence without affecting the observation.Therefore, the identification cannot be made unique (unlessother properties of the input sequence and the channel areused).

Another ramification of coprimeness among subchannelsis the existence of finite impulse response (FIR) inverse.When there is only one channel it is not possible toreconstruct the input sequence by using an FIR filter. This isnot necessarily true in the multichannel case. Indeed, thereexists a bank of filters such that

(13)

Fig. 4. A multichannel with FIR inverse.

if and only if subchannels are coprime. Equation (13)is the so-called Bezout equation. In other words, giventhe noiseless channel output , the channel inputcan be reconstructed using a bank of FIR filtersas shown in Fig. 4. In communication applications, filter

is often referred to as the zero-forcing receiver for itcompletely eliminates the channel distortion. The conceptof FIR inverse of a vector FIR channel has been exploitedin coding [80].

The coprimeness has several equivalent forms, and theyare summarized below.

Property 1: Subchannels do not share com-mon zeros:

P1) if and only if there exists a vector polynomialsuch that

(14)

P2) if and only if has full column rank for some.

P1) comes from a basic property of ring (e.g., see [31,p. 10]). P2) was shown by Sylvester (in 1840) as a testfor coprimeness [31], [63] (also see [29]). This conditionplays a particularly important role in the development ofmany blind identification algorithms based on second-orderstatistical and deterministic formulations.

2) Linear Complexity and Persistent Excitation:Linearcomplexity measures the predictability of a finite-lengthdeterministic sequence. Persistent excitation measures therichness of the infinite-length signal in the frequencydomain. Both concepts are relevant in the deterministicsetting of blind channel identification where the source isnot assumed to have a probabilistic model. We adopt thefollowing definition given in [10] and [47].

Definition 1: The linear complexity of sequenceis defined as the smallest value offor which there exists

such that

(15)

An infinite sequence is weakly persistently excitingof order if

...

(16)

where .

1954 PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998

Page 5: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

A connection between linear complexity and persistentexcitation can be observed through the sample covariance ofthe input sequence, which enters the definition of persistentexcitation directly. We define a Toeplitz matrix by

......

...(17)

If has linear complexity or greater, then has fullcolumn rank. Hence, the sample covariance of the vectorsequence has full rank. Onthe other hand, if has linear complexity no greater than,the sample covariance of is rank deficient. For a quasi-stationary sequence, persistent excitation implies that thesample covariance matrix is full rank and its spectrum hasat least nonzero points (see [78]).

III. T HE SUBSPACE METHODS

Many recent blind channel estimation techniques exploitsubspace structures of observation. The key idea is thatthe channel (or part of the channel) vector is in a one-dimensional subspace of either the observation statistics ora block of noiseless observations. These methods, which areoften referred to as subspace algorithms, have the attractiveproperty that the channel estimates can often be obtainedin a closed form from optimizing a quadratic cost function

(18)

where is a set that specifies the domain of.Subspace methods can sometimes be considered part of

the moment methods. They are attractive because of theclosed-form identification. On the other hand, as they relyon the property that the channel lies in a unique direction(subspace), they may not be robust against modeling errors,especially when the channel matrix is close to beingsingular. The second disadvantage is that they are oftenmore computationally expensive.

A. Deterministic Subspace Methods

Deterministic subspace methods do not assume that theinput source has a specific statistical structure. Perhapsa more striking property of deterministic subspace meth-ods is the so-called finite sample convergence property.Namely, when there is no noise, the estimator producesthe exact channel using only a finite number of samples,provided that, of course, the identifiability condition issatisfied. Therefore, these methods are most effective athigh SNR and for small data sample scenarios. On onehand, deterministic methods can be applied to a muchwider range of source signals; on the other hand, not usingthe source statistics affects its asymptotic performance,especially when the identifiability condition is close to beviolated.

1) Assumptions and Identifiability:Deterministic sub-space methods assume the following conditions.

Assumption 1:

1.1) The noise is zero mean, white with knowncovariance .

1.2) The channel has known order.

The assumption that is known may not be practical.To address this problem, there are three approaches. First,channel order detection and parameter estimation can beperformed separately. There are well-known order detectionschemes that can be used in practice (e.g., see [5], [96],and [129]). Second, some statistical subspace methods [3]require only the upper bound of. Third, channel orderdetection and parameter estimation can be performed jointly[123]. Similarly, the noise variance may not be knownin practice, but it can be estimated in many ways. Forexample, the noise variance estimation and channel orderdetection can be performed jointly using singular values ofthe estimated covariance matrix [129].

For deterministic methods it is necessary to imposeconditions on the input sequence, which significantlycomplicates the identifiability condition. Xuet al. in [134]and Hua and Wax in [59] gave the necessary and suffi-cient conditions of identifiability. Here we present only asufficient condition for identifiability.

Theorem 1: Under Assumption 1, the channel (or) can be uniquely identified up to a constant factor from

the noiseless observation if

1) subchannels are coprime;2) the source sequence has linear complexity greater

than .

The condition that subchannels are coprime is also nec-essary for identifiability. It was shown in [134] that itis necessary that the linear complexity of the source,characterized by “modes” in [134], is greater than. Whenthe source is a realization of an ergodic process, the linearcomplexity condition is satisfied automatically, and wehave the same identifiability condition as the stochasticformulation. The persistent excitation of the source, alongwith the coprime condition of subchannels certainly alsoensures the identifiability.

2) The Cross Relation Approach:The cross relation (CR)approach, a termed coined by Hua [56], wisely exploitsthe multichannel structure. This algorithm was discoveredindependently and in different forms by Liuet al. [71],[134], Gurreli and Nikias [50], [51], Baccala and Roy [7],[8], and Robinson [98]. An adaptive implementation usingneural network is presented in [22] and [23].

Consider the noiseless multichannel model involvingchannels and shown in Fig. 5. We simply have for all

and

(19)

In the matrix form, we have

... (20)

TONG AND PERREAU: MULTICHANNEL BLIND IDENTIFICATION 1955

Page 6: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

Fig. 5. The cross relation between two channels.

where can be constructed from the received datasamples

......

...(21)

Of course, one can consider all possible pairs to obtain thefollowing identification equation

... (22)

where is the data selection transform [71], [138] asshown in (23), shown at the bottom of the page. It is shownin [134] that, under the identifiability conditions,is column-rank deficient by one. Hence, the solution of(22) provides the channel identification up to a scalingfactor. When noise presents only are available.The Cramer–Rao approach minimizes the following leastsquares (LS) cost

(24)

Equivalently, the channel estimatecan be obtained fromthe singular vector of associated with thesmallest singular value. It can be shown further [138] that

can be replaced by the sample covarianceof . By subtracting the noise statistics, a mean-squareconsistent estimator can be obtained.

a) Algorithm characteristics:Unlike statistical meth-ods, the CR method is very effective for small data sampleapplications at high SNR. Under the condition that sub-channels are coprime and linear complexity conditions,observations of samples are sufficient [59]. Insimulations, Hua [56] showed that CR method combinedwith the ML approach offers performance close to theCramer–Rao lower bound. The main problem of the CR

method is that the channel ordercannot be over estimated,in contrast to some of the statistical subspace approaches.For finite samples, this algorithm may also be biased.

3) Noise Subspace Approach:The (noise) subspacemethod, proposed by Moulineset al. [82], [83], exploits thestructure of the filtering matrix directly. The basicidea is to force the signal space to have the block Toeplitzform of . The dual of this approach is to force theToeplitz structure of presented in [127], thus bothcan be considered as forms of subspace intersection. See[127] for this connection.

Suppose that is in the orthogonalcomplement of the range space of , i.e.,

......

. . .. . . (25)

The above can also be written as a linear equation withrespect to the channel parameter. Specifically, we have

......

(26)

Here, is the th subvector of :. The above equation can be used

to identify the channel vector provided that (26) hasa unique solution. Moulineset al. gave the followingtheorem.

Theorem 2 [83]: Let span be theorthogonal complement of the column space of . Forany and satisfying the condition that subchannels arecoprime, if and only if . Further,for all , satisfies the following equation:

(27)

Having the estimated basis of the orthogonal com-plement of , identification of channel can be accom-plished by the following optimization:

(28)

......

......

......

......

.... . .

.... . .

......

......

...

(23)

1956 PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998

Page 7: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

With the above theorem, the estimation of the channelcan now be accomplished by first estimating the orthogonalcomplement of the . This can be achieved in a numberof ways. One of the frequently used approaches is thesignal–noise space decomposition. From the multichannelmodel, we have

(29)

The singular value decomposition of has the form

diag

(30)

where are the singular values of . If subchan-nels are coprime, i.e., is full column rank, theorthogonal complement of the range space of , alsoreferred to as the noise subspace, is given by the singularvectors of associated with the singularvalue . Note that when there is no noise canbe obtained directly, using only a finite number of datasamples from the eigen-decomposition of the data matrix

. Incidentally, vectors in noise spacecan also be viewed as linear prediction error (or theblocking) filters. With this interpretation, Slock presented alinear prediction-based subspace approach [108].

a) Algorithm characteristics:There is a strong connec-tion between the CR and the noise subspace approaches. Aspointed out in [2], they are different only in their choicesof parameterizing the signal and the noise subspaces. Fora special but important case when [138], these twoalgorithms are in fact identical. Similar to the CR method,the noise subspace method also requires the knowledgeof the channel order . Overdetermination of the channelorder renders the algorithm ineffective without additionalprocessing. The noise subspace method is also suitablefor short data size applications. Although it is a bit morecomplex than the CR method, it appears to offer improvedperformance in many simulations. Several extensions havebeen obtained. Huaet al. [57] investigated the minimumnoise subspace for channel identification. The multiusercase is presented in [2].

4) Identification via Least Squares Smoothing:Althoughdeterministic approaches enjoy the advantage of having fastconvergence, they share some common difficulties. Forexample, the determination of channel order is requiredand often difficult. Second, the adaptive implementationof these algorithms is not straightforward. Recently, Tongand Zhao proposed approaches based on the least squaressmoothing (LSS) of the observation process [121]–[123],[141], [142].

The key idea of LSS rests on the isomorphic relationbetween the input and the observation spaces. Given theinput sequence and the noiseless observation , we

define

(31)

span

span (32)

where and are spaces spanned by the past inputand (noiseless) observation vectors. It can be shown thatwhen the channel is identifiable, there exists asuch that

(33)

This implies that the input and observation spaces are iden-tical. We now change the problem of identifying channelusing the input subspaces.

For simplicity, consider the case for . We have

We define a projection space that satisfies the followingtwo conditions: 1) and 2)

. Because of the isomorphic relation between theinput and observation spaces, it can be shown that

(34)

which is the space spanned by the past and future observa-tions. The smoothing error of has the followingform:

(35)

The channel vector can then be obtained from the projectionerror matrix . A general formulation that does not requirethe knowledge of the channel order is given in [121] and[123].

a) Algorithm characteristics:This approach has twoattractive features. First, it converts a channel estimationproblem to a linear LSS problem for which there areefficient adaptive implementations [141], [142] using latticefilters. Second, a joint order detection and channel algorithm[121], [123] (J-LSS) can be derived that determines thebest channel order and channel coefficients to minimize thesmoothing error. J-LSS is perhaps the only deterministicapproach that enables channel identification with only theknowledge of the upper bound of the channel order.

B. Second-Order Statistical Subspace Methods

1) Assumptions and Identifiability:In statistical subspaceapproaches, it is assumed that the sourceis a randomsequence with known second-order statistics. Althoughalgorithms discussed here can be extended in many differentways, we shall assume the following assumptions in ourdiscussion.

TONG AND PERREAU: MULTICHANNEL BLIND IDENTIFICATION 1957

Page 8: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

Assumption 2:

2.1) The source sequence is zero mean, white, with unitvariance.

2.2) The noise sequence , uncorrelated with , iszero mean, white, with known covariance .

2.3) The channel order is known.

Most algorithms can be extended to cases when the noise iscolored but with known correlations. Some of the statisticalmethods do not require knowledge of the channel order.They require, instead, the upper bound of the channel order.

One of the most important questions is channel identifia-bility, i.e., given the second-order statistics of, canbe uniquely determined up to a constant factor? The answerto this question is affirmative provided that the subchannelsare coprime.

To illuminate this issue, we present a frequency-domainargument given in [117], but in a slightly different form.Instead of using the multirate channel model we considerthe multichannel model where the second-order statistics ofthe noiseless received signal are given by

(36)

It can be verified that is related to the channel by

(37)

We therefore have

(38)

Hence, those zeros of not shared by can beidentified from the zeros of . Thus, all zerosof , and itself, can be identified from the zerosof if all channels do not share a commonzero. Conversely, if all channels share a common zero,then one can replace with without affectingfor all and . Thus the channel is not identifiable. Wethen have the following necessary and sufficient conditionfor channel identifiability.

Theorem 3 [117], [119]: Under Assumption 2, the

channel (or ) can be uniquely identifiedup to a constant factor from the autocorrelationfunction of the multichannelmodel [or, equivalently, the autocorrelation function

in the multirate channel model] ifand only if subchannels are coprime.

2) Identification via Cyclic Spectra:Having shown thatthe channel is uniquely determined from the second-order statistics, the next question is how to estimate thechannel. Indeed, the arguments leading to the identifiabilitycondition already suggest that channels can be identifiedfrom the zeros of the output spectra. Unfortunately, findingzeros of the estimated spectra accurately is difficult because

locations of the zeros can vary significantly with smallvariations of the estimated autocorrelation. To circumventsuch difficulties, a subspace approach was first proposed in[117] and later independently in [41].

We present next an approach based on the multiratechannel model. Although an equivalent method can beobtained for the multichannel model, the multirate modelexploits the cyclostationary properties of the signal, whichare also used in more recent approaches [15], [104] to solveblind channel estimation problems for cases that do not havea clear multichannel representation.

For the multirate channel model, it can be shownthat the observation process is cyclostationary, i.e.,

is periodic in with period . Letbe the “instantaneous spectrum”

(39)

Since is periodic in , it has a Fourier seriesrepresentation where is referred to asthe th cyclic spectrum. It is easy to show [117] that

(40)

Since is assumed to be known (or can be estimated inpractice), we are dealing with the following identificationequation in the frequency domain

(41)

where is the unit impulse, and the left-hand side of theabove equation is known. A line of arguments identical tomultichannel case can be used to show that the channel isidentifiable if and only if the multirate channel doesnot have uniformly -spaced zeros, which is equivalentto the channel diversity condition, i.e., all subchannels inthe vector channel model are coprime. To obtain channelidentification, observe that, for any and

(42)

It is clear that the time domain equivalence of the aboveidentification equation (from the inversetransform of theabove) leads to a set of linear equations with respect to thechannel coefficients

(43)

where can be constructed from the cyclic cor-relations of the received signal. The specific forms of

can be found in [117]. Therefore, is in thenull space of matrix . Combining cyclic spectrafor all , the intersection of the null spaces of all

gives the unique one-dimensional subspace towhich the channel vector belongs.

1958 PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998

Page 9: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

A practical estimation algorithm can be derived from thefollowing optimization:

(44)

where the channel estimate is given by the quadraticoptimization

(45)

where is the estimated .a) Algorithm characteristics and related work:This al-

gorithm exploits the complete cyclic statistics of the re-ceived and source signals, as well as the FIR structure ofthe channel model. The disadvantage of this algorithm isthat it requires the convergence of source statistics, whichmeans that, even when there is no noise, there is estimationerror for any fixed sample size, although the algorithm ismean square consistent.

In related work, Li and Ding [69] developed a frequencydomain nonparametric approach that identifies the magni-tude and phase response separately from the cyclostationarystatistics. Aghajanet al. [4] obtained its extension formultiuser scenarios. A similar approach is also presentedin [14].

3) Identification via Filtering Transform:The first second-order statistical approach to blind channel estimation wasproposed in [118], [119]. In this approach, the authorspresented a two-step closed-form identification algorithm.The algorithm finds first the filtering matrix and thenestimates the channel from the estimated filtering matrix.

Considering the time domain channel (2), we have

(46)

(47)

where is the filtering transform , and is the“shifting” matrix with the first lower off-diagonal entriesbeing one and zero elsewhere. It was then shown thatcan be computed from and .

a) Algorithm characteristics and related work:The im-plementation of this algorithm requires the channel orderand the noise variance, both of which, in principle, canbe estimated from the SVD of the estimated covariancematrix . While it is consistent, this approach may notperform well for two reasons. First, the algorithm failsto take advantage of the special structure of the filteringtransform . Second, the performance of such a two-step procedure is often affected by the quality of theestimation in the first step. On the other hand, since nostructure in is assumed, when a large number ofchannels is available, this algorithm can be applied to[instead of ] directly, which may have computationaladvantages. An extension of this approach to colored source

was presented in [58] and its performance analysis wasperformed in [92]. The extension to the multiuser case wasgiven in [70].

4) Identification via Linear Prediction:Introduced firstby Slock [108]–[110], the linear prediction formulation ofthe multichannel problem plays an important role in thedevelopment of several algorithms. We present next onesuch approach by Abed-Meraimet al. [3].

Consider the multichannel model given in (2). The keyidea comes from the recognition that the multichannel MAprocess is also autoregressive. Under the condition thatsubchannels are coprime there exists a causal suchthat

(48)

Substituting above into the multichannelmodel, we have

(49)

Since is a white sequence, is orthogonal to allfor . Since is causal, is the summation

of the optimal linear prediction and the innovation (theprediction error) process .

The identification of the channel involves two steps.

1) Identification of : The prediction error covarianceis thus given by the covariance of the innovationprocess, i.e.,

Cov (50)

Given the autocorrelation function of , theleft-hand side can be computed explicitly using thestandard theory in linear prediction. The right-handside is a rank one matrix made of the vector ofthe first coefficient of the channel impulse response.Therefore, we can obtain for some un-known from the eigenvector of the prediction errorcovariance associated with the largest eigenvalue.

2) Identification of : Once is obtained, from (49),the input sequence can be constructed directly fromthe innovation sequence

(51)

With the estimated input, we essentially have thestandard input–output channel identification problem.

a) Algorithm characteristics and related work:The al-gorithm uses all second-order statistics of the receivedsignal, and it is mean square consistent. It does not re-quire the exact channel order, thus it is robust againstoverdetermination of the channel order. Derived from thenoiseless model, the linear prediction idea is no longervalid in the presence of noise. However, when channelparameters are estimated from the autocorrelation functions,

TONG AND PERREAU: MULTICHANNEL BLIND IDENTIFICATION 1959

Page 10: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

the effect of noise can be lessened by subtracting the termsrelated to the noise correlation. The main disadvantageof this algorithm is that it is a two-step approach whoseperformance depends on the accuracy of the estimated

. When noise presents and is small, performancedegradation may be significant. Like all statistical momentmethods, the convergence of the source statistics is alsorequired.

The linear prediction-based approaches appear to berooted in a somewhat surprising result by Liu and Dong[75]. It is shown that, for the multichannel model, awhitening filter is in fact a perfect equalizer, which is nottrue in the single channel case. Specifically, a finite order

is a whitening filter, i.e., is a whitesequence if and only if for some . Inthe spectral domain this result is a consequence of themaximum modulus theorem [100, p. 212].

A number of new approaches have been proposed re-cently. Based on the linear prediction framework, Gorokhovet al. proposed a weighted least squares approach in [48].Ding [20] proposed the outer-product decomposition algo-rithm (OPDA) that obtains the channel directly, henceavoiding the problem of small . Although OPDA wasnot derived from the linear prediction view point, it hasthe same identification equation as the multistep linearprediction approach derived by Gesbert and Duhamel [33].

C. Other Related Subspace Approaches

Space limits our exposition of many channel estimationapproaches developed recently. Here we mention two re-lated classes of approaches that can be applied to generalsubspace methods for improved performance.

1) Weighted Subspace Approaches:Subspace approachesusually involve estimating the channel vector (or perhapspart of the channel vector) by optimizing a quadratic costfunction

(52)

where is obtained from the received data. The weightedsubspace approaches, successfully used in the direction ofarrival estimation in array signal processing (see [128]),employ an additional weighting matrix which is chosenoptimally in some ways

(53)

The optimal selection of the weighting matrix is, however,nontrivial, and it is often a function of the true channelparameters. A practical solution is to use a consistentestimate of the channel to construct the optimal weightingmatrix (see [1], [13], [48], and [66]).

2) Exploiting Signal Waveforms:Exploiting side infor-mation proves to be an effective way of circumventingthe difficulties associated with the ill-conditioning of thechannel matrix. Recognizing that in many communicationapplications the waveforms used in the transmission is oftenknown, Schellet al. first proposed a subchannel response

matching approach [102], [103]. Principle componentstructure of the channel was used in [139]. Ding and Maopresented a knowledge-based approach [21]. In the multipleantenna array setting, while Gunther and Swindlehust[49] developed an ESPRIT-like subspace approach byparameterizing the channel using physical parameters suchas relative delays of multipaths. Applications to IS-136 arereported in [143].

IV. OPTIMAL MOMENT METHODS: PERFORMANCE

AND MATCHING TECHNIQUES

When the source has a statistical model, most subspacemethods are part of the moment methods. Specifically,they all can be viewed as estimating channel parametersfrom the estimated second-order momentsof the receivedsignal. For the class of consistent estimators, asymptoticnormalized mean square error (ANMSE) can be used asa performance measure. Specifically, given the estimatedsecond-order moments from observations, the

ANMSE of the estimator is defined by

ANMSE (54)

when the limit exists. By normalization we mean that boththe channel and its estimate are normalized to unit 2-norm.Further, to obtain a meaningful MSE, we also assume thatthe scaler ambiguity of the estimate has been removed.ANMSE measures the MSE of the consistent estimator

for a sufficiently large sample size

ANMSE

Obviously, smaller ANMSE is desired. WhenANMSE , it implies that the estimatordoes not have convergence at the rate of .

In analyzing the class of blind channel estimators usingthe second-order moments, we pose the following ques-tions.

1) What is the achievable ANMSE among all consistentestimators using consistent estimates of second-ordermoments?

2) What are fundamental limitations to the ANMSEof blind channel estimators using the second-orderstatistics?

3) What is the ANMSE of existing subspace estimatorsand what are their performance limitations?

4) How much potential improvement can be made overthe existing subspace based moment estimators?

These questions are addressed in part in [1], [2], [43], [44],[93], [137], [139], and [140].

A. The Achievable ANMSE and Performance Bounds

The question considered here is the following: givenconsistent estimates of the second-order moments ofthe observation, what is the minimum ANMSE that canbe achieved by an estimator using? The answer to

1960 PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998

Page 11: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

this problem can be obtained by applying the asymptoticperformance analysis of general moment methods [91]. Forthe case involving real signals Zeng and Tong gave thefollowing theorem. The complex case can be found in [44]and [136].

Theorem 4 [140]: Let be the vector consisting of(nonredundant) autocorrelation coefficients. Assume that

, the Jacobian of the autocorrelation vectorwithrespect to the channel vector, is full column rank. Let

be the estimated autocorrelation vector obtained fromwith normalized asymptotic

covariance . Let be a channel parameterestimator such that . Then the ANMSE ofis lower bounded by

ANMSE tr

SNR(55)

where is a constant, is the condition number of, and the SNR is defined as

SNR (56)

Moreover, there exists an estimator that achievesthe lower bound tr .

From (55), it is clear that the performance of all momentmethods are limited by the condition number of the Jaco-bian , which leads to the following question: whenis singular? This question has a surprisingly simplecondition.

Lemma 1 [140]: is singular if and only ifshare common conjugate reciprocal zeros (CRZ), or equiv-alently, share common zeros.

The above condition shows an interesting difference fromthe condition of identifiability (subchannels are coprime).Note that the violation of the identifiability condition doesnot imply that no moment algorithm can achieve theANMSE bound. When subchannels do have common zeros,there are multiple but possibly finite numbers of possiblesolutions to the identification equation. If one can restrictthe parameter set to the neighborhood of the true chan-nel, optimal algorithms with minimum ANMSE do exist[44]. This, of course, is not unique to the multichannelidentification.

It is also interesting to compare the performance boundfor the CR and the noise subspace method. For the specialcase when , the ANMSE of both CR and thenoise subspace methods can be obtained easily if thecovariance matrix has the Wishart distribution. Underthis assumption, it can be shown that

ANMSESNR

(57)

where arethe singular values of the , is the conditionnumber of and is a constant. If the source isGaussian, Abed-Meraimet al. obtained a different bound[2].

The above bound shows that the CR and noise subspacemethods are limited by the condition number of the channelmatrix or the locations of channel zeros. Indeed,subspace methods often suffer from the ill-conditioningof the matrix from which they are derived. For example,certain channels have closely located zeros, which causesthe ill-conditioning of the channel matrix. This effect wasillustrated in Endreset al. [25].

B. Moment Matching Techniques

The moment matching approach is motivated by theexistence of a moment method that achieves the minimumANMSE. Giannakis and Halford investigated the generalmoment matching approach of the following form:

(58)

where is a weighting matrix. By choosing appropriate, as a function of , the so-called asymptotic best con-

sistent (ABC) estimator achieving the minimum ANMSEwas proposed in [43] and [44]. The suboptimal approachwith no weighting was investigated in [120].

While moment matching methods have a much morerobust performance against channel order selection andthe channel condition, they are unfortunately not easy toimplement because of the existence of local minima in theoptimization. To incorporate the subspace structure into themoment matching approach, Zeng and Tong proposed in[139] the following channel estimation criterion:

(59)

where is a linear subspace containing that used in thesubspace algorithms. The selection ofleads to a methodthat combines both subspace and moment matches.

V. THE ML M ETHODS

One of the most popular parameter estimation algorithmsis the ML method. Not only can such methods be de-rived in a systematic way, but perhaps more importantly,the class of maximum likelihood estimators are usuallyoptimal for large data records as they approximate theminimum variance unbiased estimators. Asymptotically,under certain regularity conditions, the variance of MLestimators approach the Cramer–Rao bound (CRB), whichis the lower bound on variance for all unbiased estimators.Unfortunately, unlike subspace based approaches, the MLmethods usually cannot be obtained in closed form. Theirimplementations are further complicated by the existenceof local minima. However, ML approaches can be madevery effective by including the subspace and other subop-timal approaches as initialization procedures. The general

TONG AND PERREAU: MULTICHANNEL BLIND IDENTIFICATION 1961

Page 12: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

formulation of the ML estimation can be found in manytextbooks (e.g., see [91]).

The problem at hand is to estimate the deterministic(vector) parameter given the probabilistic model of theobservation. Specifically, let be the probabilitydensity function of random variable parameterized by

. Given an observation , is estimated bymaximizing

(60)

where , when viewed as the function of, is referredto as the likelihood function.

The ML-based blind channel estimation can be derivedbased on either the statistical or the deterministic settingdepending on the model of the source signal.

SML Statistical ML estimation:In such a case, theinput sequence is assumed to be random with aknown distribution. In such a formulation, the onlyunknown parameter is the channel vector (

). In this case, the dimension of the unknownparameter is fixed with respect to the data size.

DML Deterministic ML estimation;Here the input se-quence is part of the unknown parameters, i.e.,

, although one may only be in-terested in estimating . In such a case, thedimension of the parameters increases with thesize of the observation.

These two classes of ML estimators are discussed next.

A. DML Approach

The DML approach assumes no statistical model for theinput sequence . In other words, both the channel vectorand the input source vectorare parameters to be estimated.In this paper, we shall only consider the estimation of thechannel.

Consider the multichannel model in (2)

(61)

The DML problem can be stated as follows: given,estimate by

(62)

where is the density function of the observationvectors parameterized by both the channeland theinput source .

When the noise is zero-mean Gaussian with covariance, the ML estimates can be obtained by the nonlinear

least squares optimization

(63)

1) Assumptions and Identifiability:In considering the de-terministic model, we assume the following assumptions.

Assumption 3:

3.1) The noise is zero mean, Gaussian, with knowncovariance .

3.2) The channel has known order.

We note that the noise variance can also be consideredas part of the parameters. For simplicity and consistencywith other approaches it is assumed to be known in ourdiscussion. Note also that the set of assumptions for DMLis almost the same as that for the deterministic subspacemethods, except that the noise in DML is assumed to beGaussian. Again, the channel modelmust be known foridentifiability reasons.

It is not surprising that the identifiability condition forDML is the same as that for the deterministic second-ordermoment methods. Specifically, the channel is identifiableif subchannels are coprime and the source has linear com-plexity greater than . The reason is that, when thenoise is Gaussian, all information about the channel in thelikelihood function resides in the second-order moments ofthe observations. Readers are referred to Theorem 1 forsufficient conditions and related discussions.

2) IQML, TSML, and Other Iterative Methods:These al-gorithms are developed by Hua [55], [56] and around thesame time by Slock [108]. The iterative quadratic maximumlikelihood (IQML) approach, proposed by Bresler andMacovski [11] for estimating superimposed exponentialsignals, transforms the DML problem into a sequence ofquadratic optimization problems for which simple solutionscan be obtained. It turns out that IQML has a related formin blind channel estimation using DML. This connection,first pointed out by Slock and Papadias [108], [110], has itsroot in the linear prediction formulation in both problems.

The joint optimization of the likelihood function in boththe channel and the source parameter spaces is difficult.Fortunately, the observation is linear in both the channeland the input parameters. In other words, we have aseparable nonlinear LS problem, which allows us to reducethe complexity considerably. The nonlinear LS optimizationcan be achieved sequentially in one of the following ways:

(64)

(65)

Considering next the optimization in (64), we have

(66)

where is a projection transform of into the orthog-onal complement of the range space of or the noisesubspace of the observation.

The key of IQML type of algorithms is the parameteri-zation of . Hua in [56] obtained directly from

1962 PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998

Page 13: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

the channel vector . Fig. 5 provided the clue for such aconstruction, where it is clear that the channel itself canbe used to null the noiseless observation, a process called“blocking” by Slock. Hua’s construction of uses thedata selection transform defined in (64) to obtain the IQMLform

(67)

where can be obtained easily from, and isa matrix constructed from . To implement the DMLestimation, Hua proposed a two-step approach referred toas the two-step maximum likelihood (TSML) method that1) uses the CR method to obtain an initial estimate of thechannel and 2) substitute the initial estimate into andoptimize (67) recursively.

a) Algorithm characteristics and related work:A num-ber of IQML type of approaches have been proposeddepending on the parameterization of the projection .In Slock’s “minimum null-space parameterization” [110],IQML is applied to the blocking filter. A different approachwas developed by Harikumar and Bresler [53]. This IQMLtype of algorithm (not surprisingly) offers more efficientchannel estimates when compared with moment methods.Hua demonstrated that TSML is both “high SNR” con-sistent and efficient. Similarly, Harikumar and Bresler alsoshowed that the CR method used in Hua’s TSML is a coarseapproximation of IQML, which ultimately supports Hua’sTSML. The performance comparison with the Cram´er–Raobound has also been obtained in [53], [56], and [85]. As a“dual” to the IQML-type of algorithms, Feder and Catipovic[27] proposed a DML by obtaining first by optimizingfirst the inner term in (65). Since the estimation of theinput is obtained first, it suffers from the fact that thedimension of the problem increases with the sample size,which renders this approach not practical for large datasize applications. For cases when the input sequence hasthe finite alphabet property, simplifications can be obtained(see [27]).

3) DML for Finite Alphabet Input:Similar to SML withhidden Markov model (HMM), finite alphabet propertiescan also be incorporated into DML. Because of the finitealphabet property, it is difficult to apply the separationidea in IQML-type approach. Consequently, this class ofalgorithms, first proposed by Seshadri [105] and Ghosh andWeber [36], iterates between estimates of the channel andthe input. At iteration , with an initial guess of the channel

, the algorithm estimates the input sequence andthe channel for the next iteration by

(68)

(69)

where is the (discrete) domain of. The optimizationin (69) is a linear least squares problem whereas theoptimization in (68) can be achieved by using the Viterbi

algorithm [28]. The convergence of such approaches arenot guaranteed in general.

a) Algorithm characteristics and related work:The fi-nite alphabet nature of the input makes the evaluation of theCramer–Rao lower bound difficult. Paris argued in [86] that,if the input sequence is equally probable, the probabilitythat the above estimatediffers from the ML estimate ofwith known diminishes with the noise variance. Similarly,at high SNR, one can expect that the above channel estimateis close to the ML channel estimate with known input.There are many variations in the implementation of thenonlinear LS to reduce the implementation complexity.Seshadri presented “blind” trellis search techniques [106].Reduced-state sequence estimation [26] was proposed in[36]. The so-called iterative LS with projection (ILSP)proposed by Talwaret al. [111], [112] is a relaxationtechnique that first ignores the finite alphabet property andthen projects the estimate to its nearest discrete value.Raheli et al. proposed a per-survivor processing techniquein [95]. An algebraic approach was presented by Yellin andPorat [135].

B. Statistical Maximum Likelihood Approach

We consider the statistical model where the source se-quence is random. The formulation of the problemis straightforward in principle. Recalling the multichannelmodel (2) where we consider a block of received vectors

(70)

where we have omitted the time index becausehasincluded all observations. The SML problem can be statedas follows: given , estimate by

(71)where is the density function of the observationvectors parameterized by .

1) Assumptions and Identifiability:The SML estimationhinges on the availability and the evaluation of the like-lihood function. Although SML applies to more generalcases, we shall make the following assumptions in ourdiscussion.

Assumption 4:

4.1) components of and are jointly independent;4.2) is zero mean Gaussian with covariance ;4.3) components of are independently, identically dis-

tributed (i.i.d.) with known probability density func-tion.

Identifiability remains to be an important issue in SMLapproach. The identifiability condition tells when SML canbe applied. A main issue is whether the likelihood func-tion provides sufficient information to distinguish differentmodels. Specifically, is identifiable if(almost everywhere) implies for some . It is not

TONG AND PERREAU: MULTICHANNEL BLIND IDENTIFICATION 1963

Page 14: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

surprising to see that the class of channels identifiable bySML is larger than that by moments. Obviously, parametersidentifiable by moments are identifiable by the likelihoodfunction. It can be shown further that as long asis non-Gaussian, the linear structure in (70) is uniquely determinedby the density function [62]. Indeed, under thenon-Gaussian assumption of the source, subspace methodsdeveloped by Giannakis and Mendel [39] can be usedto obtain closed-form identification based on higher ordercumulants of the observation. When is Gaussian, allstatistical information about the channel is contained in thesecond-order moment. In such a case, identifiability can beensured if subchannels are coprime.

Theorem 5: Under Assumption 4, the channel parameteris identifiable by the likelihood function if and only if

one of the following conditions is satisfied:

1) is non-Gaussian;2) subchannels are coprime.

2) The EM Approach:The SML optimization in (71) isin general difficult because is nonconvex. Theexpectation-maximization (EM) algorithm [9], [19] can beapplied to transform the complicated optimization (71) to asequence of quadratic optimizations. We shall give an intu-itive explanation of this idea. More rigorous developmentcan be found in [19].

If the input is known, e.g., , the ML estimation ofis a simple LS problem involving maximizing a quadraticcost

(72)

where is a constant. When is unknown but with knowndistribution, we should consider maximizing the abovecost averaged over all possible input sequences. Given thereceived signal , this average should be performed usingthe a posteriori distribution of

(73)

Unfortunately, the computation of requires the knowl-edge of the true channel parameter. To circumvent thisdifficulty, we may consider an approximation of the abovecost function using, at iteration, the current estimate

(74)

(75)

(76)

It is clear that the maximization of with respectto is in fact a quadratic optimization problem with thenew estimate given by

(77)

One recognizes that the only difference between the abovesolution and that when the input is known is the conditionalexpectation.

Although the above arguments are solid heuristics, it is

not clear whether . This is indeed the caseprovided a good initial guess of the channel is available.

Theorem 6 [19]: If ,then .

The above theorem implies that the maximization ofthe likelihood function can be achieved by a sequentialmaximization the “auxiliary function” , whichhas a closed-form solution

E: Compute:

M: Maximization:

a) Algorithm characteristics and related work:The per-formance of EM algorithm depends on its initialization,which may be facilitated by moment techniques such asthose described in Section III (see [89]). When EM con-verges globally, the estimate achieves asymptotically theCRB for the case of i.i.d. sequences (which is not the casehere). See [18] for the evaluation of CRB when the inputis Gaussian.

Various algorithms are implemented either in “on-line”or batch modes. Kaleh and Vallet [64] first applied the EMalgorithm to the equalization of communication channelswith input sequence having finite alphabet property. Byusing an HMM they developed a batch (off-line) procedurethat includes the so-called forward and backward recursions[94]. The complexity of this algorithm increases exponen-tially with the channel memory. Shao and Nikias [107]proposed an approximation for calculating the elements ofthe conditional autocorrelation matrix and correlation vectorinvolved in (77). Such an approximation becomes exact asthe block size approaches infinity. At high SNR, furtherreduction of complexity can be achieved as shown in [87].

To relax the memory requirements and facilitate channeltracking, “on-line” sequential approaches have been pro-posed in [113], [114], and [130] for general input, and in[65] for input with finite alphabet properties under an HMMformulation. Given the appropriate regularity conditions[113] and a good initialization guess, it can be shownthat these algorithms converge (almost surely and in themean-square sense) to the true channel value. To reducethe implementation complexity associated with the HMMformulation, suboptimal approaches have been proposed in[131] and [132]. The complexity of the implementationin [88], [132] increases linearly with the channel. WhenSNR approaches infinity, the suboptimal implementationachieves optimality.

1964 PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998

Page 15: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

VI. CONCLUSION

In this paper, we have presented some recent develop-ments in blind identification and estimation of single inputand multiple output channels. Depending on the application,the problem of blind identification is a problem of exploit-ing structural information of the channel and properties ofits input. Because different applications utilize differentstructures to specify the unknown parameters, there isgreat diversity in developing new approaches. Recently,there have been several developments in semiblind channelestimation techniques. Semiblind channel estimation standsfor cases when part of the input is accessible. This case issignificant in several ways. First, the availability of certaininput should improve the performance of any blind channelestimators. de Carvalho and Slock evaluated the CRB forblind channel estimation for both deterministic and Gauss-ian input cases [17], [18]. Second, previously unidentifiablechannel may become identifiable. If zeros are introducedperiodically in the input data sequence, Giannakis [37]showed that any FIR channel can be identified, and asubspace algorithm was proposed in [52]. Pal proposed asemiblind channel estimation technique in [84].

While it is clear that these methods have potential appli-cations in many different problems, it is still too early toassess their impact, for most studies have been conductedeither in simulation or with real data but in a controlledmanner.

REFERENCES

[1] K. Abed-Meraim, J. Cardoso, A. Gorokhov, P. Loubaton, andE. Moulines, “On subspace methods for blind identification ofsingle-input multiple-output FIR systems, ”IEEE Trans. SignalProcessing,vol. 45, pp. 42–55, Jan. 1997.

[2] K. Abed-Meraim, P. Loubaton, and E. Moulines, “A sub-space algorithm for certain blind identification problems,”IEEETrans. Inform. Theory,vol. 43, pp. 499–511, Mar. 1997.

[3] K. Abed-Meraim, E. Moulines, and P. Loubaton, “Predictionerror method for second-order blind identification,”IEEE Trans.Signal Processing,vol. 45, pp. 694–705, Mar. 1997.

[4] H. Aghajan, B. Hassibi, B. Khalaj, A. Paulraj, and T. Kailath,“Blind identification of FIR channels with multiple users viaspatio-temporal processing,” inProc. IEEE GLOBECOM,SanFrancisco, CA, Nov. 1994, vol. 3, pp. 1899–1903.

[5] H. Akaike, “A new look at the statistical model identification,”IEEE Trans. Automat. Contr.,vol. AC-19, pp. 716–723, Dec.1974.

[6] S. Amari and A. Cichocki, “Adaptive blind signal process-ing–Neural network approaches,” this issue, pp. 2026–2048.

[7] L. A. Baccala and S. Roy, “A new blind time-domain channelidentificaiton method based on cyclostationarity,” presented at26th Conf. Information Sciences and Systems, Princeton, NJ,Mar. 1994.

[8] , “A new blind time-domain channel identification methodbased on cyclostationarity,”IEEE Signal Processing Lett.,vol.1, pp. 89–91, June 1994.

[9] L. E. Baum, T. Petrie, G. Soules, and N. Weiss, “A maximiza-tion technique occuring in the statistical analysis of probabilisticfunctions of Markov chains,”Ann. Math. Stat.,vol. 41, pp.164–171, 1970.

[10] R. E. Blahut, Algebraic Methods for Signal Processing andCommunications Coding.New York: Springer-Verlag, 1992.

[11] Y. Bresler and A. Macovski, “Exact maximum likelihoodparameter estimation of supperimposed exponential signals innoise,” IEEE Trans. Audio, Speech, Signal Processing,vol.ASSP-34, pp. 1081–1089, Oct. 1986.

[12] D. Brillinger, “The identification of polynomial systems bymeans of higher-order spectra,”J. Sound Vib.,vol. 20, pp.301–313, 1970.

[13] J.-F. Cardoso, P. Loubaton, and E. Moulines, “On weightedsubspace estimates in system identification,” inProc. 8th IEEESignal Processing Workshop Statistical and Array Signal Pro-cessing,Corfu, Greece, June 1996, pp. 352–355.

[14] A. Chevreuil and P. Loubaton, “On the use of conjugatecyclo-stationarity: A blind second-order multi-user equalizationmethod,” in Proc. IEEE Intl. Conf. Acoustics, Speech, SignalProcessing,Atlanta, GA, May 1996, vol. 5, pp. 2439–2442.

[15] , “Blind second-order identification of FIR channels:Forced cyclo-stationarity and structured subspace method,” inProc. 1st IEEE Signal Processing Workshop Signal ProcessingAdvances in Wireless Communication,Paris, France, Apr. 1997,vol. 1, pp. 121–124.

[16] S. Choi and R. Liu, “An adaptive system for direct blind multi-channel equalization,” inProc. 1st IEEE Signal ProcessingWorkshop Signal Processing Advances in Wireless Communi-cation, Paris, France, Apr. 1997, vol. 1, pp. 85–88.

[17] E. de Carvalho and D. T. M. Slock, “Cramer–Rao Boundsfor semi blind, blind and training sequence based channelestimation,” inProc. 1st IEEE Signal Processing Workshop onSignal Processing Advances in Wireless Communication,Apr.1997, vol. 1, pp. 129–132, Paris, France.

[18] , “Maximum-likelihood blind FIR multi-channel estima-tion with Gaussian prior for the symbols,” inProc. IEEE Intl.Conf. Acoustics, Speech, Signal Processing,Munich, Germany,Apr. 1997, vol. 5, pp. 3593–3596.

[19] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximumlikelihood from incomplete data via EM algorithm,”J. RoyalStatist. Soci.,vol. 39, ser. B, 1977.

[20] Z. Ding, “A blind channel identification algorithm based onmatrix outer-product,” inProc. 1995 IEEE Int. Conf. Commu-nications,Dallas, TX, pp. 852–856.

[21] Z. Ding and Z. Mao, “Knowledge based identification of frac-tionally sampled channels,” inProc. 1995 Int. Conf. Acoustics,Speech, Signal Processing,Detroit, MI, pp. 1996–1999.

[22] G. Dong and R. Liu, “Adaptive blind channel identification,” inProc. 1995 IEEE Int. Conf. Communications,Dallas, TX, June1996, pp. 828–831.

[23] , “An orthogonal learning rule for null-space trackingwith implementation to blind two-cahnnel identification,”IEEETrans. Circuits Syst. I,vol. 45, pp. 26–33, Jan 1998.

[24] D. Donoho, “On minimum entropy deconvolution,”AppliedTime Series Analysis II,D. F. Findley, Ed. New York: Aca-demic, 1981.

[25] T. Endres, B. D. O. Anderson, C. R. Johnson, and L. Tong, “Onthe robustness of FIR channel identification from second-orderstatistics,” IEEE Signal Processing Lett.,vol. 3, pp. 153–155,May 1996.

[26] M. V. Eyuboglu and S. U. Qureshi, “Reduced-state sequenceestimation with set partitioning and decision feedback,”IEEETrans. Commun.,vol. COM-36, pp. 13–20, Jan. 1988.

[27] M. Feder and J. A. Catipovic, “Algorithms for joint channelestimating and data recovery—Application to underwater com-munications,”IEEE J. Oceanic Eng,vol. 16, pp. 42–55, Jan.1991.

[28] G. D. Forney, “The Viterbi algorithm,”Proc. IEEE, vol. 61,pp. 268–278, Mar. 1972.

[29] , “Minimal bases of rational vector spaces, with applica-tions to multivariable linear systems,”SIAM J. Control,vol. 13,no. 3, pp. 493–520, May 1975.

[30] B. Friedlander and B. Porat, “Asymptotically optimal estimationof MA and ARMA parameters of non-Gaussian processes fromhigher-order moments,”IEEE Trans. Automat. Control,vol. 35,pp. 27–35, Jan. 1990.

[31] P. A. Fuhrmann,A Polynomial Approach to Linear Algebra.New York: Springer-Verlag, 1996.

[32] W. A. Gardner,Cyclostationarity in Communications and Sig-nal Processing. New York: IEEE Press, 1994.

[33] D. Gesbert and P. Duhamel, “Robust blind identification andequalization based on multi-step predictors,” inProc. IEEE Intl.Conf. Acoust. Speech, Sig. Proc.,Munich, Germany, Apr. 1997,vol. 5, pp. 2621–2624.

[34] D. Gesbert, C. B. Papadias, and A. Paulraj, “Blind equalizationof polyphase FIR channels: A whitening approach,” Asilomar,CA, Nov. 1997.

TONG AND PERREAU: MULTICHANNEL BLIND IDENTIFICATION 1965

Page 16: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

[35] M. Ghosh, “Blind decision feedback equalization for terrestrialtelevision receivers,” this issue, pp. 2070–2081.

[36] M. Ghosh and C. L. Weber, “Maximum-likelihood blind equal-ization,” Opt. Eng.,vol. 31, no. 6, pp. 1224–1228, June 1992.

[37] G. Giannakis, “Filterbanks for blind channel identification andequalization,”IEEE Signal Processing Lett.,vol. 7, July 1997.

[38] G. Giannakis, Y. Inouye, and J. Mendel, “Cumulant basedidentification of multichannel moving-average models,”IEEETrans. Automat. Contr.,vol. 34, no. 7, pp. 783–787, July 1989.

[39] G. Giannakis and J. Mendel, “Identification of nonminimumphase systems using higher order statistics,”IEEE Trans. Acous-tics, Speech, Signal Processing,vol. 37, pp. 7360–377, Mar.1989.

[40] G. B. Giannakis and R. W. Heath, Jr., “Blind identificationof multichannel FIR blurs and perfect image restoration,”submitted for publication.

[41] G. B. Giannakis, “Linear cyclic correlation approach for blindidentification of FIR channels,” inProc. 28th Asilomar Conf.Signals, Systems, and Computers,Pacific Grove, CA, Nov.1994, pp. 420–423.

[42] G. B. Giannakis and W. Chen, “Blind blur identification andmultichannel image restoration using cyclostationarity,” inProc. IEEE Workshop Nonlinear Signal and Image Processing,Halkidiki, Greece, June 1996, pp. 543–546.

[43] G. B. Giannakis and S. D. Halford, “Performance analysis ofblind equalizers based on cyclostationary statistics,” inProc.26th Conf. Information Sciences and Systems,Princeton, NJ,Mar. 1994, pp. 711–716.

[44] , “Asymptotically optimal blind fractionally-spaced chan-nel estimation and performance analysis,”IEEE Trans. SignalProcessing,vol. 45, pp. 1815–1830, July 1997.

[45] G. B. Giannakis and C. Tepedelenlioglu, “Batch and adaptivedirect blind equalizers of multiple FIR Channels: A determinis-tic approach,” inProc. 30th Asilomar Conf. on Signals, Systems,and Computers,Pacific Grove, CA, Nov. 1996.

[46] D. N. Godard, “Self-recovering equalization and carrier trackingin two-dimensional data communication systems,”IEEE Trans.Commun.,vol. COM-28, pp. 1867–1875, Nov. 1980.

[47] G. C. Goodwin and K. S. Sin,Adaptive Filtering, Prediction,and Control. Englewood Cliffs, NJ: Prentice-Hall, 1984.

[48] A. Gorokhov, P. Loubaton, and E. Moulines, “Second-orderblind equalization in multiple input multiple output systems:A weighted least squares approach,” inProc. IEEE Intl. Conf.Acoustics, Speech, Signal Processing,Atlanta, GA, May 1996,vol. 5, pp. 2415–2418.

[49] J. Gunther and A. Swindlehust, “Algorithms for blind equal-ization with multiple antennas based on frequency domainsubspaces,” inProc. IEEE Intl. Conf. Acoustics, Speech, SignalProcessing,Atlanta, GA, May 1996, vol. 5, pp. 2419–2422.

[50] M. L. Gurelli and C. L. Nikias, “EVAM: An eigenvector-baseddeconvolution of input colored signals,”IEEE Trans. SignalProcessing,vol. 43, pp. 134–149, Jan 1995.

[51] M. I. Gurreli and C. L. Nikias, “A new eigenvector-basedalgorithm for multichannel blind deconvolution of input coloredsignals,” in Proc. IEEE Intl. Conf. Acoustics, Speech, SignalProc., Minneapolis, MN, Apr. 1993, vol. 4, pp. 448–451.

[52] S. D. Halford and G. B. Giannakis, “Direct blind equalizationfor transmitter induced cyclostationarity,” inProc. 1st IEEESignal Processing Workshop Signal Processing Advances inWireless Communication,Paris, France, Apr. 1997, vol. 1, pp.117–120.

[53] G. Harikuma and Y. Bresler, “Analysis and comparative evalua-tion of techniques for multichannel blind equalization,” inProc.8th IEEE Signal Processing Workshop Statistical and ArraySignal Processing,Corfu, Greece, June 1996, pp. 332–335.

[54] D. Hatzinakos and C. Nikias, “Estimation of multipath channelresponse in frequency selective channels,”IEEE J. Select. AreasCommun.,vol. 7, pp. 12–19, Jan. 1989.

[55] Y. Hua, “Fast maximum likelihood for blind identification ofmultiple FIR channels,” inProc. 28th Asilomar Conf. Signals,Systems, and Computers,Pacific Grove, CA, Nov. 1994.

[56] Y. Hua, “Fast maximum likelihood for blind identification ofmultiple FIR channels,”IEEE Trans. Signal Processing,vol.44, pp. 661–672, Mar. 1996.

[57] Y. Hua, K. Abed-Meraim, and M. Wax, “Blind system iden-tification using minimum noise subspace,” inProc. 8th IEEESignal Processing Workshop Statistical and Array Signal Pro-cessing,Corfu, Greece, June 1996, pp. 308–311.

[58] Y. Hua and W. Qiu, “Source correlation compensation for blindchannel identification based on second-order statistics,”IEEESignal Processing Lett.,vol. 1, pp. 119–120, Aug. 1994.

[59] Y. Hua and M. Wax, “Strict identifibility of multiple FIRchannels driven by an unknown arbitrary sequence,”IEEETrans. Signal Processing,vol. SP-44, pp. 756–759, Mar. 1996.

[60] R. A. Iltis, “A Bayesian maximum-likelihood sequence esti-mation algorithm fora priori unknown channels and symboltiming,” IEEE J. Select. Areas Commun.,vol. 10, pp. 579–588,Apr. 1992.

[61] R. A. Iltis, J. J. Shynk, and K. Giridhar, “Bayesian algorithmsfor blind equalization using parallel adaptive filtering,”IEEETrans. Commun.,vol. 42, pp. 1017–1032, Feb.–Apr. 1994.

[62] A. Kagan, Y. V. Linnik, and C. R. Rao,CharacterizationProblems in Mathematical Statistics.New York: Wiley, 1973.

[63] T. Kailath, Linear Systems. Englewood Cliffs, NJ: Prentice-Hall, 1980.

[64] G. K. Kaleh and R. Vallet, “Joint parameter estimation andsymbol detection for linear or nonlinear unknown dispersivechannels,”IEEE Trans. Commun.,vol. 42, pp. 2406–2413, July1994.

[65] V. Krishnamurphy and J. B. Moore, “On-line estimation ofhidden Markov model parameters based on Kullback–Leiblerinformation measure,”IEEE Trans. Signal Processing,vol. 41,pp. 2557–2573, Aug. 1993.

[66] M. Kristensson and B. Ottersten, “Statistical analysis of asubspace method for blind channel identification,” inProc.IEEE Intl. Conf. Acoustics, Speech, Signal Processing,Atlanta,GA, May 1996, vol. 5, pp. 2435–2438.

[67] R. L. Lawrence and H. Kaufman, “The Kalman filter for theequalization of digital communication channel,”IEEE Trans.Commun.,vol. COM-19, Dec. 1971.

[68] G. Lee, S. B. Gelfand, and M. P. Fitz, “Bayesian techniquesfor blind deconvolution,”IEEE Trans. Commun.,vol. 44, pp.826–835, July 1996.

[69] Y. Li and Z. Ding, “A new nonparametric cepstral method forblind channel identification from cyclostationary statistics,” pre-sented at 27th Asilomar Conf. Signals, Systems, and Computers,Asilomar, CA, Oct. 1993.

[70] Y. Li and K. J. R. Liu, “On blind MIMO channel identificationusing second-order statistics,” inProc. 1996 Conf. InformationScience and Systems,Princeton, NJ, vol. 2, pp. 1166–1171.

[71] H. Liu, G. Xu, and L. Tong, “A deterministic approach toblind identification of multichannel FIR systems,” presented at27th Asilomar Conference on Signals, Systems, and Computers,Asilomar, CA, Oct. 1993.

[72] H. Liu, G. Xu, L. Tong, and T. Kailath, “Recent develop-ments in blind channel equalization: From cyclostationarity tosubspaces,”Signal Processing,vol. 50, pp. 83–99, 1996.

[73] R. Liu, “Blind signal processing: An introduction,” inProc.1996 Intl. Symp. Circuits and Systems,Atlanta, GA, vol. 2, pp.81–83.

[74] , “A fundamental theorem on direct blind equalization ofmultiple-channel information transmission,” inProc. 1996 Intl.Symp. Circuits and Systems,vol. 1, pp. 705–708.

[75] R. Liu and G. Dong, “A fundamental theorem for multiple-channel blind equalization,”IEEE Trans. Circuits Syst. I,vol.44, pp. 472–473, May 1997.

[76] R. Liu and H. Luo, “Some results on multiple-channel blindequalization,” inProc.’97 Euro. Conf. Circuit Theory and De-sign, pp. 1347–1351.

[77] , “Direct blind separation of independent non-Gaussiansignals with dynamic channels,” presented at 5th IEEE Int.Workshop Cellular Neural Networks and their Applications,Apr. 1998.

[78] L. Ljung, System Identification: Theory for the User.Engle-wood Cliffs, NJ: Prentice-Hall, 1987.

[79] H. Luo and Y. Li, “Application of blind channel identificationtechniques to prestack seismic deconvolution,” this issue, pp.2082–2089.

[80] J. Massey and M. Sain, “Inverse of linear sequential circuits,”IEEE Trans. Comput.,vol. C-17, pp. 330–337, 1968.

[81] J. M. Mendel, “Tutorial on higher-order statistics (spectra) insignal processing and system theory: Theoretical results andsome applications,”Proc. IEEE, vol. 79, pp. 278–305, Mar.1991.

[82] E. Moulines, P. Duhamel, J. F. Cardoso, and S. Mayrargue,“Subspace-methods for the blind identification of multichannel

1966 PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998

Page 17: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

FIR filters,” presented at ICASSP’94 Conf., Adelaide, Australia,Apr. 1994.

[83] , “Subspace-methods for the blind identification of mul-tichannel FIR filters,”IEEE Trans. Signal Processing,vol. 43,pp. 516–525, Feb. 1995.

[84] D. Pal, “Fractionally spaced semiblind decision feedback equal-ization of wireless channels,” inProc. 1993 IEEE Intl. Conf.Communications,Geneva, Switzerland, pp. 1139–1143.

[85] C. B. Papadias, “Methods for blind equalization and identifi-cation of linear channels,” Ph.D. dissertation,Ecole NationaleSuperieure des Telecommunications, Paris, France, Mar. 1995.

[86] B. P. Paris, “Self-adaptive maximum-likelihood sequence esti-mation,” presented at IEEE GLOBECOM, 1993.

[87] S. Perreau, “A sub-optimal off-line approach for maximumlikelihood blind equalization of a multiusers system,” Univ.Connecticut, Storrs, Tech. Rep. 97-03, May 1997.

[88] S. Perreau, L. White, and P. Duhamel, “A new decision-feedback equalizer incorporating fixed-lag smoothing,” submit-ted for publication.

[89] , “A reduced computation multichannel adaptive equalizerbased on HMM,” inProc. 8th IEEE Signal Processing WorkshopStatistical and Array Signal Processing,Corfu, Greece, June1996, pp. 156–159.

[90] B. Porat and B. Friedlander, “Blind equalization of digitalcommunication channels using high-order moments,”IEEETrans. Signal Processing,vol. 39, pp. 522–526, 1991.

[91] B. Porat,Digital Processing of Random Signals.EnglewoodCliffs, NJ: Prentice-Hall, 1993.

[92] W. Qiu and Y. Hua, “Performance analysis of a blind channelidentification method based on second-order statistics,” inProc.21st Asilomar Conf. Signals, Systems, and Computers,PacificGrove, CA, Nov. 1995, vol. 2, pp. 1458–1462.

[93] , “Performance analysis of the subspace method for blindchannel identification,”Signal Processing,vol. 50, no. 71–81,1996.

[94] L. Rabiner, “A tutorial on hidden Markov models and selectedapplications in speech recognition,”Proc. IEEE,vol. 77, no. 2,pp. 257–285, Feb. 1989.

[95] R. Raheli, A. Polydoros, and C. K. Tzou, “Per-survivor process-ing: A general approach to MLSE in uncertain environments,”IEEE Trans. Commun.,vol. 43, pp. 354–364, Feb.–Apr. 1995.

[96] J. Rissanen, “Modeling by shortest data description,”Automat-ica, vol. 14, pp. 465–471, 1978.

[97] E. Robinson, “Predictive deconvolution and wavelet analysis,”presented at Intl. Conf. Communications, Computing, Control,Signal Processing, Stanford, CA, June 1996.

[98] E. A. Robinson, “T. Tomographic deconvolution of echograms,”in Communications, Computation, Control and Signal Process-ing: A Tribute to Thomas Kailath,A. Paularj, V. Roychowdhuryand C. Schaper, Eds. Norwell, MA,: Kluwer, 1997.

[99] , “Predictive decomposition of seismic traces,”Geophisics,vol. 22, pp. 767–778, 1957.

[100] W. Rudin,Real and Complex Analysis.New York: McGraw-Hill, 1986.

[101] Y. Sato, “A method of self-recovering equalization formultilevel amplitude-modulation,”IEEE Trans. Commun.,vol.COM-23, pp. 679–682, June 1975.

[102] S. Schell and D. Smith, “Improved performance of blindequalization using prior knowledge of transmitter filter,” inProc. 1994 IEEE MILCOM,Fort Monmouth, NJ, vol. 1, pp.128–132.

[103] S. V. Schell, D. L. Smith, and S. Roy, “Blind channel identifi-caiton using subchannel response matching,” presented at 26thConf. Information Sciences and Systems, Princeton, NJ, Mar.1994.

[104] E. Serpedin and G. Giannakis, “Blind channel identification andequalization with modulation induced cyclostationarity,” pre-sented at 31st Annu. Conf. Information Sciences and Systems,Baltimore, MD, Mar. 1997.

[105] N. Seshadri, “Joint data and channel estimation using fast blindtrellis search techniques,” inProc. Globecom’90,1991, pp.1659–1663.

[106] , “Joint data and channel estimation using blind trel-lis search techniques,”IEEE Trans. Commun.,vol. 42, pp.1000–1011, Feb.–Apr. 1994.

[107] M. Shao and C. L. Nikias, “An ML/MMSE estimation approachto blind equalization,” inProc. ICASSP’94 Conf.,Adelaide,Australia, vol. 4, pp. IV–569–572.

[108] D. Slock, “Blind fractionally-spaced equalization, perfect re-construction filterbanks, and multilinear prediction,” presentedat ICASSP’94 Adelaide, Australia.

[109] , “Blind joint equalization of multiple synchronous mobileusers using oversampling and/or multiple antennas,” presentedat 28th Asilomar Conf. Signals, Systems, Computers, Monter-rey, CA, Oct. 1994.

[110] D. Slock and C. B. Papadias, “Further results on blind iden-tification and equalization of multiple FIR channels,” inIEEEProc. Intl. Conf. Acoustics, Speech, Signal Processing,Detroit,MI, Apr. 1995, pp. 1964–1967.

[111] S. Talwar, M. Viberg, and A. Paulraj, “Blind estimation ofmultiple co-channel digital signals using an antenna array,”IEEE Signal Processing Lett.,vol. 1, pp. 29–31, Feb. 1994.

[112] , “Blind separation of synchronous co-channel digitalsignals using an antenna array. I. Algorithms,”IEEE Trans.Signal Processing,vol. 44, pp. 1184–1197, May 1996.

[113] D. M. Titterington, A. F. M. Smith, and U. E. Makov,Statis-tical Analysis of Finite Mixture Distributions(Wiley Series inProbability and Mathematical Statistics). New York: Wiley,1985.

[114] D. M. Titterington, “Recursive parameter estimation using in-complete data,”J. Royal Statist. Soc. B,vol. 46, no. 2, pp.257–267, 1984.

[115] L. Tong, “Identification of multichannel MA parameters usinghigher-order statistics,” submitted for publication.

[116] L. Tong, Y. Inouye, and R. Liu, “A finite-step global conver-gence algorithm for the parameter estimation of multichannelMA processes,”IEEE Trans. Signal Processing,vol. 40, pp.2547–2558, Oct. 1992.

[117] L. Tong, G. Xu, B. Hassibi, and T. Kailath, “Blind identificationand equalization of multipath channels: A frequency domainapproach,”IEEE Trans. Inform. Theory,vol. 41, pp. 329–334,Jan. 1995.

[118] L. Tong, G. Xu, and T. Kailath, “A new approach to blindidentification and equalization of multipath channels,” presentedat 25th Asilomar Conf., Pacific Grove, CA, Nov. 1991.

[119] , “Blind identification and equalization based on second-order statistics: A time domain approach,”IEEE Trans. Inform.Theory,vol. 40, pp. 340–349, Mar. 1994.

[120] L. Tong and H. Zeng, “Blind channel identification using cyclicspectra,” inProc. 26th Conf. Information Sciences and Systems,Princeton, NJ, Mar. 1994, pp. 711–716.

[121] , “Joint order detection and blind channel estimation byleast squares smoothing,”IEEE Trans. Signal Processing, to bepublished.

[122] L. Tong and Q. Zhao, “Blind channel estimation by least squaressmoothing,” inProc. 1998 IEEE Intl. Conf. Acoustics, Speech,Signal Processing, Seattle, WA, vol. 5, pp. 2121–2124.

[123] , “Joint order detection and channel estimation by leastsquares smoothing,” presented at 30th Conf. Information Sci-ences and Systems, Princeton, NJ, Mar. 1998.

[124] J. R. Treichler and B. G. Agee, “A new approach to multipathcorrection of constant modulus signals,”IEEE Trans. Audio,Speech, Signal Processing,vol. ASSP-31, pp. 459–472, Apr.1983.

[125] J. K. Tugnait, “Identification of nonminimum phase linearstochastic systems,”Automatica,vol. 22, pp. 454–464, 1986.

[126] , “Blind equalization and estimation of digital communi-cation FIR channels using cumulant matching,”IEEE Trans.Commun.,vol. 43, pp. 1240–1245, Feb.–Apr. 1995.

[127] A. van der Veen, S. Talwar, and A. Paulraj, “A subspaceapproach to blind space-time signal processing for wirelesscommunication systems,”IEEE Trans. Signal Processing,vol.45, pp. 173–190, Jan 1997.

[128] M. Viberg, “Subspace fitting concepts in sensor array process-ing,” Ph.D. dissertation, Linkoping Univ., Linkoping, Sweden,Oct. 1989.

[129] M. Wax and T. Kailath, “Detection of signals by informationtheoretic criteria,”IEEE Trans. Audio, Speech, Signal Process-ing, vol. ASSP-33, pp. 387–392, Apr. 1985.

[130] E. Weinstein, M. Feder, and A. Oppenheim, “Sequential algo-rithms for parameter estimation based on the Kullback–Leiblerinformation measure,”IEEE Trans. Signal Processing,vol. SP-38, pp. 1652–1654, Sept. 1990.

[131] L. B. White and V. Krishnamurphy, “Adaptive blind equaliza-tion of FIR channels using hidden Markov Models,” presentedat Int. Conf. Communications, Geneva, June 1993.

TONG AND PERREAU: MULTICHANNEL BLIND IDENTIFICATION 1967

Page 18: Multichannel Blind Identification: From Subspace to Maximum …acsp.ece.cornell.edu/papers/TongPerreau.pdf · 2002. 12. 8. · of blind equalization in high-definition television

[132] L. B. White, S. Perreau, and P. Duhamel, “Reduced computationblind equalization for FIR channel input Markov models,” inProc. Int. Conf. Communications,Seattle, WA, June 1995, pp.993–997.

[133] R. A. Wiggings, “Minimum entropy deconvolution,”Geoexplo-ration, vol. 16, pp. 21–35, 1978.

[134] G. Xu, H. Liu, L. Tong, and T. Kailath, “A least-squaresapproach to blind channel identification,”IEEE Trans. SignalProcessing,vol. 43, pp. 2982–2993, Dec. 1995.

[135] D. Yellin and B. Porat, “Blind identification of FIR systemsexcited by discrete-alphabet inputs,”IEEE Trans. Signal Pro-cessing,vol. 41, pp. 1331–1339, Mar. 1993.

[136] H. Zeng, “Blind estimation: Performance analysis and newalgorithms,” Ph.D. dissertation, Univ. Connecticut, Storrs, May1997.

[137] H. Zeng and L. Tong, “Some new results on blind channelestimation: Performance and algorithms,” presented at 27thConf. Information Sciences and Systems, Baltimore, MD, Mar.1995.

[138] H. Zeng and L. Tong, “Connections between the least-squaresand subspace approaches to blind channel estimation,”IEEETrans. Signal Processing,vol. 44, pp. 1593–1596, June 1996.

[139] , “Blind channel estimation using the second-order sta-tistics: Algorithms,” IEEE Trans. Signal Processing,vol. 45,Aug. 1997.

[140] , “Blind channel estimation using the second-order sta-tistics: Asymptotic performance analysis,”IEEE Trans. SignalProcessing,vol. 45, Aug. 1997.

[141] Q. Zhao and L. Tong, “Adaptive blind channel estimation byleast squares smoothing,” submitted for publication.

[142] , “Fast convergent adaptive blind channel estimation anddeconvolution,” presented at 30th Conf. Information Sciencesand Systems, Princeton, NJ, Mar. 1998.

[143] M. Zoltowski and D. Tseng, “Blind channel identificationfor narrowband digital communication based on parametricmodeling of channel impulse response,” inProc. 1997 AllertonConf. Communications, Control, and Computing,Urbana, IL,pp. 503–512.

Lang Tong (Member, IEEE), for a photograph and biography, see thisissue, p. 1906.

Sylvie Perreauwas born in France in 1969. In1993 she graduated from theEcole NationaleSuperieure de l’Electronique et des Applicationsand obtained the M.Sc. degree from the Univer-sity of Cergy-Pontoise. In 1997, she received thePh.D. degree in signal processing from TelecomParis.

In 1997 she worked at the University ofConnecticut, Storrs. She is now with the Institutefor Telecommunications Research in Adelaide,Australia. Her research interests are in blind

equalization, interference mitigation, and blind array processing.

1968 PROCEEDINGS OF THE IEEE, VOL. 86, NO. 10, OCTOBER 1998