Blind and semi-blind equalization using hidden Markov models and clustering techniques

Signal Processing 80 (2000) 1795}1805

Blind and semi-blind equalization using hidden Markov modelsand clustering techniques

Kristina Georgoulakis, Sergios Theodoridis*

Divisions of Communications and Signal Processing, Department of Informatics, University of Athens, Panepistimioupolis, TYPA Buildings,15784 Athens, Greece

Received 26 November 1999; received in revised form 6 March 2000

AbstractA novel blind channel equalizer is proposed which is suitable both for linear and nonlinear channels. The proposed

equalizer consists of the following three steps: (a) identi"cation of the clusters formed by the received data samples, via anunsupervised learning clustering technique, (b) labeling of the identi"ed clusters, by using a hidden Markov modeling(HMM) of the process and (c) channel equalization by means of a cluster-based sequence equalizer. The performance ofthe equalizer is investigated for a variety of channels (minimum/non-minimum phase, linear/nonlinear channels). Thecase of channel order mismatch is studied. A semi-blind clustering-based equalizer, where a short sequence of known dataused for training, is also considered. ( 2000 Elsevier Science B.V. All rights reserved.

Zusammenfassung

Es wird ein neues Verfahren fuK r eine blinde Kanalentzerrung vorgeschlagen, das sowohl fuK r lineare als auch fuK rnichtlineare KanaK le geeignet ist. Der vorgestellte Algorithmus zur Entzerrung setzt sich aus drei Schrittten zusammen: (a)Identi"kation der Cluster, die von den empfangenen Abtastwerten gebildet werden, mittels eines unuK berwachtenCluster-Lernverfahrens, (b) Einordnung der identi"zierten Cluster in ein Hidden-Markov-Modell (HMM) des Prozessesund (c) Kanalentzerrung mit Hilfe eines cluster-basierten Sequenz-Entzerrers. Die LeistungsfaK higkeit des Entzerrers wirdfuK r mehrere Arten von KanaK len (minimal-/nichtminimalphasige, lineare/nichtlineare KanaK le) untersucht. Der Fall einerFehlanpassung der Kanalordnung wird analysiert. Au{erdem wird ein `halb-blindera, cluster-basierter Entzerrerbetrachtet, bei dem eine kurze Sequenz bekannter Daten waK hrend einer Trainingsphase zum Einsatz kommt. ( 2000Elsevier Science B.V. All rights reserved.

Re2 sume2

Nous proposons dans cet article un eH galiseur aveugle de canal nouveau approprieH a la fois aux canaux lineH aires etnon-lineH aires. L'eH galiseur proposeH comprend les trois eH tapes suivantes: (a) identi"cation des amas formeH s par leseH chantillons rec7 us, a l'aide d'une technique de coalescence non-superviseH e (b) eH tiquetage des amas identi"eH s, a l'aide d'unemodele de Markov cacheH (HMM) du processus et (c) eH galisation du canal au moyen d'un eH galiseur de seH quence baseH surles amas. Les performances de cet eH galiseur sont investigueH es pour une gamme de canaux (phase minimale/non-minimale,canaux lineH aires/non-lineH aires). Le cas du meH sajustement de l'ordre du canal est eH tudieH . Un eH galiseur semi-aveugle baseHsur la coalescence, dans lequel une courte seH quence de donneH es connues est utiliseH e pour l'apprentissage, est eH galementconsideH reH . ( 2000 Elsevier Science B.V. All rights reserved.

Keywords: Blind equalization; Clustering techniques; Hidden Markov models

*Corresponding author. Tel.: 30-1-7275328; fax: 30-1-72-19-561.E-mail addresses: [email protected] (K. Georgoulakis), [email protected] (S. Theodoridis).

0165-1684/00/$ - see front matter ( 2000 Elsevier Science B.V. All rights reserved.PII: S 0 1 6 5 - 1 6 8 4 ( 0 0 ) 0 0 0 8 9 - X

Fig. 1. Blind clustering-based sequence equalizer.

1. Introduction

Intersymbol interference (ISI) is a major impair-ment in today's high bit-rate communication sys-tems [12]. Channel equalizers, used in the receiverpart, aim to suppress the e!ect of ISI. In most of thecases, the communication channel is unknown, andthe design of the equalizer is performed on the basisof a known training sequence of information bits.However, there are many cases that transmission ofa training sequence is not possible or desirable.This mode of equalizer design is known as &blind '.Blind equalizers have been suggested for severalapplications. Typical examples include: multipointnetworks [5], digital communications over fadingand multipath channels [8], very rapidly time-vary-ing channels when high bit-rates are desirable [12],etc.

Blind channel equalization is a challenging taskand has been the focus of intense research e!ort.Recently, an interest has risen on approaches basedon data-clustering techniques [9,16,4]. A major ad-vantage of such approaches is that no explicit chan-nel modeling is required and this makes themattractive when nonlinear channels are involved[15]. A cluster-based blind channel estimation al-gorithm consists of two steps: (a) data clusters are"rst estimated via an unsupervised learning tech-nique and (b) labeling of the estimated clusters isachieved, by unraveling the information hidden inthe sequence of received data [9,16]. When the taskrelated to the channel estimation is completed,a cluster-based sequence equalizer [3] can be em-ployed to provide signal detection. A block diagramof a cluster-based blind equalizer is given in Fig. 1.

In this paper, a novel cluster-based blind channelestimation procedure is proposed. The novelty of

the technique is in the way the identi"ed clustersare labeled, that is, in the way each group of trans-mitted bits is assigned to the respective cluster.Labeling is performed using a hidden Markovmodeling (HMM) of the estimation processand by relating data clusters to HMM states.The probability of each received data cluster tocorrespond to a speci"c label is treated as theunknown parameter of the HMM learning task,implemented by the expectation maximization(EM) algorithm.

Most of the existing communication systems re-quire some reference signals for various operationalpurposes (e.g., synchronisation and user identi"ca-tion). Such an auxiliary information can also beused to enhance the blind identi"cation techniques.The corresponding equalization methods are called&semi-blind ', and, in this case, the information fromboth the training and the blind part of the receiveddata is exploited [6]. In this paper, the semi-blindoperation of the proposed clustering equalizer isalso considered.

The paper is organized as follows. The concept ofchannel equalization as a classi"cation task is givenin Section 2. The proposed blind equalizer is pre-sented in Section 3. Section 4 deals with thesemi-blind operation of the proposed equalizer.Simulations results are presented in Section 5 andconclusions are summarized in Section 6.

2. Channel equalization as a classi5cation task

2.1. System description

Let us consider the received signal g(t) of an ISIand noise-impaired system (see Fig. 1). In the

1796 K. Georgoulakis, S. Theodoridis / Signal Processing 80 (2000) 1795}1805

general case, g(t) can be written as

g(t)"c(t)#w(t), (1)

where

c(t)"f (I(t), I(t!1),2, I(t!¸)) (2)

is the noiseless channel output sequence, f ( ) ) is thefunction representing the channel action, I(t) is anequiprobable sequence of transmitted data takenfrom an M-ary alphabet, i.e., I(t)3MI

1,2, I

MN and

w(t) is an additive white gaussian noise (AWGN)sequence. The noiseless channel output sequence,c(t), is a discrete values signal with ML`1 di!erentelements. This happens because the channel lengthis "nite (the channel spans over ¸#1 symbols) andI(t) belongs to a "nite alphabet (see Eq. (2)). In thespecial case of a linear channel, g(t) can be writtenas

g(t)"L+i/0

h(i)I(t!i)#w(t), (3)

where h(i) is the channel impulse response.

2.2. Clustering-based sequence equalization

The equalizer, placed in the receiver part, aims torecover the transmitted sequence of informationbits I(t), based on the corrupted received sequenceg(t). In [15] a clustering based-sequence equalizer(CBSE) was proposed which treats equalization asa classi"cation task. This method focuses on theclusters, which the received data form. The receiveddata samples are clustered around speci"c pointswhose number and constellation shape is deter-mined by the spread of the channel and the impair-ments characteristics.

Consider the D]1 vector of successively receivedsamples:

u(t)"[g(t), g(t!1),2, g(t!D#1)]T. (4)

According to (1) and (2), in the absence of noise, u(t)is associated with Q"ML`D points in the D-di-mensional space. Each point corresponds to one ofthe ML`D possible realizations of the sequence oftransmitted bits: (I(t),2, I(t!¸!D#1)). If thereceived data is corrupted by AWGN, then therandomness of noise leads to the formation ofa cluster around each point. The radius of clusters

depends on the noise variance. Each cluster isrepresented by a suitably chosen representative,which corresponds to the noiseless channel res-ponse vector in the D-dimensional space, i.e., c(t)"[c(t),2, c(t!D#1)]T, with c(t)3Mc

k; k"1,2,QN.

Due to the interdependence that ISI imposes onsuccessive received samples only speci"c transitionsamong the di!erent clusters are possible. Thus,CBSE employs a Viterbi-type procedure, dictatedby the speci"c transitions among clusters. Assum-ing that received samples are treated in groups of D,then, an ML`D~1 state trellis can be considered,where the state S(t), at time t, uniquely identi"es thechannel memory

S(t)P(I(t!1)I(t!2)2I(t!¸!D#1)).

In the Viterbi trellis diagram, the transition fromone state, S(t!1), to another, S(t), corresponds tothe emission of a speci"c cluster representative,dictated by the sequence of bits associated with thecurrent state and the new information bit transmit-ted. We call this sequence of bits the label and wedenote it by

X(t)P(I(t)I(t!1)2I(t!¸!D#1)).

For the implementation of the Viterbi algorithm anappropriate distance metric is adopted (i.e., Euclid-ean or Mahalanobis) in order to measure the dis-tance between the received data vector and therepresentatives of the various clusters.

Training of the clusters is equivalent to: (a) iden-tifying the cluster representatives and (b) per-forming the labeling of clusters. In supervisedequalization, during the training period, a sequenceof known information bits is transmitted and eachcluster representative is computed by a simple aver-aging of all the data vectors, u(t), belonging to therespective cluster [3,15].

3. Clustering-based blind sequence equalizer

In blind mode, no training sequence is availableto provide the clusters training. In this paper, train-ing of the clusters is performed in two steps: (a)cluster representatives are estimated via an un-supervised clustering technique and (b) labeling ofclusters is achieved by using a discrete observations

K. Georgoulakis, S. Theodoridis / Signal Processing 80 (2000) 1795}1805 1797

HMM. Once the training of the equalizer is ac-complished, a Viterbi algorithm is subsequentlyused for signal detection.

The proposed clustering-based blind equalizerdi!ers in concept from the blind equalizer present-ed in [10], which performs clusters training ina single step by means of a continuous observationsHMM. In contrast, the proposed equalizer de-couples the two steps of the training procedure(clusters estimation and labeling) and o!ers conver-gence rate and complexity advantages, a fact whichis experimentally veri"ed (see Section 5).

3.1. Unsupervised clustering

The task of clusters identi"cation can be per-formed using various unsupervised clustering tech-niques. Typical examples are: the Isodata algorithm[2,16], the neural gas network [11,9], the least-squares vector quantization by a recursive dynamicprogramming procedure [14], etc. A comprehens-ive review is presented in [17]. In this paper, theIsodata algorithm is adopted for the estimation ofcluster representatives. This algorithm has a com-putational simplicity and is suitable for recoveringcompact clusters [17], as is the case of the com-munications systems examined.

In contrast to other blind clustering equalizerssuggested elsewhere [9,16], for the proposedmethod it suzces to exploit the received informationin one dimension, i.e., u(t)"[g(t)] and D"1. In[16,9] the clustering procedure is performed usingthe two-dimensional observation space (D"2)and the corresponding data vector is u(t)"[g(t)g(t!1)]T. The two-dimensional representa-tion of clusters is required in [9] in order to deter-mine the possible transitions among di!erentclusters. In [16], the two-dimensional approach isneeded for minimizing the problem of uncertaintyarising from clusters which overlap in the one-dimensional space. In the proposed algorithm, clus-ters transition information is implicitly provided bythe HMM used to model the one-dimensional re-ceived data. Moreover, the HMM formulation ofthe problem o!ers as a by-product the probabilityof each cluster to correspond to any one of thepossible labels. Thus, this methodology can accom-modate clusters overlap, and the use of one-dimen-

sional data vector is adequate. This o!ers to theproposed algorithm complexity bene"ts comparedto the clustering algorithms employing the two-dimensional data vector set-up (since in the formercase only ML`1 clusters need to be treated insteadof ML`2 clusters involved in the latter case).

3.2. HMM and clusters labeling

Once the cluster representatives have been esti-mated, their corresponding labels have to be deter-mined. For this purpose, a discrete observationsHMM is used for the process modeling, utilizingthe already estimated values of cluster representa-tives. Hence, the discrete observations HMM ischaracterized by the following elements [10,7,17]:

(1) The states of the model, which in our case are

S(t)P(I(t!1)2I(t!¸)), (5)

where I(t) is the i.i.d. sequence of transmitted sym-bols and ¸ is the channel memory. For an M-aryalphabet, the number of the states is N"ML, thatis S(t)3M1,2, NN.

(2) The state transition probabilities, aij, where

aij"P[S(t#1)"j DS(t)"i], 1)i, j)N. (6)

In the blind equalization case, aij

is

aij"G

1/M for an allowable transition,

0 otherwise.

For every allowable transition (aij"1/M) a spe-

ci"c noiseless channel output occurs. In otherwords, each state transition specixes uniquely a clus-ter label. The cluster labels are speci"ed by

X(t)P(I(t)I(t!1)2I(t!¸)), (7)

where X(t)3M1,2, QN, Q"ML`1. Assume, forexample, that ¸"1 and M"2, then, transitionfrom state 1 (I(t!1)"!1) to state 1 (I(t)"!1),de"nes the label which corresponds to:(I(t)I(t!1))"(!1!1), and characterizes a speci-"c cluster.

(3) The distinct observation symbols pertransition, which, in this case, are the cluster repre-sentatives, C"Mc

k; k"1,2, QN.

(4) The probabilities for each observation symbolto occur and for each state transition i to j. In ourcase, this parameter corresponds to the probability


of a speci"c cluster representative (symbol) to cor-respond to a speci"c label (state transitions). Forsimplicity, we use indices of labels and not of statetransitions, since there is a unique correspondencebetween labels and state transitions, as statedearlier. Thus, this element is de"ned as

bn(c

k)"P[c

kobserved D S(t)"i,S(t#1)"j]

"P[ck

observed D X(t)"n], 1)n, k)Q,

1)i, j)N, (8)

where n corresponds to the label that uniquelyspeci"es a state transition: from i to j.

(5) The initial state distribution ni"P[S(1)"i],

for 1)i)N.The discrete observations of the HMM are de-

rived by the sequence of received data. However,due to the presence of noise, the received data, g(t),in the channel output do not take discrete values(see Eq. (1)). This sequence of continuous valuescannot be fed into the discrete observations HMM,described above. Thus, the received data, g(t), arequantized to the closer cluster representative andthe resulting sequence of discrete data is denoted bythe symbol y(t), with y(t)3Mc

k;k"1,2,QN. This is

the discrete observations sequence feeding the HMM.In the proposed blind channel estimation algo-

rithm, clusters' labeling is treated as an HMMlearning problem; that is, we model the unknownprobability of a speci"c cluster to correspond toa speci"c label as an unknown parameter of theHMM, and then seek for the optimal parameters ofHMM which best match the given observationssequence. The unknown probabilities are expectedto converge to 1, for the correct cluster-label corre-spondence, and to zero, otherwise. A usual practiceto handle the learning problem in HMM is themaximization of the probability of the observationsequence of length ¹: >"(y(1),2, y(¹)) given themodel parameters (h), that is the probability P(>/h),with respect to h. The expectation}maximizationalgorithm (or else the Baum}Welch (B}W) rees-timation formulae) is a commonly used numericaliterative scheme to obtain maximum likelihood(ML) estimates of an HMM. The resulting MLestimate is given by

hK "arg maxh

P(>/h). (9)

In our case, we de"ne

h"[bn(c

k)], n, k"1,2, Q. (10)

That is, h is the probability matrix that maps labelsto clusters and it is expected to converge to a matrixwhose elements are equal to: (a) 1, if a speci"csymbol (y(t)"c

k) corresponds to a speci"c label (n)

and (b) zero otherwise.According to the above-described model settings,

the maximization of P(>/h) by means of the B}Walgorithm, leads to the labeling procedure de-scribed in Table 1. Convergence of the algorithm isachieved when P(>/h)"+

iaT`1

(i)'P, withP a predetermined threshold.

Note that, in the procedure described above, inorder to simplify the presentation the assumptionthat the initial state, S(1), is known, is adopted. Thegeneralization of the results to the case where S(1) isunknown can be easily accommodated: the algo-rithm is able to estimate prior probabilitiesP[S(1)"i] [10,13,17].

4. Semi-blind clustering-based sequence equalizer

In the case of semi-blind transmission there isa (short) sequence of known symbols in the trans-mitted data [6,1]. The known symbols are ex-ploited to enhance the blind clustering procedure.

Speci"cally, the HMM procedure describedabove is now modi"ed to incorporate the knownsymbols information as follows. Let us assume thatthe transmitted sequence contains a block ofknown data of length N

53. The corresponding se-

quence of the received (quantized) data is denotedby y

53"[y(n

0),2, y(n

0#N

53!1)]. For the blind

part of the data, the transitions probabilities are setequal to: (a) a

ij"1/M for an allowable states

transition and (b) aij"0 otherwise. For the known

part of data, the transition probabilities are known.Let us assume that the received sample y(t) belongsto the sequence y

53, then, (a) a

ij"1 if the symbol

y(t) corresponds to the label X(t)"n, that is, ifS(t)"i and S(t#1)"j (label n is related to statestransition i to j), (b) a

ij"0 otherwise. Apart from

the modi"cations in the values of the transitionprobabilities, the HMM labeling procedure re-mains the same as in the blind case.


Table 1Clusters labeling via HMM

Initialization

Set N"ML and Q"ML`1.Take a block of ¹ received data g(t) and map them to their closest representative, c

k, for k"1,2, Q.

Set bn(c

k)"1/Q, for n"1,2, Q, k"1,2, Q.

Set aij"P[S(t#1)"j D S(t)"i]"1/M if there is an allowable transition from state i to j, otherwise, set a

ij"0, for 1)i, j)N.

Set a1(i)"1 for the known initial state i and 0 otherwise, b

T`1( j)"1 for j"1,2,N,

(at(i) and b

t`1(j) are the forward and backward parameters, respectively).

Main part } Recursion

(1) Use the forward recursion to calculate at(j) for t"2,2,¹#1, j"1,2, N

at(j)"+

N

i/1

at~1

(i)P[S(t)"j D S(t!1)"i]bn(y(t!1)).

(2) Use the backward recursion to calculate bt(i) for t"¹,¹!1,2,1, i"1,2, N

bt(i)"+

N

j/1

bt`1

(j)P[S(t#1)"j DS(t)"i]bn(y(t)).

(3) Calculate the mt(i, j); the probability of being in state i at time t, and state j at time t#1, given the model and the observation

sequence [13]:

mt(i, j)"

at(i)a

ijbn(y(t))b

t`1(j)

+Ni/1

+Nj/1

at(i)a

ijbn(y(t))b

t`1(j)

(4) Finally, use the reestimation formulae to update the bn(c

k), n"12Q, k"12Q

bn(c

k)"

expected number of transitions from i to j and symbol ck

occurs

expected number of transitions from i to j"

+T~1t/1,y(t)/ck

mt(i, j)

+T~1t/1

mt(i, j)

Repeat the recursion until convergence.

5. Simulations

Example 1. At "rst, a simple linear channel is con-sidered to clarify the operation of the proposedblind algorithm. The assumed channel has thetransfer function H(z)"1#0.5z~1 and ¸"1. TheSNR is 30 dB. Transmitted data are assumed bipo-lar (I(t)"$1,M"2). The clustering - HMMalgorithm is as follows: (a) The unsupervisedalgorithm identi"es the clusters representatives (i.e.,c1"0.49, c

2"!0.507, c

3"!1.49, c

4"1.512,

with the number of data used: K"30). Note that,when bipolar data are utilized, clusters appear ina symmetric way, i.e., $c

k. Thus, the number of

data used for the unsupervised algorithm can behalved, since, only half (e.g., positive) of the clusterscan be estimated. (b) The HMM is formed. The

number of states is set equal to N"2, the numberof clusters is Q"4 and the number of data pro-cessed per recursion is ¹"20. The matrix of b

n(c

k)

converge to the matrix appearing in Table 2, after8 recursions.

From this table, we can conclude the clusters'labeling, i.e., the representative c

1has the label

which corresponds to the sequence (#1 !1), etc.Once the channel estimation procedure is com-pleted, the CBSE can be used for signal detection.

Example 2 (Overlapping clusters). In this experi-ment, data is assumed bipolar (M"2), the channelis: H(z)"0.5#0.7z~1#0.5z~2 and the noise levelis at SNR"20 dB. This is a high-ISI channel,resulting in overlapping clusters in the one-dimen-sional space [10]. Thus, instead of the 8 clusters


Table 2Probability matrix (b

n(c

k)), after convergence, for the channel:

H(z)"1#0.5z~1

Label Representative

I(t)I(t!1) c1

c2

c3

c4

(!1 !1) 0 0 1 0(!1 1) 0 1 0 0(1 !1) 1 0 0 0(1 1) 0 0 0 1

Fig. 2. Clusters formed from noisy received data in one-dimen-sional space. Channel: H(z)"0.5#0.7z~1#0.5z~2. Solid linescorrespond to noiseless channel outputs.


n(c

k)), after convergence, for the channel:

H(z)"0.5#0.7z~1#0.5z~2


I(t)I(t!1)I(t!2) c1

c2

c3

c4

c5

c6

c11

c12

c31

c32

(!1 !1 !1) 0 0 0 0 0 1(!1 !1 1) 0 0 1 0 0 0(!1 1 !1) 0 0 0 1 0 0(!1 1 1) 1 0 0 0 0 0(1 !1 !1) 0 0 1 0 0 0(1 !1 1) 0 1 0 0 0 0(1 1 !1) 1 0 0 0 0 0(1 1 1) 0 0 0 0 1 0

expected to appear (Q"2D`L"8, for ¸"2 andD"1), only 6 clusters are observed (see Fig. 2). Inthis case, the blind algorithm is as follows: (a) Clus-ter representatives estimation results in the follow-ing representatives: c

1"0.7, c

2"0.3, c

3"!0.7,

c4"!0.3, c

5"1.7, c

6"!1.7. (b) The number

of HMM states is N"4, and Q"6. The matrix ofbn(c

k), for n"1,2,8, k"1,2,6 converge to the

matrix appearing in Table 3 (in 7 recursions, with¹"40).

From this matrix, we recover the labeling ofclusters, i.e.: cluster representative c

1(actually

corresponding to two overlapping clusters:c11

"c12

"0.7) has two labels, corresponding to:(!1 #1 #1) and (#1 #1 !1), respect-ivelly, this is a result of the symmetry of the channelwhich actually causes the overlapping of clusters.The labels of the other representatives are deter-mined in the same manner.

From this example, it is apparent that the situ-ation of overlapping clusters does not cause anyproblem in the proposed algorithm. It should beemphasized that, overlapping clusters (usually re-sulting from symmetric channels) can lead the clus-ter-based algorithms of [9,16] in incorrect clusterslabeling. The algorithms of [9,16] need the informa-tion of a starting point for their initialization. Thisstarting point is a cluster that jumps to itself andthis is the cluster which corresponds to a label ofsame symbols (i.e., (#1 #1 #1) if ¸"2). How-ever, in the overlapping clusters case, a cluster(which corresponds actually to 2 or more overlap-ping clusters with di!erent labels) can seem to jumpto itself although its label is not as described before.For example, in the above-described channel,

cluster c11

"0.7, with label correspondingto (#1 #1 !1), has a transition to clus-ter c

12"0.7, with label corresponding to

(!1 #1 #1). If this overlapping cluster is as-sumed as the starting point (since it seems to jumpto itself), then, a false starting point is assumed.Since, the subsequent procedure of clusters labelingis dependent on the correctness of this starting


Fig. 3. Performance achieved for nonlinear channel (Example 3)&L' : proposed blind equalizer, &!': CBSE with known clustersrepresentatives, &.' : MLSE with linear channel estimator.

point } cluster, is apparent that this situation cancause serious problems in the algorithms of [9,16].

Example 3 (Nonlinear channel). Consider, forexample, the channel with H(z)"0.34#0.87z~1#0.34z~2 and with the nonlinear function:g(t)#0.05g(t)2!0.1g(t)3. Fig. 3 illustrates the per-formances of three equalizers: (a) the proposed clus-ter-HMM blind equalizer, (b) a CBSE usingtraining sequence and (c) a classical maximum like-lihood sequence estimator (MLSE) with channelestimation achieved through an RLS algorithm.From this "gure, it is apparent that the perfor-mance of the proposed equalizer is the same to theperformance of the supervised CBSE. Moreover,the performance of the proposed equalizer is sub-stantially better compared to the performance ofthe conventional MLSE.

It should also be noted that the proposedequalizer is able to treat channel nonlinearities with-out having to model them. In contrast, the equalizersof [9,16], due to the special way of initializationthat they adopt, constrain their use only in mono-tonic nonlinearities.

The performance of the proposed blind algo-rithm has been investigated on a variety of channelsand for a number of signaling schemes. All theresults indicate the robustness of the proposedscheme. The adoption of one-dimensional observa-

tion data in the unsupervised clustering proceduregives complexity and convergence bene"ts to theproposed equalizer compared with the correspond-ing clustering equalizers of [16,9].

Example 4. In this example, the convergence rateand the complexity of the proposed method is com-pared with those of the blind equalizer presented in[10]. In order to determine the accuracy of theachieved values of the clusters representatives weintroduce the following measure:

m"

1

Q

Q+k/1

(ck!c(

k)2 (11)

which represents the average-squared error be-tween the clusters representatives c

kand the esti-

mated representatives c(k, and Q is the number of

representatives. In the above equation, c(k

corres-ponds to the value achieved in steady state (thealgorithm has converged).

Consider the channel of Example 1. As it be-comes apparent from this example, the proposedalgorithm needs 30 data for the batch procedure ofclustering and 20 of those data are reused for feed-ing the HMM. The HMM converges in 8 iterations.The average squared error of the achieved repre-sentatives is m"10~4.

In the method of [10] a continuous HMM isused for training the clusters (both representativesestimation and labeling). For the same experiment,in order to obtain estimates of the same accuracy asabove, the number of data used per iteration needsto be set equal to ¹"100, and, the convergence ofthe algorithm is achieved in 16 iterations. The in-itial values of the representatives are set to 0. Fig. 4illustrates the achieved average-squared error, onthe steady state, for di!erent values of the numberof data user per iteration, estimated after 100 inde-pendent realizations of the experiment.

Similar behaviour, concerning the convergenceand the required number of data of the two com-pared algorithms, has been veri"ed via a number ofdi!erent experiments. The fact that the proposedalgorithm decouples the tasks of representativesidenti"cation and labeling makes the whole taskmore robust and o!ers convergence and complex-ity bene"ts compared to the algorithm of [10]. Thecomplexity of both algorithms, per iteration, is


Fig. 4. Average-squared error versus ¹ (number of data usedper iteration), for the algorithm [10].


n(c

k)), channel: H(z)"1#0.5z~1, over-

determined number of clusters


I(t)I(t!1)I(t!2) c1

c2

c3

c4

c5

c6

c7

c8

(!1 !1 !1) 0 0 0 1 0 1 0 0(!1 !1 1) 0 0 0 1 0 1 0 0(!1 1 !1) 0 0 0 0 1 0 0 1(!1 1 1) 0 0 0 0 1 0 0 1(1 !1 !1) 0 1 0 0 0 0 1 0(1 !1 1) 0 1 0 0 0 0 1 0(1 1 !1) 1 0 1 0 0 0 0 0(1 1 1) 1 0 1 0 0 0 0 0

roughly O(¹N2), where ¹ is the number of dataused per iteration and N the number of states.However, the complexity of the discrete HMM islower compared to the complexity of the continu-ous HMM. Moreover, the number of required iter-ations for convergence and the parameter ¹ havea much lower value in the case of the proposedalgorithm, as it has already been pointed out. Thecomplexity of the proposed algorithm depends alsoon the complexity of the clustering procedure.However, the complexity of many clustering algo-rithms, e.g. Isodata is low (linearly dependent onthe number of data) and adds a small amount to thetotal complexity. Thus, "nally, the total computa-tional burden of the proposed algorithm comparesfavorably to that of [10].

Example 5 (Channel order mismatch). In all algo-rithms for blind equalization, channel order estima-tion is a major problem and usually the channellength is assumed known (i.e. [10,16,9]). Here, thee!ect of channel order mismatch in the proposedcluster-based blind equalizer is discussed. Two dif-ferent situations are considered: overdeterminedand underdetermined channel order.

5.1. Overdetermined channel order

In the case of overdetermined channel order,ML`1`r clusters are assumed, instead of ML`1 (or

equivalently, the channel length is assumed to be¸#1#r, instead of ¸#1).

Consider, for example, a binary alphabet and thechannel of Example 1, which results in 4 clusters inone-dimensional space. Suppose that 8 clusters areassumed. Then the unsupervised algorithm resultsin 8 clusters representatives (i.e., c

1"!1.5,

c2"!1.49, c

3"!0.5, c

4"!0.49, c

5"0.5,

c6"0.49, c

7"1.5, c

8"1.51). When these (re-

dundant) clusters representatives feed the HMM,the learning algorithm procedure results in thematrix appearing in Table 4. From this matrix, weobserve that each label (horizontal line) corres-ponds to 2 clusters, and this indicates the over-determination of channel order, since the last bitI(t!2) carries no relevant information.

5.2. Underdetermined channel order

In this case, ML`1~r clusters are assumed, in-stead of ML`1, (or equivalently, the channel lengthis assumed to be ¸#1!r, instead of ¸#1). Thatis, clusters grouping takes place [3]. Two di!erentsituations are considered in this case.

(a) The clusters, which are grouped, correspondto the same value of the transmitted symbolI(t). Take, for example, the channel withH(z)"1#0.4z~1!0.2z~2, which forms 8 clustersin the one-dimensional space (binary data as-sumed). Assume that, only 4 clusters are con-sidered. The unsupervised clustering technique



n(c

k)), channel: H(z)"1#0.4z~1!

0.2z~2, underdetermined number of clusters


I(t)I(t!1) c1

c2

c3

c4

(!1 !1) 1 0 0 0(!1 1) 0 0 1 0(1 !1) 0 1 0 0(1 1) 0 0 0 1


n(c

k)), channel: H(z)"0.5#0.3z~1#

0.6z~2, underdetermined number of clusters


I(t)I(t!1) c1

c2

c3

c4

(!1 !1) 0 0.5 0.5 0(!1 1) 0 0 0.5 0.5(1 !1) 0 0 0.5 0.5(1 1) 0.5 0 0 0.5

results in the clusters: c1"!1.4, c

2"!0.6,

c3"0.6, c

4"1.4 (SNR"20 dB). Each of these

clusters results from the grouping of two clusters,whose labels have in common the bits I(t) andI(t!1). For example, the cluster !1.4 results fromthe grouping of clusters: !1.2 and !1.6, corre-sponding to the labels with: (I(t)I(t!1)I(t!2)"(!1!1!1) and (!1!1#1), respectively. TheHMM procedure results in the probabilities matrixappearing in Table 5. With the aid of this matrix,identi"cation of the 4 assumed clusters is per-formed. Next, these labeled clusters can feeda CBSE, working in reduced states mode [3]. Sucha mode of operation, although may a!ect the per-formance slightly, it is not catastrophic.

(b) The clusters that are grouped correspond todi!erent values of the symbol I(t). Consider forexample the channel H(z)"0.5#0.3z~1#0.6z~2,resulting in 8 clusters in the one-dimensional space,namely the clusters !1.4,!0.2,!0.8,0.4,!0.4,0.8,0.2,1.4. When only 4 clusters are assumed, then,the unsupervised learning results in the clusters:c1"1.4, c

2"!1.4, c

3"!0.47, c

4"0.47.

However, cluster !0.47 comes from groupingof the clusters: !0.2,!0.8 and !0.4, correspond-ing to the labels with (I(t)I(t!1)I(t!2)):(!1!1#1),(!1#1!1) and (#1!1!1),respectively. Then, there is no two-symbol labelingsequence (i.e., (I(t)I(t!1))) to which the cluster!0.47 could be mapped. In this case, the labelingprocedure results in a matrix of size 4]4 appearingin Table 6. From this matrix it is observed that theunknown probabilities, b

n(c

k), converge to 0.5 and

to 0, instead of 1 and 0. The convergence to a prob-ability of value 0.5 indicates that the respective

label corresponds to two clusters. For example, thelabel (!1!1) corresponds to both clusters c

2and

c3. Or alternatively, from the matrix it is seen that

the cluster c3

comes from 3 grouped clusters withlabels corresponding to: (!1!1),(!1#1) and(#1!1). Thus, underdetermining the number ofclusters is re#ected to the values of the resultingmatrix. Once this is detected, clusters can be rees-timated by using a higher channel order (so that toachieve an exact determination of the channel or-der). Alternatively, CBSE can work in a reducedstates modes by using the information obtainedfrom the estimated probability matrix (e.g., fromTable 6 it is concluded that: cluster !0.47 (c

3) is

assigned to the two-bits labels: (!1!1),(!1 1)and (1!1), etc).

Example 6 (Semi-blind case). In this example,the contribution of the use of known symbols inthe blind algorithm is examined. Consider thechannel with H(z)"0.34#0.87z~1#0.34z~2 andSNR"20 dB. The CBSE is used for channelequalization both in blind and semi-blind mode.The number of data processed per iteration in theHMM learning procedure is set equal to ¹"40.Di!erent number of training data are used perexperiment, i.e., N

53"0,5,10,15,20,30. Note that,

although the cases of more than 20 training data(half the data block) do not correspond to a semi-blind detection set-up are examinated only forcomparative purposes. Fig. 5 illustrates the numberof iterations needed for convergence versus thenumber of used training data. For example, for theblind case (N

53"0) 8 iterations are needed, al-

though, with 5 training data the number of iter-ations is reduced to 6, etc. From this example, we


Fig. 5. Number of iterations for the convergence of semi-blindequalizer versus number of training bits.

can see the bene"t o!ered from the use of (even few)known symbols in the convergence rate.

The above results have also been veri"ed fora number of di!erent channels. All the results indi-cate the improvement achieved over the blind algo-rithm, when some known symbols are used.

6. Conclusions

A novel clustering-based blind sequenceequalizer was described. The blind CBSE consistsof three parts: (a) unsupervised estimation of clus-ters representatives, (b) labeling of the clusters viaan HMM procedure and (c) Viterbi algorithm forsignal detection. The proposed blind equalizationprocedure is irrelevant to the nature of the channel,i.e., linear or nonlinear, minimum phase or non-minimum phase. The case of unknown channelorder was studied. The operation of the equalizer insemi-blind mode was also investigated. Simulationresults have shown that even very short trainingsequences lead the HMM labeling algorithm to fastconvergence.

References

[1] H.A. Cirpan, M.K. Tsatsanis, Stochastic maximum likeli-hood methods for semi-blind channel estimation, IEEESignal Process. Lett. 5 (1) (January 1998) 21}24.

[2] R.O. Duda, P.E. Hart, Classi"cation and Scene Analysis,Wiley, New York, 1973.

[3] K. Georgoulakis, S. Theodoridis, E$cient clustering tech-niques for channel equalization in hostile environments,Signal Process. 58 (1997) 153}164.

[4] K. Georgoulakis, S. Theodoridis, Blind equalization fornonlinear channels using HMM, Proceedings of the EU-SIPCO 98, Vol. III, Rodos, Greece, September 1998, pp.1613}1616.

[5] D.N. Goddard, Self recovering equalization and carriertracking in two-dimensional data communications sys-tems, IEEE Trans. Commun. COM-28 (November 1980)1867}1875.

[6] A. Gorokhov, P. Loubaton, Semi-blind second order iden-ti"cation of convolutive channels, Proceedings of theICASSP, Munich, Germany, April 1997, pp. 3905}3908.

[7] X.D. Huang, Y. Ariki, M.A. Jack, Hidden Markov Modelsfor Speech Recognition, Endiburgh University Press, En-diburgh, 1990.

[8] N.K. Jablon, Joint blind equalization, carrier recovery andtiming recovery for 64-QAM and 128-QAM signal con-stellations, Proceedings of the 1989 IEEE InternationalConference on Communications, Boston, MA, June 1989,pp. 1043}1049.

[9] Y.J. Jeng, C.C. Yeh, Cluster based blind nonlinear } chan-nel estimation, IEEE Trans. Signal Process. 45 (5) (May1997) 1161}1172.

[10] G.K. Kaleh, R. Vallet, Joint parameter estimation andsymbol detection for linear or nonlinear unknown chan-nels, IEEE Trans. Commun. 42 (7) (July 1994) 2406}2413.

[11] T.M. Martinetz, S.G. Bekovich, K.J. Schulten, Neural-gaznetwork for vector quantization and its application totime-series prediction, IEEE Trans. Neural Networks 4 (4)(July 1993) 558}569.

[12] J.G. Proakis, Digital Communications, McGraw-Hill,New York, 1983.

[13] L.R. Rabiner, A tutorial on HMM and selected applica-tion in speech recognition, Proc. IEEE 77 (2) (February1989).

[14] N. Sheshadri, Joint data and channel estimation usingblind Trellis search techniques, IEEE Trans. Commun. 42(2/3/4) (1994) 1000}1011.

[15] S. Theodoridis, C.F.N. Cowan, C.P. Callender, C.M.S. Lee,Schemes for equalization of communications channelswith nonlinear impairments, IEE Pro. Commun. 142(1995) 165}171.

[16] S. Theodoridis, K. Georgoulakis, E$cient clustering tech-niques for supervised and blind channel equalization ina hostile environment, Proceedings of the Eusipco 1996,Trieste, Italy, September 1996, pp. 611}614.

[17] S. Theodoridis, K. Koutroubas, Pattern Recognition, Aca-demic Press, New York, 1998.


Documents

Blind and semi-blind equalization using hidden Markov models and clustering techniques