6
POPULATION SIZE IDENTIFICATION FOR CDMA EAVESDROPPING Ming Li, Stella N. Batalama* and Dimitris A. Pados Department of Electrical Engineering, State University of New York at Buffalo, Buffalo, NY 14260. Email: {mingli, batalama, pados}@eng.buffalo.edu ABSTRACT In this paper we develop a novel iterative-least- squares-type method that identifies the population size of direct-sequence CDMA signal mixtures without any knowledge about the signatures of the active spread- spectrum signals in the system or the channel state and in the absence of pilot signaling (training sequence). Simulation studies demonstrate the performance of the proposed method and offer favorable comparisons with corresponding schemes based on the conventional Akaike information criterion, the minimum description length criterion, as well as their more recent modified versions. I. INTRODUCTION In this paper, we aim at identifying the size of the spread spectrum user population without any knowledge of the users' spreading codes (signatures) as well as in the absence of any pilot signal (training sequence) or channel state information. We understand that identifying the population size of the active code-division multiplexed users is criti- cal for the successful extraction of their information data. The problem of detecting the number of active (code or space) coexisting users/signals in a system has been considered in the past literature under vari- ous signal models, channel conditions and application environments. As an example, one may consider the early work in [1]-[3] where the number of users was found by utilizing an estimate of the noise subspace obtained through a sequence of binary hypothesis testing problems that identified the eigenvalues of the sample average correlation matrix pertinent to the noise sub- space. Among MUSIC-type solutions we may consider the work in [4]. The proposed solution in [4] appears to be ineffective in multipath fading channels (due to an * Corresponding author. exhaustive search incorporated in the user identification process). Additional efforts and elegant proposals were based on the Akaike's information criterion (AIC) and the minimum-description-length (MDL) criterion with primary example the work in [5] that developed pertinent techniques for array signal processing applications. At the same time, it was noticed that the performance of either AIC or MDL-type solutions was not satisfactory under small sample support or high density of user population environments [6],[7] (the same holds true for eigen-decomposition based solutions that are based on small sample support). In this paper, a novel population size identification method for CDMA signal eavesdropping is proposed. This method is intended to be integrated in a scheme that aims at the extraction of the information data of concurrent CDMA users with no knowledge about the size of user population, their signature waveforms and/or channel state. After introducing the signal model and notation in Section 11, we present, in Section 111, a light- weight iterative least-squares (ILS) method for coupled signature estimation and information/data extraction that assumes known number of active users [8]. In Section IV, we develop an ILS-type algorithm, that identifies the population size of the direct-sequence CDMA signal mixtures. Simulation studies and comparisons are pre- sented in Section V while a few concluding remarks are drawn in Section VI. II. SIGNAL MODEL AND PROBLEM STATEMENT We consider a CDMA system with K users commu- nicating over independent frequency selective Rayleigh fading channels. User-k, k = 1, 2, ... , K, is assigned a normalized signature S(l) = +1/L, 1 = 1 ...L, where L is the processing gain. Without loss of gener- ality, we assume that all users have the same number 1-4244-151 3-06/07/$25.00 ©2007 IEEE I of 6

[IEEE MILCOM 2007 - IEEE Military Communications Conference - Orlando, FL, USA (2007.10.29-2007.10.31)] MILCOM 2007 - IEEE Military Communications Conference - Population Size Identification

Embed Size (px)

Citation preview

Page 1: [IEEE MILCOM 2007 - IEEE Military Communications Conference - Orlando, FL, USA (2007.10.29-2007.10.31)] MILCOM 2007 - IEEE Military Communications Conference - Population Size Identification

POPULATION SIZE IDENTIFICATION FOR CDMA EAVESDROPPING

Ming Li, Stella N. Batalama* and Dimitris A. PadosDepartment of Electrical Engineering,

State University of New York at Buffalo,Buffalo, NY 14260.

Email: {mingli, batalama, pados}@eng.buffalo.edu

ABSTRACT

In this paper we develop a novel iterative-least-squares-type method that identifies the population sizeof direct-sequence CDMA signal mixtures without anyknowledge about the signatures of the active spread-spectrum signals in the system or the channel state andin the absence of pilot signaling (training sequence).Simulation studies demonstrate the performance of theproposed method and offer favorable comparisons withcorresponding schemes based on the conventional Akaikeinformation criterion, the minimum description lengthcriterion, as well as their more recent modified versions.

I. INTRODUCTION

In this paper, we aim at identifying the size of thespread spectrum user population without any knowledgeof the users' spreading codes (signatures) as well as inthe absence of any pilot signal (training sequence) orchannel state information.

We understand that identifying the population sizeof the active code-division multiplexed users is criti-cal for the successful extraction of their informationdata. The problem of detecting the number of active(code or space) coexisting users/signals in a systemhas been considered in the past literature under vari-ous signal models, channel conditions and applicationenvironments. As an example, one may consider theearly work in [1]-[3] where the number of users wasfound by utilizing an estimate of the noise subspaceobtained through a sequence of binary hypothesis testingproblems that identified the eigenvalues of the sampleaverage correlation matrix pertinent to the noise sub-space. Among MUSIC-type solutions we may considerthe work in [4]. The proposed solution in [4] appearsto be ineffective in multipath fading channels (due to an

* Corresponding author.

exhaustive search incorporated in the user identificationprocess). Additional efforts and elegant proposals werebased on the Akaike's information criterion (AIC) andthe minimum-description-length (MDL) criterion withprimary example the work in [5] that developed pertinenttechniques for array signal processing applications. Atthe same time, it was noticed that the performance ofeither AIC or MDL-type solutions was not satisfactoryunder small sample support or high density of userpopulation environments [6],[7] (the same holds true foreigen-decomposition based solutions that are based onsmall sample support).

In this paper, a novel population size identificationmethod for CDMA signal eavesdropping is proposed.This method is intended to be integrated in a schemethat aims at the extraction of the information data ofconcurrent CDMA users with no knowledge about thesize of user population, their signature waveforms and/orchannel state. After introducing the signal model andnotation in Section 11, we present, in Section 111, a light-weight iterative least-squares (ILS) method for coupledsignature estimation and information/data extraction thatassumes known number of active users [8]. In SectionIV, we develop an ILS-type algorithm, that identifiesthe population size of the direct-sequence CDMA signalmixtures. Simulation studies and comparisons are pre-sented in Section V while a few concluding remarks aredrawn in Section VI.

II. SIGNAL MODEL AND PROBLEMSTATEMENT

We consider a CDMA system with K users commu-nicating over independent frequency selective Rayleighfading channels. User-k, k = 1,2, ... , K, is assigneda normalized signature S(l) = +1/L, 1 = 1 ...L,where L is the processing gain. Without loss of gener-ality, we assume that all users have the same number

1-4244-151 3-06/07/$25.00 ©2007 IEEE I of 6

Page 2: [IEEE MILCOM 2007 - IEEE Military Communications Conference - Orlando, FL, USA (2007.10.29-2007.10.31)] MILCOM 2007 - IEEE Military Communications Conference - Population Size Identification

of resolvable paths. Let M be the number of resolvablepaths, and hk = [hk (1), hk (2), . .. , hk (M)]T (T denotesmatrix transpose) be the channel response vector of user-k. Then the channel processed composite signature Ck ofuser-k is an LM x 1 vector, where LM = L + M- 1and can be written as

Ck = Skhk (1)

where

Sk=

Sk (1)

Sk(L)

0

U

Sk (1)

Sk(L) LmXM

In a synchronous CDMA environment, the discretetime received signal vector for the ith symbol interval(after chip matched filtering and chip rate sampling) canbe expressed as

K

y(i) = >{ EckCbk(i) + N;Ekc-bk (iklk

1)

+ 1ECkc bk(i + 1)}+ n (3)where y(i) EE CLmxl is the combined data/observationvector (snapshot) at the i-th transmission period, Ekand bk(i) = {+1} are the energy level and the i-thbit of user k, k = 1, ... , K, respectively. The bits areviewed as binary equiprobable random variables, thatare independent within a user stream and across users.n P1(O, u721) denotes additive white Gaussian noise(AWGN) while /Eck bk(i -1) and Ekc bk(i + 1)indicates ISI from past and future bits, correspondingly.Since the effect of ISI is negligible for most applicationsof practical interest where the number of resolvablemultipaths is much less than the processing gain, formathematical convenience, we will not consider the ISIterms in our theoretical treatment/developments (how-ever, ISI will be considered, naturally, in our simulationstudies). If we denote by vk E-kck, then the receivedsignal can be expressed as

y(i)K

ZVkbk() +nk=l

Vb(i) + n

where V = [vI, . . ., VK] is the energy absorb-ing channel processed signature matrix and b(i) =

[bi (i), ... , bk Wi)]' is the vector of bits sent by all K usersat the i-th transmission period. We collect the tappedreceived signal y(i) for N snapshots and rewrite thetapped received signal in matrix form as

Y =VB+N (6)

where Y E CLA XN is the data/observation matrix, andB = [b(1), ... , b(N)] is a K x N matrix that containsN symbols transmitted by the K users. N is an L, XN matrix with columns modeled as Gaussian randomvectors.

Our technical objective is the identification of userpopulation (i.e. row dimension of B). In the sequel,we develop an iterative procedure that achieves thisgoal and exhibits low computational complexity and fastconvergence.

III. ITERATIVE LEAST SQUARES ALGORITHM

We consider the signal model in (1)-(6) where thechannel matrix V is assumed unknown. In this section,we assume that the number of active users K is known[8].

Clearly, V can be estimated in a supervised fashion,i.e. by utilizing a training sequence (pilot signaling).However, supervised channel estimation may not be apreferred approach for rapidly changing communicationenvironments due to the need for frequent piloting (re-training and system adaptation to the new channel con-ditions) and the associated bandwidth waste. In addition,for our specific application of interest (eavesdropping) atraining sequence is not, in general, available.

Taking the above considerations into account, our ap-proach begins by formulating the CDMA signal extrac-tion problem as a joint detection and estimation problemwith the following least squares (LS)-type solution

V,B = arg mmn IIY VBi1VGLMx K

(7)

where 11 ||F denotes the matrix Frobenius norm. TheLS solution above coincides with conditional maximumlikelihood (treating V and B as deterministic unknowns).In any case, regretfully, joint detection and estimation hascomplexity exponential in KN. We consider this costunacceptable and attempt to reach a quality approxima-tion of the solution by alternating least squares estimatesof V and B, iteratively.

The basic idea behind such an iterative least squares

2 of 6

Page 3: [IEEE MILCOM 2007 - IEEE Military Communications Conference - Orlando, FL, USA (2007.10.29-2007.10.31)] MILCOM 2007 - IEEE Military Communications Conference - Population Size Identification

(ILS) solution is simple [9]-[1 1]. Each time we computean LS update (B) of one of the unknown (matrix)parameters conditioned on a previously obtainedestimate (V) of the other (matrix) parameter, weproceed to update estimate of the other matrix (V) untilconvergence of the LS cost function is reached. Morespecifically, we iterate estimation of V and B treatingB initially as a continuous valued matrix. We, then,project the estimate of the continuous valued matrix Bto the digital space {+i1}KXN (two-level zero-thresholdhard-limiter quantization). The iterative least-squaresprocedure is presented below. Superscripts denote theiteration index.

criterion given by

K =arg min AJC(k)ke{1,2,..,Lm}

-arg mmnke{1,2,..,Lm}

+ 2v(k, LM),

or

K =arg min MDL(k)kef127,..Lm}

=arg mmnkc{1,2,.,Lm}1

+ -v(k, LAJ)logN,2

Table I

Iterative scheme for data extraction1) p = 0; initialize V(°).2) p = p + 1;

(v(P-1))HyI }(8)

(p YY(B(p))H [(B(P)) (B(P))H]

3) Repeat Step 2 until (B(P), V(P)) = (B(P-1) (P- 1))

IV. COMBINED POPULATION SIZEIDENTIFICATION AND DATA DETECTION

In Section III we develop an ILS-type scheme that can

extract successfully the information bits of coexistingCDMA users based on the assumption that the numberof active user is known. In this section we propose a

combined scheme that estimates the number of activeuser and simultaneously detects their information datasymbols in the absence of any knowledge about theirsignature or channel state.

Most of the existing methods for identifying the sizeof user/signal population are based on the AIC and MDLcriterion. If LM = L + M -1 is the dimension of thechannel processed signature vector, [5] proposed to esti-mate the number of active (space division) multiplexedsignals as the value K that minimizes the AIC or MDL

A LM-IeLk = l Km

i , v(k, LM ) k(2LmI k) is

LMq-k Ei=k+ia penalty function, and A1 > A2 > ... > ALm denotethe eigenvalues of the sample average covariance matrixR-A Y Criteria (11) and (13) exhibit poor perfor-mance when the coexisting signals are correlated. Suchsignals are better handled by their modified versionsof proposed in [12] that exhibit higher computationalcomplexity.

If we assume that all active users in the system havesufficient signal power in the sense that VkHVk > 02k 1,..., K, where u2 is the noise power, then our

proposed approach evaluates the ILS estimates of matrixV (and B) as follows. First we run the ILS algorithm of(8)-(9) by setting the number of active users q = 1, andinitialize the algorithm with Vq°) where the subscriptq identifies the number of columns of V; when q = 1Vq°) is merely a randomly chosen complex Gaussian

vector.

The ILS algorithm returns a channel processed signa-ture vector (q = 1) or matrix (q > 1) Vq and data vector

~2 IlY-VqBq 2or matrix Bq, respectively. Let q IIF be our~q LmN

proposed estimate of the received signal residual power

(i.e. after subtracting the signal of the ILS-estimatedusers). As long as the norm of each column of Vq islarger than '2 we repeat the ILS process using each

time as initial estimate V(1 [VqXaLmx1], where

Vq is the estimate returned by the ILS procedure in theprevious step, i.e. the matrix Vq is now expanded by a

randomly chosen Gaussian vector a and the ILS process

is repeated with q := q + 1. Then procedure stops once

at least one column of the new matrix Vq has norm

less than '2. Then the number of users in the system is

3 of 6

(10)

21ogL (Lm-k)N

(11)

(12)

logL LAIk)N

(13)

'(P-1))H('(P-1))B (p) = sgn gi (v v

Page 4: [IEEE MILCOM 2007 - IEEE Military Communications Conference - Orlando, FL, USA (2007.10.29-2007.10.31)] MILCOM 2007 - IEEE Military Communications Conference - Population Size Identification

estimated to be equal to q -1 and the detected bits aregiven by Bq-1 obtained during the previous executionof the process. Our proposed combined population sizeidentification and data detection scheme is summarizedbelow.

drawn from Fig. 2 where K and N are increased to 15and 300, respectively.

Table II

User identification and data extraction scheme

1) q= 1; initialize V(0) = [a]ILx1.2) Execute the iterative procedure of Table I

(0)initialized at Vq3) Let Vq, Bq be the estimates returned by the

procedure in Table I;let Vmi,2 be the vector in Vq with the smallestnorm.If ^H IlY-VqBq F

VminVmin> LMANthen Vqi1 = [Vq, a], q = q + 1, goto (2)else exit.

4) Set number of active users K q -1.5) Extract the information data bits as Bq-1.

V. SIMULATION STUDIES

Y10

0~Y

* 10

0

10

-AIC

-MDL

-AICod

mod

- Proposed

3 4 5 6 7 8SNR in dB

9 10 11 12

Fig. 1. Probability of error versus SNR (user population Ksample support N= 200).

10,

10l

10

Our simulations focus on the performance evaluationof the proposed scheme in terms of the probability oferror in identifying the number of active users in thesystem. We consider a multiuser CDMA system withusers that utilize Gold signatures with processing gainL = 31. Each user signal is transmitted over a multipathRayleigh fading channel (the channels are considered tobe different among users) and in the presence of additivewhite Gaussian noise with variance fixed at u 2 = 1. Weidentify the user population by executing the methodproposed in Table II and calculate the probability oferror P(K z K) over 105 experiments. For comparisonpurposes we also implement the corresponding schemesbased on the conventional AIC and MDL criteria [5] aswell as their modified versions [12] that we denote asAJCmod and MDLmod.

In Fig. 1 we consider K 10 users with equalSNR (SNR1 = SNR2 = ... SNR1o) ranging from2 dB to 12 dB. N = 200 data sample records arecollected and applied. We plot the probability of erroras a function of signal-to-noise-ratio (SNR). We observethat the proposed method outperforms all other schemeswhile the relative performance of AIC, MDL, AJCmodand MDLmod is as expected. Similar conclusions are

0

a) 10

L. 10

10

- AIC- MDL

AlCmodMDLmomoLd-Proposed

2 3 4 5 6 7 8SNR in dB

9 10 11 12

Fig. 2. Probability of error versus SNR (user population Ksample support N = 300).

15,

The performance of the proposed scheme as a functionof the data record size is examined in Fig. 3. We considerK = 10 users with powers Ek, k =1, ... ,10, rangingbetween 4 dB and 7 dB. We plot the probability of erroras a function of data record size (N ranges from 192 to400). Fig. 4 repeats this study for K= 15.

Fig. 5 (3-D figure) captures the performance of theproposed scheme in terms of probability of error as afunction of both the data record size N and size of userpopulation.

4 of 6

Page 5: [IEEE MILCOM 2007 - IEEE Military Communications Conference - Orlando, FL, USA (2007.10.29-2007.10.31)] MILCOM 2007 - IEEE Military Communications Conference - Population Size Identification

10

010

0

1

0

10

AIC

... od ~~~~~~..... ... ... .... ... ............. .. .. .... ....... ..d

Proposed

200 220 240 260 280 300 320 340 360 380 400Sample record size

Fig. 3. Probability of error versus data record size (SNR=4 -7dB,user population K = 10).

0.8

Y 0.7

n 06

O 0.5~

- 0.4

03= 0.3X -.,O. 0.2

0.1 6

01614

1210

8User population

200250

350400

4 50 Sample record size6

150

Fig. 5. Probability of error versus sample record size and userpopulation (SNR=5dB).

90

80

70

*-102_i-::: AIC :::::::::::: C;

n ... t~~~~~~~~~~~~~~~~~~~~......

8 ::: MDL :::::::::::::::::::::::::::

-3 Proposed

...... ... ... ... ... ... ... ... ... .. .. . .

200 220 240 260 280 300 320Sample record size

60

o 50

$ 40

t 30

2L20

340 360 380 400

10

-5 -4 -3

under-estimating-2 -1 0

K K

AIC

MDL

eAlCmodiM DLmod

-- Proposed

2 3 4 5

over-estimating

Fig. 4. Probability of error versus data record size (SNR=4 -7dB,user population K = 15).

Finally, in Fig. 6 we plot the empirical pdf of thedifference K -K of the number of users K identifiedby a corresponding algorithm and the "genie assisted"true value K. This figure captures the level of over orunder-estimation of the number of active users in thesystem by each algorithm implemented. High concentra-tion around the 0 point indicates that the correspondingscheme estimates the number of users correctly with highprobability.

VI. CONCLUSIONS

In this paper we proposed a novel population sizeidentification method for CDMA eavesdropping. Thismethod is intended to be integrated in a scheme that aims

Fig. 6. Histogram of the difference K-K for K returned by eachof the five implemented schemes (K = 15, N = 200, SNR=7 dB).

at the extraction of the information data of concurrentCDMA users with no knowledge about the size of userpopulation, their signature waveforms and/or channelstate. Simulation studies examined the performance ofthe proposed scheme in terms of probability of errorin determining the number of users in the system andoffered favorable comparisons with pertinent structuresthat are based on the AIC and MDL criterion.

REFERENCES

[1] M. S. Bartelett, "A note on the multiplying factors for variousx2 approximations," J Roy. Stat. Soc., ser. B, vol. 16, pp. 296-298, 1954.

[2] D. N. Lawley, "Tests of significance of the latent roots of thecovairance and correlation matrices," Biometrica, vol. 43, pp.128-136, 1956.

5 of 6

Page 6: [IEEE MILCOM 2007 - IEEE Military Communications Conference - Orlando, FL, USA (2007.10.29-2007.10.31)] MILCOM 2007 - IEEE Military Communications Conference - Population Size Identification

[31 T. W. Anderson, "Asymptotic theory for principal componentanalysis," Ann. J Math. Stat., vol. 34, pp. 122-148, 1963.

[4] A. Haghighat and M. R. Soleymani, "A MUSIC-based al-gorithm for spreading sequence discovery in multiuser DS-CDMA," Vehicular Technology Conference, Orlando, FL, Oct.2003, vol. 2, pp. 978-981.

[5] M. Wax and T. Kailath, Detection of Signals by InformationTheroretic Criterion, IEEE Transactions on Acoustics, Speechand Signal Processing, vol. 33, pp. 387-392, April 1985.

[6] Q. T. Zhang, K. M. Wong, P. C. Yip and J. P. Reilly, " Statisticalanalysis of the performance of information theoretic criterionin the detection of the number of signals in array processing,"IEEE Trans. Acoust., Speech, and Signal Proc., vol. 37, pp.1557-1567, Oct. 1989.

[7] H. Wang and M. Kaveh, "On the performance of signal-subspace processing- Part I: Narrow-band systems," IEEETrans. Acoust., Speech, and Signal Proc., vol 34, pp. 1201-1209, Oct. 1986.

[8] M. Li, S. N. Batalama, D. A. Pados and J.D. Matyjas, "Mul-tiuser CDMA Signal Extraction," Military CommunicationsConference (MILCOM), Washington D.C., Oct. 2006, pp. 1-5.

[9] S. Talwar, M. Viberg and A. Paulraj, "Blind seperation ofsynchronous co-channel digital signals using an antenna array- part I: algorithms," IEEE Trans. Signal Proc., vol. 44, pp.1184-1197, May 1996.

[10] T. Li and N. D. Sidiropoulos "Blind digital signal separation us-ing successive interference cancellation iterative least squares,"IEEE Trans. on Signal Proc., vol. 48, pp. 3146-3152, Nov. 2000.

[11] M. Gkizeli, D. A. Pados, S. N. Batalama, and M. J. Medley"Blind iterative recovery of spread-spectrum steganographicmessages," in Proc. IEEE Intern. Conf on Image Proc. (ICIP),Genoa, Italy, Sept. 2005, vol. 2, pp. 1098-1101.

[12] M. Wax, "Detection and Localization of Multiple SourcesVia the Stochasitc Signals Model," IEEE Trans. on SignalProcessing, vol. 39, pp. 2450-2456, Nov. 1991.

[13] S. Verdu, Multiuser Detection. Cambridge University Press:New York, 1998.

[14] J. G. Proakis, Digital Communications. Mc Graw Hill: NewYork, 4th edition, 2000.

6 of 6