14
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 17, NO. 4, JULY 2006 975 Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion Youshen Xia, Senior Member, IEEE, and Henry Leung, Member, IEEE Abstract—The problem of spatial–temporal signal processing and modeling has been of great interest in recent years. A new spatial–temporal prediction method is presented in this paper. An optimal fusion scheme based on fourth-order statistic is first em- ployed to combine the received signals at different spatial domains. The fused signal is then used to construct a spatial–temporal pre- dictor by a support vector machine. It is shown theoretically that the proposed method has an improved performance even in non-Gaussian environments. To demonstrate the practicality of this spatial–temporal predictor, we apply it to model real-life radar sea scattered signals. Experimental results show that the proposed method can provide a more accurate model for sea clutter than the conventional methods. Index Terms—Chaos, data fusion, prediction, radar, spatial– temporal signal processing, support vector machine (SVM). I. INTRODUCTION T HE problem of spatial–temporal signal modeling and pro- cessing is of great interest in recent years. It arises in many subjects such as video processing, sensor surveillance, wireless communications, and many other fields [1]–[4]. For example, space-time coding has recently been proposed for wireless com- munication. In fact, space-time signal processing techniques are required for the new generation of wireless technologies that employ antenna array. Another problem that involves space- time processing is to model signals for network sensor surveil- lance. For instance, radar backscatter from a sea surface is ba- sically a spatial–temporal phenomenon. The scattered signals come from a moving surface and thus spatial effects in nearby area cannot be ignored for an accurate radar detection. In all these cases, spatial–temporal signal processing techniques are required since the signal processes involve both spatial and tem- poral information. In this paper, we are concerned with one par- ticular signal processing function: spatial–temporal prediction. Prediction is a very important signal processing function since it is widely used in coding, noise cancellation, and mod- eling. Many nonlinear prediction techniques based on Volterra system [5], neural networks [6], adaptive rational function filter [7], and fuzzy logics [8] have been developed in the literature. However, most of these techniques assume that the spatial structures remain frozen and do not fully exploit the available information in the spatial domain. To extend these temporal nonlinear predictors to the spatial–temporal domain, we have Manuscript received July 25, 2005; revised January 11, 2006. The authors are with the Department of Electrical and Computer En- gineering, University of Calgary, Calgary, AB T2N 1N4, Canada (e-mail: [email protected]; [email protected]). Digital Object Identifier 10.1109/TNN.2006.875985 to incorporate the spatial information into these nonlinear predictors. A conventional approach to spatial–temporal signal processes is based on a linear decomposition of the spatial modes into a high-dimensional vector space. This approach tries to find a group of dominating models such that its com- bination provides a good approximation of the underlying spatial–temporal system. This approach is effective only if such a decomposition has a few dominating models. But even for low-dimensional systems, the number of dominating models is usually found to be quite large [9]. One relatively new spatial–temporal computational scheme called couple map lattice (CML) [10] has been proposed in the physics literature. It has been mainly used to simulate the spa- tial–temporal nonlinear dynamical behavior. The CML uses a linear combiner to fuse the spatial data at the measurement level [11]–[13]. However, the conventional CML does not optimize the coefficients of the linear combiner and may have a poor performance in practical applications. The situation gets worse when the signal is contaminated by noise which may be non- Gaussian or a combination of several noise types [14]. As a re- sult, high-order cumulant methods have been introduced to im- prove performance for non-Gaussian environments ([15]–[21] and references therein). Recently, one semiblind identification method for the quadratic nonlinear model was presented in [22]. This method uses third-order cumulants to cleanse the noisy measurement data since third-order cumulants are identically zero for Gaussian disturbances. But in many practical applica- tions, the nonlinear model structure is unknown. Thus the blind identification approach cannot be applied here effectively. It is the purpose of this paper to develop a new spatial–tem- poral signal prediction method directly from the measurement for unknown model structure. By using an optimal data fusion technique, we first combine the measured signals at different spatial domains and then pass the fused signal to a nonlinear predictor based on a support vector machine (SVM). The proposed approach is somewhat like the CML scheme, but an optimal combination based on the fourth-order statistic is em- ployed here. As a result, we prove here that the proposed fusion method has an improved performance even in a non-Gaussian environment. Moreover, the proposed spatial–temporal pre- dictor is also shown to have smaller errors than the conventional temporal predictor. To demonstrate the effectiveness of the proposed method, we apply it to model autoregressive (AR) signals and radar sea-scattered signals. Computed results show that the proposed method can indeed give us a more accurate prediction performance than the conventional approaches. This paper is organized as follows. In Section II, we formu- late the problem of reconstructed spatial–temporal modeling. In 1045-9227/$20.00 © 2006 IEEE

Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

  • Upload
    h

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 17, NO. 4, JULY 2006 975

Nonlinear Spatial–Temporal PredictionBased on Optimal Fusion

Youshen Xia, Senior Member, IEEE, and Henry Leung, Member, IEEE

Abstract—The problem of spatial–temporal signal processingand modeling has been of great interest in recent years. A newspatial–temporal prediction method is presented in this paper. Anoptimal fusion scheme based on fourth-order statistic is first em-ployed to combine the received signals at different spatial domains.The fused signal is then used to construct a spatial–temporal pre-dictor by a support vector machine. It is shown theoreticallythat the proposed method has an improved performance evenin non-Gaussian environments. To demonstrate the practicalityof this spatial–temporal predictor, we apply it to model real-liferadar sea scattered signals. Experimental results show that theproposed method can provide a more accurate model for seaclutter than the conventional methods.

Index Terms—Chaos, data fusion, prediction, radar, spatial–temporal signal processing, support vector machine (SVM).

I. INTRODUCTION

THE problem of spatial–temporal signal modeling and pro-cessing is of great interest in recent years. It arises in many

subjects such as video processing, sensor surveillance, wirelesscommunications, and many other fields [1]–[4]. For example,space-time coding has recently been proposed for wireless com-munication. In fact, space-time signal processing techniques arerequired for the new generation of wireless technologies thatemploy antenna array. Another problem that involves space-time processing is to model signals for network sensor surveil-lance. For instance, radar backscatter from a sea surface is ba-sically a spatial–temporal phenomenon. The scattered signalscome from a moving surface and thus spatial effects in nearbyarea cannot be ignored for an accurate radar detection. In allthese cases, spatial–temporal signal processing techniques arerequired since the signal processes involve both spatial and tem-poral information. In this paper, we are concerned with one par-ticular signal processing function: spatial–temporal prediction.

Prediction is a very important signal processing functionsince it is widely used in coding, noise cancellation, and mod-eling. Many nonlinear prediction techniques based on Volterrasystem [5], neural networks [6], adaptive rational function filter[7], and fuzzy logics [8] have been developed in the literature.However, most of these techniques assume that the spatialstructures remain frozen and do not fully exploit the availableinformation in the spatial domain. To extend these temporalnonlinear predictors to the spatial–temporal domain, we have

Manuscript received July 25, 2005; revised January 11, 2006.The authors are with the Department of Electrical and Computer En-

gineering, University of Calgary, Calgary, AB T2N 1N4, Canada (e-mail:[email protected]; [email protected]).

Digital Object Identifier 10.1109/TNN.2006.875985

to incorporate the spatial information into these nonlinearpredictors. A conventional approach to spatial–temporal signalprocesses is based on a linear decomposition of the spatialmodes into a high-dimensional vector space. This approachtries to find a group of dominating models such that its com-bination provides a good approximation of the underlyingspatial–temporal system. This approach is effective only if sucha decomposition has a few dominating models. But even forlow-dimensional systems, the number of dominating models isusually found to be quite large [9].

One relatively new spatial–temporal computational schemecalled couple map lattice (CML) [10] has been proposed in thephysics literature. It has been mainly used to simulate the spa-tial–temporal nonlinear dynamical behavior. The CML uses alinear combiner to fuse the spatial data at the measurement level[11]–[13]. However, the conventional CML does not optimizethe coefficients of the linear combiner and may have a poorperformance in practical applications. The situation gets worsewhen the signal is contaminated by noise which may be non-Gaussian or a combination of several noise types [14]. As a re-sult, high-order cumulant methods have been introduced to im-prove performance for non-Gaussian environments ([15]–[21]and references therein). Recently, one semiblind identificationmethod for the quadratic nonlinear model was presented in [22].This method uses third-order cumulants to cleanse the noisymeasurement data since third-order cumulants are identicallyzero for Gaussian disturbances. But in many practical applica-tions, the nonlinear model structure is unknown. Thus the blindidentification approach cannot be applied here effectively.

It is the purpose of this paper to develop a new spatial–tem-poral signal prediction method directly from the measurementfor unknown model structure. By using an optimal data fusiontechnique, we first combine the measured signals at differentspatial domains and then pass the fused signal to a nonlinearpredictor based on a support vector machine (SVM). Theproposed approach is somewhat like the CML scheme, but anoptimal combination based on the fourth-order statistic is em-ployed here. As a result, we prove here that the proposed fusionmethod has an improved performance even in a non-Gaussianenvironment. Moreover, the proposed spatial–temporal pre-dictor is also shown to have smaller errors than the conventionaltemporal predictor. To demonstrate the effectiveness of theproposed method, we apply it to model autoregressive (AR)signals and radar sea-scattered signals. Computed results showthat the proposed method can indeed give us a more accurateprediction performance than the conventional approaches.

This paper is organized as follows. In Section II, we formu-late the problem of reconstructed spatial–temporal modeling. In

1045-9227/$20.00 © 2006 IEEE

Page 2: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

976 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 17, NO. 4, JULY 2006

Section III, we propose a spatial–temporal prediction methodbased on an optimal fusion and SVM technique. In Section IV,the theoretical performance of the proposed method is analyzed.In Section V, we apply the proposed method to AR signals andreal-life sea-scattered signals. Concluding remarks are given inSection VI.

II. SPATIAL–TEMPORAL MODELLING

Let be a spatial–temporal signal process, whererepresents spatial location which is an

-dimensional vector and is the one-dimensional statevariable at spatial location and time . Forexample, when is a one-dimensional location, the entire stateof the spatial–temporal signal process can then be representedby a matrix

where denotes the state value of the spatial–temporalsystem at time and spatial site .

Let the observed spatial–temporal signal be

where is the measurement noise. Our objective is to finda spatial–temporal predictor , based on , so that itspredicted values satisfy the following optimizationproblem:

where denotes the expectation and is the true spa-tial–temporal signal at time and site .

Assume that the following spatial–temporal predictor modelis considered:

where is an -dimen-sional reconstructed time-delay vector and the predictor issmooth. Then

where . Thus

where

called as the approximation error. Since the signal andpredictor are usually independent of the measurement noise,when the measurement noise is zero mean, the above optimiza-tion problem can be expressed as

It can be seen that the error in reconstructing a spatial–temporalpredictor mainly comes from measurement noise and approx-imation error. To enhance the accuracy of a spatial–temporalpredictor, one has to reduce both measurement noise and ap-proximation error.

Motivated by the above observation, we propose a novelmethod for constructing an optimal spatial–temporal predictor:fuse spatial–temporal signals by combining the observed sig-nals at adjacent spatial domains based an optimal fusion, andthen pass the fused signal at different locations to a nonlinearpredictor by using an SVM predictor.

III. SPATIAL–TEMPORAL PREDICTION-BASED OPTIMAL FUSION

In this section, we introduce an optimal fusion technique anda support vector machine, and then propose an optimal fusion-based spatial–temporal prediction method.

A. Optimal Fusion Based on Fourth-Order Statistics

Data fusion is the process of integrating complementary in-formation from multisensor data by minimizing the uncertaintyof the fused information [23]. To reduce the uncertainty of ob-served signals, we propose an optimal fusion method to combinethe experimental data collected at different spatial regions.

Let be adjacent sites of in thespatial domain. Let be the corresponding observedsignals at the adjacent sites. We now combine these observedsignals based on optimal fusion. There are several optimal datafusion techniques available in the literature. Their optimalitiesare mostly based on second-order statistics, such as minimumvariance (MV) and linearly constrained least squares (LCLS)[24]–[26]. However, these optimal conditions require theGaussian assumption. In many real-life applications, includingthe radar clutter suppression problem to be considered here,the received signals are usually non-Gaussianly distributed. Todeal with non-Gaussian signals, we propose an optimal fusionapproach based on the fourth-order statistics.

The proposed approach employs the standard cumulant as theobjective function for optimality subject to an unbiased con-straint. More precisely, we find a set of optimal weight coef-ficients so that

such that (1)

Page 3: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

XIA AND LEUNG: NONLINEAR SPATIAL–TEMPORAL PREDICTION 977

where is a weighting vector. Thus theoptimal fusion is the solution of (1) and the optimal fuseddata is then given by

The fourth-order cumulant method has been used in blinddeconvolution and signal and image processing ([15]–[17] andreferences therein]. Similar to the third-order cumulant method[18]–[21], the introduction of the fourth-order cumulant termis to improve fuse performance when the received signals arenon-Gaussianly distributed. This is because the cumulant termis capable of extracting more non-Gaussian information fromthe input noisy data. However, unlike the third-order cumulantterm, the fourth-order cumulant term introduced here is close tothe kurtosis . In fact, the optimizationproblem related to the kurtosis can be described as follows:

such that

As a result, using the kurtosis property, we will give the analysisof performance improvement.

B. Support Vector Machine for Prediction

SVM motivated by statistical learning theory has recently re-ceived considerable attention in the field of machine learning[27]–[31]. The foundation has been developed by Vapnik andhas obtained [27]. The main feature of SVMs is that they use thestructural risk minimization rather than the empirical risk min-imization. Moreover, training an SVM is equivalent to solvinga linearly constrained convex quadratic programming problemwhen employed kernel functions are positive semidefinite. Sincethe MLP and RBF neural networks perform the minimizationof a nonlinear error function which may be nonconvex, a localminima problem in training can be avoided by using the SVMapproach.

Consider the following problem of approximating the set ofdata:

with a regression function of the form

(2)

where is called feature functions, which map the inputspace into a high-dimensional space (possibly infinite-dimensional space), is the weight vector, and is a bias term.

The SVM learning is to find the above unknown function byminimizing the following functional:

(3)

where is called a loss function which indicateshow the difference between and should be penalized and

is a scaling constant for the tradeoff between minimizingthe training errors and model complexity. An SVM for chaotictime series prediction was first proposed by using a -insensitiveloss function [29], [30]

where is a scaling constant. For a simple implementation,we here use a Huber loss function given below

otherwise.(4)

It was shown in [32] and [36] that the weight vector can berepresented by

and the function minimizing this functional has a form

(5)

where and are optimal solutions to the following quadraticoptimization problem:

with the constraints and , ,, where . According to Mercer’s theorem,

one can choose a kernel function with the property

Thus and become optimal solutions to the followingquadratic optimization problem:

(6)

Page 4: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

978 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 17, NO. 4, JULY 2006

Using the complementary property , we replaceby and let . Then (7) can

be rewritten as

such that (7)

Therefore, training an SVM with the Huber loss function isequivalent to solving the quadratic programming problem (8).Since (8) is a convex quadratic programming problem, the localminimum problem will not occur.

C. Spatial Temporal Prediction

Combining optimal fusion with SVM, we propose the fol-lowing spatial–temporal prediction algorithm.

0) For each location in spatial domain, given a group ofadjacent sites .i) Find an optimal fusion solution by solving

minimize

subject to (8)

where .ii) Use to determine the fused signal at the site

where .iii) Embed the optimal fused data to form thereconstructed time delay vectors given by

(9)

where is an embedding dimension [10].iv) Use as the desired prediction and data set

to train the SVM.v) Find the optimal solution of (8), determine the sup-port vectors , and compute the bias givenby

(10)

where denotes the number of support vectors and SVrepresents the set of index of support vectors, which corre-spond to nonzero solution to (8).

Fig. 1. Spatial temporal prediction: (a) two-dimensional and (b) one-dimen-sional site.

vi) The spatial–temporal predictor is then given by

Fig. 1 illustrates the proposed method for both one- and two-dimensional sites with five neighboring cells. It shows how in-formation at time collected at a desired site and its neighbor-hood is used to predict the behavior at time 1 at that particularsite.

The proposed spatial–temporal prediction method has severaladvantages over the conventional techniques. First, SVM usu-ally has a better performance than other neural network predic-tors [6]–[8], [12], [14] since SVM learning does not have thelocal minimum problem provided that the chosen kernel func-tion is positive semidefinite. Second, instead of using simplecombination of spatial data [11], [12], the proposed spatial–tem-poral predictor uses an optimal combination to minimize the un-certainty. Note that the proposed method is actually a general-ization of CML, with the optimally chosen coupling parameters.Third, most works on spatial–temporal prediction in the litera-ture are based on computer simulation only [4], [9], [11]–[14].In this paper, the proposed spatial–temporal prediction methodis proved theoretically that it can improve the prediction qualitydue to the use of the optimal fusion. Finally, unlike the algo-rithm in [22], the proposed method only requires training for

Page 5: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

XIA AND LEUNG: NONLINEAR SPATIAL–TEMPORAL PREDICTION 979

one predictor and thus is suitable for real-time signal processingapplications.

Remark: The bias term can be obtained by the least squaresSVM technique [36].

IV. PERFORMANCE ANALYSIS

In this section, we first prove that the proposed optimal fusiontechnique can improve the quality of the input data without theneed of the standard Gaussian assumption. We then show thatthe proposed spatial–temporal predictor can increase the qualityof the SVM output.

For convenience, we assume that are zero mean andmutually uncorrelated.

Lemma 1: Assume that are adjacent observedsignals of . Then

(11)

where , which is a zero-mean process, is dependent ofand .

Proof: The proof is given in the Appendix.With Lemma 1, we establish the two main results of this

paper.Theorem 1: The optimal fused data are unbiased.

Proof: The proof is given in the Appendix.Theorem 2: Assume that have the same second-

order and fourth-order statistics. Then

where and are the optimal fuseddata.

Proof: The proof is given in the Appendix.Based on the above obtained results, we now show that the

proposed spatial–temporal predictor has a smaller error than aconventional temporal predictor. To show that, we use an ap-proximate analysis for simplicity. According to the discussionin the previous section, the proposed spatial–temporal predictorcan be expressed as

while the SVM temporal predictor is given by

where is the chosen kernel func-tion and and are time-delayed vectors based on the fuseddata and measurement data, respectively. Let

and

where is measurement noise and is the error betweenand . Then their time delayed vectors are repre-

sented as

and

According to Theorem 2, and thusfor , where and are the element

of the time delayed vectors and , respectively. When mea-surement noise is small

and thus

It follows that

and

Therefore

and

Let and . The aboveequations become

Page 6: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

980 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 17, NO. 4, JULY 2006

We now show that . Since is independentof , by Theorem 2, we have

and

where is the element of and is defined inthe proof of Theorem 2. From , it follows that

. Therefore, we conclude that the proposed spa-tial–temporal predictor has a better performance than the con-ventional temporal predictor in term of prediction errors.

V. APPLICATION

In this section, we apply the proposed spatial–temporal pre-diction method to model AR signals and radar sea scatteredsignals.

A. Application to Modeling of Linear Noisy AR Signals

Consider the noisy AR signal system

(12)

where is the unknown AR parameter vector;is the regression vector of a

noise-free AR process of order ; is the unknown noiseerror; and is the unknown measurement white noise. Let thenoisy signal vector be denoted by ,and let be the noise vector. TheAR signal model can be rewritten as

(13)

where . Assume that we are givenfrom the noisy system (13). By solving (8) we

obtain the following SVM predictor:

where is given by (11), is the number of support vectors,and is the support vector. Using the proposed optimal fusionmethod, we get a set of fused data . The proposed pre-dictor is then given by

Example 1: Consider the noisy AR model example given in[33], where the AR model parameter vector is

But here is a white Gaussian noise with a unit variance andis a chaotic processes generated by the logistic map

where . Obviously, defined in (13) isnon-Gaussian noise. We generate 250 points, which aredivided into a training set of 100 points and a testing set of 150points. We set , , , , and take thepolynomial function as the kernel of SVM. Note that the ARmodel has no constant term. We thus set threshold . Aftersolving (8), we have the parameter estimation

where is the optimal solution of (8). The SVM predictoris thus given by

while our SVM-fusion predictor for the noisy AR model is thengiven by

Fig. 2 shows the variation of the predictive signal from 100 to250 using the SVM predictor and Fig. 3 shows the variationof the predictive signal from 100 to 250 using the proposedSVM-fusion predictor. Fig. 4 displays the absolute error be-tween the original signal and the predictive signal given by theSVM predictor and the proposed SVM-fusion predictor, respec-tively. Compared to the SVM predictor, the present SVM-fusionpredictor has much smaller errors.

B. Application to Modeling of Sea-Scattering Signals

Spatial temporal signal modeling arise in many areas suchas video processing, sensor surveillance, and wireless commu-nications [37]–[40]. In this section, we consider sea-scatteringsignal modeling.

Based on the recent discovery that the sea clutter can be mod-eled using chaos [4], [34], we describe the sea clutter dynamicusing the following equations:

where is a smooth unknown nonlinear mapping, is thesea-scattered data received by a radar, and is a zero-mean

Page 7: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

XIA AND LEUNG: NONLINEAR SPATIAL–TEMPORAL PREDICTION 981

Fig. 2. Variation of the signal by the SVM predictor versus the number of mea-surements in Example 1.

Fig. 3. Variation of the signal by the SVM-fusion predictor versus the numberof measurements in Example 1.

measurement noise process. When a radar is in its staring modefor remote sensing, sea clutter is basically a spatial–temporalprocess with a one-dimensional spatial coordinate [12]. Weapply the proposed method to real-life sea clutter data toevaluate the effectiveness of the spatial–temporal modelingapproach for sea clutter.

Example 2: Consider real-life sea clutter data. The radar usedfor data collection is a coherent dual-polarized X-band radar.The radar site was located at N, W on acliff at Osborne Head, Nova Scotia, Canada, facing the AtlanticOcean at a height of about 30 m above mean sea level and anopen ocean view of about 130 . The database contains a va-riety of targets, including small boats and beach ball targets[35]. Three data sets are used in this study to investigate the spa-tial–temporal dynamics of sea clutter. They are all staring data(ST1, ST2, and ST3), where the antenna is aimed at a single az-imuth all times. Applications of a staring model include remotesensing and target identification. The properties of these datasets are listed in Table I. Their original radar images are

Fig. 4. Error variation of two signal prediction in Example 1: the solid linecorresponds to the SVM-fusion predictor and the dot line to the SVM predictor.

Fig. 5. (a) Radar image of the staring data set 1 (ST1). (b) Prediction radarimage of ST1 with K = 4.

plotted in Figs. 5(a), 6(a), and 7(a), respectively. Here, theaxis denotes the range and the axis represents the time. When

Page 8: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

982 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 17, NO. 4, JULY 2006

TABLE ICHARACTERISTICS OF THE REAL RADAR DATA USED IN THIS PAPER

Fig. 6. (a) Radar image of the staring data set 2 (ST2). (b) Prediction radarimage of ST2 with K = 4.

visible, the targets appear in the staring images as a dark hori-zontal line. Since the target was not anchored and so fluctuates

Fig. 7. (a) Radar image of the staring data set 3 (ST3). (b) Prediction radarimage of ST3 with K = 4.

with the waves, causing it to be more easy to detect at sometimes than at others. Apparently, the clutter condition in ST3 is

Page 9: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

XIA AND LEUNG: NONLINEAR SPATIAL–TEMPORAL PREDICTION 983

Fig. 8. Prediction error of ST1 generalized over range. (a) The solid line cor-responds to the SVM and the dashed line to the proposed method withK = 2.(b) The solid line corresponds to the SVM and the dashed line to the proposedmethod with K = 4.

less severe than that in ST1. The target, which is a boat, appearsas a solid dark line in the image without any breaking.

We choose the prediction parameters , ,, and the Gaussian RBF function as the kernel of SVM.

For the measurement fusion, we fuse 1 neighboring cellsin the spatial domain. Here, we use and the resultingimages are plotted in Figs. 5(b), 6(b), and 7(b), respectively.We also perform comparison between the proposed spatial–tem-poral method and the SVM. Fig. 8(a) and (b) displays the meansquare prediction errors (MSE) of ST1, where the solid line cor-responded to the conventional SVM predictor and the dashedline to the proposed fusion predictor with and ,respectively. Fig. 9(a) and (b) displays the MSE of ST2 gen-eralized over range with data from one range cell are used fortraining. The solid line corresponded to the conventional SVMpredictor and the dashed lines represent the proposed fusion pre-dictor with and , respectively. Fig. 10(a) and (b) shows

Fig. 9. Prediction error of ST2 generalized over range. (a) The solid line cor-responds to the SVM and the dashed line to the proposed method withK = 2.(b) The solid line corresponds to the SVM and the dashed line to the proposedmethod with K = 4.

the same MSE analysis for ST3. Based on obtained results, wehave the following observations.

• All three predicted radar images look very close to theiroriginals, which mean that the proposed spatial–temporalpredictor provide a good model for these sea clutter data,In addition, the predicted image for ST3 shows that theproposed method does not only model the backgroundsea clutter accurately but also highlights the target signal,which is highly desirable for target detection.

• The use of spatial–temporal prediction produces a bettermodel than a purely temporal SVM predictor. It is becausethe prediction fusion approach can successfully reduce theMSE by incorporating the extra spatial information in themodel. This observation matches our theoretical analysisgiven.

• The prediction performance with is consistentlybetter than that with .

Page 10: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

984 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 17, NO. 4, JULY 2006

Fig. 10. Prediction error of ST3 generalized over range. (a) The solid line cor-responds to the SVM and the dashed line to the proposed method withK = 2.(b) The solid line corresponds to the SVM and the dashed line to the proposedmethod with K = 4.

• The improvement due to the spatial–temporal processingover the temporal method varies for different data sets de-pending on the sea conditions.

One main application of predictive clutter modeling is cluttersuppression to enhance target detection. Radar detection is usu-ally formulated as a binary detection problem given by

where is the received signal, is the target signal, andis the clutter (or clutter plus thermal noise) process. A pre-

diction-based detector based on using a radial basis function(RBF) neural network has been proposed in [31] and [32] forradar detection in sea clutter. This predictive detection methodconsiders a target as an abrupt change in clutter image. When a

Fig. 11. Receiver operating characteristics of the predictive detector for ST1.The dashed line corresponds to the SVM and the solid line to the proposedmethod with K = 4.

received signal containing a target is fed into the predictor thatis trained using the clutter data, the prediction error of the re-ceived signal is expected to be relatively large since target andclutter dynamics are usually quite different. In other words, theabove binary decision problem can be determined by the magni-tude of the error of the clutter-trained predictor of the incomingsignal . It has been shown in the previous section that thespatial–temporal SVM method can accurately approximate thesea clutter dynamic. This ability is essential for the predictivedetection approach. As the approximation is close to the realsea clutter data, a target embedded in the clutter will result in asignificant deviation in the prediction error.

To determine a suitable threshold on the prediction foroptimal detection performance, a standard approach in radar isconstant-false-alarm-rate (CFAR) processing. CFAR detectioninvolves determining a threshold for a given probability offalse alarm. The threshold can be computed analytically ifthe noise characteristics are known; otherwise, the thresholdmust be computed numerically. For the predictive detectionmethod considered here, since clutter data are used to train thepredictor, the probability distribution for the hypothesiscan then be estimated using the clutter prediction error by ahistogram. In this paper, we set the constant false alarm rateat 0.001. To evaluate the detection performance, the receiveroperating characteristics (ROC) that plots the probability ofdetection versus the probability of false alarm is used. As manytarget points are required to develop a histogram of targetdeflection to estimate the probability of detection, ST1, ST2,and ST3 are used here in the ROC evaluation. We here comparethe proposed spatial–temporal approach with the SVM methodsince the RBF method may be viewed as submethod of theSVM method from viewpoint of the predictive model, wherethe kernel function is taken as the RBF function. Figs. 11–13plot the ROC for ST1, ST2, and ST3, respectively. For threedata sets, the proposed spatial–temporal approach outperformsthe SVM method.

Page 11: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

XIA AND LEUNG: NONLINEAR SPATIAL–TEMPORAL PREDICTION 985

Fig. 12. Receiver operating characteristics of the predictive detector for ST2.The dashed line corresponds to the SVM and the solid line to the proposedmethod with K = 4.

Fig. 13. Receiver operating characteristics of the predictive detector for ST3.The dashed line corresponds to the SVM and the solid line to the proposedmethod with K = 4.

VI. CONCLUSION

In this paper, we proposed a novel spatial–temporal predic-tion method based on optimal fusion and SVM. The proposedmethod can obtain an optimal solution and guarantee that thefused information has an improved performance even in non-Gaussian environment. Theoretical analysis also shows that theproposed method can indeed improve the accuracy of signal pre-diction provided that the signal comes from a spatial–temporalsystem. Using real-life oceanic radar signals, we demonstratethat the proposed method is indeed effective in predicting spa-tial–temporal signals. It is also concluded that spatial–temporalclutter model is more efficient for enhancing target detection.

APPENDIX

Proof of Lemma 1

Without loss of generality, we consider only the case ofin the two-dimensional space. That is, we try to prove the

following:

(14)

Since

where for is zero mean and mutuallyuncorrelated. Using the time-delay vector coordinates, we havethe expression in the vector form

Subtracting these equations, we have

Assume that there exists a nonlinear smooth map satisfying

where is a zero-mean white noise. From mean value theorem,it follows that

and

Page 12: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

986 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 17, NO. 4, JULY 2006

where is the gradient of . Thus

where

and

Since is independent of the signal process andare adjacent observed signals

Finally, note that

Then

and hence

It can be verified that

Thus (15) holds, and this completes the proof.

Proof of Theorem 1

Since

applying the expectation operator, we have

According to Lemma 1, we have

where is a zero-mean process. Then

It follows that

This completes the proof.

Proof of Theorem 2

First, let

Then

(15)

Let

Then

where is the kurtosis. Let

By the property of the kurtosis, we can get the followingrelationship:

Page 13: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

XIA AND LEUNG: NONLINEAR SPATIAL–TEMPORAL PREDICTION 987

In fact, it is enough to show the following equalities:

(16)

Let . Since are mutually uncorrelated,are mutually uncorrelated. We have

and

Thus

Using mathematical induction we can show that the above twoequalities for , (17), hold. Now, note that

. Then

Thus

By (17), we have

It can be seen that

and the equality holds if and only if there is only one nonzerocomponent. Since , the above inequality is strict. It fol-lows that

Because

That is

This completes the proof.

ACKNOWLEDGMENT

The authors would like to thank the Associate Editor and re-viewers for their valued comments, which helped in improvingthe quality of this paper.

REFERENCES

[1] A. M. Tekalp, Digital Video Processing. Englewood Cliffs, NJ: Pren-tice-Hall, 1995.

[2] P. V. Rooyen, M. Lotter, and D. V. Wyk, Space-Time Processing forCDMA Mobile Communications. Boston, MA: Kluwer Academic,2000.

[3] N. Cressie and J. J. Majure, “Spatial-temporal statistical modeling oflive-stock waste in streams,” J. Agricult., Biol, Environ. Statist., vol. 2,pp. 24–47, 1997.

[4] H. Leung and T. Lo, “A spatial temporal dynamical model for multi-path scattering from the sea,” IEEE Trans. Geosci. Remote Sens., vol.33, no. 2, pp. 441–448, Feb. 1995.

[5] J. Lee and V. J. Mathews, “A fast recursive least squares adaptivesecond-order Volterra filter and its performance analysis,” IEEE Trans.Signal Process., vol. 41, no. 3, pp. 1087–1097, Mar. 1993.

[6] J. T. Lo, “Synthetic approach to optimal filtering,” IEEE Trans. NeuralNetw., vol. 5, no. 5, pp. 803–811, Sep. 1994.

[7] H. Leung and S. Haykin, “Detection and estimation using an adaptiverational function filter,” IEEE Trans. Signal Process., vol. 42, no. 11,pp. 3366–3376, Nov. 1994.

[8] H. Leung, “Nonlinear clutter cancellation and detection using amemory-based predictor,” IEEE Trans. Aerosp. Electron. Syst., vol.32, no. 4, pp. 1249–1256, Oct. 1996.

[9] U. Parlitz and C. Merkwirth, “Predicting low-dimensional spatiotem-poral dynamics using discrete wavelet transforms,” Phys. Rev. E, vol.51, no. 4, p. R2710, 1995.

[10] K. Kaneko, Theory and Application of Coupled Map Lattices. NewYork: Wiley, 1993.

[11] U. Parlitz and C. Merkwirth, “Prediction of spatial-temporal time se-ries based on reconstructed local states,” Phys. Rev. Lett., vol. 84, pp.1890–1893, 2000.

[12] H. Leung, G. Hennessey, and A. Drosopoulos, “Signal detection usingthe radial basis function coupled map lattice,” IEEE Trans. NeuralNetw., vol. 11, no. 5, pp. 1133–1151, Sep. 2000.

[13] A. Kumar and S. K. Mullick, “Nonlinear dynamical analysis ofspeech,” J. Acoust. Soc. Amer., vol. 100, pp. 615–629, Jul. 1996.

Page 14: Nonlinear Spatial–Temporal Prediction Based on Optimal Fusion

988 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 17, NO. 4, JULY 2006

[14] E. Jakeman and R. J. A. Tough, “Non-Gaussian models for the statisticsof scattered waves,” Adv. Phys., vol. 37, no. 5, pp. 471–529, 1988.

[15] D. Donoho, “On minimum entropy deconvolution,” in Applied TimeSeries Analysis II. New York: Academic, 1981, pp. 565–608.

[16] E. Walach and B. Widrow, “The least mean four (LMF) adaptive al-gorithmand its family,” IEEE Trans. Inf. Theory, vol. IT-30, no. 2, pp.275–283, Mar. 1984.

[17] M. C. Hong, T. Stathaki, and A. K. Katsaggelos, “Iterative regularizedmixed norm multichannel image restoration,” J. Electron. Imag., vol.14, no. 1, Mar. 2005, Art. 013004.

[18] T. W. S. Chow and H. Z. Tan, “Third-order cumulant RLS algorithmfor nonminimum ARMA system identification,” Signal Process., vol.61, pp. 23–38, 1997.

[19] C. T. Leung and T. W. S. Chow, “Adaptive regularization parameterselection method for enhancing generalization capability of neural net-works,” Artif. Intell., vol. 107, pp. 347–356, 1999.

[20] ——, “Least third order cumulant method with adaptive parameter reg-ularization parameter selection method for neural networks,” Artif. In-tell., vol. 127, pp. 169–197, 2001.

[21] A. J. Bell and T. J. Sejnowski, “An information-maximisation approachto blind separation and blind deconvolution,” Neural Comput., vol. 7,no. 6, pp. 1004–1034, 1995.

[22] Y. S. Xia, H. Leung, and H. Chan, “A prediction fusion method forreconstructing spatial temporal dynamics,” IEEE Trans. Circuits Syst.II, Express Briefs, vol. 53, no. 1, pp. 62–66, Jan. 2006.

[23] P. K. Varshney, “Scanning the issue,” Proc. IEEE, vol. 85, no. 1, pp.3–5, Jan. 1997.

[24] H. F. Durrant-Whyte, Integration, Coordination and Control of Multi-Sensor Robot Systems. Norwell, MA: Kluwer Academic, 1988.

[25] D. Avitzour, “A maximum likehood approach to data association,”IEEE Trans. Aerosp. Electron. Syst., vol. 28, no. 2, pp. 560–566, Apr.1992.

[26] Y. Xia, H. Leung, and E. BosséBosse, “Neural data fusion algorithmsbased on a linearly constrained least square method,” IEEE Trans.Signal Process., vol. 53, no. 2, pt. I, pp. 672–685, Feb. 2005.

[27] V. Vapnik, The Nature of Statistical Learning Theory. New York:Springer-Verlag, 1995.

[28] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learn.,vol. 20, pp. 273–297, 1995.

[29] V. Vapnik, S. Golowich, and A. J. Smola, “Support vector method forfunction approximation, regression estimation and signal processing,”in Advances in Neural Information Processing Systems. Cambridge,MA: MIT Press, 1996, vol. 9, pp. 281–287.

[30] S. Mukherjee, E. Osuna, and F. Girosi, “Nonlinear prediction of chaotictime series using support vector machines,” in Proc. IEEE NNSP’97,Amelia Island, FL, 1997, pp. 24–26.

[31] Z. Chen and S. Haykin, “On different facets of regularization theory,”Neural Comput., vol. 14, pp. 2791–2846, 2002.

[32] A. J. Smola and B. Scholkopf, “On a Kernel-based method for patternrecognition, regression, approximation, and operator inversion,” Algo-rithmica, vol. 22, pp. 211–231, 1998.

[33] W. X. Zheng, “Autoregressive parameter estimation from noise data,”IEEE Trans. Circuits Syst., vol. 47, no. 1, pt. II, pp. 71–75, Jan. 2000.

[34] S. Haykin and S. Puthusserypady, “Chaotic dynamics of sea clutter,”Chaos, vol. 7, pp. 777–802, 1997.

[35] A. Drosopoulos, Description of the OHGR database Defence ResearchEstablishment, Ottawa, ON, Canada, Dec. 1994, Tech. Note 94-14.

[36] J. A. K. Suykens and J. Vandewalle, “Least squares support vector ma-chine classifiers,” Neural Process. Lett., vol. 9, pp. 293–300, 1999.

[37] H. Leung, T. Lo, and S. C. Wang, “Prediction of noisy chaotic timeseries using an optimal radial basis function neural network,” IEEETrans. Neural Netw., vol. 12, no. 5, pp. 1163–1172, Sep. 2001.

[38] X. Tang, X. B. Gao, and J. Z. Liu, “A spatial-temporal approach forvideo caption detection and recognition,” IEEE Trans. Neural Netw.,vol. 13, no. 4, pp. 961–971, Jul. 2002.

[39] J. Q. Wang, M. R. Azimi-Sadjadi, and D. Reinke, “A temporally adap-tive classifier for multispectral imagery,” IEEE Trans. Neural Netw.,vol. 15, no. 1, pp. 159–165, Jan. 2004.

[40] N. Xie and H. Leung, “Blind equalization using a predictive radial basisfunction neural network,” IEEE Trans. Neural Netw., vol. 16, no. 3, pp.709–720, May 2005.

Youshen Xia (M’96–SM’01) received the B.S. andM.S. degrees in computational mathematics fromNanjing University, China, in 1982 and 1989, respec-tively, and the Ph.D. degree from the Department ofAutomation and Computer-Aided Engineering, TheChinese University of Hong Kong, in 2000.

His present research interests include applicationsof recurrent neural networks for constrained opti-mization in control and signal and image processing.

Henry Leung (M’90) received the Ph.D. degreein electrical and computer engineering from theMcMaster University, Hamilton, ON, Canada.

He is now a Professor in the Department ofElectrical and Computer Engineering, University ofCalgary, Calgary, AB, Canada. Before that, he waswith the Defence Research Establishment, Ottawa,ON, Canada, where he was involved in the designof automated systems for air and maritime multi-sensor surveillance. His research interests includechaos, computational intelligence, data mining,

nonlinear signal processing, multimedia, radar, sensor fusion, and wirelesscommunications.