Recurrent networks for compressive sampling

Recurrent networks for compressive sampling

Chi-Sing Leung a,n, John Sumb, A.G. Constantinides c

a The Department of Electronic Engineering, City University of Hong Kong, Kowloon Tong, KLN, Hong Kongb Institute of Technology Management, National Chung Hsing University, Taiwanc Imperial College, UK

a r t i c l e i n f o

Article history:Received 14 April 2013Received in revised form31 July 2013Accepted 19 September 2013Communicated by D. WangAvailable online 1 November 2013

Keywords:Neural circuitStability

a b s t r a c t

This paper develops two neural network models, based on Lagrange programming neural networks(LPNNs), for recovering sparse signals in compressive sampling. The first model is for the standardrecovery of sparse signals. The second one is for the recovery of sparse signals from noisy observations.Their properties, including the optimality of the solutions and the convergence behavior of the networks,are analyzed. We show that for the first case, the network converges to the global minimum of theobjective function. For the second case, the convergence is locally stable.

& 2013 Elsevier B.V. All rights reserved.

1. Introduction

Nonlinear constrained optimization problems have been stu-died over several decades [1–3]. Conventional ways for solvingthem are based on numerical methods [1,2]. As mentioned bymany neural network pioneers [3–7], when realtime solutions arerequired, the neural circuit approach [3–5] is more effective. Inthe neural circuit approach, we do not solve them in a digitalcomputer. Instead, we set up an associated neural circuit for thegiven constrained optimization problem. After the neural circuitsettles down at one of the equilibrium points, the solution isobtained by measuring the neuron output voltages at this stableequilibrium point. So, one of the important issues is the stability ofequilibrium points.

In the past two decades, many results of neural circuits werereported. For instance, Hopfield [4] investigated a neural circuit forsolving quadratic optimization problems. In [5,8–10] a numberof models were proposed to solve various nonlinear constrainedoptimization problems. The neural circuit approach is able to solvemany engineering problems effectively. For example, it can beused for optimizing microcode [11]. Besides, it is able to search themaximum of a set of numbers [12–14]. In [15], the Lagrangeprogramming neural network (LPNN) model was proposed tosolve general nonlinear constrained optimization problems. Formany years, many neural models for engineering problems wereaddressed. However, little attention has been paid to analog neuralcircuits for compressing sampling.

In compressing sampling [16–18], a sparse signal is sampled(measured) by a few random-like basis functions. The task ofcompressing sampling is to recover the sparse signal from themeasurements. Compressive sampling can also be applied to anon-sparse signal by increasing the sparsity of the signal. It can bedone by using some transform coding techniques. Conventionalapproaches for recovering sparse signals are based on theNewton's method [19–21].

As the neural circuit approach is a good alternative for non-linear optimization, it is interesting to investigate the ways toapply the neural circuit approach for compressing sampling. Thispaper proposes two analog neural network models, based onLPNNs, for compressive sampling. One is for the standard recovery.Another one is for the recovery from noisy measurements. Sincethe norm-1 measure is not twice differentiable, it is difficult toconstruct a neural circuit. Hence this paper proposes an approx-imation for the objective function. With the approximation, thehyperbolic tangent function, which is a commonly used activationfunction in the neural network community, is involved. We useexperiments to investigate how the hyperbolic tangent parameteraffects the performance of the proposed models. Since the con-vergence and stability of neural circuits are important issues, thispaper investigates the stability and optimality of our approaches.For the case of the standard recovery, we show that the proposedneural model converges to the global minimum of the objectivefunction. For the case of the recovery from noisy measurements,we show that the neural model is locally stable.

This paper is organized as follows. In Section 2, the backgroundsof compressive sampling and LPNNs are reviewed. Section 3formulates neural models for compressive sampling. The theo-retical analysis on the proposed neural models is also presented.

Contents lists available at ScienceDirect

journal homepage: www.elsevier.com/locate/neucom

Neurocomputing

0925-2312/$ - see front matter & 2013 Elsevier B.V. All rights reserved.http://dx.doi.org/10.1016/j.neucom.2013.09.028

n Corresponding author.E-mail address: [email protected] (C.-S. Leung).

Neurocomputing 129 (2014) 298–305

www.sciencedirect.com/science/journal/09252312

www.elsevier.com/locate/neucom

http://dx.doi.org/10.1016/j.neucom.2013.09.028



http://crossmark.crossref.org/dialog/?doi=10.1016/j.neucom.2013.09.028&domain=pdf



mailto:[email protected]


https://www.researchgate.net/publication/23169109_A_Novel_Recurrent_Neural_Network_for_Solving_Nonlinear_Optimization_Problems_With_Inequality_Constraints?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/3302557_Analysis_for_a_class_of_winner-take-all_model?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/7202352_Optimally_sparse_representation_in_general_nonorthogonal_dictionaries_via_l?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/3322018_Wakin_MB_An_introduction_to_compressive_sampling_IEEE_Signal_Process_Mag_252_21-30?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/3324333_Lagrange_Programming_neural_networks?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/221997039_Regression_shrinkage_and_selection_via_the_LASSO_J_R_Stat_Soc_B?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/262146233_Analysis_on_the_Convergence_Time_of_Dual_Neural_Network-Based_kWTA?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/45650417_Analysis_and_Design_of_a_k_-Winners-Take-All_Model_With_a_Single_State_Variable_and_the_Heaviside_Step_Activation_Function?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/3080545_Uncertainty_Principles_and_Ideal_Atomic_Decomposition?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/16246447_Hopfield_JJ_Neural_Network_and_Physical_Systems_with_Emergent_Collective_Computational_Abilities_Proc_Natl_Acad_Sci_79_2554-2558?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==



https://www.researchgate.net/publication/3186082_Nonlinear_Programming_without_Computation?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/3186082_Nonlinear_Programming_without_Computation?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/269634028_l1-MAGIC_Recovery_of_Sparse_Signals_via_Convex_Programming?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/5462879_A_One-Layer_Recurrent_Neural_Network_With_a_Discontinuous_Hard-Limiting_Activation_Function_for_Quadratic_Programming?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/5598825_Microcode_optimization_with_neural_networks?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/220692275_Non-Linear_Programming_Theory_and_Algorithms?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/220692275_Non-Linear_Programming_Theory_and_Algorithms?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

Section 4 presents our simulation results. Section 5 concludes ourresults.

2. Background

2.1. Compressive sampling

In compressive sampling [16–18], we would like to find asparse solution xARn of an underdetermined system, given by

b¼Φx; ð1Þwhere bARm is the observation vector, ΦARm�n is the measure-ment matrix with a rank of m, xARn is the unknown sparse vectorto be recovered, and mon. In a more precise way, the sparsestsolution is defined as

min jxj0 ð2aÞ

subject to b¼Φx: ð2bÞUnfortunately, problem (2) is NP-hard. Therefore, we usuallyreplace the l0-norm measure with the l1-norm measure. Theproblem for recovering sparse signals becomes

min jxj1 ð3aÞ

subject to b¼Φx: ð3bÞThis problem is known as basis pursuit (BP) [17,19]. Let ϕj be the j-th column of Φ. Define the mutual coherence [18] of Φ as

μðΦÞ ¼maxia j

jϕTi ϕjj

jϕij2jϕjj2: ð4Þ

If the cardinality of the true solution obeys xj0o12 ð1þ1=μðΦÞÞ

�� ,then the solution of (3) is exactly the same as that of (2). When themeasurement matrix has independent Gaussian entries and thenumber m of measurements is greater than 2k log ðn=kÞ, the BP[22] can reconstruct the sparse signal with high probability.

When there is measurement noise in b, the sampling processbecomes

b¼Φxþξ; ð5Þwhere ξ¼ ½ξ1;ξ2;…; ξm�T , and ξi's are independent identical ran-dom variables with zero mean and variance s2. In this case, ourproblem becomes

min jxj1 ð6aÞ

subject to jb�Φxj2rms2: ð6bÞ

2.2. Lagrange programming neural networks

The LPNN approach aims at solving a general nonlinear con-strained optimization problem, given by

EP : min f ðxÞ ð7aÞ

subject to hðxÞ ¼ 0; ð7bÞwhere xARn is the state of system, f : Rn-R is the objectivefunction, and h : Rn-Rm (mon) describes the m equality con-straints. The two functions f and h are assumed to be twicedifferentiable.

The LPNN approach first sets up a Lagrange function for EP,given by

Lðx;λÞ ¼ f ðxÞþλThðxÞ; ð8Þwhere λ¼ ½λ1;…;λm�T is the Lagrange multiplier vector. The LPNNapproach then defines two kinds of neurons: variable neurons andLagrange neurons. The variable neurons hold the state variable

vector x. The Lagrange neurons hold the Lagrange multipliervector λ.

The LPNN approach defines two updating equations for thoseneurons, given by

1εdxdt

¼ �∇xLðx;λÞ ð9aÞ

1εdλdt

¼∇λLðx;λÞ; ð9bÞ

where ε is the time constant of the circuit. The time constantdepends on the circuit resistance and the circuit capacitance.Without loss of generality, we set ε to 1. The variable neuronsseek for a state with the minimum cost in a system while theLagrange neurons are trying to constrain the system state of thesystem such that the system state falls into the feasible region.With (9), the network will settle down at a stable state [15] if thenetwork satisfies some conditions.

3. LPNNs for compressive sampling

3.1. Sparse signal

From (3) and (7), one may suggest that we can set up a LPNNto solve problem (3). However, the absolute operator j � j is notdifferentiable at xi¼0. Hence, we need an approximation. In thispaper, we use the following approximation:

xij j � lnðcoshðaxiÞÞa

; ð10Þ

where a41. Fig. 1 shows the shape of ð1=aÞlnðcoshðaxÞÞ. From thefigure, the approximation1 is quite accurate for a large a.

Thanks to the property of ð1=aÞlnðcoshðaxÞÞ, the hyperbolictangent function is involved in the dynamic equations of theneural circuit. It should be noticed that the hyperbolic tangentfunction [23] is the commonly used activation in the neuralnetwork community. Also, comparators or amplifiers [24] areassociated with the hyperbolic-tangent relation between inputand output for differential bipolar pairs.

−1 −0.5 0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x

y

abs(x)(1/10)log(cosh(10x))(1/20)log(cosh(20x))(1/50)log(cosh(50x))

Fig. 1. y¼ ð1=aÞlnðcoshðaxÞÞ.

1 In Section 4, we will use simulation examples to show the effectiveness ofthis approximation.

C.-S. Leung et al. / Neurocomputing 129 (2014) 298–305 299



https://www.researchgate.net/publication/3322018_Wakin_MB_An_introduction_to_compressive_sampling_IEEE_Signal_Process_Mag_252_21-30?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/3324333_Lagrange_Programming_neural_networks?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/2129550_Counting_faces_of_randomly-projected_polytopes_when_the_projection_radically_lowers_dimension?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/3080545_Uncertainty_Principles_and_Ideal_Atomic_Decomposition?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/3372929_Hardware_realisation_of_a_neuron_transfer_function_and_its_derivative?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

https://www.researchgate.net/publication/268322491_High_speed_AD_converters_Understanding_data_converters_through_SPICE?el=1_x_8&enrichId=rgreq-3282def1-d2ff-4639-ae9c-1298a2a22c7b&enrichSource=Y292ZXJQYWdlOzI2MDExMDEyMTtBUzoxMDQ4OTI4ODI0ODkzNTRAMTQwMjAxOTgyMDkwNA==

With the approximation, the problem becomes

EPcs : min1a∑n

ilnðcoshðaxiÞÞ ð11aÞ

subject to b�Φx¼ 0: ð11bÞFrom (8), the Lagrange function for EPcs is given by

Lcsðx;λÞ ¼1a∑n

ilnðcoshðaxiÞÞþλT ðb�ΦxÞ: ð12Þ

Applying (9) on (12), we can obtain the neural dynamics for EPcs,given by

dxdt

¼ �tanhðaxÞþΦTλ ð13aÞ

dλdt

¼ b�Φx: ð13bÞ

In the above equations, Φ is the interconnection matrix betweenthe variable neurons and Lagrange neurons. This kind of neuraldynamics can be considered as a special form of bi-directionalassociative memories (BAMs) [25,26]. The information flowbetween two types of neurons is achieved through the intercon-nection matrix.

The modified constrained optimization problem (11) has sev-eral nice properties. First, the objective function is twice differ-entiable. Second, the objective function is convex and theconstraints b�Φx are affine. Hence any local optimum is theglobal optimum. Also, according to the Karush–Kuhn–Tucker(KKT) theorem, the necessary conditions are also sufficient foroptimality. Based on these properties, we have the followingtheorem for EPcs.

Theorem 1. There is a unique solution for EPcs. A state fxn;λng is aunique solution of EPcs, iff

∇xLcsðxn;λnÞ ¼ 0 ð14aÞ

b�Φxn ¼ 0: ð14bÞ

From (9), (14a) can be rewritten as

�tanhðaxnÞþΦTλn ¼ 0: ð15Þ

It should be noticed that Theorem 1 tells us the optimality ofEPcs only. It does not tell us that our proposed neural model, givenby (13), converges to the optimal solution.

In the following, we will show that the proposed neural model,given by (13), leads to the optimal solution of EPcs. That is, xðtÞ-xn

and λðtÞ-λn, as t-1.

Theorem 2. Given that the rank of Φ is m, the neural model, givenby (13), leads to the unique optimal solution fxn;λng for EPcs. That is,xðtÞ-xn and λðtÞ-λn, as t-1.

Proof. Define a scalar function Vðx;λÞ, given by

Vðx;λÞ ¼ 12 jtanhðaxÞ�ΦTλj2þ1

2 jb�Φxj2þ1

2 jx�xnj2þ12 jλ�λnj2: ð16Þ

We will show that the scalar function (16) is a Lyapunov functionfor (13), and that the system is globally asymptotically stable.Clearly, V ðxn;λnÞ ¼ 0 and Vðx;λÞ40 for fx;λgafxn;λng. HenceVðx;λÞ is low bounded. Also, Vðx;λÞ-1, as jxj-1 and jλj-1.This means, Vðx;λÞ is radial unbounded.Now we are going to show that _V ðx;λÞ ¼ dVðx;λÞ=dt ¼ 0 for

fx;λg ¼ fxn;λng, and _V ðx;λÞ ¼ dVðx;λÞ=dto0 for fx;λgafxn;λng.

The derivative of V ðx;λÞ with respect to time is

_V ðx;λÞ ¼ ∂Vðx;λÞ∂x

dxdt

þ∂Vðx;λÞ∂λ

dλdt

: ð17Þ

Define

Vðx;λÞ ¼ V1ðx;λÞþV2ðx;λÞ ð18Þ

V1ðx;λÞ ¼ 12 jtanhðaxÞ�ΦTλj2þ1

2 jb�Φxj2 ð19Þ

V2ðx;λÞ ¼ 12 jx�xnj2þ1

2 jλ�λnj2: ð20Þ

It is easy to show that

_V 1ðx;λÞ ¼ �ðtanhðaxÞ�ΦTλÞTAðtanhðaxÞ�ΦTλÞ; ð21Þ

where

A¼ a

1�tanh2ðax1Þ 0 ⋯ 00 ⋱ ⋮ 0⋮ ⋯ ⋱ 00 ⋯ 0 1�tanh2ðaxnÞ

0BBBB@

1CCCCA:

Note that A is positive definite. Also, we have

_V 2ðx;λÞ ¼ �ðx�xnÞT ðtanhðaxÞ�ΦTλÞþðλ�λnÞT ðb�ΦxÞ: ð22Þ

From (14b) and (15), (22) becomes

_V 2ðx;λÞ ¼ �ðx�xnÞT ðtanhðaxÞ�tanhðaxnÞÞ

�ðx�xnÞT ðΦTλn�ΦTλÞ

þðλ�λnÞT ðb�Φx�bþΦxnÞ

¼ �ðx�xnÞT ðtanhðaxÞ�tanhðaxnÞÞ: ð23Þ

For fx;λg ¼ fxn;λng, we have _V 1ðxn;λnÞ ¼ 0 and _V 2ðxn;λnÞ ¼ 0.This means, _V ðxn;λnÞ ¼ 0. For x¼ xn but λaλn, we haveðtanhðaxnÞ�ΦTλÞa0 because the rank of Φ is m.2 Hence_V 1ðxn;λÞo0 and _V 2ðxn;λÞ ¼ 0. This means, _V ðxn;λÞo0. Forxaxn, we have _V 1ðx;λÞr0. Since tanhð�Þ is a strictly monotonicfunction, _V 2ðx;λÞo0. This means, _V ðx;λÞo0. To sum up, we have_V ðx;λÞo0 for fx;λgafxn;λng, _V ðxn;λnÞ ¼ 0, and Vðx;λÞ is lowerbounded at fxn;λng and is radial unbounded. Hence Vðx;λÞ is aLyapunov function for (13). The network is globally asymptoticallystable. It converges to the unique equilibrium point fxn;λng. Theproof is complete. □

Theorem 2 tells us that the proposed neural model, givenby (13), can find out the optimal solution of EPcs. Note that thecondition of Theorem 2 (rank of Φ is m) usually holds in compres-sive sampling because the measurement vectors are usuallyindependent.

We use a synthetic 1D sparse signal, shown in Fig. 2(a), toillustrate the convergence of the neural circuit. The signal lengthis 256. In this signal 15 data points have non-zero values. Theconnection matrix is a 71 random matrix. Fig. 2(b) shows therecovery from 60 measured values with no measurement noise.From the figure, the recovered signal is nearly the same as theoriginal signal. Moreover, the network converges after 14 char-acteristic times, as shown in Fig. 2(c).

2 Since the rank ofΦ is equal tom, there is a unique λ, denoted as λn , such thatðtanhðaxnÞ�ΦTλnÞ ¼ 0.

C.-S. Leung et al. / Neurocomputing 129 (2014) 298–305300

3.2. Noisy measurements

As mentioned before, when there is measurement noise in b,the compressive sampling process can be represented as

b¼Φxþξ; ð24Þ

where ξ¼ ½ξ1;ξ2;…;ξm�T , and ξi's are independently identicallyrandom variables with zero mean and variance s2. In this case,from (6) and (10), our problem becomes

EPcn : min1a∑n


subject to jb�Φxj2rms2: ð25bÞIn our formulation, we assume that the signal power is greaterthan the noise power, i.e., jbj24ms2. Since the objective functionand the constraint are convex, any local optimum is a globaloptimum. According to the KKT theorem, we have the followingtheorem.

Theorem 3. There is a unique solution for EPcn. A state fxn; λng is asolution of EPcn′, iff

tanhðaxnÞ�λnΦT ðb�ΦxnÞ ¼ 0 ð26Þ

jb�Φxnj2�ms2r0 ð27Þ

λnZ0 ð28Þ

λnðjb�Φxnj2�ms2Þ ¼ 0: ð29ÞNote that in the noisy case λn is a scalar.

From Theorem 3 and the assumption jbj24ms2, we have thefollowing theorem.

Theorem 4. Given jbj24ms2, the optimization problem EPcn isequivalent to

EPcn′ : min1a∑n


subject to jb�Φxj2�ms2 ¼ 0 ð30bÞ

Proof. From (29), λn40 is equivalent to ðjb�Φxnj2�ms2Þ ¼ 0.From Theorem 3, EPcn has a unique solution fxn; λng and λnZ0. Atfxn; λng, λn is either greater than zero or equal to zero. If λn ¼ 0,then from (26) xn ¼ 0. This implies that from (27) jbj2oms2. Thiscontradicts our earlier assumption jbj24ms2. Hence from (29) λn

must be greater than zero and jb�Φxnj2�ms2 ¼ 0. Therefore, theinequality in (25) can be removed, and the proof is complete. □

From Theorem 4, we consider the Lagrange function for (30),given by

Lcn′ðx;λÞ ¼1a∑n

ilnðcoshðaxiÞÞþλð b�Φxj2�ms2Þ:

�� ð31Þ

Now, from (9) the neural model is given by

dxdt

¼ �tanhðaxÞþ2λΦT ðb�ΦxÞ ð32Þ

dλdt

¼ b�Φxj2�ms2:�� ð33Þ

From Theorem 3, we can obtain the stability property of (32)and (33), given by the following theorem.

Theorem 5. Given jbj24ms2, the unique solution fxn;λng of EPcn

and EPcn′ is an asymptotically stable point of the neural dynamics.

Proof. The linearized neural dynamics around fxn;λng is given by

dxdtdλdt

" #¼ � Anþ2λnΦTΦ 2ΦT ðb�ΦxnÞ

�2ðb�ΦxnÞTΦ 0

" #x�xn

λ�λn

� �ð34Þ

0 50 100 150 200 250 300−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

signal index

sign

al

Original

0 50 100 150 200 250 300−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

signal index

sign

al

OriginalRecovered signal from our analog neural network

0 5 10 15 20 25 30 35 40−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

time

sign

al

Fig. 2. The 1D sparse artificial signal recovery. (a) The sparse signal, (b) the recoveredsignal from the LPNN approach and (c) the dynamics of the recovered signal.


where

An ¼ a

1�tanh2ðaxn1Þ 0 ⋯ 00 ⋱ ⋮ 0⋮ ⋯ ⋱ 00 ⋯ 0 1�tanh2ðaxnnÞ

0BBBB@

1CCCCA:

Define

Γ¼ Anþ2λnΦTΦ 2ΦT ðb�ΦxnÞ�2ðb�ΦxnÞTΦ 0

" #: ð35Þ

If the real part of each eigenvalue of Γ is strictly positive, thenfxn; λng is an asymptotically stable point [27]. Denote ~x be theconjugate of x. Let α be an eigenvalue of Γ, and ðχ T ; τÞT a ð0T ;0ÞTbe the corresponding eigenvector. Clearly, if ðχ T ; τÞT is an eigen-vector of Γ, it cannot be a zero vector.Now we consider two cases, either χ ¼ 0 or χa0. If χ ¼ 0, from

the definition of eigenvector we have

Γ 0τ

� �¼ α

0τ

� �: ð36Þ

From the definition of Γ (see (35)), we also have

Γ 0τ

� �¼ 2τΦT ðb�ΦxnÞ

0

" #: ð37Þ

For jbj24ms2, from (26), we have xna0, λna0 andΦT ðb�ΦxnÞa0. Therefore, τ¼ 0. This contradicts the assumptionof eigenvector. This means, χa0.Since ðχ T ; τÞT is the eigenvector, we have

Re ½ ~χ T ~τ �Γ χτ

� �� ¼ ReðαÞðjχ j2þjτj2Þ: ð38Þ

From the definition of Γ, we also have

Re ½ ~χ T ~τ �Γ χτ

� �� ¼ Ref ~χ T ðAnþ2λnΦTΦÞχ g: ð39Þ

This means, we have

Ref ~χ T ðAnþ2λnΦTΦÞχ g ¼ ReðαÞðjχ j2þjτj2Þ: ð40Þ

Since An is positive definite and λn40, the left hand side must begreater than zero. This means ReðαÞ40. The proof is complete. □

We use the previous synthetic sparse signal (Fig. 2) to illustratethe convergence behavior. The measurement noise variance is0.01. Fig. 3(a) shows the recovery from 60 measured values. Therecovered signal has converged after 5 characteristic times, asshown in Fig. 3(b).

3.3. LPNNs for recovering non-sparse signal

Compressing sampling can be extended to handle non-sparsesignals [28]. Let z be a non-sparse signal, and let Ψ be invertibletransform which can transform a non-sparse signal to a sparseone. In this case, the measurement signal is given by b¼ΦΨz.Clearly, the compressing sensing problem for non-sparse signals[28] becomes

min jΨzj1 ð41aÞ

subject to jb�ΦΨzj2rε ð41bÞ

where ε40 is the error level in the residual. By consideringx¼Ψz, we can use the neural model presented in Section 3.2 torecover non-sparse signals.

3.4. Activation function in the dynamics

In (13) and (32), the hyperbolic tangent function is involved. Inhardware implementation, comparators or amplifiers [24] areassociated with the hyperbolic-tangent relation between inputand output for differential bipolar pairs. Besides, the hyperbolicfunction [23] is the commonly used activation.

Instead of using the hyperbolic tangent function as the activa-tion function, we can use any activation function φðxÞ with thefollowing properties:

limx-�1

φðxÞ ¼ �1; ð42aÞ

limx-1

φðxÞ ¼ 1; ð42bÞ

φð0Þ ¼ 0 ð42cÞ

limx-71

dφðxÞdx

¼ 0; ð42dÞ

dφðxÞdx

40: ð42eÞ

0 50 100 150 200 250 300−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

signal index

sign

al

OriginalRecovered signal from our analog neural network

0 5 10 15 20 25 30 35 40−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

time

signal

Fig. 3. The 1D sparse artificial signal recovery from noisy measurements. (a) Therecovered signal from our approach where the measurements contain noise andthe noise variance is 0.01. (b) The dynamics of the recovered signal.


With those properties, we can also obtain the same theoreticalresults about the stability and optimality. It is because the proofsof Theorems 1–5 are based on those properties. Hence, we cancascade two low gain neurons to form a high gain neuron, given by

φðxÞ ¼ tanhða tanhðaxÞÞ: ð43Þ

4. Simulations

4.1. Recovery for sparse signals

In our approach, we use hyperbolic tangent for approximation.This section tests our neural model with different hyperbolictangent parameter values a¼{10,20,50}. The cascade approach,with activation function φðxÞ ¼ tanhð20 tanhð20xÞÞ is also consid-ered. As a comparison, we use the standard BP [21] from l1-Magicto recover sparse signals.

The example is a standard data configuration used in many papers[29,21]. The signal length is 512 and 15 data points have non-zero

values. We randomly generate 100 signals in our experiment. One ofthe 100 sparse signals is shown in Fig. 4. The measurement matrix Φis a 71 random matrix. We vary the number of measurements from60 to 120. We repeat our experiment 100 times with different randommatrices Φ and different sparse signals x.

Fig. 5 shows the average mean square error (MSE) values ofrecovery signals. From the figure, we can observe that for thestandard BP, we need around 90 measurement signals to recoverthe sparse signal. For our method, when the value of the hyper-bolic tangent parameter is too small, such as 10 or 20, there isnoticeable degradation in MSE values. When the hyperbolictangent parameter a is equal to 50, there is no noticeabledifference in MSE values between the standard BP method andour neural approach. Also, when we use the cascade approach inthe activation function (φðxÞ ¼ tanhð20 tanhð20xÞÞÞ, there is nonoticeable difference in MSE values between the standard BPmethod and the cascade activation function approach.

4.2. Recovery from noisy measurements

This section tests our neural model for recovering sparsesignals from noisy measurements. The simulation setting is similarto that of Section 4.1. We add Gaussian noise in the measurementsignals. The experiment considers three signal to noise (SNR)levels, {45 dB, 31 dB, 25 dB}, in the measurement signal. For ourneural approach, three different hyperbolic tangent parametervalues, a¼{10,20,50}, are considered. In the simulation, we alsoconsider the cascade activation function approach with φðxÞ ¼tanhð20 tanhð20xÞÞ.

As a comparison, we use the “Min-l1 with quadratic con-straints” (ML1QC) algorithm from l1-Magic [21] to recover sparsesignals. We repeat our experiment 100 times with differentrandom matrices Φ and different sparse signals x. Fig. 6 showsthe average MSE values of recovery signals. From the figure, wecan observe that for the ML1QC algorithm, we need around 90measurement signals to recover the sparse signal. For our method,when the value of the hyperbolic tangent parameter is too small,such as 10 or 20, there is noticeable degradation. When thehyperbolic tangent parameter a is equal to 50, there is no notice-able difference in MSE values between the ML1QC method and ourapproach. Also, when we use the cascade approach in the activa-tion function (φðxÞ ¼ tanhð20 tanhð20xÞÞÞ, there is no noticeabledifference in MSE values between the ML1QC method and thecascade activation function approach.

4.3. Recovery for non-sparse signal

For recovering non-sparse signals, we can consider the situa-tion of data compression [28]. We use the algorithm presented inSection 3.3 to recover non-sparse signals. We consider a 1D signalwhich is extracted from one row of image “CameraMan”. The

0 50 100 150 200 250 300 350 400 450 500−1.5

−1−0.5

00.5

11.5

sign

al le

vel

signal index

Fig. 4. The sparse signal. One of the 100 sparse signals in Sections 4.1 and 4.2.

60 70 80 90 100 110 1200

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

0.01

m: number of measurements

MS

E

Recovery from the noiseless condition

standard BPneural approach with a=10neural approach with a=20neural approach with a=50cascade approach with tanh(20tanh(20x))

Fig. 5. Average MSE of the recovery signals. The signal length is 512 and 15 datapoints have non-zero values. We repeat our experiment 100 times with differentrandom matrices and different sparse signals.

60 70 80 90 100 110 1200

0.0010.0020.0030.0040.0050.0060.0070.0080.009

0.01


MS

E

Recovery from noisy measurement with SNR=45 dB

60 70 80 90 100 110 1200

0.0010.0020.0030.0040.0050.0060.0070.0080.009

0.01


MS

E


60 70 80 90 100 110 1200

0.0010.0020.0030.0040.0050.0060.0070.0080.009

0.01


MS

E


Fig. 6. Average MSE of the recovery signals from noisy measurements. At each number of measurements and at each measurement noise levels, we repeat our experiment100 times with different random matrices and different sparse signals.


transform used is the Haar transform. As a comparison, we alsoconsider the ML1QC algorithm. We vary the number of measure-ments in our experiment. The reconstruction signals of usingdifferent number of measurements are shown in Fig. 7. From thefigure, when more measurements are considered, we get a betterreconstruction. Also, the quality of using our neural method issimilar to that of using the ML1QC algorithm. We repeat ourexperiment 100 times with different random matrices. Fig. 8shows that our neural approach and the ML1QC algorithm producea similar MSE performance.

5. Conclusion

We formulated two analog neural models for signal recovery incompressive sampling. We proposed neural dynamics to handletwo cases, including the standard recovery and the recovery fromnoisy measurements. We not only proposed the neural models butalso investigated the stability of our neural models. We showedthat for the case of the standard recovery, our proposed neuraldynamics converge to the unique global minimum solution. Forthe case of recovery from noisy measurements, we showed thatour proposed neural model is locally stable. With our stabilityanalysis, the neural dynamics has a great potential application incompressive sampling. Experimental results showed that there areno significant differences between our analog approach and thetraditional numerical methods. One of the future directions is tomodify the dynamics for the noisy situation so that the neuraldynamics are globally stable.

Acknowledgment

The work presented in this paper is supported by a researchgrant (CityU 115612) from the Research Grants Council of theGovernment of the Hong Kong Special Administrative Region.

References

[1] M.S. Bazaraa, H.D. Sherali, C.M. Shetty, Nonlinear Programming: Theory andAlgorithms, 2nd ed., Wiley, New York, 1993.

[2] A. Ruszczynski, Nonlinear Optimization, , 2006.[3] A. Cichocki, R. Unbehauen, Neural Networks for Optimization and Signal

Processing, Wiley, London, UK, 1993.[4] J.J. Hopfield, Neural networks and physical systems with emergent collective

computational abilities, Proc. Natl. Acad. Sci. U.S.A. 79 (1982) 2554–2558.[5] L.O. Chua, G.N. Lin, Nonlinear programming without computation, IEEE Trans.

Circuits Syst. 31 (1984) 182–188.[6] Y. Xia, G. Feng, J. Wang, A novel recurrent neural network for solving nonlinear

optimization problems with inequality constraints, IEEE Trans. Neural Net-works 19 (8) (2008) 1340–1353.

[7] Q. Liu, J. Wang, A one-layer recurrent neural network with a discontinuoushard-limiting activation function for quadratic programming, IEEE Trans.Neural Networks 19 (4) (2008) 558–570.

[8] C. Dang, Y. Leung, X. Bao Gao, K. Zhou Chen, Neural networks for nonlinear andmixed complementarity problems and their applications, Neural Networks 17(2004) 271–283.

[9] X. Hu, J. Wang, Solving pseudomonotone variational inequalities and pseudo-convex optimization problems using the projection neural network, IEEETrans. Neural Networks 17 (6) (2006) 1487–1499, http://dx.doi.org/10.1109/TNN.2006.879774.

[10] X. Hu, J. Wang, A recurrent neural network for solving a class of generalvariational inequalities, IEEE Trans. Syst. Man Cybern., Part B: Cybern. 37 (3)(2007) 528–539.

[11] S. Bharitkar, K. Tsuchiya, Y. Takefuji, Microcode optimization with neuralnetworks, IEEE Trans. Neural Networks 10 (3) (1999) 698–703.

[12] J. Sum, C.-S. Leung, P. Tam, G. Young, W. Kan, L.-W. Chan, Analysis for a class ofwinner-take-all model, IEEE Trans. Neural Networks 10 (1) (1999) 64–71.

[13] J. Wang, Analysis and design of a k-winners-take-all model with a single statevariable and the heaviside step activation function, IEEE Trans. Neural Net-works 21 (9) (2010) 1496–1506, http://dx.doi.org/10.1109/TNN.2010.2052631.

[14] Y. Xiao, Y. Liu, C.S. Leung, J. Sum, K. Ho, Analysis on the convergence timeof dual neural network-based KWTA, IEEE Trans. Neural Networks Learn. Syst.23 (4) (2012) 676–682.

[15] S. Zhang, A.G. Constantinidies, Lagrange programming neural networks, IEEETrans. Circuits Syst. II 39 (1992) 441–452.

[16] D. Donoho, X. Huo, Uncertainty principles and ideal atomic decomposition,IEEE Trans. Inf. Theory 47 (7) (1999) 2845–2862.

[17] S. Chen, D. Donoho, M. Saunders, Atomic decomposition by basis pursuit, SIAMJ. Sci. Comput. 43 (1) (2001) 129–159.

[18] D. Donoho, M. Elad, Optimally sparse representation in general dictionaries vial1 minimization, Proc. Natl. Acad. Sci. U.S.A. 100 (5) (2003) 2197–2202.

[19] E.J. Candès, M.B. Wakin, An introduction to compressive sampling, IEEE SignalProcess. 25 (2008) 21–30.

[20] R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc.Ser. B (Methodological) 58 (1996) 267–288.

[21] E. Candes, J. Romberg, L1 Magic: Recovery of Sparse Signals via ConvexProgramming. URL ⟨http://users.ece.gatech.edu/� justin/l1magic/⟩.

[22] D.L. Donoho, Jared Tanner, Counting faces of randomly-projected polytopeswhen the projection radically lowers dimension, J. AMS (2009) 1–53.

[23] A. Annema, Hardware realisation of a neuron transfer function and itsderivative, Electron. Lett. 30 (7) (1994) 576–577.

[24] A. Moscovici, High Speed A/D Converters: Understanding Data ConvertersThrough SPICE, Kluwer Academic Publishers, 2001.

0 50 100 150 200 250−0.5

0

0.5

1

1.5

2

2.580 measurements

signal index

sign

al le

vel

0 50 100 150 200 250−0.5

0

0.5

1

1.5

2

2.5120 measurements

signal index

sign

al le

vel

0 50 100 150 200 250−0.5

0

0.5

1

1.5

2

2.5140 measurements

signal index

sign

al le

vel

Fig. 7. Signal recovery from the non-sparse signal.

80 90 100 110 120 130 140 1500

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04


MS

E

ML1QC with Haar Transform

neural approach with a=50cascade function with tanh(20tanh(20x))

Fig. 8. MSE of signal recovery for non-sparse signals. We repeat our experiment100 times with different random matrices Φ.


http://refhub.elsevier.com/S0925-2312(13)00948-X/sbref1


















http://dx.doi.org/10.1109/TNN.2006.879774





























http://users.ece.gatech.edu/~justin/l1magic/

http://users.ece.gatech.edu/~justin/l1magic/







[25] B. Kosko, Bidirectional associative memories, IEEE Trans. Syst. Man Cybern. 18(1) (1988) 49–60 http://dx.doi.org/10.1109/21.87054.

[26] C.S. Leung, L.W. Chan, E. Lai, Stability and statistical properties of second-orderbidirectional associative memory, IEEE Trans. Neural Networks 8 (2) (1997)267–277.

[27] J.-J.E. Slotine, W. Li, Applied Nonlinear Control, Prentice Hall, 1991.[28] Y. Tsaig, D.L. Donoho, Extensions of compressed sensing, Signal Process. 86 (3)

(2006) 549–571.[29] S. Ji, Y. Xue, L. Carin, Bayesian compressive sensing, IEEE Trans. Signal Process.

56 (6) (2007) 2346–2356, http://dx.doi.org/10.1109/TSP.2007.914345.

Chi-Sing Leung received the B.Sci. degree in electro-nics, the M.Phil. degree in information engineering, andthe PhD. degree in computer science from the ChineseUniversity of Hong Kong in 1989, 1991, and 1995,respectively. He is currently an Associate Professor inthe Department of Electronic Engineering, City Univer-sity of Hong Kong. His research interests include neuralcomputing, data mining, and computer graphics. In2005, he received the 2005 IEEE Transactions on Multi-media Prize Paper Award for his paper titled, “ThePlenoptic Illumination Function” published in 2002.He was a member of Organizing Committee of ICO-NIP2006. He was the Program Chair of ICONIP2009 and

ICONIP2012. He is/was the guest editors of several journals, including, Neurocom-puting, and Neural Computing and Applications. He is a governing board memberof the Asian Pacific Neural Network Assembly (APNNA) and Vice Presidentof APNNA.

John Sum received the B.Eng. in Electronic Engineeringfrom the Hong Kong Polytechnic University in 1992,M.Phil. and Ph.D. in CSE from the Chinese University ofHong Kong in 1995 and 1998. John spent 6 yearsteaching in several universities in Hong Kong, includingthe Hong Kong Baptist University, the Open Universityof Hong Kong and the Hong Kong Polytechnic Univer-sity. In 2005, John moved to Taiwan and started toteach in Chung Shan Medical University. Currently, heis an associate professor in the Institute of E-Com-merce, the National Chung Hsing University, Taichung,ROC. His research interests include neural computation,mobile sensor networks and scale-free network. John

Sum is a senior member of IEEE and an associate editor of the International Journalof Computers and Applications.

A.G. Constantinides is the Professor of Signal and thehead of the Communications and Signal ProcessingGroup of the Department of Electrical and ElectronicEngineering. He has been actively involved in researchin various aspects of digital filter design, digital signalprocessing, and communications for more than 30years. Professor Constantinides' research spans a widerange of Digital Signal Processing and Communications,both from the theoretical as well as the practical pointsof view. His recent work has been directed toward thedemanding signal processing problems arising from thevarious areas of mobile telecommunication. This workis supported by research grants and contracts from

various government and industrial organisations.Professor Constantinides has published several books and over 250 papers in

learned journals in the area of Digital Signal Processing and its applications. He hasserved as the First President of the European Association for Signal Processing(EURASIP) and has contributed in this capacity to the establishment of theEuropean Journal for Signal Processing. He has been on, and is currently servingas, a member of many technical program committees of the IEEE, the IEE and otherinternational conferences. He has organised the first ever international series ofmeetings on Digital Signal Processing, in London initially in 1967, and in Florence(with Vito Cappellini) since 1972. In 1985 he was awarded the Honour of Chevalier,Palmes Academiques, by the French government, and in 1996, the promotion toOfficier, Palmes Academiques. He holds honorary doctorates from European and FarEastern Universities, several Visiting Professorships, Distinguished Lectureships,Fellowships and other honours around the world.


http://dx.doi.org/10.1109/21.87054







http://dx.doi.org/10.1109/TSP.2007.914345



Documents

Recurrent networks for compressive sampling