[IEEE 2012 IEEE International Conference on Information Science and Technology (ICIST) - Wuhan, China (2012.03.23-2012.03.25)] 2012 IEEE International Conference on Information Science

Salt-and-pepper Noise Removalby a Spectral Conjugate Gradient Method

Wei Xue, Dandan Cui, and Jinhong Huang

Abstract—Denoising is an important problem in signal pro-cessing. This paper proposes an efficient two-phase methodfor salt-and-pepper noise removal. In the first phase, adaptivemedian filter is used to detect the contaminated pixels. Thenin the second phase, the candidate pixels will be restored byminimizing a regularization functional. To end of this, a spectralconjugate gradient method is considered. Its global convergenceresult can be established under some suitable conditions. Ex-perimental results show that the proposed approach is efficientand practical.

I. INTRODUCTION

independent of image content. An important type of impulsenoise is salt-and-pepper noise. When images are corruptedby this kind of noise, only part of pixels will be changedand the noisy pixels like white and black dots will sprinkleon the images. Recently, a two-phase method was proposedfor removing salt-and-pepper impulse noise [2]. Firstly, anadaptive median filter [3] is used to identify pixels whichare likely to be contaminated. Then it will restore the noisepixels by minimizing an edge-preserving functional in thesecond phase. It restores noisy pixels one by one and hasa poor computational efficiency. Motivated to improve thetwo-phase method, an efficient spectral conjugate gradientmethod (SCG) is introduced in this paper.

Let I be the numerical image, 𝒜 = {1, 2, ⋅ ⋅ ⋅ ,𝑀} ×{1, 2, ⋅ ⋅ ⋅ , 𝑁} be the index set of I and 𝒩 ⊂ 𝒜 denote theset of indices of the noise pixels detected in the first phase.The second phase will restore the noise pixels by minimizingthe following functional:

𝐹𝛼(𝑢) =∑

(𝑖,𝑗)∈𝒩∣𝑢𝑖,𝑗−𝑦𝑖,𝑗 ∣+ 𝜇

2

∑(𝑖,𝑗)∈𝒩

(2 ⋅𝑇 0𝑖,𝑗+𝑇 1

𝑖,𝑗), (1)

where 𝜇 > 0 is a parameter and 𝑢 = [𝑢𝑖,𝑗 ](𝑖,𝑗)∈𝒩is a column vector of length 𝑙 ordered lexicographically,𝑙 is the number of elements of 𝒩 and 𝑦𝑖,𝑗 denotesthe observed pixel value of the image at position (𝑖, 𝑗).Both 𝑇 0

𝑖,𝑗 =∑

(𝑚,𝑛)∈𝒱𝑖,𝑗∖𝒩 𝜑𝛼(𝑢𝑖,𝑗 − 𝑦𝑚,𝑛) and 𝑇 1𝑖,𝑗 =∑

(𝑚,𝑛)∈𝒱𝑖,𝑗

∩𝒩 𝜑𝛼(𝑢𝑖,𝑗 − 𝑢𝑚,𝑛) are regularization terms,where 𝒱𝑖,𝑗 denotes the set of the four closest neighbors ofthe pixel at position (𝑖, 𝑗) ∈ 𝒜. An explanation of the extra

Wei Xue, Dandan Cui and Jinhong Huang are with the School ofMathematics and Computer Sciences, GanNan Normal University, Ganzhou,Jiangxi Province, People’s Republic of China (email: [email protected],[email protected], [email protected]).

This work was partly supported by the Postgraduate Student InnovationFoundation of Gannan Normal University (No. YCX10B006) and theNational Natural Science Foundation of China (No. 11001060).

factor “2” in the second summation in (1) can be seen in[4]. 𝜑𝛼 is an edge-preserving functional with a regularizationparameter 𝛼. Examples of such 𝜑𝛼 are 𝜑𝛼(𝑢) =

√𝛼+ 𝑢2

and 𝜑𝛼(𝑢) = ∣𝑢∣𝛼. See [5]−[9].The two-phase method can restore large patches of noisy

pixels because the pertinent prior information is containedin the regularization term. In addition, since the adaptivemedian filter can detect almost all of the noisy pixels withvery high accuracy and the noisy pixel values are independenton the pixel values before corrupted, the data-fitting term inthe objective functional of edge-preserving regularization isno longer necessary. Furthermore, the objective functionalshould be simple enough in order to improve the algorithmon computational complexity. So we can drop the nonsmoothdata-fitting term in the second phase where only noisy pixelsare restored in the minimization. Then there exists lots ofoptimization methods which can be extended to minimize thefollowing smooth edge-preserving regularization functional[4] [10]:

𝐹𝛼(𝑢) =∑

(𝑖,𝑗)∈𝒩(2 ⋅ 𝑇 0

𝑖,𝑗 + 𝑇 1𝑖,𝑗). (2)

Now we are in a position to present some properties of thesmooth edge-preserving regularization functional (2), whichhave been discussed in [10].

Proposition 1.1. If 𝜑𝛼(𝑢) is second order Lipschitz contin-uous, continuously differentiable, convex, strictly convex orcoercive. Then the functional 𝐹𝛼(𝑢) will be second orderLipschitz continuous, continuously differentiable, convex,strictly convex or coercive respectively .

Proposition 1.2. If 𝜑𝛼(𝑢) is even, continuous and strictlyincreasing with respect to ∣𝑢∣. Then the global minimum of𝐹𝛼(𝑢) will exist and any global minimum 𝑢∗ is in a dynamicrange, i.e., 𝑢∗ ∈ [𝑡𝑚𝑖𝑛, 𝑡𝑚𝑎𝑥] for all (𝑖, 𝑗) ∈ 𝒩 .

The outline of the paper is as follows. We present thespectral conjugate gradient algorithm in the next section. InSection 3, the global convergence of the proposed algorithmis studied. Section 4 shows some numerical results illustratethat our method is very practical and promising. Finally wehave a conclusion section.

II. ALGORITHM

Conjugate gradient methods are quite useful in large scaleunconstrained optimization. For a general unconstrainedproblem

min{𝑓(𝑥) ∣ 𝑥 ∈ ℜ𝑛} (3)

where 𝑓 : ℜ𝑛 → ℜ is smooth and its gradient is available.

URING the acquisition or transmission, digital images are often corrupted by impulse noise [1] which is D

816

2012 IEEE International Conference on Information Science and Technology Wuhan, Hubei, China; March 23-25, 2012

978-1-4577-0345-4/12/$26.00 ©2012 IEEE

In order to accelerate the conjugate gradient method, somespectral conjugate gradient method have been given by Birginand Martinez [11], then studied by Andrei [12] and Yu etal. [13] [14]. In this paper we consider a spectral conjugategradient method which has the following form:

𝑑𝑘 =

{ −𝑔𝑘 𝑓𝑜𝑟 k= 1,− 1𝛿𝑘𝑔𝑘 + 𝛽𝑘𝑑𝑘−1 𝑓𝑜𝑟 k≥ 2,

(4)

𝑥𝑘+1 = 𝑥𝑘 + 𝛼𝑘𝑑𝑘, (5)

with

𝛼𝑘 = − 𝑔𝑇𝑘 𝑑𝑘𝛿𝑘∥𝑑𝑘∥2Γ𝑘

, (6)

𝛿𝑘 =𝑦𝑇𝑘−1𝑠𝑘−1

∥𝑠𝑘−1∥2 , (7)

𝛽𝑘 =∥𝑔𝑘∥2

𝛿𝑘𝑑𝑇𝑘−1𝑦∗𝑘−1

, (8)

𝑦𝑘−1 = 𝑔𝑘 − 𝑔𝑘−1, 𝑠𝑘−1 = 𝑥𝑘 − 𝑥𝑘−1, (9)

𝑦∗𝑘−1 = 𝑦𝑘−1 +Φ𝑘−1, Φ𝑘−1 = 𝑚𝑎𝑥(𝑣𝑘−1, 0)𝑠𝑘−1, (10)

𝑣𝑘−1 =𝑎(𝑔𝑘 + 𝑔𝑘−1)

𝑇 𝑠𝑘−1 − 𝑏(𝑓𝑘 − 𝑓𝑘−1)

∥𝑠𝑘−1∥2 , (11)

where 𝛼𝑘 is a steplength, 𝛽𝑘 is a scalar, a and b are twopositive parameters, 𝑔𝑘 denotes the gradient of 𝑓(𝑥𝑘) and 𝑓𝑘denotes 𝑓(𝑥𝑘). The form of 𝑦∗𝑘−1 has appeared in [15] andit possesses some interesting properties.

The SCG algorithm can be written as follows.Algorithm 2.1 (SCG method)Step 1. (Initial step) Choose an initial point 𝑥0 ∈ 𝑅𝑛, set

𝛼0 =√99/16 and 𝑑0 = −𝑔0, let 𝑘 := 1. Compute

𝑥𝑘 = 𝑥𝑘−1 + 𝛼𝑘−1𝑑𝑘−1.Step 2. (Termination test) If ∥𝑔𝑘 = 0∥, stop.Step 3. (Generating direction) Compute 𝛿𝑘 and generate 𝛽𝑘

by the formula (7), (8), (9), (10) and (11). Set 𝑑𝑘 =− 𝑔𝑘𝛿𝑘

+ 𝛽𝑘𝑑𝑘−1.Step 4. (Iteration) Compute 𝛼𝑘 via (6) and set 𝑥𝑘+1 = 𝑥𝑘+

𝛼𝑘𝑑𝑘. Then let 𝑘:=𝑘+1 and go to Step 2.We choose Γ𝑘 ≡ 𝐼 (the unit matrix) in this Algorithm.

III. GLOBAL CONVERGENCE

In this section, we study the global convergence of Algo-rithm 2.1. We make the following assumptions.

Assumption 3.1. The level set

Ω = {𝑥∣𝑓(𝑥) ≤ 𝑓(𝑥0)}is contained in a bounded convex set 𝐷.

Assumption 3.2. The function 𝑓 in (3) is continuouslydifferentiable on 𝐷 and there exists constant 𝐿 > 0 suchthat

∥𝑔𝑘+1 − 𝑔𝑘∥ ≤ 𝐿∥𝑥𝑘+1 − 𝑥𝑘∥

for any 𝑥𝑘, 𝑥𝑘+1 ∈ 𝐷.Assumption 3.3. The function 𝑓 in (3) is strong convex,

i.e., there exists constant 𝛾 > 0 such that

𝛾∥𝑥𝑘+1 − 𝑥𝑘∥2 ≤ (𝑔𝑘+1 − 𝑔𝑘)𝑇 (𝑥𝑘+1 − 𝑥𝑘)

for any 𝑥𝑘, 𝑥𝑘+1 ∈ 𝐷.Let {Γ𝑘} be a sequence of positive definite matrices.

Assume that there exist 𝜌𝑚𝑖𝑛 > 0 and 𝜌𝑚𝑎𝑥 > 0 such thatfor all 𝑝 ∈ 𝑅𝑛, there holds

𝜌𝑚𝑖𝑛𝑝𝑇 𝑝 ≤ 𝑝𝑇Γ𝑘𝑝 ≤ 𝜌𝑚𝑎𝑥𝑝

𝑇 𝑝. (12)

This condition would be satisfied, for example, if Γ𝑘 = Γ andΓ is a positive definite. Let the steplength 𝛼𝑘 be computedby (6), where

∥𝑑𝑘∥Γ𝑘:=

√𝑑𝑇𝑘 Γ𝑘𝑑𝑘, 𝛿𝑘 >

𝐿

𝜌𝑚𝑖𝑛. (13)

Note that the specification of 𝛿𝑘 ensures 𝐿/𝛿𝑘𝜌𝑚𝑖𝑛 < 1.Theorem 3.1: Suppose that 𝑥𝑘 is given by (4), (5) and (6).

Then𝑔𝑇𝑘+1𝑑𝑘 = 𝜏𝑘𝑔

𝑇𝑘 𝑑𝑘 (14)

will hold for all 𝑘, where 𝜏𝑘 = 1− 𝜓𝑘

𝛿𝑘and

𝜓𝑘 =

{0 if𝛼𝑘 = 0,

(𝑔𝑘+1−𝑔𝑘)𝑇 (𝑥𝑘+1−𝑥𝑘)∥𝑥𝑘+1−𝑥𝑘∥2 if𝛼𝑘 ∕= 0.

(15)

Proof: The case of 𝛼𝑘=0 implies 𝜏𝑘=1 and 𝑔𝑘+1=𝑔𝑘. Thus(12) is valid. We now prove for the case 𝛼𝑘 ∕= 0. From (5)and (6), we have

𝑔𝑇𝑘+1𝑑𝑘 = 𝑔𝑇𝑘 𝑑𝑘 + (𝑔𝑘+1 − 𝑔𝑘)𝑇 𝑑𝑘

= 𝑔𝑇𝑘 𝑑𝑘 + 𝛼−1𝑘 (𝑔𝑘+1 − 𝑔𝑘)

𝑇 (𝑥𝑘+1 − 𝑥𝑘)= 𝑔𝑇𝑘 𝑑𝑘 + 𝛼−1

𝑘 𝜓𝑘∥𝑥𝑘+1 − 𝑥𝑘∥2= 𝑔𝑇𝑘 𝑑𝑘 − 𝑔𝑇𝑘 𝑑𝑘

𝛿𝑘∥𝑑𝑘∥2𝜓𝑘∥𝑑𝑘∥2= (1− 𝜓𝑘

𝛿𝑘)𝑔𝑇𝑘 𝑑𝑘.

(16)Corollary 3.1: If Assumption 3.2 is valid and 𝛼𝑘 ∕= 0,

then there will hold

𝜏𝑘 ≥ 1− 𝐿

𝛿𝑘𝜌𝑚𝑖𝑛(17)

for all k; and𝜏𝑘 ≤ 1− 𝛾

𝛿𝑘(18)

will hold under Assumption 3.3. From (17) and (18), weeasily know

0 < 𝜏𝑘 ≤ 1 +𝐿

𝛿𝑘𝜌𝑚𝑖𝑛. (19)

Theorem 3.2: Suppose that Assumption 3.2 is valid, 𝑥𝑘 isgiven by (4), (5) and (6), then we have

∑𝑑𝑘 ∕=0

(𝑔𝑇𝑘 𝑑𝑘)2

∥𝑑𝑘∥2 < ∞. (20)

Proof: By the mean-value theorem we obtain

𝑓(𝑥𝑘+1)− 𝑓(𝑥𝑘) = 𝑔𝑇 (𝑥𝑘+1 − 𝑥𝑘), (21)

817

where 𝑔=∇𝑓(𝑥) for some 𝑥 ∈ [𝑥𝑘, 𝑥𝑘+1]. Assumption 3.1,Cauchy-Schwartz inequality, (5) and (6) yield

𝑔𝑇 (𝑥𝑘+1 − 𝑥𝑘) = 𝑔𝑇𝑘 (𝑥𝑘+1 − 𝑥𝑘)+ (𝑔 − 𝑔𝑘)

𝑇 (𝑥𝑘+1 − 𝑥𝑘)≤ 𝑔𝑇𝑘 (𝑥𝑘+1 − 𝑥𝑘)+ ∥𝑔 − 𝑔𝑘∥∥𝑥𝑘+1 − 𝑥𝑘∥≤ 𝛼𝑘𝑔

𝑇𝑘 𝑑𝑘 + 𝐿𝛼2

𝑘∥𝑑𝑘∥2= 𝛼𝑘𝑔

𝑇𝑘 𝑑𝑘(1− 𝐿∥𝑑𝑘∥2

𝛿𝑘∥𝑑𝑘∥2Γ𝑘

)

≤ 𝛼𝑘𝑔𝑇𝑘 𝑑𝑘(1− 𝐿

𝛿𝑘𝜌𝑚𝑖𝑛)

= − (𝑔𝑇𝑘 𝑑𝑘)2


(1− 𝐿𝛿𝑘𝜌𝑚𝑖𝑛

),

(22)

i.e.,

𝑓(𝑥𝑘+1)− 𝑓(𝑥𝑘) ≤ − (𝑔𝑇𝑘 𝑑𝑘)2


(1− 𝐿

𝛿𝑘𝜌𝑚𝑖𝑛), (23)

which implies 𝑓(𝑥𝑘+1)− 𝑓(𝑥𝑘) < 0. It follows by Assump-tion 3.2 that lim

𝑘→∞𝑓(𝑥𝑘) exists. Then we obtain

(𝑔𝑇𝑘 𝑑𝑘)2

∥𝑑𝑘∥2 ≤ 𝜌𝑚𝑎𝑥(𝑔𝑇𝑘 𝑑𝑘)

2

∥𝑑𝑘∥2Γ𝑘

≤ 𝛿𝑘𝜌𝑚𝑎𝑥

1−𝐿/𝛿𝑘𝜌𝑚𝑖𝑛[𝑓(𝑥𝑘)− 𝑓(𝑥𝑘+1)].

(24)

Hence the inequality (20) is valid.Theorem 3.3: Suppose that Assumptions hold and 𝑥𝑘 is

given by (4), (5) and (6). Then lim𝑘→∞

𝑖𝑛𝑓∥𝑔𝑘∥ ∕= 0 implies

∑𝑑𝑘 ∕=0

∥𝑔𝑘∥4∥𝑑𝑘∥2 < ∞. (25)

Proof: If lim𝑘→∞

𝑖𝑛𝑓∥𝑔𝑘∥ ∕= 0, there will exist constant 𝜃 > 0

such that ∥𝑔𝑘∥ ≥ 𝜃 for all k. Let 𝜆𝑘 = ∣𝑔𝑇𝑘 𝑑𝑘∣/∥𝑑𝑘∥, thenby Theorem 3.2, there holds 𝜆𝑘 ≤ 𝜃/4 for all large k. FromTheorem 3.1 and (19) we have

∣𝑔𝑇𝑘 𝑑𝑘−1∣ = ∣𝜏𝑘𝑔𝑇𝑘−1𝑑𝑘−1∣≤ ∣(1 + 𝐿

𝛿𝑘𝜌𝑚𝑖𝑛)𝑔𝑇𝑘−1𝑑𝑘−1∣

< 2∣𝑔𝑇𝑘−1𝑑𝑘−1∣.(26)

Considering (4) we have

𝑔𝑘 = 𝛿𝑘𝛽𝑘𝑑𝑘−1 − 𝛿𝑘𝑑𝑘 = 𝛿𝑘(𝛽𝑘𝑑𝑘−1 − 𝑑𝑘). (27)

By multiplying 𝑔𝑘 on both side of (27) we know that

∥𝑔𝑘∥2 = 𝛿𝑘(𝛽𝑘𝑔𝑇𝑘 𝑑𝑘−1 − 𝑔𝑇𝑘 𝑑𝑘). (28)

(26) and (28) yield

∥𝑔𝑘∥2

∥𝑑𝑘∥ =𝛿𝑘(𝛽𝑘𝑔

𝑇𝑘 𝑑𝑘−1−𝑔𝑇𝑘 𝑑𝑘)∥𝑑𝑘∥

≤ 𝛿𝑘[2∣𝛽𝑘∣∣𝑔𝑇𝑘−1𝑑𝑘−1∣+∣𝑔𝑇𝑘 𝑑𝑘∣]∥𝑑𝑘∥

= 𝛿𝑘[∣𝑔𝑇𝑘 𝑑𝑘∣∥𝑑𝑘∥ +

2∣𝑔𝑇𝑘−1𝑑𝑘−1∣∥𝛽𝑘𝑑𝑘−1∥∥𝑑𝑘−1∥∥𝑑𝑘∥ ]

= 𝛿𝑘[𝜆𝑘 +2𝜆𝑘−1∥𝑑𝑘+𝑔𝑘/𝛿𝑘∥

∥𝑑𝑘∥ ]

≤ 𝛿𝑘[𝜆𝑘 + 2𝜆𝑘−1(1 +1𝛿𝑘

∥𝑔𝑘∥2

∥𝑑𝑘∥∥𝑔𝑘∥ )]

≤ 𝛿𝑘(𝜆𝑘 + 2𝜆𝑘−1 + 2𝜆𝑘−1∥𝑔𝑘∥2

𝜃𝛿𝑘∥𝑑𝑘∥ ),

(29)

i.e.,∥𝑔𝑘∥2

∥𝑑𝑘∥ ≤ 𝛿𝑘(𝜆𝑘 + 2𝜆𝑘−1 + 2( 𝜃4 )∥𝑔𝑘∥2

𝜃𝛿𝑘∥𝑑𝑘∥ )

≤ 𝛿𝑘𝜆𝑘 + 2𝛿𝑘𝜆𝑘−1 +∥𝑔𝑘∥2

2∥𝑑𝑘∥ ,(30)

i.e.,∥𝑔𝑘∥2∥𝑑𝑘∥ ≤ 4𝛿𝑘(𝜆𝑘 + 𝜆𝑘−1), (31)

Then∥𝑔𝑘∥4∥𝑑𝑘∥2 ≤ 32𝛿2𝑘(𝜆

2𝑘 + 𝜆2

𝑘−1) (32)

holds or all sufficient large k. Now inequality (25) followsfrom (32).

Theorem 3.4: Suppose that Assumptions hold and thesequence {𝑥𝑘} be generated by Algorithm 2.1. Then we havelim𝑘→∞

𝑖𝑛𝑓∥𝑔𝑘∥ = 0.

Proof: If lim𝑘→∞

𝑖𝑛𝑓∥𝑔𝑘∥ ∕= 0, there will exist constant 𝜃 >

0 such that ∥𝑔𝑘∥ ≥ 𝜃 for all k. And by Theorem 3.3 we have

∑𝑑𝑘 ∕=0

∥𝑔𝑘∥4∣𝑑𝑘∥2 < ∞. (33)

We consider the following two cases.Firstly, if 𝑣𝑘−1 < 0, i.e., 𝑦∗𝑘−1 = 𝑦𝑘−1, then by (4), (8),

(18) and Theorem 3.1, we obtain

𝑔𝑇𝑘 𝑑𝑘 = 𝑔𝑇𝑘 (− 1𝛿𝑘𝑔𝑘 +

∥𝑔𝑘∥2

𝛿𝑘𝑑𝑇𝑘−1𝑑𝑘−1𝑑𝑘−1)

= − 1𝛿𝑘∥𝑔𝑘∥2(1− 𝜏𝑘−1𝑔

𝑇𝑘−1𝑑𝑘−1

(𝜏𝑘−1−1)𝑔𝑇𝑘−1𝑑𝑘−1)

= − 1𝛿𝑘(1−𝜏𝑘−1)

∥𝑔𝑘∥2.(34)

(4),(8), (34) and Theorem 3.1 yield

∥𝑑𝑘∥2 = ∥ − 1𝛿𝑘𝑔𝑘 +

∥𝑔𝑘∥2

𝛿𝑘𝑑𝑇𝑘−1𝑦𝑘−1𝑑𝑘−1∥2

= ∥𝑔𝑘∥2

𝛿2𝑘+ ∥𝑔𝑘∥4∥𝑑𝑘−1∥2

𝛿2𝑘(𝑑𝑇𝑘−1𝑦𝑘−1)2

− 2∥𝑔𝑘∥2

𝛿2𝑘

𝑔𝑇𝑘 𝑑𝑘−1

𝑑𝑇𝑘−1𝑦𝑘−1

= ∥𝑔𝑘∥2

𝛿2𝑘+ ∥𝑔𝑘∥4∥𝑑𝑘−1∥2

𝛿2𝑘(𝑑𝑇𝑘−1𝑦𝑘−1)2

+ 2𝜏𝑘−1

1−𝜏𝑘−1

∥𝑔𝑘∥2

𝛿2𝑘

= ∥𝑑𝑘−1∥2∥𝑔𝑘∥4

𝛿2𝑘(𝜏𝑘−1−1)2(𝑔𝑇𝑘 𝑑𝑘)2 + ∥𝑔𝑘∥2

𝛿2𝑘

1+𝜏𝑘−1

1−𝜏𝑘−1

=𝛿2𝑘−1(1−𝜏𝑘−2)

2

𝛿2𝑘(1−𝜏𝑘−1)2∥𝑑𝑘−1∥2

∥𝑔𝑘−1∥4 ∥𝑔𝑘∥4+ 1

𝛿2𝑘

1+𝜏𝑘−1

1−𝜏𝑘−1∥𝑔𝑘∥2.(35)

The above relation can be re-written as

∥𝑑𝑘∥2

∥𝑔𝑘∥4 =𝛿2𝑘−1(1−𝜏𝑘−2)

2

𝛿2𝑘(1−𝜏𝑘−1)2∥𝑑𝑘−1∥2

∥𝑔𝑘−1∥4 + 1𝛿2𝑘

1+𝜏𝑘−1

1−𝜏𝑘−11

∥𝑔𝑘∥2 .

(36)Hence we get

∥𝑑𝑘∥2

∥𝑔𝑘∥4 ≤ 𝛿2𝑘−1(1−𝜏𝑘−2)2

𝛿2𝑘(1−𝜏𝑘−1)2∥𝑑𝑘−1∥2

∥𝑔𝑘−1∥4 + 1𝛿2𝑘

1+𝜏𝑘−1

1−𝜏𝑘−1

1𝜃2

≤ 𝛿2𝑘−2(1−𝜏𝑘−3)2

𝛿2𝑘−1(1−𝜏𝑘−1)2∥𝑑𝑘−2∥2

∥𝑔𝑘−2∥4

+ 1𝛿2𝑘−1

[1−𝜏2

𝑘−1

(1−𝜏𝑘−1)2+

1−𝜏2𝑘−2

(1−𝜏𝑘−1)2] 1𝜃2

≤ ⋅ ⋅ ⋅≤ ( 𝛿2𝛿𝑘 )

2( 1−𝜏11−𝜏𝑘−1

)2 ∥𝑑2∥2

∥𝑔2∥4

+ 1𝛿2𝑘[

1−𝜏2𝑘−1

(1−𝜏𝑘−1)2+ ⋅ ⋅ ⋅+ 1−𝜏2

2

(1−𝜏𝑘−1)2] 1𝜃2 .

(37)

According to Corollary 3.1, we know the quantities

( 1−𝜏11−𝜏𝑘−1

)2,1−𝜏2

𝑘−1

(1−𝜏𝑘−1)2, ⋅ ⋅ ⋅ , 1−𝜏2

2

(1−𝜏𝑘−1)2(38)

818

have a common upper bound. Denoting this common boundby 𝜛, 𝑐 = 𝜛/𝜃2 and 𝑑 = 𝜛∥𝑑2∥2/∥𝑔2∥4.From (37) weobtain∥𝑔𝑘∥4∥𝑑𝑘∥2 ≥ 𝛿2𝑘

(𝑘 − 2)𝑐+ 𝛿22𝑑≥ 𝐿2

[(𝑘 − 2)𝑐+ 𝐿2𝑑]𝜌2𝑚𝑖𝑛, (39)

which indicates ∑𝑑𝑘 ∕=0

∥𝑔𝑘∥4∥𝑑𝑘∥2 = ∞. (40)

Obviously, it is contradictory to (33). Thus the theorem 3.4is valid.

Secondly, if 𝑣𝑘−1 > 0, i.e., 𝑦∗𝑘−1 = 𝑦𝑘−1 + 𝑣𝑘−1𝑠𝑘−1, thenby (4), (8), (18) and Theorem 3.1, we get

𝑔𝑇𝑘 𝑑𝑘 = 𝑔𝑇𝑘 (− 1𝛿𝑘𝑔𝑘 +

∥𝑔𝑘∥2𝑑𝑘−1

𝛿𝑘𝑑𝑇𝑘−1𝑦𝑘−1+𝑣𝑘−1𝛼𝑘𝛿𝑘∥𝑑𝑘−1∥2 )

= −∥𝑔𝑘∥2

𝛿𝑘(1− 𝑔𝑇𝑘 𝑑𝑘−1

𝑑𝑇𝑘−1𝑦𝑘−1− 𝑣𝑘−1𝑔𝑇𝑘

𝑑𝑘


∥𝑑𝑘−1∥2)

= −∥𝑔𝑘∥2

𝛿𝑘(1− 𝛿𝑘∥𝑑𝑘∥2

Γ𝑘𝑔𝑇𝑘 𝑑𝑘−1

𝛿𝑘∥𝑑𝑘∥2Γ𝑘𝑑𝑇𝑘−1𝑦𝑘−1−𝑣𝑘−1∥𝑑𝑘−1∥2𝑔𝑇𝑘 𝑑𝑘

)

= −∥𝑔𝑘∥2

𝛿𝑘[1 +

𝛿𝑘𝜏𝑘−1∥𝑑𝑘∥2Γ𝑘

(1−𝜏𝑘−1)𝛿𝑘∥𝑑𝑘∥2Γ𝑘

+𝑣𝑘−1𝜏𝑘−1∥𝑑𝑘−1∥2 ]

≤ − 1𝛿𝑘∥𝑔𝑘∥2.

(41)Furthermore, we have

−𝑔𝑇𝑘 𝑑𝑘 ≥ ∥𝑔𝑘∥2/𝛿𝑘 =⇒ (𝑔𝑇𝑘 𝑑𝑘)2 ≥ ∥𝑔𝑘∥4/𝛿2𝑘. (42)

(4),(8) and Theorem 3.1 yield

∥𝑑𝑘∥2 = ∥ − 1𝛿𝑘𝑔𝑘 +

∥𝑔𝑘∥2

𝛿𝑘𝑑𝑇𝑘−1(𝑦𝑘−1+𝑣𝑘−1𝑠𝑘−1)𝑑𝑘−1∥2

= ∥𝑔𝑘∥2

𝛿2𝑘− 2∥𝑔𝑘∥2

𝛿2𝑘

𝑔𝑇𝑘 𝑑𝑘−1

𝑑𝑇𝑘−1(𝑦𝑘−1+𝑣𝑘−1𝑠𝑘−1)

+ ∥𝑔𝑘∥4∥𝑑𝑘−1∥2

𝛿2𝑘(𝑑𝑇𝑘−1𝑦𝑘−1+𝑣𝑘−1𝑑𝑇𝑘−1𝑠𝑘−1)2

.

(43)The above relation can be re-written as

∥𝑑𝑘∥2

∥𝑔𝑘∥4 = 1𝛿2𝑘∥𝑔𝑘∥2

− 2𝜏𝑘−1𝑔𝑇𝑘−1𝑑𝑘−1

(𝜏𝑘−1−1)𝛿2𝑘∥𝑔𝑘∥2𝑔𝑇𝑘−1𝑑𝑘−1+𝛼𝑘−1𝑣𝑘−1𝛿2𝑘∥𝑔𝑘∥2∥𝑑𝑘−1∥2

+ ∥𝑑𝑘−1∥2

𝛿2𝑘[(𝜏𝑘−1−1)𝑔𝑇𝑘−1𝑑𝑘−1+𝛼𝑘−1𝑣𝑘−1∥𝑑𝑘−1∥2]2

≤ 1𝛿2𝑘∥𝑔𝑘∥2 + ∥𝑑𝑘−1∥2

𝛿2𝑘[(𝜏𝑘−1−1)𝑔𝑇𝑘−1𝑑𝑘−1+𝛼𝑘−1𝑣𝑘−1∥𝑑𝑘−1∥2]2

≤ 1𝛿2𝑘

∥𝑑𝑘−1∥2

(1−𝜏𝑘−1)2(𝑔𝑇𝑘−1𝑑𝑘−1)2+ 1

𝛿2𝑘

1∥𝑔𝑘∥2 .

(44)From (42) and (44) we have

∥𝑑𝑘∥2

∥𝑔𝑘∥4 ≤ 𝛿2𝑘−1

𝛿2𝑘

∥𝑑𝑘−1∥2

(1−𝜏𝑘−1)2∥𝑔𝑘∥4 + 1𝛿2𝑘

1∥𝑔𝑘∥2

≤ 𝛿2𝑘−1

𝛿2𝑘

1(1−𝜏𝑘−1)2

∥𝑑𝑘−1∥2

∥𝑔𝑘−1∥4 + 1𝛿2𝑘

1𝜃2

≤ 𝛿2𝑘−2

𝛿2𝑘

1(1−𝜏𝑘−1)2(1−𝜏𝑘−2)2

∥𝑑𝑘−2∥2

∥𝑔𝑘−2∥4

+ 1𝛿2𝑘[1 + 1

(1−𝜏𝑘−1)2] 1𝜃2

≤ 𝛿2𝑘−3

𝛿2𝑘

1(1−𝜏𝑘−1)2(1−𝜏𝑘−2)2(1−𝜏𝑘−3)2

∥𝑑𝑘−3∥2

∥𝑔𝑘−3∥4

+ 1𝛿2𝑘[1 + 1

(1−𝜏𝑘−1)2+ 1

(1−𝜏𝑘−1)2(1−𝜏𝑘−2)2] 1𝜃2

≤ ⋅ ⋅ ⋅≤ 𝛿22

𝛿2𝑘

1∏𝑘−2

𝑖=1 (1−𝜏𝑘−𝑖)2∥𝑑2∥2

∥𝑔2∥4

+ 1𝛿2𝑘[1 +

∑𝑘−3𝑗=2

1∏𝑗

𝑖=1(1−𝜏𝑘−𝑖)2] 1𝜃2 .

(45)

According to Corollary 3.1, we know the quantities

1∏𝑘−2𝑖=1 (1− 𝜏𝑘−𝑖)2

,1

(1− 𝜏𝑘−1)2,

1

(1− 𝜏𝑘−1)2(1− 𝜏𝑘−2)2

, ⋅ ⋅ ⋅ ,

1∏𝑘−3𝑖=1 (1− 𝜏𝑘−𝑖)2

have a common upper bound. Denoting this common boundby 𝜔, from (45) we obtain

∥𝑑𝑘∥2∥𝑔𝑘∥4 ≤ 1

𝛿2𝑘{𝜔𝛿

22∥𝑑2∥2∥𝑔2∥4 + [1 + (𝑘 − 4)𝜔]

1

𝜃2}. (46)

We denote 𝜔𝛿22∥𝑑2∥2

∥𝑔2∥4 + [1+ (𝑘− 4)𝜔] 1𝜃2 by Ω(𝑘) here. Then

the above relation can be re-written as

∥𝑑𝑘∥2∥𝑔𝑘∥4 ≤ 1

𝛿2𝑘Ω(𝑘) ≤ 𝜌2𝑚𝑖𝑛Ω(𝑘)

𝐿2, (47)

i.e.,∥𝑔𝑘∥4∥𝑑𝑘∥2 ≥ 𝐿2

𝜌2𝑚𝑖𝑛Ω(𝑘). (48)

From (48) we have ∑𝑑𝑘 ∕=0

∥𝑔𝑘∥4∥𝑑𝑘∥2 = ∞. (49)

Obviously, it is contradictory to (33). Thus the theorem 3.4is valid. This finishes our proof completely.

IV. NUMERICAL EXPERIMENTS

In this section the numerical experiments are presentedto show the performance of the Algorithm 2.1 for salt-and-pepper noise removal. These experiments are tested in Matlab7.0 and all the test images are 512×512 gray level images.

According to the numerical comparison in [4], the Polak-Ribiere conjugate gradient method (PRCG) is the mostefficient method. Here, we test SCG method and compare itwith and PRCG method. The computation results are givenin Tab. 1. We report the number of iterations (NI) and theCPU-time required for the whole denoising process and thePSNR of the restored images.

It should be stressed that we are mainly concerned withthe speed of slolving the minimization of (2). We choose𝜑𝛼(𝑢) =

√𝛼+ 𝑢2 with 𝛼 = 100 and let the parameters

𝑎 = 3 and 𝑏 = 6 in (11) during the whole tests.To assess the restoration performance qualitatively, we use

the PSNR (peak signal to noise ratio, see details in [1])defined as

𝑃𝑆𝑁𝑅 = 10 log102552

1𝑀𝑁

∑𝑖,𝑗

(𝑢𝑟𝑖,𝑗 − 𝑢∗𝑖,𝑗)

2, (50)

where 𝑢𝑟𝑖,𝑗 and 𝑢∗𝑖,𝑗 denote the pixel values of the restored

image and the original image respectively.The stopping cri-terion of both methods are

∥𝑔𝑘∥ ≤ 10−6(1 + ∣𝑓(𝑢𝑘)∣) (51)

819

1 2 3 4 5 6 7 8 9 100

50

100

150

200

250

300

350nu

mbe

r of

iter

atio

ns

PRCGSCG

Fig. 1. Number of iterations of PRCG/SCG methods for ten test imagesin Table I.

1 2 3 4 5 6 7 8 9 100

20

40

60

80

100

120

CP

U−

time

PRCGSCG

Fig. 2. CPU-time of PRCG/SCG methods for ten test images in Table I.

and∣𝑓(𝑢𝑘)− 𝑓(𝑢𝑘−1)∣

∣𝑓(𝑢𝑘)∣ ≤ 10−6. (52)

Table I indicates that the proposed SCG method is aboutthree or four even five times faster than the famous PRCGmethod for all of the test images. We also find the PSNRvalues attained by the SCG and PRCG method are verysimilar in some degree. Figs.1 and 2 show the SCG methodhas less sensitivity to different problems. Fig.3 shows partoriginal test images in this paper, the corresponding noisyimages with noise level 𝑟 = 70% and the restoration resultsby PRCG method and SCG method respectively. Theseresults show that the proposed spectral conjugate gradientmethod (SCG) is feasible and it can restore corrupted imagequite well in an efficient manner.

V. CONCLUSION

In this paper,we present a spectral conjugate gradientmethod (SCG) to minimize the smooth regularization func-tional for salt-and-pepper noise removal. Its global conver-gence can be established under some suitable conditions.Numerical experiments results illustrate that SCG methodcan significantly improve the CPU-time for image processingwhile obtaining the same restored image quality.

Our motivation comes from [14]-[16]. In this paper wejust let Γ𝑘 ≡ 𝐼 (the unit matrix) and it brings us somegood results. So we can try adopting some new forms of Γ𝑘,updating by the quasi-Newton method, such as the classicalBFGS and DFP quasi-Newton update formulaes.

TABLE IPERFORMANCE OF SALT-AND PEPPER DENOISING VIA PRCG AND SCG

Test Noise PRCGImages Level PSNR (dB) CPU-time NIBoat 27.938 46.3281 194

Bridge 24.998 60.672 255Cameraman 30.755 76.2344 228

Goldhill 70% 29.792 48.1094 144Head DTI 32.892 47.8281 196

Lena 31.167 42.2813 177Pepper 30.3889 78.2188 234Texture 19.84 104.9214 306

Brain MRI 70% 35.1812 41.9688 123Brain MRI 90% 30.247 59.3125 257

SCGPSNR (dB) CPU-time NI

Boat 27.978 11.6533 45Bridge 24.853 14.461 55

Cameraman 30.799 12.012 48Goldhill 70% 29.857 9.5005 38

Head DTI 32.712 12.199 48Lena 31.207 12.4021 49

Pepper 30.417 12.729 50Texture 19.852 19.0165 56

Brain MRI 70% 35.315 9.297 37Brain MRI 90% 30.222 12.6985 50

REFERENCES

[1] A. Bovik, Handbook of Image and Video Processing. Academic, NewYork, 2000.

[2] R.H. Chan, C.W. Ho and M. Nikolova, “Salt-and-pepper noise removalby median-type noise detectors and detail-preserving regularization,”IEEE Trans, Image Process, vol. 14, pp. 1479-1485, 2005.

[3] H. Hwang and R.A. Haddad, “Adaptive median filters: New algorithmsand results,” IEEE Trans, Image Process, vol. 4, pp. 499-502, 1995.

[4] J.F. Cai, R.H. Chan and C.D. Fiore, “Minimization of a detail-preservingregularization function for impulse noise removal,” J. Math, ImagingVision, vol. 27, pp. 79-91, 2007.

[5] T. Luki𝑐, J.Lindblad and N,Sladoje, “Regularized image denoisingbased on spectral gradient optimization,” Inverse Problems, vol. 27,2011.

[6] M. Black and A. Rangarajan, “On the unification of line processes,outlier rejection, and robust statistics with applications to early vision,”International Journal of Computer Vision, vol. 19, pp. 57-91, 1996.

[7] C. Bouman and K. Sauer, “On discontinuity-adaptive smoothness pri-ors in computer vision,” IEEE Transactions on Pattern Analysis andMachine Intelligence, vol. 17, pp. 576-586, 1995.

[8] P. Charbonnier, L. Blanc-Feraud, G. Aubert and M. Barlaud, “Deter-ministic edge-preserving regularization in computed imaging,” IEEETransactions on Image Processing, vol. 6, pp. 298-311, 1997.

[9] P. J. Green, “Bayesian reconstructions from emission tomographydata using a modified EM algorithm,” IEEE Transactions on MedicalImaging, MI-9, pp. 84-93, 1990.

820

Fig. 3. The first line is the original images: Head DTI/Brain MRI/Lena. The second line is the noisy images with noise level r =70%: Head DTI/BrainMRI/Lena. The third line is the restored images via PRCG (the PSNR values are 32.892dB, 35.1812dB and 31.167dB severally). The fourth line is therestored images via SCG (the PSNR values are 32.712dB, 35.315dB and 31.207dB severally).

[10] J.F. Cai, R.H. Chan and B. Morini, “Minimization of an edge-preserving regularization function by conjugate gradient type methods,”in: Image Precessing baded on Partial Differential Equations, in Math-ematic and Visualization, Springer, Berlin, Heidelberg, pp. 109-122,2007.

[11] E.G. Birgin and J.M. Martinez, “A spectral conjugate gradient methodfor unconstrained optimization,” Appl. Math. Optim, vol. 43, pp. 117-128, 2001.

[12] N. Andrei, “Scaled memoryless BFGS preconditioned conjugate gra-dient algorithm for unconstrained optimization,” Optim. Methods Softw,vol. 22, pp. 561-571, 2007.

[13] G.H. Yu, L.T. Guan and W.F. Chen, “Spectral conjugate gradientmethods with sufficient descent property for large-scale unconstrainedoptimization,” Optim.Methods Softw, vol. 23, pp. 275-293, 2008.

[14] G.H. Yu, J.H. Huang and Y. Zhou, “A descent spectral conjugategradient method for impulse noise removal,” Applied Mathematics

Letters, vol. 23, pp. 555-560, 2010.[15] G.Y. Li, C.M. Tang and Z.X. Wei, “New conjugate condition and re-

lated new conjugate gradient methods for unconstrained optimization,”Journal of Computational and Applied Mathematics, vol. 202, pp. 523-529, 2007.

[16] J. Sun and J.P. Zhang, “Global convergence of conjugate gradientmethods without line search,” Annals of Operations Research, vol. 103,pp. 161-173, 2001.

821

Documents

[IEEE 2012 IEEE International Conference on Information Science and Technology (ICIST) - Wuhan, China (2012.03.23-2012.03.25)] 2012 IEEE International Conference on Information Science