11
Research Article Inexact Version of Bregman Proximal Gradient Algorithm S. Kabbadj Department of Mathematics, Faculty of Sciences of Meknes, B.P. 11201, Meknes, Morocco Correspondence should be addressed to S. Kabbadj; [email protected] Received 31 August 2019; Revised 19 January 2020; Accepted 21 January 2020; Published 1 April 2020 Academic Editor: Allan Peterson Copyright © 2020 S. Kabbadj. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e Bregman Proximal Gradient (BPG) algorithm is an algorithm for minimizing the sum of two convex functions, with one being nonsmooth. e supercoercivity of the objective function is necessary for the convergence of this algorithm precluding its use in many applications. In this paper, we give an inexact version of the BPG algorithm while circumventing the condition of supercoercivity by replacing it with a simple condition on the parameters of the problem. Our study covers the existing results, while giving other. 1. Introduction We consider the following minimization problem: inf Ψ(x) f(x)+ g(x) : x R d . (1) where f is a convex proper lower-semicontinuous (l.s.c.) function and g is a convex continuously differentiable function. is problem arises in many applications including compressed sensing [1], signal recovery [2], and phase re- trieve problem [3]. One classical algorithm for solving this problem is the proximal gradient (PG) method: x n argmin f(u)+ 〈∇gx n1 ,u+ 1 2λ n u x n1 2 n N . (2) where λ n is the stepsize on each iteration. e Proximal Gradient Method and its variants [4–14] have been one hot topic in optimization field for a long time due to their simple forms. A central property required in the analysis of gradient methods is that of the Lipschitz continuity of the gradient of the smooth part g. However, in many applications, the differentiable function does not have such a property, e.g., in the broad class of Poisson inverse problems. In [15], by introducing the Bregman distance [16] generated by some reference convex function h defined by D h (x, y)� h(x)− h(y)−〈x y, h(y)〉, (3) the authors could replace the intricate question of Lipschitz continuity of gradients by a convex condition easy to verify, which we call below LC property. ereby, they proposed and studied the algorithm called NoLips defined by x n argmin f(u)+ 〈∇gx n1 ,u+ 1 λ n D h u, x n1 n N . (4) where g 0. Equation (4) is the Bregman Proximal (BP) studied in [17–21]. In this article, we give an inexact version of the (BPG) defined by x n ε n argmin f(u)+ 〈∇gx n1 ,u+ 1 λ n D h u, x n1 . (5) While circumventing the condition of supercoercivity re- quired in [15, 22] by replacing it with a simple condition on the parameters of the problem, our study covers the existing results, while giving others. Hindawi Abstract and Applied Analysis Volume 2020, Article ID 1963980, 11 pages https://doi.org/10.1155/2020/1963980

ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

Research ArticleInexact Version of Bregman Proximal Gradient Algorithm

S Kabbadj

Department of Mathematics Faculty of Sciences of Meknes BP 11201 Meknes Morocco

Correspondence should be addressed to S Kabbadj kabbajsaid63yahoocom

Received 31 August 2019 Revised 19 January 2020 Accepted 21 January 2020 Published 1 April 2020

Academic Editor Allan Peterson

Copyright copy 2020 S Kabbadj )is is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

)e Bregman Proximal Gradient (BPG) algorithm is an algorithm forminimizing the sum of two convex functions with one beingnonsmooth )e supercoercivity of the objective function is necessary for the convergence of this algorithm precluding its use inmany applications In this paper we give an inexact version of the BPG algorithm while circumventing the condition ofsupercoercivity by replacing it with a simple condition on the parameters of the problem Our study covers the existing resultswhile giving other

1 Introduction

We consider the following minimization problem

inf Ψ(x) ≔ f(x) + g(x) x isin Rd

1113966 1113967 (1)

where f is a convex proper lower-semicontinuous (lsc)function and g is a convex continuously differentiablefunction)is problem arises inmany applications includingcompressed sensing [1] signal recovery [2] and phase re-trieve problem [3] One classical algorithm for solving thisproblem is the proximal gradient (PG) method

xn ≔ argmin f(u) + langnablag x

nminus 11113872 1113873 urang +

12λn

u minus xnminus 1

2

1113896 1113897

n isin Nlowast

(2)

where λn is the stepsize on each iteration )e ProximalGradient Method and its variants [4ndash14] have been one hottopic in optimization field for a long time due to their simpleforms A central property required in the analysis of gradientmethods is that of the Lipschitz continuity of the gradient ofthe smooth part g However in many applications thedifferentiable function does not have such a property eg inthe broad class of Poisson inverse problems In [15] by

introducing the Bregman distance [16] generated by somereference convex function h defined by

Dh(x y) h(x) minus h(y) minus langx minus ynablah(y)rang (3)

the authors could replace the intricate question of Lipschitzcontinuity of gradients by a convex condition easy to verifywhich we call below LC property )ereby they proposedand studied the algorithm called NoLips defined by

xn

argmin f(u) + langnablag xnminus 1

1113872 1113873 urang +1λn

Dh u xnminus 1

1113872 11138731113896 1113897

n isin Nlowast

(4)

where g 0 Equation (4) is the Bregman Proximal (BP)studied in [17ndash21]

In this article we give an inexact version of the (BPG)defined by

xn isin εn minus argmin f(u) +langnablag x

nminus 11113872 1113873 urang +

1λn

Dh u xnminus 1

1113872 11138731113896 1113897

(5)

While circumventing the condition of supercoercivity re-quired in [15 22] by replacing it with a simple condition onthe parameters of the problem our study covers the existingresults while giving others

HindawiAbstract and Applied AnalysisVolume 2020 Article ID 1963980 11 pageshttpsdoiorg10115520201963980

Our notation is fairly standard lang rang is the scalar producton Rd and the associated norm middot )e closure of the set C(relative interior) is denoted by C (riC respectively) For anyconvex function f we denote by

(1) domf x isin Rd f(x)lt +infin1113864 1113865 its effective domain(2) zεf(middot) v f(y) gef(middot) + langv y minus rang minus εforally1113864 1113865 its εminus

subdifferential(3) argminf x isin Rd f(x) inf f1113864 1113865 its argmin f(4) ε minus argminf x isin Rd f(x)le inff + ε1113864 1113865 its εminus

argminf

2 Preliminary

In this section we present the main results of the conver-gence of NoLips

Definition 1 (see [23]) Let C be a convex not empty of Rd

(i) A convex function h Rd⟶ ] minus infin +infin] is ofLegendre on C if it verifies the three followingconditions

(a) C int (dom h)(b) h is differentiable on C(c) limnablah(xi) +infin for any sequence xi1113864 1113865 of C

that converges towards a boundary point of C

(ii) )e class of strictly convex functions verifying a band c is called the class of Legendrersquos functions on Cand denoted by E(C)

Definition 2 (see [23]) Let F Rd⟶ ] minus infin +infin] we saythat F is supercoercive if

lim infx⟶infin

F(x)

xinfin (6)

Consider the following assumptions

Assumption 1

(i) h X sub Rd⟶ ] minus infin +infin] is of Legendre type(ii) g X⟶ ] minus infin +infin] is convex proper lsc with

dom h sub domg which is differentiable onint(dom h)

(iii) f X⟶ ] minus infin +infin] is convex proper lower semi-continuous (lsc)

(iv) domfcap int(dom h)neempty(v) inf Ψ(x) x isin dom h1113966 1113967gt minus infin

We consider the following minimization problem (P)inf Ψ(x) ≔ f(x) + g(x) x isin dom h1113966 1113967

Let the operator Tλ be defined by

Tλ(x) argmin f(u) +langnablag(x) u minus xrang +1λDh(u x)1113882 1113883

(7)

Lemma 1 (well-posedness of the method) Under As-sumption 1 suppose one of the following assumptions holds

(i) argmin Ψ(x) x isin dom h1113966 1113967 is nonempty and compact(ii) forallλgt 0 h + λf is supercoercive and the map Tλ de-

fined in (7) is nonempty and single-valued from int(domh) to int (domh)

Definition 3 )e couple (g h) verified a Lipschitz-likeConvexity Condition (LC) if existLgt 0 with Lh-g convex on int(dom h)

By posing

proxhλf(x) ≔ argmin f(u) +

1λDh(u x)1113882 1113883 (8)

they showed that

Tλ(x) proxhλf o proxh

λp(x) (9)

where p(u) langnablag(x) urang )e operator Tλ thus appears ascomposed of two operators prox)eNoLips algorithm thenbecomes

xn

Tλnx

nminus 11113872 1113873 foralln isin N

lowast (10)

Assumption 2

(i) argmin Ψ(x) x isin dom h1113966 1113967 is nonempty and com-pact or forallλgt 0 h + λf is supercoercive

(ii) For every x isin int(dom h) and r isin R the level setL1(x r) y isin int(dom h) Dh(x y)le r1113864 1113865 isbounded

(iii) If xn n converges to some x in int (dom h) thenDh(x xn)⟶ 0

(iv) Reciprocally if x in int (dom h) and if xn n is suchthat Dh(x xn)⟶ 0 then xn⟶ x

(v) exist Lgt 0 with Lh-g convex on int (dom h) (LC)

Theorem 1 (Global Convergence) Assume that

(i) dom h dom h(ii) 1113936 λn +infin and Assumptions 1 and 2 are satisfied

en the sequence xn n converges to some solutionxlowast of (P)

Our contribution is resumed in two essential points

(1) Improvements of some assumptions

(a) Suppose f and g are both are convex (see[15 22]) we show that we can reduce this hy-pothesis by supposing only that Ψ is convexwhich allows to distinguish two interesting casesthat are still not yet studied neither in the case ofthe BPG nor in the case PG

2 Abstract and Applied Analysis

(i) )e nonsmooth part f is possibly not convexand the smooth part g is convex

(ii) )e nonsmooth part f is convex and thesmooth part g is possibly not convex

(b) )e assumption is as follows argmin Ψ(x)

x isin dom h is compact or forallλgt 0 h + λf issupercoercive

)is is a condition on f and g (see [15]) whichprecludes the application of NoLips for the functionsΨ non-supercoercive In this work we show that wecan circumvent this condition by coupling the LCproperty with the bounded level sets as follows

L2(x r) y isin S Dh(y x)le r1113864 1113865 (11)

It is a condition which relates to the parameter h andwhich is verified by most of the interesting Bregmandistances

(2) Inexact version of NoLips

We propose an inexact version of NoLips calledεminus NoLips which verifies

xn

εn minus argmin f(u) +langnablag xnminus 1

1113872 1113873 urang +1λn

Dh u xnminus 1

1113872 11138731113896 1113897

n isin Nlowast

(12)

)e convergence result is established in Section 4 )isstudy covers the convergence results given for PG and BPGby giving new results in particular the convergence of theinexact version of the interior method with Bregman dis-tance studied in [24] this result has not been establisheduntil now

We also note that the convergence of NoLips is givenwith the following condition

dom h dom h (13)

It is for that and for the clarity of the hypothesis we supposein what follows that h S⟶ R with S being an open convexset of Rd

3 Main Results

In order to clarify the status of parameter h we give thefollowing definitions Let S be a convex open subset of Rd

and h S⟶ R Let us consider the following hypotheses

H1 h is continuously differentiable on SH2 h is continuous and strictly convex on S

H3 forallrge 0forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (14)

H4 forallrge 0forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (15)

H5 if xn n sub S then xn⟶ xlowast isin S so

Dh xlowast x

n( 1113857⟶ 0 (16)

H6 if xn n sub S then Dh(xlowast xn)⟶ 0 so

xn⟶ x

lowast (17)

Definition 4

(i) h S⟶ R is a Bregman function on S orldquoD-func-tionrdquo if h verifies H1 H2 H3 H4 H5 and H6

(ii) Dh( ) SXS⟶ R such that forallx isin S forally isin S

Dh(x y) h(x) minus h(y) minus langx minus ynablah(y)rang (18)

Eq (18) is called Bregman distance if h is a Bregmanfunction We put the following conditions

A(S) h S⟶ R verifingH1 H21113864 1113865

B(S) h S⟶ R verifingH1 H2 H3 H4 H5 andH61113864 1113865

E(S) h S⟶ R the Legendre type of S1113864 1113865

Proposition 1 Let h and hprime verify H1

forallλ Dλh+hprime( ) λDh( ) + Dhprime( ) (19)

Lemma 2 forallh isin A(S) foralla isin S and forallb c isin S

Dh(a b) + Dh(b c) minus Dh(a c) langa minus bnablah(c) minus nablah(b)rang

(20)

Example 1 If S0 Rd and h0(x) (12)x2 then

Dh0(x y)

12x minus y

2 (21)

Example 2 If S1 Rd++ ≔ x isin Rdxi gt 0 i 1 d1113864 1113865 and

h1(x) 1113944id

i1xilog xi forallx isin S1 (22)

with the convention 0 log 0 0 then forall(x y) isin S1XS1

Dh1(x y) 1113944

d

i1xilog

xi

yi

+ yi minus xi (23)

Example 3 If S2 ]minus 1 1]d and h2(x) minus 1113936idi1

1 minus x2

i

1113969

thenDh2(x y) h2(x) + 1113936

di11 minus xiyiy2

i forall(x y) isin S2XS2

Abstract and Applied Analysis 3

Proposition 2 (see [19]) hi isin B(Si)capE(Si) i 0 1 2 Weconsider the following minimization problem

(p) inf Ψ(x) ≔ f(x) + g(x) x isin S1113864 1113865 (24)

)e following assumptions on the problemrsquos data aremade throughout the paper (and referred to as the blanketassumptions)

Assumption 3

(i) h isin A(S)capE(S)

(ii) g Rd⟶ ]minus infin +infin] is proper (lsc) withS sub domg which is and continuously differentiableon S

(iii) f Rd⟶ ]minus infin +infin] is proper (lsc)(iv) ri(domΨ)cap Sneempty(v) Ψlowast ≔ inf Ψ(x) x isin S1113864 1113865gt minus infin

We consider the operator Tλ defined by forallx isin S

Tλ(x) argmin f(u) +langnablag(x) u minus xrang1113882

+1λDh(u x) u isin S1113883

(25)

We give in the following a series of lemmas allowingestablishment of )eorem 2 which assures the well-pos-edness of the method proposed in Section 4

Lemma 3 forallλgt 0 and forallx isin S

Tλ(x) argmin Ψ(u) +1λDhminus λg(u x) u isin S1113880 1113883 (26)

Proof forallx isin S sub domg

(25)⟹ Tλ(x) argmin f(u) + g(x) +langnablag(x) u minus xrang1113882

+1λDh(u x) u isin S1113883

(27)

Whenforallu isin S sub domg we have

f(u) + g(x) + langnablag(x) u minus xrang +1λDh(u x)

Ψ(u) + g(x) minus g(u) + langnablag(x) u minus xrang +1λDh(u x)

Ψ(u) minus Dg(u x) +1λDh(u x)

Ψ(u) +1λDhminus λg(u x)

(28)

Lemma 4 If the pair (g h) verified the condition (LC) thenexist Lgt 0 forallx isin S forallu isin S

(i) Dg(u x)leLDh(u x)

(ii) forallλ isin ]0 1L[ Dhminus λg(u x)ge 0

Proof

(i) If Lh-g is convex on S then

DLhminus g(u x)ge 0 forallx isin S forallu isin S (29)

Let u isin (SS) )ere exists a sequence un n sub S suchthat un⟶ u then we have

DLhminus g un x( 1113857ge 0 (30)

Lh-g is continuous in S then DLhminus g(u x)ge 0

(ii) Let λ isin ]0 1L[

DLhminus g(u x)ge 0⟹ LDh(u x)geDg(u x)

⟹1λDh(u x)geDg(u x)

⟹Dhminus λg(u x)ge 0

(31)

Lemma 5 If h is the Legendre on S then h minus λg is also theLegendre on S for all λ such that 0lt λlt (1L)

Proof Conditions (a) and (b) of Definition 1 being verifiedlet us demonstrate that the condition (c) is verified too Letxi1113864 1113865 be xi⟶ xlowast isin Fr(S) (SS)

nablah xi( 1113857 minus λnablag xi( 1113857

2 ge nablah xi( 1113857

nablah xi( 1113857

minus 2λ nablag xi( 1113857

1113872 1113873

+ λ2 nablag xi( 1113857

2

(32)

)en

limxi⟶xlowastisinFr(S)

nabla(h minus λg) xi( 1113857

+infin (33)

h minus λg is strictly convex in S Indeed let x y isin S xney wehave Dhminus λg(x y)ge 0

Dhminus λg(x y) 0⟹Dh(x y)

λDg(x y)⟹Dh(x y)le λLDh(x y)

(34)

h is strongly convex on S x y isin S xney thenDh(x y)ne 0⟹1le λL which is absurd Hence h minus λg isstrictly convex in S

Lemma 6 Consider the following

4 Abstract and Applied Analysis

forallλ isin 01L

1113877 1113876 forallu isin S z Dhminus λg( u)1113872 1113873 xlowast

( 1113857

nabla(h minus λg) xlowast( ) minus nabla(h minus λg)(u)1113864 1113865 if xlowast isin S

empty if not

⎧⎪⎨

⎪⎩

(35)

Proof Since h minus λg is a Legendre function on S Dhminus λg( u)

is also a Legendre By application of )eorem 261 in [23]z(Dhminus λg( u)) verifies the following

(i) If xlowast isin int(domDhminus λg( u)) S then

z Dhminus λg( u)1113872 1113873 xlowast

( 1113857 nablaDhminus λg( u) xlowast

( 11138571113966 1113967 (36)

(ii) If xlowast notin S then z(Dh( u))(xlowast) empty

Theorem 2 (well-posedness of the method) We assumethat

(i) Ψ is convex(ii) e pair (g h) verified the condition (LC)(iii) forallrge 0 forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (37)

Then forallλ isin ]0 (1L)[ and the map Tλ defined in (25) isnonempty and single-valued from S to S

Proof forallx isin S and Tλ(x) is nonempty for this it is enoughto demonstrate that forallr isin R

L(x r) u isin S Ψ(u) + λminus 1Dhminus λg(u x)le r1113966 1113967 (38)

which is closed and is bounded when it is nonempty

u isin L(x r)⟹Ψ(u) + λminus 1Dhminus λg(u x)le r

⟹Dhminus λg(u x)le λ r minus Ψlowast( 1113857

⟹Dh(u x)le λ r minus Ψlowast( 1113857 + λDg(u x)

⟹Dh(u x)le λ r minus Ψlowast( 1113857

+ LλDh(u x)

⟹Dh(u x)leλ r minus Ψlowast( )

1 minus Lλ

(39)

It follows that

L(x r) sub L2 xλ r minus Ψlowast( )

1 minus Lλ1113888 1113889 (40)

thanks to H4 L2(x (λ(r minus Ψlowast)1 minus Lλ)) is bounded whichleads that L(x r) is bounded too which shows that

Tλ(x)neempty (41)

Let xlowast isin Tλ(x) Let us suppose that xlowast isin S(26)⟹ 0 isinz(Ψ(middot) + 1λDhminus λg( x))(xlowast) since ri(domΨ)cap ri(domDhminus λg

( x)) ri(domΨ)cap ri(S) ri(domΨ)cap Sneempty from [10]which allows to write that

0 isin zΨ xlowast

( 1113857 + z1λDhminus λg( x)1113874 1113875 x

lowast( 1113857 (42)

It follows that

existu isin zΨ xlowast

( 1113857 (43)

such that

minus λu isin zDhminus λg( x) xlowast

( 1113857 (44)

is in contradiction with Lemma 6 )en Tλ(x) sub S

On the other hand h minus λg is strictly convex in S and Ψ isconvex so Ψ(middot) + Dhminus λg( x) is strongly convex in S )enTλ(x) has a unique value for all x isin S

Remark 1 )is result is liberated from the supercoercivityof Ψ and the simultaneous convexity of f and g as requiredby Lemma 2 [15]

Proposition 3 forallx isin Sforallλ isin ]0 (1L)[nabla(h minus λg)(x) minus nabla(h minus λg) Tλ(x)( 1113857

λisin zΨ Tλ(x)( 1113857 (45)

existx isin Snabla(h minus λg)(x) minus nabla(h minus λg)(x)

λisin zεΨ(x) (46)

Proof Since Tλ(x) isin S we have

0 isin z Ψ(middot) +1λDhminus λg( x)1113874 1113875 Tλ(x)( 1113857

⟹ minus nabla1λDhminus λg( x)1113874 1113875 Tλ(x)( 1113857( 1113857 isin z Ψ Tλ(x)( 1113857( 1113857

⟹ (45)

(47)

For (46) just take x Tλ(x) since

zΨ Tλ(x)( 1113857 sub zεΨ Tλ(x)( 1113857 (48)

Proposition 4 forallx isin Sforallλ isin ]0 (1L)[

Tλ(x) proxhminus λg

λΨ (x) proxhλf o proxh

λp(x) (49)

where p(u) langnablag(x) urang

Proof )e first equality is due to Lemma 3 )e second isestablished in [15]

)e first equality played a decisive role in the devel-opment of this paper

4 Analysis of the εminusNoLips Algorithm

In this section we propose an Inexact Bregman ProximalGradient Algorithm (IBPG) which is an inexact version ofthe BPG algorithm described in [15 22] the IBPG

Abstract and Applied Analysis 5

framework allows an error in the subgradient inclusion byusing the error εn We study two algorithms

(i) Algorithm 1 inexact Bregman Proximal Gradient(IBPG) algorithm without relative error criterion

(ii) Algorithm 2 inexact Bregman Proximal Gradient(IBPG) algorithm with relative error criterion whichwe call εminus NoLips

We establish the main convergence properties of theproposed algorithms In particular we prove its global rateof convergence showing that it shares the claimed sublinearrate O(1n) of basic first-order methods such as the classicalPG and BPG We also derive a global convergence of thesequence generated by NoLips to a minimizer of (P)

Assumptions 4

(i) h isin B(S)capE(S)

(ii) Ψ is convex(iii) )e pair (g h) verified the condition (LC)(iv) Argmin Ψneempty

In our analysis Ψ is supposed to be a convex function itallows to distinguish two interesting cases

(i) )e nonsmooth part f is possibly not convex and thesmooth part g is convex

(ii) )e nonsmooth part f is convex and the smooth partg is possibly not convex

In what follows the choice of the sequence λn1113864 1113865 dependsof the convexity of g

Let λ such that 0lt λlt (1L) λ0 ≔ 0

If g is not convex then we choose

λn λ (0lt λle λ) n 1 (50)

If g is convex then we choose

λn le λn+1 le λ n 1 (51)

In those conditions we easily show thatforallx y isin S foralln isin N 0lt λn le λ such that

λn+1 minus λn( 1113857Dg(x y)ge 0 (52)

We pose

hn h minus λng n 1 (53)

Proposition 5 e sequence xn n defined by (IBPG) existsand is verified for all n isin Nlowast

xn isin εn minus argmin Ψ(u) +

1λn

Dhnu x

nminus 11113872 1113873 u isin S1113896 1113897 (54)

Proof Existence is deduced trivially from (45)

Ωn ≔nablahn xnminus 1( 1113857 minus nablahn xn( )

λn

isin zεnΨ x

n( 1113857

⟹Ψ(u)geΨ xn

( 1113857 +langu minus xnΩnrang minus εn

(55)

By applying Lemma 2 we have

Ψ(u)geΨ xn

( 1113857 + λminus 1n Dhn

u xn

( 1113857 + Dhnx

n x

nminus 11113872 1113873 minus Dhn

u xnminus 1

1113872 11138731113960 1113961 minus εn

⟹Ψ xn

( 1113857 + λminus 1n Dhn

xn x

nminus 11113872 1113873leΨ(u) + λminus 1

n Dhnu x

nminus 11113872 1113873 + εn forallu isin S

⟹ xn isin T

εn

λnx

nminus 11113872 1113873

(56)

where Tελ(x) ε minus argmin Ψ(u) + (1λ)Dhminus λg(u x)1113966 1113967

Remark 2 )is result shows that IBPG is an inexact versionof BPG and this is exactly the BPG when εn 0 ie

(i) xn isin Tεn

λn(xnminus 1)⟹ xn ≃Tλn

(xnminus 1)

(ii) εn 0⟹ xn Tλn(xnminus 1)

Proposition 6 For all n isin Nlowast

(i) Ψ xn

( 1113857 minus Ψ xnminus 1

1113872 11138731113872 1113873le minus Dhnx

n x

nminus 11113872 1113873 + λnεn (57)

(ii) Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 11138731113872

minus Ψ xn

( 11138571113857 +λnεn

1 minus λL

(58)

Proof

(55)⟹λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le Dhnu x

nminus 11113872 1113873 minus Dhn

u xn

( 1113857 minus Dhnx

n x

nminus 11113872 11138731113960 1113961 + εnλn (59)

6 Abstract and Applied Analysis

We put u xnminus 1 in (59) we get (57)(ii) Put u xnminus 1 in (59) we have

Dhnx

n x

nminus 11113872 1113873 + Dhn

xnminus 1

xn

1113872 1113873le λn Ψ xnminus 1

1113872 1113873 minus Ψ xn

( 11138571113872 1113873 + λnεn

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λn Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λnDg x

n x

nminus 11113872 1113873 + λnDg x

nminus 1 x

n1113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λ Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λL Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 11138731113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

λnεn

1 minus λL

(60)

Corollary 1

(i) If εn 0 the sequence Ψ(xn) n is nonincreasing(ii) Summability if 1113936

infinn1λnεn lt +infin then 1113936

infinn1

Dh(xn xnminus 1)lt +infin and 1113936infinn1Dh(xnminus 1 xn)lt +infin

Proof

(i) From (57) Dhn(xn xnminus 1)ge 0⟹Ψ(xn)leΨ(xnminus 1)

foralln isin Nlowast(ii) From (58) 1113936

npn1Dh(xn xnminus 1) + Dh(xnminus 1 xn)le

λ1 minus λL

1113944

np

n1Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

11 minus λL

1113944

np

n1λnεn

⟹1113944

np

n1Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873

leλ

1 minus λLΨ x

01113872 1113873 minus Ψlowast1113872 1113873 +

11 minus λL

1113944

np

n1λnεn

(61)

In the following we pose

tp ≔ 1113944

np

n1λn forallp isin N

lowast t0 λ0 0

An(u) Dhnu x

n( 1113857 forallu isin S foralln isin N

lowast

Φ xp

( 1113857 ≔ min Ψ xk

1113872 1113873 1le klep1113966 1113967 forallp isin Nlowast

(62)

Proposition 7 (Global Estimate in Function Values) (a) Forall u isin S and forallp isin Nlowast

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (63)

Proof We have Dhn(u xnminus 1) Anminus 1(u) minus (λn minus λnminus 1)Dg

(u xnminus 1) from (59) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) minus Dhnx

n x

nminus 11113872 1113873

minus λn minus λnminus 1( 1113857Dg u xnminus 1

1113872 1113873 + λnεn

(64)

From (52) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857le 1113944

np

n1Anminus 1(u) minus An(u) + 1113944

np

n1λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857leA0(u) minus Ap(u) + 1113944

np

n1λnεn

⟹ Φ xp

( 1113857 minus Ψ(u)( 1113857 1113944n1

np

λn leA0(u) minus Ap(u) + 1113944

np

n1λnεn

(65)

A0(u) Dh(u x0) (λ0 0) so

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (66)

)is theorem covers the evaluation of the global con-vergence rate given in [6 12] as shown by the followingcorollary

Corollary 2 We assume that

(a) εn 0 n 1

(b) λn λ 12L n 1

(c) xlowast isin argminΨ

en

Ψ xp

( 1113857 minus Ψlowast le2L

pDh xlowast x

01113872 1113873 (67)

ie Ψ(xp) minus Ψlowast O(1p)

Proof Immediate consequence of Proposition 7

Now we derive a global convergence of the sequencegenerated by Algorithm 1 to a minimizer of (P)

Abstract and Applied Analysis 7

Theorem 3 We assume that 1113936 λnεn lt +infin if one of thefollowing assumptions holds

(i) exist λ gt 0 such that λ le λn le λ n 1

(ii) e sequence Ψ(xn) is nonincreasing and1113936 λn +infin then (a) Ψ(xn)⟶ inf Ψ and (b)xn⟶ xlowast isin argminΨ

Proof

(a) Suppose

(i) Let xlowast isin argminΨ and we put u xlowast in (69) wehave

λ Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857 + λnεn

(68)

An xlowast

( 1113857leAnminus 1 xlowast

( 1113857 + λnεn (69)

1113944 λnεn lt +infin⟹An xlowast

( 1113857⟶ l isin R (70)

(68) and (suppose (69))⟹Ψ(xn)⟶ inf Ψ

Suppose (ii)

Φ xp

( 1113857 min Ψ xk

1113872 1113873 1le klep1113966 1113967 Ψ xp

( 1113857 (71)

For u xlowast isin argminΨ in (63) we have 0leΨ(xp) minus Ψ(xlowast) le (1tp)[Dh(xlowast x0) + 1113936

infinn1λnεn]

so 1113936 λn +infin⟹Ψ(xn)⟶ inf Ψ

(b) (69)⟹existαgt 0 An(xlowast)le αAn(xlowast) Dhn(xlowast xn)

Dh(xlowast xn) minus λnDg(xlowast xn) so

Dh xlowast x

n( 1113857le α + λnDg x

lowast x

n( 1113857

dArr

Dh xlowast x

n( 1113857le

α1 minus λL

(72)

Then Dh(xlowast xn)1113864 1113865 is bounded and from H3 xn isbounded as well Let ulowast isin Adh xn there exists then asubsequence xni of xn such that xni⟶ ulowast isin S From H5Dh(ulowast xni )⟶ 0 On the other hand

0leDg ulowast x

ni( 1113857leL middot Dh ulowast x

ni( 1113857 (73)

so Dg(ulowast xni )⟶ 0ulowast isin argminΨ Indeed

inf ΨleΨ ulowast

( 1113857le lim Ψ xni( 1113857 inf Ψ

dArr

inf Ψ Ψ ulowast

( 1113857

(74)

which shows that⟹ulowast isin argminΨ )en

Dhniulowast x

ni( 1113857 Dh ulowast x

ni( 1113857 minus λniDg u

lowast x

ni( 1113857⟶ 0 (75)

Since Dhni(ulowast xni )⟶ 0 and ulowast isin argminΨ we have

Dhnulowast x

n( 1113857⟶ 0 (76)

We have

Dh ulowast x

n( 1113857 Dhn

ulowast x

n( 1113857 + λnDg u

lowast x

n( 1113857

leDhnulowast x

n( 1113857 + L middot λDh u

lowast x

n( 1113857

(77)

so (1 minus L middot λ)Dh(ulowast xn)leDhn(ulowast xn) then

Dh ulowast x

n( 1113857⟶ 0 (78)

And from H6 we have xn⟶ ulowast isin argminΨ )e IBPG algorithm generates a sequence such thatΨ(xn) n does not necessarily be nonincreasing for thisreason and for improvement of the global estimate infunction values we now propose εminus NoLips which is aninexact version of BPG with a relative error criterion Let σsuch that 0le σ lt 1 be given as follows

In what follows we will derive a convergence rate result()eorem 4) for the εminus NoLips framework First we need toestablish a few technical lemmas

In the following xn n denotes the sequence generatedby εminus NoLips

Lemma 7 For every u isin S for all n isin Nlowast

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(79)

Proof Since (λn minus λnminus 1)Dg(u xnminus 1)ge 0 we have from (62)

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le minus Dhnx

n x

nminus 11113872 1113873 + Anminus 1(u)

amp9 minus An(u) + λnεn(80)

From Algorithm 2 we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + (σ minus 1)Dhnx

n x

nminus 11113872 1113873

(81)

From the condition LC we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(82)

Remark 3 We now notice that Ψ(xn) n is nonincreasingJust replace u with xnminus 1 in (79)

Lemma 8 For every n isin Nlowast and xlowast isin argminΨ we have

tn Ψ xn

( 1113857 minus Ψlowast( 1113857 + λminus 1n tn(1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 1113873 + Anminus 1 xlowast

( 1113857 minus An xlowast

( 1113857

(83)

Proof Replacing u by xnminus 1 in (79) and since An(xnminus 1)ge 0and Anminus 1(xnminus 1) 0 we have

8 Abstract and Applied Analysis

tnminus 1 Ψ xn

( 1113857 minus Ψlowast( 11138571113857 + tnminus 1λminus 1n (1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 11138731113873

(84)

Replacing u by xlowast in (79) we have

λn Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857 +(1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

+ leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857(85)

Since tnminus 1 + λn tn by adding (84) and (85) we have(83)

Lemma 9 For every k isin Nlowast

tk Ψ xk

1113872 1113873 minus Ψlowast1113872 1113873 +(1 minus σ)(1 minus λL) 1113944nk

n1λminus 1

n tnDh xn x

nminus 11113872 1113873

leA0 xlowast

( 1113857 minus Ak xlowast

( 1113857

(86)

Proof )is result is obtained by adding inequality (83) from1 to k (t0 λ0 0)

We are now ready to state the convergence rate result forthe εminus NoLips framework )is result improves and com-pletes the one given in the Proposition 7

Theorem 4 For every k isin Nlowast the following statements holdwe pose ρk ≔ 1113936

nkn1λ

minus 1n tn Consider

(a)Ψ xk

1113872 1113873 minus Ψlowast leDh xlowast x0( 1113857

tk

1113888 1113889 (87)

(b) ck ≔ min1lenlek

Dh xn x

nminus 11113872 1113873le

Dh xlowast x0( 1113857

(1 minus σ)(1 minus λL)ρk

1113888 1113889

(88)

Proof From (86) and since A0(xlowast) Dh(xlowast x0)(λ0 0)

and Ak(xlowast)ge 0 we immediately have (87) and (88)

Corollary 3 Consider an instance of the εminus NoLips frame-work with λn λ for every n isin Nlowasten for every k isin Nlowast thefollowing statements hold

(a)Ψ xk

1113872 1113873 minus Ψlowast O1k

1113874 1113875 (89)

(b) ck min1lenlek

Dh xn x

nminus 11113872 1113873 O

1k21113874 1113875 (90)

Proof tk kλ and (87)⟹ (90)

ρk k(k + 1)

2ge

k2

2 (91)

and (88)⟹ (90)

Remark 4 (87) and (89) represent exactly the convergencerate established in [15 22] (90) is a new result not estab-lished in [15 22] this result shows that ck converges to zeroat a rate of O(1k2)

Theorem 5 If 1113936 λn +infin then

(a) Ψ(xn)⟶ inf Ψ(b) xn⟶ xlowast isin argminΨ

Proof Replacing u by xlowast in (81) we have

(1 minus σ)Dhnx

n x

nminus 11113872 1113873leAnminus 1 x

lowast( 1113857 minus An x

lowast( 1113857 (92)

By adding the inequality (92) from 1 to k we have

(1 minus σ) 1113944nk

n1Dhn

xn x

nminus 11113872 1113873leA0 x

lowast( 1113857 minus Ak x

lowast( 1113857 (93)

Ak(xlowast)ge 0 so 1113936infinn1Dhn

(xn xnminus 1)lt +infin From (69) we have

1113944

infin

n1λnεn lt +infin (94)

Since Ψ(xn) n is nonincreasing and (94) by applyingthe )eorem 3 (ii) we have the results of )eorem 5

5 Application to Nonnegative LinearInverse Problem

In Poisson inverse problems (eg [25 26]) we are given anonnegative observation matrix A isin Rmtimesd

+ and a noisymeasurement vector b isin Rm

+ and the goal is to reconstructthe signal x isin Rd

+ such that Ax≃ b We can naturally adoptthe distance D(Ax b) to measure the residuals between twononnegative points with

(1) Input x0 isin Scap ri(domΨ)

(2) For n 1 2 with εn gt 0 we obtain (nablahn(xnminus 1) minus nablahn(xn)λn) isin zεnΨ(xn)

ALGORITHM 1 Inexact Bregman Proximal Gradient (IBPG)

Abstract and Applied Analysis 9

D(Ax b) 1113944m

i1langai xranglog

langai xrangbi

+ bi minus langai xrang (95)

where ai denotes the ith line of AIn this section we propose an approach for solving the

nonnegative linear inverse problem defined by

Pα( 1113857 inf αx1 + D(Ax b) x isin Rd+1113966 1113967 (αgt 0) (96)

We take

f(x) ≔ αx1

g(x) ≔ D(Ax b)

h(x) ≔ h1(x) 1113944d

i1xi logxi

(97)

It is shown in [15] that the couple (g h) verified aLipschitz-likeConvexity Condition (LC) on Rd

+ for any Lsuch that

Lge max1lejled

1113944

m

i1aij max

1lejledAj

1 (98)

where Aj denotes the jth column of AFor λn ≔ λ foralln )eorem 3 is applicable and global

warrant of Algorithm 1 convergence to an optimal solutionof (Pα)

Given xn isin Rd++ the iteration

xn+1≃Tλ x

n( 1113857 (99)

amounts to solving the one-dimensional problemFor j 1 d

xn+1j ≃ argmin αt + cjt +

t logt

xnj

+ xnj minus t⎛⎝ ⎞⎠ tgt 0

⎧⎨

⎫⎬

(100)

where cj is the jth component of nablag(xn)

6 Conclusion

)e proposed algorithms constitute a unified frame for theexisting algorithms BPG BP and PG by giving others inparticular the inexact version of the interior method withBregman distance studied in [24] More precisely

(i) When εn 0 foralln isin Nlowast our algorithm is the NoLipsstudied in [15 22]

(ii) When g 0 our algorithm is the inexact version ofBregman Proximal (BP) studied in [19]

(iii) When εn 0 foralln isin Nlowast and g 0 our algorithm isthe Bregman Proximal (BP) studied in [17 21]

(iv) When f 0 our algorithm is the inexact version ofthe interior method with Bregman distance studiedin [24]

(v) When f 0 and εn 0 foralln isin Nlowast our algorithm isthe interior method with Bregman distance studiedin [24]

(vi) When h (12)middot2 our algorithm is the proximalgradient method (PG) and its variants [4ndash14 27]

Our analysis is different and more simple than the onegiven in [15 22] and allows to reduce some hypothesis inparticular the supercoercivity of Ψ as well as the simulta-neous convexity of f and g

Data Availability

No data were used to support this study

Conflicts of Interest

)e authors declare that they have no conflicts of interest

References

[1] D L Donoho ldquoCompressed sensingrdquo IEEE Transactions onInformation eory vol 52 no 4 pp 1289ndash1306 2006

[2] A Beck and Y C Eldar ldquoSparsity constrained nonlinearoptimization optimality conditions and algorithmsrdquo SIAMJournal on Optimization vol 23 no 3 pp 1480ndash1509 2013

[3] D R Luke ldquoPhase retrieval whatrsquos newrdquo SIAGOPT Viewsand News vol 25 no 1 pp 1ndash5 2017

[4] A Auslender ldquoNumrrical methods for non-differentiableconvex optimizationrdquo inMathematical Programming StudiesJ P Vial and C B Nguyen Eds vol 30 pp 102ndash126Springer Berlin Germany 1987

[5] A I Chen and A Ozdaglar ldquoA fast distributed proximal-gradient methodrdquo in Proceedings of the 2012 50th AnnualAllerton Conference on Communication Control and Com-puting (Allerton) pp 601ndash608 IEEE Monticello IL USAOctober 2012

[6] J B Hiriart-Urruty ldquoε-subdifferentiel calculs Convex anal-ysis and optimizationrdquo in Research Notes in MathematicsSeries Vol 57 Pitman Publishers London UK 1982

[7] K Jiang D Sun and K-C Toh ldquoAn inexact acceleratedproximal gradient method for large scale linearly constrainedconvex SDPrdquo SIAM Journal on Optimization vol 22 no 3pp 1042ndash1064 2012

(1) Input x0 isin Scap ri(domΨ)

(2) Choose λn gt 0 and find xn isin S σn isin [0 σ] and εn ge 0 such that((nablahn(xnminus 1) minus nablahn(xn))λn) isin zεn

Ψ(xn)

λnεn le σnDhn(xn xnminus 1)

(3) Set n⟵ n + 1 and go to step 1

ALGORITHM 2 εminus NoLips

10 Abstract and Applied Analysis

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11

Page 2: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

Our notation is fairly standard lang rang is the scalar producton Rd and the associated norm middot )e closure of the set C(relative interior) is denoted by C (riC respectively) For anyconvex function f we denote by

(1) domf x isin Rd f(x)lt +infin1113864 1113865 its effective domain(2) zεf(middot) v f(y) gef(middot) + langv y minus rang minus εforally1113864 1113865 its εminus

subdifferential(3) argminf x isin Rd f(x) inf f1113864 1113865 its argmin f(4) ε minus argminf x isin Rd f(x)le inff + ε1113864 1113865 its εminus

argminf

2 Preliminary

In this section we present the main results of the conver-gence of NoLips

Definition 1 (see [23]) Let C be a convex not empty of Rd

(i) A convex function h Rd⟶ ] minus infin +infin] is ofLegendre on C if it verifies the three followingconditions

(a) C int (dom h)(b) h is differentiable on C(c) limnablah(xi) +infin for any sequence xi1113864 1113865 of C

that converges towards a boundary point of C

(ii) )e class of strictly convex functions verifying a band c is called the class of Legendrersquos functions on Cand denoted by E(C)

Definition 2 (see [23]) Let F Rd⟶ ] minus infin +infin] we saythat F is supercoercive if

lim infx⟶infin

F(x)

xinfin (6)

Consider the following assumptions

Assumption 1

(i) h X sub Rd⟶ ] minus infin +infin] is of Legendre type(ii) g X⟶ ] minus infin +infin] is convex proper lsc with

dom h sub domg which is differentiable onint(dom h)

(iii) f X⟶ ] minus infin +infin] is convex proper lower semi-continuous (lsc)

(iv) domfcap int(dom h)neempty(v) inf Ψ(x) x isin dom h1113966 1113967gt minus infin

We consider the following minimization problem (P)inf Ψ(x) ≔ f(x) + g(x) x isin dom h1113966 1113967

Let the operator Tλ be defined by

Tλ(x) argmin f(u) +langnablag(x) u minus xrang +1λDh(u x)1113882 1113883

(7)

Lemma 1 (well-posedness of the method) Under As-sumption 1 suppose one of the following assumptions holds

(i) argmin Ψ(x) x isin dom h1113966 1113967 is nonempty and compact(ii) forallλgt 0 h + λf is supercoercive and the map Tλ de-

fined in (7) is nonempty and single-valued from int(domh) to int (domh)

Definition 3 )e couple (g h) verified a Lipschitz-likeConvexity Condition (LC) if existLgt 0 with Lh-g convex on int(dom h)

By posing

proxhλf(x) ≔ argmin f(u) +

1λDh(u x)1113882 1113883 (8)

they showed that

Tλ(x) proxhλf o proxh

λp(x) (9)

where p(u) langnablag(x) urang )e operator Tλ thus appears ascomposed of two operators prox)eNoLips algorithm thenbecomes

xn

Tλnx

nminus 11113872 1113873 foralln isin N

lowast (10)

Assumption 2

(i) argmin Ψ(x) x isin dom h1113966 1113967 is nonempty and com-pact or forallλgt 0 h + λf is supercoercive

(ii) For every x isin int(dom h) and r isin R the level setL1(x r) y isin int(dom h) Dh(x y)le r1113864 1113865 isbounded

(iii) If xn n converges to some x in int (dom h) thenDh(x xn)⟶ 0

(iv) Reciprocally if x in int (dom h) and if xn n is suchthat Dh(x xn)⟶ 0 then xn⟶ x

(v) exist Lgt 0 with Lh-g convex on int (dom h) (LC)

Theorem 1 (Global Convergence) Assume that

(i) dom h dom h(ii) 1113936 λn +infin and Assumptions 1 and 2 are satisfied

en the sequence xn n converges to some solutionxlowast of (P)

Our contribution is resumed in two essential points

(1) Improvements of some assumptions

(a) Suppose f and g are both are convex (see[15 22]) we show that we can reduce this hy-pothesis by supposing only that Ψ is convexwhich allows to distinguish two interesting casesthat are still not yet studied neither in the case ofthe BPG nor in the case PG

2 Abstract and Applied Analysis

(i) )e nonsmooth part f is possibly not convexand the smooth part g is convex

(ii) )e nonsmooth part f is convex and thesmooth part g is possibly not convex

(b) )e assumption is as follows argmin Ψ(x)

x isin dom h is compact or forallλgt 0 h + λf issupercoercive

)is is a condition on f and g (see [15]) whichprecludes the application of NoLips for the functionsΨ non-supercoercive In this work we show that wecan circumvent this condition by coupling the LCproperty with the bounded level sets as follows

L2(x r) y isin S Dh(y x)le r1113864 1113865 (11)

It is a condition which relates to the parameter h andwhich is verified by most of the interesting Bregmandistances

(2) Inexact version of NoLips

We propose an inexact version of NoLips calledεminus NoLips which verifies

xn

εn minus argmin f(u) +langnablag xnminus 1

1113872 1113873 urang +1λn

Dh u xnminus 1

1113872 11138731113896 1113897

n isin Nlowast

(12)

)e convergence result is established in Section 4 )isstudy covers the convergence results given for PG and BPGby giving new results in particular the convergence of theinexact version of the interior method with Bregman dis-tance studied in [24] this result has not been establisheduntil now

We also note that the convergence of NoLips is givenwith the following condition

dom h dom h (13)

It is for that and for the clarity of the hypothesis we supposein what follows that h S⟶ R with S being an open convexset of Rd

3 Main Results

In order to clarify the status of parameter h we give thefollowing definitions Let S be a convex open subset of Rd

and h S⟶ R Let us consider the following hypotheses

H1 h is continuously differentiable on SH2 h is continuous and strictly convex on S

H3 forallrge 0forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (14)

H4 forallrge 0forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (15)

H5 if xn n sub S then xn⟶ xlowast isin S so

Dh xlowast x

n( 1113857⟶ 0 (16)

H6 if xn n sub S then Dh(xlowast xn)⟶ 0 so

xn⟶ x

lowast (17)

Definition 4

(i) h S⟶ R is a Bregman function on S orldquoD-func-tionrdquo if h verifies H1 H2 H3 H4 H5 and H6

(ii) Dh( ) SXS⟶ R such that forallx isin S forally isin S

Dh(x y) h(x) minus h(y) minus langx minus ynablah(y)rang (18)

Eq (18) is called Bregman distance if h is a Bregmanfunction We put the following conditions

A(S) h S⟶ R verifingH1 H21113864 1113865

B(S) h S⟶ R verifingH1 H2 H3 H4 H5 andH61113864 1113865

E(S) h S⟶ R the Legendre type of S1113864 1113865

Proposition 1 Let h and hprime verify H1

forallλ Dλh+hprime( ) λDh( ) + Dhprime( ) (19)

Lemma 2 forallh isin A(S) foralla isin S and forallb c isin S

Dh(a b) + Dh(b c) minus Dh(a c) langa minus bnablah(c) minus nablah(b)rang

(20)

Example 1 If S0 Rd and h0(x) (12)x2 then

Dh0(x y)

12x minus y

2 (21)

Example 2 If S1 Rd++ ≔ x isin Rdxi gt 0 i 1 d1113864 1113865 and

h1(x) 1113944id

i1xilog xi forallx isin S1 (22)

with the convention 0 log 0 0 then forall(x y) isin S1XS1

Dh1(x y) 1113944

d

i1xilog

xi

yi

+ yi minus xi (23)

Example 3 If S2 ]minus 1 1]d and h2(x) minus 1113936idi1

1 minus x2

i

1113969

thenDh2(x y) h2(x) + 1113936

di11 minus xiyiy2

i forall(x y) isin S2XS2

Abstract and Applied Analysis 3

Proposition 2 (see [19]) hi isin B(Si)capE(Si) i 0 1 2 Weconsider the following minimization problem

(p) inf Ψ(x) ≔ f(x) + g(x) x isin S1113864 1113865 (24)

)e following assumptions on the problemrsquos data aremade throughout the paper (and referred to as the blanketassumptions)

Assumption 3

(i) h isin A(S)capE(S)

(ii) g Rd⟶ ]minus infin +infin] is proper (lsc) withS sub domg which is and continuously differentiableon S

(iii) f Rd⟶ ]minus infin +infin] is proper (lsc)(iv) ri(domΨ)cap Sneempty(v) Ψlowast ≔ inf Ψ(x) x isin S1113864 1113865gt minus infin

We consider the operator Tλ defined by forallx isin S

Tλ(x) argmin f(u) +langnablag(x) u minus xrang1113882

+1λDh(u x) u isin S1113883

(25)

We give in the following a series of lemmas allowingestablishment of )eorem 2 which assures the well-pos-edness of the method proposed in Section 4

Lemma 3 forallλgt 0 and forallx isin S

Tλ(x) argmin Ψ(u) +1λDhminus λg(u x) u isin S1113880 1113883 (26)

Proof forallx isin S sub domg

(25)⟹ Tλ(x) argmin f(u) + g(x) +langnablag(x) u minus xrang1113882

+1λDh(u x) u isin S1113883

(27)

Whenforallu isin S sub domg we have

f(u) + g(x) + langnablag(x) u minus xrang +1λDh(u x)

Ψ(u) + g(x) minus g(u) + langnablag(x) u minus xrang +1λDh(u x)

Ψ(u) minus Dg(u x) +1λDh(u x)

Ψ(u) +1λDhminus λg(u x)

(28)

Lemma 4 If the pair (g h) verified the condition (LC) thenexist Lgt 0 forallx isin S forallu isin S

(i) Dg(u x)leLDh(u x)

(ii) forallλ isin ]0 1L[ Dhminus λg(u x)ge 0

Proof

(i) If Lh-g is convex on S then

DLhminus g(u x)ge 0 forallx isin S forallu isin S (29)

Let u isin (SS) )ere exists a sequence un n sub S suchthat un⟶ u then we have

DLhminus g un x( 1113857ge 0 (30)

Lh-g is continuous in S then DLhminus g(u x)ge 0

(ii) Let λ isin ]0 1L[

DLhminus g(u x)ge 0⟹ LDh(u x)geDg(u x)

⟹1λDh(u x)geDg(u x)

⟹Dhminus λg(u x)ge 0

(31)

Lemma 5 If h is the Legendre on S then h minus λg is also theLegendre on S for all λ such that 0lt λlt (1L)

Proof Conditions (a) and (b) of Definition 1 being verifiedlet us demonstrate that the condition (c) is verified too Letxi1113864 1113865 be xi⟶ xlowast isin Fr(S) (SS)

nablah xi( 1113857 minus λnablag xi( 1113857

2 ge nablah xi( 1113857

nablah xi( 1113857

minus 2λ nablag xi( 1113857

1113872 1113873

+ λ2 nablag xi( 1113857

2

(32)

)en

limxi⟶xlowastisinFr(S)

nabla(h minus λg) xi( 1113857

+infin (33)

h minus λg is strictly convex in S Indeed let x y isin S xney wehave Dhminus λg(x y)ge 0

Dhminus λg(x y) 0⟹Dh(x y)

λDg(x y)⟹Dh(x y)le λLDh(x y)

(34)

h is strongly convex on S x y isin S xney thenDh(x y)ne 0⟹1le λL which is absurd Hence h minus λg isstrictly convex in S

Lemma 6 Consider the following

4 Abstract and Applied Analysis

forallλ isin 01L

1113877 1113876 forallu isin S z Dhminus λg( u)1113872 1113873 xlowast

( 1113857

nabla(h minus λg) xlowast( ) minus nabla(h minus λg)(u)1113864 1113865 if xlowast isin S

empty if not

⎧⎪⎨

⎪⎩

(35)

Proof Since h minus λg is a Legendre function on S Dhminus λg( u)

is also a Legendre By application of )eorem 261 in [23]z(Dhminus λg( u)) verifies the following

(i) If xlowast isin int(domDhminus λg( u)) S then

z Dhminus λg( u)1113872 1113873 xlowast

( 1113857 nablaDhminus λg( u) xlowast

( 11138571113966 1113967 (36)

(ii) If xlowast notin S then z(Dh( u))(xlowast) empty

Theorem 2 (well-posedness of the method) We assumethat

(i) Ψ is convex(ii) e pair (g h) verified the condition (LC)(iii) forallrge 0 forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (37)

Then forallλ isin ]0 (1L)[ and the map Tλ defined in (25) isnonempty and single-valued from S to S

Proof forallx isin S and Tλ(x) is nonempty for this it is enoughto demonstrate that forallr isin R

L(x r) u isin S Ψ(u) + λminus 1Dhminus λg(u x)le r1113966 1113967 (38)

which is closed and is bounded when it is nonempty

u isin L(x r)⟹Ψ(u) + λminus 1Dhminus λg(u x)le r

⟹Dhminus λg(u x)le λ r minus Ψlowast( 1113857

⟹Dh(u x)le λ r minus Ψlowast( 1113857 + λDg(u x)

⟹Dh(u x)le λ r minus Ψlowast( 1113857

+ LλDh(u x)

⟹Dh(u x)leλ r minus Ψlowast( )

1 minus Lλ

(39)

It follows that

L(x r) sub L2 xλ r minus Ψlowast( )

1 minus Lλ1113888 1113889 (40)

thanks to H4 L2(x (λ(r minus Ψlowast)1 minus Lλ)) is bounded whichleads that L(x r) is bounded too which shows that

Tλ(x)neempty (41)

Let xlowast isin Tλ(x) Let us suppose that xlowast isin S(26)⟹ 0 isinz(Ψ(middot) + 1λDhminus λg( x))(xlowast) since ri(domΨ)cap ri(domDhminus λg

( x)) ri(domΨ)cap ri(S) ri(domΨ)cap Sneempty from [10]which allows to write that

0 isin zΨ xlowast

( 1113857 + z1λDhminus λg( x)1113874 1113875 x

lowast( 1113857 (42)

It follows that

existu isin zΨ xlowast

( 1113857 (43)

such that

minus λu isin zDhminus λg( x) xlowast

( 1113857 (44)

is in contradiction with Lemma 6 )en Tλ(x) sub S

On the other hand h minus λg is strictly convex in S and Ψ isconvex so Ψ(middot) + Dhminus λg( x) is strongly convex in S )enTλ(x) has a unique value for all x isin S

Remark 1 )is result is liberated from the supercoercivityof Ψ and the simultaneous convexity of f and g as requiredby Lemma 2 [15]

Proposition 3 forallx isin Sforallλ isin ]0 (1L)[nabla(h minus λg)(x) minus nabla(h minus λg) Tλ(x)( 1113857

λisin zΨ Tλ(x)( 1113857 (45)

existx isin Snabla(h minus λg)(x) minus nabla(h minus λg)(x)

λisin zεΨ(x) (46)

Proof Since Tλ(x) isin S we have

0 isin z Ψ(middot) +1λDhminus λg( x)1113874 1113875 Tλ(x)( 1113857

⟹ minus nabla1λDhminus λg( x)1113874 1113875 Tλ(x)( 1113857( 1113857 isin z Ψ Tλ(x)( 1113857( 1113857

⟹ (45)

(47)

For (46) just take x Tλ(x) since

zΨ Tλ(x)( 1113857 sub zεΨ Tλ(x)( 1113857 (48)

Proposition 4 forallx isin Sforallλ isin ]0 (1L)[

Tλ(x) proxhminus λg

λΨ (x) proxhλf o proxh

λp(x) (49)

where p(u) langnablag(x) urang

Proof )e first equality is due to Lemma 3 )e second isestablished in [15]

)e first equality played a decisive role in the devel-opment of this paper

4 Analysis of the εminusNoLips Algorithm

In this section we propose an Inexact Bregman ProximalGradient Algorithm (IBPG) which is an inexact version ofthe BPG algorithm described in [15 22] the IBPG

Abstract and Applied Analysis 5

framework allows an error in the subgradient inclusion byusing the error εn We study two algorithms

(i) Algorithm 1 inexact Bregman Proximal Gradient(IBPG) algorithm without relative error criterion

(ii) Algorithm 2 inexact Bregman Proximal Gradient(IBPG) algorithm with relative error criterion whichwe call εminus NoLips

We establish the main convergence properties of theproposed algorithms In particular we prove its global rateof convergence showing that it shares the claimed sublinearrate O(1n) of basic first-order methods such as the classicalPG and BPG We also derive a global convergence of thesequence generated by NoLips to a minimizer of (P)

Assumptions 4

(i) h isin B(S)capE(S)

(ii) Ψ is convex(iii) )e pair (g h) verified the condition (LC)(iv) Argmin Ψneempty

In our analysis Ψ is supposed to be a convex function itallows to distinguish two interesting cases

(i) )e nonsmooth part f is possibly not convex and thesmooth part g is convex

(ii) )e nonsmooth part f is convex and the smooth partg is possibly not convex

In what follows the choice of the sequence λn1113864 1113865 dependsof the convexity of g

Let λ such that 0lt λlt (1L) λ0 ≔ 0

If g is not convex then we choose

λn λ (0lt λle λ) n 1 (50)

If g is convex then we choose

λn le λn+1 le λ n 1 (51)

In those conditions we easily show thatforallx y isin S foralln isin N 0lt λn le λ such that

λn+1 minus λn( 1113857Dg(x y)ge 0 (52)

We pose

hn h minus λng n 1 (53)

Proposition 5 e sequence xn n defined by (IBPG) existsand is verified for all n isin Nlowast

xn isin εn minus argmin Ψ(u) +

1λn

Dhnu x

nminus 11113872 1113873 u isin S1113896 1113897 (54)

Proof Existence is deduced trivially from (45)

Ωn ≔nablahn xnminus 1( 1113857 minus nablahn xn( )

λn

isin zεnΨ x

n( 1113857

⟹Ψ(u)geΨ xn

( 1113857 +langu minus xnΩnrang minus εn

(55)

By applying Lemma 2 we have

Ψ(u)geΨ xn

( 1113857 + λminus 1n Dhn

u xn

( 1113857 + Dhnx

n x

nminus 11113872 1113873 minus Dhn

u xnminus 1

1113872 11138731113960 1113961 minus εn

⟹Ψ xn

( 1113857 + λminus 1n Dhn

xn x

nminus 11113872 1113873leΨ(u) + λminus 1

n Dhnu x

nminus 11113872 1113873 + εn forallu isin S

⟹ xn isin T

εn

λnx

nminus 11113872 1113873

(56)

where Tελ(x) ε minus argmin Ψ(u) + (1λ)Dhminus λg(u x)1113966 1113967

Remark 2 )is result shows that IBPG is an inexact versionof BPG and this is exactly the BPG when εn 0 ie

(i) xn isin Tεn

λn(xnminus 1)⟹ xn ≃Tλn

(xnminus 1)

(ii) εn 0⟹ xn Tλn(xnminus 1)

Proposition 6 For all n isin Nlowast

(i) Ψ xn

( 1113857 minus Ψ xnminus 1

1113872 11138731113872 1113873le minus Dhnx

n x

nminus 11113872 1113873 + λnεn (57)

(ii) Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 11138731113872

minus Ψ xn

( 11138571113857 +λnεn

1 minus λL

(58)

Proof

(55)⟹λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le Dhnu x

nminus 11113872 1113873 minus Dhn

u xn

( 1113857 minus Dhnx

n x

nminus 11113872 11138731113960 1113961 + εnλn (59)

6 Abstract and Applied Analysis

We put u xnminus 1 in (59) we get (57)(ii) Put u xnminus 1 in (59) we have

Dhnx

n x

nminus 11113872 1113873 + Dhn

xnminus 1

xn

1113872 1113873le λn Ψ xnminus 1

1113872 1113873 minus Ψ xn

( 11138571113872 1113873 + λnεn

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λn Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λnDg x

n x

nminus 11113872 1113873 + λnDg x

nminus 1 x

n1113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λ Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λL Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 11138731113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

λnεn

1 minus λL

(60)

Corollary 1

(i) If εn 0 the sequence Ψ(xn) n is nonincreasing(ii) Summability if 1113936

infinn1λnεn lt +infin then 1113936

infinn1

Dh(xn xnminus 1)lt +infin and 1113936infinn1Dh(xnminus 1 xn)lt +infin

Proof

(i) From (57) Dhn(xn xnminus 1)ge 0⟹Ψ(xn)leΨ(xnminus 1)

foralln isin Nlowast(ii) From (58) 1113936

npn1Dh(xn xnminus 1) + Dh(xnminus 1 xn)le

λ1 minus λL

1113944

np

n1Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

11 minus λL

1113944

np

n1λnεn

⟹1113944

np

n1Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873

leλ

1 minus λLΨ x

01113872 1113873 minus Ψlowast1113872 1113873 +

11 minus λL

1113944

np

n1λnεn

(61)

In the following we pose

tp ≔ 1113944

np

n1λn forallp isin N

lowast t0 λ0 0

An(u) Dhnu x

n( 1113857 forallu isin S foralln isin N

lowast

Φ xp

( 1113857 ≔ min Ψ xk

1113872 1113873 1le klep1113966 1113967 forallp isin Nlowast

(62)

Proposition 7 (Global Estimate in Function Values) (a) Forall u isin S and forallp isin Nlowast

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (63)

Proof We have Dhn(u xnminus 1) Anminus 1(u) minus (λn minus λnminus 1)Dg

(u xnminus 1) from (59) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) minus Dhnx

n x

nminus 11113872 1113873

minus λn minus λnminus 1( 1113857Dg u xnminus 1

1113872 1113873 + λnεn

(64)

From (52) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857le 1113944

np

n1Anminus 1(u) minus An(u) + 1113944

np

n1λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857leA0(u) minus Ap(u) + 1113944

np

n1λnεn

⟹ Φ xp

( 1113857 minus Ψ(u)( 1113857 1113944n1

np

λn leA0(u) minus Ap(u) + 1113944

np

n1λnεn

(65)

A0(u) Dh(u x0) (λ0 0) so

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (66)

)is theorem covers the evaluation of the global con-vergence rate given in [6 12] as shown by the followingcorollary

Corollary 2 We assume that

(a) εn 0 n 1

(b) λn λ 12L n 1

(c) xlowast isin argminΨ

en

Ψ xp

( 1113857 minus Ψlowast le2L

pDh xlowast x

01113872 1113873 (67)

ie Ψ(xp) minus Ψlowast O(1p)

Proof Immediate consequence of Proposition 7

Now we derive a global convergence of the sequencegenerated by Algorithm 1 to a minimizer of (P)

Abstract and Applied Analysis 7

Theorem 3 We assume that 1113936 λnεn lt +infin if one of thefollowing assumptions holds

(i) exist λ gt 0 such that λ le λn le λ n 1

(ii) e sequence Ψ(xn) is nonincreasing and1113936 λn +infin then (a) Ψ(xn)⟶ inf Ψ and (b)xn⟶ xlowast isin argminΨ

Proof

(a) Suppose

(i) Let xlowast isin argminΨ and we put u xlowast in (69) wehave

λ Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857 + λnεn

(68)

An xlowast

( 1113857leAnminus 1 xlowast

( 1113857 + λnεn (69)

1113944 λnεn lt +infin⟹An xlowast

( 1113857⟶ l isin R (70)

(68) and (suppose (69))⟹Ψ(xn)⟶ inf Ψ

Suppose (ii)

Φ xp

( 1113857 min Ψ xk

1113872 1113873 1le klep1113966 1113967 Ψ xp

( 1113857 (71)

For u xlowast isin argminΨ in (63) we have 0leΨ(xp) minus Ψ(xlowast) le (1tp)[Dh(xlowast x0) + 1113936

infinn1λnεn]

so 1113936 λn +infin⟹Ψ(xn)⟶ inf Ψ

(b) (69)⟹existαgt 0 An(xlowast)le αAn(xlowast) Dhn(xlowast xn)

Dh(xlowast xn) minus λnDg(xlowast xn) so

Dh xlowast x

n( 1113857le α + λnDg x

lowast x

n( 1113857

dArr

Dh xlowast x

n( 1113857le

α1 minus λL

(72)

Then Dh(xlowast xn)1113864 1113865 is bounded and from H3 xn isbounded as well Let ulowast isin Adh xn there exists then asubsequence xni of xn such that xni⟶ ulowast isin S From H5Dh(ulowast xni )⟶ 0 On the other hand

0leDg ulowast x

ni( 1113857leL middot Dh ulowast x

ni( 1113857 (73)

so Dg(ulowast xni )⟶ 0ulowast isin argminΨ Indeed

inf ΨleΨ ulowast

( 1113857le lim Ψ xni( 1113857 inf Ψ

dArr

inf Ψ Ψ ulowast

( 1113857

(74)

which shows that⟹ulowast isin argminΨ )en

Dhniulowast x

ni( 1113857 Dh ulowast x

ni( 1113857 minus λniDg u

lowast x

ni( 1113857⟶ 0 (75)

Since Dhni(ulowast xni )⟶ 0 and ulowast isin argminΨ we have

Dhnulowast x

n( 1113857⟶ 0 (76)

We have

Dh ulowast x

n( 1113857 Dhn

ulowast x

n( 1113857 + λnDg u

lowast x

n( 1113857

leDhnulowast x

n( 1113857 + L middot λDh u

lowast x

n( 1113857

(77)

so (1 minus L middot λ)Dh(ulowast xn)leDhn(ulowast xn) then

Dh ulowast x

n( 1113857⟶ 0 (78)

And from H6 we have xn⟶ ulowast isin argminΨ )e IBPG algorithm generates a sequence such thatΨ(xn) n does not necessarily be nonincreasing for thisreason and for improvement of the global estimate infunction values we now propose εminus NoLips which is aninexact version of BPG with a relative error criterion Let σsuch that 0le σ lt 1 be given as follows

In what follows we will derive a convergence rate result()eorem 4) for the εminus NoLips framework First we need toestablish a few technical lemmas

In the following xn n denotes the sequence generatedby εminus NoLips

Lemma 7 For every u isin S for all n isin Nlowast

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(79)

Proof Since (λn minus λnminus 1)Dg(u xnminus 1)ge 0 we have from (62)

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le minus Dhnx

n x

nminus 11113872 1113873 + Anminus 1(u)

amp9 minus An(u) + λnεn(80)

From Algorithm 2 we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + (σ minus 1)Dhnx

n x

nminus 11113872 1113873

(81)

From the condition LC we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(82)

Remark 3 We now notice that Ψ(xn) n is nonincreasingJust replace u with xnminus 1 in (79)

Lemma 8 For every n isin Nlowast and xlowast isin argminΨ we have

tn Ψ xn

( 1113857 minus Ψlowast( 1113857 + λminus 1n tn(1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 1113873 + Anminus 1 xlowast

( 1113857 minus An xlowast

( 1113857

(83)

Proof Replacing u by xnminus 1 in (79) and since An(xnminus 1)ge 0and Anminus 1(xnminus 1) 0 we have

8 Abstract and Applied Analysis

tnminus 1 Ψ xn

( 1113857 minus Ψlowast( 11138571113857 + tnminus 1λminus 1n (1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 11138731113873

(84)

Replacing u by xlowast in (79) we have

λn Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857 +(1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

+ leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857(85)

Since tnminus 1 + λn tn by adding (84) and (85) we have(83)

Lemma 9 For every k isin Nlowast

tk Ψ xk

1113872 1113873 minus Ψlowast1113872 1113873 +(1 minus σ)(1 minus λL) 1113944nk

n1λminus 1

n tnDh xn x

nminus 11113872 1113873

leA0 xlowast

( 1113857 minus Ak xlowast

( 1113857

(86)

Proof )is result is obtained by adding inequality (83) from1 to k (t0 λ0 0)

We are now ready to state the convergence rate result forthe εminus NoLips framework )is result improves and com-pletes the one given in the Proposition 7

Theorem 4 For every k isin Nlowast the following statements holdwe pose ρk ≔ 1113936

nkn1λ

minus 1n tn Consider

(a)Ψ xk

1113872 1113873 minus Ψlowast leDh xlowast x0( 1113857

tk

1113888 1113889 (87)

(b) ck ≔ min1lenlek

Dh xn x

nminus 11113872 1113873le

Dh xlowast x0( 1113857

(1 minus σ)(1 minus λL)ρk

1113888 1113889

(88)

Proof From (86) and since A0(xlowast) Dh(xlowast x0)(λ0 0)

and Ak(xlowast)ge 0 we immediately have (87) and (88)

Corollary 3 Consider an instance of the εminus NoLips frame-work with λn λ for every n isin Nlowasten for every k isin Nlowast thefollowing statements hold

(a)Ψ xk

1113872 1113873 minus Ψlowast O1k

1113874 1113875 (89)

(b) ck min1lenlek

Dh xn x

nminus 11113872 1113873 O

1k21113874 1113875 (90)

Proof tk kλ and (87)⟹ (90)

ρk k(k + 1)

2ge

k2

2 (91)

and (88)⟹ (90)

Remark 4 (87) and (89) represent exactly the convergencerate established in [15 22] (90) is a new result not estab-lished in [15 22] this result shows that ck converges to zeroat a rate of O(1k2)

Theorem 5 If 1113936 λn +infin then

(a) Ψ(xn)⟶ inf Ψ(b) xn⟶ xlowast isin argminΨ

Proof Replacing u by xlowast in (81) we have

(1 minus σ)Dhnx

n x

nminus 11113872 1113873leAnminus 1 x

lowast( 1113857 minus An x

lowast( 1113857 (92)

By adding the inequality (92) from 1 to k we have

(1 minus σ) 1113944nk

n1Dhn

xn x

nminus 11113872 1113873leA0 x

lowast( 1113857 minus Ak x

lowast( 1113857 (93)

Ak(xlowast)ge 0 so 1113936infinn1Dhn

(xn xnminus 1)lt +infin From (69) we have

1113944

infin

n1λnεn lt +infin (94)

Since Ψ(xn) n is nonincreasing and (94) by applyingthe )eorem 3 (ii) we have the results of )eorem 5

5 Application to Nonnegative LinearInverse Problem

In Poisson inverse problems (eg [25 26]) we are given anonnegative observation matrix A isin Rmtimesd

+ and a noisymeasurement vector b isin Rm

+ and the goal is to reconstructthe signal x isin Rd

+ such that Ax≃ b We can naturally adoptthe distance D(Ax b) to measure the residuals between twononnegative points with

(1) Input x0 isin Scap ri(domΨ)

(2) For n 1 2 with εn gt 0 we obtain (nablahn(xnminus 1) minus nablahn(xn)λn) isin zεnΨ(xn)

ALGORITHM 1 Inexact Bregman Proximal Gradient (IBPG)

Abstract and Applied Analysis 9

D(Ax b) 1113944m

i1langai xranglog

langai xrangbi

+ bi minus langai xrang (95)

where ai denotes the ith line of AIn this section we propose an approach for solving the

nonnegative linear inverse problem defined by

Pα( 1113857 inf αx1 + D(Ax b) x isin Rd+1113966 1113967 (αgt 0) (96)

We take

f(x) ≔ αx1

g(x) ≔ D(Ax b)

h(x) ≔ h1(x) 1113944d

i1xi logxi

(97)

It is shown in [15] that the couple (g h) verified aLipschitz-likeConvexity Condition (LC) on Rd

+ for any Lsuch that

Lge max1lejled

1113944

m

i1aij max

1lejledAj

1 (98)

where Aj denotes the jth column of AFor λn ≔ λ foralln )eorem 3 is applicable and global

warrant of Algorithm 1 convergence to an optimal solutionof (Pα)

Given xn isin Rd++ the iteration

xn+1≃Tλ x

n( 1113857 (99)

amounts to solving the one-dimensional problemFor j 1 d

xn+1j ≃ argmin αt + cjt +

t logt

xnj

+ xnj minus t⎛⎝ ⎞⎠ tgt 0

⎧⎨

⎫⎬

(100)

where cj is the jth component of nablag(xn)

6 Conclusion

)e proposed algorithms constitute a unified frame for theexisting algorithms BPG BP and PG by giving others inparticular the inexact version of the interior method withBregman distance studied in [24] More precisely

(i) When εn 0 foralln isin Nlowast our algorithm is the NoLipsstudied in [15 22]

(ii) When g 0 our algorithm is the inexact version ofBregman Proximal (BP) studied in [19]

(iii) When εn 0 foralln isin Nlowast and g 0 our algorithm isthe Bregman Proximal (BP) studied in [17 21]

(iv) When f 0 our algorithm is the inexact version ofthe interior method with Bregman distance studiedin [24]

(v) When f 0 and εn 0 foralln isin Nlowast our algorithm isthe interior method with Bregman distance studiedin [24]

(vi) When h (12)middot2 our algorithm is the proximalgradient method (PG) and its variants [4ndash14 27]

Our analysis is different and more simple than the onegiven in [15 22] and allows to reduce some hypothesis inparticular the supercoercivity of Ψ as well as the simulta-neous convexity of f and g

Data Availability

No data were used to support this study

Conflicts of Interest

)e authors declare that they have no conflicts of interest

References

[1] D L Donoho ldquoCompressed sensingrdquo IEEE Transactions onInformation eory vol 52 no 4 pp 1289ndash1306 2006

[2] A Beck and Y C Eldar ldquoSparsity constrained nonlinearoptimization optimality conditions and algorithmsrdquo SIAMJournal on Optimization vol 23 no 3 pp 1480ndash1509 2013

[3] D R Luke ldquoPhase retrieval whatrsquos newrdquo SIAGOPT Viewsand News vol 25 no 1 pp 1ndash5 2017

[4] A Auslender ldquoNumrrical methods for non-differentiableconvex optimizationrdquo inMathematical Programming StudiesJ P Vial and C B Nguyen Eds vol 30 pp 102ndash126Springer Berlin Germany 1987

[5] A I Chen and A Ozdaglar ldquoA fast distributed proximal-gradient methodrdquo in Proceedings of the 2012 50th AnnualAllerton Conference on Communication Control and Com-puting (Allerton) pp 601ndash608 IEEE Monticello IL USAOctober 2012

[6] J B Hiriart-Urruty ldquoε-subdifferentiel calculs Convex anal-ysis and optimizationrdquo in Research Notes in MathematicsSeries Vol 57 Pitman Publishers London UK 1982

[7] K Jiang D Sun and K-C Toh ldquoAn inexact acceleratedproximal gradient method for large scale linearly constrainedconvex SDPrdquo SIAM Journal on Optimization vol 22 no 3pp 1042ndash1064 2012

(1) Input x0 isin Scap ri(domΨ)

(2) Choose λn gt 0 and find xn isin S σn isin [0 σ] and εn ge 0 such that((nablahn(xnminus 1) minus nablahn(xn))λn) isin zεn

Ψ(xn)

λnεn le σnDhn(xn xnminus 1)

(3) Set n⟵ n + 1 and go to step 1

ALGORITHM 2 εminus NoLips

10 Abstract and Applied Analysis

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11

Page 3: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

(i) )e nonsmooth part f is possibly not convexand the smooth part g is convex

(ii) )e nonsmooth part f is convex and thesmooth part g is possibly not convex

(b) )e assumption is as follows argmin Ψ(x)

x isin dom h is compact or forallλgt 0 h + λf issupercoercive

)is is a condition on f and g (see [15]) whichprecludes the application of NoLips for the functionsΨ non-supercoercive In this work we show that wecan circumvent this condition by coupling the LCproperty with the bounded level sets as follows

L2(x r) y isin S Dh(y x)le r1113864 1113865 (11)

It is a condition which relates to the parameter h andwhich is verified by most of the interesting Bregmandistances

(2) Inexact version of NoLips

We propose an inexact version of NoLips calledεminus NoLips which verifies

xn

εn minus argmin f(u) +langnablag xnminus 1

1113872 1113873 urang +1λn

Dh u xnminus 1

1113872 11138731113896 1113897

n isin Nlowast

(12)

)e convergence result is established in Section 4 )isstudy covers the convergence results given for PG and BPGby giving new results in particular the convergence of theinexact version of the interior method with Bregman dis-tance studied in [24] this result has not been establisheduntil now

We also note that the convergence of NoLips is givenwith the following condition

dom h dom h (13)

It is for that and for the clarity of the hypothesis we supposein what follows that h S⟶ R with S being an open convexset of Rd

3 Main Results

In order to clarify the status of parameter h we give thefollowing definitions Let S be a convex open subset of Rd

and h S⟶ R Let us consider the following hypotheses

H1 h is continuously differentiable on SH2 h is continuous and strictly convex on S

H3 forallrge 0forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (14)

H4 forallrge 0forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (15)

H5 if xn n sub S then xn⟶ xlowast isin S so

Dh xlowast x

n( 1113857⟶ 0 (16)

H6 if xn n sub S then Dh(xlowast xn)⟶ 0 so

xn⟶ x

lowast (17)

Definition 4

(i) h S⟶ R is a Bregman function on S orldquoD-func-tionrdquo if h verifies H1 H2 H3 H4 H5 and H6

(ii) Dh( ) SXS⟶ R such that forallx isin S forally isin S

Dh(x y) h(x) minus h(y) minus langx minus ynablah(y)rang (18)

Eq (18) is called Bregman distance if h is a Bregmanfunction We put the following conditions

A(S) h S⟶ R verifingH1 H21113864 1113865

B(S) h S⟶ R verifingH1 H2 H3 H4 H5 andH61113864 1113865

E(S) h S⟶ R the Legendre type of S1113864 1113865

Proposition 1 Let h and hprime verify H1

forallλ Dλh+hprime( ) λDh( ) + Dhprime( ) (19)

Lemma 2 forallh isin A(S) foralla isin S and forallb c isin S

Dh(a b) + Dh(b c) minus Dh(a c) langa minus bnablah(c) minus nablah(b)rang

(20)

Example 1 If S0 Rd and h0(x) (12)x2 then

Dh0(x y)

12x minus y

2 (21)

Example 2 If S1 Rd++ ≔ x isin Rdxi gt 0 i 1 d1113864 1113865 and

h1(x) 1113944id

i1xilog xi forallx isin S1 (22)

with the convention 0 log 0 0 then forall(x y) isin S1XS1

Dh1(x y) 1113944

d

i1xilog

xi

yi

+ yi minus xi (23)

Example 3 If S2 ]minus 1 1]d and h2(x) minus 1113936idi1

1 minus x2

i

1113969

thenDh2(x y) h2(x) + 1113936

di11 minus xiyiy2

i forall(x y) isin S2XS2

Abstract and Applied Analysis 3

Proposition 2 (see [19]) hi isin B(Si)capE(Si) i 0 1 2 Weconsider the following minimization problem

(p) inf Ψ(x) ≔ f(x) + g(x) x isin S1113864 1113865 (24)

)e following assumptions on the problemrsquos data aremade throughout the paper (and referred to as the blanketassumptions)

Assumption 3

(i) h isin A(S)capE(S)

(ii) g Rd⟶ ]minus infin +infin] is proper (lsc) withS sub domg which is and continuously differentiableon S

(iii) f Rd⟶ ]minus infin +infin] is proper (lsc)(iv) ri(domΨ)cap Sneempty(v) Ψlowast ≔ inf Ψ(x) x isin S1113864 1113865gt minus infin

We consider the operator Tλ defined by forallx isin S

Tλ(x) argmin f(u) +langnablag(x) u minus xrang1113882

+1λDh(u x) u isin S1113883

(25)

We give in the following a series of lemmas allowingestablishment of )eorem 2 which assures the well-pos-edness of the method proposed in Section 4

Lemma 3 forallλgt 0 and forallx isin S

Tλ(x) argmin Ψ(u) +1λDhminus λg(u x) u isin S1113880 1113883 (26)

Proof forallx isin S sub domg

(25)⟹ Tλ(x) argmin f(u) + g(x) +langnablag(x) u minus xrang1113882

+1λDh(u x) u isin S1113883

(27)

Whenforallu isin S sub domg we have

f(u) + g(x) + langnablag(x) u minus xrang +1λDh(u x)

Ψ(u) + g(x) minus g(u) + langnablag(x) u minus xrang +1λDh(u x)

Ψ(u) minus Dg(u x) +1λDh(u x)

Ψ(u) +1λDhminus λg(u x)

(28)

Lemma 4 If the pair (g h) verified the condition (LC) thenexist Lgt 0 forallx isin S forallu isin S

(i) Dg(u x)leLDh(u x)

(ii) forallλ isin ]0 1L[ Dhminus λg(u x)ge 0

Proof

(i) If Lh-g is convex on S then

DLhminus g(u x)ge 0 forallx isin S forallu isin S (29)

Let u isin (SS) )ere exists a sequence un n sub S suchthat un⟶ u then we have

DLhminus g un x( 1113857ge 0 (30)

Lh-g is continuous in S then DLhminus g(u x)ge 0

(ii) Let λ isin ]0 1L[

DLhminus g(u x)ge 0⟹ LDh(u x)geDg(u x)

⟹1λDh(u x)geDg(u x)

⟹Dhminus λg(u x)ge 0

(31)

Lemma 5 If h is the Legendre on S then h minus λg is also theLegendre on S for all λ such that 0lt λlt (1L)

Proof Conditions (a) and (b) of Definition 1 being verifiedlet us demonstrate that the condition (c) is verified too Letxi1113864 1113865 be xi⟶ xlowast isin Fr(S) (SS)

nablah xi( 1113857 minus λnablag xi( 1113857

2 ge nablah xi( 1113857

nablah xi( 1113857

minus 2λ nablag xi( 1113857

1113872 1113873

+ λ2 nablag xi( 1113857

2

(32)

)en

limxi⟶xlowastisinFr(S)

nabla(h minus λg) xi( 1113857

+infin (33)

h minus λg is strictly convex in S Indeed let x y isin S xney wehave Dhminus λg(x y)ge 0

Dhminus λg(x y) 0⟹Dh(x y)

λDg(x y)⟹Dh(x y)le λLDh(x y)

(34)

h is strongly convex on S x y isin S xney thenDh(x y)ne 0⟹1le λL which is absurd Hence h minus λg isstrictly convex in S

Lemma 6 Consider the following

4 Abstract and Applied Analysis

forallλ isin 01L

1113877 1113876 forallu isin S z Dhminus λg( u)1113872 1113873 xlowast

( 1113857

nabla(h minus λg) xlowast( ) minus nabla(h minus λg)(u)1113864 1113865 if xlowast isin S

empty if not

⎧⎪⎨

⎪⎩

(35)

Proof Since h minus λg is a Legendre function on S Dhminus λg( u)

is also a Legendre By application of )eorem 261 in [23]z(Dhminus λg( u)) verifies the following

(i) If xlowast isin int(domDhminus λg( u)) S then

z Dhminus λg( u)1113872 1113873 xlowast

( 1113857 nablaDhminus λg( u) xlowast

( 11138571113966 1113967 (36)

(ii) If xlowast notin S then z(Dh( u))(xlowast) empty

Theorem 2 (well-posedness of the method) We assumethat

(i) Ψ is convex(ii) e pair (g h) verified the condition (LC)(iii) forallrge 0 forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (37)

Then forallλ isin ]0 (1L)[ and the map Tλ defined in (25) isnonempty and single-valued from S to S

Proof forallx isin S and Tλ(x) is nonempty for this it is enoughto demonstrate that forallr isin R

L(x r) u isin S Ψ(u) + λminus 1Dhminus λg(u x)le r1113966 1113967 (38)

which is closed and is bounded when it is nonempty

u isin L(x r)⟹Ψ(u) + λminus 1Dhminus λg(u x)le r

⟹Dhminus λg(u x)le λ r minus Ψlowast( 1113857

⟹Dh(u x)le λ r minus Ψlowast( 1113857 + λDg(u x)

⟹Dh(u x)le λ r minus Ψlowast( 1113857

+ LλDh(u x)

⟹Dh(u x)leλ r minus Ψlowast( )

1 minus Lλ

(39)

It follows that

L(x r) sub L2 xλ r minus Ψlowast( )

1 minus Lλ1113888 1113889 (40)

thanks to H4 L2(x (λ(r minus Ψlowast)1 minus Lλ)) is bounded whichleads that L(x r) is bounded too which shows that

Tλ(x)neempty (41)

Let xlowast isin Tλ(x) Let us suppose that xlowast isin S(26)⟹ 0 isinz(Ψ(middot) + 1λDhminus λg( x))(xlowast) since ri(domΨ)cap ri(domDhminus λg

( x)) ri(domΨ)cap ri(S) ri(domΨ)cap Sneempty from [10]which allows to write that

0 isin zΨ xlowast

( 1113857 + z1λDhminus λg( x)1113874 1113875 x

lowast( 1113857 (42)

It follows that

existu isin zΨ xlowast

( 1113857 (43)

such that

minus λu isin zDhminus λg( x) xlowast

( 1113857 (44)

is in contradiction with Lemma 6 )en Tλ(x) sub S

On the other hand h minus λg is strictly convex in S and Ψ isconvex so Ψ(middot) + Dhminus λg( x) is strongly convex in S )enTλ(x) has a unique value for all x isin S

Remark 1 )is result is liberated from the supercoercivityof Ψ and the simultaneous convexity of f and g as requiredby Lemma 2 [15]

Proposition 3 forallx isin Sforallλ isin ]0 (1L)[nabla(h minus λg)(x) minus nabla(h minus λg) Tλ(x)( 1113857

λisin zΨ Tλ(x)( 1113857 (45)

existx isin Snabla(h minus λg)(x) minus nabla(h minus λg)(x)

λisin zεΨ(x) (46)

Proof Since Tλ(x) isin S we have

0 isin z Ψ(middot) +1λDhminus λg( x)1113874 1113875 Tλ(x)( 1113857

⟹ minus nabla1λDhminus λg( x)1113874 1113875 Tλ(x)( 1113857( 1113857 isin z Ψ Tλ(x)( 1113857( 1113857

⟹ (45)

(47)

For (46) just take x Tλ(x) since

zΨ Tλ(x)( 1113857 sub zεΨ Tλ(x)( 1113857 (48)

Proposition 4 forallx isin Sforallλ isin ]0 (1L)[

Tλ(x) proxhminus λg

λΨ (x) proxhλf o proxh

λp(x) (49)

where p(u) langnablag(x) urang

Proof )e first equality is due to Lemma 3 )e second isestablished in [15]

)e first equality played a decisive role in the devel-opment of this paper

4 Analysis of the εminusNoLips Algorithm

In this section we propose an Inexact Bregman ProximalGradient Algorithm (IBPG) which is an inexact version ofthe BPG algorithm described in [15 22] the IBPG

Abstract and Applied Analysis 5

framework allows an error in the subgradient inclusion byusing the error εn We study two algorithms

(i) Algorithm 1 inexact Bregman Proximal Gradient(IBPG) algorithm without relative error criterion

(ii) Algorithm 2 inexact Bregman Proximal Gradient(IBPG) algorithm with relative error criterion whichwe call εminus NoLips

We establish the main convergence properties of theproposed algorithms In particular we prove its global rateof convergence showing that it shares the claimed sublinearrate O(1n) of basic first-order methods such as the classicalPG and BPG We also derive a global convergence of thesequence generated by NoLips to a minimizer of (P)

Assumptions 4

(i) h isin B(S)capE(S)

(ii) Ψ is convex(iii) )e pair (g h) verified the condition (LC)(iv) Argmin Ψneempty

In our analysis Ψ is supposed to be a convex function itallows to distinguish two interesting cases

(i) )e nonsmooth part f is possibly not convex and thesmooth part g is convex

(ii) )e nonsmooth part f is convex and the smooth partg is possibly not convex

In what follows the choice of the sequence λn1113864 1113865 dependsof the convexity of g

Let λ such that 0lt λlt (1L) λ0 ≔ 0

If g is not convex then we choose

λn λ (0lt λle λ) n 1 (50)

If g is convex then we choose

λn le λn+1 le λ n 1 (51)

In those conditions we easily show thatforallx y isin S foralln isin N 0lt λn le λ such that

λn+1 minus λn( 1113857Dg(x y)ge 0 (52)

We pose

hn h minus λng n 1 (53)

Proposition 5 e sequence xn n defined by (IBPG) existsand is verified for all n isin Nlowast

xn isin εn minus argmin Ψ(u) +

1λn

Dhnu x

nminus 11113872 1113873 u isin S1113896 1113897 (54)

Proof Existence is deduced trivially from (45)

Ωn ≔nablahn xnminus 1( 1113857 minus nablahn xn( )

λn

isin zεnΨ x

n( 1113857

⟹Ψ(u)geΨ xn

( 1113857 +langu minus xnΩnrang minus εn

(55)

By applying Lemma 2 we have

Ψ(u)geΨ xn

( 1113857 + λminus 1n Dhn

u xn

( 1113857 + Dhnx

n x

nminus 11113872 1113873 minus Dhn

u xnminus 1

1113872 11138731113960 1113961 minus εn

⟹Ψ xn

( 1113857 + λminus 1n Dhn

xn x

nminus 11113872 1113873leΨ(u) + λminus 1

n Dhnu x

nminus 11113872 1113873 + εn forallu isin S

⟹ xn isin T

εn

λnx

nminus 11113872 1113873

(56)

where Tελ(x) ε minus argmin Ψ(u) + (1λ)Dhminus λg(u x)1113966 1113967

Remark 2 )is result shows that IBPG is an inexact versionof BPG and this is exactly the BPG when εn 0 ie

(i) xn isin Tεn

λn(xnminus 1)⟹ xn ≃Tλn

(xnminus 1)

(ii) εn 0⟹ xn Tλn(xnminus 1)

Proposition 6 For all n isin Nlowast

(i) Ψ xn

( 1113857 minus Ψ xnminus 1

1113872 11138731113872 1113873le minus Dhnx

n x

nminus 11113872 1113873 + λnεn (57)

(ii) Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 11138731113872

minus Ψ xn

( 11138571113857 +λnεn

1 minus λL

(58)

Proof

(55)⟹λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le Dhnu x

nminus 11113872 1113873 minus Dhn

u xn

( 1113857 minus Dhnx

n x

nminus 11113872 11138731113960 1113961 + εnλn (59)

6 Abstract and Applied Analysis

We put u xnminus 1 in (59) we get (57)(ii) Put u xnminus 1 in (59) we have

Dhnx

n x

nminus 11113872 1113873 + Dhn

xnminus 1

xn

1113872 1113873le λn Ψ xnminus 1

1113872 1113873 minus Ψ xn

( 11138571113872 1113873 + λnεn

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λn Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λnDg x

n x

nminus 11113872 1113873 + λnDg x

nminus 1 x

n1113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λ Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λL Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 11138731113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

λnεn

1 minus λL

(60)

Corollary 1

(i) If εn 0 the sequence Ψ(xn) n is nonincreasing(ii) Summability if 1113936

infinn1λnεn lt +infin then 1113936

infinn1

Dh(xn xnminus 1)lt +infin and 1113936infinn1Dh(xnminus 1 xn)lt +infin

Proof

(i) From (57) Dhn(xn xnminus 1)ge 0⟹Ψ(xn)leΨ(xnminus 1)

foralln isin Nlowast(ii) From (58) 1113936

npn1Dh(xn xnminus 1) + Dh(xnminus 1 xn)le

λ1 minus λL

1113944

np

n1Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

11 minus λL

1113944

np

n1λnεn

⟹1113944

np

n1Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873

leλ

1 minus λLΨ x

01113872 1113873 minus Ψlowast1113872 1113873 +

11 minus λL

1113944

np

n1λnεn

(61)

In the following we pose

tp ≔ 1113944

np

n1λn forallp isin N

lowast t0 λ0 0

An(u) Dhnu x

n( 1113857 forallu isin S foralln isin N

lowast

Φ xp

( 1113857 ≔ min Ψ xk

1113872 1113873 1le klep1113966 1113967 forallp isin Nlowast

(62)

Proposition 7 (Global Estimate in Function Values) (a) Forall u isin S and forallp isin Nlowast

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (63)

Proof We have Dhn(u xnminus 1) Anminus 1(u) minus (λn minus λnminus 1)Dg

(u xnminus 1) from (59) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) minus Dhnx

n x

nminus 11113872 1113873

minus λn minus λnminus 1( 1113857Dg u xnminus 1

1113872 1113873 + λnεn

(64)

From (52) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857le 1113944

np

n1Anminus 1(u) minus An(u) + 1113944

np

n1λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857leA0(u) minus Ap(u) + 1113944

np

n1λnεn

⟹ Φ xp

( 1113857 minus Ψ(u)( 1113857 1113944n1

np

λn leA0(u) minus Ap(u) + 1113944

np

n1λnεn

(65)

A0(u) Dh(u x0) (λ0 0) so

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (66)

)is theorem covers the evaluation of the global con-vergence rate given in [6 12] as shown by the followingcorollary

Corollary 2 We assume that

(a) εn 0 n 1

(b) λn λ 12L n 1

(c) xlowast isin argminΨ

en

Ψ xp

( 1113857 minus Ψlowast le2L

pDh xlowast x

01113872 1113873 (67)

ie Ψ(xp) minus Ψlowast O(1p)

Proof Immediate consequence of Proposition 7

Now we derive a global convergence of the sequencegenerated by Algorithm 1 to a minimizer of (P)

Abstract and Applied Analysis 7

Theorem 3 We assume that 1113936 λnεn lt +infin if one of thefollowing assumptions holds

(i) exist λ gt 0 such that λ le λn le λ n 1

(ii) e sequence Ψ(xn) is nonincreasing and1113936 λn +infin then (a) Ψ(xn)⟶ inf Ψ and (b)xn⟶ xlowast isin argminΨ

Proof

(a) Suppose

(i) Let xlowast isin argminΨ and we put u xlowast in (69) wehave

λ Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857 + λnεn

(68)

An xlowast

( 1113857leAnminus 1 xlowast

( 1113857 + λnεn (69)

1113944 λnεn lt +infin⟹An xlowast

( 1113857⟶ l isin R (70)

(68) and (suppose (69))⟹Ψ(xn)⟶ inf Ψ

Suppose (ii)

Φ xp

( 1113857 min Ψ xk

1113872 1113873 1le klep1113966 1113967 Ψ xp

( 1113857 (71)

For u xlowast isin argminΨ in (63) we have 0leΨ(xp) minus Ψ(xlowast) le (1tp)[Dh(xlowast x0) + 1113936

infinn1λnεn]

so 1113936 λn +infin⟹Ψ(xn)⟶ inf Ψ

(b) (69)⟹existαgt 0 An(xlowast)le αAn(xlowast) Dhn(xlowast xn)

Dh(xlowast xn) minus λnDg(xlowast xn) so

Dh xlowast x

n( 1113857le α + λnDg x

lowast x

n( 1113857

dArr

Dh xlowast x

n( 1113857le

α1 minus λL

(72)

Then Dh(xlowast xn)1113864 1113865 is bounded and from H3 xn isbounded as well Let ulowast isin Adh xn there exists then asubsequence xni of xn such that xni⟶ ulowast isin S From H5Dh(ulowast xni )⟶ 0 On the other hand

0leDg ulowast x

ni( 1113857leL middot Dh ulowast x

ni( 1113857 (73)

so Dg(ulowast xni )⟶ 0ulowast isin argminΨ Indeed

inf ΨleΨ ulowast

( 1113857le lim Ψ xni( 1113857 inf Ψ

dArr

inf Ψ Ψ ulowast

( 1113857

(74)

which shows that⟹ulowast isin argminΨ )en

Dhniulowast x

ni( 1113857 Dh ulowast x

ni( 1113857 minus λniDg u

lowast x

ni( 1113857⟶ 0 (75)

Since Dhni(ulowast xni )⟶ 0 and ulowast isin argminΨ we have

Dhnulowast x

n( 1113857⟶ 0 (76)

We have

Dh ulowast x

n( 1113857 Dhn

ulowast x

n( 1113857 + λnDg u

lowast x

n( 1113857

leDhnulowast x

n( 1113857 + L middot λDh u

lowast x

n( 1113857

(77)

so (1 minus L middot λ)Dh(ulowast xn)leDhn(ulowast xn) then

Dh ulowast x

n( 1113857⟶ 0 (78)

And from H6 we have xn⟶ ulowast isin argminΨ )e IBPG algorithm generates a sequence such thatΨ(xn) n does not necessarily be nonincreasing for thisreason and for improvement of the global estimate infunction values we now propose εminus NoLips which is aninexact version of BPG with a relative error criterion Let σsuch that 0le σ lt 1 be given as follows

In what follows we will derive a convergence rate result()eorem 4) for the εminus NoLips framework First we need toestablish a few technical lemmas

In the following xn n denotes the sequence generatedby εminus NoLips

Lemma 7 For every u isin S for all n isin Nlowast

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(79)

Proof Since (λn minus λnminus 1)Dg(u xnminus 1)ge 0 we have from (62)

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le minus Dhnx

n x

nminus 11113872 1113873 + Anminus 1(u)

amp9 minus An(u) + λnεn(80)

From Algorithm 2 we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + (σ minus 1)Dhnx

n x

nminus 11113872 1113873

(81)

From the condition LC we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(82)

Remark 3 We now notice that Ψ(xn) n is nonincreasingJust replace u with xnminus 1 in (79)

Lemma 8 For every n isin Nlowast and xlowast isin argminΨ we have

tn Ψ xn

( 1113857 minus Ψlowast( 1113857 + λminus 1n tn(1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 1113873 + Anminus 1 xlowast

( 1113857 minus An xlowast

( 1113857

(83)

Proof Replacing u by xnminus 1 in (79) and since An(xnminus 1)ge 0and Anminus 1(xnminus 1) 0 we have

8 Abstract and Applied Analysis

tnminus 1 Ψ xn

( 1113857 minus Ψlowast( 11138571113857 + tnminus 1λminus 1n (1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 11138731113873

(84)

Replacing u by xlowast in (79) we have

λn Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857 +(1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

+ leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857(85)

Since tnminus 1 + λn tn by adding (84) and (85) we have(83)

Lemma 9 For every k isin Nlowast

tk Ψ xk

1113872 1113873 minus Ψlowast1113872 1113873 +(1 minus σ)(1 minus λL) 1113944nk

n1λminus 1

n tnDh xn x

nminus 11113872 1113873

leA0 xlowast

( 1113857 minus Ak xlowast

( 1113857

(86)

Proof )is result is obtained by adding inequality (83) from1 to k (t0 λ0 0)

We are now ready to state the convergence rate result forthe εminus NoLips framework )is result improves and com-pletes the one given in the Proposition 7

Theorem 4 For every k isin Nlowast the following statements holdwe pose ρk ≔ 1113936

nkn1λ

minus 1n tn Consider

(a)Ψ xk

1113872 1113873 minus Ψlowast leDh xlowast x0( 1113857

tk

1113888 1113889 (87)

(b) ck ≔ min1lenlek

Dh xn x

nminus 11113872 1113873le

Dh xlowast x0( 1113857

(1 minus σ)(1 minus λL)ρk

1113888 1113889

(88)

Proof From (86) and since A0(xlowast) Dh(xlowast x0)(λ0 0)

and Ak(xlowast)ge 0 we immediately have (87) and (88)

Corollary 3 Consider an instance of the εminus NoLips frame-work with λn λ for every n isin Nlowasten for every k isin Nlowast thefollowing statements hold

(a)Ψ xk

1113872 1113873 minus Ψlowast O1k

1113874 1113875 (89)

(b) ck min1lenlek

Dh xn x

nminus 11113872 1113873 O

1k21113874 1113875 (90)

Proof tk kλ and (87)⟹ (90)

ρk k(k + 1)

2ge

k2

2 (91)

and (88)⟹ (90)

Remark 4 (87) and (89) represent exactly the convergencerate established in [15 22] (90) is a new result not estab-lished in [15 22] this result shows that ck converges to zeroat a rate of O(1k2)

Theorem 5 If 1113936 λn +infin then

(a) Ψ(xn)⟶ inf Ψ(b) xn⟶ xlowast isin argminΨ

Proof Replacing u by xlowast in (81) we have

(1 minus σ)Dhnx

n x

nminus 11113872 1113873leAnminus 1 x

lowast( 1113857 minus An x

lowast( 1113857 (92)

By adding the inequality (92) from 1 to k we have

(1 minus σ) 1113944nk

n1Dhn

xn x

nminus 11113872 1113873leA0 x

lowast( 1113857 minus Ak x

lowast( 1113857 (93)

Ak(xlowast)ge 0 so 1113936infinn1Dhn

(xn xnminus 1)lt +infin From (69) we have

1113944

infin

n1λnεn lt +infin (94)

Since Ψ(xn) n is nonincreasing and (94) by applyingthe )eorem 3 (ii) we have the results of )eorem 5

5 Application to Nonnegative LinearInverse Problem

In Poisson inverse problems (eg [25 26]) we are given anonnegative observation matrix A isin Rmtimesd

+ and a noisymeasurement vector b isin Rm

+ and the goal is to reconstructthe signal x isin Rd

+ such that Ax≃ b We can naturally adoptthe distance D(Ax b) to measure the residuals between twononnegative points with

(1) Input x0 isin Scap ri(domΨ)

(2) For n 1 2 with εn gt 0 we obtain (nablahn(xnminus 1) minus nablahn(xn)λn) isin zεnΨ(xn)

ALGORITHM 1 Inexact Bregman Proximal Gradient (IBPG)

Abstract and Applied Analysis 9

D(Ax b) 1113944m

i1langai xranglog

langai xrangbi

+ bi minus langai xrang (95)

where ai denotes the ith line of AIn this section we propose an approach for solving the

nonnegative linear inverse problem defined by

Pα( 1113857 inf αx1 + D(Ax b) x isin Rd+1113966 1113967 (αgt 0) (96)

We take

f(x) ≔ αx1

g(x) ≔ D(Ax b)

h(x) ≔ h1(x) 1113944d

i1xi logxi

(97)

It is shown in [15] that the couple (g h) verified aLipschitz-likeConvexity Condition (LC) on Rd

+ for any Lsuch that

Lge max1lejled

1113944

m

i1aij max

1lejledAj

1 (98)

where Aj denotes the jth column of AFor λn ≔ λ foralln )eorem 3 is applicable and global

warrant of Algorithm 1 convergence to an optimal solutionof (Pα)

Given xn isin Rd++ the iteration

xn+1≃Tλ x

n( 1113857 (99)

amounts to solving the one-dimensional problemFor j 1 d

xn+1j ≃ argmin αt + cjt +

t logt

xnj

+ xnj minus t⎛⎝ ⎞⎠ tgt 0

⎧⎨

⎫⎬

(100)

where cj is the jth component of nablag(xn)

6 Conclusion

)e proposed algorithms constitute a unified frame for theexisting algorithms BPG BP and PG by giving others inparticular the inexact version of the interior method withBregman distance studied in [24] More precisely

(i) When εn 0 foralln isin Nlowast our algorithm is the NoLipsstudied in [15 22]

(ii) When g 0 our algorithm is the inexact version ofBregman Proximal (BP) studied in [19]

(iii) When εn 0 foralln isin Nlowast and g 0 our algorithm isthe Bregman Proximal (BP) studied in [17 21]

(iv) When f 0 our algorithm is the inexact version ofthe interior method with Bregman distance studiedin [24]

(v) When f 0 and εn 0 foralln isin Nlowast our algorithm isthe interior method with Bregman distance studiedin [24]

(vi) When h (12)middot2 our algorithm is the proximalgradient method (PG) and its variants [4ndash14 27]

Our analysis is different and more simple than the onegiven in [15 22] and allows to reduce some hypothesis inparticular the supercoercivity of Ψ as well as the simulta-neous convexity of f and g

Data Availability

No data were used to support this study

Conflicts of Interest

)e authors declare that they have no conflicts of interest

References

[1] D L Donoho ldquoCompressed sensingrdquo IEEE Transactions onInformation eory vol 52 no 4 pp 1289ndash1306 2006

[2] A Beck and Y C Eldar ldquoSparsity constrained nonlinearoptimization optimality conditions and algorithmsrdquo SIAMJournal on Optimization vol 23 no 3 pp 1480ndash1509 2013

[3] D R Luke ldquoPhase retrieval whatrsquos newrdquo SIAGOPT Viewsand News vol 25 no 1 pp 1ndash5 2017

[4] A Auslender ldquoNumrrical methods for non-differentiableconvex optimizationrdquo inMathematical Programming StudiesJ P Vial and C B Nguyen Eds vol 30 pp 102ndash126Springer Berlin Germany 1987

[5] A I Chen and A Ozdaglar ldquoA fast distributed proximal-gradient methodrdquo in Proceedings of the 2012 50th AnnualAllerton Conference on Communication Control and Com-puting (Allerton) pp 601ndash608 IEEE Monticello IL USAOctober 2012

[6] J B Hiriart-Urruty ldquoε-subdifferentiel calculs Convex anal-ysis and optimizationrdquo in Research Notes in MathematicsSeries Vol 57 Pitman Publishers London UK 1982

[7] K Jiang D Sun and K-C Toh ldquoAn inexact acceleratedproximal gradient method for large scale linearly constrainedconvex SDPrdquo SIAM Journal on Optimization vol 22 no 3pp 1042ndash1064 2012

(1) Input x0 isin Scap ri(domΨ)

(2) Choose λn gt 0 and find xn isin S σn isin [0 σ] and εn ge 0 such that((nablahn(xnminus 1) minus nablahn(xn))λn) isin zεn

Ψ(xn)

λnεn le σnDhn(xn xnminus 1)

(3) Set n⟵ n + 1 and go to step 1

ALGORITHM 2 εminus NoLips

10 Abstract and Applied Analysis

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11

Page 4: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

Proposition 2 (see [19]) hi isin B(Si)capE(Si) i 0 1 2 Weconsider the following minimization problem

(p) inf Ψ(x) ≔ f(x) + g(x) x isin S1113864 1113865 (24)

)e following assumptions on the problemrsquos data aremade throughout the paper (and referred to as the blanketassumptions)

Assumption 3

(i) h isin A(S)capE(S)

(ii) g Rd⟶ ]minus infin +infin] is proper (lsc) withS sub domg which is and continuously differentiableon S

(iii) f Rd⟶ ]minus infin +infin] is proper (lsc)(iv) ri(domΨ)cap Sneempty(v) Ψlowast ≔ inf Ψ(x) x isin S1113864 1113865gt minus infin

We consider the operator Tλ defined by forallx isin S

Tλ(x) argmin f(u) +langnablag(x) u minus xrang1113882

+1λDh(u x) u isin S1113883

(25)

We give in the following a series of lemmas allowingestablishment of )eorem 2 which assures the well-pos-edness of the method proposed in Section 4

Lemma 3 forallλgt 0 and forallx isin S

Tλ(x) argmin Ψ(u) +1λDhminus λg(u x) u isin S1113880 1113883 (26)

Proof forallx isin S sub domg

(25)⟹ Tλ(x) argmin f(u) + g(x) +langnablag(x) u minus xrang1113882

+1λDh(u x) u isin S1113883

(27)

Whenforallu isin S sub domg we have

f(u) + g(x) + langnablag(x) u minus xrang +1λDh(u x)

Ψ(u) + g(x) minus g(u) + langnablag(x) u minus xrang +1λDh(u x)

Ψ(u) minus Dg(u x) +1λDh(u x)

Ψ(u) +1λDhminus λg(u x)

(28)

Lemma 4 If the pair (g h) verified the condition (LC) thenexist Lgt 0 forallx isin S forallu isin S

(i) Dg(u x)leLDh(u x)

(ii) forallλ isin ]0 1L[ Dhminus λg(u x)ge 0

Proof

(i) If Lh-g is convex on S then

DLhminus g(u x)ge 0 forallx isin S forallu isin S (29)

Let u isin (SS) )ere exists a sequence un n sub S suchthat un⟶ u then we have

DLhminus g un x( 1113857ge 0 (30)

Lh-g is continuous in S then DLhminus g(u x)ge 0

(ii) Let λ isin ]0 1L[

DLhminus g(u x)ge 0⟹ LDh(u x)geDg(u x)

⟹1λDh(u x)geDg(u x)

⟹Dhminus λg(u x)ge 0

(31)

Lemma 5 If h is the Legendre on S then h minus λg is also theLegendre on S for all λ such that 0lt λlt (1L)

Proof Conditions (a) and (b) of Definition 1 being verifiedlet us demonstrate that the condition (c) is verified too Letxi1113864 1113865 be xi⟶ xlowast isin Fr(S) (SS)

nablah xi( 1113857 minus λnablag xi( 1113857

2 ge nablah xi( 1113857

nablah xi( 1113857

minus 2λ nablag xi( 1113857

1113872 1113873

+ λ2 nablag xi( 1113857

2

(32)

)en

limxi⟶xlowastisinFr(S)

nabla(h minus λg) xi( 1113857

+infin (33)

h minus λg is strictly convex in S Indeed let x y isin S xney wehave Dhminus λg(x y)ge 0

Dhminus λg(x y) 0⟹Dh(x y)

λDg(x y)⟹Dh(x y)le λLDh(x y)

(34)

h is strongly convex on S x y isin S xney thenDh(x y)ne 0⟹1le λL which is absurd Hence h minus λg isstrictly convex in S

Lemma 6 Consider the following

4 Abstract and Applied Analysis

forallλ isin 01L

1113877 1113876 forallu isin S z Dhminus λg( u)1113872 1113873 xlowast

( 1113857

nabla(h minus λg) xlowast( ) minus nabla(h minus λg)(u)1113864 1113865 if xlowast isin S

empty if not

⎧⎪⎨

⎪⎩

(35)

Proof Since h minus λg is a Legendre function on S Dhminus λg( u)

is also a Legendre By application of )eorem 261 in [23]z(Dhminus λg( u)) verifies the following

(i) If xlowast isin int(domDhminus λg( u)) S then

z Dhminus λg( u)1113872 1113873 xlowast

( 1113857 nablaDhminus λg( u) xlowast

( 11138571113966 1113967 (36)

(ii) If xlowast notin S then z(Dh( u))(xlowast) empty

Theorem 2 (well-posedness of the method) We assumethat

(i) Ψ is convex(ii) e pair (g h) verified the condition (LC)(iii) forallrge 0 forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (37)

Then forallλ isin ]0 (1L)[ and the map Tλ defined in (25) isnonempty and single-valued from S to S

Proof forallx isin S and Tλ(x) is nonempty for this it is enoughto demonstrate that forallr isin R

L(x r) u isin S Ψ(u) + λminus 1Dhminus λg(u x)le r1113966 1113967 (38)

which is closed and is bounded when it is nonempty

u isin L(x r)⟹Ψ(u) + λminus 1Dhminus λg(u x)le r

⟹Dhminus λg(u x)le λ r minus Ψlowast( 1113857

⟹Dh(u x)le λ r minus Ψlowast( 1113857 + λDg(u x)

⟹Dh(u x)le λ r minus Ψlowast( 1113857

+ LλDh(u x)

⟹Dh(u x)leλ r minus Ψlowast( )

1 minus Lλ

(39)

It follows that

L(x r) sub L2 xλ r minus Ψlowast( )

1 minus Lλ1113888 1113889 (40)

thanks to H4 L2(x (λ(r minus Ψlowast)1 minus Lλ)) is bounded whichleads that L(x r) is bounded too which shows that

Tλ(x)neempty (41)

Let xlowast isin Tλ(x) Let us suppose that xlowast isin S(26)⟹ 0 isinz(Ψ(middot) + 1λDhminus λg( x))(xlowast) since ri(domΨ)cap ri(domDhminus λg

( x)) ri(domΨ)cap ri(S) ri(domΨ)cap Sneempty from [10]which allows to write that

0 isin zΨ xlowast

( 1113857 + z1λDhminus λg( x)1113874 1113875 x

lowast( 1113857 (42)

It follows that

existu isin zΨ xlowast

( 1113857 (43)

such that

minus λu isin zDhminus λg( x) xlowast

( 1113857 (44)

is in contradiction with Lemma 6 )en Tλ(x) sub S

On the other hand h minus λg is strictly convex in S and Ψ isconvex so Ψ(middot) + Dhminus λg( x) is strongly convex in S )enTλ(x) has a unique value for all x isin S

Remark 1 )is result is liberated from the supercoercivityof Ψ and the simultaneous convexity of f and g as requiredby Lemma 2 [15]

Proposition 3 forallx isin Sforallλ isin ]0 (1L)[nabla(h minus λg)(x) minus nabla(h minus λg) Tλ(x)( 1113857

λisin zΨ Tλ(x)( 1113857 (45)

existx isin Snabla(h minus λg)(x) minus nabla(h minus λg)(x)

λisin zεΨ(x) (46)

Proof Since Tλ(x) isin S we have

0 isin z Ψ(middot) +1λDhminus λg( x)1113874 1113875 Tλ(x)( 1113857

⟹ minus nabla1λDhminus λg( x)1113874 1113875 Tλ(x)( 1113857( 1113857 isin z Ψ Tλ(x)( 1113857( 1113857

⟹ (45)

(47)

For (46) just take x Tλ(x) since

zΨ Tλ(x)( 1113857 sub zεΨ Tλ(x)( 1113857 (48)

Proposition 4 forallx isin Sforallλ isin ]0 (1L)[

Tλ(x) proxhminus λg

λΨ (x) proxhλf o proxh

λp(x) (49)

where p(u) langnablag(x) urang

Proof )e first equality is due to Lemma 3 )e second isestablished in [15]

)e first equality played a decisive role in the devel-opment of this paper

4 Analysis of the εminusNoLips Algorithm

In this section we propose an Inexact Bregman ProximalGradient Algorithm (IBPG) which is an inexact version ofthe BPG algorithm described in [15 22] the IBPG

Abstract and Applied Analysis 5

framework allows an error in the subgradient inclusion byusing the error εn We study two algorithms

(i) Algorithm 1 inexact Bregman Proximal Gradient(IBPG) algorithm without relative error criterion

(ii) Algorithm 2 inexact Bregman Proximal Gradient(IBPG) algorithm with relative error criterion whichwe call εminus NoLips

We establish the main convergence properties of theproposed algorithms In particular we prove its global rateof convergence showing that it shares the claimed sublinearrate O(1n) of basic first-order methods such as the classicalPG and BPG We also derive a global convergence of thesequence generated by NoLips to a minimizer of (P)

Assumptions 4

(i) h isin B(S)capE(S)

(ii) Ψ is convex(iii) )e pair (g h) verified the condition (LC)(iv) Argmin Ψneempty

In our analysis Ψ is supposed to be a convex function itallows to distinguish two interesting cases

(i) )e nonsmooth part f is possibly not convex and thesmooth part g is convex

(ii) )e nonsmooth part f is convex and the smooth partg is possibly not convex

In what follows the choice of the sequence λn1113864 1113865 dependsof the convexity of g

Let λ such that 0lt λlt (1L) λ0 ≔ 0

If g is not convex then we choose

λn λ (0lt λle λ) n 1 (50)

If g is convex then we choose

λn le λn+1 le λ n 1 (51)

In those conditions we easily show thatforallx y isin S foralln isin N 0lt λn le λ such that

λn+1 minus λn( 1113857Dg(x y)ge 0 (52)

We pose

hn h minus λng n 1 (53)

Proposition 5 e sequence xn n defined by (IBPG) existsand is verified for all n isin Nlowast

xn isin εn minus argmin Ψ(u) +

1λn

Dhnu x

nminus 11113872 1113873 u isin S1113896 1113897 (54)

Proof Existence is deduced trivially from (45)

Ωn ≔nablahn xnminus 1( 1113857 minus nablahn xn( )

λn

isin zεnΨ x

n( 1113857

⟹Ψ(u)geΨ xn

( 1113857 +langu minus xnΩnrang minus εn

(55)

By applying Lemma 2 we have

Ψ(u)geΨ xn

( 1113857 + λminus 1n Dhn

u xn

( 1113857 + Dhnx

n x

nminus 11113872 1113873 minus Dhn

u xnminus 1

1113872 11138731113960 1113961 minus εn

⟹Ψ xn

( 1113857 + λminus 1n Dhn

xn x

nminus 11113872 1113873leΨ(u) + λminus 1

n Dhnu x

nminus 11113872 1113873 + εn forallu isin S

⟹ xn isin T

εn

λnx

nminus 11113872 1113873

(56)

where Tελ(x) ε minus argmin Ψ(u) + (1λ)Dhminus λg(u x)1113966 1113967

Remark 2 )is result shows that IBPG is an inexact versionof BPG and this is exactly the BPG when εn 0 ie

(i) xn isin Tεn

λn(xnminus 1)⟹ xn ≃Tλn

(xnminus 1)

(ii) εn 0⟹ xn Tλn(xnminus 1)

Proposition 6 For all n isin Nlowast

(i) Ψ xn

( 1113857 minus Ψ xnminus 1

1113872 11138731113872 1113873le minus Dhnx

n x

nminus 11113872 1113873 + λnεn (57)

(ii) Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 11138731113872

minus Ψ xn

( 11138571113857 +λnεn

1 minus λL

(58)

Proof

(55)⟹λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le Dhnu x

nminus 11113872 1113873 minus Dhn

u xn

( 1113857 minus Dhnx

n x

nminus 11113872 11138731113960 1113961 + εnλn (59)

6 Abstract and Applied Analysis

We put u xnminus 1 in (59) we get (57)(ii) Put u xnminus 1 in (59) we have

Dhnx

n x

nminus 11113872 1113873 + Dhn

xnminus 1

xn

1113872 1113873le λn Ψ xnminus 1

1113872 1113873 minus Ψ xn

( 11138571113872 1113873 + λnεn

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λn Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λnDg x

n x

nminus 11113872 1113873 + λnDg x

nminus 1 x

n1113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λ Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λL Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 11138731113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

λnεn

1 minus λL

(60)

Corollary 1

(i) If εn 0 the sequence Ψ(xn) n is nonincreasing(ii) Summability if 1113936

infinn1λnεn lt +infin then 1113936

infinn1

Dh(xn xnminus 1)lt +infin and 1113936infinn1Dh(xnminus 1 xn)lt +infin

Proof

(i) From (57) Dhn(xn xnminus 1)ge 0⟹Ψ(xn)leΨ(xnminus 1)

foralln isin Nlowast(ii) From (58) 1113936

npn1Dh(xn xnminus 1) + Dh(xnminus 1 xn)le

λ1 minus λL

1113944

np

n1Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

11 minus λL

1113944

np

n1λnεn

⟹1113944

np

n1Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873

leλ

1 minus λLΨ x

01113872 1113873 minus Ψlowast1113872 1113873 +

11 minus λL

1113944

np

n1λnεn

(61)

In the following we pose

tp ≔ 1113944

np

n1λn forallp isin N

lowast t0 λ0 0

An(u) Dhnu x

n( 1113857 forallu isin S foralln isin N

lowast

Φ xp

( 1113857 ≔ min Ψ xk

1113872 1113873 1le klep1113966 1113967 forallp isin Nlowast

(62)

Proposition 7 (Global Estimate in Function Values) (a) Forall u isin S and forallp isin Nlowast

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (63)

Proof We have Dhn(u xnminus 1) Anminus 1(u) minus (λn minus λnminus 1)Dg

(u xnminus 1) from (59) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) minus Dhnx

n x

nminus 11113872 1113873

minus λn minus λnminus 1( 1113857Dg u xnminus 1

1113872 1113873 + λnεn

(64)

From (52) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857le 1113944

np

n1Anminus 1(u) minus An(u) + 1113944

np

n1λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857leA0(u) minus Ap(u) + 1113944

np

n1λnεn

⟹ Φ xp

( 1113857 minus Ψ(u)( 1113857 1113944n1

np

λn leA0(u) minus Ap(u) + 1113944

np

n1λnεn

(65)

A0(u) Dh(u x0) (λ0 0) so

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (66)

)is theorem covers the evaluation of the global con-vergence rate given in [6 12] as shown by the followingcorollary

Corollary 2 We assume that

(a) εn 0 n 1

(b) λn λ 12L n 1

(c) xlowast isin argminΨ

en

Ψ xp

( 1113857 minus Ψlowast le2L

pDh xlowast x

01113872 1113873 (67)

ie Ψ(xp) minus Ψlowast O(1p)

Proof Immediate consequence of Proposition 7

Now we derive a global convergence of the sequencegenerated by Algorithm 1 to a minimizer of (P)

Abstract and Applied Analysis 7

Theorem 3 We assume that 1113936 λnεn lt +infin if one of thefollowing assumptions holds

(i) exist λ gt 0 such that λ le λn le λ n 1

(ii) e sequence Ψ(xn) is nonincreasing and1113936 λn +infin then (a) Ψ(xn)⟶ inf Ψ and (b)xn⟶ xlowast isin argminΨ

Proof

(a) Suppose

(i) Let xlowast isin argminΨ and we put u xlowast in (69) wehave

λ Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857 + λnεn

(68)

An xlowast

( 1113857leAnminus 1 xlowast

( 1113857 + λnεn (69)

1113944 λnεn lt +infin⟹An xlowast

( 1113857⟶ l isin R (70)

(68) and (suppose (69))⟹Ψ(xn)⟶ inf Ψ

Suppose (ii)

Φ xp

( 1113857 min Ψ xk

1113872 1113873 1le klep1113966 1113967 Ψ xp

( 1113857 (71)

For u xlowast isin argminΨ in (63) we have 0leΨ(xp) minus Ψ(xlowast) le (1tp)[Dh(xlowast x0) + 1113936

infinn1λnεn]

so 1113936 λn +infin⟹Ψ(xn)⟶ inf Ψ

(b) (69)⟹existαgt 0 An(xlowast)le αAn(xlowast) Dhn(xlowast xn)

Dh(xlowast xn) minus λnDg(xlowast xn) so

Dh xlowast x

n( 1113857le α + λnDg x

lowast x

n( 1113857

dArr

Dh xlowast x

n( 1113857le

α1 minus λL

(72)

Then Dh(xlowast xn)1113864 1113865 is bounded and from H3 xn isbounded as well Let ulowast isin Adh xn there exists then asubsequence xni of xn such that xni⟶ ulowast isin S From H5Dh(ulowast xni )⟶ 0 On the other hand

0leDg ulowast x

ni( 1113857leL middot Dh ulowast x

ni( 1113857 (73)

so Dg(ulowast xni )⟶ 0ulowast isin argminΨ Indeed

inf ΨleΨ ulowast

( 1113857le lim Ψ xni( 1113857 inf Ψ

dArr

inf Ψ Ψ ulowast

( 1113857

(74)

which shows that⟹ulowast isin argminΨ )en

Dhniulowast x

ni( 1113857 Dh ulowast x

ni( 1113857 minus λniDg u

lowast x

ni( 1113857⟶ 0 (75)

Since Dhni(ulowast xni )⟶ 0 and ulowast isin argminΨ we have

Dhnulowast x

n( 1113857⟶ 0 (76)

We have

Dh ulowast x

n( 1113857 Dhn

ulowast x

n( 1113857 + λnDg u

lowast x

n( 1113857

leDhnulowast x

n( 1113857 + L middot λDh u

lowast x

n( 1113857

(77)

so (1 minus L middot λ)Dh(ulowast xn)leDhn(ulowast xn) then

Dh ulowast x

n( 1113857⟶ 0 (78)

And from H6 we have xn⟶ ulowast isin argminΨ )e IBPG algorithm generates a sequence such thatΨ(xn) n does not necessarily be nonincreasing for thisreason and for improvement of the global estimate infunction values we now propose εminus NoLips which is aninexact version of BPG with a relative error criterion Let σsuch that 0le σ lt 1 be given as follows

In what follows we will derive a convergence rate result()eorem 4) for the εminus NoLips framework First we need toestablish a few technical lemmas

In the following xn n denotes the sequence generatedby εminus NoLips

Lemma 7 For every u isin S for all n isin Nlowast

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(79)

Proof Since (λn minus λnminus 1)Dg(u xnminus 1)ge 0 we have from (62)

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le minus Dhnx

n x

nminus 11113872 1113873 + Anminus 1(u)

amp9 minus An(u) + λnεn(80)

From Algorithm 2 we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + (σ minus 1)Dhnx

n x

nminus 11113872 1113873

(81)

From the condition LC we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(82)

Remark 3 We now notice that Ψ(xn) n is nonincreasingJust replace u with xnminus 1 in (79)

Lemma 8 For every n isin Nlowast and xlowast isin argminΨ we have

tn Ψ xn

( 1113857 minus Ψlowast( 1113857 + λminus 1n tn(1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 1113873 + Anminus 1 xlowast

( 1113857 minus An xlowast

( 1113857

(83)

Proof Replacing u by xnminus 1 in (79) and since An(xnminus 1)ge 0and Anminus 1(xnminus 1) 0 we have

8 Abstract and Applied Analysis

tnminus 1 Ψ xn

( 1113857 minus Ψlowast( 11138571113857 + tnminus 1λminus 1n (1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 11138731113873

(84)

Replacing u by xlowast in (79) we have

λn Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857 +(1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

+ leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857(85)

Since tnminus 1 + λn tn by adding (84) and (85) we have(83)

Lemma 9 For every k isin Nlowast

tk Ψ xk

1113872 1113873 minus Ψlowast1113872 1113873 +(1 minus σ)(1 minus λL) 1113944nk

n1λminus 1

n tnDh xn x

nminus 11113872 1113873

leA0 xlowast

( 1113857 minus Ak xlowast

( 1113857

(86)

Proof )is result is obtained by adding inequality (83) from1 to k (t0 λ0 0)

We are now ready to state the convergence rate result forthe εminus NoLips framework )is result improves and com-pletes the one given in the Proposition 7

Theorem 4 For every k isin Nlowast the following statements holdwe pose ρk ≔ 1113936

nkn1λ

minus 1n tn Consider

(a)Ψ xk

1113872 1113873 minus Ψlowast leDh xlowast x0( 1113857

tk

1113888 1113889 (87)

(b) ck ≔ min1lenlek

Dh xn x

nminus 11113872 1113873le

Dh xlowast x0( 1113857

(1 minus σ)(1 minus λL)ρk

1113888 1113889

(88)

Proof From (86) and since A0(xlowast) Dh(xlowast x0)(λ0 0)

and Ak(xlowast)ge 0 we immediately have (87) and (88)

Corollary 3 Consider an instance of the εminus NoLips frame-work with λn λ for every n isin Nlowasten for every k isin Nlowast thefollowing statements hold

(a)Ψ xk

1113872 1113873 minus Ψlowast O1k

1113874 1113875 (89)

(b) ck min1lenlek

Dh xn x

nminus 11113872 1113873 O

1k21113874 1113875 (90)

Proof tk kλ and (87)⟹ (90)

ρk k(k + 1)

2ge

k2

2 (91)

and (88)⟹ (90)

Remark 4 (87) and (89) represent exactly the convergencerate established in [15 22] (90) is a new result not estab-lished in [15 22] this result shows that ck converges to zeroat a rate of O(1k2)

Theorem 5 If 1113936 λn +infin then

(a) Ψ(xn)⟶ inf Ψ(b) xn⟶ xlowast isin argminΨ

Proof Replacing u by xlowast in (81) we have

(1 minus σ)Dhnx

n x

nminus 11113872 1113873leAnminus 1 x

lowast( 1113857 minus An x

lowast( 1113857 (92)

By adding the inequality (92) from 1 to k we have

(1 minus σ) 1113944nk

n1Dhn

xn x

nminus 11113872 1113873leA0 x

lowast( 1113857 minus Ak x

lowast( 1113857 (93)

Ak(xlowast)ge 0 so 1113936infinn1Dhn

(xn xnminus 1)lt +infin From (69) we have

1113944

infin

n1λnεn lt +infin (94)

Since Ψ(xn) n is nonincreasing and (94) by applyingthe )eorem 3 (ii) we have the results of )eorem 5

5 Application to Nonnegative LinearInverse Problem

In Poisson inverse problems (eg [25 26]) we are given anonnegative observation matrix A isin Rmtimesd

+ and a noisymeasurement vector b isin Rm

+ and the goal is to reconstructthe signal x isin Rd

+ such that Ax≃ b We can naturally adoptthe distance D(Ax b) to measure the residuals between twononnegative points with

(1) Input x0 isin Scap ri(domΨ)

(2) For n 1 2 with εn gt 0 we obtain (nablahn(xnminus 1) minus nablahn(xn)λn) isin zεnΨ(xn)

ALGORITHM 1 Inexact Bregman Proximal Gradient (IBPG)

Abstract and Applied Analysis 9

D(Ax b) 1113944m

i1langai xranglog

langai xrangbi

+ bi minus langai xrang (95)

where ai denotes the ith line of AIn this section we propose an approach for solving the

nonnegative linear inverse problem defined by

Pα( 1113857 inf αx1 + D(Ax b) x isin Rd+1113966 1113967 (αgt 0) (96)

We take

f(x) ≔ αx1

g(x) ≔ D(Ax b)

h(x) ≔ h1(x) 1113944d

i1xi logxi

(97)

It is shown in [15] that the couple (g h) verified aLipschitz-likeConvexity Condition (LC) on Rd

+ for any Lsuch that

Lge max1lejled

1113944

m

i1aij max

1lejledAj

1 (98)

where Aj denotes the jth column of AFor λn ≔ λ foralln )eorem 3 is applicable and global

warrant of Algorithm 1 convergence to an optimal solutionof (Pα)

Given xn isin Rd++ the iteration

xn+1≃Tλ x

n( 1113857 (99)

amounts to solving the one-dimensional problemFor j 1 d

xn+1j ≃ argmin αt + cjt +

t logt

xnj

+ xnj minus t⎛⎝ ⎞⎠ tgt 0

⎧⎨

⎫⎬

(100)

where cj is the jth component of nablag(xn)

6 Conclusion

)e proposed algorithms constitute a unified frame for theexisting algorithms BPG BP and PG by giving others inparticular the inexact version of the interior method withBregman distance studied in [24] More precisely

(i) When εn 0 foralln isin Nlowast our algorithm is the NoLipsstudied in [15 22]

(ii) When g 0 our algorithm is the inexact version ofBregman Proximal (BP) studied in [19]

(iii) When εn 0 foralln isin Nlowast and g 0 our algorithm isthe Bregman Proximal (BP) studied in [17 21]

(iv) When f 0 our algorithm is the inexact version ofthe interior method with Bregman distance studiedin [24]

(v) When f 0 and εn 0 foralln isin Nlowast our algorithm isthe interior method with Bregman distance studiedin [24]

(vi) When h (12)middot2 our algorithm is the proximalgradient method (PG) and its variants [4ndash14 27]

Our analysis is different and more simple than the onegiven in [15 22] and allows to reduce some hypothesis inparticular the supercoercivity of Ψ as well as the simulta-neous convexity of f and g

Data Availability

No data were used to support this study

Conflicts of Interest

)e authors declare that they have no conflicts of interest

References

[1] D L Donoho ldquoCompressed sensingrdquo IEEE Transactions onInformation eory vol 52 no 4 pp 1289ndash1306 2006

[2] A Beck and Y C Eldar ldquoSparsity constrained nonlinearoptimization optimality conditions and algorithmsrdquo SIAMJournal on Optimization vol 23 no 3 pp 1480ndash1509 2013

[3] D R Luke ldquoPhase retrieval whatrsquos newrdquo SIAGOPT Viewsand News vol 25 no 1 pp 1ndash5 2017

[4] A Auslender ldquoNumrrical methods for non-differentiableconvex optimizationrdquo inMathematical Programming StudiesJ P Vial and C B Nguyen Eds vol 30 pp 102ndash126Springer Berlin Germany 1987

[5] A I Chen and A Ozdaglar ldquoA fast distributed proximal-gradient methodrdquo in Proceedings of the 2012 50th AnnualAllerton Conference on Communication Control and Com-puting (Allerton) pp 601ndash608 IEEE Monticello IL USAOctober 2012

[6] J B Hiriart-Urruty ldquoε-subdifferentiel calculs Convex anal-ysis and optimizationrdquo in Research Notes in MathematicsSeries Vol 57 Pitman Publishers London UK 1982

[7] K Jiang D Sun and K-C Toh ldquoAn inexact acceleratedproximal gradient method for large scale linearly constrainedconvex SDPrdquo SIAM Journal on Optimization vol 22 no 3pp 1042ndash1064 2012

(1) Input x0 isin Scap ri(domΨ)

(2) Choose λn gt 0 and find xn isin S σn isin [0 σ] and εn ge 0 such that((nablahn(xnminus 1) minus nablahn(xn))λn) isin zεn

Ψ(xn)

λnεn le σnDhn(xn xnminus 1)

(3) Set n⟵ n + 1 and go to step 1

ALGORITHM 2 εminus NoLips

10 Abstract and Applied Analysis

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11

Page 5: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

forallλ isin 01L

1113877 1113876 forallu isin S z Dhminus λg( u)1113872 1113873 xlowast

( 1113857

nabla(h minus λg) xlowast( ) minus nabla(h minus λg)(u)1113864 1113865 if xlowast isin S

empty if not

⎧⎪⎨

⎪⎩

(35)

Proof Since h minus λg is a Legendre function on S Dhminus λg( u)

is also a Legendre By application of )eorem 261 in [23]z(Dhminus λg( u)) verifies the following

(i) If xlowast isin int(domDhminus λg( u)) S then

z Dhminus λg( u)1113872 1113873 xlowast

( 1113857 nablaDhminus λg( u) xlowast

( 11138571113966 1113967 (36)

(ii) If xlowast notin S then z(Dh( u))(xlowast) empty

Theorem 2 (well-posedness of the method) We assumethat

(i) Ψ is convex(ii) e pair (g h) verified the condition (LC)(iii) forallrge 0 forallx isin S the sets below are bounded

L2(x r) y isin S Dh(y x)le r1113864 1113865 (37)

Then forallλ isin ]0 (1L)[ and the map Tλ defined in (25) isnonempty and single-valued from S to S

Proof forallx isin S and Tλ(x) is nonempty for this it is enoughto demonstrate that forallr isin R

L(x r) u isin S Ψ(u) + λminus 1Dhminus λg(u x)le r1113966 1113967 (38)

which is closed and is bounded when it is nonempty

u isin L(x r)⟹Ψ(u) + λminus 1Dhminus λg(u x)le r

⟹Dhminus λg(u x)le λ r minus Ψlowast( 1113857

⟹Dh(u x)le λ r minus Ψlowast( 1113857 + λDg(u x)

⟹Dh(u x)le λ r minus Ψlowast( 1113857

+ LλDh(u x)

⟹Dh(u x)leλ r minus Ψlowast( )

1 minus Lλ

(39)

It follows that

L(x r) sub L2 xλ r minus Ψlowast( )

1 minus Lλ1113888 1113889 (40)

thanks to H4 L2(x (λ(r minus Ψlowast)1 minus Lλ)) is bounded whichleads that L(x r) is bounded too which shows that

Tλ(x)neempty (41)

Let xlowast isin Tλ(x) Let us suppose that xlowast isin S(26)⟹ 0 isinz(Ψ(middot) + 1λDhminus λg( x))(xlowast) since ri(domΨ)cap ri(domDhminus λg

( x)) ri(domΨ)cap ri(S) ri(domΨ)cap Sneempty from [10]which allows to write that

0 isin zΨ xlowast

( 1113857 + z1λDhminus λg( x)1113874 1113875 x

lowast( 1113857 (42)

It follows that

existu isin zΨ xlowast

( 1113857 (43)

such that

minus λu isin zDhminus λg( x) xlowast

( 1113857 (44)

is in contradiction with Lemma 6 )en Tλ(x) sub S

On the other hand h minus λg is strictly convex in S and Ψ isconvex so Ψ(middot) + Dhminus λg( x) is strongly convex in S )enTλ(x) has a unique value for all x isin S

Remark 1 )is result is liberated from the supercoercivityof Ψ and the simultaneous convexity of f and g as requiredby Lemma 2 [15]

Proposition 3 forallx isin Sforallλ isin ]0 (1L)[nabla(h minus λg)(x) minus nabla(h minus λg) Tλ(x)( 1113857

λisin zΨ Tλ(x)( 1113857 (45)

existx isin Snabla(h minus λg)(x) minus nabla(h minus λg)(x)

λisin zεΨ(x) (46)

Proof Since Tλ(x) isin S we have

0 isin z Ψ(middot) +1λDhminus λg( x)1113874 1113875 Tλ(x)( 1113857

⟹ minus nabla1λDhminus λg( x)1113874 1113875 Tλ(x)( 1113857( 1113857 isin z Ψ Tλ(x)( 1113857( 1113857

⟹ (45)

(47)

For (46) just take x Tλ(x) since

zΨ Tλ(x)( 1113857 sub zεΨ Tλ(x)( 1113857 (48)

Proposition 4 forallx isin Sforallλ isin ]0 (1L)[

Tλ(x) proxhminus λg

λΨ (x) proxhλf o proxh

λp(x) (49)

where p(u) langnablag(x) urang

Proof )e first equality is due to Lemma 3 )e second isestablished in [15]

)e first equality played a decisive role in the devel-opment of this paper

4 Analysis of the εminusNoLips Algorithm

In this section we propose an Inexact Bregman ProximalGradient Algorithm (IBPG) which is an inexact version ofthe BPG algorithm described in [15 22] the IBPG

Abstract and Applied Analysis 5

framework allows an error in the subgradient inclusion byusing the error εn We study two algorithms

(i) Algorithm 1 inexact Bregman Proximal Gradient(IBPG) algorithm without relative error criterion

(ii) Algorithm 2 inexact Bregman Proximal Gradient(IBPG) algorithm with relative error criterion whichwe call εminus NoLips

We establish the main convergence properties of theproposed algorithms In particular we prove its global rateof convergence showing that it shares the claimed sublinearrate O(1n) of basic first-order methods such as the classicalPG and BPG We also derive a global convergence of thesequence generated by NoLips to a minimizer of (P)

Assumptions 4

(i) h isin B(S)capE(S)

(ii) Ψ is convex(iii) )e pair (g h) verified the condition (LC)(iv) Argmin Ψneempty

In our analysis Ψ is supposed to be a convex function itallows to distinguish two interesting cases

(i) )e nonsmooth part f is possibly not convex and thesmooth part g is convex

(ii) )e nonsmooth part f is convex and the smooth partg is possibly not convex

In what follows the choice of the sequence λn1113864 1113865 dependsof the convexity of g

Let λ such that 0lt λlt (1L) λ0 ≔ 0

If g is not convex then we choose

λn λ (0lt λle λ) n 1 (50)

If g is convex then we choose

λn le λn+1 le λ n 1 (51)

In those conditions we easily show thatforallx y isin S foralln isin N 0lt λn le λ such that

λn+1 minus λn( 1113857Dg(x y)ge 0 (52)

We pose

hn h minus λng n 1 (53)

Proposition 5 e sequence xn n defined by (IBPG) existsand is verified for all n isin Nlowast

xn isin εn minus argmin Ψ(u) +

1λn

Dhnu x

nminus 11113872 1113873 u isin S1113896 1113897 (54)

Proof Existence is deduced trivially from (45)

Ωn ≔nablahn xnminus 1( 1113857 minus nablahn xn( )

λn

isin zεnΨ x

n( 1113857

⟹Ψ(u)geΨ xn

( 1113857 +langu minus xnΩnrang minus εn

(55)

By applying Lemma 2 we have

Ψ(u)geΨ xn

( 1113857 + λminus 1n Dhn

u xn

( 1113857 + Dhnx

n x

nminus 11113872 1113873 minus Dhn

u xnminus 1

1113872 11138731113960 1113961 minus εn

⟹Ψ xn

( 1113857 + λminus 1n Dhn

xn x

nminus 11113872 1113873leΨ(u) + λminus 1

n Dhnu x

nminus 11113872 1113873 + εn forallu isin S

⟹ xn isin T

εn

λnx

nminus 11113872 1113873

(56)

where Tελ(x) ε minus argmin Ψ(u) + (1λ)Dhminus λg(u x)1113966 1113967

Remark 2 )is result shows that IBPG is an inexact versionof BPG and this is exactly the BPG when εn 0 ie

(i) xn isin Tεn

λn(xnminus 1)⟹ xn ≃Tλn

(xnminus 1)

(ii) εn 0⟹ xn Tλn(xnminus 1)

Proposition 6 For all n isin Nlowast

(i) Ψ xn

( 1113857 minus Ψ xnminus 1

1113872 11138731113872 1113873le minus Dhnx

n x

nminus 11113872 1113873 + λnεn (57)

(ii) Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 11138731113872

minus Ψ xn

( 11138571113857 +λnεn

1 minus λL

(58)

Proof

(55)⟹λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le Dhnu x

nminus 11113872 1113873 minus Dhn

u xn

( 1113857 minus Dhnx

n x

nminus 11113872 11138731113960 1113961 + εnλn (59)

6 Abstract and Applied Analysis

We put u xnminus 1 in (59) we get (57)(ii) Put u xnminus 1 in (59) we have

Dhnx

n x

nminus 11113872 1113873 + Dhn

xnminus 1

xn

1113872 1113873le λn Ψ xnminus 1

1113872 1113873 minus Ψ xn

( 11138571113872 1113873 + λnεn

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λn Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λnDg x

n x

nminus 11113872 1113873 + λnDg x

nminus 1 x

n1113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λ Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λL Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 11138731113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

λnεn

1 minus λL

(60)

Corollary 1

(i) If εn 0 the sequence Ψ(xn) n is nonincreasing(ii) Summability if 1113936

infinn1λnεn lt +infin then 1113936

infinn1

Dh(xn xnminus 1)lt +infin and 1113936infinn1Dh(xnminus 1 xn)lt +infin

Proof

(i) From (57) Dhn(xn xnminus 1)ge 0⟹Ψ(xn)leΨ(xnminus 1)

foralln isin Nlowast(ii) From (58) 1113936

npn1Dh(xn xnminus 1) + Dh(xnminus 1 xn)le

λ1 minus λL

1113944

np

n1Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

11 minus λL

1113944

np

n1λnεn

⟹1113944

np

n1Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873

leλ

1 minus λLΨ x

01113872 1113873 minus Ψlowast1113872 1113873 +

11 minus λL

1113944

np

n1λnεn

(61)

In the following we pose

tp ≔ 1113944

np

n1λn forallp isin N

lowast t0 λ0 0

An(u) Dhnu x

n( 1113857 forallu isin S foralln isin N

lowast

Φ xp

( 1113857 ≔ min Ψ xk

1113872 1113873 1le klep1113966 1113967 forallp isin Nlowast

(62)

Proposition 7 (Global Estimate in Function Values) (a) Forall u isin S and forallp isin Nlowast

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (63)

Proof We have Dhn(u xnminus 1) Anminus 1(u) minus (λn minus λnminus 1)Dg

(u xnminus 1) from (59) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) minus Dhnx

n x

nminus 11113872 1113873

minus λn minus λnminus 1( 1113857Dg u xnminus 1

1113872 1113873 + λnεn

(64)

From (52) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857le 1113944

np

n1Anminus 1(u) minus An(u) + 1113944

np

n1λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857leA0(u) minus Ap(u) + 1113944

np

n1λnεn

⟹ Φ xp

( 1113857 minus Ψ(u)( 1113857 1113944n1

np

λn leA0(u) minus Ap(u) + 1113944

np

n1λnεn

(65)

A0(u) Dh(u x0) (λ0 0) so

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (66)

)is theorem covers the evaluation of the global con-vergence rate given in [6 12] as shown by the followingcorollary

Corollary 2 We assume that

(a) εn 0 n 1

(b) λn λ 12L n 1

(c) xlowast isin argminΨ

en

Ψ xp

( 1113857 minus Ψlowast le2L

pDh xlowast x

01113872 1113873 (67)

ie Ψ(xp) minus Ψlowast O(1p)

Proof Immediate consequence of Proposition 7

Now we derive a global convergence of the sequencegenerated by Algorithm 1 to a minimizer of (P)

Abstract and Applied Analysis 7

Theorem 3 We assume that 1113936 λnεn lt +infin if one of thefollowing assumptions holds

(i) exist λ gt 0 such that λ le λn le λ n 1

(ii) e sequence Ψ(xn) is nonincreasing and1113936 λn +infin then (a) Ψ(xn)⟶ inf Ψ and (b)xn⟶ xlowast isin argminΨ

Proof

(a) Suppose

(i) Let xlowast isin argminΨ and we put u xlowast in (69) wehave

λ Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857 + λnεn

(68)

An xlowast

( 1113857leAnminus 1 xlowast

( 1113857 + λnεn (69)

1113944 λnεn lt +infin⟹An xlowast

( 1113857⟶ l isin R (70)

(68) and (suppose (69))⟹Ψ(xn)⟶ inf Ψ

Suppose (ii)

Φ xp

( 1113857 min Ψ xk

1113872 1113873 1le klep1113966 1113967 Ψ xp

( 1113857 (71)

For u xlowast isin argminΨ in (63) we have 0leΨ(xp) minus Ψ(xlowast) le (1tp)[Dh(xlowast x0) + 1113936

infinn1λnεn]

so 1113936 λn +infin⟹Ψ(xn)⟶ inf Ψ

(b) (69)⟹existαgt 0 An(xlowast)le αAn(xlowast) Dhn(xlowast xn)

Dh(xlowast xn) minus λnDg(xlowast xn) so

Dh xlowast x

n( 1113857le α + λnDg x

lowast x

n( 1113857

dArr

Dh xlowast x

n( 1113857le

α1 minus λL

(72)

Then Dh(xlowast xn)1113864 1113865 is bounded and from H3 xn isbounded as well Let ulowast isin Adh xn there exists then asubsequence xni of xn such that xni⟶ ulowast isin S From H5Dh(ulowast xni )⟶ 0 On the other hand

0leDg ulowast x

ni( 1113857leL middot Dh ulowast x

ni( 1113857 (73)

so Dg(ulowast xni )⟶ 0ulowast isin argminΨ Indeed

inf ΨleΨ ulowast

( 1113857le lim Ψ xni( 1113857 inf Ψ

dArr

inf Ψ Ψ ulowast

( 1113857

(74)

which shows that⟹ulowast isin argminΨ )en

Dhniulowast x

ni( 1113857 Dh ulowast x

ni( 1113857 minus λniDg u

lowast x

ni( 1113857⟶ 0 (75)

Since Dhni(ulowast xni )⟶ 0 and ulowast isin argminΨ we have

Dhnulowast x

n( 1113857⟶ 0 (76)

We have

Dh ulowast x

n( 1113857 Dhn

ulowast x

n( 1113857 + λnDg u

lowast x

n( 1113857

leDhnulowast x

n( 1113857 + L middot λDh u

lowast x

n( 1113857

(77)

so (1 minus L middot λ)Dh(ulowast xn)leDhn(ulowast xn) then

Dh ulowast x

n( 1113857⟶ 0 (78)

And from H6 we have xn⟶ ulowast isin argminΨ )e IBPG algorithm generates a sequence such thatΨ(xn) n does not necessarily be nonincreasing for thisreason and for improvement of the global estimate infunction values we now propose εminus NoLips which is aninexact version of BPG with a relative error criterion Let σsuch that 0le σ lt 1 be given as follows

In what follows we will derive a convergence rate result()eorem 4) for the εminus NoLips framework First we need toestablish a few technical lemmas

In the following xn n denotes the sequence generatedby εminus NoLips

Lemma 7 For every u isin S for all n isin Nlowast

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(79)

Proof Since (λn minus λnminus 1)Dg(u xnminus 1)ge 0 we have from (62)

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le minus Dhnx

n x

nminus 11113872 1113873 + Anminus 1(u)

amp9 minus An(u) + λnεn(80)

From Algorithm 2 we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + (σ minus 1)Dhnx

n x

nminus 11113872 1113873

(81)

From the condition LC we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(82)

Remark 3 We now notice that Ψ(xn) n is nonincreasingJust replace u with xnminus 1 in (79)

Lemma 8 For every n isin Nlowast and xlowast isin argminΨ we have

tn Ψ xn

( 1113857 minus Ψlowast( 1113857 + λminus 1n tn(1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 1113873 + Anminus 1 xlowast

( 1113857 minus An xlowast

( 1113857

(83)

Proof Replacing u by xnminus 1 in (79) and since An(xnminus 1)ge 0and Anminus 1(xnminus 1) 0 we have

8 Abstract and Applied Analysis

tnminus 1 Ψ xn

( 1113857 minus Ψlowast( 11138571113857 + tnminus 1λminus 1n (1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 11138731113873

(84)

Replacing u by xlowast in (79) we have

λn Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857 +(1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

+ leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857(85)

Since tnminus 1 + λn tn by adding (84) and (85) we have(83)

Lemma 9 For every k isin Nlowast

tk Ψ xk

1113872 1113873 minus Ψlowast1113872 1113873 +(1 minus σ)(1 minus λL) 1113944nk

n1λminus 1

n tnDh xn x

nminus 11113872 1113873

leA0 xlowast

( 1113857 minus Ak xlowast

( 1113857

(86)

Proof )is result is obtained by adding inequality (83) from1 to k (t0 λ0 0)

We are now ready to state the convergence rate result forthe εminus NoLips framework )is result improves and com-pletes the one given in the Proposition 7

Theorem 4 For every k isin Nlowast the following statements holdwe pose ρk ≔ 1113936

nkn1λ

minus 1n tn Consider

(a)Ψ xk

1113872 1113873 minus Ψlowast leDh xlowast x0( 1113857

tk

1113888 1113889 (87)

(b) ck ≔ min1lenlek

Dh xn x

nminus 11113872 1113873le

Dh xlowast x0( 1113857

(1 minus σ)(1 minus λL)ρk

1113888 1113889

(88)

Proof From (86) and since A0(xlowast) Dh(xlowast x0)(λ0 0)

and Ak(xlowast)ge 0 we immediately have (87) and (88)

Corollary 3 Consider an instance of the εminus NoLips frame-work with λn λ for every n isin Nlowasten for every k isin Nlowast thefollowing statements hold

(a)Ψ xk

1113872 1113873 minus Ψlowast O1k

1113874 1113875 (89)

(b) ck min1lenlek

Dh xn x

nminus 11113872 1113873 O

1k21113874 1113875 (90)

Proof tk kλ and (87)⟹ (90)

ρk k(k + 1)

2ge

k2

2 (91)

and (88)⟹ (90)

Remark 4 (87) and (89) represent exactly the convergencerate established in [15 22] (90) is a new result not estab-lished in [15 22] this result shows that ck converges to zeroat a rate of O(1k2)

Theorem 5 If 1113936 λn +infin then

(a) Ψ(xn)⟶ inf Ψ(b) xn⟶ xlowast isin argminΨ

Proof Replacing u by xlowast in (81) we have

(1 minus σ)Dhnx

n x

nminus 11113872 1113873leAnminus 1 x

lowast( 1113857 minus An x

lowast( 1113857 (92)

By adding the inequality (92) from 1 to k we have

(1 minus σ) 1113944nk

n1Dhn

xn x

nminus 11113872 1113873leA0 x

lowast( 1113857 minus Ak x

lowast( 1113857 (93)

Ak(xlowast)ge 0 so 1113936infinn1Dhn

(xn xnminus 1)lt +infin From (69) we have

1113944

infin

n1λnεn lt +infin (94)

Since Ψ(xn) n is nonincreasing and (94) by applyingthe )eorem 3 (ii) we have the results of )eorem 5

5 Application to Nonnegative LinearInverse Problem

In Poisson inverse problems (eg [25 26]) we are given anonnegative observation matrix A isin Rmtimesd

+ and a noisymeasurement vector b isin Rm

+ and the goal is to reconstructthe signal x isin Rd

+ such that Ax≃ b We can naturally adoptthe distance D(Ax b) to measure the residuals between twononnegative points with

(1) Input x0 isin Scap ri(domΨ)

(2) For n 1 2 with εn gt 0 we obtain (nablahn(xnminus 1) minus nablahn(xn)λn) isin zεnΨ(xn)

ALGORITHM 1 Inexact Bregman Proximal Gradient (IBPG)

Abstract and Applied Analysis 9

D(Ax b) 1113944m

i1langai xranglog

langai xrangbi

+ bi minus langai xrang (95)

where ai denotes the ith line of AIn this section we propose an approach for solving the

nonnegative linear inverse problem defined by

Pα( 1113857 inf αx1 + D(Ax b) x isin Rd+1113966 1113967 (αgt 0) (96)

We take

f(x) ≔ αx1

g(x) ≔ D(Ax b)

h(x) ≔ h1(x) 1113944d

i1xi logxi

(97)

It is shown in [15] that the couple (g h) verified aLipschitz-likeConvexity Condition (LC) on Rd

+ for any Lsuch that

Lge max1lejled

1113944

m

i1aij max

1lejledAj

1 (98)

where Aj denotes the jth column of AFor λn ≔ λ foralln )eorem 3 is applicable and global

warrant of Algorithm 1 convergence to an optimal solutionof (Pα)

Given xn isin Rd++ the iteration

xn+1≃Tλ x

n( 1113857 (99)

amounts to solving the one-dimensional problemFor j 1 d

xn+1j ≃ argmin αt + cjt +

t logt

xnj

+ xnj minus t⎛⎝ ⎞⎠ tgt 0

⎧⎨

⎫⎬

(100)

where cj is the jth component of nablag(xn)

6 Conclusion

)e proposed algorithms constitute a unified frame for theexisting algorithms BPG BP and PG by giving others inparticular the inexact version of the interior method withBregman distance studied in [24] More precisely

(i) When εn 0 foralln isin Nlowast our algorithm is the NoLipsstudied in [15 22]

(ii) When g 0 our algorithm is the inexact version ofBregman Proximal (BP) studied in [19]

(iii) When εn 0 foralln isin Nlowast and g 0 our algorithm isthe Bregman Proximal (BP) studied in [17 21]

(iv) When f 0 our algorithm is the inexact version ofthe interior method with Bregman distance studiedin [24]

(v) When f 0 and εn 0 foralln isin Nlowast our algorithm isthe interior method with Bregman distance studiedin [24]

(vi) When h (12)middot2 our algorithm is the proximalgradient method (PG) and its variants [4ndash14 27]

Our analysis is different and more simple than the onegiven in [15 22] and allows to reduce some hypothesis inparticular the supercoercivity of Ψ as well as the simulta-neous convexity of f and g

Data Availability

No data were used to support this study

Conflicts of Interest

)e authors declare that they have no conflicts of interest

References

[1] D L Donoho ldquoCompressed sensingrdquo IEEE Transactions onInformation eory vol 52 no 4 pp 1289ndash1306 2006

[2] A Beck and Y C Eldar ldquoSparsity constrained nonlinearoptimization optimality conditions and algorithmsrdquo SIAMJournal on Optimization vol 23 no 3 pp 1480ndash1509 2013

[3] D R Luke ldquoPhase retrieval whatrsquos newrdquo SIAGOPT Viewsand News vol 25 no 1 pp 1ndash5 2017

[4] A Auslender ldquoNumrrical methods for non-differentiableconvex optimizationrdquo inMathematical Programming StudiesJ P Vial and C B Nguyen Eds vol 30 pp 102ndash126Springer Berlin Germany 1987

[5] A I Chen and A Ozdaglar ldquoA fast distributed proximal-gradient methodrdquo in Proceedings of the 2012 50th AnnualAllerton Conference on Communication Control and Com-puting (Allerton) pp 601ndash608 IEEE Monticello IL USAOctober 2012

[6] J B Hiriart-Urruty ldquoε-subdifferentiel calculs Convex anal-ysis and optimizationrdquo in Research Notes in MathematicsSeries Vol 57 Pitman Publishers London UK 1982

[7] K Jiang D Sun and K-C Toh ldquoAn inexact acceleratedproximal gradient method for large scale linearly constrainedconvex SDPrdquo SIAM Journal on Optimization vol 22 no 3pp 1042ndash1064 2012

(1) Input x0 isin Scap ri(domΨ)

(2) Choose λn gt 0 and find xn isin S σn isin [0 σ] and εn ge 0 such that((nablahn(xnminus 1) minus nablahn(xn))λn) isin zεn

Ψ(xn)

λnεn le σnDhn(xn xnminus 1)

(3) Set n⟵ n + 1 and go to step 1

ALGORITHM 2 εminus NoLips

10 Abstract and Applied Analysis

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11

Page 6: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

framework allows an error in the subgradient inclusion byusing the error εn We study two algorithms

(i) Algorithm 1 inexact Bregman Proximal Gradient(IBPG) algorithm without relative error criterion

(ii) Algorithm 2 inexact Bregman Proximal Gradient(IBPG) algorithm with relative error criterion whichwe call εminus NoLips

We establish the main convergence properties of theproposed algorithms In particular we prove its global rateof convergence showing that it shares the claimed sublinearrate O(1n) of basic first-order methods such as the classicalPG and BPG We also derive a global convergence of thesequence generated by NoLips to a minimizer of (P)

Assumptions 4

(i) h isin B(S)capE(S)

(ii) Ψ is convex(iii) )e pair (g h) verified the condition (LC)(iv) Argmin Ψneempty

In our analysis Ψ is supposed to be a convex function itallows to distinguish two interesting cases

(i) )e nonsmooth part f is possibly not convex and thesmooth part g is convex

(ii) )e nonsmooth part f is convex and the smooth partg is possibly not convex

In what follows the choice of the sequence λn1113864 1113865 dependsof the convexity of g

Let λ such that 0lt λlt (1L) λ0 ≔ 0

If g is not convex then we choose

λn λ (0lt λle λ) n 1 (50)

If g is convex then we choose

λn le λn+1 le λ n 1 (51)

In those conditions we easily show thatforallx y isin S foralln isin N 0lt λn le λ such that

λn+1 minus λn( 1113857Dg(x y)ge 0 (52)

We pose

hn h minus λng n 1 (53)

Proposition 5 e sequence xn n defined by (IBPG) existsand is verified for all n isin Nlowast

xn isin εn minus argmin Ψ(u) +

1λn

Dhnu x

nminus 11113872 1113873 u isin S1113896 1113897 (54)

Proof Existence is deduced trivially from (45)

Ωn ≔nablahn xnminus 1( 1113857 minus nablahn xn( )

λn

isin zεnΨ x

n( 1113857

⟹Ψ(u)geΨ xn

( 1113857 +langu minus xnΩnrang minus εn

(55)

By applying Lemma 2 we have

Ψ(u)geΨ xn

( 1113857 + λminus 1n Dhn

u xn

( 1113857 + Dhnx

n x

nminus 11113872 1113873 minus Dhn

u xnminus 1

1113872 11138731113960 1113961 minus εn

⟹Ψ xn

( 1113857 + λminus 1n Dhn

xn x

nminus 11113872 1113873leΨ(u) + λminus 1

n Dhnu x

nminus 11113872 1113873 + εn forallu isin S

⟹ xn isin T

εn

λnx

nminus 11113872 1113873

(56)

where Tελ(x) ε minus argmin Ψ(u) + (1λ)Dhminus λg(u x)1113966 1113967

Remark 2 )is result shows that IBPG is an inexact versionof BPG and this is exactly the BPG when εn 0 ie

(i) xn isin Tεn

λn(xnminus 1)⟹ xn ≃Tλn

(xnminus 1)

(ii) εn 0⟹ xn Tλn(xnminus 1)

Proposition 6 For all n isin Nlowast

(i) Ψ xn

( 1113857 minus Ψ xnminus 1

1113872 11138731113872 1113873le minus Dhnx

n x

nminus 11113872 1113873 + λnεn (57)

(ii) Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 11138731113872

minus Ψ xn

( 11138571113857 +λnεn

1 minus λL

(58)

Proof

(55)⟹λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le Dhnu x

nminus 11113872 1113873 minus Dhn

u xn

( 1113857 minus Dhnx

n x

nminus 11113872 11138731113960 1113961 + εnλn (59)

6 Abstract and Applied Analysis

We put u xnminus 1 in (59) we get (57)(ii) Put u xnminus 1 in (59) we have

Dhnx

n x

nminus 11113872 1113873 + Dhn

xnminus 1

xn

1113872 1113873le λn Ψ xnminus 1

1113872 1113873 minus Ψ xn

( 11138571113872 1113873 + λnεn

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λn Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λnDg x

n x

nminus 11113872 1113873 + λnDg x

nminus 1 x

n1113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λ Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λL Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 11138731113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

λnεn

1 minus λL

(60)

Corollary 1

(i) If εn 0 the sequence Ψ(xn) n is nonincreasing(ii) Summability if 1113936

infinn1λnεn lt +infin then 1113936

infinn1

Dh(xn xnminus 1)lt +infin and 1113936infinn1Dh(xnminus 1 xn)lt +infin

Proof

(i) From (57) Dhn(xn xnminus 1)ge 0⟹Ψ(xn)leΨ(xnminus 1)

foralln isin Nlowast(ii) From (58) 1113936

npn1Dh(xn xnminus 1) + Dh(xnminus 1 xn)le

λ1 minus λL

1113944

np

n1Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

11 minus λL

1113944

np

n1λnεn

⟹1113944

np

n1Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873

leλ

1 minus λLΨ x

01113872 1113873 minus Ψlowast1113872 1113873 +

11 minus λL

1113944

np

n1λnεn

(61)

In the following we pose

tp ≔ 1113944

np

n1λn forallp isin N

lowast t0 λ0 0

An(u) Dhnu x

n( 1113857 forallu isin S foralln isin N

lowast

Φ xp

( 1113857 ≔ min Ψ xk

1113872 1113873 1le klep1113966 1113967 forallp isin Nlowast

(62)

Proposition 7 (Global Estimate in Function Values) (a) Forall u isin S and forallp isin Nlowast

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (63)

Proof We have Dhn(u xnminus 1) Anminus 1(u) minus (λn minus λnminus 1)Dg

(u xnminus 1) from (59) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) minus Dhnx

n x

nminus 11113872 1113873

minus λn minus λnminus 1( 1113857Dg u xnminus 1

1113872 1113873 + λnεn

(64)

From (52) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857le 1113944

np

n1Anminus 1(u) minus An(u) + 1113944

np

n1λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857leA0(u) minus Ap(u) + 1113944

np

n1λnεn

⟹ Φ xp

( 1113857 minus Ψ(u)( 1113857 1113944n1

np

λn leA0(u) minus Ap(u) + 1113944

np

n1λnεn

(65)

A0(u) Dh(u x0) (λ0 0) so

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (66)

)is theorem covers the evaluation of the global con-vergence rate given in [6 12] as shown by the followingcorollary

Corollary 2 We assume that

(a) εn 0 n 1

(b) λn λ 12L n 1

(c) xlowast isin argminΨ

en

Ψ xp

( 1113857 minus Ψlowast le2L

pDh xlowast x

01113872 1113873 (67)

ie Ψ(xp) minus Ψlowast O(1p)

Proof Immediate consequence of Proposition 7

Now we derive a global convergence of the sequencegenerated by Algorithm 1 to a minimizer of (P)

Abstract and Applied Analysis 7

Theorem 3 We assume that 1113936 λnεn lt +infin if one of thefollowing assumptions holds

(i) exist λ gt 0 such that λ le λn le λ n 1

(ii) e sequence Ψ(xn) is nonincreasing and1113936 λn +infin then (a) Ψ(xn)⟶ inf Ψ and (b)xn⟶ xlowast isin argminΨ

Proof

(a) Suppose

(i) Let xlowast isin argminΨ and we put u xlowast in (69) wehave

λ Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857 + λnεn

(68)

An xlowast

( 1113857leAnminus 1 xlowast

( 1113857 + λnεn (69)

1113944 λnεn lt +infin⟹An xlowast

( 1113857⟶ l isin R (70)

(68) and (suppose (69))⟹Ψ(xn)⟶ inf Ψ

Suppose (ii)

Φ xp

( 1113857 min Ψ xk

1113872 1113873 1le klep1113966 1113967 Ψ xp

( 1113857 (71)

For u xlowast isin argminΨ in (63) we have 0leΨ(xp) minus Ψ(xlowast) le (1tp)[Dh(xlowast x0) + 1113936

infinn1λnεn]

so 1113936 λn +infin⟹Ψ(xn)⟶ inf Ψ

(b) (69)⟹existαgt 0 An(xlowast)le αAn(xlowast) Dhn(xlowast xn)

Dh(xlowast xn) minus λnDg(xlowast xn) so

Dh xlowast x

n( 1113857le α + λnDg x

lowast x

n( 1113857

dArr

Dh xlowast x

n( 1113857le

α1 minus λL

(72)

Then Dh(xlowast xn)1113864 1113865 is bounded and from H3 xn isbounded as well Let ulowast isin Adh xn there exists then asubsequence xni of xn such that xni⟶ ulowast isin S From H5Dh(ulowast xni )⟶ 0 On the other hand

0leDg ulowast x

ni( 1113857leL middot Dh ulowast x

ni( 1113857 (73)

so Dg(ulowast xni )⟶ 0ulowast isin argminΨ Indeed

inf ΨleΨ ulowast

( 1113857le lim Ψ xni( 1113857 inf Ψ

dArr

inf Ψ Ψ ulowast

( 1113857

(74)

which shows that⟹ulowast isin argminΨ )en

Dhniulowast x

ni( 1113857 Dh ulowast x

ni( 1113857 minus λniDg u

lowast x

ni( 1113857⟶ 0 (75)

Since Dhni(ulowast xni )⟶ 0 and ulowast isin argminΨ we have

Dhnulowast x

n( 1113857⟶ 0 (76)

We have

Dh ulowast x

n( 1113857 Dhn

ulowast x

n( 1113857 + λnDg u

lowast x

n( 1113857

leDhnulowast x

n( 1113857 + L middot λDh u

lowast x

n( 1113857

(77)

so (1 minus L middot λ)Dh(ulowast xn)leDhn(ulowast xn) then

Dh ulowast x

n( 1113857⟶ 0 (78)

And from H6 we have xn⟶ ulowast isin argminΨ )e IBPG algorithm generates a sequence such thatΨ(xn) n does not necessarily be nonincreasing for thisreason and for improvement of the global estimate infunction values we now propose εminus NoLips which is aninexact version of BPG with a relative error criterion Let σsuch that 0le σ lt 1 be given as follows

In what follows we will derive a convergence rate result()eorem 4) for the εminus NoLips framework First we need toestablish a few technical lemmas

In the following xn n denotes the sequence generatedby εminus NoLips

Lemma 7 For every u isin S for all n isin Nlowast

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(79)

Proof Since (λn minus λnminus 1)Dg(u xnminus 1)ge 0 we have from (62)

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le minus Dhnx

n x

nminus 11113872 1113873 + Anminus 1(u)

amp9 minus An(u) + λnεn(80)

From Algorithm 2 we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + (σ minus 1)Dhnx

n x

nminus 11113872 1113873

(81)

From the condition LC we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(82)

Remark 3 We now notice that Ψ(xn) n is nonincreasingJust replace u with xnminus 1 in (79)

Lemma 8 For every n isin Nlowast and xlowast isin argminΨ we have

tn Ψ xn

( 1113857 minus Ψlowast( 1113857 + λminus 1n tn(1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 1113873 + Anminus 1 xlowast

( 1113857 minus An xlowast

( 1113857

(83)

Proof Replacing u by xnminus 1 in (79) and since An(xnminus 1)ge 0and Anminus 1(xnminus 1) 0 we have

8 Abstract and Applied Analysis

tnminus 1 Ψ xn

( 1113857 minus Ψlowast( 11138571113857 + tnminus 1λminus 1n (1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 11138731113873

(84)

Replacing u by xlowast in (79) we have

λn Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857 +(1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

+ leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857(85)

Since tnminus 1 + λn tn by adding (84) and (85) we have(83)

Lemma 9 For every k isin Nlowast

tk Ψ xk

1113872 1113873 minus Ψlowast1113872 1113873 +(1 minus σ)(1 minus λL) 1113944nk

n1λminus 1

n tnDh xn x

nminus 11113872 1113873

leA0 xlowast

( 1113857 minus Ak xlowast

( 1113857

(86)

Proof )is result is obtained by adding inequality (83) from1 to k (t0 λ0 0)

We are now ready to state the convergence rate result forthe εminus NoLips framework )is result improves and com-pletes the one given in the Proposition 7

Theorem 4 For every k isin Nlowast the following statements holdwe pose ρk ≔ 1113936

nkn1λ

minus 1n tn Consider

(a)Ψ xk

1113872 1113873 minus Ψlowast leDh xlowast x0( 1113857

tk

1113888 1113889 (87)

(b) ck ≔ min1lenlek

Dh xn x

nminus 11113872 1113873le

Dh xlowast x0( 1113857

(1 minus σ)(1 minus λL)ρk

1113888 1113889

(88)

Proof From (86) and since A0(xlowast) Dh(xlowast x0)(λ0 0)

and Ak(xlowast)ge 0 we immediately have (87) and (88)

Corollary 3 Consider an instance of the εminus NoLips frame-work with λn λ for every n isin Nlowasten for every k isin Nlowast thefollowing statements hold

(a)Ψ xk

1113872 1113873 minus Ψlowast O1k

1113874 1113875 (89)

(b) ck min1lenlek

Dh xn x

nminus 11113872 1113873 O

1k21113874 1113875 (90)

Proof tk kλ and (87)⟹ (90)

ρk k(k + 1)

2ge

k2

2 (91)

and (88)⟹ (90)

Remark 4 (87) and (89) represent exactly the convergencerate established in [15 22] (90) is a new result not estab-lished in [15 22] this result shows that ck converges to zeroat a rate of O(1k2)

Theorem 5 If 1113936 λn +infin then

(a) Ψ(xn)⟶ inf Ψ(b) xn⟶ xlowast isin argminΨ

Proof Replacing u by xlowast in (81) we have

(1 minus σ)Dhnx

n x

nminus 11113872 1113873leAnminus 1 x

lowast( 1113857 minus An x

lowast( 1113857 (92)

By adding the inequality (92) from 1 to k we have

(1 minus σ) 1113944nk

n1Dhn

xn x

nminus 11113872 1113873leA0 x

lowast( 1113857 minus Ak x

lowast( 1113857 (93)

Ak(xlowast)ge 0 so 1113936infinn1Dhn

(xn xnminus 1)lt +infin From (69) we have

1113944

infin

n1λnεn lt +infin (94)

Since Ψ(xn) n is nonincreasing and (94) by applyingthe )eorem 3 (ii) we have the results of )eorem 5

5 Application to Nonnegative LinearInverse Problem

In Poisson inverse problems (eg [25 26]) we are given anonnegative observation matrix A isin Rmtimesd

+ and a noisymeasurement vector b isin Rm

+ and the goal is to reconstructthe signal x isin Rd

+ such that Ax≃ b We can naturally adoptthe distance D(Ax b) to measure the residuals between twononnegative points with

(1) Input x0 isin Scap ri(domΨ)

(2) For n 1 2 with εn gt 0 we obtain (nablahn(xnminus 1) minus nablahn(xn)λn) isin zεnΨ(xn)

ALGORITHM 1 Inexact Bregman Proximal Gradient (IBPG)

Abstract and Applied Analysis 9

D(Ax b) 1113944m

i1langai xranglog

langai xrangbi

+ bi minus langai xrang (95)

where ai denotes the ith line of AIn this section we propose an approach for solving the

nonnegative linear inverse problem defined by

Pα( 1113857 inf αx1 + D(Ax b) x isin Rd+1113966 1113967 (αgt 0) (96)

We take

f(x) ≔ αx1

g(x) ≔ D(Ax b)

h(x) ≔ h1(x) 1113944d

i1xi logxi

(97)

It is shown in [15] that the couple (g h) verified aLipschitz-likeConvexity Condition (LC) on Rd

+ for any Lsuch that

Lge max1lejled

1113944

m

i1aij max

1lejledAj

1 (98)

where Aj denotes the jth column of AFor λn ≔ λ foralln )eorem 3 is applicable and global

warrant of Algorithm 1 convergence to an optimal solutionof (Pα)

Given xn isin Rd++ the iteration

xn+1≃Tλ x

n( 1113857 (99)

amounts to solving the one-dimensional problemFor j 1 d

xn+1j ≃ argmin αt + cjt +

t logt

xnj

+ xnj minus t⎛⎝ ⎞⎠ tgt 0

⎧⎨

⎫⎬

(100)

where cj is the jth component of nablag(xn)

6 Conclusion

)e proposed algorithms constitute a unified frame for theexisting algorithms BPG BP and PG by giving others inparticular the inexact version of the interior method withBregman distance studied in [24] More precisely

(i) When εn 0 foralln isin Nlowast our algorithm is the NoLipsstudied in [15 22]

(ii) When g 0 our algorithm is the inexact version ofBregman Proximal (BP) studied in [19]

(iii) When εn 0 foralln isin Nlowast and g 0 our algorithm isthe Bregman Proximal (BP) studied in [17 21]

(iv) When f 0 our algorithm is the inexact version ofthe interior method with Bregman distance studiedin [24]

(v) When f 0 and εn 0 foralln isin Nlowast our algorithm isthe interior method with Bregman distance studiedin [24]

(vi) When h (12)middot2 our algorithm is the proximalgradient method (PG) and its variants [4ndash14 27]

Our analysis is different and more simple than the onegiven in [15 22] and allows to reduce some hypothesis inparticular the supercoercivity of Ψ as well as the simulta-neous convexity of f and g

Data Availability

No data were used to support this study

Conflicts of Interest

)e authors declare that they have no conflicts of interest

References

[1] D L Donoho ldquoCompressed sensingrdquo IEEE Transactions onInformation eory vol 52 no 4 pp 1289ndash1306 2006

[2] A Beck and Y C Eldar ldquoSparsity constrained nonlinearoptimization optimality conditions and algorithmsrdquo SIAMJournal on Optimization vol 23 no 3 pp 1480ndash1509 2013

[3] D R Luke ldquoPhase retrieval whatrsquos newrdquo SIAGOPT Viewsand News vol 25 no 1 pp 1ndash5 2017

[4] A Auslender ldquoNumrrical methods for non-differentiableconvex optimizationrdquo inMathematical Programming StudiesJ P Vial and C B Nguyen Eds vol 30 pp 102ndash126Springer Berlin Germany 1987

[5] A I Chen and A Ozdaglar ldquoA fast distributed proximal-gradient methodrdquo in Proceedings of the 2012 50th AnnualAllerton Conference on Communication Control and Com-puting (Allerton) pp 601ndash608 IEEE Monticello IL USAOctober 2012

[6] J B Hiriart-Urruty ldquoε-subdifferentiel calculs Convex anal-ysis and optimizationrdquo in Research Notes in MathematicsSeries Vol 57 Pitman Publishers London UK 1982

[7] K Jiang D Sun and K-C Toh ldquoAn inexact acceleratedproximal gradient method for large scale linearly constrainedconvex SDPrdquo SIAM Journal on Optimization vol 22 no 3pp 1042ndash1064 2012

(1) Input x0 isin Scap ri(domΨ)

(2) Choose λn gt 0 and find xn isin S σn isin [0 σ] and εn ge 0 such that((nablahn(xnminus 1) minus nablahn(xn))λn) isin zεn

Ψ(xn)

λnεn le σnDhn(xn xnminus 1)

(3) Set n⟵ n + 1 and go to step 1

ALGORITHM 2 εminus NoLips

10 Abstract and Applied Analysis

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11

Page 7: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

We put u xnminus 1 in (59) we get (57)(ii) Put u xnminus 1 in (59) we have

Dhnx

n x

nminus 11113872 1113873 + Dhn

xnminus 1

xn

1113872 1113873le λn Ψ xnminus 1

1113872 1113873 minus Ψ xn

( 11138571113872 1113873 + λnεn

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λn Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λnDg x

n x

nminus 11113872 1113873 + λnDg x

nminus 1 x

n1113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le λ Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 + λnεn + λL Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 11138731113872 1113873

⟹Dh xn x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873le

λ1 minus λLΨ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

λnεn

1 minus λL

(60)

Corollary 1

(i) If εn 0 the sequence Ψ(xn) n is nonincreasing(ii) Summability if 1113936

infinn1λnεn lt +infin then 1113936

infinn1

Dh(xn xnminus 1)lt +infin and 1113936infinn1Dh(xnminus 1 xn)lt +infin

Proof

(i) From (57) Dhn(xn xnminus 1)ge 0⟹Ψ(xn)leΨ(xnminus 1)

foralln isin Nlowast(ii) From (58) 1113936

npn1Dh(xn xnminus 1) + Dh(xnminus 1 xn)le

λ1 minus λL

1113944

np

n1Ψ x

nminus 11113872 1113873 minus Ψ x

n( 11138571113872 1113873 +

11 minus λL

1113944

np

n1λnεn

⟹1113944

np

n1Dh x

n x

nminus 11113872 1113873 + Dh x

nminus 1 x

n1113872 1113873

leλ

1 minus λLΨ x

01113872 1113873 minus Ψlowast1113872 1113873 +

11 minus λL

1113944

np

n1λnεn

(61)

In the following we pose

tp ≔ 1113944

np

n1λn forallp isin N

lowast t0 λ0 0

An(u) Dhnu x

n( 1113857 forallu isin S foralln isin N

lowast

Φ xp

( 1113857 ≔ min Ψ xk

1113872 1113873 1le klep1113966 1113967 forallp isin Nlowast

(62)

Proposition 7 (Global Estimate in Function Values) (a) Forall u isin S and forallp isin Nlowast

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (63)

Proof We have Dhn(u xnminus 1) Anminus 1(u) minus (λn minus λnminus 1)Dg

(u xnminus 1) from (59) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) minus Dhnx

n x

nminus 11113872 1113873

minus λn minus λnminus 1( 1113857Dg u xnminus 1

1113872 1113873 + λnεn

(64)

From (52) we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857le 1113944

np

n1Anminus 1(u) minus An(u) + 1113944

np

n1λnεn

⟹1113944

np

n1λn Ψ x

n( 1113857 minus Ψ(u)( 1113857leA0(u) minus Ap(u) + 1113944

np

n1λnεn

⟹ Φ xp

( 1113857 minus Ψ(u)( 1113857 1113944n1

np

λn leA0(u) minus Ap(u) + 1113944

np

n1λnεn

(65)

A0(u) Dh(u x0) (λ0 0) so

Φ xp

( 1113857 minus Ψ(u)le1tp

Dh u x0

1113872 1113873 + 1113944infin

n1λnεn

⎡⎣ ⎤⎦ (66)

)is theorem covers the evaluation of the global con-vergence rate given in [6 12] as shown by the followingcorollary

Corollary 2 We assume that

(a) εn 0 n 1

(b) λn λ 12L n 1

(c) xlowast isin argminΨ

en

Ψ xp

( 1113857 minus Ψlowast le2L

pDh xlowast x

01113872 1113873 (67)

ie Ψ(xp) minus Ψlowast O(1p)

Proof Immediate consequence of Proposition 7

Now we derive a global convergence of the sequencegenerated by Algorithm 1 to a minimizer of (P)

Abstract and Applied Analysis 7

Theorem 3 We assume that 1113936 λnεn lt +infin if one of thefollowing assumptions holds

(i) exist λ gt 0 such that λ le λn le λ n 1

(ii) e sequence Ψ(xn) is nonincreasing and1113936 λn +infin then (a) Ψ(xn)⟶ inf Ψ and (b)xn⟶ xlowast isin argminΨ

Proof

(a) Suppose

(i) Let xlowast isin argminΨ and we put u xlowast in (69) wehave

λ Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857 + λnεn

(68)

An xlowast

( 1113857leAnminus 1 xlowast

( 1113857 + λnεn (69)

1113944 λnεn lt +infin⟹An xlowast

( 1113857⟶ l isin R (70)

(68) and (suppose (69))⟹Ψ(xn)⟶ inf Ψ

Suppose (ii)

Φ xp

( 1113857 min Ψ xk

1113872 1113873 1le klep1113966 1113967 Ψ xp

( 1113857 (71)

For u xlowast isin argminΨ in (63) we have 0leΨ(xp) minus Ψ(xlowast) le (1tp)[Dh(xlowast x0) + 1113936

infinn1λnεn]

so 1113936 λn +infin⟹Ψ(xn)⟶ inf Ψ

(b) (69)⟹existαgt 0 An(xlowast)le αAn(xlowast) Dhn(xlowast xn)

Dh(xlowast xn) minus λnDg(xlowast xn) so

Dh xlowast x

n( 1113857le α + λnDg x

lowast x

n( 1113857

dArr

Dh xlowast x

n( 1113857le

α1 minus λL

(72)

Then Dh(xlowast xn)1113864 1113865 is bounded and from H3 xn isbounded as well Let ulowast isin Adh xn there exists then asubsequence xni of xn such that xni⟶ ulowast isin S From H5Dh(ulowast xni )⟶ 0 On the other hand

0leDg ulowast x

ni( 1113857leL middot Dh ulowast x

ni( 1113857 (73)

so Dg(ulowast xni )⟶ 0ulowast isin argminΨ Indeed

inf ΨleΨ ulowast

( 1113857le lim Ψ xni( 1113857 inf Ψ

dArr

inf Ψ Ψ ulowast

( 1113857

(74)

which shows that⟹ulowast isin argminΨ )en

Dhniulowast x

ni( 1113857 Dh ulowast x

ni( 1113857 minus λniDg u

lowast x

ni( 1113857⟶ 0 (75)

Since Dhni(ulowast xni )⟶ 0 and ulowast isin argminΨ we have

Dhnulowast x

n( 1113857⟶ 0 (76)

We have

Dh ulowast x

n( 1113857 Dhn

ulowast x

n( 1113857 + λnDg u

lowast x

n( 1113857

leDhnulowast x

n( 1113857 + L middot λDh u

lowast x

n( 1113857

(77)

so (1 minus L middot λ)Dh(ulowast xn)leDhn(ulowast xn) then

Dh ulowast x

n( 1113857⟶ 0 (78)

And from H6 we have xn⟶ ulowast isin argminΨ )e IBPG algorithm generates a sequence such thatΨ(xn) n does not necessarily be nonincreasing for thisreason and for improvement of the global estimate infunction values we now propose εminus NoLips which is aninexact version of BPG with a relative error criterion Let σsuch that 0le σ lt 1 be given as follows

In what follows we will derive a convergence rate result()eorem 4) for the εminus NoLips framework First we need toestablish a few technical lemmas

In the following xn n denotes the sequence generatedby εminus NoLips

Lemma 7 For every u isin S for all n isin Nlowast

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(79)

Proof Since (λn minus λnminus 1)Dg(u xnminus 1)ge 0 we have from (62)

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le minus Dhnx

n x

nminus 11113872 1113873 + Anminus 1(u)

amp9 minus An(u) + λnεn(80)

From Algorithm 2 we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + (σ minus 1)Dhnx

n x

nminus 11113872 1113873

(81)

From the condition LC we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(82)

Remark 3 We now notice that Ψ(xn) n is nonincreasingJust replace u with xnminus 1 in (79)

Lemma 8 For every n isin Nlowast and xlowast isin argminΨ we have

tn Ψ xn

( 1113857 minus Ψlowast( 1113857 + λminus 1n tn(1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 1113873 + Anminus 1 xlowast

( 1113857 minus An xlowast

( 1113857

(83)

Proof Replacing u by xnminus 1 in (79) and since An(xnminus 1)ge 0and Anminus 1(xnminus 1) 0 we have

8 Abstract and Applied Analysis

tnminus 1 Ψ xn

( 1113857 minus Ψlowast( 11138571113857 + tnminus 1λminus 1n (1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 11138731113873

(84)

Replacing u by xlowast in (79) we have

λn Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857 +(1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

+ leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857(85)

Since tnminus 1 + λn tn by adding (84) and (85) we have(83)

Lemma 9 For every k isin Nlowast

tk Ψ xk

1113872 1113873 minus Ψlowast1113872 1113873 +(1 minus σ)(1 minus λL) 1113944nk

n1λminus 1

n tnDh xn x

nminus 11113872 1113873

leA0 xlowast

( 1113857 minus Ak xlowast

( 1113857

(86)

Proof )is result is obtained by adding inequality (83) from1 to k (t0 λ0 0)

We are now ready to state the convergence rate result forthe εminus NoLips framework )is result improves and com-pletes the one given in the Proposition 7

Theorem 4 For every k isin Nlowast the following statements holdwe pose ρk ≔ 1113936

nkn1λ

minus 1n tn Consider

(a)Ψ xk

1113872 1113873 minus Ψlowast leDh xlowast x0( 1113857

tk

1113888 1113889 (87)

(b) ck ≔ min1lenlek

Dh xn x

nminus 11113872 1113873le

Dh xlowast x0( 1113857

(1 minus σ)(1 minus λL)ρk

1113888 1113889

(88)

Proof From (86) and since A0(xlowast) Dh(xlowast x0)(λ0 0)

and Ak(xlowast)ge 0 we immediately have (87) and (88)

Corollary 3 Consider an instance of the εminus NoLips frame-work with λn λ for every n isin Nlowasten for every k isin Nlowast thefollowing statements hold

(a)Ψ xk

1113872 1113873 minus Ψlowast O1k

1113874 1113875 (89)

(b) ck min1lenlek

Dh xn x

nminus 11113872 1113873 O

1k21113874 1113875 (90)

Proof tk kλ and (87)⟹ (90)

ρk k(k + 1)

2ge

k2

2 (91)

and (88)⟹ (90)

Remark 4 (87) and (89) represent exactly the convergencerate established in [15 22] (90) is a new result not estab-lished in [15 22] this result shows that ck converges to zeroat a rate of O(1k2)

Theorem 5 If 1113936 λn +infin then

(a) Ψ(xn)⟶ inf Ψ(b) xn⟶ xlowast isin argminΨ

Proof Replacing u by xlowast in (81) we have

(1 minus σ)Dhnx

n x

nminus 11113872 1113873leAnminus 1 x

lowast( 1113857 minus An x

lowast( 1113857 (92)

By adding the inequality (92) from 1 to k we have

(1 minus σ) 1113944nk

n1Dhn

xn x

nminus 11113872 1113873leA0 x

lowast( 1113857 minus Ak x

lowast( 1113857 (93)

Ak(xlowast)ge 0 so 1113936infinn1Dhn

(xn xnminus 1)lt +infin From (69) we have

1113944

infin

n1λnεn lt +infin (94)

Since Ψ(xn) n is nonincreasing and (94) by applyingthe )eorem 3 (ii) we have the results of )eorem 5

5 Application to Nonnegative LinearInverse Problem

In Poisson inverse problems (eg [25 26]) we are given anonnegative observation matrix A isin Rmtimesd

+ and a noisymeasurement vector b isin Rm

+ and the goal is to reconstructthe signal x isin Rd

+ such that Ax≃ b We can naturally adoptthe distance D(Ax b) to measure the residuals between twononnegative points with

(1) Input x0 isin Scap ri(domΨ)

(2) For n 1 2 with εn gt 0 we obtain (nablahn(xnminus 1) minus nablahn(xn)λn) isin zεnΨ(xn)

ALGORITHM 1 Inexact Bregman Proximal Gradient (IBPG)

Abstract and Applied Analysis 9

D(Ax b) 1113944m

i1langai xranglog

langai xrangbi

+ bi minus langai xrang (95)

where ai denotes the ith line of AIn this section we propose an approach for solving the

nonnegative linear inverse problem defined by

Pα( 1113857 inf αx1 + D(Ax b) x isin Rd+1113966 1113967 (αgt 0) (96)

We take

f(x) ≔ αx1

g(x) ≔ D(Ax b)

h(x) ≔ h1(x) 1113944d

i1xi logxi

(97)

It is shown in [15] that the couple (g h) verified aLipschitz-likeConvexity Condition (LC) on Rd

+ for any Lsuch that

Lge max1lejled

1113944

m

i1aij max

1lejledAj

1 (98)

where Aj denotes the jth column of AFor λn ≔ λ foralln )eorem 3 is applicable and global

warrant of Algorithm 1 convergence to an optimal solutionof (Pα)

Given xn isin Rd++ the iteration

xn+1≃Tλ x

n( 1113857 (99)

amounts to solving the one-dimensional problemFor j 1 d

xn+1j ≃ argmin αt + cjt +

t logt

xnj

+ xnj minus t⎛⎝ ⎞⎠ tgt 0

⎧⎨

⎫⎬

(100)

where cj is the jth component of nablag(xn)

6 Conclusion

)e proposed algorithms constitute a unified frame for theexisting algorithms BPG BP and PG by giving others inparticular the inexact version of the interior method withBregman distance studied in [24] More precisely

(i) When εn 0 foralln isin Nlowast our algorithm is the NoLipsstudied in [15 22]

(ii) When g 0 our algorithm is the inexact version ofBregman Proximal (BP) studied in [19]

(iii) When εn 0 foralln isin Nlowast and g 0 our algorithm isthe Bregman Proximal (BP) studied in [17 21]

(iv) When f 0 our algorithm is the inexact version ofthe interior method with Bregman distance studiedin [24]

(v) When f 0 and εn 0 foralln isin Nlowast our algorithm isthe interior method with Bregman distance studiedin [24]

(vi) When h (12)middot2 our algorithm is the proximalgradient method (PG) and its variants [4ndash14 27]

Our analysis is different and more simple than the onegiven in [15 22] and allows to reduce some hypothesis inparticular the supercoercivity of Ψ as well as the simulta-neous convexity of f and g

Data Availability

No data were used to support this study

Conflicts of Interest

)e authors declare that they have no conflicts of interest

References

[1] D L Donoho ldquoCompressed sensingrdquo IEEE Transactions onInformation eory vol 52 no 4 pp 1289ndash1306 2006

[2] A Beck and Y C Eldar ldquoSparsity constrained nonlinearoptimization optimality conditions and algorithmsrdquo SIAMJournal on Optimization vol 23 no 3 pp 1480ndash1509 2013

[3] D R Luke ldquoPhase retrieval whatrsquos newrdquo SIAGOPT Viewsand News vol 25 no 1 pp 1ndash5 2017

[4] A Auslender ldquoNumrrical methods for non-differentiableconvex optimizationrdquo inMathematical Programming StudiesJ P Vial and C B Nguyen Eds vol 30 pp 102ndash126Springer Berlin Germany 1987

[5] A I Chen and A Ozdaglar ldquoA fast distributed proximal-gradient methodrdquo in Proceedings of the 2012 50th AnnualAllerton Conference on Communication Control and Com-puting (Allerton) pp 601ndash608 IEEE Monticello IL USAOctober 2012

[6] J B Hiriart-Urruty ldquoε-subdifferentiel calculs Convex anal-ysis and optimizationrdquo in Research Notes in MathematicsSeries Vol 57 Pitman Publishers London UK 1982

[7] K Jiang D Sun and K-C Toh ldquoAn inexact acceleratedproximal gradient method for large scale linearly constrainedconvex SDPrdquo SIAM Journal on Optimization vol 22 no 3pp 1042ndash1064 2012

(1) Input x0 isin Scap ri(domΨ)

(2) Choose λn gt 0 and find xn isin S σn isin [0 σ] and εn ge 0 such that((nablahn(xnminus 1) minus nablahn(xn))λn) isin zεn

Ψ(xn)

λnεn le σnDhn(xn xnminus 1)

(3) Set n⟵ n + 1 and go to step 1

ALGORITHM 2 εminus NoLips

10 Abstract and Applied Analysis

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11

Page 8: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

Theorem 3 We assume that 1113936 λnεn lt +infin if one of thefollowing assumptions holds

(i) exist λ gt 0 such that λ le λn le λ n 1

(ii) e sequence Ψ(xn) is nonincreasing and1113936 λn +infin then (a) Ψ(xn)⟶ inf Ψ and (b)xn⟶ xlowast isin argminΨ

Proof

(a) Suppose

(i) Let xlowast isin argminΨ and we put u xlowast in (69) wehave

λ Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857 + λnεn

(68)

An xlowast

( 1113857leAnminus 1 xlowast

( 1113857 + λnεn (69)

1113944 λnεn lt +infin⟹An xlowast

( 1113857⟶ l isin R (70)

(68) and (suppose (69))⟹Ψ(xn)⟶ inf Ψ

Suppose (ii)

Φ xp

( 1113857 min Ψ xk

1113872 1113873 1le klep1113966 1113967 Ψ xp

( 1113857 (71)

For u xlowast isin argminΨ in (63) we have 0leΨ(xp) minus Ψ(xlowast) le (1tp)[Dh(xlowast x0) + 1113936

infinn1λnεn]

so 1113936 λn +infin⟹Ψ(xn)⟶ inf Ψ

(b) (69)⟹existαgt 0 An(xlowast)le αAn(xlowast) Dhn(xlowast xn)

Dh(xlowast xn) minus λnDg(xlowast xn) so

Dh xlowast x

n( 1113857le α + λnDg x

lowast x

n( 1113857

dArr

Dh xlowast x

n( 1113857le

α1 minus λL

(72)

Then Dh(xlowast xn)1113864 1113865 is bounded and from H3 xn isbounded as well Let ulowast isin Adh xn there exists then asubsequence xni of xn such that xni⟶ ulowast isin S From H5Dh(ulowast xni )⟶ 0 On the other hand

0leDg ulowast x

ni( 1113857leL middot Dh ulowast x

ni( 1113857 (73)

so Dg(ulowast xni )⟶ 0ulowast isin argminΨ Indeed

inf ΨleΨ ulowast

( 1113857le lim Ψ xni( 1113857 inf Ψ

dArr

inf Ψ Ψ ulowast

( 1113857

(74)

which shows that⟹ulowast isin argminΨ )en

Dhniulowast x

ni( 1113857 Dh ulowast x

ni( 1113857 minus λniDg u

lowast x

ni( 1113857⟶ 0 (75)

Since Dhni(ulowast xni )⟶ 0 and ulowast isin argminΨ we have

Dhnulowast x

n( 1113857⟶ 0 (76)

We have

Dh ulowast x

n( 1113857 Dhn

ulowast x

n( 1113857 + λnDg u

lowast x

n( 1113857

leDhnulowast x

n( 1113857 + L middot λDh u

lowast x

n( 1113857

(77)

so (1 minus L middot λ)Dh(ulowast xn)leDhn(ulowast xn) then

Dh ulowast x

n( 1113857⟶ 0 (78)

And from H6 we have xn⟶ ulowast isin argminΨ )e IBPG algorithm generates a sequence such thatΨ(xn) n does not necessarily be nonincreasing for thisreason and for improvement of the global estimate infunction values we now propose εminus NoLips which is aninexact version of BPG with a relative error criterion Let σsuch that 0le σ lt 1 be given as follows

In what follows we will derive a convergence rate result()eorem 4) for the εminus NoLips framework First we need toestablish a few technical lemmas

In the following xn n denotes the sequence generatedby εminus NoLips

Lemma 7 For every u isin S for all n isin Nlowast

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(79)

Proof Since (λn minus λnminus 1)Dg(u xnminus 1)ge 0 we have from (62)

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857le minus Dhnx

n x

nminus 11113872 1113873 + Anminus 1(u)

amp9 minus An(u) + λnεn(80)

From Algorithm 2 we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u) + (σ minus 1)Dhnx

n x

nminus 11113872 1113873

(81)

From the condition LC we have

λn Ψ xn

( 1113857 minus Ψ(u)( 1113857leAnminus 1(u) minus An(u)

minus (1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

(82)

Remark 3 We now notice that Ψ(xn) n is nonincreasingJust replace u with xnminus 1 in (79)

Lemma 8 For every n isin Nlowast and xlowast isin argminΨ we have

tn Ψ xn

( 1113857 minus Ψlowast( 1113857 + λminus 1n tn(1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 1113873 + Anminus 1 xlowast

( 1113857 minus An xlowast

( 1113857

(83)

Proof Replacing u by xnminus 1 in (79) and since An(xnminus 1)ge 0and Anminus 1(xnminus 1) 0 we have

8 Abstract and Applied Analysis

tnminus 1 Ψ xn

( 1113857 minus Ψlowast( 11138571113857 + tnminus 1λminus 1n (1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 11138731113873

(84)

Replacing u by xlowast in (79) we have

λn Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857 +(1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

+ leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857(85)

Since tnminus 1 + λn tn by adding (84) and (85) we have(83)

Lemma 9 For every k isin Nlowast

tk Ψ xk

1113872 1113873 minus Ψlowast1113872 1113873 +(1 minus σ)(1 minus λL) 1113944nk

n1λminus 1

n tnDh xn x

nminus 11113872 1113873

leA0 xlowast

( 1113857 minus Ak xlowast

( 1113857

(86)

Proof )is result is obtained by adding inequality (83) from1 to k (t0 λ0 0)

We are now ready to state the convergence rate result forthe εminus NoLips framework )is result improves and com-pletes the one given in the Proposition 7

Theorem 4 For every k isin Nlowast the following statements holdwe pose ρk ≔ 1113936

nkn1λ

minus 1n tn Consider

(a)Ψ xk

1113872 1113873 minus Ψlowast leDh xlowast x0( 1113857

tk

1113888 1113889 (87)

(b) ck ≔ min1lenlek

Dh xn x

nminus 11113872 1113873le

Dh xlowast x0( 1113857

(1 minus σ)(1 minus λL)ρk

1113888 1113889

(88)

Proof From (86) and since A0(xlowast) Dh(xlowast x0)(λ0 0)

and Ak(xlowast)ge 0 we immediately have (87) and (88)

Corollary 3 Consider an instance of the εminus NoLips frame-work with λn λ for every n isin Nlowasten for every k isin Nlowast thefollowing statements hold

(a)Ψ xk

1113872 1113873 minus Ψlowast O1k

1113874 1113875 (89)

(b) ck min1lenlek

Dh xn x

nminus 11113872 1113873 O

1k21113874 1113875 (90)

Proof tk kλ and (87)⟹ (90)

ρk k(k + 1)

2ge

k2

2 (91)

and (88)⟹ (90)

Remark 4 (87) and (89) represent exactly the convergencerate established in [15 22] (90) is a new result not estab-lished in [15 22] this result shows that ck converges to zeroat a rate of O(1k2)

Theorem 5 If 1113936 λn +infin then

(a) Ψ(xn)⟶ inf Ψ(b) xn⟶ xlowast isin argminΨ

Proof Replacing u by xlowast in (81) we have

(1 minus σ)Dhnx

n x

nminus 11113872 1113873leAnminus 1 x

lowast( 1113857 minus An x

lowast( 1113857 (92)

By adding the inequality (92) from 1 to k we have

(1 minus σ) 1113944nk

n1Dhn

xn x

nminus 11113872 1113873leA0 x

lowast( 1113857 minus Ak x

lowast( 1113857 (93)

Ak(xlowast)ge 0 so 1113936infinn1Dhn

(xn xnminus 1)lt +infin From (69) we have

1113944

infin

n1λnεn lt +infin (94)

Since Ψ(xn) n is nonincreasing and (94) by applyingthe )eorem 3 (ii) we have the results of )eorem 5

5 Application to Nonnegative LinearInverse Problem

In Poisson inverse problems (eg [25 26]) we are given anonnegative observation matrix A isin Rmtimesd

+ and a noisymeasurement vector b isin Rm

+ and the goal is to reconstructthe signal x isin Rd

+ such that Ax≃ b We can naturally adoptthe distance D(Ax b) to measure the residuals between twononnegative points with

(1) Input x0 isin Scap ri(domΨ)

(2) For n 1 2 with εn gt 0 we obtain (nablahn(xnminus 1) minus nablahn(xn)λn) isin zεnΨ(xn)

ALGORITHM 1 Inexact Bregman Proximal Gradient (IBPG)

Abstract and Applied Analysis 9

D(Ax b) 1113944m

i1langai xranglog

langai xrangbi

+ bi minus langai xrang (95)

where ai denotes the ith line of AIn this section we propose an approach for solving the

nonnegative linear inverse problem defined by

Pα( 1113857 inf αx1 + D(Ax b) x isin Rd+1113966 1113967 (αgt 0) (96)

We take

f(x) ≔ αx1

g(x) ≔ D(Ax b)

h(x) ≔ h1(x) 1113944d

i1xi logxi

(97)

It is shown in [15] that the couple (g h) verified aLipschitz-likeConvexity Condition (LC) on Rd

+ for any Lsuch that

Lge max1lejled

1113944

m

i1aij max

1lejledAj

1 (98)

where Aj denotes the jth column of AFor λn ≔ λ foralln )eorem 3 is applicable and global

warrant of Algorithm 1 convergence to an optimal solutionof (Pα)

Given xn isin Rd++ the iteration

xn+1≃Tλ x

n( 1113857 (99)

amounts to solving the one-dimensional problemFor j 1 d

xn+1j ≃ argmin αt + cjt +

t logt

xnj

+ xnj minus t⎛⎝ ⎞⎠ tgt 0

⎧⎨

⎫⎬

(100)

where cj is the jth component of nablag(xn)

6 Conclusion

)e proposed algorithms constitute a unified frame for theexisting algorithms BPG BP and PG by giving others inparticular the inexact version of the interior method withBregman distance studied in [24] More precisely

(i) When εn 0 foralln isin Nlowast our algorithm is the NoLipsstudied in [15 22]

(ii) When g 0 our algorithm is the inexact version ofBregman Proximal (BP) studied in [19]

(iii) When εn 0 foralln isin Nlowast and g 0 our algorithm isthe Bregman Proximal (BP) studied in [17 21]

(iv) When f 0 our algorithm is the inexact version ofthe interior method with Bregman distance studiedin [24]

(v) When f 0 and εn 0 foralln isin Nlowast our algorithm isthe interior method with Bregman distance studiedin [24]

(vi) When h (12)middot2 our algorithm is the proximalgradient method (PG) and its variants [4ndash14 27]

Our analysis is different and more simple than the onegiven in [15 22] and allows to reduce some hypothesis inparticular the supercoercivity of Ψ as well as the simulta-neous convexity of f and g

Data Availability

No data were used to support this study

Conflicts of Interest

)e authors declare that they have no conflicts of interest

References

[1] D L Donoho ldquoCompressed sensingrdquo IEEE Transactions onInformation eory vol 52 no 4 pp 1289ndash1306 2006

[2] A Beck and Y C Eldar ldquoSparsity constrained nonlinearoptimization optimality conditions and algorithmsrdquo SIAMJournal on Optimization vol 23 no 3 pp 1480ndash1509 2013

[3] D R Luke ldquoPhase retrieval whatrsquos newrdquo SIAGOPT Viewsand News vol 25 no 1 pp 1ndash5 2017

[4] A Auslender ldquoNumrrical methods for non-differentiableconvex optimizationrdquo inMathematical Programming StudiesJ P Vial and C B Nguyen Eds vol 30 pp 102ndash126Springer Berlin Germany 1987

[5] A I Chen and A Ozdaglar ldquoA fast distributed proximal-gradient methodrdquo in Proceedings of the 2012 50th AnnualAllerton Conference on Communication Control and Com-puting (Allerton) pp 601ndash608 IEEE Monticello IL USAOctober 2012

[6] J B Hiriart-Urruty ldquoε-subdifferentiel calculs Convex anal-ysis and optimizationrdquo in Research Notes in MathematicsSeries Vol 57 Pitman Publishers London UK 1982

[7] K Jiang D Sun and K-C Toh ldquoAn inexact acceleratedproximal gradient method for large scale linearly constrainedconvex SDPrdquo SIAM Journal on Optimization vol 22 no 3pp 1042ndash1064 2012

(1) Input x0 isin Scap ri(domΨ)

(2) Choose λn gt 0 and find xn isin S σn isin [0 σ] and εn ge 0 such that((nablahn(xnminus 1) minus nablahn(xn))λn) isin zεn

Ψ(xn)

λnεn le σnDhn(xn xnminus 1)

(3) Set n⟵ n + 1 and go to step 1

ALGORITHM 2 εminus NoLips

10 Abstract and Applied Analysis

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11

Page 9: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

tnminus 1 Ψ xn

( 1113857 minus Ψlowast( 11138571113857 + tnminus 1λminus 1n (1 minus σ)(1 minus λL)Dh x

n x

nminus 11113872 1113873

le tnminus 1 Ψ xnminus 1

1113872 1113873 minus Ψlowast1113872 11138731113873

(84)

Replacing u by xlowast in (79) we have

λn Ψ xn

( 1113857 minus Ψ xlowast

( 1113857( 1113857 +(1 minus σ)(1 minus λL)Dh xn x

nminus 11113872 1113873

+ leAnminus 1 xlowast

( 1113857 minus An xlowast

( 1113857(85)

Since tnminus 1 + λn tn by adding (84) and (85) we have(83)

Lemma 9 For every k isin Nlowast

tk Ψ xk

1113872 1113873 minus Ψlowast1113872 1113873 +(1 minus σ)(1 minus λL) 1113944nk

n1λminus 1

n tnDh xn x

nminus 11113872 1113873

leA0 xlowast

( 1113857 minus Ak xlowast

( 1113857

(86)

Proof )is result is obtained by adding inequality (83) from1 to k (t0 λ0 0)

We are now ready to state the convergence rate result forthe εminus NoLips framework )is result improves and com-pletes the one given in the Proposition 7

Theorem 4 For every k isin Nlowast the following statements holdwe pose ρk ≔ 1113936

nkn1λ

minus 1n tn Consider

(a)Ψ xk

1113872 1113873 minus Ψlowast leDh xlowast x0( 1113857

tk

1113888 1113889 (87)

(b) ck ≔ min1lenlek

Dh xn x

nminus 11113872 1113873le

Dh xlowast x0( 1113857

(1 minus σ)(1 minus λL)ρk

1113888 1113889

(88)

Proof From (86) and since A0(xlowast) Dh(xlowast x0)(λ0 0)

and Ak(xlowast)ge 0 we immediately have (87) and (88)

Corollary 3 Consider an instance of the εminus NoLips frame-work with λn λ for every n isin Nlowasten for every k isin Nlowast thefollowing statements hold

(a)Ψ xk

1113872 1113873 minus Ψlowast O1k

1113874 1113875 (89)

(b) ck min1lenlek

Dh xn x

nminus 11113872 1113873 O

1k21113874 1113875 (90)

Proof tk kλ and (87)⟹ (90)

ρk k(k + 1)

2ge

k2

2 (91)

and (88)⟹ (90)

Remark 4 (87) and (89) represent exactly the convergencerate established in [15 22] (90) is a new result not estab-lished in [15 22] this result shows that ck converges to zeroat a rate of O(1k2)

Theorem 5 If 1113936 λn +infin then

(a) Ψ(xn)⟶ inf Ψ(b) xn⟶ xlowast isin argminΨ

Proof Replacing u by xlowast in (81) we have

(1 minus σ)Dhnx

n x

nminus 11113872 1113873leAnminus 1 x

lowast( 1113857 minus An x

lowast( 1113857 (92)

By adding the inequality (92) from 1 to k we have

(1 minus σ) 1113944nk

n1Dhn

xn x

nminus 11113872 1113873leA0 x

lowast( 1113857 minus Ak x

lowast( 1113857 (93)

Ak(xlowast)ge 0 so 1113936infinn1Dhn

(xn xnminus 1)lt +infin From (69) we have

1113944

infin

n1λnεn lt +infin (94)

Since Ψ(xn) n is nonincreasing and (94) by applyingthe )eorem 3 (ii) we have the results of )eorem 5

5 Application to Nonnegative LinearInverse Problem

In Poisson inverse problems (eg [25 26]) we are given anonnegative observation matrix A isin Rmtimesd

+ and a noisymeasurement vector b isin Rm

+ and the goal is to reconstructthe signal x isin Rd

+ such that Ax≃ b We can naturally adoptthe distance D(Ax b) to measure the residuals between twononnegative points with

(1) Input x0 isin Scap ri(domΨ)

(2) For n 1 2 with εn gt 0 we obtain (nablahn(xnminus 1) minus nablahn(xn)λn) isin zεnΨ(xn)

ALGORITHM 1 Inexact Bregman Proximal Gradient (IBPG)

Abstract and Applied Analysis 9

D(Ax b) 1113944m

i1langai xranglog

langai xrangbi

+ bi minus langai xrang (95)

where ai denotes the ith line of AIn this section we propose an approach for solving the

nonnegative linear inverse problem defined by

Pα( 1113857 inf αx1 + D(Ax b) x isin Rd+1113966 1113967 (αgt 0) (96)

We take

f(x) ≔ αx1

g(x) ≔ D(Ax b)

h(x) ≔ h1(x) 1113944d

i1xi logxi

(97)

It is shown in [15] that the couple (g h) verified aLipschitz-likeConvexity Condition (LC) on Rd

+ for any Lsuch that

Lge max1lejled

1113944

m

i1aij max

1lejledAj

1 (98)

where Aj denotes the jth column of AFor λn ≔ λ foralln )eorem 3 is applicable and global

warrant of Algorithm 1 convergence to an optimal solutionof (Pα)

Given xn isin Rd++ the iteration

xn+1≃Tλ x

n( 1113857 (99)

amounts to solving the one-dimensional problemFor j 1 d

xn+1j ≃ argmin αt + cjt +

t logt

xnj

+ xnj minus t⎛⎝ ⎞⎠ tgt 0

⎧⎨

⎫⎬

(100)

where cj is the jth component of nablag(xn)

6 Conclusion

)e proposed algorithms constitute a unified frame for theexisting algorithms BPG BP and PG by giving others inparticular the inexact version of the interior method withBregman distance studied in [24] More precisely

(i) When εn 0 foralln isin Nlowast our algorithm is the NoLipsstudied in [15 22]

(ii) When g 0 our algorithm is the inexact version ofBregman Proximal (BP) studied in [19]

(iii) When εn 0 foralln isin Nlowast and g 0 our algorithm isthe Bregman Proximal (BP) studied in [17 21]

(iv) When f 0 our algorithm is the inexact version ofthe interior method with Bregman distance studiedin [24]

(v) When f 0 and εn 0 foralln isin Nlowast our algorithm isthe interior method with Bregman distance studiedin [24]

(vi) When h (12)middot2 our algorithm is the proximalgradient method (PG) and its variants [4ndash14 27]

Our analysis is different and more simple than the onegiven in [15 22] and allows to reduce some hypothesis inparticular the supercoercivity of Ψ as well as the simulta-neous convexity of f and g

Data Availability

No data were used to support this study

Conflicts of Interest

)e authors declare that they have no conflicts of interest

References

[1] D L Donoho ldquoCompressed sensingrdquo IEEE Transactions onInformation eory vol 52 no 4 pp 1289ndash1306 2006

[2] A Beck and Y C Eldar ldquoSparsity constrained nonlinearoptimization optimality conditions and algorithmsrdquo SIAMJournal on Optimization vol 23 no 3 pp 1480ndash1509 2013

[3] D R Luke ldquoPhase retrieval whatrsquos newrdquo SIAGOPT Viewsand News vol 25 no 1 pp 1ndash5 2017

[4] A Auslender ldquoNumrrical methods for non-differentiableconvex optimizationrdquo inMathematical Programming StudiesJ P Vial and C B Nguyen Eds vol 30 pp 102ndash126Springer Berlin Germany 1987

[5] A I Chen and A Ozdaglar ldquoA fast distributed proximal-gradient methodrdquo in Proceedings of the 2012 50th AnnualAllerton Conference on Communication Control and Com-puting (Allerton) pp 601ndash608 IEEE Monticello IL USAOctober 2012

[6] J B Hiriart-Urruty ldquoε-subdifferentiel calculs Convex anal-ysis and optimizationrdquo in Research Notes in MathematicsSeries Vol 57 Pitman Publishers London UK 1982

[7] K Jiang D Sun and K-C Toh ldquoAn inexact acceleratedproximal gradient method for large scale linearly constrainedconvex SDPrdquo SIAM Journal on Optimization vol 22 no 3pp 1042ndash1064 2012

(1) Input x0 isin Scap ri(domΨ)

(2) Choose λn gt 0 and find xn isin S σn isin [0 σ] and εn ge 0 such that((nablahn(xnminus 1) minus nablahn(xn))λn) isin zεn

Ψ(xn)

λnεn le σnDhn(xn xnminus 1)

(3) Set n⟵ n + 1 and go to step 1

ALGORITHM 2 εminus NoLips

10 Abstract and Applied Analysis

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11

Page 10: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

D(Ax b) 1113944m

i1langai xranglog

langai xrangbi

+ bi minus langai xrang (95)

where ai denotes the ith line of AIn this section we propose an approach for solving the

nonnegative linear inverse problem defined by

Pα( 1113857 inf αx1 + D(Ax b) x isin Rd+1113966 1113967 (αgt 0) (96)

We take

f(x) ≔ αx1

g(x) ≔ D(Ax b)

h(x) ≔ h1(x) 1113944d

i1xi logxi

(97)

It is shown in [15] that the couple (g h) verified aLipschitz-likeConvexity Condition (LC) on Rd

+ for any Lsuch that

Lge max1lejled

1113944

m

i1aij max

1lejledAj

1 (98)

where Aj denotes the jth column of AFor λn ≔ λ foralln )eorem 3 is applicable and global

warrant of Algorithm 1 convergence to an optimal solutionof (Pα)

Given xn isin Rd++ the iteration

xn+1≃Tλ x

n( 1113857 (99)

amounts to solving the one-dimensional problemFor j 1 d

xn+1j ≃ argmin αt + cjt +

t logt

xnj

+ xnj minus t⎛⎝ ⎞⎠ tgt 0

⎧⎨

⎫⎬

(100)

where cj is the jth component of nablag(xn)

6 Conclusion

)e proposed algorithms constitute a unified frame for theexisting algorithms BPG BP and PG by giving others inparticular the inexact version of the interior method withBregman distance studied in [24] More precisely

(i) When εn 0 foralln isin Nlowast our algorithm is the NoLipsstudied in [15 22]

(ii) When g 0 our algorithm is the inexact version ofBregman Proximal (BP) studied in [19]

(iii) When εn 0 foralln isin Nlowast and g 0 our algorithm isthe Bregman Proximal (BP) studied in [17 21]

(iv) When f 0 our algorithm is the inexact version ofthe interior method with Bregman distance studiedin [24]

(v) When f 0 and εn 0 foralln isin Nlowast our algorithm isthe interior method with Bregman distance studiedin [24]

(vi) When h (12)middot2 our algorithm is the proximalgradient method (PG) and its variants [4ndash14 27]

Our analysis is different and more simple than the onegiven in [15 22] and allows to reduce some hypothesis inparticular the supercoercivity of Ψ as well as the simulta-neous convexity of f and g

Data Availability

No data were used to support this study

Conflicts of Interest

)e authors declare that they have no conflicts of interest

References

[1] D L Donoho ldquoCompressed sensingrdquo IEEE Transactions onInformation eory vol 52 no 4 pp 1289ndash1306 2006

[2] A Beck and Y C Eldar ldquoSparsity constrained nonlinearoptimization optimality conditions and algorithmsrdquo SIAMJournal on Optimization vol 23 no 3 pp 1480ndash1509 2013

[3] D R Luke ldquoPhase retrieval whatrsquos newrdquo SIAGOPT Viewsand News vol 25 no 1 pp 1ndash5 2017

[4] A Auslender ldquoNumrrical methods for non-differentiableconvex optimizationrdquo inMathematical Programming StudiesJ P Vial and C B Nguyen Eds vol 30 pp 102ndash126Springer Berlin Germany 1987

[5] A I Chen and A Ozdaglar ldquoA fast distributed proximal-gradient methodrdquo in Proceedings of the 2012 50th AnnualAllerton Conference on Communication Control and Com-puting (Allerton) pp 601ndash608 IEEE Monticello IL USAOctober 2012

[6] J B Hiriart-Urruty ldquoε-subdifferentiel calculs Convex anal-ysis and optimizationrdquo in Research Notes in MathematicsSeries Vol 57 Pitman Publishers London UK 1982

[7] K Jiang D Sun and K-C Toh ldquoAn inexact acceleratedproximal gradient method for large scale linearly constrainedconvex SDPrdquo SIAM Journal on Optimization vol 22 no 3pp 1042ndash1064 2012

(1) Input x0 isin Scap ri(domΨ)

(2) Choose λn gt 0 and find xn isin S σn isin [0 σ] and εn ge 0 such that((nablahn(xnminus 1) minus nablahn(xn))λn) isin zεn

Ψ(xn)

λnεn le σnDhn(xn xnminus 1)

(3) Set n⟵ n + 1 and go to step 1

ALGORITHM 2 εminus NoLips

10 Abstract and Applied Analysis

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11

Page 11: ResearchArticle ...downloads.hindawi.com/journals/aaa/2020/1963980.pdf4:∀r≥0,∀x ∈S,thesetsbelowarebounded L 2(x,r) y∈S;D h(y,x)≤r . (15) H 5:if{ }xn n ⊂S,thenxn x∗∈S,so

[8] B Lemaire ldquoAbout the convergence of the proximal method proceedings 6th French German conference on optimisationrdquoin 1991 Advances in Optimization Lecture Notes pp 39ndash51Springer Berlin Germany 1992

[9] B Lemaire ldquo)e proximal algorithmrdquo International Series ofNumerical Mathematics vol 87 pp 73ndash87 1989

[10] B Martinet ldquoPerturbation des methodes drsquooptimisation-Applicationrdquo Modelisation Mathematique et AnalyseNumerique vol 12 no 2 pp 153ndash171 1976

[11] N Parikh ldquoProximal algorithmsrdquo Foundations and Trends inOptimization vol 1 no 3 pp 127ndash239 2014

[12] M Schmidt N L Roux and F R Bach ldquoConvergence rates ofinexact proximal-gradient methods for convex optimizationrdquoin Advances in Neural Information Processing Systems vol 28pp 1458ndash1466 MIT Press Cambridge MA USA 2011

[13] N D Vanli M Gurbuzbalaban and A Ozdaglar ldquoGlobalconvergence rate of proximal incremental aggregated gradientmethodsrdquo SIAM Journal on Optimization vol 28 no 2pp 1282ndash1300 2018

[14] L Xiao and T Zhang ldquoA proximal stochastic gradient methodwith progressive variance reductionrdquo SIAM Journal on Op-timization vol 24 no 4 pp 2057ndash2075 2014

[15] J Bolte S Sabach M Teboulle and Y Vaisbourd ldquoFirst ordermethods beyond convexity and Lipschitz gradient continuitywith applications to quadratic inverse problemsrdquo SIAMJournal on Optimization vol 28 no 3 pp 2131ndash2151 2018

[16] L M Bregman ldquo)e relaxationmethod of finding the commonpoint of convex sets and its application to the solution ofproblems in convex programming finding the common pointof convex sets and its application to the solution of problems inconve programmingrdquo USSR Computational Mathematics andMathematical Physics vol 7 no 3 pp 200ndash217 1967

[17] J Eckstein ldquoNonlinear proximal point algorithms using Bregmanfunctions with applications to convex programmingrdquo Mathe-matics of Operations Research vol 18 no 1 pp 202ndash226 1993

[18] S Kabbadj ldquo)eoretical aspect of diagonal bregman proximalmethodsrdquo Journal of Applied Mathematics vol 2020 ArticleID 3108056 9 pages 2020

[19] S Kabbadj ldquoMethodes proximales entropiquesrdquo )eseUniversite Montpellier II Montpellier France 1994

[20] R T Rockafellar ldquoMonotone operators and the proximalpoint algorithmrdquo SIAM Journal on Control and Optimizationvol 14 no 5 pp 877ndash898 1975

[21] M Teboulle and C Gong ldquoConvergence analysis of a proximal-like minimization algorithm using Bregman functionrdquo SIAMJournal on Optimization vol 3 no 3 pp 538ndash543 1993

[22] D H Gutman and J F Pena ldquoUnified framework for bregmanproximal methodssubgradient gradient and accelerated gra-dient schemesrdquo 2018 httpsarxivorgpdf181210198pdf

[23] R T Rockafellar Convex Analysis Princeton UniversityPress Princeton NJ USA 1970

[24] A Auslender and M Teboulle ldquoInterior gradient and prox-imal methods for convex and conic optimizationrdquo SIAMJournal on Optimization vol 16 no 3 pp 697ndash725 2006

[25] M Bertero P Boccacci G Desidera and G VicidominildquoImage deblurring with Poisson data from cells to galaxiesrdquoInverse Problems vol 25 no 12 2009

[26] I Csiszar ldquoWhy least squares and maximum entropy Anaxiomatic approach to inference for linear inverse problemsrdquoe Annals of Statistics vol 19 no 4 pp 2032ndash2066 1991

[27] A Nitanda ldquoStochastic proximal gradient descent with ac-celeration techniquesrdquo in Advances in Neural InformationProcessing Systems pp 1574ndash1582 MIT Press CambridgeMA USA 2014

Abstract and Applied Analysis 11