Upload
dodien
View
220
Download
0
Embed Size (px)
Citation preview
A UNIFIED FRAMEWORK FOR SOME
INEXACT PROXIMAL POINT ALGORITHMS*
M. V. Solodov and B. F. Svaiter
Instituto de Matematica Pura e Aplicada, EstradaDona Castorina 110, Jardim Botanico, Rio de Janeiro,
RJ 22460-320, BrazilE-mail: [email protected] and [email protected]
ABSTRACT
We present a unified framework for the design and convergence analysis
of a class of algorithms based on approximate solution of proximal
point subproblems. Our development further enhances the constructive
approximation approach of the recently proposed hybrid projection–
proximal and extragradient–proximal methods. Specifically, we introduce
an even more flexible error tolerance criterion, as well as provide a
unified view of these two algorithms. Our general method possesses
global convergence and local (super)linear rate of convergence under
standard assumptions, while using a constructive approximation cri-
terion suitable for a number of specific implementations. For example,
we show that close to a regular solution of a monotone system of
semismooth equations, two Newton iterations are sufficient to solve the
proximal subproblem within the required error tolerance. Such systems
of equations arise naturally when reformulating the nonlinear comple-
mentarity problem.
Key Words: Maximal monotone operator; Proximal point algorithm;
Approximation criteria
1013
Copyright # 2001 by Marcel Dekker, Inc. www.dekker.com
NUMER. FUNCT. ANAL. AND OPTIMIZ., 22(7&8), 1013–1035 (2001)
*Research of the first author is supported by CNPq Grant 300734/95-6, by PRONEX-
Optimization, and by FAPERJ, research of the second author is supported by CNPq Grant301200/93-9(RN), by PRONEX-Optimization, and by FAPERJ.
AMS Subject Classifications: 90C30; 90C33
1. INTRODUCTION AND MOTIVATION
We consider the classical problem
find x 2 H such that 0 2 TðxÞ , ð1Þ
where H is a real Hilbert space, and T is a maximal monotone operator (or amultifunction) onH, i:e:,T : H ! PðHÞ, where PðHÞ stands for the family ofsubsets of H. It is well known that many problems in applied mathematics,economics and engineering can be cast in this general form.
One of the fundamental regularization techniques in this setting is theproximal point method. Using the current approximation to the solutionof (1), xk 2 H, this method generates the next iterate xkþ1 by solving thesubproblem
0 2 ckTðxÞ þ ðx� xkÞ , ð2Þ
where ck > 0 is a regularization parameter. Equivalently this can bewritten as
xkþ1¼ 3DðckT þ IÞ�1
ðxkÞ : ð3Þ
Theoretically, the proximal point method has quite nice global andlocal convergence properties [31], see also [20]. Its main practical draw-back is that the subproblems are structurally as difficult to solve as theoriginal problem. This makes straightforward application of the methodimpractical in most situations. Nevertheless, the proximal point methodologyis important for the development of a variety of successful numerical tech-niques, such as operator splitting and bundle methods, to name a few.It is therefore of importance to develop proximal-type algorithms withimproved practical characteristics. One possibility in this sense is inexactsolution of subproblems. In particular, approximation criteria should beas constructive and realistic as possible. We next discuss previous work inthis direction.
The first inexact proximal scheme was introduced in [31], and until veryrecently criteria of the same (summability) type remained the only choiceavailable (for example, see [8,10,12,42]). Specifically, in [31] Eq. (3) is relaxed,and the approximation rule is the following:
kxkþ1� ðckT þ IÞ�1
ðxkÞk � ek ,X1k¼3D0
ek < 1: ð4Þ
This condition, together with fckg being bounded away from zero, ensuresconvergence of the iterates to some �xx 2 T�1
ð0Þ, provided the latter setis nonempty [31, Theorem 1]. Convergence here is understood in the
1014 SOLODOV AND SVAITER
weak topology. In general, proximal point iterates may fail to convergestrongly [18]. However, strong convergence can be forced by some simplemodifications [40], see also [1]. Regarding the rate of convergence, the classi-cal result is the following. If the iterates are bounded, ck � c > 0, and T�1 isLipschitz-continuous at zero, then condition
kxkþ1� ðckT þ IÞ�1
ðxkÞk � dkkxkþ1
� xkk ,X1k¼3D0
dk < 1 ð5Þ
implies convergence (in the strong sense) at the linear rate [31, Theorem 2].It is interesting to note that by itself, (5) does not guarantee convergenceof the sequence. Only if T�1 is globally Lipschitz-continuous (which is arather strong assumption), then the boundedness assumption on iterates issuperfluous [31, Proposition 5].
Regarding the error tolerance rules (4) and (5), it is important to keepin mind that ðI þ ckTÞ
�1ðxkÞ is the point to be approximated, so it is not
available. Therefore, the above two criteria usually cannot be directly used inimplementations of the algorithm. In [31, Proposition 3] it was establishedthat (4), (5) are implied, respectively, by the following two conditions:
distð0, ckTðxkþ1
Þ þ xkþ1� xkÞ � ek ,
X1k¼3D0
ek < 1 , ð6Þ
and
distð0, ckTðxkþ1
Þ þ xkþ1� xkÞ � dkkx
kþ1� xkk ,
X1k¼3D0
dk < 1 : ð7Þ
These two conditions are usually easier to verify in practice, because theyonly require evaluation of T at xkþ1. Note that since TðxÞ is closed, theinexact proximal point algorithm managed by criteria (6) or (7) can berestated as follows. Having a current iterate xk, find ðxkþ1, vkþ1
Þ 2 H �H
such that
vkþ12 Tðxkþ1
Þ,
ckvkþ1
þ xkþ1� xk ¼ 3Drk,
(ð8Þ
where rk 2 H is the error associated with approximation. With this notation,criterion (6) becomes
krkk � ek ,X1k¼3D0
ek < 1 , ð9Þ
and criterion (7) becomes
krkk � dkkxkþ1
� xkk ,X1k¼3D0
dk < 1 : ð10Þ
PROXIMAL POINT ALGORITHMS 1015
One question which arises immediately if these rules are to be used, is how tochoose the value of ek or dk for each k when solving a specific problem.Obviously, the number of choices at every step is infinite, and it is notquite clear how the choice should be related to the behavior of the methodon the given problem. This goes to say that, in this sense, these rules are notof constructive nature. Developing constructive approximation criteria isone of the subjects of the present paper (following [37,38]). Another impor-tant question is how to solve the subproblems efficiently within the approx-imation rule, once it is chosen. This question can be addressed mosteffectively under some further assumptions on the structure of the operatorT , e.g., T is the subdifferential of a convex function, T is smooth, T is asum of two operators with certain structure, T represents a variationalinequality, etc. Developments for some of these cases can be found in[35,36,37,39,44], where some Newton-type and splitting-type methods wereconsidered. We note that the use of constructive approximation criteria of[37,38] was crucial in those references. The case of a general maximal mono-tone operator was discussed in [32], where subproblems are solved by bundletechniques, similar to [5]. In this paper we consider one more application,which has to do with solving a system of semismooth (but not in generaldifferentiable) systems of equations. Monotone systems of this type arise, forexample, in reformulation of complementarity problems.
The next question we shall discuss is whether and how (9), (10) can berelaxed. A natural rule to consider in this setting would be
krkk � �kxkþ1� xkk , 0 � � < 1, ð11Þ
where, for simplicity, the relaxation parameter � is fixed (the important pointis not to have it necessarily fixed, but rather to keep it bounded away fromzero). The above condition is obviously weaker than (10). It is also lessrestrictive than (9), because a sequence fxkg generated by the proximalpoint method based on it may not converge (unlike when (9) is used).We refer the reader to [38], where an example of divergence in a finite-dimensional space R2 is given. Hence, condition (11) cannot be used withoutmodifications to the algorithm. Fortunately, modifications that are neededare quite simple and do not increase the computational burden per iterationof the algorithm. This development is another goal of this paper, following[37,38]. We note that conditions in the spirit of (11) are very natural andcomputationally realistic. We mention classical inexact Newton methods, e.g.[11], as one example where similar approximations are used.
We now turn our attention to the more recent research in the area ofinexact proximal point methods. Let us consider the ‘‘proximal system’’
v 2 TðxÞ,
ckvþ x� xk¼ 3D0,
�ð12Þ
1016 SOLODOV AND SVAITER
which is clearly equivalent to the subproblem (2). This splitting of equa-tion and inclusion seems to be a natural view of subproblems, the advantagesof which will be further evident later. In [38], the following notion of inexactsolutions was introduced. A pair ~xxkþ1, ~vvkþ1 is an approximate solution of (12) if
~vvkþ12 Tð ~xxkþ1
Þ
ck ~vvkþ1
þ ~xxkþ1� xk ¼ 3Drk
(ð13Þ
and
krkk � �max ckk ~vvkþ1
k, k ~xxkþ1� xkk
n o, ð14Þ
where � 2 ½0, 1Þ is fixed. Note that this condition is even more general than(11). The algorithm of [38], called the Hybrid Projection–Proximal PointMethod (HPPPM), proceeds as follows. Instead of taking ~xxkþ1 as the nextiterate xkþ1 (recall that this may not work even in R2), xkþ1 is defined as theprojection of xk onto the halfspace
Hkþ1 :¼ 3Dfx 2 H j h ~vvkþ1,x� ~xxkþ1i � 0g:
Note that since ~vvkþ12 Tð ~xxkþ1
Þ, any solution of (1) belongs to Hkþ1, by themonotonicity of T . On the other hand, using (14), it can be verified that if xk
is not a solution then
h ~vvkþ1, xk � ~xxkþ1i > 0 ,
and so xk 62 Hkþ1. Hence, the projected point xkþ1 is closer to the solution setT�1
ð0Þ than xk, which essentially ensures global convergence of the method.Recall that projection onto a halfspace is explicit (so it does not entail anynontrivial computational cost). Specifically, it is given by
xkþ1 :¼ 3Dxk �h ~vvkþ1,xk � ~xxkþ1
i
k ~vvkþ1k2~vvkþ1 : ð15Þ
It is also worth to mention that the same rule (14) used for global conver-gence also gives local linear rate of convergence under standard assumptions.Observe that if ð ~xxkþ1, ~vvkþ1
Þ happens to be the exact solution of (12) (forexample, this will necessarily be the case if one takes � ¼ 3D0 in (14)),then xkþ1 will be equal to ~xxkþ1, and we retrieve the exact proximal iteration.Of course, the case of interest is when � 6¼ 0. Convergence properties ofHPPPM are precisely the same as of the standard proximal point method,see [38]. The advantage of HPPPM is that the approximation rule (14)is constructive and less restrictive than the classical one. This is importantnot only theoretically, but even more so in applications. For example,tolerance rule (14) proved crucial in the design of truly globally conver-gent Newton-type algorithms [35,36], which combine the proximal andprojection methods of [34,38] with the linearization methodology. An
PROXIMAL POINT ALGORITHMS 1017
extension of (14) to proximal methods with Bregman-distance regularizationcan be found in [41].
Let us go back to the proximal system (12). Up to now, we discussedrelaxing the equation part of this system. The inclusion v 2 TðxÞ can also berelaxed using some ‘‘outer approximation’’ of T . In [3,4] T ", an enlargementof a maximal monotone operator T , was introduced. Specifically, given" � 0, define T ":H ! PðHÞ as
T "ðxÞ ¼ 3Dfv 2 H j hw� v, z� xi � �" for all z 2 H, w 2 TðzÞg:
ð16Þ
Since T is maximal monotone, T0ðxÞ ¼ 3DTðxÞ for all x 2 H. Furthermore,
the relation
TðxÞ � T "ðxÞ
holds for any " � 0, x 2 H. So, T " may be seen as a certain outer approx-imation of T . In the particular case T ¼ 3D@f , where f is a proper closedconvex function, it holds that T "
ðxÞ � @" f ðxÞ, where @" f stands for the stan-dard "-subdifferential as defined in Convex Analysis. Some further propertiesand applications of T " can be found in [4,5,6,7,32,39].
In [37], a variation of the proximal point method was proposed,where both the equation and the inclusion were relaxed in system (12).In addition, the projection step of [38] was replaced by a more simpleextragradient-like step. This algorithm is called the Hybrid Extragradient–Proximal Point Method (HEPPM). Specifically, HEPPM works as follows.A pair ð ~xxkþ1, ~vvkþ1
Þ is considered an acceptable approximate solution of theproximal system (12) if for some "kþ1 � 0.
~vvkþ12 T "kþ1ð ~xxkþ1
Þ ,
ck ~vvkþ1
þ ~xxkþ1� xk ¼ 3Drk,
(ð17Þ
and
krkk2 þ 2ck"kþ1 � �2k ~xxkþ1
� xkk2 , ð18Þ
where � 2 ½0, 1Þ is fixed. Having in hand those objects, the next iterate is
xkþ1 :¼ 3Dxk � ck ~vvkþ1: ð19Þ
Convergence properties of this method are again the same as of the originalproximal point algorithm. We note that for "kþ1 ¼ 3D0, condition (18) forHEPPM is somewhat stronger than condition (14) for HPPPM, although itshould not be much more difficult to satisfy in practice. It has to be stressed,however, that the introduction of "kþ1 6¼ 0 is not merely formal or artificialhere. It is important for several reasons, see [37,39] for a more detaileddiscussion. Here, we shall mention that T" arises naturally in the contextof variational inequalities where subproblems are solved using the regular-ized gap function [15]. Another situation where T " is useful are bundle
1018 SOLODOV AND SVAITER
techniques for general maximal monotone inclusions, see [5,32] (unlike T , T "
has certain continuity properties [6], much like the "-subdifferential in non-smooth convex optimization).
The main goal of the present paper is to unify the methods of [38] and[37] within a more general framework which, in addition, utilizes an evenbetter approximation criterion for solution of subproblems than the onesstudied previously. A new application to solving monotone systems of semi-smooth equations is also discussed.
2. THE GENERAL FRAMEWORK
Given any x 2 H and c > 0, consider the proximal point problem ofsolving the system
v 2 Tð yÞ,
cvþ y� x ¼ 3D0:
�ð20Þ
We start directly with our new notion of approximate solutions of (20).
Definition 1. We say that a triple ð y, v, "Þ 2 H �H� Rþ is an approximatesolution of the proximal system (20) with error tolerance � 2 ½0, 1Þ, if
v 2 T "ð yÞ,
kcvþ y� xk2 þ 2c" � �2 cvk k2þ y� x�� ��2� �
: ð21Þ
First note that condition (21) is more general than either (14) or (18). This iseasy to see because the right-hand side in (21) is larger. In addition, (14)works only with the exact values of the operator (i.e., with " ¼ 3D0).
Observe that if ð y, vÞ is the exact solution of (20) then, taking " ¼ 3D0,we conclude that ð y, v, "Þ satisfy the approximation criteria (21) for any� 2 ½0, 1Þ. Conversely, if � ¼ 3D0, then only the exact solution of (20)(with " ¼ 3D0) will satisfy this approximation criterion. So this definitionis quite natural. For � 2 ð0, 1Þ, system (20) has at least one, and typicallymany, approximate solutions in the sense of Definition 1.
We next establish some basic properties of approximate solutionsdefined above.
Lemma 2. Take x 2 H, c > 0. The triple ð y, v, "Þ 2 H �H� Rþ being anapproximate solution of the proximal system (20) with error tolerance� 2 ½0, 1Þ is equivalent to the following relations:
v 2 T "ð yÞ , hv, x� yi � " �
1� �2
2ccvk k
2þ y� x�� ��2� �
: ð22Þ
PROXIMAL POINT ALGORITHMS 1019
In addition, it holds that
1� �
1� �2ckvk � ky� xk �
1þ �
1� �2ckvk , ð23Þ
where
� :¼ 3D
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� ð1� �2Þ
2
q:
Furthermore, the three conditions
1. 0 2 TðxÞ,2. v ¼ 3D0,3. y ¼ 3Dx
are equivalent and imply " ¼ 3D0.
Proof. Re-writing (21), we have
�2kcvk2 þ ky� xk2� �
� kcvþ y� xk2 þ 2c"
¼ 3D2c"þ kcvk2 þ ky� xk2 þ 2chv, y� xi,
which is further equivalent to (22), by a simple rearrangement of terms.Furthermore, by the Cauchy-Schwarz inequality and the nonnegativity
of ", using (22), we have that
kcvkky� xk � chv, y� xi � c" �1� �2
2kcvk2 þ ky� xk2� �
:
Denoting t :¼ 3Dky� xk and resolving the quadratic inequality in t
t2 �2kcvk
1� �2tþ kcvk2 � 0,
proves (23).Suppose now that 0 2 TðxÞ. Since v 2 T"
ð yÞ, we have that
� " � hv� 0, y� xi ¼ 3Dhv, y� xi,
and it follows that the right-hand side of (22) is zero. Hence, v ¼ 3D0, andx ¼ 3Dy. If we assume that v ¼ 3D0, then (22) implies that x ¼ 3Dy, and viceversa. And obviously, these conditions imply that 0 2 TðxÞ. It is also clearfrom (22) that in those cases " ¼ 3D0. Q
The following lemma is the key to developing convergent iterativeschemes based on approximate solutions of proximal subproblems definedabove.
Lemma 3. Take any x 2 H. Suppose that
hv, x� yi � " > 0,
1020 SOLODOV AND SVAITER
where " � 0 and v 2 T "ð yÞ. Then for any x� 2 T�1
ð0Þ and any � � 0, it holdsthat
kx� � xþk2 � kx� � xk2 � ð1� ð1� �Þ2Þkavk2,
where
xþ :¼ 3Dx� �av ,
a :¼ 3Dhv,x� yi � "
kvk2:
Proof. Define the closed halfspace H as
H :¼ 3Dfz 2 H j hv, z� yi � " � 0g:
By the assumption, x =2H. Let �xx be the projection of x onto H. It is wellknown, and easy to check, that
�xx ¼ 3Dx� av,
where the quantity a is defined above. For any z 2 H, it holds thathz� �xx, vi � 0, and so
hz� �xx, xþ � xi � 0:
Using the definition of xþ, we obtain
kz� xk2 ¼ 3Dkðz� xþÞ þ ðxþ � xÞk2
¼ 3Dkz� xþk2 þ kxþ � xk2 þ 2hz� xþ,xþ � xi
¼ 3Dkz� xþk2 þ kxþ � xk2 þ 2h �xx� xþ, xþ � xi
þ 2hz� �xx,xþ � xi
� kz� xþk2 þ kxþ � xk2 þ 2h �xx� xþ,xþ � xi
¼ 3Dkz� xþk2 þ ð1� ð1� �Þ2Þa2kvk2:
Suppose x� 2 T�1ð0Þ. By the definition of T ",
hv� 0, y� x�i � �":
Therefore, hv,x� � yi � " � 0, which means that x� 2 H. Setting z ¼ 3Dx� inthe chain of inequalities above completes the proof. Q
Lemma 3 shows that if we take � 2 ð0, 2Þ then the point xþ is closer tothe solution set T�1
ð0Þ than the point x. Using further Lemma 2, this canserve as a basis for constructing a convergent iterative algorithm. Allowingthe parameter � to vary within the interval ð0, 2Þ will make it possible to unifythe algorithms of [38] and [37].
PROXIMAL POINT ALGORITHMS 1021
Algorithm 2.1. Choose any x0 2 H, 0 � ��� < 1, �cc > 0, and 2 ð0, 1Þ.
1. Choose ck � �cc and find ðvk, yk, "kÞ 2 H �H� Rþ, an approximatesolution with error tolerance �k � ��� of the proximal system (12), i.e.,
vk 2 T"kð ykÞ
kckvkþ yk � xkk2 þ 2ck"k � �2
kðkckvkk2þ kyk � xkk2Þ:
2. If yk ¼ 3Dxk, STOP. Otherwise,3. Choose �k 2 ½1� , 1þ �, and set
ak :¼ 3Dhvk, xk � yki � "k
kvkk2,
xkþ1 :¼ 3Dxk � �kakvk ,
and go to Step 1.
Note that the algorithm stops when yk ¼ 3Dxk which, by Lemma 2,means that xk 2 T�1
ð0Þ. In our convergence analysis we shall assume thatthis does not occur, and so an infinite sequence of iterates is generated.
We next show that Algorithm 2.1 contains the methods proposed in [38]and [37] as its special cases. First, as was already remarked, any approximatesolution of the proximal point subproblem satisfying (14) or (18) (used in[38] and [37], respectively), certainly satisfies approximation criterion (21)employed in Algorithm 2.1. It remains to show that the ways xkþ1 is obtainedin (15) and (19) (used in [38] and [37], respectively), are also two special casesof the update rule in Algorithm 2.1. With respect to [38] this is completelyobvious, as (15) corresponds to taking �k ¼ 3D1 in Algorithm 2.1 (with"k ¼ 3D0, as in the setting of [38]). For the method of [37], on the otherhand, this issue is not immediately clear. Essentially, one has to showthat under the approximation criterion (18), there exists �k 2 ½1� , 1þ �such that �kak ¼ 3Dck, or equivalently, there exists 2 ð0, 1Þ such thatð1� Þak � ck � ð1þ Þak. The following proposition establishes this fact.
Proposition 4. Take x, y, v 2 H, " � 0, c > 0 and � 2 ½0, 1Þ such that 0 =2TðxÞand
v 2 T "ð yÞ
kcvþ y� xk2 þ 2c" � �2ky� xk2 :
Then
ð1� �Þa � c � ð1þ �Þa ,
where
a ¼ 3Dhv, x� yi � "
kvk2:
1022 SOLODOV AND SVAITER
Proof. First note that by the hypotheses of the proposition, Lemma 2 isapplicable, and we have that v 6¼ 0 and a > 0.
Furthermore, by our assumptions it holds that kcvþ y� xk � �ky� xk,from which by the triangle inequality, it follows that
ð1� �Þk y� xk � ckvk � ð1þ �Þk y� xk: ð24Þ
By the nonnegativity of " and the Cauchy-Schwarz inequality, we have that
a � hv, x� yi=kvk2 � ky� xk=kvk:
Combining the latter relation with (24), we conclude that
ð1� �Þa � ð1� �Þky� xk=kvk � c ,
which proves one part of the assertion.To prove the other part, note that under our assumptions
a ¼ 3Dkcvk2 þ ky� xk2 � ðkcvþ y� xk2 þ 2c"Þ
2ckvk2
�kcvk2 þ ð1� �2
Þky� xk2
2ckvk2
¼ 3Dc
21þ ð1� �2
Þkx� yk2
kcvk2
!:
Using further (24), we obtain
a � ðc=2Þ 1þ1� �2
ð1þ �Þ2
!¼ 3Dc=ð1þ �Þ ,
which concludes the proof. Q
Proposition 4 implies that if we choose � ��� in Algorithm 2.1, then foreach k there exists �k 2 ½1� , 1þ � such that �kak ¼ 3Dck. Hence, HEPPMfalls within the presented framework of Algorithm 2.1.
3. CONVERGENCE ANALYSIS
As already remarked, if Algorithm 2.1 terminates finitely, it does so at asolution of the problem. From now on, we assume that infinite sequencesfxkg, fvkg, fykg and f"kg are generated. Using Lemma 2 we conclude that forall k, vk 6¼ 0, yk 6¼ xk and
hvk, xk � yki � "k �1� �2
k
2ckðkckv
kk2þ kyk � xkk2Þ > 0: ð25Þ
PROXIMAL POINT ALGORITHMS 1023
Therefore, by the definition of ak,
ak �1� �2
k
2
kckvkk2þ kyk � xkk2
ckkvkk2
: ð26Þ
Furthermore, using the fact that kckvkk2þ kyk � xkk2 � 2ckkv
kkkyk � xkk,
we obtain that
akkvkk � ð1� �2
kÞkyk� xkk: ð27Þ
Proposition 5. If the solution set of problem (1) is nonempty, then
1. The sequence fxkg is bounded;2.
P1
k¼3D0 kakvkk2 < 1;
3. limk!1 kyk � xkk ¼ 3D limk!1 kvkk ¼ 3D limk!1 "k ¼ 3D0.
Proof. Applying (25) and Lemma 3, we have that for any x� 2 T�1ð0Þ it
holds that
kx� � xkþ1k2� kx� � xkk2 � ð1� ð1� �kÞ
2Þa2kkv
kk2
� kx� � xkk2 � ð1� 2Þkakvkk2:
It immediately follows that the sequence fxkg is bounded, and that the seconditem of the proposition holds. Therefore we also have that 0 ¼
3D limk!1 akkvkk. From (26) it easily follows that
akkvkk �
1� �2k
2kckv
kk:
Using this relation and (27), we immediately conclude that 0 ¼
3D limk!1 kvkk ¼ 3D limk!1 kyk � xkk. Since "k � 0, relation (25) alsoimplies that 0 ¼ 3D limk!1 "k.
We are now in position to prove convergence of our algorithm.
Theorem 6. If the solution set of problem (1) is nonempty, then the sequencefxkg converges weakly to a solution.
Proof. By Proposition 5, the sequence fxkg is bounded, and so it has atleast one weak accumulation point, say �xx. Let fxkj g be some subsequenceweakly converging to �xx. Since kyk � xkk ! 0 (by Proposition 5), it followsthat the subsequence fykj g also converges weakly to �xx. Moreover, we knowthat vkj ! 0 (strongly) and "kj ! 0. Take any x 2 H and u 2 TðxÞ. Becausevk 2 T"kð ykÞ, for any index j it holds that
hu� vkj ,x� ykj i � �"kj :
Therefore
hu� 0, x� ykji � hvkj , x� ykj i � "kj :
1024 SOLODOV AND SVAITER
Since fykj g converges weakly to �xx, fvkjg converges strongly to zero and fykjg isbounded, and f"kjg converges to zero, taking the limit as j ! 1 in the aboveinequality we obtain that
hu� 0, x� �xxi � 0:
But ðx, uÞ was taken as an arbitrary element in the graph of T . From max-imality of T it follows that 0 2 Tð �xxÞ, i.e., �xx is a solution of (1). We have thusproved that every weak accumulation point of fxkg is a solution.
The proof of uniqueness of weak accumulation point in this settingis standard. For example, uniqueness follows easily by applying Opial’sLemma [23] (observe that fkxk � �xxkg is nonincreasing), or by using theanalysis of [31]. Q
When T�1ð0Þ ¼ 3D 6 0, i.e., no solution exists, the sequence fxkg can
be shown to be unbounded. Because one has to work with an enlargementof the operator, the proof of this fact is somewhat different from theclassical (e.g., [31]). We refer the reader to [37] where the required analysisis similar.
We next turn our attention to the study of convergence rate ofAlgorithm 2.1. This will be done under the standard assumption that T�1
is Lipschitz-continuous at zero, i.e., there exist the unique x� 2 T�1ð0Þ and
some �,L > 0 such that
v 2 TðxÞ , kvk � � ) kx� x�k � Lkvk: ð28Þ
We shall assume, for the sake of simplicity, that �k ¼ 3D1 for all k, i.e., thereis no relaxation in the ‘‘projection’’ step of Algorithm 2.1.
The following error bound, established in [39], will be the key to theanalysis.
Theorem 7. [39, Corollary 2.1] Let y� , v� be the exact solution of the proximalsystem (20). Then for any y 2 H , v 2 T "
ð yÞ, it holds that
ky� y�k2 þ c2kv� v�k2 � kcvþ y� xk2 þ 2c":
A lower bound for ak will also be needed. Using (26), we have
ak �1� �2
k
2ð1þ ðkyk � xkk=kckv
kkÞ
2Þck
�1� ���2
2ð1þ ðkyk � xkk=kckv
kkÞ
2Þ �cc:
Using also (23), after some algebraic manipulations, we obtain
ak �1þ
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� ð1� ���2Þ
2q1� ���2
24
35
�1
�cc: ð29Þ
PROXIMAL POINT ALGORITHMS 1025
We are now ready to establish the linear rate of convergence of ouralgorithm.
Theorem 8. Suppose that T�1 is Lipschitz-continuous at zero, and �k ¼ 3D1for all k. Then for k sufficiently large,
kx� � xkþ1k �
�ffiffiffiffiffiffiffiffiffiffiffiffiffi1þ �2
p kx� � xkk ,
where
� :¼ 3Dffiffiffiffiffiffiffiffiffiffiffiffiffi 2 þ 1
p ffiffiffiffiffiffiffiffiffiffiffiffiffi�2 � 1
qþ �
with
:¼ 3DðL= �ccÞ1þ
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� ð1� ���2Þ
2q1� ���2
, � :¼ 3D1=ð1� ���2Þ:
Proof. Define, for each k, xxk, vvk 2 TðxxkÞ as the exact solution of the proximalproblem 0 2 akTðxÞ þ x� x
k, where ak was defined in Algorithm 2.1 (recallthat ak > 0). Since vk 2 T"kð ykÞ, by Theorem 7 it follows that
kxxk�ykk2þa2kkvvk� vkk2 �kakv
kþyk�xkk2þ2ak"k
¼ 3Dkakvkþyk�xkk2
þ2akðhvk,xk�yki�akkv
kk2Þ
¼ 3Dkyk�xkk2�kakvkk2:
Using also (27) we get
kxxk � ykk2 þ a2kkvvk� vkk2 � ðð1� �2
k�2
� 1Þkakvkk2: ð30Þ
Therefore,
kvvk � vkk2 � ðð1� ���2Þ�2Þ � 1Þkvkk2:
Now, since vk converges to zero (strongly) we conclude that vvk also convergesto zero. Hence, for k large enough, it holds that kvvkk � �, and, by (28),
kx� � xxkk � Lkvvkk ¼ 3DðL=akÞkxxk� xkk:
In the sequel, we assume that k is large enough, so that the above boundholds. By the triangle inequality, we further obtain
kx� � xkþ1k � kx� � xxkk þ kxxk � xkþ1
k
� ðL=akÞkxxk� xkk þ kxxk � xkþ1
k
� ðL=akÞkxxk� ykk þ ðL=akÞky
k� xkk þ kxxk � xkþ1
k:
1026 SOLODOV AND SVAITER
Note that because xxk ¼ 3Dxk � akvvk and xkþ1
¼ 3Dxk � akvk, we have that
akkvvk� vkk ¼ 3Dkxxk � xkþ1
k. Using this relation and (30), we obtain that
kxxk � ykk2 þ kxxk � xkþ1k2� ½ð1� �2
k�2
� 1�kakvkk2:
Using the Cauchy-Schwarz inequality, it now follows that
ðL=akÞkxxk� ykk þ kxxk � xkþ1
k
�
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðL=akÞ
2þ 1
q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffikxxk � ykk2 þ kxxk � xkþ1k2
q
�
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðL=akÞ
2þ 1
q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� �2
k�2
� 1
qkakv
kk:
Hence,
kx� � xkþ1k �
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðL=akÞ
2þ 1
q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� �2
k�2
� 1
qkakv
kkþ ðL=akÞky
k� xkk
�
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðL=akÞ
2þ 1
q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� �2
k�2
� 1
qþ ðL=akÞð1� �2
k�1
� �kakv
kk ,
where (27) was used again. Recalling the definition of �, and using (29), we get
kx� � xkþ1k � �kakv
kk:
Using also Lemma 3, we now have that
kx� � xkk2 � kx� � xkþ1k2þ kakv
kk2
� kx� � xkþ1k2þ ð1=�Þ2kx� � xkþ1
k2 ,
and it follows that
kx� � xkþ1k �
�ffiffiffiffiffiffiffiffiffiffiffiffiffi1þ �2
p kx� � xkk ,
which establishes the claim. Q
Remark 9. It is easy to see that we could repeat the analysis of Theorem ?? with k, �k and �k defined for each k as functions of ck and �k (instead of theirbounds �cc and ���). Then the analysis above shows that if ck ! 1 and �k ! 0,then k ! 0 , �k ! 1, and hence, �k ! 0. We conclude that under the addi-tional assumptions that ck ! 1 and �k ! 0, in the setting of this section fxkgconverges to x� superlinearly.
In the more general case where �k 2 ½1� , 1þ �, under the same con-dition (28), the following estimate is valid:
kx� � xkþ1k �
�þ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� 2Þ þ ð�þ Þ2
q kx� � xkk:
PROXIMAL POINT ALGORITHMS 1027
4. SYSTEMS OF SEMISMOOTH EQUATIONS
AND REFORMULATIONS OF MONOTONE
COMPLEMENTARITY PROBLEMS
In this section, we consider a specific situation where
H ¼ 3DRn , T ¼ 3DF , F :Rn ! Rn ,
and F is a locally Lipschitz-continuous semismooth monotone function. Aswill be discussed below, one interesting example of problems having thisstructure are equation-based reformulations [16,27] of monotone nonlinearcomplementarity problems [13,25]. A globally convergent Newton methodfor solving systems of monotone equations was proposed in [36]. The attrac-tive feature of this algorithm is that given an arbitrary starting point, it isguaranteed to generate a sequence of iterates convergent to a solution of theproblem without any regularity-type assumptions. This is a very desirableproperty, not shared by standard globalizations strategies based on meritfunctions. We refer the reader to [35,36] for a detailed discussion of thisissue. We next state a related method based on the more general frameworkof Algorithm 2.1.
Algorithm 4.1. Choose any x0 2 Rn, 0 � ��� < 1, �cc > 0, and � , 2 ð0, 1Þ.
1. (Newton Step)
Choose ck � �cc, �k � ���, and a positive semidefinite matrix Hk.Compute dk, the solution of the linear system
FðxkÞ þ ðHk þ c�1k IÞd ¼ 3D0:
Stop if dk ¼ 3D0. If
kckFðxkþ dkÞ þ dkk2 � �2
kðkckFðxkþ dkÞk2 þ kdkk2Þ ,
then set yk :¼ 3Dxk þ dk, and go to Projection Step. Otherwise,
2. (Linesearch Step)
Find mk, the smallest nonnegative integer m, such that
� ckhFðxkþ �mdkÞ, dki � �2
kkdkk2:
Set yk :¼ 3Dxk þ �mkdk.
3. (Projection Step)
Choose �k 2 ½1� , 1þ �, and set
xkþ1 :¼ 3Dxk þ �khFð ykÞ, dki
kFð ykÞk2Fð ykÞ:
Set k :¼ 3Dkþ 1, and go to Newton Step.
1028 SOLODOV AND SVAITER
The key idea in Algorithm 4.1 is to try solving the proximal pointsubproblem by just one Newton-type iteration. When this is not possible,the Newton point is refined by means of a lineasearch. The following proper-ties for Algorithm 4.1 can be established essentially using the line of analysisdeveloped in [36].
Theorem 10. Suppose that F is continuous and monotone and let fxkg be anysequence generated by Algorithm 4.1. Then fxkg is bounded. Furthermore, ifkHkk � C1 and ck � C2kFðx
kÞk
�1 for some C1 ,C2 > 0, then fxkg converges tosome �xx such that Fð �xxÞ ¼ 3D0.
In addition, if Fð�Þ is differentiable in a neighborhood of �xx with F 0ð�Þ
Lipschitz-continuous, F 0ð �xxÞ is positive definite, Hk ¼ 3DF 0
ðxkÞ for all k � k0,and 0 ¼ 3D limk!1 c
�1k ¼ 3D limk!1 ckkFðx
kÞk, then the convergence rate is
superlinear.Consider the classical nonlinear complementarity problem [25,?], which
is to find an x 2 Rn such that
gðxÞ � 0, x � 0, hgðxÞ, xi ¼ 3D0 , ð31Þ
where g : Rn ! Rn is differentiable. One of the most useful approaches tonumerical and theoretical treatment of the NCP consists in reformulating itas a system of (nonsmooth) equations [27,?]. One popular choice is given bythe so-called natural residual [?]
FiðxÞ ¼ 3Dmin fxi, giðxÞg , i ¼ 3D1, . . . , n, ð32Þ
where > 0 is a parameter. It is easy to check that for this mapping thesolution set of the system of equations FðxÞ ¼ 3D0 coincides with thesolution set of the NCP (31). Furthermore, it is known [45] (see also[17,33]) that F given by (32) is monotone for all sufficiently small if g isco-coercive (for the definition of co-coerciveness and its role for variationalinequalities, see [2,43,46]). Since F is obviously continuous, it immediatelyfollows that for co-coercive NCP Algorithm 4.1 applied to the naturalresidual F generates a sequence of iterates globally converging to somesolution x� of the NCP, provided the parameters are chosen as specified.However, it is clear that this F is in general not differentiable. In particular,it is not differentiable at a solution x� of the NCP, unless the strictcomplementarity condition x�i þ giðx
�Þ > 0 , i ¼ 3D1, . . . , n, holds. This con-
dition, however, is known to be very restrictive. Hence, the superlinear rate ofconvergence stated in Theorem 10 does not apply in this context.
On the other hand, by Remark 9, Algorithm 2.1 does converge locallysuperlinearly under appropriate assumptions. Of course, it is importantto study computational cost involved in finding acceptable approximatesolutions for the proximal point subproblems in Algorithm 2.1. We nowturn our attention to this issue. Specifically, we shall show that when xk is
PROXIMAL POINT ALGORITHMS 1029
close to a solution x� with certain properties, two steps of the generalizedNewton method applied to solve the k-th proximal subproblem
0 ¼ 3DFkðxÞ :¼ 3DFðxÞ þ c�1k ðx� xkÞ ð33Þ
are sufficient to guarantee our error tolerance criterion. As a consequence,local superlinear rate of convergence of Algorithm 2.1 is ensured by solvingat most two systems of linear equations at each iteration, and by takingck ! þ1 , �k ! 0 in an appropriate way.
Let @FðxÞ denote the Clarke’s generalized Jacobian [9] of F at x 2 Rn.Specifically, @FðxÞ is the convex hull of the B-subdifferential [30] of F at x,which is the set
@BFðxÞ ¼ 3DfH 2 Rn�n j 9fxkg � DF :xk! x and F 0
ðxkÞ ! Hg,
with DF being the set of points at which F is differentiable (by theRademacher’s Theorem, a locally Lipschitz-continuous function is differenti-able almost everywhere). Recall that F is semismooth [22,28,29] at x 2 Rn if itis directionally differentiable at x, and
Hd � F 0ðx; dÞ ¼ 3DoðkdkÞ , 8 d ! 0 , 8H 2 @Fðxþ dÞ ,
where F 0ðx; dÞ stands for the usual directional derivative (this is one of a
number of equivalent ways to define semismoothness).Similarly, F is said to be strongly semismooth at x 2 Rn if
Hd � F 0ðx; dÞ ¼ 3DOðkdk2Þ , 8 d ! 0 , 8H 2 @Fðxþ dÞ:
When we say that F is (strongly) semismooth, we mean that this propertyholds for all points x 2 Rn. By the calculus for semismooth mappings [14],if g is (strongly) semismooth, then so is F given by (32). In particular, ifg is continuously differentiable, then F is semismooth; and F is stronglysemismooth if the derivative of g is Lipschitz-continuous (see also [21]).Furthermore, at any point x 2 Rn elements in @FðxÞ or @BFðxÞ are easilycomputable by explicit formulas, e.g., see [19]. It is known [19,21] that if x�
is a b-regular [?] solution of NCP (which is one of the weakest regularityconditions for NCP), then all elements in @BFðx
�Þ are nonsingular (the
so-called BD-regularity of F at x�).Consider fxkg ! x�, and assume that x� is a BD-regular solution of
FðxÞ ¼ 3D0. The latter implies that x� is an isolated solution, and the follow-ing error bound holds [27, Proposition 3]: there exist M1 > 0 and � > 0 suchthat
M1kx� x�k � kFðxÞk 8 x 2 Rn such that kx� x�k � �: ð34Þ
Denote by zk the exact solution of (33), fzkg ! x�. By the well-known prop-erties of the (exact) proximal point method [31],
kzk � x�k2 � kxk � x�k2 � kzk � xkk2 ,
1030 SOLODOV AND SVAITER
from which using xk � zk ¼ 3DckFðzkÞ and (34), we obtain
kzk � x�k ¼ 3DOðc�1k kxk � x�kÞ: ð35Þ
Suppose F is strongly semismooth. Clearly, Fk is strongly semismooth, and zk
is a BD-regular solution of (33) (elements in @BFkðzkÞ are nonsingular, once xk
is sufficiently close to x�, so that zk is also close to x�).Let yyk 2 Rn be the point obtained after one iteration of the generalized
Newton method applied to (33):
yyk ¼ 3Dxk � P�1k Fkðx
kÞ , Pk 2 @BFkðx
kÞ:
Since Pk ¼ 3DHk þ c�1k I ,Hk 2 @BFðx
kÞ, we have that
kyyk � zkk ¼ 3Dkxk � zk � ðHk þ c�1k IÞ
�1FðxkÞk
� kxk �H�1k Fðx
kÞ � x�k þ kzk � x�k
þ kðH�1k � ðHk þ c
�1k IÞ
�1ÞFðxkÞk
¼ 3DOðkxk � x�k2Þ þOðc�1k kxk � x�kÞ þOðc�1
k kFðxkÞkÞ
¼ 3DOðc�1k kFðxkÞkÞ , ð36Þ
where the second equality follows from the quadratic convergence [?,29] ofthe generalized Newton method applied to FðxÞ ¼ 3D0, (35), and the classicalfact of linear algebra about the inverse of a perturbed nonsingular matrix;and the last equality follows from (34). By the Lipschitz-continuity of F , thetriangle inequality, and (35), we also have
kFðxkÞk �M2kxk� x�k
�M2ðkzk� xkk þ kzk � x�kÞ
�M2kzk� xkk þOðc�1
k kxk � x�kÞ:
Using (34) in the last relation, it follows that kFðxkÞk � Oðkzk � xkkÞ. From(36), we then obtain that
kyyk � zkk ¼ 3DOðc�1k kxk � zkkÞ:
Denoting by yk the next Newton step for (33),
yk ¼ 3Dyyk � PP�1k Fkð yy
kÞ , PPk 2 @BFkð yy
kÞ ,
we have
kyk � zkk ¼ 3DOðc�1k kyyk � zkkÞ ,
and, using (36), we conclude that
kyk � zkk ¼ 3DOðc�2k kFðxkÞkÞ: ð37Þ
PROXIMAL POINT ALGORITHMS 1031
The left-hand side of the tolerance criterion (21) for proximal subproblem(33) can be bounded as follows:
kckFð ykÞ þ yk � xkk � ckkFð y
kÞ � FðzkÞk þ kyk � zkk
� 2M2ckkyk� zkk
¼ 3DOðc�1k kFðxkÞkÞ , ð38Þ
where the first inequality follows from the definition of zk and the Cauchy-Schwarz inequality; the second inequality follows from the Lipschitz-continuity of F ; and the last follows from (37). On the other hand,we have that
kyk � xkk � kxk � x�k � kyk � zkk � kzk � x�k
� kxk � x�k �Oðc�2k kFðxkÞkÞ �Oðc�1
k kxk � x�kÞ
�M3kFðxkÞk , ð39Þ
where the second inequality follows from (37) and (35); and the lastinequality follows with some M3 > 0, by the Lipschitz-continuity of F .Using (39) and (38), we conclude that the error tolerance condition (21)is guaranteed to be satisfied for yk, the second Newton iterate for solving(33), if we choose parameters �k and ck so that
c�1k ¼ 3Doð�kÞ:
With this choice, in the framework of Algorithm 2.1 yk is an acceptableapproximate solution of the proximal point subproblem. By Remark 9, con-vergence rate of fxkg generated by Algorithm 2.1 is superlinear.
In fact, comparing (36) and (38), we can see that for the first Newtonpoint the left and right-hand sides of the tolerance criterion are already of thesame order. Since these are merely rough upper and lower estimates, as apractical matter, one might expect the error tolerance rule to be satisfiedalready after the first Newton step.
REFERENCES
1. H.H. Bauschke and P.L. Combettes. A weak-to-strong convergence principle forFejer-monotone methods in Hilbert spaces. Mathematics of OperationsResearch, 26, (2001), 248–264.
2. R.E. Bruck. An iterative solution of a variational inequality for certain mono-tone operators in a Hilbert space. Bulletin of the American MathematicalSociety, 81, (1975), 890–892.
3. R.S. Burachik, A.N. Iusem, and B.F. Svaiter. Enlargement of monotone opera-tors with applications to variational inequalities. Tech. Rep. 02/96 (1996),Pontificia Universidade Catolica do Rio de Janeiro, Rio de Janeiro, Brazil.
1032 SOLODOV AND SVAITER
4. R.S. Burachik, A.N. Iusem, and B.F. Svaiter. Enlargement of monotone opera-
tors with applications to variational inequalities. Set-Valued Analysis, 5, (1997),
159–180.
5. R.S. Burachik, C.A. Sagastizabal, and B.F. Svaiter. Bundle methods for
maximal monotone operators. In R. Tichatschke and M. Thera, editors,
Ill-posed variational problems and regularization techniques, Lecture Notes in
Economics and Mathematical Systems, No. 477, pages 49–64. Springer–
Verlag, 1999.
6. R.S. Burachik, C.A. Sagastizabal, and B.F. Svaiter. "-Enlargements of maximal
monotone operators: Theory and applications. In M. Fukushima and L. Qi,
editors, Reformulation – Nonsmooth, Piecewise Smooth, Semismooth and
Smoothing Methods, pages 25–44. Kluwer Academic Publishers, 1999.
7. R.S. Burachik and B.F. Svaiter. "-Enlargements of maximal monotone opera-
tors in Banach spaces. Set-Valued Analysis, 7, (1999), 117–132.
8. J.V. Burke and M. Qian. A variable metric proximal point algorithm for
monotone operators. SIAM Journal on Control and Optimization, 37,
(1998), 353–375.
9. F.H. Clarke. Optimization and Nonsmooth Analysis. John Wiley & Sons,
New York, 1983.
10. R. Cominetti. Coupling the proximal point algorithm with approximation
methods. Journal of Optimization Theory and Applications, 95, (1997),
581–600.
11. R.S. Dembo, S.C. Eisenstat, and T. Steihaug. Inexact Newton methods. SIAM
Journal on Numerical Analysis, 19, (1982), 400–408.
12. J. Eckstein. Approximate iterations in Bregman-function-based proximal algo-
rithms. Mathematical Programming, 83, (1998), 113–123.
13. M.C. Ferris and C. Kanzow. Complementarity and related problems. In P.M.
Pardalos and M.G.C. Resende, editors, Handbook of Applied Optimization.
Oxford University Press, 2000.
14. A. Fischer. Solution of monotone complementarity problems with locally
Lipschitzian functions. Mathematical Programming, 76, (1997), 513–532.
15. M. Fukushima. Equivalent differentiable optimization problems and descent
methods for asymmetric variational inequality problems. Mathematical
Programming, 53, (1992), 99–110.
16. M. Fukushima and L. Qi (editors). Reformulation – Nonsmooth, Piecewise
Smooth, Semismooth and Smoothing Methods. Kluwer Academic Publishers,
1999.
17. D. Gabay. Applications of the method of multipliers to variational inequalities.
In M. Fortin and R. Glowinski, editors, Augmented Lagrangian Methods:
Applications to the Solution of Boundary-Value Problems, pages 299–331.
North-Holland, 1983.
18. O. Guler. On the convergence of the proximal point algorithm for convex
minimization. SIAM Journal on Control and Optimization, 29, (1991),
403–419.
19. C. Kanzow and M. Fukushima. Solving box constrained variational inequality
problems by using the natural residual with D-gap function globalization.
Operations Research Letters, 23, (1998), 45–51.
PROXIMAL POINT ALGORITHMS 1033
20. B. Lemaire. The proximal algorithm. In J.-P. Penot, editor, New Methods of
Optimization and Their Industrial Use. International Series of Numerical
Mathematics 87, pages 73–87. Birkhauser, Basel, 1989.
21. T. De Luca, F. Facchinei, and C. Kanzow. A theoretical and numerical com-
parison of some semismooth algorithms for complementarity problems.
Computational Optimization and Applications, 16, (2000), 173–205.
22. R. Mifflin. Semismooth and semiconvex functions in constrained optimization.
SIAM Journal on Control and Optimization, 15, (1977), 957–972.
23. Z. Opial. Weak convergence of the sequence of successive approximations for
nonexpansive mappings. Bulletin of the American Mathematical Society, 73,
(1967), 591–597.
24. J.-S. Pang. Inexact Newton methods for the nonlinear complementarity pro-
blem. Mathematical Programming, 36, (1986), 54–71.
25. J.-S. Pang. Complementarity problems. In R. Horst and P. Pardalos, editors,
Handbook of Global Optimization, pages 271–338. Kluwer Academic
Publishers, Boston, Massachusetts, 1995.
26. J.-S. Pang and S.A. Gabriel. NE/SQP: A robust algorithm for the nonlinear
complementarity problem. Mathematical Programming, 60, (1993), 295–337.
27. J.-S. Pang and L. Qi. Nonsmooth equations: Motivation and algorithms. SIAM
Journal on Optimization, 3, (1993), 443–465.
28. L. Qi. Convergence analysis of some algorithms for solving nonsmooth equa-
tions. Mathematics of Operations Research, 18, (1993), 227–244.
29. L. Qi and J. Sun. A nonsmooth version of Newton’s method. Mathematical
Programming, 58, (1993), 353–367.
30. S.M. Robinson. Local structure of feasible sets in nonlinear programming, Part
III: Stability and sensitivity. Mathematical Programming Study, 30, (1987),
45–66.
31. R.T. Rockafellar. Monotone operators and the proximal point algorithm.
SIAM Journal on Control and Optimization, 14, (1976), 877–898.
32. C.A. Sagastizabal and M.V. Solodov. On the relation between bundle
methods for maximal monotone inclusions and hybrid proximal point algo-
rithms. In D. Butnariu, Y. Censor, and S. Reich, editors, Inherently Parallel
Algorithms in Feasibility and Optimization and their Applications, volume 8 of
Studies in Computational Mathematics, pages 441–455. Elsevier Science B.V.,
2001.
33. M. Sibony. Methodes iteratives pour les equations et inequations aux derivees
partielles nonlineares de type monotone. Calcolo, 7, (1970), 65–183.
34. M.V. Solodov and B.F. Svaiter. A new projection method for variational
inequality problems. SIAM Journal on Control and Optimization, 37, (1999),
765–776.
35. M.V. Solodov and B.F. Svaiter. A truly globally convergent Newton-type
method for the monotone nonlinear complementarity problem. SIAM Journal
on Optimization, 10, (2000), 605–625.
36. M.V. Solodov and B.F. Svaiter. A globally convergent inexact Newton method
for systems of monotone equations. In M. Fukushima and L. Qi, editors,
Reformulation – Nonsmooth, Piecewise Smooth, Semismooth and Smoothing
Methods, pages 355–369. Kluwer Academic Publishers, 1999.
1034 SOLODOV AND SVAITER
37. M.V. Solodov and B.F. Svaiter. A hybrid approximate extragradient–proximalpoint algorithm using the enlargement of a maximal monotone operator.Set-Valued Analysis, 7, (1999), 323–345.
38. M.V. Solodov and B.F. Svaiter. A hybrid projection–proximal point algorithm.Journal of Convex Analysis, 6, (1999), 59–70.
39. M.V. Solodov and B.F. Svaiter. Error bounds for proximal point subproblemsand associated inexact proximal point algorithms. Mathematical Programming,88, (2000), 371–389.
40. M.V. Solodov and B.F. Svaiter. Forcing strong convergence of proximal pointiterations in a Hilbert space. Mathematical Programming, 87, (2000), 189–202.
41. M.V. Solodov and B.F. Svaiter. An inexact hybrid generalized proximalpoint algorithm and some new results on the theory of Bregman functions.Mathematics of Operations Research, 25, (2000), 214–230.
42. P. Tossings. The perturbed proximal point algorithm and some of its applica-tions. Applied Mathematics and Optimization, 29, (1994), 125–159.
43. P. Tseng. Further applications of a splitting algorithm to decomposition invariational inequalities and convex programming. Mathematical Programming,48, (1990), 249–263.
44. P. Tseng. A modified forward-backward splitting method for maximal mono-tone mappings. SIAM Journal on Control and Optimization, 38, (2000),431–446.
45. Y.-B. Zhao and D. Li. Monotonicity of fixed point and normal mappingsassociated with variational inequality and its application. SIAM Journal onOptimization, 11, (2001), 962–973.
46. D.L. Zhu and P. Marcotte. Co-coercivity and its role in the convergenceof iterative schemes for solving variational inequalities. SIAM Journal onOptimization, 6, (1996), 714–726.
PROXIMAL POINT ALGORITHMS 1035