23
A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* M. V. Solodov and B. F. Svaiter Instituto de Matema´tica Pura e Aplicada, Estrada Dona Castorina 110, Jardim Botaˆnico, Rio de Janeiro, RJ 22460-320, Brazil E-mail: [email protected] and [email protected] ABSTRACT We present a unified framework for the design and convergence analysis of a class of algorithms based on approximate solution of proximal point subproblems. Our development further enhances the constructive approximation approach of the recently proposed hybrid projection– proximal and extragradient–proximal methods. Specifically, we introduce an even more flexible error tolerance criterion, as well as provide a unified view of these two algorithms. Our general method possesses global convergence and local (super)linear rate of convergence under standard assumptions, while using a constructive approximation cri- terion suitable for a number of specific implementations. For example, we show that close to a regular solution of a monotone system of semismooth equations, two Newton iterations are sufficient to solve the proximal subproblem within the required error tolerance. Such systems of equations arise naturally when reformulating the nonlinear comple- mentarity problem. Key Words: Maximal monotone operator; Proximal point algorithm; Approximation criteria 1013 Copyright # 2001 by Marcel Dekker, Inc. www.dekker.com NUMER. FUNCT. ANAL. AND OPTIMIZ., 22(7&8), 1013–1035 (2001) *Research of the first author is supported by CNPq Grant 300734/95-6, by PRONEX- Optimization, and by FAPERJ, research of the second author is supported by CNPq Grant 301200/93-9(RN), by PRONEX-Optimization, and by FAPERJ.

A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

  • Upload
    dodien

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

A UNIFIED FRAMEWORK FOR SOME

INEXACT PROXIMAL POINT ALGORITHMS*

M. V. Solodov and B. F. Svaiter

Instituto de Matematica Pura e Aplicada, EstradaDona Castorina 110, Jardim Botanico, Rio de Janeiro,

RJ 22460-320, BrazilE-mail: [email protected] and [email protected]

ABSTRACT

We present a unified framework for the design and convergence analysis

of a class of algorithms based on approximate solution of proximal

point subproblems. Our development further enhances the constructive

approximation approach of the recently proposed hybrid projection–

proximal and extragradient–proximal methods. Specifically, we introduce

an even more flexible error tolerance criterion, as well as provide a

unified view of these two algorithms. Our general method possesses

global convergence and local (super)linear rate of convergence under

standard assumptions, while using a constructive approximation cri-

terion suitable for a number of specific implementations. For example,

we show that close to a regular solution of a monotone system of

semismooth equations, two Newton iterations are sufficient to solve the

proximal subproblem within the required error tolerance. Such systems

of equations arise naturally when reformulating the nonlinear comple-

mentarity problem.

Key Words: Maximal monotone operator; Proximal point algorithm;

Approximation criteria

1013

Copyright # 2001 by Marcel Dekker, Inc. www.dekker.com

NUMER. FUNCT. ANAL. AND OPTIMIZ., 22(7&8), 1013–1035 (2001)

*Research of the first author is supported by CNPq Grant 300734/95-6, by PRONEX-

Optimization, and by FAPERJ, research of the second author is supported by CNPq Grant301200/93-9(RN), by PRONEX-Optimization, and by FAPERJ.

Page 2: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

AMS Subject Classifications: 90C30; 90C33

1. INTRODUCTION AND MOTIVATION

We consider the classical problem

find x 2 H such that 0 2 TðxÞ , ð1Þ

where H is a real Hilbert space, and T is a maximal monotone operator (or amultifunction) onH, i:e:,T : H ! PðHÞ, where PðHÞ stands for the family ofsubsets of H. It is well known that many problems in applied mathematics,economics and engineering can be cast in this general form.

One of the fundamental regularization techniques in this setting is theproximal point method. Using the current approximation to the solutionof (1), xk 2 H, this method generates the next iterate xkþ1 by solving thesubproblem

0 2 ckTðxÞ þ ðx� xkÞ , ð2Þ

where ck > 0 is a regularization parameter. Equivalently this can bewritten as

xkþ1¼ 3DðckT þ IÞ�1

ðxkÞ : ð3Þ

Theoretically, the proximal point method has quite nice global andlocal convergence properties [31], see also [20]. Its main practical draw-back is that the subproblems are structurally as difficult to solve as theoriginal problem. This makes straightforward application of the methodimpractical in most situations. Nevertheless, the proximal point methodologyis important for the development of a variety of successful numerical tech-niques, such as operator splitting and bundle methods, to name a few.It is therefore of importance to develop proximal-type algorithms withimproved practical characteristics. One possibility in this sense is inexactsolution of subproblems. In particular, approximation criteria should beas constructive and realistic as possible. We next discuss previous work inthis direction.

The first inexact proximal scheme was introduced in [31], and until veryrecently criteria of the same (summability) type remained the only choiceavailable (for example, see [8,10,12,42]). Specifically, in [31] Eq. (3) is relaxed,and the approximation rule is the following:

kxkþ1� ðckT þ IÞ�1

ðxkÞk � ek ,X1k¼3D0

ek < 1: ð4Þ

This condition, together with fckg being bounded away from zero, ensuresconvergence of the iterates to some �xx 2 T�1

ð0Þ, provided the latter setis nonempty [31, Theorem 1]. Convergence here is understood in the

1014 SOLODOV AND SVAITER

Page 3: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

weak topology. In general, proximal point iterates may fail to convergestrongly [18]. However, strong convergence can be forced by some simplemodifications [40], see also [1]. Regarding the rate of convergence, the classi-cal result is the following. If the iterates are bounded, ck � c > 0, and T�1 isLipschitz-continuous at zero, then condition

kxkþ1� ðckT þ IÞ�1

ðxkÞk � dkkxkþ1

� xkk ,X1k¼3D0

dk < 1 ð5Þ

implies convergence (in the strong sense) at the linear rate [31, Theorem 2].It is interesting to note that by itself, (5) does not guarantee convergenceof the sequence. Only if T�1 is globally Lipschitz-continuous (which is arather strong assumption), then the boundedness assumption on iterates issuperfluous [31, Proposition 5].

Regarding the error tolerance rules (4) and (5), it is important to keepin mind that ðI þ ckTÞ

�1ðxkÞ is the point to be approximated, so it is not

available. Therefore, the above two criteria usually cannot be directly used inimplementations of the algorithm. In [31, Proposition 3] it was establishedthat (4), (5) are implied, respectively, by the following two conditions:

distð0, ckTðxkþ1

Þ þ xkþ1� xkÞ � ek ,

X1k¼3D0

ek < 1 , ð6Þ

and

distð0, ckTðxkþ1

Þ þ xkþ1� xkÞ � dkkx

kþ1� xkk ,

X1k¼3D0

dk < 1 : ð7Þ

These two conditions are usually easier to verify in practice, because theyonly require evaluation of T at xkþ1. Note that since TðxÞ is closed, theinexact proximal point algorithm managed by criteria (6) or (7) can berestated as follows. Having a current iterate xk, find ðxkþ1, vkþ1

Þ 2 H �H

such that

vkþ12 Tðxkþ1

Þ,

ckvkþ1

þ xkþ1� xk ¼ 3Drk,

(ð8Þ

where rk 2 H is the error associated with approximation. With this notation,criterion (6) becomes

krkk � ek ,X1k¼3D0

ek < 1 , ð9Þ

and criterion (7) becomes

krkk � dkkxkþ1

� xkk ,X1k¼3D0

dk < 1 : ð10Þ

PROXIMAL POINT ALGORITHMS 1015

Page 4: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

One question which arises immediately if these rules are to be used, is how tochoose the value of ek or dk for each k when solving a specific problem.Obviously, the number of choices at every step is infinite, and it is notquite clear how the choice should be related to the behavior of the methodon the given problem. This goes to say that, in this sense, these rules are notof constructive nature. Developing constructive approximation criteria isone of the subjects of the present paper (following [37,38]). Another impor-tant question is how to solve the subproblems efficiently within the approx-imation rule, once it is chosen. This question can be addressed mosteffectively under some further assumptions on the structure of the operatorT , e.g., T is the subdifferential of a convex function, T is smooth, T is asum of two operators with certain structure, T represents a variationalinequality, etc. Developments for some of these cases can be found in[35,36,37,39,44], where some Newton-type and splitting-type methods wereconsidered. We note that the use of constructive approximation criteria of[37,38] was crucial in those references. The case of a general maximal mono-tone operator was discussed in [32], where subproblems are solved by bundletechniques, similar to [5]. In this paper we consider one more application,which has to do with solving a system of semismooth (but not in generaldifferentiable) systems of equations. Monotone systems of this type arise, forexample, in reformulation of complementarity problems.

The next question we shall discuss is whether and how (9), (10) can berelaxed. A natural rule to consider in this setting would be

krkk � �kxkþ1� xkk , 0 � � < 1, ð11Þ

where, for simplicity, the relaxation parameter � is fixed (the important pointis not to have it necessarily fixed, but rather to keep it bounded away fromzero). The above condition is obviously weaker than (10). It is also lessrestrictive than (9), because a sequence fxkg generated by the proximalpoint method based on it may not converge (unlike when (9) is used).We refer the reader to [38], where an example of divergence in a finite-dimensional space R2 is given. Hence, condition (11) cannot be used withoutmodifications to the algorithm. Fortunately, modifications that are neededare quite simple and do not increase the computational burden per iterationof the algorithm. This development is another goal of this paper, following[37,38]. We note that conditions in the spirit of (11) are very natural andcomputationally realistic. We mention classical inexact Newton methods, e.g.[11], as one example where similar approximations are used.

We now turn our attention to the more recent research in the area ofinexact proximal point methods. Let us consider the ‘‘proximal system’’

v 2 TðxÞ,

ckvþ x� xk¼ 3D0,

�ð12Þ

1016 SOLODOV AND SVAITER

Page 5: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

which is clearly equivalent to the subproblem (2). This splitting of equa-tion and inclusion seems to be a natural view of subproblems, the advantagesof which will be further evident later. In [38], the following notion of inexactsolutions was introduced. A pair ~xxkþ1, ~vvkþ1 is an approximate solution of (12) if

~vvkþ12 Tð ~xxkþ1

Þ

ck ~vvkþ1

þ ~xxkþ1� xk ¼ 3Drk

(ð13Þ

and

krkk � �max ckk ~vvkþ1

k, k ~xxkþ1� xkk

n o, ð14Þ

where � 2 ½0, 1Þ is fixed. Note that this condition is even more general than(11). The algorithm of [38], called the Hybrid Projection–Proximal PointMethod (HPPPM), proceeds as follows. Instead of taking ~xxkþ1 as the nextiterate xkþ1 (recall that this may not work even in R2), xkþ1 is defined as theprojection of xk onto the halfspace

Hkþ1 :¼ 3Dfx 2 H j h ~vvkþ1,x� ~xxkþ1i � 0g:

Note that since ~vvkþ12 Tð ~xxkþ1

Þ, any solution of (1) belongs to Hkþ1, by themonotonicity of T . On the other hand, using (14), it can be verified that if xk

is not a solution then

h ~vvkþ1, xk � ~xxkþ1i > 0 ,

and so xk 62 Hkþ1. Hence, the projected point xkþ1 is closer to the solution setT�1

ð0Þ than xk, which essentially ensures global convergence of the method.Recall that projection onto a halfspace is explicit (so it does not entail anynontrivial computational cost). Specifically, it is given by

xkþ1 :¼ 3Dxk �h ~vvkþ1,xk � ~xxkþ1

i

k ~vvkþ1k2~vvkþ1 : ð15Þ

It is also worth to mention that the same rule (14) used for global conver-gence also gives local linear rate of convergence under standard assumptions.Observe that if ð ~xxkþ1, ~vvkþ1

Þ happens to be the exact solution of (12) (forexample, this will necessarily be the case if one takes � ¼ 3D0 in (14)),then xkþ1 will be equal to ~xxkþ1, and we retrieve the exact proximal iteration.Of course, the case of interest is when � 6¼ 0. Convergence properties ofHPPPM are precisely the same as of the standard proximal point method,see [38]. The advantage of HPPPM is that the approximation rule (14)is constructive and less restrictive than the classical one. This is importantnot only theoretically, but even more so in applications. For example,tolerance rule (14) proved crucial in the design of truly globally conver-gent Newton-type algorithms [35,36], which combine the proximal andprojection methods of [34,38] with the linearization methodology. An

PROXIMAL POINT ALGORITHMS 1017

Page 6: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

extension of (14) to proximal methods with Bregman-distance regularizationcan be found in [41].

Let us go back to the proximal system (12). Up to now, we discussedrelaxing the equation part of this system. The inclusion v 2 TðxÞ can also berelaxed using some ‘‘outer approximation’’ of T . In [3,4] T ", an enlargementof a maximal monotone operator T , was introduced. Specifically, given" � 0, define T ":H ! PðHÞ as

T "ðxÞ ¼ 3Dfv 2 H j hw� v, z� xi � �" for all z 2 H, w 2 TðzÞg:

ð16Þ

Since T is maximal monotone, T0ðxÞ ¼ 3DTðxÞ for all x 2 H. Furthermore,

the relation

TðxÞ � T "ðxÞ

holds for any " � 0, x 2 H. So, T " may be seen as a certain outer approx-imation of T . In the particular case T ¼ 3D@f , where f is a proper closedconvex function, it holds that T "

ðxÞ � @" f ðxÞ, where @" f stands for the stan-dard "-subdifferential as defined in Convex Analysis. Some further propertiesand applications of T " can be found in [4,5,6,7,32,39].

In [37], a variation of the proximal point method was proposed,where both the equation and the inclusion were relaxed in system (12).In addition, the projection step of [38] was replaced by a more simpleextragradient-like step. This algorithm is called the Hybrid Extragradient–Proximal Point Method (HEPPM). Specifically, HEPPM works as follows.A pair ð ~xxkþ1, ~vvkþ1

Þ is considered an acceptable approximate solution of theproximal system (12) if for some "kþ1 � 0.

~vvkþ12 T "kþ1ð ~xxkþ1

Þ ,

ck ~vvkþ1

þ ~xxkþ1� xk ¼ 3Drk,

(ð17Þ

and

krkk2 þ 2ck"kþ1 � �2k ~xxkþ1

� xkk2 , ð18Þ

where � 2 ½0, 1Þ is fixed. Having in hand those objects, the next iterate is

xkþ1 :¼ 3Dxk � ck ~vvkþ1: ð19Þ

Convergence properties of this method are again the same as of the originalproximal point algorithm. We note that for "kþ1 ¼ 3D0, condition (18) forHEPPM is somewhat stronger than condition (14) for HPPPM, although itshould not be much more difficult to satisfy in practice. It has to be stressed,however, that the introduction of "kþ1 6¼ 0 is not merely formal or artificialhere. It is important for several reasons, see [37,39] for a more detaileddiscussion. Here, we shall mention that T" arises naturally in the contextof variational inequalities where subproblems are solved using the regular-ized gap function [15]. Another situation where T " is useful are bundle

1018 SOLODOV AND SVAITER

Page 7: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

techniques for general maximal monotone inclusions, see [5,32] (unlike T , T "

has certain continuity properties [6], much like the "-subdifferential in non-smooth convex optimization).

The main goal of the present paper is to unify the methods of [38] and[37] within a more general framework which, in addition, utilizes an evenbetter approximation criterion for solution of subproblems than the onesstudied previously. A new application to solving monotone systems of semi-smooth equations is also discussed.

2. THE GENERAL FRAMEWORK

Given any x 2 H and c > 0, consider the proximal point problem ofsolving the system

v 2 Tð yÞ,

cvþ y� x ¼ 3D0:

�ð20Þ

We start directly with our new notion of approximate solutions of (20).

Definition 1. We say that a triple ð y, v, "Þ 2 H �H� Rþ is an approximatesolution of the proximal system (20) with error tolerance � 2 ½0, 1Þ, if

v 2 T "ð yÞ,

kcvþ y� xk2 þ 2c" � �2 cvk k2þ y� x�� ��2� �

: ð21Þ

First note that condition (21) is more general than either (14) or (18). This iseasy to see because the right-hand side in (21) is larger. In addition, (14)works only with the exact values of the operator (i.e., with " ¼ 3D0).

Observe that if ð y, vÞ is the exact solution of (20) then, taking " ¼ 3D0,we conclude that ð y, v, "Þ satisfy the approximation criteria (21) for any� 2 ½0, 1Þ. Conversely, if � ¼ 3D0, then only the exact solution of (20)(with " ¼ 3D0) will satisfy this approximation criterion. So this definitionis quite natural. For � 2 ð0, 1Þ, system (20) has at least one, and typicallymany, approximate solutions in the sense of Definition 1.

We next establish some basic properties of approximate solutionsdefined above.

Lemma 2. Take x 2 H, c > 0. The triple ð y, v, "Þ 2 H �H� Rþ being anapproximate solution of the proximal system (20) with error tolerance� 2 ½0, 1Þ is equivalent to the following relations:

v 2 T "ð yÞ , hv, x� yi � " �

1� �2

2ccvk k

2þ y� x�� ��2� �

: ð22Þ

PROXIMAL POINT ALGORITHMS 1019

Page 8: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

In addition, it holds that

1� �

1� �2ckvk � ky� xk �

1þ �

1� �2ckvk , ð23Þ

where

� :¼ 3D

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� ð1� �2Þ

2

q:

Furthermore, the three conditions

1. 0 2 TðxÞ,2. v ¼ 3D0,3. y ¼ 3Dx

are equivalent and imply " ¼ 3D0.

Proof. Re-writing (21), we have

�2kcvk2 þ ky� xk2� �

� kcvþ y� xk2 þ 2c"

¼ 3D2c"þ kcvk2 þ ky� xk2 þ 2chv, y� xi,

which is further equivalent to (22), by a simple rearrangement of terms.Furthermore, by the Cauchy-Schwarz inequality and the nonnegativity

of ", using (22), we have that

kcvkky� xk � chv, y� xi � c" �1� �2

2kcvk2 þ ky� xk2� �

:

Denoting t :¼ 3Dky� xk and resolving the quadratic inequality in t

t2 �2kcvk

1� �2tþ kcvk2 � 0,

proves (23).Suppose now that 0 2 TðxÞ. Since v 2 T"

ð yÞ, we have that

� " � hv� 0, y� xi ¼ 3Dhv, y� xi,

and it follows that the right-hand side of (22) is zero. Hence, v ¼ 3D0, andx ¼ 3Dy. If we assume that v ¼ 3D0, then (22) implies that x ¼ 3Dy, and viceversa. And obviously, these conditions imply that 0 2 TðxÞ. It is also clearfrom (22) that in those cases " ¼ 3D0. Q

The following lemma is the key to developing convergent iterativeschemes based on approximate solutions of proximal subproblems definedabove.

Lemma 3. Take any x 2 H. Suppose that

hv, x� yi � " > 0,

1020 SOLODOV AND SVAITER

Page 9: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

where " � 0 and v 2 T "ð yÞ. Then for any x� 2 T�1

ð0Þ and any � � 0, it holdsthat

kx� � xþk2 � kx� � xk2 � ð1� ð1� �Þ2Þkavk2,

where

xþ :¼ 3Dx� �av ,

a :¼ 3Dhv,x� yi � "

kvk2:

Proof. Define the closed halfspace H as

H :¼ 3Dfz 2 H j hv, z� yi � " � 0g:

By the assumption, x =2H. Let �xx be the projection of x onto H. It is wellknown, and easy to check, that

�xx ¼ 3Dx� av,

where the quantity a is defined above. For any z 2 H, it holds thathz� �xx, vi � 0, and so

hz� �xx, xþ � xi � 0:

Using the definition of xþ, we obtain

kz� xk2 ¼ 3Dkðz� xþÞ þ ðxþ � xÞk2

¼ 3Dkz� xþk2 þ kxþ � xk2 þ 2hz� xþ,xþ � xi

¼ 3Dkz� xþk2 þ kxþ � xk2 þ 2h �xx� xþ, xþ � xi

þ 2hz� �xx,xþ � xi

� kz� xþk2 þ kxþ � xk2 þ 2h �xx� xþ,xþ � xi

¼ 3Dkz� xþk2 þ ð1� ð1� �Þ2Þa2kvk2:

Suppose x� 2 T�1ð0Þ. By the definition of T ",

hv� 0, y� x�i � �":

Therefore, hv,x� � yi � " � 0, which means that x� 2 H. Setting z ¼ 3Dx� inthe chain of inequalities above completes the proof. Q

Lemma 3 shows that if we take � 2 ð0, 2Þ then the point xþ is closer tothe solution set T�1

ð0Þ than the point x. Using further Lemma 2, this canserve as a basis for constructing a convergent iterative algorithm. Allowingthe parameter � to vary within the interval ð0, 2Þ will make it possible to unifythe algorithms of [38] and [37].

PROXIMAL POINT ALGORITHMS 1021

Page 10: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

Algorithm 2.1. Choose any x0 2 H, 0 � ��� < 1, �cc > 0, and 2 ð0, 1Þ.

1. Choose ck � �cc and find ðvk, yk, "kÞ 2 H �H� Rþ, an approximatesolution with error tolerance �k � ��� of the proximal system (12), i.e.,

vk 2 T"kð ykÞ

kckvkþ yk � xkk2 þ 2ck"k � �2

kðkckvkk2þ kyk � xkk2Þ:

2. If yk ¼ 3Dxk, STOP. Otherwise,3. Choose �k 2 ½1� , 1þ �, and set

ak :¼ 3Dhvk, xk � yki � "k

kvkk2,

xkþ1 :¼ 3Dxk � �kakvk ,

and go to Step 1.

Note that the algorithm stops when yk ¼ 3Dxk which, by Lemma 2,means that xk 2 T�1

ð0Þ. In our convergence analysis we shall assume thatthis does not occur, and so an infinite sequence of iterates is generated.

We next show that Algorithm 2.1 contains the methods proposed in [38]and [37] as its special cases. First, as was already remarked, any approximatesolution of the proximal point subproblem satisfying (14) or (18) (used in[38] and [37], respectively), certainly satisfies approximation criterion (21)employed in Algorithm 2.1. It remains to show that the ways xkþ1 is obtainedin (15) and (19) (used in [38] and [37], respectively), are also two special casesof the update rule in Algorithm 2.1. With respect to [38] this is completelyobvious, as (15) corresponds to taking �k ¼ 3D1 in Algorithm 2.1 (with"k ¼ 3D0, as in the setting of [38]). For the method of [37], on the otherhand, this issue is not immediately clear. Essentially, one has to showthat under the approximation criterion (18), there exists �k 2 ½1� , 1þ �such that �kak ¼ 3Dck, or equivalently, there exists 2 ð0, 1Þ such thatð1� Þak � ck � ð1þ Þak. The following proposition establishes this fact.

Proposition 4. Take x, y, v 2 H, " � 0, c > 0 and � 2 ½0, 1Þ such that 0 =2TðxÞand

v 2 T "ð yÞ

kcvþ y� xk2 þ 2c" � �2ky� xk2 :

Then

ð1� �Þa � c � ð1þ �Þa ,

where

a ¼ 3Dhv, x� yi � "

kvk2:

1022 SOLODOV AND SVAITER

Page 11: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

Proof. First note that by the hypotheses of the proposition, Lemma 2 isapplicable, and we have that v 6¼ 0 and a > 0.

Furthermore, by our assumptions it holds that kcvþ y� xk � �ky� xk,from which by the triangle inequality, it follows that

ð1� �Þk y� xk � ckvk � ð1þ �Þk y� xk: ð24Þ

By the nonnegativity of " and the Cauchy-Schwarz inequality, we have that

a � hv, x� yi=kvk2 � ky� xk=kvk:

Combining the latter relation with (24), we conclude that

ð1� �Þa � ð1� �Þky� xk=kvk � c ,

which proves one part of the assertion.To prove the other part, note that under our assumptions

a ¼ 3Dkcvk2 þ ky� xk2 � ðkcvþ y� xk2 þ 2c"Þ

2ckvk2

�kcvk2 þ ð1� �2

Þky� xk2

2ckvk2

¼ 3Dc

21þ ð1� �2

Þkx� yk2

kcvk2

!:

Using further (24), we obtain

a � ðc=2Þ 1þ1� �2

ð1þ �Þ2

!¼ 3Dc=ð1þ �Þ ,

which concludes the proof. Q

Proposition 4 implies that if we choose � ��� in Algorithm 2.1, then foreach k there exists �k 2 ½1� , 1þ � such that �kak ¼ 3Dck. Hence, HEPPMfalls within the presented framework of Algorithm 2.1.

3. CONVERGENCE ANALYSIS

As already remarked, if Algorithm 2.1 terminates finitely, it does so at asolution of the problem. From now on, we assume that infinite sequencesfxkg, fvkg, fykg and f"kg are generated. Using Lemma 2 we conclude that forall k, vk 6¼ 0, yk 6¼ xk and

hvk, xk � yki � "k �1� �2

k

2ckðkckv

kk2þ kyk � xkk2Þ > 0: ð25Þ

PROXIMAL POINT ALGORITHMS 1023

Page 12: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

Therefore, by the definition of ak,

ak �1� �2

k

2

kckvkk2þ kyk � xkk2

ckkvkk2

: ð26Þ

Furthermore, using the fact that kckvkk2þ kyk � xkk2 � 2ckkv

kkkyk � xkk,

we obtain that

akkvkk � ð1� �2

kÞkyk� xkk: ð27Þ

Proposition 5. If the solution set of problem (1) is nonempty, then

1. The sequence fxkg is bounded;2.

P1

k¼3D0 kakvkk2 < 1;

3. limk!1 kyk � xkk ¼ 3D limk!1 kvkk ¼ 3D limk!1 "k ¼ 3D0.

Proof. Applying (25) and Lemma 3, we have that for any x� 2 T�1ð0Þ it

holds that

kx� � xkþ1k2� kx� � xkk2 � ð1� ð1� �kÞ

2Þa2kkv

kk2

� kx� � xkk2 � ð1� 2Þkakvkk2:

It immediately follows that the sequence fxkg is bounded, and that the seconditem of the proposition holds. Therefore we also have that 0 ¼

3D limk!1 akkvkk. From (26) it easily follows that

akkvkk �

1� �2k

2kckv

kk:

Using this relation and (27), we immediately conclude that 0 ¼

3D limk!1 kvkk ¼ 3D limk!1 kyk � xkk. Since "k � 0, relation (25) alsoimplies that 0 ¼ 3D limk!1 "k.

We are now in position to prove convergence of our algorithm.

Theorem 6. If the solution set of problem (1) is nonempty, then the sequencefxkg converges weakly to a solution.

Proof. By Proposition 5, the sequence fxkg is bounded, and so it has atleast one weak accumulation point, say �xx. Let fxkj g be some subsequenceweakly converging to �xx. Since kyk � xkk ! 0 (by Proposition 5), it followsthat the subsequence fykj g also converges weakly to �xx. Moreover, we knowthat vkj ! 0 (strongly) and "kj ! 0. Take any x 2 H and u 2 TðxÞ. Becausevk 2 T"kð ykÞ, for any index j it holds that

hu� vkj ,x� ykj i � �"kj :

Therefore

hu� 0, x� ykji � hvkj , x� ykj i � "kj :

1024 SOLODOV AND SVAITER

Page 13: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

Since fykj g converges weakly to �xx, fvkjg converges strongly to zero and fykjg isbounded, and f"kjg converges to zero, taking the limit as j ! 1 in the aboveinequality we obtain that

hu� 0, x� �xxi � 0:

But ðx, uÞ was taken as an arbitrary element in the graph of T . From max-imality of T it follows that 0 2 Tð �xxÞ, i.e., �xx is a solution of (1). We have thusproved that every weak accumulation point of fxkg is a solution.

The proof of uniqueness of weak accumulation point in this settingis standard. For example, uniqueness follows easily by applying Opial’sLemma [23] (observe that fkxk � �xxkg is nonincreasing), or by using theanalysis of [31]. Q

When T�1ð0Þ ¼ 3D 6 0, i.e., no solution exists, the sequence fxkg can

be shown to be unbounded. Because one has to work with an enlargementof the operator, the proof of this fact is somewhat different from theclassical (e.g., [31]). We refer the reader to [37] where the required analysisis similar.

We next turn our attention to the study of convergence rate ofAlgorithm 2.1. This will be done under the standard assumption that T�1

is Lipschitz-continuous at zero, i.e., there exist the unique x� 2 T�1ð0Þ and

some �,L > 0 such that

v 2 TðxÞ , kvk � � ) kx� x�k � Lkvk: ð28Þ

We shall assume, for the sake of simplicity, that �k ¼ 3D1 for all k, i.e., thereis no relaxation in the ‘‘projection’’ step of Algorithm 2.1.

The following error bound, established in [39], will be the key to theanalysis.

Theorem 7. [39, Corollary 2.1] Let y� , v� be the exact solution of the proximalsystem (20). Then for any y 2 H , v 2 T "

ð yÞ, it holds that

ky� y�k2 þ c2kv� v�k2 � kcvþ y� xk2 þ 2c":

A lower bound for ak will also be needed. Using (26), we have

ak �1� �2

k

2ð1þ ðkyk � xkk=kckv

kkÞ

2Þck

�1� ���2

2ð1þ ðkyk � xkk=kckv

kkÞ

2Þ �cc:

Using also (23), after some algebraic manipulations, we obtain

ak �1þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� ð1� ���2Þ

2q1� ���2

24

35

�1

�cc: ð29Þ

PROXIMAL POINT ALGORITHMS 1025

Page 14: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

We are now ready to establish the linear rate of convergence of ouralgorithm.

Theorem 8. Suppose that T�1 is Lipschitz-continuous at zero, and �k ¼ 3D1for all k. Then for k sufficiently large,

kx� � xkþ1k �

�ffiffiffiffiffiffiffiffiffiffiffiffiffi1þ �2

p kx� � xkk ,

where

� :¼ 3Dffiffiffiffiffiffiffiffiffiffiffiffiffi 2 þ 1

p ffiffiffiffiffiffiffiffiffiffiffiffiffi�2 � 1

qþ �

with

:¼ 3DðL= �ccÞ1þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� ð1� ���2Þ

2q1� ���2

, � :¼ 3D1=ð1� ���2Þ:

Proof. Define, for each k, xxk, vvk 2 TðxxkÞ as the exact solution of the proximalproblem 0 2 akTðxÞ þ x� x

k, where ak was defined in Algorithm 2.1 (recallthat ak > 0). Since vk 2 T"kð ykÞ, by Theorem 7 it follows that

kxxk�ykk2þa2kkvvk� vkk2 �kakv

kþyk�xkk2þ2ak"k

¼ 3Dkakvkþyk�xkk2

þ2akðhvk,xk�yki�akkv

kk2Þ

¼ 3Dkyk�xkk2�kakvkk2:

Using also (27) we get

kxxk � ykk2 þ a2kkvvk� vkk2 � ðð1� �2

k�2

� 1Þkakvkk2: ð30Þ

Therefore,

kvvk � vkk2 � ðð1� ���2Þ�2Þ � 1Þkvkk2:

Now, since vk converges to zero (strongly) we conclude that vvk also convergesto zero. Hence, for k large enough, it holds that kvvkk � �, and, by (28),

kx� � xxkk � Lkvvkk ¼ 3DðL=akÞkxxk� xkk:

In the sequel, we assume that k is large enough, so that the above boundholds. By the triangle inequality, we further obtain

kx� � xkþ1k � kx� � xxkk þ kxxk � xkþ1

k

� ðL=akÞkxxk� xkk þ kxxk � xkþ1

k

� ðL=akÞkxxk� ykk þ ðL=akÞky

k� xkk þ kxxk � xkþ1

k:

1026 SOLODOV AND SVAITER

Page 15: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

Note that because xxk ¼ 3Dxk � akvvk and xkþ1

¼ 3Dxk � akvk, we have that

akkvvk� vkk ¼ 3Dkxxk � xkþ1

k. Using this relation and (30), we obtain that

kxxk � ykk2 þ kxxk � xkþ1k2� ½ð1� �2

k�2

� 1�kakvkk2:

Using the Cauchy-Schwarz inequality, it now follows that

ðL=akÞkxxk� ykk þ kxxk � xkþ1

k

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðL=akÞ

2þ 1

q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffikxxk � ykk2 þ kxxk � xkþ1k2

q

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðL=akÞ

2þ 1

q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� �2

k�2

� 1

qkakv

kk:

Hence,

kx� � xkþ1k �

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðL=akÞ

2þ 1

q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� �2

k�2

� 1

qkakv

kkþ ðL=akÞky

k� xkk

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðL=akÞ

2þ 1

q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� �2

k�2

� 1

qþ ðL=akÞð1� �2

k�1

� �kakv

kk ,

where (27) was used again. Recalling the definition of �, and using (29), we get

kx� � xkþ1k � �kakv

kk:

Using also Lemma 3, we now have that

kx� � xkk2 � kx� � xkþ1k2þ kakv

kk2

� kx� � xkþ1k2þ ð1=�Þ2kx� � xkþ1

k2 ,

and it follows that

kx� � xkþ1k �

�ffiffiffiffiffiffiffiffiffiffiffiffiffi1þ �2

p kx� � xkk ,

which establishes the claim. Q

Remark 9. It is easy to see that we could repeat the analysis of Theorem ?? with k, �k and �k defined for each k as functions of ck and �k (instead of theirbounds �cc and ���). Then the analysis above shows that if ck ! 1 and �k ! 0,then k ! 0 , �k ! 1, and hence, �k ! 0. We conclude that under the addi-tional assumptions that ck ! 1 and �k ! 0, in the setting of this section fxkgconverges to x� superlinearly.

In the more general case where �k 2 ½1� , 1þ �, under the same con-dition (28), the following estimate is valid:

kx� � xkþ1k �

�þ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� 2Þ þ ð�þ Þ2

q kx� � xkk:

PROXIMAL POINT ALGORITHMS 1027

Page 16: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

4. SYSTEMS OF SEMISMOOTH EQUATIONS

AND REFORMULATIONS OF MONOTONE

COMPLEMENTARITY PROBLEMS

In this section, we consider a specific situation where

H ¼ 3DRn , T ¼ 3DF , F :Rn ! Rn ,

and F is a locally Lipschitz-continuous semismooth monotone function. Aswill be discussed below, one interesting example of problems having thisstructure are equation-based reformulations [16,27] of monotone nonlinearcomplementarity problems [13,25]. A globally convergent Newton methodfor solving systems of monotone equations was proposed in [36]. The attrac-tive feature of this algorithm is that given an arbitrary starting point, it isguaranteed to generate a sequence of iterates convergent to a solution of theproblem without any regularity-type assumptions. This is a very desirableproperty, not shared by standard globalizations strategies based on meritfunctions. We refer the reader to [35,36] for a detailed discussion of thisissue. We next state a related method based on the more general frameworkof Algorithm 2.1.

Algorithm 4.1. Choose any x0 2 Rn, 0 � ��� < 1, �cc > 0, and � , 2 ð0, 1Þ.

1. (Newton Step)

Choose ck � �cc, �k � ���, and a positive semidefinite matrix Hk.Compute dk, the solution of the linear system

FðxkÞ þ ðHk þ c�1k IÞd ¼ 3D0:

Stop if dk ¼ 3D0. If

kckFðxkþ dkÞ þ dkk2 � �2

kðkckFðxkþ dkÞk2 þ kdkk2Þ ,

then set yk :¼ 3Dxk þ dk, and go to Projection Step. Otherwise,

2. (Linesearch Step)

Find mk, the smallest nonnegative integer m, such that

� ckhFðxkþ �mdkÞ, dki � �2

kkdkk2:

Set yk :¼ 3Dxk þ �mkdk.

3. (Projection Step)

Choose �k 2 ½1� , 1þ �, and set

xkþ1 :¼ 3Dxk þ �khFð ykÞ, dki

kFð ykÞk2Fð ykÞ:

Set k :¼ 3Dkþ 1, and go to Newton Step.

1028 SOLODOV AND SVAITER

Page 17: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

The key idea in Algorithm 4.1 is to try solving the proximal pointsubproblem by just one Newton-type iteration. When this is not possible,the Newton point is refined by means of a lineasearch. The following proper-ties for Algorithm 4.1 can be established essentially using the line of analysisdeveloped in [36].

Theorem 10. Suppose that F is continuous and monotone and let fxkg be anysequence generated by Algorithm 4.1. Then fxkg is bounded. Furthermore, ifkHkk � C1 and ck � C2kFðx

kÞk

�1 for some C1 ,C2 > 0, then fxkg converges tosome �xx such that Fð �xxÞ ¼ 3D0.

In addition, if Fð�Þ is differentiable in a neighborhood of �xx with F 0ð�Þ

Lipschitz-continuous, F 0ð �xxÞ is positive definite, Hk ¼ 3DF 0

ðxkÞ for all k � k0,and 0 ¼ 3D limk!1 c

�1k ¼ 3D limk!1 ckkFðx

kÞk, then the convergence rate is

superlinear.Consider the classical nonlinear complementarity problem [25,?], which

is to find an x 2 Rn such that

gðxÞ � 0, x � 0, hgðxÞ, xi ¼ 3D0 , ð31Þ

where g : Rn ! Rn is differentiable. One of the most useful approaches tonumerical and theoretical treatment of the NCP consists in reformulating itas a system of (nonsmooth) equations [27,?]. One popular choice is given bythe so-called natural residual [?]

FiðxÞ ¼ 3Dmin fxi, giðxÞg , i ¼ 3D1, . . . , n, ð32Þ

where > 0 is a parameter. It is easy to check that for this mapping thesolution set of the system of equations FðxÞ ¼ 3D0 coincides with thesolution set of the NCP (31). Furthermore, it is known [45] (see also[17,33]) that F given by (32) is monotone for all sufficiently small if g isco-coercive (for the definition of co-coerciveness and its role for variationalinequalities, see [2,43,46]). Since F is obviously continuous, it immediatelyfollows that for co-coercive NCP Algorithm 4.1 applied to the naturalresidual F generates a sequence of iterates globally converging to somesolution x� of the NCP, provided the parameters are chosen as specified.However, it is clear that this F is in general not differentiable. In particular,it is not differentiable at a solution x� of the NCP, unless the strictcomplementarity condition x�i þ giðx

�Þ > 0 , i ¼ 3D1, . . . , n, holds. This con-

dition, however, is known to be very restrictive. Hence, the superlinear rate ofconvergence stated in Theorem 10 does not apply in this context.

On the other hand, by Remark 9, Algorithm 2.1 does converge locallysuperlinearly under appropriate assumptions. Of course, it is importantto study computational cost involved in finding acceptable approximatesolutions for the proximal point subproblems in Algorithm 2.1. We nowturn our attention to this issue. Specifically, we shall show that when xk is

PROXIMAL POINT ALGORITHMS 1029

Page 18: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

close to a solution x� with certain properties, two steps of the generalizedNewton method applied to solve the k-th proximal subproblem

0 ¼ 3DFkðxÞ :¼ 3DFðxÞ þ c�1k ðx� xkÞ ð33Þ

are sufficient to guarantee our error tolerance criterion. As a consequence,local superlinear rate of convergence of Algorithm 2.1 is ensured by solvingat most two systems of linear equations at each iteration, and by takingck ! þ1 , �k ! 0 in an appropriate way.

Let @FðxÞ denote the Clarke’s generalized Jacobian [9] of F at x 2 Rn.Specifically, @FðxÞ is the convex hull of the B-subdifferential [30] of F at x,which is the set

@BFðxÞ ¼ 3DfH 2 Rn�n j 9fxkg � DF :xk! x and F 0

ðxkÞ ! Hg,

with DF being the set of points at which F is differentiable (by theRademacher’s Theorem, a locally Lipschitz-continuous function is differenti-able almost everywhere). Recall that F is semismooth [22,28,29] at x 2 Rn if itis directionally differentiable at x, and

Hd � F 0ðx; dÞ ¼ 3DoðkdkÞ , 8 d ! 0 , 8H 2 @Fðxþ dÞ ,

where F 0ðx; dÞ stands for the usual directional derivative (this is one of a

number of equivalent ways to define semismoothness).Similarly, F is said to be strongly semismooth at x 2 Rn if

Hd � F 0ðx; dÞ ¼ 3DOðkdk2Þ , 8 d ! 0 , 8H 2 @Fðxþ dÞ:

When we say that F is (strongly) semismooth, we mean that this propertyholds for all points x 2 Rn. By the calculus for semismooth mappings [14],if g is (strongly) semismooth, then so is F given by (32). In particular, ifg is continuously differentiable, then F is semismooth; and F is stronglysemismooth if the derivative of g is Lipschitz-continuous (see also [21]).Furthermore, at any point x 2 Rn elements in @FðxÞ or @BFðxÞ are easilycomputable by explicit formulas, e.g., see [19]. It is known [19,21] that if x�

is a b-regular [?] solution of NCP (which is one of the weakest regularityconditions for NCP), then all elements in @BFðx

�Þ are nonsingular (the

so-called BD-regularity of F at x�).Consider fxkg ! x�, and assume that x� is a BD-regular solution of

FðxÞ ¼ 3D0. The latter implies that x� is an isolated solution, and the follow-ing error bound holds [27, Proposition 3]: there exist M1 > 0 and � > 0 suchthat

M1kx� x�k � kFðxÞk 8 x 2 Rn such that kx� x�k � �: ð34Þ

Denote by zk the exact solution of (33), fzkg ! x�. By the well-known prop-erties of the (exact) proximal point method [31],

kzk � x�k2 � kxk � x�k2 � kzk � xkk2 ,

1030 SOLODOV AND SVAITER

Page 19: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

from which using xk � zk ¼ 3DckFðzkÞ and (34), we obtain

kzk � x�k ¼ 3DOðc�1k kxk � x�kÞ: ð35Þ

Suppose F is strongly semismooth. Clearly, Fk is strongly semismooth, and zk

is a BD-regular solution of (33) (elements in @BFkðzkÞ are nonsingular, once xk

is sufficiently close to x�, so that zk is also close to x�).Let yyk 2 Rn be the point obtained after one iteration of the generalized

Newton method applied to (33):

yyk ¼ 3Dxk � P�1k Fkðx

kÞ , Pk 2 @BFkðx

kÞ:

Since Pk ¼ 3DHk þ c�1k I ,Hk 2 @BFðx

kÞ, we have that

kyyk � zkk ¼ 3Dkxk � zk � ðHk þ c�1k IÞ

�1FðxkÞk

� kxk �H�1k Fðx

kÞ � x�k þ kzk � x�k

þ kðH�1k � ðHk þ c

�1k IÞ

�1ÞFðxkÞk

¼ 3DOðkxk � x�k2Þ þOðc�1k kxk � x�kÞ þOðc�1

k kFðxkÞkÞ

¼ 3DOðc�1k kFðxkÞkÞ , ð36Þ

where the second equality follows from the quadratic convergence [?,29] ofthe generalized Newton method applied to FðxÞ ¼ 3D0, (35), and the classicalfact of linear algebra about the inverse of a perturbed nonsingular matrix;and the last equality follows from (34). By the Lipschitz-continuity of F , thetriangle inequality, and (35), we also have

kFðxkÞk �M2kxk� x�k

�M2ðkzk� xkk þ kzk � x�kÞ

�M2kzk� xkk þOðc�1

k kxk � x�kÞ:

Using (34) in the last relation, it follows that kFðxkÞk � Oðkzk � xkkÞ. From(36), we then obtain that

kyyk � zkk ¼ 3DOðc�1k kxk � zkkÞ:

Denoting by yk the next Newton step for (33),

yk ¼ 3Dyyk � PP�1k Fkð yy

kÞ , PPk 2 @BFkð yy

kÞ ,

we have

kyk � zkk ¼ 3DOðc�1k kyyk � zkkÞ ,

and, using (36), we conclude that

kyk � zkk ¼ 3DOðc�2k kFðxkÞkÞ: ð37Þ

PROXIMAL POINT ALGORITHMS 1031

Page 20: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

The left-hand side of the tolerance criterion (21) for proximal subproblem(33) can be bounded as follows:

kckFð ykÞ þ yk � xkk � ckkFð y

kÞ � FðzkÞk þ kyk � zkk

� 2M2ckkyk� zkk

¼ 3DOðc�1k kFðxkÞkÞ , ð38Þ

where the first inequality follows from the definition of zk and the Cauchy-Schwarz inequality; the second inequality follows from the Lipschitz-continuity of F ; and the last follows from (37). On the other hand,we have that

kyk � xkk � kxk � x�k � kyk � zkk � kzk � x�k

� kxk � x�k �Oðc�2k kFðxkÞkÞ �Oðc�1

k kxk � x�kÞ

�M3kFðxkÞk , ð39Þ

where the second inequality follows from (37) and (35); and the lastinequality follows with some M3 > 0, by the Lipschitz-continuity of F .Using (39) and (38), we conclude that the error tolerance condition (21)is guaranteed to be satisfied for yk, the second Newton iterate for solving(33), if we choose parameters �k and ck so that

c�1k ¼ 3Doð�kÞ:

With this choice, in the framework of Algorithm 2.1 yk is an acceptableapproximate solution of the proximal point subproblem. By Remark 9, con-vergence rate of fxkg generated by Algorithm 2.1 is superlinear.

In fact, comparing (36) and (38), we can see that for the first Newtonpoint the left and right-hand sides of the tolerance criterion are already of thesame order. Since these are merely rough upper and lower estimates, as apractical matter, one might expect the error tolerance rule to be satisfiedalready after the first Newton step.

REFERENCES

1. H.H. Bauschke and P.L. Combettes. A weak-to-strong convergence principle forFejer-monotone methods in Hilbert spaces. Mathematics of OperationsResearch, 26, (2001), 248–264.

2. R.E. Bruck. An iterative solution of a variational inequality for certain mono-tone operators in a Hilbert space. Bulletin of the American MathematicalSociety, 81, (1975), 890–892.

3. R.S. Burachik, A.N. Iusem, and B.F. Svaiter. Enlargement of monotone opera-tors with applications to variational inequalities. Tech. Rep. 02/96 (1996),Pontificia Universidade Catolica do Rio de Janeiro, Rio de Janeiro, Brazil.

1032 SOLODOV AND SVAITER

Page 21: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

4. R.S. Burachik, A.N. Iusem, and B.F. Svaiter. Enlargement of monotone opera-

tors with applications to variational inequalities. Set-Valued Analysis, 5, (1997),

159–180.

5. R.S. Burachik, C.A. Sagastizabal, and B.F. Svaiter. Bundle methods for

maximal monotone operators. In R. Tichatschke and M. Thera, editors,

Ill-posed variational problems and regularization techniques, Lecture Notes in

Economics and Mathematical Systems, No. 477, pages 49–64. Springer–

Verlag, 1999.

6. R.S. Burachik, C.A. Sagastizabal, and B.F. Svaiter. "-Enlargements of maximal

monotone operators: Theory and applications. In M. Fukushima and L. Qi,

editors, Reformulation – Nonsmooth, Piecewise Smooth, Semismooth and

Smoothing Methods, pages 25–44. Kluwer Academic Publishers, 1999.

7. R.S. Burachik and B.F. Svaiter. "-Enlargements of maximal monotone opera-

tors in Banach spaces. Set-Valued Analysis, 7, (1999), 117–132.

8. J.V. Burke and M. Qian. A variable metric proximal point algorithm for

monotone operators. SIAM Journal on Control and Optimization, 37,

(1998), 353–375.

9. F.H. Clarke. Optimization and Nonsmooth Analysis. John Wiley & Sons,

New York, 1983.

10. R. Cominetti. Coupling the proximal point algorithm with approximation

methods. Journal of Optimization Theory and Applications, 95, (1997),

581–600.

11. R.S. Dembo, S.C. Eisenstat, and T. Steihaug. Inexact Newton methods. SIAM

Journal on Numerical Analysis, 19, (1982), 400–408.

12. J. Eckstein. Approximate iterations in Bregman-function-based proximal algo-

rithms. Mathematical Programming, 83, (1998), 113–123.

13. M.C. Ferris and C. Kanzow. Complementarity and related problems. In P.M.

Pardalos and M.G.C. Resende, editors, Handbook of Applied Optimization.

Oxford University Press, 2000.

14. A. Fischer. Solution of monotone complementarity problems with locally

Lipschitzian functions. Mathematical Programming, 76, (1997), 513–532.

15. M. Fukushima. Equivalent differentiable optimization problems and descent

methods for asymmetric variational inequality problems. Mathematical

Programming, 53, (1992), 99–110.

16. M. Fukushima and L. Qi (editors). Reformulation – Nonsmooth, Piecewise

Smooth, Semismooth and Smoothing Methods. Kluwer Academic Publishers,

1999.

17. D. Gabay. Applications of the method of multipliers to variational inequalities.

In M. Fortin and R. Glowinski, editors, Augmented Lagrangian Methods:

Applications to the Solution of Boundary-Value Problems, pages 299–331.

North-Holland, 1983.

18. O. Guler. On the convergence of the proximal point algorithm for convex

minimization. SIAM Journal on Control and Optimization, 29, (1991),

403–419.

19. C. Kanzow and M. Fukushima. Solving box constrained variational inequality

problems by using the natural residual with D-gap function globalization.

Operations Research Letters, 23, (1998), 45–51.

PROXIMAL POINT ALGORITHMS 1033

Page 22: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

20. B. Lemaire. The proximal algorithm. In J.-P. Penot, editor, New Methods of

Optimization and Their Industrial Use. International Series of Numerical

Mathematics 87, pages 73–87. Birkhauser, Basel, 1989.

21. T. De Luca, F. Facchinei, and C. Kanzow. A theoretical and numerical com-

parison of some semismooth algorithms for complementarity problems.

Computational Optimization and Applications, 16, (2000), 173–205.

22. R. Mifflin. Semismooth and semiconvex functions in constrained optimization.

SIAM Journal on Control and Optimization, 15, (1977), 957–972.

23. Z. Opial. Weak convergence of the sequence of successive approximations for

nonexpansive mappings. Bulletin of the American Mathematical Society, 73,

(1967), 591–597.

24. J.-S. Pang. Inexact Newton methods for the nonlinear complementarity pro-

blem. Mathematical Programming, 36, (1986), 54–71.

25. J.-S. Pang. Complementarity problems. In R. Horst and P. Pardalos, editors,

Handbook of Global Optimization, pages 271–338. Kluwer Academic

Publishers, Boston, Massachusetts, 1995.

26. J.-S. Pang and S.A. Gabriel. NE/SQP: A robust algorithm for the nonlinear

complementarity problem. Mathematical Programming, 60, (1993), 295–337.

27. J.-S. Pang and L. Qi. Nonsmooth equations: Motivation and algorithms. SIAM

Journal on Optimization, 3, (1993), 443–465.

28. L. Qi. Convergence analysis of some algorithms for solving nonsmooth equa-

tions. Mathematics of Operations Research, 18, (1993), 227–244.

29. L. Qi and J. Sun. A nonsmooth version of Newton’s method. Mathematical

Programming, 58, (1993), 353–367.

30. S.M. Robinson. Local structure of feasible sets in nonlinear programming, Part

III: Stability and sensitivity. Mathematical Programming Study, 30, (1987),

45–66.

31. R.T. Rockafellar. Monotone operators and the proximal point algorithm.

SIAM Journal on Control and Optimization, 14, (1976), 877–898.

32. C.A. Sagastizabal and M.V. Solodov. On the relation between bundle

methods for maximal monotone inclusions and hybrid proximal point algo-

rithms. In D. Butnariu, Y. Censor, and S. Reich, editors, Inherently Parallel

Algorithms in Feasibility and Optimization and their Applications, volume 8 of

Studies in Computational Mathematics, pages 441–455. Elsevier Science B.V.,

2001.

33. M. Sibony. Methodes iteratives pour les equations et inequations aux derivees

partielles nonlineares de type monotone. Calcolo, 7, (1970), 65–183.

34. M.V. Solodov and B.F. Svaiter. A new projection method for variational

inequality problems. SIAM Journal on Control and Optimization, 37, (1999),

765–776.

35. M.V. Solodov and B.F. Svaiter. A truly globally convergent Newton-type

method for the monotone nonlinear complementarity problem. SIAM Journal

on Optimization, 10, (2000), 605–625.

36. M.V. Solodov and B.F. Svaiter. A globally convergent inexact Newton method

for systems of monotone equations. In M. Fukushima and L. Qi, editors,

Reformulation – Nonsmooth, Piecewise Smooth, Semismooth and Smoothing

Methods, pages 355–369. Kluwer Academic Publishers, 1999.

1034 SOLODOV AND SVAITER

Page 23: A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL …w3.impa.br/~benar/docs/unif.pdf · A UNIFIED FRAMEWORK FOR SOME INEXACT PROXIMAL POINT ALGORITHMS* ... (12).Uptonow,wediscussed relaxingtheequationpartofthissystem.Theinclusionv2

37. M.V. Solodov and B.F. Svaiter. A hybrid approximate extragradient–proximalpoint algorithm using the enlargement of a maximal monotone operator.Set-Valued Analysis, 7, (1999), 323–345.

38. M.V. Solodov and B.F. Svaiter. A hybrid projection–proximal point algorithm.Journal of Convex Analysis, 6, (1999), 59–70.

39. M.V. Solodov and B.F. Svaiter. Error bounds for proximal point subproblemsand associated inexact proximal point algorithms. Mathematical Programming,88, (2000), 371–389.

40. M.V. Solodov and B.F. Svaiter. Forcing strong convergence of proximal pointiterations in a Hilbert space. Mathematical Programming, 87, (2000), 189–202.

41. M.V. Solodov and B.F. Svaiter. An inexact hybrid generalized proximalpoint algorithm and some new results on the theory of Bregman functions.Mathematics of Operations Research, 25, (2000), 214–230.

42. P. Tossings. The perturbed proximal point algorithm and some of its applica-tions. Applied Mathematics and Optimization, 29, (1994), 125–159.

43. P. Tseng. Further applications of a splitting algorithm to decomposition invariational inequalities and convex programming. Mathematical Programming,48, (1990), 249–263.

44. P. Tseng. A modified forward-backward splitting method for maximal mono-tone mappings. SIAM Journal on Control and Optimization, 38, (2000),431–446.

45. Y.-B. Zhao and D. Li. Monotonicity of fixed point and normal mappingsassociated with variational inequality and its application. SIAM Journal onOptimization, 11, (2001), 962–973.

46. D.L. Zhu and P. Marcotte. Co-coercivity and its role in the convergenceof iterative schemes for solving variational inequalities. SIAM Journal onOptimization, 6, (1996), 714–726.

PROXIMAL POINT ALGORITHMS 1035