Studies on Methods for Mathematical Programs …The other methods in the literature on MPECs include the sequential quadratic pro-gramming methods [29, 44, 62], implicit programming

Studies on Methods for

Mathematical Programs with

Equilibrium Constraints

Gui-Hua LIN

Submitted in partial fulfillment ofthe requirement for the degree of

DOCTOR OF INFORMATICS(Applied Mathematics and Physics)

Kyoto UniversityKyoto 606-8501, Japan

December, 2003

Preface

Mathematical program with equilibrium constraints, abbreviated as MPEC, is a

constrained optimization problem in which the essential constraints are defined

by some parametric variational inequalities or a parametric complementarity sys-

tem. This problem can be thought as a generalization of the so-called bilevel

programming problem that is a mathematical program with optimization con-

straints. MPEC is also closely related to the well-known Stackelberg game. As

a result, MPEC plays a very important role in many fields such as engineering

design, economic equilibrium, multilevel game, and mathematical programming

theory itself, and it has been receiving much attention in the recent optimization

world.

On the other hand, MPEC is very difficult to deal with because, from the ge-

ometric point of view, its feasible region is not convex and not connected even in

general, and in theory, its constraints fail to satisfy a standard constraint qualifica-

tion such as the linear independence constraint qualification or the Mangasarian-

Fromovitz constraint qualification at any feasible point. Therefore, the well-

developed nonlinear programming theory cannot be applied to MPECs directly.

There have been proposed several approaches such as sequential quadratic pro-

gramming approach, implicit programming approach, penalty function approach,

interior point method approach, and reformulation approach in the literature on

MPECs.

Our main purpose is to develop more efficient methods for solving MPECs.

Moreover, we notice that, in many practical problems, some elements may involve

uncertain data, and hence we also pay great attention to the stochastic mathe-

matical programs with equilibrium constraints (SMPECs). Thus, the thesis may

be divided into two parts:

The first part consists of Chapters 2 to 5, in which we focus on the study

i

ii

of MPECs, particularly a special and important subclass — the mathematical

programs with complementarity constraints (MPCCs). We first give some modi-

fied exact penalty results for nonlinear programs and MPECs in Chapter 2 and

then, in Chapters 3 and 4, we propose two relaxation methods for MPCCs, one

of which uses an expansive simplex instead of the nonnegative orthant involved

in the complementarity constraints and the other suggests a scheme with bi-

hyperbola approximation strategy. Some convergence results have been given

for the proposed methods. In Chapter 5, we consider a hybrid approach with

active index set identification for MPCCs. It has been shown that, unlike most

existing methods, the hybrid approach may solve an MPCC in a finite number

of iterations.

The second part includes Chapters 6–8, in which we deal with the SMPECs.

We discuss two kinds of SMPECs: One is the lower-level wait-and-see model, in

which the upper-level decision is made at once and a lower-level decision may

be made after a random event is observed, and the other is the here-and-now

model that requires us to make all decisions before a random event is observed.

It has been shown that many decision problems can be formulated as SMPECs

in practice. Several methods have been proposed in Chapters 6, 7, and 8, respec-

tively. In particular, in Chapter 6, we suggest a smoothing implicit programming

approach for both the lower-level wait-and-see decision model and the here-and-

now decision model. Subsequently, we consider a special here-and-now problem

in Chapters 7 and 8. We first give some equivalent reformulations of the problem

and then, based on the reformulations, we propose some penalty methods and a

regularization method in Chapters 7 and 8, respectively.

The importance of MPECs and SMPECs has been known in the optimization

world and, particularly, the study of SMPECs is still in its initial stages. We

hope that the results obtained in this thesis will be helpful to advance the study

in the field.

Gui-Hua Lin

December, 2003

Contents

Preface i

1 Introduction 1

1.1 Background on MPECs . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Background on SMPECs . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4.2 Terminologies . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Exact Penalty Results for Nonlinear Programs and MPECs 11

2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2 Penalty Results for Nonlinear Programs . . . . . . . . . . . . . . . 13

2.3 Penalty Results for MPECs . . . . . . . . . . . . . . . . . . . . . 18

2.4 Some Properties Related to Strong Convexity . . . . . . . . . . . 20

3 New Relaxation Method for MPECs 25

3.1 Relaxed Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3 Computational Results . . . . . . . . . . . . . . . . . . . . . . . . 48

3.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . 51

4 A Modified Scheme for MPECs 55

4.1 Some Results on Constraint Qualifications . . . . . . . . . . . . . 56


iii

iv

4.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . 74

5 Hybrid Approach with Active Set Identification for MPECs 75

5.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.2 A Hybrid Algorithm for MPCC . . . . . . . . . . . . . . . . . . . 78

5.3 Modified Hybrid Method with Index Addition Strategy . . . . . . 85

5.4 Modified Hybrid Method with Index Subtraction Strategy . . . . 92

5.5 Further Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.5.1 Remarks on the assumptions . . . . . . . . . . . . . . . . . 95

5.5.2 Stopping criteria . . . . . . . . . . . . . . . . . . . . . . . 97

5.5.3 Comparison of the algorithms . . . . . . . . . . . . . . . . 98

5.5.4 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.6 Computational Results . . . . . . . . . . . . . . . . . . . . . . . . 100

5.6.1 Computational results for Algorithm H . . . . . . . . . . . 100

5.6.2 Computational results for Algorithms HIA and HIS . . . . 107

6 Smoothing Implicit Programming Approach for SMPECs 113

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

6.2 Smoothing Implicit Programming Method for Lower-Level Wait-

And-See Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 119

6.2.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

6.2.3 Limiting behavior of local optimal solutions . . . . . . . . 122

6.2.4 Limiting behavior of stationary points . . . . . . . . . . . 129

6.3 Smoothing Implicit Programming Method for Here-And-Now Prob-

lems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

7 Some Reformulations and Algorithms for SMPECs 149

7.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

7.2 Reformulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

7.2.1 Properties of the function Q . . . . . . . . . . . . . . . . . 151

7.2.2 Continuous case . . . . . . . . . . . . . . . . . . . . . . . . 154

v

7.2.3 Discrete case . . . . . . . . . . . . . . . . . . . . . . . . . 156

7.3 Further Discussions on Discrete Problems . . . . . . . . . . . . . . 157

7.4 Smoothed Penalty Methods for Discrete Problems . . . . . . . . . 167

7.4.1 Smoothed penalty method (I) . . . . . . . . . . . . . . . . 168

7.4.2 Smoothed penalty method (II) . . . . . . . . . . . . . . . . 176

7.4.3 Numerical results . . . . . . . . . . . . . . . . . . . . . . . 179

7.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

8 Regularization Method for SMPECs 181

8.1 Preliminaries and Regularization Method . . . . . . . . . . . . . . 182


8.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

Bibliography 195

Acknowledgements 203

Chapter 1

Introduction

Mathematical program with equilibrium constraints (MPEC) is a constrained opti-mization problem whose constraints include some parametric variational inequalitiesor a parametric complementarity system. This problem plays a very important rolein many fields such as engineering design, economic equilibrium, multilevel game, andmathematical programming theory itself, and it has been receiving much attention inthe recent optimization world. In this chapter, we give a brief overview on MPECs.We also introduce some knowledge about the stochastic mathematical programs withequilibrium constraints (SMPECs).

1.1 Background on MPECs

MPEC is generally an optimization problem with two types of variables, an upper-levelvariable x ∈ <n and a lower-level variable y ∈ <m:

minimize f(x, y)

subject to (x, y) ∈ Z, (1.1)

y solves VI(F (x, ·), C(x)).

Here, Z is a subset of <n+m, f : <n+m → <, F : <n+m → <m, C : <n → 2<mare

mappings, and VI(F (x, ·), C(x)) denotes the variational inequality problem defined bythe pair (F (x, ·), C(x)); that is, y solves VI(F (x, ·), C(x)) if and only if y ∈ C(x) and

(v − y)T F (x, y) ≥ 0, ∀v ∈ C(x).

It is well-known [36] that, for a given variational inequality problem VI(G,Y ), if thefunction G is the gradient mapping of a differentiable function g : <m → < and the

1

2 1. Introduction

set Y is convex in <m, then VI(G,Y ) is just a restatement of the first-order necessaryconditions of optimality for the optimization problem

minimize g(y)

subject to y ∈ Y.

Therefore, MPEC (1.1) can be regarded as a generalization of the so-called bilevelprogramming problem that is a mathematical program with optimization constraints.Moreover, MPEC is also closely related to the well-known Stackelberg game, see [62, 66].

When C(x) ≡ <m+ for each x in problem (1.1), the parametric variational inequality

constraints reduce to a parametric complementarity system and then problem (1.1) isequivalent to the following mathematical program with complementarity constraints(MPCC):

minimize f(x, y)


y ≥ 0, F (x, y) ≥ 0,

yT F (x, y) = 0.

On the other hand, if the set-valued function C in problem (1.1) is defined by

C(x) := y ∈ <m| c(x, y) ≤ 0,

where c : <n+m → <s is continuously differentiable, then, under some suitable condi-tions, the variational inequality problem VI(F (x, ·), C(x)) has an equivalent Karush-Kuhn-Tucker representation [68]:

F (x, y) +∇yc(x, y)λ = 0,

λ ≥ 0, c(x, y) ≤ 0, λT c(x, y) = 0,

where λ is the Lagrange multiplier vector, and hence, problem (1.1) can be reformulatedas a program like (1.2) under some conditions, see the monograph [62] for details.Hence, problem (1.2) constitutes an important subclass of MPECs. In this thesis, weparticularly concentrate on this kind of MPECs.

As mentioned above, MPEC plays a very important role in many fields and it hasbeen receiving more and more attention in the optimization world. For more details,we refer to the monographs [62, 66] and the references attached in the end of the thesis.

On the other hand, MPEC is very difficult to deal with because, from the geometricpoint of view, its feasible region is not convex and not connected even in general, and

1.2 Background on SMPECs 3

in theory, its constraints fail to satisfy a standard constraint qualification such as thelinear independence constraint qualification or the Mangasarian-Fromovitz constraintqualification at any feasible point [17]. Therefore, the developed nonlinear programmingtheory may not be applied to MPEC class directly. At present, a natural and popularapproach is try to find some suitable approximations of an MPEC so that the MPECcan be solved by solving a sequence of subproblems. Along this way, many methodshave been proposed in the literature.

One family of the methods employs some kinds of smoothing or relaxation tech-niques. In particular, Facchinei et al. [24] and Fukushima and Pang [31] make use ofthe Fischer-Burmeister function to generate some smooth approximations of MPECsand subsequently, a similar scheme was presented by Scholtes [76].

Another family of the methods uses a penalty technique. For example, Huang etal [41] utilized the Fischer-Burmeister function to penalize the whole complementaritysystem, whereas Hu and Ralph [38] suggested a method to penalize the complementarityconstraints only. Moreover, the works [39, 40, 63, 77] also belong to this family.

The other methods in the literature on MPECs include the sequential quadratic pro-gramming methods [29, 44, 62], implicit programming methods [12, 62], interior pointmethods [60, 62], and implementable active-set method [33]. In addition, nonsmoothmethods for MPECs can be found in [66]. One purpose of the thesis is to develop moreefficient methods for solving MPECs.

1.2 Background on SMPECs

Stochastic mathematical program with equilibrium constraints (SMPEC) can be for-mulated as follows:

minimize Eω[f(x, y, ω)]

subject to (x, y) ∈ Z, ω ∈ Ω, (1.3)

y solves VI(F (x, ·, ω), C(x, ω)),

where Z is a subset of <n+m, Ω denotes the underlying sample space, Eω means ex-pectation with respect to the random variable ω ∈ Ω, and f : <n+m × Ω → <, F :<n+m×Ω → <m, C : <n×Ω → 2<m

are mappings. Obviously, if Ω is a singleton, thenproblem (1.3) reduces to an ordinary MPEC, and so the SMPEC (1.3) can be thoughtas a generalization of the MPEC (1.1). Since an MPEC is already very hard to handle,so SMPECs may be more difficult to deal with because the number of random eventsis usually very large in practice.

4 1. Introduction

The SMPEC (1.3) is also closely related to the so-called two-stage stochastic pro-gram with recourse [72]:

minimize p(x) + Eω[Q(x, ω)]

subject to x ∈ X, (1.4)

where p : <n → <, X ⊆ <n, and Q : <n × Ω → < is defined by

Q(x, ω) := infy∈Y (x,ω)

g(y, ω)

with Y : <n × Ω → 2<mand g : <m × Ω → <. Many applications of problem (1.4)

can be found in practice, especially in financial planning. For further details, see[3, 11, 15, 16, 79].

SMPEC was first discussed in [72], the main results of which are concerned withthe existence of solutions, the convexity and directional differentiability of an implicitobjective function, and links between SMPEC and bilevel models. Actually, there hasbeen no effective algorithms suggested for solving SMPECs so far. In the second half ofthe thesis, we study SMPECs systematically. We discuss two kinds of SMPECs: Oneis the lower-level wait-and-see model, in which the upper-level decision is made at onceand a lower-level decision may be made after a random event is observed, and the otheris the here-and-now model that requires us to make all decisions before a random eventis observed. See Chapter 6 for details.

1.3 Main Contributions

The purpose of the thesis is to develop efficient methods for solving MPECs and SM-PECs. We may divide the thesis into two parts: The first part includes Chapters 2–5 inwhich we deal with the MPECs and the second part consists of Chapters 6–8 in whichwe study the SMPECs. Our main results can be summarized as follows.

In Chapter 2, we give some modified exact penalty results for nonlinear programsand MPECs. In particular, instead of the abstract subanalytic property and errorbounds employed in [62], some of our results use a kind of convexity that is discussedin detail as well.

In Chapter 3, we propose a new relaxation method for MPCCs. Our method re-places the complementarity constraints by a variational inequality defined on an ex-pansive simplex. It is well known that such a variational inequality problem can berepresented by a finite number of inequalities. We remove some inequalities and obtain

1.4 Preliminaries 5

a standard smooth nonlinear program. We investigate the limiting behavior of therelaxed problems and obtain some exciting convergence results. In particular, someconditions assumed in the convergence theory are new and can be verified easily inpractice.

In Chapter 4, we suggest another relaxation method with bi-hyperbola approxima-tion for MPCCs. This method possesses similar properties to the regularization method[76] and the subproblems in the new method have less constraints.

In Chapter 5, we are devoted to develop some methods that enable us to computea solution or a point with some kind of stationarity by solving a finite number ofnonlinear programs. To this end, we apply an active set identification technique tosome existing methods and present some hybrid algorithms. We show that, under somesuitable assumptions, the algorithms indeed possess a finite termination property, unlikemost existing methods that require to solve an infinite sequence of nonlinear programs.Further discussions and extensions are also included.

We study the SMPECs in the rest of the thesis.

In Chapter 6, we first introduce the problems and then, we show that many decisionproblems can be formulated as SMPECs in practice. We discuss both the lower-levelwait-and-see decision model and the here-and-now decision model. For the lower-levelwait-and-see model, we propose a smoothing implicit programming method and es-tablish a comprehensive convergence theory. For the here-and-now decision problem,we apply a penalty technique and suggest a similar method. We show that the twomethods possess similar convergence properties.

In Chapters 7 and 8, we consider a special here-and-now problem. We show that thestochastic linear complementarity problem may be formulated as this kind of SMPECs.We give some equivalent reformulations of the problem and then propose some penaltymethods and a regularization method in Chapters 7 and 8, respectively.

1.4 Preliminaries

We will use the following notations and terminologies in this thesis.

1.4.1 Notations

Throughout, all vectors are thought as column vectors and T means the transposeoperation. For u ∈ <s, let ‖u‖ and ‖u‖1 denote the norms defined by

6 1. Introduction

‖u‖ :=( s∑

i=1

u2i

)1/2and ‖u‖1 :=

s∑

i=1

|ui|,

respectively. For a nonempty closed set V ⊆ <s, we denote

dist(u, V ) := minv∈V

‖u− v‖

andΠV (u) :=

v ∈ V

∣∣∣ ‖u− v‖ = dist(u, V ).

Moreover, B(u, δ) stands for the closed ball v ∈ <s | ‖u − v‖ ≤ δ and <s+ denotes

the nonnegative orthant in <s. For a real scalar a, we denote (a)+ := max0, a. Fortwo vectors u and v in <s, u⊥v means uT v = 0 and both min(u, v) and max(u, v) areunderstood to be taken componentwise, i.e.,

min(u, v) := (minu1, v1, · · · , minus, vs)T ,

max(u, v) := (maxu1, v1, · · · , maxus, vs)T

For a given function G : <s → <s′ and a vector u ∈ <s, ∇G(u) is the transposedJacobian of G at u, whereas for a real valued function g : <s → <, ∇g(u) denotes thegradient vector of g at u. Moreover,

IG(u) := i | Gi(u) = 0

stands for the active index set of G at u. In addition, ei denotes the unit vector withthe ith element to be 1; I and O denote the identity matrix and the zero matrix withsuitable dimension, respectively.

1.4.2 Terminologies

We first recall some basic concepts for the standard nonlinear programming problem

minimize f(z)

subject to ci(z) ≤ 0, i = 1, · · · , l, (1.5)

ci(z) = 0, i = l + 1, · · · , s,

where f : <n → < and c : <n → <s are twice continuously differentiable.

Definition 1.1 The linear independence constraint qualification (LICQ) is said to holdat a feasible point z of problem (1.5) if the set of vectors ∇ci(z) | i ∈ Ic(z) is linearlyindependent.

1.4 Preliminaries 7

Definition 1.2 We say z to be a stationary point of problem (1.5) if it is feasible to(1.5) and there exists a Lagrange multiplier vector λ ∈ <s such that

∇f(z) +∇c(z)λ = 0,

λi ≥ 0, λici(z) = 0, i = 1, · · · , l.

Definition 1.3 Let z be a stationary point of problem (1.5) and λ be a Lagrangemultiplier vector corresponding to z. We say the weak second-order necessary condition(WSONC) holds at z if we have

dT(∇2f(z) +

s∑

i=1

λi∇2ci(z))d ≥ 0

for any d ∈ T (z) :=d ∈ <n

∣∣∣ dT∇ci(z) = 0, ∀ i ∈ Ic(z).

We next consider the mathematical program with complementarity constraints(MPCC)

minimize f(z)

subject to g(z) ≤ 0, h(z) = 0 (1.6)

G(z) ≥ 0, H(z) ≥ 0

G(z)T H(z) = 0,

where f : <n → <, g : <n → <p, h : <n → <q, and G,H : <n → <m are all twicecontinuously differentiable functions. Let F denote the feasible region of the aboveproblem.

Definition 1.4 The MPEC-linear independence constraint qualification (MPEC-LICQ)is said to hold at z ∈ F if the set of vectors

∇gl(z),∇hr(z),∇Gi(z),∇Hj(z) |

l ∈ Ig(z), r = 1, · · · , q, i ∈ IG(z), j ∈ IH(z)

is linearly independent.

This condition is not particularly stringent [78] and has been assumed often in theliterature on MPCCs [31, 38, 41, 54, 76]. Note that this definition is different fromthe standard definition of LICQ in nonlinear programming theory that would require

8 1. Introduction

the gradient of the function G(z)T H(z) be linearly independent of the above vectors,which cannot happen in any case actually.

In the study of MPCCs, there are several kinds of stationarity defined for problem(1.6) [75].

Definition 1.5 We say z ∈ F to be a Bouligand or B-stationary point of problem(1.6) if it satisfies

dT∇f(z) ≥ 0, ∀d ∈ T (z,F),

where

T (z,F) :=d ∈ <n

∣∣∣ tk(zk − z) → d, zk → z, zk ∈ F , tk ≥ 0, k = 1, 2, · · ·

stands for the tangent cone of F at z.

Definition 1.6 (1) z ∈ F is called weakly stationary to problem (1.6) if there existmultiplier vectors λ ∈ <p, µ ∈ <q, and u, v ∈ <m such that

∇f(z) +∇g(z)λ +∇h(z)µ−∇G(z)u−∇H(z)v = 0, (1.7)

λ ≥ 0, λT g(z) = 0, (1.8)

ui = 0, i /∈ IG(z), (1.9)

vi = 0, i /∈ IH(z). (1.10)

(2) z ∈ F is called a Clarke or C-stationary point of problem (1.6) if there existmultiplier vectors λ ∈ <p, µ ∈ <q, and u, v ∈ <m such that (1.7)–(1.10) hold with

uivi ≥ 0, i ∈ IG(z) ∩ IH(z)

and we say z is Mordukhovich or M-stationary to problem (1.6) if, furthermore, eitherui > 0, vi > 0 or uivi = 0 for all i ∈ IG(z) ∩ IH(z).

(3) z ∈ F is called a strongly or S-stationary point of problem (1.6) if there existmultiplier vectors λ, µ, u, and v such that (1.7)–(1.10) hold with

ui ≥ 0, vi ≥ 0, i ∈ IG(z) ∩ IH(z).

It is well-known [75] that, if the MPEC-LICQ holds at z, B-stationarity is equivalentto S-stationarity.

2.1 Preliminaries 9

Definition 1.7 A weakly stationary point z ∈ F of problem (1.6) is said to satisfy theupper level strict complementarity (ULSC) condition if there exist multiplier vectorsλ, µ, u, and v satisfying (1.7)–(1.10) and

uivi 6= 0, i ∈ IG(z) ∩ IH(z).

The ULSC condition is clearly weaker than the so-called lower level strict comple-mentarity (LLSC) condition (which means IG(z) ∩ IH(z) = ∅ and in this case, z isalso said to be nondegenerate). Moreover, it is obvious that any M-stationary pointof problem (1.6) satisfying the upper level strict complementarity condition must be aB-stationary point.

10 2. Exact Penalty Results for NLP and MPEC

Chapter 2

Exact Penalty Results for

Nonlinear Programs and MPECs

Because of the presence of variational inequality or complementarity constraints, MPEChas such an intrinsic feature that its feasible region is nonconvex or nonsmooth in gen-eral and hence it is very difficult to handle. At present, a popular approach is to refor-mulate an MPEC as a standard nonlinear program. In this respect, penalty functionshave provided a powerful approach, both as a theoretical tool and as a computationalvehicle. Recently, based on the study of subanalytic optimization problems and withthe help of the theory of error bounds, some exact penalty results for nonlinear pro-grams and MPECs were proved by Luo, Pang, and Ralph [62]. In this chapter, we willshow that those results remain valid under some other mild conditions. In particular,instead of the subanalytic property and error bounds, which are somewhat abstract anddifficult to verify in practice, some of our results use a property called strong convexitywith order σ, which is a generalization of the ordinary strong convexity [49] and willbe discussed in detail.

2.1 Preliminaries

The following definitions will be used later on.

Definition 2.1 (See [62]) A set X ⊆ <n is said to be subanalytic if for any u ∈ <n,there exist a neighborhood U of u and a bounded set Z ⊆ <n+p with some nonnegativeinteger p such that

(a) for any v ∈ <n+p, there exist a neighborhood V of v and a finite family Zij | 1 ≤

11


i ≤ l, 1 ≤ j ≤ q of sets Zij = z ∈ V | fij(z) = 0 or z ∈ V | fij(z) < 0 defined forsome real analytic functions fij on V such that

Z ∩ V =l⋃

i=1

q⋂

j=1

Zij ;

(b) X ∩ U = x ∈ <n | (x, y) ∈ Z for some y ∈ <p.A function f : <n → < is said to be subanalytic if its graph is subanalytic.

The class of subanalytic functions is broader than the class of analytic functionsand is employed by many papers, although it is somewhat abstract. For more details,we refer the reader to [2, 20, 61, 62].

Definition 2.2 (See [64]) Let 0 < p ≤ 1 be a constant and G : <n → <m be a mapping.We say G to be Holder continuous with order p on X ⊆ <n, if there exists a constantL such that

‖G(x)−G(y)‖ ≤ L‖x− y‖p, ∀x, y ∈ X. (2.1)

This concept is a generalization of the Lipschitz continuity, which is, by definition,Holder continuity with order p = 1. Note that Holder continuity makes sense onlywhen 0 < p ≤ 1. In fact, when p > 1, condition (2.1) implies that all directionalderivatives of G at any interior point are zero and so G is quite trivial. In addition,for 0 < p 6= p′ ≤ 1, Holder continuous functions with order p and those with order p′

constitute different classes of functions. For example, the function

G(x) := ‖x‖ 12 , ∀x ∈ <n

is Holder continuous with order p = 12 on <n and not Lipschitz continuous on <n.

Definition 2.3 A function f : <n → < is said to be strongly convex with order σ > 0on a convex set X ⊆ <n, if there exists a constant c > 0 such that

f(tx + (1− t)y) ≤ tf(x) + (1− t)f(y)− ct(1− t)‖x− y‖σ (2.2)

for any x, y ∈ X and any t ∈ [0, 1].

When σ = 2, this property reduces to the strong convexity in the ordinary sense[49]. But if σ 6= 2, they are different. For example, we can see from the results givenin Section 2.4 that the function f(x) = x4 is strongly convex with order 4 and notstrongly convex (with order 2) on <.

2.2 Penalty Results for NLP 13

2.2 Penalty Results for Nonlinear Programs

Consider the following nonlinear program:

minimize θ(x)


g(x) ≤ 0, h(x) = 0,

where θ : <n → <, g : <n → <m, and h : <n → <l are all continuous functions andX ⊆ <n is a nonempty closed set. Let W denote the feasible region of (2.3) and

r(x) :=m∑

i=1

(gi(x))+ +l∑

j=1

|hj(x)|

be the residual for the constraints in (2.3) at x ∈ X. Then the function r may be usedas a penalty function for problem (2.3). The following theorem is shown in [62]:

Theorem 2.1 Let X ⊆ <n be a compact subanalytic set, θ be Lipschitz continuous onX, and gi, hj be continuous subanalytic. Suppose problem (2.3) is feasible. Then thereexist positive constants α∗ and γ∗ such that for α ≥ α∗ and γ ≥ γ∗, problem (2.3) isequivalent to

minimize θ(x) + αr(x)1/γ (2.4)

subject to x ∈ X

in the sense that x∗ solves (2.3) if and only if it solves (2.4).

Moreover, the following result will be used:

Theorem 2.2 (Lojasiewicz Inequality [2]) Let φ, ψ : S → < be continuous subanalyticand S ⊆ <n be compact subanalytic. If φ−1(0) ⊆ ψ−1(0), then there exist constantsρ > 0 and N∗ > 0 such that

ρ|ψ(x)|N∗ ≤ |φ(x)|, ∀x ∈ S. (2.5)

Now we give our penalty results for problem (2.3). First of all, we define a newfunction. Suppose that problem (2.3) is feasible, i.e., W 6= ∅. Then we can take avector d ∈ W and define a function θd on X by

θd(x) := θ(xd), ∀x ∈ X,

where xd := (1−td)x+tdd and td is the smallest number t ∈ [0, 1] such that (1−t)x+td ∈W.


Theorem 2.3 Suppose that X, g, h are the same as in Theorem 2.1, problem (2.3) isfeasible, and the function (θ− θd) is continuous subanalytic for some d ∈ W . Then theconclusion of Theorem 2.1 remains valid.

Proof: Let r|X denote the restriction of r on X. Noticing that both r and (θ− θd)are continuous subanalytic and

(r|X)−1(0) = W ⊆ (θ − θd)−1(0),

we have from Theorem 2.2 that there exist constants ρ > 0 and N∗ > 0 such that

ρ|θ(x)− θ(xd)|N∗ ≤ r(x), ∀x ∈ X. (2.6)

Letµ = max1, max

x∈Xr(x), α∗ > (µ/ρ)1/N∗

, γ∗ = N∗,

andα ≥ α∗, γ ≥ γ∗.

(a) Assume that x solves problem (2.3). Then for any x ∈ X, we have

θ(x) + αr(x)1/γ = θ(xd) + (θ(x)− θ(xd)) + αr(x)1/N∗r(x)1/γ−1/N∗

≥ θ(x)− |θ(x)− θ(xd)|+ αρ1/N∗ |θ(x)− θ(xd)|µ1/γ−1/N∗

≥ θ(x) + (αρ1/N∗µ−1/N∗ − 1)|θ(x)− θ(xd)|

≥ θ(x)

= θ(x) + αr(x)1/γ .

Therefore, x is a global optimal solution of problem (2.4).

(b) If x solves (2.4), we can claim that x is an optimal solution of problem (2.3).In fact, since W is compact and problem (2.3) is feasible, it has an optimal solution,denoted by x. In a way similar to (a), we have

θ(x) = θ(x) + αr(x)1/γ

≥ θ(x) + αr(x)1/γ

≥ θ(x) + (αρ1/N∗µ−1/N∗ − 1)|θ(x)− θ(xd)|

≥ θ(x).

This implies θ(x) = θ(xd) and then

θ(x) = θ(xd) ≥ θ(x) ≥ θ(x) + αr(x)1/γ ,


where the first inequality holds because x solves (2.3) and xd is feasible to (2.3). Hence,we have r(x) = 0 and θ(x) = θ(x). The former implies x ∈ W and so x is an optimalsolution to problem (2.3). This completes the proof.

The new condition given in Theorem 2.3 may be satisfied by choosing d appropri-ately even if W is not convex and θ is not Lipschitz continuous, as the next exampleshows.

Example 2.1 Consider the following problem:

minimize θ(x) := sin2(3x13 )

subject to x ∈ [0, π3], cos(3x13 ) ≤ 0.

Then the feasible region is given by

W =[

π3

216,π3

8

] ⋃ [125π3

216, π3

],

which is nonconvex. We note that θ is not Lipschitz continuous on [0, π3], which meansthe conditions of Theorem 2.1 are not satisfied for this problem. However, we can showthat the assumptions of Theorem 2.3 hold. In fact, for d = π3

10 , the function

θd(x) =

1, x ∈ [0, π3

216)sin2(3x

13 ), x ∈ [ π3

216 , π3

8 ]1, x ∈ (π3

8 , 125π3

216 )sin2(3x

13 ), x ∈ [125π3

216 , π3]

is continuous and piecewise smooth and so it is continuous subanalytic on [0, π3].

We next consider another kind of error bounds for problem (2.3), which is differentfrom (2.6). We say that a function u : X → [0,∞) provides an error bound of orderν > 0 on W , if there exists a positive constant β such that

u(x) ≥ β(dist(x,W )

)ν, ∀x ∈ X.

For more details of error bounds, we refer the reader to [64, 69] and the referencestherein.

Theorem 2.4 Let X be a closed subset of <n, g and h be continuous on X, and θ beHolder continuous with order p > 0 and Holder constant L on X. Assume that r(x)provides an error bound of order ν > 0 on W with the corresponding constant β and


suppose that problem (2.3) is feasible. Then problem (2.3) has the same solution set asthe problem

minimize θ(x) + αr(x)N∗(2.7)

subject to x ∈ X,

where N∗ := pν and α > Lβ−N∗

.

Proof: By the assumption of the theorem, we have

r(x) ≥ β(dist(x, W )

)ν, ∀x ∈ X. (2.8)

(a) If x solves problem (2.3), then for any x ∈ X, we have from (2.8) and the Holdercontinuity of θ that

θ(x) + αr(x)N∗= θ(z) + (θ(x)− θ(z)) + αr(x)p/ν

≥ θ(x) + (αβp/ν − L)(dist(x,W )

)p

≥ θ(x)

= θ(x) + αr(x)N∗,

where z ∈ ΠW (x). Therefore, x is a global optimal solution of problem (2.7).

(b) Let x ∈ X be a solution of problem (2.7). Then for any x ∈ W ,

θ(x) + αr(x)N∗ ≤ θ(x) + αr(x)N∗= θ(x). (2.9)

Let t := infx∈W θ(x). Then for any ε > 0, we can find an xε ∈ W such that θ(xε) ≤ t+ε.

By (2.8), (2.9), and the Holder continuity of θ, we have

t + ε ≥ θ(xε)

≥ θ(x) + αr(x)N∗

= θ(z) + (θ(x)− θ(z)) + αr(x)N∗

≥ t + (αβp/ν − L)‖x− z‖p,

where z ∈ ΠW (x). Therefore,

‖x− z‖p ≤ (αβp/ν − L)−1ε

for any ε > 0 and so x = z ∈ W . Therefore, (2.9) becomes

θ(x) ≤ θ(x), ∀x ∈ W,

i.e., x solves problem (2.3). This completes the proof.


The set X need not be compact and the functions g and h need not be subanalyticin the last theorem, which are in contrast with Theorems 2.1 and 2.3. If X is compactand g, h are subanalytic, as in Theorems 2.1 and 2.3, the exponent of the penalty termcan be chosen elastically. This result is stated in the following theorem, whose proof isomitted here.

Theorem 2.5 Assume that X, g, and h are the same as in Theorem 2.1, θ is Holdercontinuous on X, and problem (2.3) is feasible. Then the conclusion of Theorem 2.1remains true.

Now we consider the special case of problem (2.3):

minimize θ(x)


g(x) ≤ 0.

We will show some new penalty results for problem (2.10) which will be applied to themathematical program with a nonlinear complementarity system in the next section.In the rest of this section, we let ϕ denote the function defined by

ϕ(x) := max1≤i≤m

gi(x).

In general, condition (2.8) is difficult to verify in practice. The proof of the followingtheorem indicates that it holds when X is convex and ϕ is strongly convex with orderσ on X.

Theorem 2.6 Assume that X ⊆ <n is a closed convex set, θ is Holder continuous withorder p > 0 and Holder constant L on X, and ϕ is strongly convex with order σ > 0and the corresponding constant c on X. Suppose that problem (2.10) is feasible. Thenproblem (2.10) has the same solution set as problem (2.7) with

r(x) :=m∑

i=1

(gi(x)

)+, N∗ :=

p

σ, α > L

( c

2

)−N∗.

Proof: By Theorem 2.4 and its proof, it is enough to prove that (2.8) holds withβ := c

2 and ν := σ for any x ∈ X. In fact, assume that ϕ(x) > 0 and ϕ(z) = 0, wherez ∈ ΠS(x) with S := x ∈ X | ϕ(x) ≤ 0. Since ϕ is strongly convex with order σ andconstant c on X, it follows from (2.2) that

ϕ(x + z

2

)≤ 1

2ϕ(x)− c

4‖x− z‖σ.

18 2. Some Exact Penalty Results for NLP and MPEC

Note that ϕ(x+z2 ) > 0. (Otherwise, since x+z

2 ∈ X, this will contradict z ∈ ΠS(x).) Inconsequence,

c

2‖x− z‖σ ≤ ϕ(x) = (ϕ(x))+ ≤ r(x),

i.e., (2.8) holds with β = c2 and ν = σ. This completes the proof.

Note that it is easy to verify that if each gi is strongly convex with order σ, thenthe function ϕ is also strongly convex with order σ. We also have the following result.

Theorem 2.7 Assume that X ⊆ <n is compact and convex and the other conditions

are the same as in Theorem 2.6. Let γ ≥ σp and α > L

(c2

)−p/σ. Then problem (2.10)

has the same solution set as the problem

minimize θ(x) + αr(x)1/γ

subject to x ∈ X.

2.3 Penalty Results for MPECs

Consider the following mathematical program with equilibrium constraints (MPEC):

minimize f(x, y)


y solves VI(F (x, ·), C(x)),

where f : <n+m → <, F : <n+m → <m, Z ⊆ <n+m, C : <n → 2<mis defined by a

continuously differentiable function g : <n+m → <l as

C(x) := y ∈ <m | g(x, y) ≤ 0.

Let F denote the feasible region of problem (2.11), which is assumed to be nonempty.If F is continuous, gi(x, ·) is convex for all x ∈ X, where

X := x ∈ <n | (x, y) ∈ Z for some y ∈ <m,

∇ygi(x, y) exists and is continuous at every (x, y) in an open set containing F for eachi = 1, · · · , l, Z is compact, and the constraint qualification called SBCQ [62] holds on

2.3 Penalty Results for MPEC 19

F , then problem (11) is equivalent to the following mathematical program for someδ > 0 ([62], Theorem 1.3.5):

minimize f(x, y)

subject to (x, y, λ) ∈ Z × (B(0, δ) ∩ <l+), (2.12)

F (x, y) +l∑

i=1

λi∇ygi(x, y) = 0,

g(x, y) ≤ 0, λT g(x, y) = 0.

Roughly speaking, the SBCQ means that for any (x, y) ∈ F , problem (2.12) isfeasible and for a bounded subset of F , the corresponding set of Lagrange multipliersis also bounded. Let

W := (x, y) ∈ <n+m | (x, y, λ) satisfies the constraints of (2.12) for some λ.This set is nonempty if, under the SBCQ, F is nonempty. We choose some d ∈ W anddefine the function fd in a way similar to the definition of θd in the previous section.Then, comparing (2.12) with (2.3) and applying Theorems 2.3 and 2.5, we obtain thefollowing result directly:

Theorem 2.8 Let F, gi,∇ygi be continuous subanalytic and Z be compact subanalytic.Let f be Holder continuous with order p on Z or (f − fd) be continuous subanalytic forsome d ∈ W . Furthermore, assume that each gi(x, ·) is convex for all x ∈ X and theSBCQ holds on F . Then there exist constants δ > 0, α∗ > 0, and γ∗ > 0 such that forany α ≥ α∗ and γ ≥ γ∗, problem (2.11) is equivalent to the problem

minimize f(x, y) + αr(x, y, λ)1/γ (2.13)

subject to (x, y, λ) ∈ Z × (B(0, δ) ∩ <l+),

where

r(x, y, λ) := ‖F (x, y) +l∑

i=1

λi∇ygi(x, y)‖1 +l∑

i=1

((gi(x, y))+ + λi|gi(x, y)|),

in the sense that (x∗, y∗) solves (2.11) if and only if (x∗, y∗, λ∗) solves (2.13) for someλ∗ ∈ <l

+.

Now we consider a special class of MPECs:

minimize f(x, y)


y ≥ 0, F (x, y) ≥ 0,

yT F (x, y) = 0,


i.e., the mathematical programs with complementarity constraints. Let S denote thefeasible region of problem (2.14), Z1 := Z ∩ (<n ×<m

+ ),

r(x, y) :=m∑

i=1

(−Fi(x, y))+ + |yT F (x, y)|, (2.15)

and

ψ(x, y) := min

min1≤i≤m

Fi(x, y),−yT F (x, y).

In a way similar to Theorems 2.4 and 2.6, we can show the following results.

Theorem 2.9 Assume that Z is a closed subset of <n+m, f is Holder continuous withorder p and Holder constant L on Z1, and F is continuous on Z1. Assume that r(x, y)defined by (2.15) provides an error bound of order ν > 0 with the corresponding constantβ on S and problem (2.14) is feasible. Then problem (2.14) has the same solution setas the problem

minimize f(x, y) + αr(x, y)N∗(2.16)

subject to (x, y) ∈ Z1,

where N∗ := pν and α > Lβ−N∗

.

Theorem 2.10 Assume that F, f are the same as in Theorem 2.9 and Z is closed andconvex. Suppose that problem (2.14) is feasible. If the function (−ψ) is strongly convexwith order σ and the corresponding constant c on Z, then problem (2.14) is equivalent

to problem (2.16) with N∗ := pσ and α > L

(c2

)−N∗in the sense that (x∗, y∗) solves

(2.14) if and only if it solves (2.16).

2.4 Some Properties Related to Strong Convexity

For the strong convexity employed in Theorems 2.6–2.7 and 2.10, we have the followingresults:

Theorem 2.11 If each fi, i = 1, · · · ,m, is strongly convex with order σ on a convexset X, then

∑mi=1 tifi and max1≤i≤m fi are also strongly convex with order σ on X,

where ti > 0, i = 1, · · · ,m.

Proof: Immediate from Definition 2.3.

2.4 Strong Convexity 21

Theorem 2.12 Suppose that X ⊆ <n is convex and f : <n → < is continuouslydifferentiable on an open set containing X. Then f is strongly convex with order σ onX if and only if there exists a constant c > 0 such that

f(y) ≥ f(x) + (y − x)T∇f(x) + c‖x− y‖σ, ∀x, y ∈ X. (2.17)

Proof: Assume that f is strongly convex with order σ on X and c is a constantthat appears in (2.2). Then for any x, y ∈ X and t ∈ (0, 1), we have

f(y)− f(x) ≥ 1t(f(ty + (1− t)x)− f(x)) + c(1− t)‖x− y‖σ

= (y − x)T∇f(x + ξ(y − x)) + c(1− t)‖x− y‖σ

for some ξ ∈ (0, t). Letting t → 0, we have (2.17) from the continuity of ∇f .

Conversely, suppose (2.17) holds for some c > 0. For any x, y ∈ X and t ∈ (0, 1),we have

f(x)− f(tx + (1− t)y) ≥ (1− t)(x− y)T∇f(tx + (1− t)y) + c(1− t)σ‖x− y‖σ

andf(y)− f(tx + (1− t)y) ≥ t(y − x)T∇f(tx + (1− t)y) + ctσ‖x− y‖σ.

In consequence, we have

f(tx + (1− t)y) ≤ tf(x) + (1− t)f(x)− ct(1− t)((1− t)σ−1 + tσ−1)‖x− y‖σ. (2.18)

If 0 < σ ≤ 2, then(1− t)σ−1 + tσ−1 ≥ (1− t) + t = 1.

If σ > 2, since the real function φ(t) = tσ−1 is convex on (0, 1), then

(1− t)σ−1 + tσ−1 ≥(12

)σ−2.

It follows from (2.18) that there exists some constant c′ > 0 independent of x, y and t

such that

f(tx + (1− t)y) ≤ tf(x) + (1− t)f(y)− c′t(1− t)‖x− y‖σ,

i.e., f is strongly convex with order σ on X.

For a given concept of convexity, there usually exists some kind of monotonicityrelevant to it, see [49] and the references therein. Now we define the strong monotonicitywith order σ and discuss its relation to the strong convexity with order σ.


Definition 2.4 A mapping G : <n → <n is said to be strongly monotone with orderσ on a convex set X if there exists a constant β > 0 such that

(y − x)T (G(y)−G(x)) ≥ β‖y − x‖σ, ∀x, y ∈ X. (2.19)

Theorem 2.13 Let X ⊆ <n be convex and f : <n → < be continuously differentiableon an open set containing X. Then f is strongly convex with order σ on X if and onlyif ∇f is strongly monotone with order σ on X.

Proof: Suppose that f is strongly convex with order σ on X. By Theorem 2.12,there exists a constant c > 0 such that (2.17) holds. Then for any x, y ∈ X, one has

f(y)− f(x) ≥ (y − x)T∇f(x) + c‖x− y‖σ

andf(x)− f(y) ≥ (x− y)T∇f(y) + c‖x− y‖σ.

Therefore,(y − x)T (∇f(y)−∇f(x)) ≥ 2c‖x− y‖σ,

i.e., ∇f is strongly monotone with order σ on X with β = 2c.

Conversely, assume that (2.19) holds for some β > 0 and F = ∇f . Set

ti :=i

m + 1, i = 0, 1, · · · ,m + 1,

where m is a positive integer. By the mean-value theorem, there exist ξi ∈ (ti, ti+1), 0 ≤i ≤ m, such that

f(x + ti+1(y − x))− f(x + ti(y − x)) = (ti+1 − ti)(y − x)T∇f(x + ξi(y − x)).

Hence, it follows from (2.19) that

f(y)− f(x) =m∑

i=0

(f(x + ti+1(y − x))− f(x + ti(y − x)))

=m∑

i=0

(ti+1 − ti)(y − x)T (∇f(x + ξi(y − x))−∇f(x)) + (y − x)T∇f(x)

≥ β‖y − x‖σm∑

i=0

ξσ−1i (ti+1 − ti) + (y − x)T∇f(x).

Letting m → +∞ and noticing that

limm→+∞

m∑

i=0

ξσ−1i (ti+1 − ti) =

∫ 1

0tσ−1dt =

1σ

,

23

we havef(y)− f(x) ≥ β

σ‖y − x‖σ + (y − x)T∇f(x).

By Theorem 2.12, f is strongly convex with order σ on X.

24 3. New Relaxation Method for MPECs

Chapter 3

New Relaxation Method for

MPECs

Consider the following mathematical program with complementarity constraints:

minimize f(x, y)

subject to g(x, y) ≤ 0, h(x, y) = 0, (3.1)

F (x, y) ≥ 0, y ≥ 0,

yT F (x, y) = 0,

where f : <n+m → <, g : <n+m → <p, h : <n+m → <q, and F : <n+m → <m areall twice continuously differentiable functions. As mentioned in Chapter 1, the majordifficulty in solving (3.1) is that its constraints fail to satisfy a standard constraintqualification at any feasible point [17], which is necessary for the regularity of a nonlin-ear program, so that standard methods are likely to fail for this problem. There havebeen proposed several approaches such as sequential quadratic programming (SQP) ap-proach, implicit programming approach, penalty function approach, and reformulationapproach. In this chapter, we study problem (3.1) from another point of view. We usean expansive simplex instead of the nonnegative orthant involved in the complementar-ity constraints. In other words, our method replaces the complementarity constraintsby a variational inequality defined on an expansive simplex. It is well known that such avariational inequality problem can be represented by a finite number of inequalities. Weremove some inequalities and obtain a standard nonlinear program. We will show thatthe linear independence constraint qualification (LICQ) or the Mangasarian-Fromovitzconstraint qualification (MFCQ) holds for the relaxed problem under some mild condi-tions. We also consider a limiting behavior of the relaxed problem. In particular, some

25


sufficient conditions of B-stationarity for a feasible point of the original problem arenew and can be verified easily in practice.

3.1 Relaxed Problem

Let e := (1, 1, · · · , 1)T . For i = 1, 2, · · · ,m, ei ∈ <m denotes the i-th column of the m×m

identity matrix. Also we let e0 ∈ <m denote the zero vector, i.e., e0 := (0, 0, · · · , 0)T .Then the expansive simplex mentioned above is defined by

Ωk := coek0, e

k1, · · · , ek

m,

where co stands for the convex hull, k is a positive integer, and

eki :=

1ke + kei, i = 0, 1, · · · ,m. (3.2)

For a fixed x ∈ <n, the problem VI(F (x, ·), Ωk) is obviously equivalent to finding ay ∈ <m such that

y ∈ Ωk, (eki − y)T F (x, y) ≥ 0, i = 0, 1, · · · ,m

or equivalently,m∑

j=1

yj ≤ m

k+ k, yi ≥ 1

k, (ek

i − y)T F (x, y) ≥ 0, i = 0, 1, · · · , m.

In order to simplify the relaxed problem, we replace the condition y ∈ Ωk by y ≥ 0 andconsider the following problem as an approximation of problem (3.1):

minimize f(x, y)

subject to g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0, (3.3)

(eki − y)T F (x, y) ≥ 0, i = 0, 1, · · · ,m.

Let F and Fk denote the feasible sets of problems (3.1) and (3.3), respectively, and let

φki (x, y) := (ek

i − y)T F (x, y), i = 0, 1, · · · ,m. (3.4)

By (3.2), we have

φki (x, y) = φk

0(x, y) + kFi(x, y), i = 1, 2, · · · ,m. (3.5)

Then we have the following results.

3.1 Relaxed Problem 27

Theorem 3.1 For problems (3.1) and (3.3), we have

(i) for any k, F ⊆ Fk+1 ⊆ Fk;

(ii) F =⋂∞

k=1Fk, which, together with the continuity of the functions involved,implies that any accumulation point of a sequence (xk, yk) | (xk, yk) ∈ Fk belongs toF .

Proof: (i) F ⊆ Fk+1 is clear. Let (x, y) ∈ Fk+1. Then, since for each i =0, 1, · · · ,m, ek

i can be represented as

eki =

m∑

j=0

tijek+1j ,

m∑

j=0

tij = 1, tij ≥ 0, j = 0, 1, · · · ,m,

we have

(eki − y)T F (x, y) =

m∑

j=0

tij(ek+1j − y)T F (x, y) ≥ 0, i = 0, 1, · · · ,m,

i.e., (x, y) ∈ Fk. Hence Fk+1 ⊆ Fk.

(ii) From (i), we only need to prove⋂∞

k=1Fk ⊆ F . Let (x, y) ∈ ⋂∞k=1Fk. Then we

haveg(x, y) ≤ 0, h(x, y) = 0, y ≥ 0

and, for every i = 1, 2, · · · ,m,

φki (x, y) = (

1ke + kei − y)T F (x, y) ≥ 0, ∀k,

which implies

(1k2

e + ei − 1ky)T F (x, y) ≥ 0, ∀k. (3.6)

Letting k →∞ in (3.6), we have

Fi(x, y) = eTi F (x, y) ≥ 0

and hence F (x, y) ≥ 0. On the other hand,

(ek0 − y)T F (x, y) ≥ 0, ∀k

implies −yT F (x, y) ≥ 0. So, we have yT F (x, y) = 0. Therefore, (x, y) ∈ F and so⋂∞k=1Fk ⊆ F . This completes the proof.

The following result shows that problem (3.3) may satisfy some constraint qualifica-tion at its feasible points. This is in contrast with problem (3.1), for which a standardcontraint qualification fails to hold at any feasible point.


Theorem 3.2 For any (x, y) ∈ F with F (x, y) 6= 0, we have

φki (x, y) > 0, i = 0, 1, · · · ,m, ∀k

and so they are inactive constraints at (x, y) in problem (3.3). In this case, if the system

g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0

satisfies some constraint qualification such as LICQ or MFCQ at (x, y), then, for anyfixed k, there exists a neighborhood Uk(x, y) of (x, y) such that problem (3.3) satisfiesthe same constraint qualification at any point (x, y) ∈ Uk(x, y).

Proof: We obtain the first part from the definition (3.4) of φki and

eki > 0, 0 6= F (x, y) ≥ 0, yT F (x, y) = 0,

immediately. The second part follows from the continuity of g, h, φki , i = 0, 1, · · · ,m,

and their gradients directly.

In what follows, we let G(x, y) := y and

φk(x, y) := (φk0(x, y), φk

1(x, y), · · · , φkm(x, y))T ,

where φki are defined by (3.4). Note that the gradients of Gj , j = 1, · · · ,m, are constant

vectors. Nevertheless, we will often write ∇Gj(x, y), etc., to specify the point underconsideration.

Theorem 3.3 For any (x, y) ∈ F , if the set of vectors∇Fi(x, y),∇Gi(x, y),∇gl(x, y),∇hr(x, y)

∣∣∣ i = 1, · · · ,m, l ∈ Ig(x, y), r = 1, · · · , q

is linearly independent, then there exist a neighborhood U(x, y) of (x, y) and a positiveinteger K such that, for any (x, y) ∈ U(x, y) and any k ≥ K, the following conditionshold:

(i) IF (x, y) ⊆ IF (x, y), IG(x, y) ⊆ IG(x, y), Ig(x, y) ⊆ Ig(x, y);

(ii)∇Fi(x, y),∇Gi(x, y),∇gl(x, y),∇hr(x, y) | i = 1, · · · ,m, l ∈ Ig(x, y), r =

1, · · · , q

is linearly independent;

(iii) Iφk(x, y) ⊆ 0 ∪ IF (x, y).

Proof: It is obvious that there exists a neighborhood U1(x, y) such that conditions(i) and (ii) hold for any (x, y) ∈ U1(x, y) by the continuity of F, G, g,∇F,∇g, and∇h. Now we show that there exist a neighborhood U2(x, y) and a positive integer K

3.1 Relaxed Problem 29

satisfying condition (iii) for any (x, y) ∈ U2(x, y) and any k ≥ K. Otherwise, theremust be an i0 /∈ 0 ∪ IF (x, y), a subsequence kj of k, and a sequence (xj , yj)converging to (x, y) such that

φkj

i0(xj , yj) = 0, ∀j.

Since1kj

φkj

i0(xj , yj) = Fi0(x

j , yj) +1k2

j

m∑

i=1

Fi(xj , yj)− 1kj

(yj)T F (xj , yj),

we havelim

j→∞1kj

φkj

i0(xj , yj) = Fi0(x, y) > 0.

This implies thatlim

j→∞φ

kj

i0(xj , yj) = +∞.

This is a contradiction and so the neighborhood U2(x, y) and positive integer K men-tioned above exist. Let

U(x, y) = U1(x, y) ∩ U2(x, y).

Then conditions (i)–(iii) hold for any (x, y) ∈ U(x, y) and any k ≥ K.

We further have the following result about constraint qualifications.

Theorem 3.4 Let (x, y) ∈ F be nondegenerate, i.e., IF (x, y) ∩ IG(x, y) = ∅, andassume

F (x, y) 6= 0. (3.7)

If the set of vectors∇Fi(x, y),∇Gi(x, y),∇gl(x, y),∇hr(x, y)

∣∣∣ i = 1, · · · ,m, l ∈ Ig(x, y), r = 1, · · · , q

is linearly independent, then there exists a neighborhood U(x, y) of (x, y) such that,for any sufficiently large k, problem (3.3) satisfies the standard LICQ at any point(x, y) ∈ U(x, y) ∩ Fk.

Proof: By Theorem 3.3, there exist a neighborhood U(x, y) and a positive integerK such that Theorem 3.3 (i)–(iii) hold for any (x, y) ∈ U(x, y) and any k ≥ K. Nowwe let k ≥ K and choose an arbitrary point (x, y) ∈ U(x, y) ∩ Fk.

Suppose that the LICQ does not hold at (x, y) for problem (3.3). This means theset of vectors

∇φk

i (x, y),∇Gj(x, y),∇gl(x, y),∇hr(x, y)∣∣∣

i ∈ Iφk(x, y), j ∈ IG(x, y), l ∈ Ig(x, y), r = 1, · · · , q,


which is, by (3.5),∇φk

0(x, y), k∇Fi(x, y) +∇φk0(x, y),∇Gj(x, y),∇gl(x, y),∇hr(x, y)

∣∣∣

0 6= i ∈ Iφk(x, y), j ∈ IG(x, y), l ∈ Ig(x, y), r = 1, 2, · · · , q

in the case where 0 ∈ Iφk(x, y) ork∇Fi(x, y) +∇φk

0(x, y),∇Gj(x, y),∇gl(x, y),∇hr(x, y)∣∣∣

i ∈ Iφk(x, y), j ∈ IG(x, y), l ∈ Ig(x, y), r = 1, 2, · · · , q

in the case where 0 /∈ Iφk(x, y), is linearly dependent. Hence, by Theorem 3.3 (ii),∇φk

0(x, y) can be represented as a linear combination of the vectors∇Fi(x, y),∇Gj(x, y),∇gl(x, y),∇hr(x, y)

∣∣∣

i ∈ Iφk(x, y) \ 0, j ∈ IG(x, y), l ∈ Ig(x, y), r = 1, 2, · · · , q.

Therefore, there exist numbersλi, µj , ul, vr

∣∣∣ i ∈ Iφk(x, y) \ 0, j ∈ IG(x, y), l ∈ Ig(x, y), r = 1, 2, · · · , q

such that

∇φk0(x, y) =

∑

i∈Iφk (x,y)\0

λi∇Fi(x, y) +∑

j∈IG(x,y)

µj∇Gj(x, y)

+∑

l∈Ig(x,y)

ul∇gl(x, y) +q∑

r=1

vr∇hr(x, y).

Since φk0(x, y) = ( 1

ke− y)T F (x, y), we then have

∑

i∈Iφk (x,y)\0

(yi − 1k

+ λi)∇Fi(x, y) +∑

i/∈Iφk (x,y)\0

(yi − 1k)∇Fi(x, y)

+∑

j∈IG(x,y)

(µj + Fj(x, y))∇Gj(x, y) +∑

j /∈IG(x,y)

Fj(x, y)∇Gj(x, y)

+∑

l∈Ig(x,y)

ul∇gl(x, y) +q∑

r=1

vr∇hr(x, y) = 0.

By Theorem 3.3 (ii), we have

yi =

1k − λi, i ∈ Iφk(x, y) \ 01k , i /∈ Iφk(x, y) \ 0 (3.8)

3.2 Convergence Analysis 31

and

Fi(x, y) = 0, i /∈ IG(x, y). (3.9)

Suppose that φk0(x, y) = 0. We then have by (3.9) that

φki (x, y) = kFi(x, y) + φk

0(x, y) = 0, i /∈ IG(x, y). (3.10)

On the other hand, we have

IG(x, y) = IG(x, y). (3.11)

Otherwise, since IG(x, y) ⊆ IG(x, y), there exists an i0 ∈ IG(x, y) \ IG(x, y). Thenwe must have φk

i0(x, y) = 0 by (3.10). But, by the nondegenerate property of (x, y),

i0 ∈ IG(x, y) means i0 /∈ IF (x, y). So, by Theorem 3.3 (iii), we have φki0

(x, y) > 0. Thisis a contradiction and so (3.11) holds. This means that

yi = 0, i ∈ IG(x, y). (3.12)

Note that IG(x, y) 6= ∅ by (3.7). So, if i ∈ IG(x, y), then i /∈ IF (x, y) by the nondegen-eracy assumption and so i /∈ Iφk(x, y) \0 by Theorem 3.1 (iii). Hence, from (3.8), wehave yi = 1

k . This contradicts (3.12). Therefore, we have φk0(x, y) 6= 0 and so, by (3.9),

φki (x, y) = kFi(x, y) + φk

0(x, y) 6= 0, i /∈ IG(x, y). (3.13)

By Theorem 3.3 (iii) and the fact that 0 /∈ Iφk(x, y), φki (x, y) = 0 means that i ∈

IF (x, y). On the other hand, for any i ∈ IF (x, y), by the nondegeneracy of (x, y), wehave i /∈ IG(x, y) and so i /∈ IG(x, y) by Theorem 3.3 (i), which implies φk

i (x, y) 6= 0by (3.13). Hence Iφk(x, y) = ∅, i.e., the last (m + 1) inequality constraints in problem(3.3) are all inactive at (x, y) and so, by Theorem 3.3 (ii), the LICQ holds at (x, y).This also contradicts our assumption. Therefore, the LICQ holds at (x, y) for problem(3.3). This completes the proof.

3.2 Convergence Analysis

In this section, we investigate the behavior of problem (3.3) as k →∞. We first considerthe convergence of global optimal solutions.

Theorem 3.5 Suppose that (xk, yk) is a global optimal solution of problem (3.3) and(x∗, y∗) is an accumulation point of the sequence (xk, yk) as k → ∞. Then (x∗, y∗)is a global optimal solution of problem (3.1).


Proof: Taking a subsequence if necessary, we assume without loss of generalitythat

limk→∞

(xk, yk) = (x∗, y∗).

By Theorem 3.1, (x∗, y∗) ∈ F . Since F ⊆ Fk for all k, then

f(xk, yk) ≤ f(x, y), ∀(x, y) ∈ F , ∀k.

Letting k →∞, we have from the continuity of f that

f(x∗, y∗) ≤ f(x, y), ∀(x, y) ∈ F ,

i.e., (x∗, y∗) is a global optimal solution of problem (3.1).

In a similar way, we can prove the next theorem.

Theorem 3.6 Let εk ⊆ (0, +∞) be convergent to 0 and (xk, yk) ∈ Fk be an approx-imate solution of problem (3.3) satisfying

f(xk, yk)− εk ≤ f(x, y), ∀(x, y) ∈ Fk.

Then any accumulation point of (xk, yk) is a global optimal solution of problem (3.1).

Now we consider the limiting behavior of stationary points of problem (3.3).

Theorem 3.7 Let (xk, yk) ∈ Fk be a stationary point of problem (3.3) with Lagrangemultiplier vectors λk, µk, δk, and γk satisfying (3.15)–(3.16), and (x, y) be an accumu-lation point of the sequence (xk, yk). Suppose the set of vectors∇Fi(x, y),∇Gi(x, y),∇gl(x, y),∇hr(x, y)

∣∣∣ i = 1, · · · ,m, l ∈ Ig(x, y), r = 1, · · · , q

is linearly independent. Then we have the following statements.

(a) (x, y) is a weakly stationary point of problem (3.1) and, if F (x, y) 6= 0, (x, y) isC-stationary. Especially, if (x, y) is nondegenerate, it is B-stationary;

(b) If (xk, yk) ∈ F for some k, then (xk, yk) is B-stationary to problem (3.1) andif (xk, yk) ∈ F for infinitely many k, (x, y) is B-stationary;

(c) If 0 /∈ Iφk(xk, yk) for infinitely many k, then (x, y) is a B-stationary point to(3.1).

Proof: Without loss of generality, we assume that

limk→∞

(xk, yk) = (x, y). (3.14)


Then by Theorem 3.1, we have (x, y) ∈ F . By Theorem 3.3, for any sufficiently largek, we have

IF (xk, yk) ⊆ IF (x, y), IG(xk, yk) ⊆ IG(x, y),

Ig(xk, yk) ⊆ Ig(x, y), Iφk(xk, yk) ⊆ 0 ∪ IF (x, y),

and∇Fi(xk, yk),∇Gi(xk, yk),∇gl(xk, yk),∇hr(xk, yk)

∣∣∣

i = 1, · · · ,m, l ∈ Ig(x, y), r = 1, · · · , q

is linearly independent. Note that the MPEC-LICQ holds at (x, y) for problem (3.1).

By the stationarity of (xk, yk) for problem (3.3), there exist Lagrange multipliervectors λk, µk, δk, and γk such that

∇f(xk, yk)−∑

i∈Iφk (xk,yk)

λki∇φk

i (xk, yk)−

∑

j∈IG(xk,yk)

µkj∇Gj(xk, yk)

+∑

l∈Ig(xk,yk)

δkl ∇gl(xk, yk) +

q∑

r=1

γkr∇hr(xk, yk) = 0 (3.15)

and

λk ≥ 0, µk ≥ 0, δk ≥ 0. (3.16)

Since

∇φki (x

k, yk) = ∇φk0(x

k, yk) + k∇Fi(xk, yk), i = 1, 2, · · · ,m

and

∇φk0(x

k, yk) =m∑

i=1

(1k− yk

i )∇Fi(xk, yk)−m∑

j=1

Fj(xk, yk)∇Gj(xk, yk), (3.17)

it follows from (3.15) that

∇f(xk, yk) =∑

i∈Iφk (xk,yk)

λki∇φk

i (xk, yk) +

∑

j∈IG(xk,yk)

µkj∇Gj(xk, yk)

−∑

l∈Ig(xk,yk)

δkl ∇gl(xk, yk)−

q∑

r=1

γkr∇hr(xk, yk)

=∑

06=i∈Iφk (xk,yk)

kλki∇Fi(xk, yk) + ak∇φk

0(xk, yk)


+∑

j∈IG(xk,yk)

µkj∇Gj(xk, yk)

−∑

l∈Ig(xk,yk)

δkl ∇gl(xk, yk)−

q∑

r=1

γkr∇hr(xk, yk) (3.18)

=∑

i∈IF (x,y)

uki∇Fi(xk, yk) +

∑

i/∈IF (x,y)

ak(1k− yk

i )∇Fi(xk, yk)

+∑

j∈IG(x,y)

vkj∇Gj(xk, yk)−

∑

j /∈IG(x,y)

akFj(xk, yk)∇Gj(xk, yk)

−∑

l∈Ig(x,y)

wkl ∇gl(xk, yk)−

q∑

r=1

γkr∇hr(xk, yk), (3.19)

where

ak :=

∑i∈I

φk (xk,yk) λki , Iφk(xk, yk) 6= ∅

0, Iφk(xk, yk) = ∅, (3.20)

uki :=

kλk

i + ak( 1k − yk

i ), 0 6= i ∈ Iφk(xk, yk)ak( 1

k − yki ), i ∈ IF (x, y) \ Iφk(xk, yk),

(3.21)

vkj :=

µk

j − akFj(xk, yk), j ∈ IG(xk, yk)−akFj(xk, yk), j ∈ IG(x, y) \ IG(xk, yk),

(3.22)

wkl :=

δkl , l ∈ Ig(xk, yk)

0, l ∈ Ig(x, y) \ Ig(xk, yk).

Since∇Fi(x, y),∇Gi(x, y),∇gl(x, y),∇hr(x, y)

∣∣∣ i = 1, · · · ,m, l ∈ Ig(x, y), r = 1, · · · , q

is linearly independent, it follows from (3.14) and (3.19) that the multiplier sequencesuk

i

∣∣∣ i ∈ IF (x, y),

vkj

∣∣∣ j ∈ IG(x, y),

ak(

1k− yk

i )∣∣∣ i /∈ IF (x, y)

,

wk

l

∣∣∣ l ∈ Ig(x, y), (3.23)

− akFj(xk, yk)

∣∣∣ j /∈ IG(x, y),

γk

r

∣∣∣ r = 1, · · · , q

are convergent. Next we will consider several cases to prove statements (a)–(c).

(I) First we show that if (xk, yk) ∈ F for some k, then it is a B-stationary point ofproblem (3.1), namely, there exist multiplier vectors λ, µ, γ, and δ ≥ 0 such that

∇f(x, y)−∑

i∈IF (x,y)

λi∇Fi(x, y)−∑

j∈IG(x,y)

µj∇Gj(x, y)

+∑

l∈Ig(x,y)

δl∇gl(x, y) +q∑

r=1

γr∇hr(x, y) = 0 (3.24)


holds with

λi ≥ 0, µi ≥ 0, i ∈ IF (x, y) ∩ IG(x, y). (3.25)

In fact, if F (xk, yk) 6= 0, then, from Theorem 3.2, Iφk(xk, yk) = ∅ and so (3.15)–(3.16)mean that (xk, yk) is a B-stationary point of problem (3.1). If F (xk, yk) = 0, thenIφk(xk, yk) = 0, 1, · · · , m and so, it follows from (3.17) and (3.18) that

0 = ∇f(xk, yk)−m∑

i=1

kλki∇Fi(xk, yk)− ak∇φk

0(xk, yk)

−∑

j∈IG(xk,yk)

µkj∇Gj(xk, yk) +

∑

l∈Ig(xk,yk)


q∑

r=1

γkr∇hr(xk, yk)

= ∇f(xk, yk)−∑

i∈IF (xk,yk)

(ak(1k− yk

i ) + kλki )∇Fi(xk, yk)

−∑

j∈IG(xk,yk)

µkj∇Gj(xk, yk) +

∑

l∈Ig(xk,yk)


q∑

r=1

γkr∇hr(xk, yk).

For i ∈ IF (xk, yk) ∩ IG(xk, yk), we have from (3.16) and (3.20) that

ak(1k− yk

i ) + kλki =

1kak + kλk

i ≥ 0, µki ≥ 0

and hence, comparing with (3.24) and (3.25), we see that (xk, yk) is a B-stationarypoint of problem (3.1). This shows the first half of statement (b). Next we suppose(xk′ , yk′) ∈ F for infinitely many k′ and show (x, y) is a B-stationary point of problem(3.1). In fact, since for any sufficiently large k′,

IF (xk′ , yk′) ⊆ IF (x, y), IG(xk′ , yk′) ⊆ IG(x, y), Ig(xk′ , yk′) ⊆ Ig(x, y),

we have

∇f(xk′ , yk′) =∑

i∈IF (x,y)

uk′i ∇Fi(xk′ , yk′) +

∑

j∈IG(x,y)

vk′j ∇Gj(xk′ , yk′)

−∑

l∈Ig(x,y)

wk′l ∇gl(xk′ , yk′)−

q∑

r=1

γk′r ∇hr(xk′ , yk′).

By the assumptions of the theorem, the multiplier sequences converge. Letting k′ →∞,we have the B-stationarity of (x, y). This shows the second half of statement (b).

(II) Next we assume that (xk, yk) /∈ F for all sufficiently large k.

(IIa) We consider the case where IF (x, y) 6= ∅.


(i) We first prove statement (c), i.e., if there is a subsequence kl of k such that0 /∈ Iφkl (xkl , ykl) for all l, then (x, y) is a B-stationary point of problem (3.1). In fact,noting that, by (3.20) and (3.21),

∑

i∈IF (x,y)

ukli =

∑

i∈Iφkl

(xkl ,ykl )

(klλ

kli + akl

(1kl− ykl

i ))

+∑

i∈IF (x,y)\Iφkl

(xkl ,ykl )

akl(1kl− ykl

i )

= akl

(kl +

∑

i∈IF (x,y)

(1kl− ykl

i ))

and

liml→∞

∑

i∈IF (x,y)

ukli exists,

liml→∞

(kl +

∑

i∈IF (x,y)

(1kl− ykl

i ))

= +∞,

we have

liml→∞

akl= 0. (3.26)

Therefore, we obtain

limk→∞

ak(1k− yk

i ) = liml→∞

akl(1kl− ykl

i ) = 0, i /∈ IF (x, y) (3.27)

and

limk→∞

akFj(xk, yk) = liml→∞

aklFj(xkl , ykl) = 0, j /∈ IG(x, y). (3.28)

On the other hand, by (3.16), (3.26), and (3.21)–(3.22), we have

limk→∞

uki ≥ 0, lim

k→∞vki ≥ 0, i ∈ IG(x, y) ∩ IF (x, y). (3.29)

It then follows from (3.19) and (3.27)–(3.29) that conditions (3.24) and (3.25) hold.Therefore, (x, y) is a B-stationary point of problem (3.1). This shows statement (c).The rest of the proof will be devoted to showing statement (a).

(ii) Suppose that 0 ∈ Iφk(xk, yk) for all sufficiently large k. Then it follows from(3.5) that

Iφk(xk, yk) = 0 ∪ IF (xk, yk). (3.30)

(iia) If there exist a subsequence kl of k and an index i0 such that

i0 /∈ IG(x, y), i0 ∈ IF (x, y) \ IF (xkl , ykl), ∀l


or

i0 /∈ IF (x, y), i0 ∈ IG(x, y) \ IG(xkl , ykl), ∀l,

then, by (3.21) and (3.22),

ukli0

= akl(1kl− ykl

i0), ∀l (3.31)

or

vkli0

= −aklFi0(x

kl , ykl), ∀l (3.32)

holds. Since

liml→∞

(1kl− ykl

i0) = −yi0 < 0

in the former case, or

liml→∞

Fi0(xkl , ykl) = Fi0(x, y) > 0

in the latter case, it follows from (3.31) or (3.32) that akl converges. Then we also

have (3.27)–(3.29) and hence (x, y) is a B-stationary point of problem (3.1).

(iib) Now suppose that

1, · · · ,m \ IF (x, y) ⊆ IG(xk, yk) (3.33)

and

1, · · · ,m \ IG(x, y) ⊆ IF (xk, yk) (3.34)

for all sufficiently large k. Then, since Fj(xk, yk) = 0 for any j /∈ IG(x, y) and yki = 0

for any i /∈ IF (x, y), (3.19) yields

∇f(xk, yk) =∑

i∈IF (x,y)

uki∇Fi(xk, yk) +

∑

j∈IG(x,y)

vkj∇Gj(xk, yk)

+∑

i/∈IF (x,y)

ak

k∇Fi(xk, yk)−

∑

l∈Ig(x,y)


q∑

r=1

γkr∇hr(xk, yk)

(3.35)

for all sufficiently large k. If F (x, y) = 0, i.e., IF (x, y) = 1, · · · ,m, then (3.35)implies that the limit (x, y) of (xk, yk) satisfies the weak stationarity condition (3.24)for problem (3.1). If F (x, y) 6= 0, then there exists an index i such that Fi(x, y) > 0and yi = 0, which implies

IG(x, y) \ IF (x, y) 6= ∅


and∑

i∈IG(x,y)\IF (x,y)

Fi(x, y) > 0. (3.36)

By (3.33) and (3.34), for all sufficiently large k, we have

0 = φk0(x

k, yk)

=m∑

i=1

(1k− yk

i )Fi(xk, yk)

=∑

i∈IG(x,y)

(1k− yk

i )Fi(xk, yk)

=∑

i∈IG(x,y)∩IF (x,y)

(1k− yk

i )Fi(xk, yk) +∑


1kFi(xk, yk). (3.37)

For any i ∈ IF (x, y) ∩ IG(x, y), it follows from (3.21) and (3.30) that

|ak(1k− yk

i )Fi(xk, yk)| =

0, i ∈ IF (xk, yk)|uk

i Fi(xk, yk)|, i ∈ IF (x, y) \ IF (xk, yk)

≤ |uki Fi(xk, yk)|

and so

limk→∞

ak(1k− yk

i )Fi(xk, yk) = 0. (3.38)

Hence, by (3.36), (3.37), and (3.38), we have

limk→∞

ak

k= − lim

k→∞Nk

Dk= 0, (3.39)

whereNk =

∑

i∈IG(x,y)∩IF (x,y)

ak(1k− yk

i )Fi(xk, yk)

andDk =

∑


Fi(xk, yk).

Therefore, taking a limit in (3.35), we obtain (3.24) from (3.39). Now we proceed toshowing

λiµi ≥ 0, i ∈ IF (x, y) ∩ IG(x, y), (3.40)

i.e., (x, y) is C-stationary. Let i ∈ IF (x, y)∩IG(x, y). Note that, by the assumption of(ii),

kFi(xk, yk) = φki (x

k, yk)− φk0(x

k, yk) = φki (x

k, yk) ≥ 0, i = 1, · · · ,m,


i.e.,

F (xk, yk) ≥ 0 (3.41)

for all sufficiently large k. Suppose that there exists a subsequence kl of k suchthat

ykli Fi(xkl , ykl) 6= 0, ∀l.

It follows from (3.21) and (3.22) that

ukli = akl

(1kl− ykl

i ), vkli = −akl

Fi(xkl , ykl).

By (3.39) and (3.41), we have

limk→∞

uki v

ki = lim

l→∞ukl

i vkli = lim

l→∞a2

klykl

i Fi(xkl , ykl) ≥ 0. (3.42)

Next we suppose that

yki Fi(xk, yk) = 0 (3.43)

for all sufficiently large k. First consider the case where there exists a subsequence klof k such that

ykli 6= 0, ∀l.

Then, by (3.43) and (3.22), Fi(xkl , ykl) = 0 and hence vkli = 0 for any sufficiently large

l. So we obtain

limk→∞

uki v

ki = lim

l→∞ukl

i vkli = 0. (3.44)

Next consider the case where yki = 0 for all sufficiently large k. If there exists a

subsequence kl of k such that

Fi(xkl , ykl) 6= 0, ∀l,

then, by (3.21) and (3.39), we have

liml→∞

ukli = lim

l→∞akl

kl= 0

and so (3.44) also holds. If for any sufficiently large k,

yki = 0, Fi(xk, yk) = 0,


then, by (3.16), (3.21)–(3.22), and (3.39),

limk→∞

uki v

ki = lim

k→∞µk

i (kλki +

ak

k) = lim

k→∞kλk

i µki ≥ 0.

Therefore, we always have limk→∞ uki v

ki ≥ 0, i.e., (x, y) is a C-stationary point of

problem (3.1). Moreover, if (x, y) is nondegenerate, then it readily follows from thedefinitions of the weak stationarity and nondegeneracy that (x, y) is B-stationary toproblem (3.1).

(IIb) Consider the case where IF (x, y) = ∅. Then IG(x, y) = 1, · · · ,m and so(x, y) is nondegenerate. Moreover, (3.19) becomes

∇f(xk, yk) =m∑

i=1

ak(1k− yk

i )∇Fi(xk, yk) +m∑

j=1

vkj∇Gj(xk, yk)

−∑

l∈Ig(x,y)


q∑

r=1

γkr∇hr(xk, yk). (3.45)

For any sufficiently large k, since (xk, yk) /∈ F , there exists an index j ∈ IG(x, y) \IG(xk, yk). Therefore, we can choose an index j0 and a subsequence kl of k suchthat

j0 ∈ IG(x, y) \ IG(xkl , ykl), ∀l,i.e., by (3.22),

vklj0

= −aklFj0(x

kl , ykl), ∀l.

Since vklj0 converges and, by IF (x, y) = ∅,

liml→∞

Fj0(xkl , ykl) > 0,

it follows that the sequence akl is convergent. Noticing that yk tends to y = 0 as

k →∞, we have that, for each j,

limk→∞

ak(1k− yk

j ) = liml→∞

akl(1kl− ykl

j ) = 0.

Letting k →∞ in (3.45) and denoting

vj = limk→∞

vkj , wl = lim

k→∞wk

l , γr = limk→∞

γkr ,

we obtain

∇f(x, y) =m∑

j=1

vj∇Gj(xk, yk)−∑

l∈Ig(x,y)

wl∇gl(x, y)−q∑

r=1

γr∇hr(x, y).


This, together withIF (x, y) ∩ IG(x, y) = ∅,

implies that (x, y) is a B-stationary point of problem (3.1).

Combining case IIa(ii) and case IIb shows that statement (a) holds. This completesthe proof.

For a sequence (xk, yk) of stationary points of problem (3.3), let us define

I1 := i | yki > 0 for infinitely many k,

I2 := i | Fi(xk, yk) 6= 0 for infinitely many k.Then we have

1, · · · ,m \ IG(x, y) ⊆ I1 ∩ IF (x, y),

1, · · · ,m \ IF (x, y) ⊆ I2 ∩ IG(x, y).

From the proof of Theorem 3.7, we have the next corollary immediately.

Corollary 3.1 Let the assumptions in Theorem 3.7 be satisfied. If I1 \ IF (x, y) 6= ∅or I2 \ IG(x, y) 6= ∅, then (x, y) is a B-stationary point of problem (3.1).

Next we consider some other sufficient conditions on M- and B-stationarity forproblem (3.1). We say (xk, yk) ∈ Fk satisfies the second-order necessary conditions ifthere exist multiplier vectors λk ∈ <m+1, µk ∈ <m, γk ∈ <q, and δk ∈ <p such that

λk ≥ 0, µk ≥ 0, δk ≥ 0, (3.46)

(λk)T φk(xk, yk) = 0, (µk)T G(xk, yk) = 0, (δk)T g(xk, yk) = 0, (3.47)

∇(x,y)Lk(xk, yk, λk, µk, δk, γk) = 0, (3.48)

and

dT∇2(x,y)Lk(xk, yk, λk, µk, δk, γk)d ≥ 0, ∀d ∈ Tk(xk, yk), (3.49)

where

Lk(x, y, λ, µ, δ, γ) := f(x, y)− λT φk(x, y)− µT G(x, y) + δT g(x, y) + γT h(x, y)

stands for the Lagrangian of problem (3.3) and for (x, y) ∈ Fk,

Tk(x, y) :=d ∈ <n+m

∣∣∣ dT∇φki (x, y) = 0, i ∈ Iφk(x, y);

dT∇Gj(x, y) = 0, j ∈ IG(x, y);

dT∇gl(x, y) = 0, l ∈ Ig(x, y);

dT∇hr(x, y) = 0, r = 1, 2, · · · , q.


We next introduce a new kind of conditions weaker than the second-order necessaryconditions for problem (3.3). Suppose that αk is a nonnegative number. We say that, ata stationary point (xk, yk) of problem (3.3), the matrix ∇2

(x,y)Lk(xk, yk, λk, µk, δk, γk)is bounded below with constant αk on the corresponding tangent space Tk(xk, yk) if

dT∇2(x,y)Lk(xk, yk, λk, µk, δk, γk)d ≥ −αk||d||2, ∀d ∈ Tk(xk, yk). (3.50)

Condition (3.50) is clearly weaker than (3.49). In fact, for the matrix∇2(x,y)Lk(xk, yk,

λk, µk, δk, γk), there must exist a number αk such that (3.50) hold. For example, anynonnegative number α such that −α is less than the smallest eigenvalue of ∇2

(x,y)Lk(xk,

yk, λk, µk, δk, γk) must satisfy (3.50). However, condition (3.49) means that the matrix∇2

(x,y)Lk(xk, yk, λk, µk, δk, γk) should have some kind of semi-definiteness on the tan-gent space Tk(xk, yk). Note that, in (3.50), the constant −αk may be larger than thesmallest eigenvalue mentioned above.

Theorem 3.8 Let (xk, yk) ∈ Fk be a stationary point of problem (3.3) with multi-plier vectors λk, µk, δk, and γk satisfying conditions (3.46)–(3.48) and, for each k,∇2

(x,y)Lk(xk, yk, λk, µk, δk, γk) be bounded below with constant αk on the correspondingtangent space Tk(xk, yk). Suppose that (x, y) is an accumulation point of the sequence(xk, yk) with F (x, y) 6= 0, the sequence αk is bounded, and the set of vectors

∇Fi(x, y),∇Gi(x, y),∇gl(x, y),∇hr(x, y)

∣∣∣ i = 1, · · · ,m, l ∈ Ig(x, y), r = 1, · · · , q

is linearly independent. Then (x, y) is an M-stationary point of problem (3.1). Fur-thermore, if (x, y) satisfies the upper level strict complementarity condition, it is B-stationary to problem (3.1).

Proof: Since (3.46)–(3.48) are equivalent to (3.15) and (3.16), it follows from The-orem 3.7 (a) that (x, y) is a C-stationary point of problem (3.1). By the proof ofTheorem 3.7, (x, y) is not B-stationary only in the case IIa(iib), i.e., for all sufficientlylarge k,

(xk, yk) /∈ F , IF (x, y) 6= ∅, (3.51)

Iφk(xk, yk) = 0 ∪ IF (xk, yk), (3.52)

1, 2, · · · ,m \ IF (x, y) ⊆ IG(xk, yk), (3.53)

1, 2, · · · ,m \ IG(x, y) ⊆ IF (xk, yk). (3.54)

In the rest of the proof, we therefore assume (3.51)–(3.54) and use the same setting as inthe proof of Theorem 3.7. Then (3.19) holds with (3.20)–(3.22). Suppose that (x, y) is


not an M-stationary point of problem (3.1). Then, by the definitions of C-stationarityand M-stationarity, there exists an i0 ∈ IF (x, y) ∩ IG(x, y) such that

ui0 = limk→∞

uki0 < 0, vi0 = lim

k→∞vki0 < 0, (3.55)

where we use the fact that both the sequences uki0 and vk

i0 are convergent.

We claim that

yki0Fi0(x

k, yk) 6= 0 (3.56)

for all sufficiently large k. In fact, if there exists a subsequence kl of k such that

ykli0

Fi0(xkl , ykl) = 0, ∀l,

namely,

ykli0

= 0 or Fi0(xkl , ykl) = 0, ∀l, (3.57)

then we have from (3.21)–(3.22) and (3.57) that

ukli0≥ 0 or vkl

i0≥ 0, ∀l.

This contradicts (3.55), and so (3.56) holds for all sufficiently large k. Then (3.55)becomes

ui0 = limk→∞

uki0 = lim

k→∞ak(

1k− yk

i0) < 0, (3.58)

vi0 = limk→∞

vki0 = − lim

k→∞akFi0(x

k, yk) < 0 (3.59)

by (3.21) and (3.22). By Theorem 3.3 (ii), we may suppose that k is sufficiently largeso that for any k,

∇Fi(xk, yk),∇Gi(xk, yk),∇gl(xk, yk),∇hr(xk, yk)

∣∣∣

i = 1, · · · ,m, l ∈ Ig(x, y), r = 1, · · · , q

is linearly independent. Note that

limk→∞

1k − yk

i0

Fi0(xk, yk)= − ui0

vi0

< 0 (3.60)

by (3.58) and (3.59). Therefore, we can choose a bounded sequence dk ⊆ <n+m suchthat for all sufficiently large k,

(dk)T∇Fi(xk, yk) = 0, i = 1, · · · ,m, i 6= i0; (3.61)


(dk)T∇Gj(xk, yk) = 0, j = 1, · · · , m, j 6= i0; (3.62)

(dk)T∇Fi0(xk, yk) = 1; (3.63)

(dk)T∇Gi0(xk, yk) =

1k − yk

i0

Fi0(xk, yk); (3.64)

(dk)T∇gl(xk, yk) = 0, l ∈ Ig(x, y); (3.65)

(dk)T∇hr(xk, yk) = 0, r = 1, 2, · · · , q. (3.66)

Since

∇φk0(x

k, yk) =m∑

i=1

(1k− yk

i )∇Fi(xk, yk)−m∑

j=1

Fj(xk, yk)∇Gj(xk, yk), (3.67)

we have from (3.61)–(3.64) that

(dk)T∇φk0(x

k, yk)

=m∑

i=1

(1k− yk

i )(dk)T∇Fi(xk, yk)−m∑

j=1

Fj(xk, yk)(dk)T∇Gj(xk, yk) = 0. (3.68)

On the other hand, noting that i0 /∈ IF (xk, yk) ∪ IG(xk, yk) for all sufficiently large k

by (3.56), we have from (3.5), (3.61), and (3.68) that

(dk)T∇φki (x

k, yk) = (dk)T∇φk0(x

k, yk) + (dk)T∇Fi(xk, yk)

= 0, 0 6= i ∈ Iφk(xk, yk). (3.69)

It follows from (3.62), (3.64)–(3.66), and (3.68)–(3.69) that

dk ∈ Tk(xk, yk) (3.70)

for all sufficiently large k. By (3.67), we have

∇2φk0(x

k, yk) =m∑

i=1

(1k− yk

i )∇2Fi(xk, yk)−m∑

i=1

∇Fi(xk, yk)Gi(xk, yk)T

−m∑

j=1

∇Gj(xk, yk)∇Fj(xk, yk)T , (3.71)

where we use the fact that ∇Gj(xk, yk), j = 1, · · · ,m, are constant vectors. On theother hand, we can write

∇(x,y)Lk(xk, yk, λk, µk, δk, γk) = ∇f(xk, yk)−m∑

i=0

λki∇φk

i (xk, yk)−

m∑

j=1

µkj∇Gj(xk, yk)

+p∑

l=1


q∑

r=1

γkr∇hr(xk, yk)


= ∇f(xk, yk)− ak∇φk0(x

k, yk)

−m∑

i=1

kλki∇Fi(xk, yk)−

m∑

j=1

µkj∇Gj(xk, yk)

+p∑

l=1


q∑

r=1

γkr∇hr(xk, yk),

where ak =∑m

i=0 λki is the same as that in the proof of Theorem 3.7, and so we have

from (3.71) that

∇2(x,y)Lk(xk, yk, λk, µk, δk, γk) = ∇2f(xk, yk)− ak∇2φk

0(xk, yk)−

m∑

i=1

kλki∇2Fi(xk, yk)

+p∑

l=1

δkl ∇2gl(xk, yk) +

q∑

r=1

γkr∇2hr(xk, yk)

= ∇2f(xk, yk) + ak

m∑

i=1

∇Fi(xk, yk)Gi(xk, yk)T

+ak

m∑

j=1

∇Gj(xk, yk)∇Fj(xk, yk)T

−m∑

i=1

(kλki + ak(

1k− yk

i ))∇2Fi(xk, yk)

+p∑

l=1

δkl ∇2gl(xk, yk) +

q∑

r=1

γkr∇2hr(xk, yk).

Since ∇2(x,y)Lk(xk, yk, λk, µk, δk, γk) is bounded below with constant αk on the corre-

sponding tangent space Tk(xk, yk), we have from (3.50) and (3.70) that there exists aconstant C such that

(dk)T∇2(x,y)Lk(xk, yk, λk, µk, δk, γk)dk ≥ −αk||dk||2 ≥ C, (3.72)

where the last inequality follows from the boundedness of the sequences αk and dk.Note that

(dk)T∇2(x,y)Lk(xk, yk, λk, µk, δk, γk)dk

= (dk)T∇2f(xk, yk)dk + ak

m∑

i=1

(dk)T∇Fi(xk, yk)Gi(xk, yk)T dk

+ ak

m∑

j=1

(dk)T∇Gj(xk, yk)∇Fj(xk, yk)T dk

−m∑

i=1

(kλki + ak(

1k− yk

i ))(dk)T∇2Fi(xk, yk)dk


+p∑

l=1

δkl (dk)T∇2gl(xk, yk)dk +

q∑

r=1

γkr (dk)T∇2hr(xk, yk)dk

= (dk)T∇2f(xk, yk)dk +2ak( 1

k − yki0

)Fi0(xk, yk)

−m∑

i=1

(kλki + ak(

1k− yk


+p∑

l=1


q∑

r=1

γkr (dk)T∇2hr(xk, yk)dk. (3.73)

By the twice continuous differentiability of the functions involved, the boundedness ofthe sequence dk, and the convergence of the sequences (xk, yk), δk

l , and γkr ,

the terms

(dk)T∇2f(xk, yk)dk,p∑

l=1

δkl (dk)T∇2gl(xk, yk)dk,

q∑

r=1

γkr (dk)T∇2hr(xk, yk)dk

are all bounded. Noticing that, for all sufficiently large k, i /∈ IF (xk, yk) implies i /∈Iφk(xk, yk) by (3.52) and so λk

i = 0 by (3.46) and (3.47), we have from the convergenceof the sequences in (3.23) and the definition (3.21) of uk

i that the sequence kλki +

ak( 1k − yk

i ) is bounded for any i = 1, 2, · · · , m. Hence, the termm∑

i=1

(kλki + ak(

1k− yk


is also bounded. However, since limk→∞ yki0

= yi0 = 0, we have ak → +∞ by (3.58)and so

2ak( 1k − yk

i0)

Fi0(xk, yk)→ −∞

as k →∞ by (3.60). Therefore, it follows from (3.73) that

(dk)T∇2(x,y)Lk(xk, yk, λk, µk, δk, γk)dk → −∞

as k → ∞. This contradicts (3.72) and hence (x, y) is M-stationary to problem (3.1).This completes the proof of the first part of the theorem. The second part of the theoremfollows from the definitions of M-stationarity and the upper level strict complementarityimmediately.

Corollary 3.2 Let (xk, yk) and (x, y) be the same as in Theorem 3.8. If (xk, yk)together with the corresponding multiplier vectors λk, µk, δk, and γk satisfies the second-order necessary conditions (3.46)–(3.49) and the set of vectors∇Fi(x, y),∇Gi(x, y),∇gl(x, y),∇hr(x, y)

∣∣∣ i = 1, · · · ,m, l ∈ Ig(x, y), r = 1, · · · , q


is linearly independent, then the conclusion of Theorem 3.8 remains true.

Corollary 3.2 establishes convergence to a B-stationary point under the second-order necessary conditions and the upper level strict complementarity. These or similarconditions have also been assumed in [31, 62], but they are somewhat restrictive andmay be difficult to verify in practice. The next theorem provides a new condition forconvergence to a B-stationary point, which can be dealt with more easily. We notethat, unlike [31, 62], it relies on neither the upper level strict complementarity nor theasymptotic weak nondegeneracy.

Theorem 3.9 Let (xk, yk) and (x, y) be the same as in Theorem 3.8 and λk, µk, δk,

and γk be the multiplier vectors corresponding to (xk, yk) with (3.46)–(3.49). Let βk

be the smallest eigenvalue of the matrix ∇2(x,y)Lk(xk, yk, λk, µk, δk, γk). If the sequence

βk is bounded below and∇Fi(x, y),∇Gi(x, y),∇gl(x, y),∇hr(x, y)

∣∣∣ i = 1, 2, · · · ,m, l ∈ Ig(x, y), r = 1, 2, · · · , q

is linearly independent, then (x, y) is a B-stationary point of problem (3.1).

Proof: It is easy to see that the assumptions of Theorem 3.8 are satisfied withαk = max−βk, 0 and so (x, y) is an M-stationary point of problem (3.1). Supposethat (x, y) is not B-stationary to problem (3.1). As mentioned at the beginning ofthe proof of Theorem 3.8, this occurs only in the case where (3.51)–(3.54) hold forall sufficiently large k. By the definitions of B- and M-stationarity, there exists ani0 ∈ IF (x, y) ∩ IG(x, y) such that

ui0 = limk→∞

uki0 < 0, vi0 = lim

k→∞vki0 = 0 (3.74)

or

ui0 = limk→∞

uki0 = 0, vi0 = lim

k→∞vki0 < 0. (3.75)

From (3.20)–(3.22) and (3.46), we know that either of (3.74) and (3.75) implies

limk→∞

ak = +∞. (3.76)

By Theorem 3.3, we may suppose that k is large enough so that (3.51)–(3.54) hold,

IF (xk, yk) ⊆ IF (x, y), IG(xk, yk) ⊆ IG(x, y), Ig(xk, yk) ⊆ Ig(x, y),

and∇Fi(xk, yk),∇Gi(xk, yk),∇gl(xk, yk),∇hr(xk, yk)

∣∣∣

i = 1, · · · ,m, l ∈ Ig(x, y), r = 1, · · · , q


is linearly independent. Therefore, we can choose a vector dk ∈ <n+m such that (3.61)–(3.62) and (3.65)–(3.66) hold and

(dk)T∇Fi0(xk, yk) = 1, (dk)T∇Gi0(x

k, yk) = −1.

Furthermore, we can choose the sequence dk to be bounded. By the assumptions ofthe theorem, there exists a constant C such that

(dk)T∇2(x,y)Lk(xk, yk, λk, µk, δk, γk)dk ≥ βk||dk||2 ≥ C (3.77)

holds for all sufficiently large k. Note that, by the definition of dk and (3.73),

(dk)T∇2(x,y)Lk(xk, yk, λk, µk, δk, γk)dk

= (dk)T∇2f(xk, yk)dk − 2ak −m∑

i=1

(kλki + ak(

1k− yk


+p∑

l=1


q∑

r=1

γkr (dk)T∇2hr(xk, yk)dk. (3.78)

In a way similar to Theorem 3.8, we can show that all the terms on the right-hand sideof (3.78) except the term (−2ak) are bounded. This, together with (3.76), implies that

(dk)T∇2(x,y)Lk(xk, yk, λk, µk, δk, γk)dk → −∞

as k → ∞. This contradicts (3.77) and hence (x, y) is B-stationary to problem (3.1).This completes the proof.

3.3 Computational Results

We have tested the method on various small scale examples of MPECs, which have beenused to test other methods in the literature. We applied the MATLAB 6.0 built-in solverfunction fmincon to problem (3.3) with various values of k. The computational resultsare summarized in Tables 3.1–3.4, which indicate the proposed method produces goodapproximate solutions of (3.1) in a small number of iterations. In the tables, (xk, yk) isthe (approximate) solution of (3.1) produced by solving (3.3), Ite stands for the numberof iterations spent by fmincon, and r(xk, yk) denotes the residual for the constraintsin problem (3.3) at (xk, yk), i.e.,

r(xk, yk) :=p∑

l=1

(gl(xk, yk))+ +q∑

r=1

|hr(xk, yk)|+m∑

j=1

(−ykj )+

+m∑

i=1

(−Fi(xk, yk))+ + |(yk)T F (xk, yk)|.

3.3 Computational Results 49

Table 3.1: Computational results for Problem 3.1

Size (p,m, n) (1, 1, 2)

Initial point (3, 0, 0)

(xk, yk) (2.7101, 0.5365, 0)k = 10, 102 Ite 7

f(xk, yk) 10.4925r(xk, yk) 0


Size (m,n, p, q) (2, 4, 4, 2)

Initial point (1, 1, 1, 1, 0, 0)

(xk, yk) (0.5000, 0.5000, 0.5000, 0.5000, 0, 0)k = 10, 102, 104 Ite 2

f(xk, yk) -1.0000r(xk, yk) 0

Problem 3.1. This problem is given in [77], which has two upper-level variables(x1, x2) ∈ <2 and one lower-level variable y ∈ <:

minimize x21 + 10(x2 − 1)2 + (y + 1)2

subject to x2 ≥ 0, x1 − ex2 − ey ≥ 0

y ≥ 0, y(x1 − ex2 − ey) = 0.

Problem 3.2. This is equivalent to Problem 5 in [24] and goes back to [67].

minimize x21 − 2x1 + x2

2 − 2x2 + x23 + x2

4

subject to 0 ≤ x1 ≤ 2, 0 ≤ x2 ≤ 2

x3 − x1 + x3y1 − y1 = 0

x4 − x2 + x4y2 − y2 = 0

y1 ≥ 0, y2 ≥ 0

F (x, y) ≥ 0, yT F (x, y) = 0,

where

F (x, y) =

(0.25− (x3 − 1)2

0.25− (x4 − 1)2

).



Size (p,m, n) (3, 6, 2)

Initial point (0, 0, 1.60, 0.20, 0.44, 1.36, 0, 0)

(xk, yk) (0, 2, 1.9034, 0.9276, 0, 1.2689, 0, 0)k = 102 Ite 7

f(xk, yk) -12.7533r(xk, yk) 0.0703

(xk, yk) (0, 2, 1.8753, 0.9065, 0, 1.2502, 0, 0)k = 104 Ite 6

f(xk, yk) -12.6795r(xk, yk) 0.0007

(xk, yk) (0, 2, 1.8750, 0.9063, 0, 1.2500, 0, 0)k = 106, 108 Ite 6

f(xk, yk) -12.6787r(xk, yk) 0.0004

Problem 3.3. This is Problem 11 in [24], which is equivalent to the followingMPEC:

minimize −x21 − 3x2 − 4y1 + y2

2

subject to y ≥ 0, F (x, y) ≥ 0, yT F (x, y) = 0

x21 + 2x2 ≤ 4, x1 ≥ 0, x2 ≥ 0,

where

F (x, y) =

2y1 + 2y3 − 3y4 − y5

−5− y3 + 4y4 − y6

x21 − 2x1 + x2

2 − 2y1 + y2 + 3x2 + 3y1 − 4y2 − 4

y1

y2

.

Problem 3.4. This is equivalent to Problem 10 in [24].

minimize (x5 + x7 − 200)(x5 + x7) + (x6 + x8 − 160)(x6 + x8)

subject to 0 ≤ x1 ≤ 10, 0 ≤ x2 ≤ 5

0 ≤ x3 ≤ 15, 0 ≤ x4 ≤ 20

x1 + x2 + x3 + x4 ≤ 40

3.4 Concluding Remarks 51

x5 − 4 + 0.4y1 + 0.6y2 − y3 + y4 = 0

x6 − 13 + 0.7y1 + 0.3y2 − y5 + y6 = 0

x7 − 35 + 0.4y7 + 0.6y8 − y9 + y10 = 0

x8 − 2 + 0.7y7 + 0.3y8 − y11 + y12 = 0

y ≥ 0, F (x, y) ≥ 0, yT F (x, y) = 0,

where

F (x, y) =

x1 − 0.4x5 − 0.7x6

x2 − 0.6x5 − 0.3x6

x5

−x5 + 20x6

−x6 + 20x3 − 0.4x7 − 0.7x8

x4 − 0.6x7 − 0.3x8

x7

−x7 + 40x8

−x8 + 40

.

Since y stands for the Lagrangian multiplier vector in the original problem [24], we onlylist the values of x in Table 3.4.

3.4 Concluding Remarks

Suppose that the condition∑m

j=1 yj ≤ mk + k is retained in the constraints of problem

(3.3), i.e., problem (3.3) is replaced by the problem

minimize f(x, y)

subject to g(x, y) ≤ 0, h(x, y) = 0, (3.79)

y ≥ 0,m∑

j=1

yj ≤ m

k+ k,

(eki − y)T F (x, y) ≥ 0, i = 0, 1, · · · ,m.

Then, since the constraintm∑

j=1

yj ≤ m

k+ k



Size (m,n, p, q) (12, 8, 9, 4)

x0 (0, 0, 0, 0, 0, 0, 0, 0)

xk (6.9858, 2.9766, 12.0064, 18.0312, -0.0173, 10.0143, 30.0896, -0.0173)

k = 102 Ite 19f(xk, yk) -6.6097e+003r(xk, yk) 0.5500

xk (7.0369, 3.0553, 11.9632, 17.9447, 0.0921, 10.0000, 29.9079, 0)

k = 104 Ite 20f(xk, yk) -6.6000e+003r(xk, yk) 0.1953e-004

xk (6.4449, 2.7621, 12.3172, 18.4758, 0, 9.2069, 30.7930, 0.0001)


xk (6.4447, 2.7620, 12.3173, 18.4758, 0, 9.2068, 30.7932, 0)


eventually becomes inactive at any fixed point as k tends to∞, all the results establishedin the previous sections remain true except that the results in Theorem 3.1 are replacedby F = limk→∞Fk. When the set Z := z ∈ <n+m | g(z) ≤ 0, h(z) = 0 is bounded,problem (3.79) has a compact feasible region and so it is solvable for any k as long asit is feasible.

In addition, we remark that the term 1ke in (3.2) is necessary for problem (3.3) to

have desirable properties. In fact, the problem

minimize f(x, y)


(kei − y)T F (x, y) ≥ 0, i = 0, 1, · · · , m

is difficult to handle because problem (3.80) does not satisfy the MFCQ at any point(x, y) ∈ F for all sufficiently large k. For simplicity, we assume that the constraints

3.4 Concluding Remarks 53

g(x, y) ≤ 0 and h(x, y) = 0 are absent and let

G(x, y) = y, ψki (x, y) = (kei − y)T F (x, y), i = 0, 1, · · · ,m. (3.81)

Note that

ψki (x, y) = kFi(x, y) + ψk

0 (x, y), i = 1, 2, · · · ,m. (3.82)

At (x, y) ∈ F , the set of active constraints isψk

0 , ψki , Gj

∣∣∣ i ∈ IF (x, y), j ∈ IG(x, y).

Suppose the MFCQ holds at (x, y) for problem (3.80). Then there exists a vector(x, y) ∈ <n+m such that

∇ψk0 (x, y)T

(x

y

)> 0, (3.83)

∇ψki (x, y)T

(x

y

)> 0, i ∈ IF (x, y), (3.84)

yj = ∇Gj(x, y)T

(x

y

)> 0, j ∈ IG(x, y). (3.85)

(i) Assume that IF (x, y) 6= ∅ and k is large enough to satisfy

1− 1k

∑

i∈IF (x,y)

yi > 0 (3.86)

By (3.82) and (3.84), we have

−∇Fi(x, y)T

(x

y

)<

1k∇ψk

0 (x, y)T

(x

y

), i ∈ IF (x, y). (3.87)

It then follows from (3.81), (3.85), and (3.87) that

∇ψk0 (x, y)T

(x

y

)= −

m∑

i=1

yi∇Fi(x, y)T

(x

y

)−

m∑

j=1

Fj(x, y)yj

= −∑

i∈IF (x,y)

yi∇Fi(x, y)T

(x

y

)−

∑

j∈IG(x,y)

Fj(x, y)yj

≤ (1k

∑

i∈IF (x,y)

yi)∇ψk0 (x, y)T

(x

y

),

i.e.,

(1− 1k

∑

i∈IF (x,y)

yi)∇ψk0 (x, y)T

(x

y

)≤ 0.

54 4. A Modified Scheme for MPECs

By (3.86), we have

∇ψk0 (x, y)T

(x

y

)≤ 0.

This contradicts (3.83) and hence the MFCQ does not hold at (x, y) for problem (3.80)when k is sufficiently large.

(ii) Suppose that IF (x, y) = ∅. Then we have

y = 0, F (x, y) > 0

and, by (3.85), y > 0. It follows that

∇ψk0 (x, y)T

(x

y

)= −

m∑

i=1

yi∇Fi(x, y)T

(x

y

)−

m∑

j=1

Fj(x, y)yj

= −m∑

j=1

Fj(x, y)yj < 0,

which also contradicts (3.83) and then the MFCQ does not hold at (x, y) for problem(3.80).

Chapter 4

A Modified Scheme for MPECs

Consider the following mathematical program with complementarity constraints:

minimize f(z)

subject to g(z) ≤ 0, h(z) = 0 (4.1)

G(z) ≥ 0, H(z) ≥ 0

G(z)T H(z) = 0,

where f : <n → <, g : <n → <p, h : <n → <q, and G,H : <n → <m are all twicecontinuously differentiable functions. Recently, Scholtes [76] presented a regularizationscheme

minimize f(z)

subject to g(z) ≤ 0, h(z) = 0 (4.2)

G(z) ≥ 0, H(z) ≥ 0

Gi(z)Hi(z) ≤ ε, i = 1, 2, · · · ,m,

where ε is a positive parameter, as an approximation of problem (4.1) and proved,under the MPEC-LICQ and the upper level strict complementarity condition, thatan accumulation point of stationary points satisfying the weak second-order necessaryconditions for the relaxed problems is a B-stationary point of the original problem. Inthis chapter, we employ the following scheme as an approximation of problem (4.1):

minimize f(z)

subject to g(z) ≤ 0, h(z) = 0

Gi(z)Hi(z) ≤ ε2 (4.3)

(Gi(z) + ε)(Hi(z) + ε) ≥ ε2

55


i = 1, 2, · · · ,m,

in which there are less constraints than problem (4.2). We will show that the standardlinear independence constraint qualification (LICQ) holds for the new relaxed problemunder some mild conditions. Furthermore, we will give some sufficient conditions ofB-stationarity for a feasible point of the original problem.

4.1 Some Results on Constraint Qualifications

In this section, we discuss constraint qualifications for problem (4.3). We let F and Fε

denote the feasible sets of problems (4.1) and (4.3), respectively, and let, for i = 1, · · · ,mand z ∈ <n,

φε,i(z) := (Gi(z) + ε)(Hi(z) + ε)− ε2,

ψε,i(z) := Gi(z)Hi(z)− ε2,

and

Φε(z) := (φε,1(z), φε,2(z), · · · , φε,m(z))T ,

Ψε(z) := (ψε,1(z), ψε,2(z), · · · , ψε,m(z))T .

Then we have

∇φε,i(z) = (Gi(z) + ε)∇Hi(z) + (Hi(z) + ε)∇Gi(z), (4.4)

∇ψε,i(z) = Hi(z)∇Gi(z) + Gi(z)∇Hi(z) (4.5)

for i = 1, · · · ,m and

∇Φε(z) = (∇φε,1(z), · · · ,∇φε,m(z)),

∇Ψε(z) = (∇ψε,1(z), · · · ,∇ψε,m(z)).

Theorem 4.1 We have F =⋂

ε>0Fε and, for any ε > 0,

IΦε(z) ∩ IΨε(z) = ∅. (4.6)

Proof: First of all, F ⊆ ⋂ε>0Fε is evident. Let z ∈ ⋂

ε>0Fε. Then for any ε > 0,

Gi(z)Hi(z) ≤ ε2,

Gi(z)Hi(z) + ε(Gi(z) + Hi(z)) ≥ 0,

4.1 Some Results on Constraint Qualifications 57

and so

ε + (Gi(z) + Hi(z)) ≥ 0.

Letting ε → 0, we have

Gi(z)Hi(z) = 0, Gi(z) + Hi(z) ≥ 0, i = 1, 2, · · · ,m.

This means that z ∈ F and hence F =⋂

ε>0Fε.

Next we prove (4.6). Suppose that for some ε > 0 and some z ∈ Fε, i ∈ IΦε(z) ∩IΨε(z). Then

Gi(z)Hi(z) = ε2,

Gi(z)Hi(z) + ε(Gi(z) + Hi(z)) = 0.

Combining these equalities, we have

Gi(z) + Hi(z) + ε = 0.

It then follows that

0 = ε2 −Gi(z)Hi(z) = ε2 + Hi(z)2 + εHi(z) =(

Hi(z) +ε

2

)2

+34ε2,

which is a contradiction and so (4.6) holds.

Next we show that, in contrast with problem (4.1), problem (4.3) satisfies thestandard LICQ at a feasible point under some conditions.

Theorem 4.2 For any z ∈ F , if the set of vectors∇gl(z),∇hr(z),∇Gi(z),∇Hi(z)

∣∣∣ l ∈ Ig(z), r = 1, · · · , q, i ∈ IG(z) ∩ IH(z)

is linearly independent, then, for any fixed ε > 0, there exists a neighborhood Uε(z) ofz such that problem (4.3) satisfies the LICQ at any point z ∈ Uε(z) ∩ Fε.

Proof: For any z ∈ F , it is obvious that

ψε,i(z) < 0, i = 1, 2, · · · ,m

and

φε,i(z) = 0 ⇐⇒ i ∈ IG(z) ∩ IH(z).


On the other hand, it follows from the continuity of the functions g, Φε, and Ψε that,for any fixed ε > 0, there exists a neighborhood Uε(z) of z such that, for any pointz ∈ Uε(z) ∩ Fε,

Ig(z) ⊆ Ig(z), IΦε(z) ⊆ IΦε(z), IΨε(z) ⊆ IΨε(z).

This means that all the functions

φε,i, ψε,j , i /∈ IG(z) ∩ IH(z), j = 1, 2, · · · ,m

are inactive at z in problem (4.3). In addition, we have that

Hi(z) + ε 6= 0, Gi(z) + ε 6= 0, i ∈ IΦε(z).

From (4.4), we obtain the conclusion immediately.

Note that, if z ∈ F is nondegenerate or lower level strictly complementary, namely,

IG(z) ∩ IH(z) = ∅,

then the condition in Theorem 4.2 becomes very simple. Furthermore, under theMPEC-LICQ, we have the following stronger result in which the neighborhood is inde-pendent of the parameter ε.

Theorem 4.3 For any z ∈ F , if the MPEC-LICQ holds at z, which means∇gl(z),∇hr(z),∇Gi(z),∇Hj(z)

∣∣∣ l ∈ Ig(z), r = 1, · · · , q, i ∈ IG(z), j ∈ IH(z)

is linearly independent, then there exist a neighborhood U(z) of z and a positive constantε such that problem (4.3) satisfies the LICQ at any point z ∈ U(z)∩Fε for any ε ∈ (0, ε).

Proof: We first consider matrix functions whose columns consist of the vectors

∇gl(z) : l ∈ Ig(z),

∇hr(z) : r = 1, · · · , q,∇Gi(z) : i ∈ IG(z) ∩ IH(z),

∇Gi(z) +Gi(z) + ε

Hi(z) + ε∇Hi(z) or ∇Gi(z) +

Gi(z)Hi(z)

∇Hi(z) : i ∈ IG(z) \ IH(z),

∇Hj(z) : j ∈ IG(z) ∩ IH(z),

∇Hj(z) +Hj(z) + ε

Gj(z) + ε∇Gj(z) or ∇Hj(z) +

Hj(z)Gj(z)

∇Gj(z) : j ∈ IH(z) \ IG(z).

Note that there are finitely many such matrix functions, which are denoted by

A1(z, ε), A2(z, ε), · · · , AN (z, ε). (4.7)

4.1 Some Results on Constraint Qualifications 59

Rearranging components if necessary, we may suppose that all these matrices are con-vergent to the same matrix A(z) with columns

∇gl(z) : l ∈ Ig(z), (4.8)

∇hr(z) : r = 1, · · · , q, (4.9)

∇Gi(z) : i ∈ IG(z), (4.10)

∇Hj(z) : j ∈ IH(z), (4.11)

respectively, as z → z and ε → 0. It follows from the MPEC-LICQ assumption ofthe theorem that A(z) has full column rank. Since the functions G,H, and g arecontinuous, there exist a neighborhood U(z) of z and a positive constant ε such thatfor any ε ∈ (0, ε) and any point z ∈ U(z)∩Fε, all the matrices in (4.7) have full columnrank and

IG(z) ⊆ IG(z), IH(z) ⊆ IH(z), Ig(z) ⊆ Ig(z). (4.12)

Now we let ε ∈ (0, ε) and z ∈ U(z) ∩ Fε and show that problem (4.3) satisfies theLICQ at z. We suppose that the multiplier vectors λ, µ, δ, and γ satisfy

∑

l∈Ig(z)

λl∇gl(z) +q∑

r=1

µr∇hr(z) +∑

i∈IΦε (z)

δi∇φε,i(z) +∑

j∈IΨε (z)

γj∇ψε,j(z) = 0. (4.13)

By (4.4) and (4.5), we have

∑

i∈IΦε (z)

δi∇φε,i(z) =∑

i∈IΦε (z)∩IG(z)∩IH(z)

δi

((Hi(z) + ε)∇Gi(z) + (Gi(z) + ε)∇Hi(z)

)

+∑

i∈IΦε (z)\IH(z)

δi(Hi(z) + ε)(∇Gi(z) +

Gi(z) + ε

Hi(z) + ε∇Hi(z)

)

+∑

i∈IΦε (z)\IG(z)

δi(Gi(z) + ε)(∇Hi(z) +

Hi(z) + ε

Gi(z) + ε∇Gi(z)

)

and

∑

j∈IΨε (z)

γj∇ψε,j(z) =∑

j∈IΨε (z)∩IG(z)∩IH(z)

γj

(Hj(z)∇Gj(z) + Gj(z)∇Hj(z)

)

+∑

j∈IΨε (z)\IH(z)

γjHj(z)(∇Gj(z) +

Gj(z)Hj(z)

∇Hj(z))

+∑

j∈IΨε (z)\IG(z)

γjGj(z)(∇Hj(z) +

Hj(z)Gj(z)

∇Gj(z)).


Note that (4.6) and (4.12) hold. Then, renumbering terms if necessary, we can choosea matrix Ak(z, ε) from (4.7) so that (4.13) can be rewritten as

Ak(z, ε)

λ

0µ

δI(HI(z) + εeI)γIIHII(z)

0δIII(HIII(z) + εeIII)

γIV HIV (z)0

δI(GI(z) + εeI)γIIGII(z)

0δV (GV (z) + εeV )

γVIGVI(z)0

= 0, (4.14)

where

I := IΦε(z) ∩ IG(z) ∩ IH(z),

II := IΨε(z) ∩ IG(z) ∩ IH(z),

III := IΦε(z) \ IH(z),

IV := IΨε(z) \ IH(z),

V := IΦε(z) \ IG(z),

VI := IΨε(z) \ IG(z),

and eI := (1, 1, · · · , 1)T ∈ <|I|. Since Ak(z, ε) has full column rank, it follows from(4.14) that the multiplier vector in (4.14) is zero. Noticing that

Hi(z) + ε 6= 0, Gi(z) + ε 6= 0, i ∈ IΦε(z),

Hi(z) 6= 0, Gi(z) 6= 0, i ∈ IΨε(z),

and

δ =

δI

δIII

δV

, γ =

γII

γIV

δVI

,


we have from (4.14) that

(λT , µT , δT , γT ) = 0,

which implies that problem (4.3) satisfies the LICQ at z. This completes the proof.


In this section, we consider the limiting behavior of problem (4.3) as ε → 0. First wegive the convergence of global optimal solutions.

Theorem 4.4 Let εk ⊆ (0,+∞) be convergent to 0 and suppose that zk is a globaloptimal solution of problem (4.3) with ε = εk. If z∗ is an accumulation point of thesequence zk as k →∞, then z∗ is a global optimal solution of problem (4.1).

Proof: Taking a subsequence if necessary, we assume without loss of generalitythat limk→∞ zk = z∗. By Theorem 4.1, z∗ ∈ F . Since F ⊆ Fεk

for all k, we have

f(zk) ≤ f(z), ∀z ∈ F , ∀k.

Letting k →∞, we have from the continuity of f that

f(z∗) ≤ f(z), ∀z ∈ F ,

i.e., z∗ is a global optimal solution of problem (4.1).

In a similar way, we can prove the next theorem.

Theorem 4.5 Let both εk ⊆ (0,+∞) and εk ⊆ (0, +∞) be convergent to 0 andzk ∈ Fεk

be an εk-approximate solution of problem (4.3) with ε = εk, i.e.,

f(zk)− εk ≤ f(z), ∀z ∈ Fεk.

Then any accumulation point of zk is a global optimal solution of problem (4.1).

Now we consider the limiting behavior of stationary points of problem (4.3). Recallthat z ∈ F is a C-stationary point if and only if there exist multiplier vectors λ ∈<p, µ ∈ <q, and u, v ∈ <m such that

∇f(z) +∇g(z)λ +∇h(z)µ−∇G(z)u−∇H(z)v = 0, (4.15)

λ ≥ 0, z ∈ F , λT g(z) = 0, (4.16)

ui = 0, i /∈ IG(z), (4.17)

vi = 0, i /∈ IH(z), (4.18)


and

uivi ≥ 0, i ∈ IG(z) ∩ IH(z) (4.19)

and z ∈ F is M-stationary to problem (4.1) if, furthermore, either ui > 0, vi > 0or uivi = 0 for all i ∈ IG(z) ∩ IH(z). Moreover, under the MPEC-LICQ, z ∈ F isa B-stationary point if and only if there exist multiplier vectors λ ∈ <p, µ ∈ <q, andu, v ∈ <m such that (4.15)–(4.18) hold and

ui ≥ 0, vi ≥ 0, i ∈ IG(z) ∩ IH(z). (4.20)

Then we have the following convergence results.

Theorem 4.6 Let εk ⊆ (0, +∞) be convergent to 0 and zk ∈ Fεkbe a stationary

point of problem (4.3) with ε = εk for each k. Suppose that z is an accumulation pointof the sequence zk. Then, if the MPEC-LICQ holds at z, z is a C-stationary pointof problem (4.1).

Proof: Without loss of generality, we assume that

limk→∞

zk = z. (4.21)

Since all the functions involved in problem (4.1) are continuous, F is closed and hencez ∈ F by Theorem 4.1. It follows from the MPEC-LICQ assumption, (4.21), andTheorem 4.3 that, for any sufficiently large k, problem (4.3) with ε = εk satisfies theLICQ at zk and hence, by the stationarity of zk, there exist unique Lagrange multipliervectors λk ∈ <p, µk ∈ <q, and δk, γk ∈ <m such that

∇f(zk) +∇g(zk)λk +∇h(zk)µk −∇Φεk(zk)δk +∇Ψεk

(zk)γk = 0, (4.22)

λk ≥ 0, δk ≥ 0, γk ≥ 0, (4.23)

g(zk) ≤ 0, h(zk) = 0, Φεk(zk) ≥ 0, Ψεk

(zk) ≤ 0, (4.24)

g(zk)T λk = 0, Φεk(zk)T δk = 0, Ψεk

(zk)T γk = 0. (4.25)

It follows from (4.23)–(4.25) that

λki = 0, i /∈ Ig(zk), (4.26)

δki = 0, i /∈ IΦεk

(zk), (4.27)

γki = 0, i /∈ IΨεk

(zk). (4.28)

Now suppose that, for all sufficiently large k, (4.22)–(4.25) hold and, in addition, theconditions

IG(zk) ⊆ IG(z), IH(zk) ⊆ IH(z), Ig(zk) ⊆ Ig(z) (4.29)


hold and all the matrix functions Ai(z, ε), i = 1, · · · , N, in (4.7) defined in the proof ofTheorem 4.3 have full column rank at (zk, εk). By (4.4) and (4.5), we have

∇Φεk(zk)δk =

∑

i∈IG(z)∩IH(z)

δki

((Hi(zk) + εk)∇Gi(zk) + (Gi(zk) + εk)∇Hi(zk)

)

+∑

i∈IG(z)\IH(z)

δki (Hi(zk) + εk)

(∇Gi(zk) +

Gi(zk) + εk

Hi(zk) + εk∇Hi(zk)

)

+∑

i∈IH(z)\IG(z)

δki (Gi(zk) + εk)

(∇Hi(zk) +

Hi(zk) + εk

Gi(zk) + εk∇Gi(zk)

)

(4.30)

and

∇Ψεk(zk)γk =

∑

j∈IG(z)∩IH(z)

γkj

(Hj(zk)∇Gj(zk) + Gj(zk)∇Hj(zk)

)

+∑

j∈IG(z)\IH(z)

γkj Hj(zk)

(∇Gj(zk) +

Gj(zk)Hj(zk)

∇Hj(zk))

+∑

j∈IH(z)\IG(z)

γkj Gj(zk)

(∇Hj(zk) +

Hj(zk)Gj(zk)

∇Gj(zk)). (4.31)

Then, taking into account (4.6), we have from (4.22) and (4.26)–(4.31) that

−∇f(zk) = ∇g(zk)λk +∇h(zk)µk

−∑

i∈IG(z)∩IH(z)

(δki (Hi(zk) + εk)− γk

i Hi(zk))∇Gi(zk)

−∑

i∈IΦεk(zk)\IH(z)

δki (Hi(zk) + εk)

(∇Gi(zk) +

Gi(zk) + εk


)

−∑

i∈IΨεk(zk)\IH(z)

(− γk

i Hi(zk))(∇Gi(zk) +

Gi(zk)Hi(zk)

∇Hi(zk))

−∑

i∈IG(z)∩IH(z)

(δki (Gi(zk) + εk)− γk

i Gi(zk))∇Hi(zk)

−∑

i∈IΦεk(zk)\IG(z)

δki (Gi(zk) + εk)

(∇Hi(zk) +

Hi(zk) + εk

Gi(zk) + εk∇Gi(zk)

)

−∑

i∈IΨεk(zk)\IH(z)

(− γk

i Gi(zk))(∇Hi(zk) +

Hi(zk)Gi(zk)

∇Gi(zk))


= ANk(zk, εk)

λkIg(z)

µk

uk

vk

, (4.32)

where uk, vk are given by

uki :=

δki (Hi(zk) + εk), i ∈ IΦεk

(zk) ∩ IG(z)−γk

i Hi(zk), i ∈ IΨεk(zk) ∩ IG(z)

0, i ∈ IG(z) \(IΦεk

(zk) ∪ IΨεk(zk)

),

(4.33)

vki :=

δki (Gi(zk) + εk), i ∈ IΦεk

(zk) ∩ IH(z)−γk

i Gi(zk), i ∈ IΨεk(zk) ∩ IH(z)

0, i ∈ IH(z) \(IΦεk

(zk) ∪ IΨεk(zk)

),

(4.34)

respectively, and ANk(z, ε) is one of the matrix functions in (4.7). As we assumed

above, ANk(zk, εk) has full column rank for all sufficiently large k. In consequence, it

follows from (4.21) and (4.32) that all the multiplier sequences

λkl | l ∈ Ig(z), µk

r | r = 1, · · · , q, (4.35)

uki | i ∈ IG(z), vk

j | j ∈ IH(z) (4.36)

are convergent. Let λ ∈ <p, µ ∈ <q, and u, v ∈ <m be as follows:

λl :=

limk→∞ λk

l , l ∈ Ig(z)0 , l /∈ Ig(z)

, (4.37)

µr := limk→∞

µkr , r = 1, 2, · · · , q, (4.38)

ui :=

limk→∞ uk

i , i ∈ IG(z)0 , i /∈ IG(z)

, (4.39)

vj :=

limk→∞ vk

j , j ∈ IH(z)0 , j /∈ IH(z)

. (4.40)

Letting k →∞ in (4.32) and noticing that

limk→∞

ANk(zk, εk) = A(z),

where A(z) is the matrix with the columns (4.8)–(4.11), we have from (4.37)–(4.40)that

−∇f(z) = ∇g(z)λ +∇h(z)µ−∇G(z)u−∇H(z)v,


i.e., (4.15) holds for the multiplier vectors λ, µ, u, v. On the other hand, we have (4.16)–(4.18) immediately from (4.23), (4.24), (4.37), (4.39), and (4.40). Then the rest of theproof is to show (4.19). In fact, for each i ∈ IG(z) ∩ IH(z), we have from (4.6) and(4.33)–(4.34) that

uki v

ki =

(δki )2(Hi(zk) + εk)(Gi(zk) + εk) = (δk

i εk)2, i ∈ IΦεk(zk)

(γki )2Hi(zk)Gi(zk) = (γk

i εk)2, i ∈ IΨεk(zk)

0, i /∈ IΦεk(zk) ∪ IΨεk

(zk).

Letting k → ∞, we obtain (4.19) since the sequences uki and vk

i in (4.36) areconvergent. Hence z is a C-stationary point of problem (4.1). This completes theproof.

From the definitions of B- and C-stationarity, we have the following result immedi-ately.

Corollary 4.1 Let the assumptions in Theorem 4.6 be satisfied. If, in addition, z isnondegenerate, then it is a B-stationary point of problem (4.1).

On the other hand, we can prove some similar convergence results as in Chapter 3.Let

Lε(z, λ, µ, δ, γ) := f(z) + λT g(z) + µT h(z)− δT Φε(z) + γT Ψε(z)

stands for the Lagrangian of problem (4.3) and

Tε(z) :=d ∈ <n : dT∇φε,i(z) = 0, i ∈ IΦε(z);

dT∇ψε,j(z) = 0, j ∈ IΨε(z);

dT∇gl(z) = 0, l ∈ Ig(z);

dT∇hr(z) = 0, r = 1, 2, · · · , q.

Suppose that α is a nonnegative number. Recall that, at a stationary point z ofproblem (4.3), the matrix ∇2

zLε(z, λ, µ, δ, γ) is bounded below with constant α on thecorresponding tangent space Tε(z) means

dT∇2zLε(z, λ, µ, δ, γ)d ≥ −α||d||2, ∀d ∈ Tε(z). (4.41)

See Section 4.2 for more details about this property.

Theorem 4.7 Let εk ⊆ (0, +∞) be convergent to 0 and zk ∈ Fεkbe a stationary

point of problem (4.3) with ε = εk and multiplier vectors λk, µk, δk, and γk. Suppose


that, for each k, ∇2zLεk

(zk, λk, µk, δk, γk) is bounded below with constant αk on thecorresponding tangent space Tεk

(zk). Let z be an accumulation point of the sequencezk. If the sequence αk is bounded and the MPEC-LICQ holds at z, then z is anM-stationary point of problem (4.1).

Proof: Assume that limk→∞ zk = z without loss of generality. First of all, wenote from Theorem 4.6 that z is a C-stationary point of problem (4.1). To prove thetheorem, we assume to the contrary that z is not M-stationary to problem (4.1). Then,it follows from the definitions of C-stationarity and M-stationarity that there must existan i0 ∈ IG(z) ∩ IH(z) such that

ui0 < 0, vi0 < 0. (4.42)

By (4.33)–(4.34) and (4.39)–(4.40), we have

i0 ∈ IΦεk(zk) ∪ IΨεk

(zk)

for every sufficiently large k. First we consider the case where i0 ∈ IΨεk(zk) for infinitely

many k. Furthermore, taking a subsequence if necessary, we may assume without lossof generality that

i0 ∈ IΨεk(zk) (4.43)

for all sufficiently large k. Then, by (4.33) and (4.34),

ui0 = − limk→∞

γki0Hi0(z

k) < 0, (4.44)

vi0 = − limk→∞

γki0Gi0(z

k) < 0, (4.45)

and so

limk→∞

Hi0(zk)

Gi0(zk)=

ui0

vi0

> 0. (4.46)

In what follows, we suppose that, for all sufficiently large k, (4.22)–(4.25), (4.29), and

Hi0(zk)

Gi0(zk)> 0

hold and all the matrix functions Ai(z, ε), i = 1, · · · , N, in (4.7) have full column rankat (zk, εk). For such k, the matrix ANk

(zk, εk) whose columns consist of the vectors

∇gl(zk) : l ∈ Ig(z),

∇hr(zk) : r = 1, · · · , q,


∇Gi(zk) : i ∈(IG(z) ∩ IH(z)

)∪

(IG(z) \ (IΦεk

(zk) ∪ IΨεk(zk))

),

∇Gi(zk) +Gi(zk) + εk

Hi(zk) + εk∇Hi(zk) : i ∈ IΦεk

(zk) \ IH(z),

∇Gi(zk) +Gi(zk)Hi(zk)

∇Hi(zk) : i ∈ IΨεk(zk) \ IH(z),

∇Hj(zk) : j ∈(IG(z) ∩ IH(z)

)∪

(IH(z) \ (IΦεk


),

∇Hj(zk) +Hj(zk) + εk

Gj(zk) + εk∇Gj(zk) : j ∈ IΦεk

(zk) \ IG(z),

∇Hj(zk) +Hj(zk)Gj(zk)

∇Gj(zk) : j ∈ IΨεk(zk) \ IG(z)

has full column rank. Therefore, we can choose a vector dk ∈ <n such that

(dk)T∇gl(zk) = 0, l ∈ Ig(z); (4.47)

(dk)T∇hr(zk) = 0, r = 1, · · · , q; (4.48)

(dk)T∇Gi(zk) = 0,

i ∈(IG(z) ∩ IH(z)

)∪

(IG(z) \ (IΦεk


), i 6= i0; (4.49)

(dk)T(∇Gi(zk) +

Gi(zk) + εk


)= 0, i ∈ IΦεk

(zk) \ IH(z); (4.50)

(dk)T(∇Gi(zk) +

Gi(zk)Hi(zk)

∇Hi(zk))

= 0, i ∈ IΨεk(zk) \ IH(z); (4.51)

(dk)T∇Hj(zk) = 0,

j ∈(IG(z) ∩ IH(z)

)∪

(IH(z) \ (IΦεk


), j 6= i0; (4.52)

(dk)T(∇Hj(zk) +

Hj(zk) + εk

Gj(zk) + εk∇Gj(zk)

)= 0, j ∈ IΦεk

(zk) \ IG(z); (4.53)

(dk)T(∇Hj(zk) +

Hj(zk)Gj(zk)

∇Gj(zk))

= 0, j ∈ IΨεk(zk) \ IG(z); (4.54)

(dk)T∇Gi0(zk) = 1; (4.55)

(dk)T∇Hi0(zk) = −Hi0(z

k)Gi0(zk)

.

Then for any i ∈ IΦεk(zk) and any j ∈ IΨεk

(zk), since

∇φεk,i(zk) = (Gi(zk) + εk)∇Hi(zk) + (Hi(zk) + εk)∇Gi(zk),

∇ψεk,j(zk) = Hj(zk)∇Gj(zk) + Gj(zk)∇Hj(zk),

we have

(dk)T∇φεk,i(zk) = 0, i ∈ IΦεk(zk),

(dk)T∇ψεk,j(zk) = 0, j ∈ IΨεk(zk),


and so dk ∈ Tεk(zk). Furthermore, we can choose the sequence dk to be bounded.

Since ∇2zLεk

(zk, λk, µk, δk, γk) is bounded below with constant αk on the correspondingtangent space Tεk

(zk), we have from (4.41) that there exists a constant C such that

(dk)T∇2zLεk

(zk, λk, µk, δk, γk)dk ≥ −αk||dk||2 ≥ C, (4.56)

where the last inequality follows from the boundedness of the sequences αk and dk.Note that, by (4.26)–(4.28) and

∇2φεk,i(zk) = ∇Gi(zk)∇Hi(zk)T +∇Hi(zk)∇Gi(zk)T

+(Gi(zk) + εk)∇2Hi(zk) + (Hi(zk) + εk)∇2Gi(zk),

∇2ψεk,j(zk) = ∇Gj(zk)∇Hj(zk)T +∇Hj(zk)∇Gj(zk)T

+Gj(zk)∇2Hj(zk) + Hj(zk)∇2Gj(zk),

there holds

∇2zLεk

(zk, λk, µk, δk, γk) = ∇2f(zk) +p∑

l=1

λkl ∇2gl(zk) +

q∑

r=1

µkr∇2hr(zk)

−m∑

i=1

δki ∇2φεk,i(zk) +

m∑

j=1

γkj∇2ψεk,j(zk)

= ∇2f(zk) +∑

l∈Ig(z)

λkl ∇2gl(zk) +

q∑

r=1

µkr∇2hr(zk)

−∑

i∈IΦεk(zk)

δki ∇2φεk,i(zk) +

∑

j∈IΨεk(zk)

γkj∇2ψεk,j(zk).

We then have

(dk)T∇2zLεk

(zk, λk, µk, δk, γk)dk

= (dk)T∇2f(zk)dk +∑

l∈Ig(z)

λkl (d

k)T∇2gl(zk)dk +q∑

r=1

µkr (d

k)T∇2hr(zk)dk

−∑

i∈IΦεk(zk)

δki

((dk)T∇Gi(zk)∇Hi(zk)T dk + (dk)T∇Hi(zk)∇Gi(zk)T dk

+ (Gi(zk) + εk)(dk)T∇2Hi(zk)dk + (Hi(zk) + εk)(dk)T∇2Gi(zk)dk)

+∑

j∈IΨεk(zk)

γkj

((dk)T∇Gj(zk)∇Hj(zk)T dk + (dk)T∇Hj(zk)∇Gj(zk)T dk

+ Gj(zk)(dk)T∇2Hj(zk)dk + Hj(zk)(dk)T∇2Gj(zk)dk). (4.57)

By the twice continuous differentiability of the functions, the boundness of the sequencedk, and the convergence of the sequences zk, λk

l and µkr (by (4.37)–(4.38)), the


terms

(dk)T∇2f(zk)dk,∑

l∈Ig(z)

λkl (d

k)T∇2gl(zk)dk,q∑

r=1

µkr (d

k)T∇2hr(zk)dk

are all bounded. Consider arbitrary indices i and j such that i ∈ IΦεk(zk) for infinitely

many k and j ∈ IΨεk(zk) \ i0 for infinitely many k, respectively. If

i ∈ IG(z) ∩ IH(z) or j ∈ IG(z) ∩ IH(z),

then

(dk)T∇Gi(zk) = 0 or (dk)T∇Hj(zk) = 0

and, by (4.33)–(4.34) and (4.39)–(4.40), the sequences

δki (Gi(zk) + εk)

,

δki (Hi(zk) + εk)

,

and

γkj Gj(zk)

,

γk

j Hj(zk)

are all convergent. If

i, j /∈ IG(z) ∩ IH(z),

then, also by (4.33)–(4.34) and (4.39)–(4.40), the sequences δki and γk

j are conver-gent. Therefore, we have that the terms

∑

i∈IΦεk(zk)

δki

((dk)T∇Gi(zk)∇Hi(zk)T dk + (dk)T∇Hi(zk)∇Gi(zk)T dk +

(Gi(zk) + εk)(dk)T∇2Hi(zk)dk + (Hi(zk) + εk)(dk)T∇2Gi(zk)dk)

and∑

j∈IΨεk(zk)\i0

γkj

((dk)T∇Gj(zk)∇Hj(zk)T dk + (dk)T∇Hj(zk)∇Gj(zk)T dk +

Gj(zk)(dk)T∇2Hj(zk)dk + Hj(zk)(dk)T∇2Gj(zk)dk)

are bounded. On the other hand, however, we have (4.43) for all sufficiently large k

and

γki0

((dk)T∇Gi0(z

k)∇Hi0(zk)T dk + (dk)T∇Hi0(z

k)∇Gi0(zk)T dk

+Gi0(zk)(dk)T∇2Hi0(z

k)dk + Hi0(zk)(dk)T∇2Gi0(z

k)dk)

(4.58)

= −2γki0

Hi0(zk)

Gi0(zk)+ γk

i0

(Gi0(z

k)(dk)T∇2Hi0(zk)dk + Hi0(z

k)(dk)T∇2Gi0(zk)dk

).


Since (4.46) holds and γki0→ +∞ as k →∞ by (4.23) and (4.44), we have

−2γki0

Hi0(zk)

Gi0(zk)→ −∞

as k →∞. Note that, by (4.44) and (4.45), the sequences

γki0Gi0(z

k)

,

γk

i0Hi0(zk)

are also convergent. We then have that the term (4.58) tends to −∞ as k → ∞.Therefore, it follows from (4.57) that

(dk)T∇2zLεk

(zk, λk, µk, δk, γk)dk → −∞

as k →∞. This contradicts (4.56) and hence z is M-stationary to problem (4.1).

Finally we consider the case where i0 ∈ IΦεk(zk) for infinitely many k. By (4.33)

and (4.34), we have from (4.42) that

ui0 = limk→∞

δki0(Hi0(z

k) + εk) < 0,

vi0 = limk→∞

δki0(Gi0(z

k) + εk) < 0,

and so

limk→∞

Hi0(zk) + εk

Gi0(zk) + εk=

ui0

vi0

> 0.

Therefore, we can also choose a bounded sequence dk such that (4.47)–(4.55) and

(dk)T∇Hi0(zk) = −Hi0(z

k) + εk

Gi0(zk) + εk

hold for each k. In a similar way, we then obtain a contradiction and so z is M-stationaryto problem (4.1). This completes the proof.

Corollary 4.2 Let εk, zk, and z be the same as in Theorem 4.7. If zk togetherwith the corresponding multiplier vectors λk, µk, δk, and γk satisfies the weak second-order necessary conditions for problem (4.3) with ε = εk and the MPEC-LICQ holds atz, then z is an M-stationary point of problem (4.1).

Corollary 4.3 Let the assumptions in Theorem 4.7 be satisfied. If, in addition, z

satisfies the upper level strict complementarity condition, then it is a B-stationary pointof problem (4.1).


Theorem 4.8 Let εk, zk, and z be the same as in Theorem 4.7 and λk, µk, δk,

and γk be the multiplier vectors corresponding to zk. Let βk be the smallest eigenvalueof the matrix ∇2

zLεk(zk, λk, µk, δk, γk). If the sequence βk is bounded below and the

MPEC-LICQ holds at z, then z is a B-stationary point of problem (4.1).

Proof: It is easy to see that the assumptions of Theorem 4.7 are satisfied withαk = max−βk, 0 and so z is an M-stationary point of problem (4.1). Suppose that z

is not B-stationary to problem (4.1). Then, by the definitions of B- and M-stationarity,there exists an i0 ∈ IG(z) ∩ IH(z) such that

ui0 < 0, vi0 = 0 (4.59)

or

ui0 = 0, vi0 < 0.

By (4.33)–(4.34) and (4.39)–(4.40), we have

i0 ∈ IΦεk(zk) ∪ IΨεk

(zk)

for every sufficiently large k. Without loss of generality, we assume that (4.59) holds.

First we consider the case where i0 ∈ IΨεk(zk) for infinitely many k. By taking a

subsequence if necessary, we assume

i0 ∈ IΨεk(zk) (4.60)

for all sufficiently large k. Then, it follows from (4.33), (4.34), and (4.59) that

ui0 = − limk→∞

γki0Hi0(z

k) < 0

and so, by (4.23), we have

limk→∞

γki0 = +∞. (4.61)

Now we suppose that, for all sufficiently large k, (4.22)–(4.25) and (4.29) hold and thematrix ANk

(zk, εk) defined in the proof of Theorem 4.7 has full column rank. Therefore,we can choose a vector dk ∈ <n such that

(dk)T∇gl(zk) = 0, l ∈ Ig(z);

(dk)T∇hr(zk) = 0, r = 1, 2, · · · , q;(dk)T∇Gi(zk) = 0, i ∈

(IG(z) ∩ IH(z)

)∪

(IG(z) \ (IΦεk


), i 6= i0;


(dk)T(∇Gi(zk) +

Gi(zk) + εk


)= 0, i ∈ IΦεk

(zk) \ IH(z);

(dk)T(∇Gi(zk) +

Gi(zk)Hi(zk)

∇Hi(zk))

= 0, i ∈ IΨεk(zk) \ IH(z);

(dk)T∇Hj(zk) = 0, j ∈(IG(z) ∩ IH(z)

)∪

(IH(z) \ (IΦεk


), j 6= i0;

(dk)T(∇Hj(zk) +

Hj(zk) + εk

Gj(zk) + εk∇Gj(zk)

)= 0, j ∈ IΦεk

(zk) \ IG(z);

(dk)T(∇Hj(zk) +

Hj(zk)Gj(zk)

∇Gj(zk))

= 0, j ∈ IΨεk(zk) \ IG(z);

(dk)T∇Gi0(zk) = 1;

(dk)T∇Hi0(zk) = −1.

Furthermore, we can choose the sequence dk to be bounded. By the assumptions ofthe theorem, there exists a constant C such that

(dk)T∇2zLεk

(zk, λk, µk, δk, γk)dk ≥ βk||dk||2 ≥ C (4.62)

holds for all k. In a similar way to the proof of Theorem 4.7, we can show that all theterms on the right-hand side of (4.57) except

γki0

((dk)T∇Gi0(z


k)∇Gi0(zk)T dk

+Gi0(zk)(dk)T∇2Hi0(z

k)dk + Hi0(zk)(dk)T∇2Gi0(z

k)dk)

are bounded. On the other hand,

γki0

((dk)T∇Gi0(z


k)∇Gi0(zk)T dk

)= −2γk

i0 → −∞

by the definition of dk and (4.61), and

γki0

(Gi0(z

k)(dk)T∇2Hi0(zk)dk + Hi0(z

k)(dk)T∇2Gi0(zk)dk

)

is bounded by the convergence of the sequences

γki0Gi0(z

k)

,

γk

i0Hi0(zk)

.

In consequence, we have

(dk)T∇2zLεk

(zk, λk, µk, δk, γk)dk → −∞

as k →∞. This contradicts (4.62) and hence z is B-stationary to problem (4.1).


For the case where i0 ∈ IΦεk(zk) for infinitely many k, we can show that z is B-

stationary to problem (4.1) in a similar way as in the proof of Theorem 4.7. Thiscompletes the proof.

The next example shows that the new condition in Theorem 4.8 is actually inde-pendent of the upper level strict complementarity condition employed in Corollary 4.3and [38, 41, 57, 76].

Example 4.1 Consider the following problem:

minimize f(z)

subject to z1 + z2 ≥ 0, z2 ≥ 0, (4.63)

z2(z1 + z2) = 0.

Then the modified relaxation scheme (4.3) for (4.63) can be written as

minimize f(z)

subject to z2(z1 + z2) + ε(z1 + 2z2) ≥ 0, (4.64)

z2(z1 + z2)− ε2 ≤ 0.

Let

G(z) := z1 + z2,

H(z) := z2,

φε(z) := z2(z1 + z2) + ε(z1 + 2z2),

ψε(z) := z2(z1 + z2)− ε2,

Lε(z, δ, γ) := f(z)− δφε(z) + γψε(z)

and denote by z a solution of (4.63) with multipliers (u, v) satisfying (4.15)–(4.20) andby zε a solution of (4.64) with Lagrange multipliers (δε, γε). In order to show that thetwo conditions do not imply each other, we consider the following two cases.

(I) Let f(z) := (z1 + z2)2 + z22 . Then we have z = zε = (0, 0) for any ε > 0. We

can show that the smallest eigenvalue βε of ∇2zLε(zε, δε, γε) is independent of ε, whereas

the upper level strict complementarity condition does not hold at z. In fact, sinceδε = γε = 0 for any ε > 0, we have

∇2zLε(zε, δε, γε) =

(2 2− δε + γε

2− δε + γε 4− 2δε + 2γε

)=

(2 22 4

)

74 5. Hybrid Approach for MPECs

and hence βε = 3 − √5 for any ε > 0. On the other hand, we have u = v = 0. This,together with IG(z)∩IH(z) = 1, implies that the upper level strict complementaritycondition does not hold at z.

(II) Let f(z) := (z1 + 1)2 + (z2 + 2)2. Then, for any ε > 0, we have z = zε = (0, 0).On the one hand, since u = v = 2 and IG(z) ∩ IH(z) = 1, the upper level strictcomplementarity condition holds at z. On the other hand, we have

δε = 2ε−1, γε = 0

and

∇2zLε(zε, δε, γε) =

(2 −δε + γε

−δε + γε 2− 2δε + 2γε

)=

(2 −2ε−1

−2ε−1 2− 4ε−1

).

It then follows that βε = 2− 2(1 +√

2)ε−1, which tends to −∞ as ε → 0+. This meansthat the condition given in Theorem 4.8 does not hold.

4.3 Concluding Remarks

In this chapter, we have proposed a modified relaxation scheme for a mathematicalprogram with complementarity constraints. The new relaxed problem involves lessconstraints than the one considered by Scholtes [76]. All desirable properties estab-lished in [76] remain valid for the new relaxed problem. In addition, we obtain somenew sufficient conditions for B-stationarity described by the eigenvalues of the Hessianmatrix of the Lagrangian of the relaxed problem. From the proof, it is easy to see that,even if the matrix mentioned above is replaced by the Hessian matrix of the simplerfunction

Lε(z, γ, δ) := γT Ψε(z)− δT Φε(z),

all the results remain true. Similar extension is possible for the relaxation schemespresented by Scholtes [76] and Lin and Fukushima [57] as well.

Finally, we remark that, if z is degenerate, the gradient of φε,i or ψε,i at zε tendsto 0 as ε → 0+ for i ∈ IG(z) ∩ IH(z). In fact, the method presented in [76] also has asimilar problem. This is a possible deficiency of these methods, which may cause somenumerical instability in practical calculations.

Chapter 5

Hybrid Approach with Active

Set Identification for MPECs

Consider the following mathematical program with complementarity constraints (MPCC):

minimize f(z)

subject to g(z) ≤ 0, h(z) = 0 (5.1)

G(z) ≥ 0, H(z) ≥ 0

G(z)T H(z) = 0,

where f : <n → <, g : <n → <p, h : <n → <q, and G,H : <n → <m are all twicecontinuously differentiable functions. There have been proposed several approachessuch as sequential quadratic programming approach [29, 44, 62], implicit programmingapproach [12, 62], penalty function approach [38, 41, 59, 62, 63, 77], active-set approach[33], and reformulation approach [24, 31, 54, 57, 76]. However, these methods requireto solve an infinite sequence of nonlinear programs. The purpose of this chapter is todevelop some methods that enable us to compute a solution or a point with some kindof stationarity to problem (5.1) by solving a finite number of nonlinear programs. Tothis end, we will apply an active set identification technique to a smoothing continu-ation method [31] and present some hybrid algorithms. Further discussions and someextensions will also be included.

75


5.1 Preliminaries

Consider the smoothing continuation method [31] that uses the perturbed Fischer-Burmeister function

φε(a, b) := a + b−√

a2 + b2 + 2ε2,

where ε ≥ 0. Define the function Φε : <n → <m by

Φε(z) :=(φε(G1(z),H1(z)), · · · , φε(Gm(z),Hm(z))

)T

and consider the nonlinear programming problem

minimize f(z)

subject to Φε(z) = 0,

g(z) ≤ 0, h(z) = 0.

(Pε)

Note that (P0) is equivalent to problem (5.1) and Φε is differentiable everywhere forany ε > 0. We assume that problem (Pε) has a solution (or a stationary point) zε foreach small scalar ε > 0. We may expect to find a solution or a point with some kindof stationarity to problem (5.1) by tracing the trajectory zε as ε → 0. Suppose thatεk is a positive sequence converging to zero. The following convergence result is givenin [31].

Theorem 5.1 Let zk be a stationary point of problem (Pεk) and the sequence zk

converge to z∗ as εk → 0. Suppose that the WSONC holds at each zk, the MPCC-LICQholds at z∗, and zk is asymptotically weakly nondegenerate. Then z∗ is B-stationaryto problem (5.1).

We now recall the asymptotically weak nondegeneracy [31], which is assumed inthe above theorem and will also be employed in the subsequent analysis. Supposezk converges to z∗ as εk → 0. Then z∗ ∈ F . It can be shown [31] that, for eachi ∈ IG(z∗) ∩ IH(z∗),

∇Φεk,i(zk) =Hi(zk)

Gi(zk) + Hi(zk)∇Gi(zk) +

Gi(zk)Gi(zk) + Hi(zk)

∇Hi(zk). (5.2)

Therefore, every accumulation point r of∇Φεk,i(zk)

can be represented as

r = ξi(r)∇Gi(z∗) + ηi(r)∇Hi(z∗) (5.3)

5.2 Algorithm H 77

for some (ξi(r), ηi(r)) with (1− ξi(r))2 + (1− ηi(r))2 ≤ 1. We say that zk is asymp-totically weakly nondegenerate if, for each i ∈ IG(z∗) ∩ IH(z∗), neither ξi(r) nor ηi(r)vanishes for any accumulation point r of

∇Φεk,i(zk)

.

Roughly speaking, the asymptotically weak nondegeneracy of zk means that,for each i ∈ IG(z∗) ∩ IH(z∗), Gi(zk) and Hi(zk) approach zero in the same order ofmagnitude. This property is obviously weaker than the nondegeneracy (lower-levelstrict complementarity), because it vacuously holds when z∗ is nondegenerate. We willshow in Section 5.5 that it is even weaker than the upper-level strict complementarity(ULSC) condition, which is often employed in the literature on MPCC, see [38, 41, 54,57, 76].

In addition, we can prove that the smoothing continuation method possesses similarconvergence properties to the methods proposed in Chapters 3 and 4. Here, we stateone of such results.

Theorem 5.2 Let zk be a stationary point of problem (Pεk) and, for each k, (λk

g , λkh, λk

Φ)be a multiplier vector corresponding to zk. Suppose that the sequence zk convergesto z∗ as εk → 0 and, for each k, ∇2

zLεk(zk, λk

g , λkh, λk

Φ) is bounded below with constantαk ≥ 0 on the corresponding tangent space Tεk

(zk), which means

dT∇2zLεk

(zk, λkg , λ

kh, λk

Φ)d ≥ −αk||d||2, ∀d ∈ Tεk(zk), (5.4)

where

Lε(z, λg, λh, λΦ) := f(z) + λTg g(z) + λT

h h(z) + λTΦΦε(z),

Tε(z) :=d ∈ <n | dT∇Φε,i(z) = 0, i = 1, 2, · · · ,m;

dT∇gl(z) = 0, l ∈ Ig(z);

dT∇hr(z) = 0, r = 1, 2, · · · , q.

If the sequence αk is bounded, zk is asymptotically weakly nondegenerate, and theMPCC-LICQ holds at z∗, then z∗ is a B-stationary point of problem (5.1).

Actually, the condition that problem (Pεk) satisfies the WSONC at zk for each

k means that ∇2zLεk

(zk, λkg , λ

kh, λk

Φ) is bounded below with constant 0. In conse-quence, Theorem 5.1 is actually a corollary of Theorem 5.2. Note that, for the matrix∇2

zLεk(zk, λk

g , λkh, λk

Φ), there must exist a number αk such that (5.4) holds. For exam-ple, any nonnegative scalar αk such that (−αk) is less than the smallest eigenvalue of∇2

zLεk(zk, λk

g , λkh, λk

Φ) must satisfy (5.4). However, the WSONC means that the matrixshould have some kind of semi-definiteness on the tangent space Tεk

(zk). Therefore,the assumptions in Theorem 5.2 are weaker than the conditions of Theorem 5.1. Sincethe proof of Theorem 5.2 is similar to that of Theorem 5.1, it is omitted here, see [31].


5.2 A Hybrid Algorithm for MPCC

We first introduce an active set identification technique for MPCC. Active set iden-tification plays an important role in optimization theory [6, 8, 22, 23, 82]. Accurateidentification of active constraints is important from both theoretical and practicalpoints of view. For problem (5.1), by means of active set identification, the combina-torial constraints

G(z) ≥ 0, H(z) ≥ 0, G(z)T H(z) = 0 (5.5)

may be replaced by some equality and/or inequality constraints that are easier to dealwith.

For a point z ∈ F , let α(z), β(z) and γ(z) be the index sets defined by

α(z) := i | Gi(z) > 0, Hi(z) = 0 ,

β(z) := i | Gi(z) = 0, Hi(z) = 0 ,

γ(z) := i | Gi(z) = 0, Hi(z) > 0 ,

respectively. Obviously, α(z) ∪ β(z) ∪ γ(z) = 1, . . . , m and these index sets aremutually disjoint. Let a sequence zk be generated so that it converges to z∗ ∈ F .If z∗ is nondegenerate, namely, if β(z∗) = ∅, it is generally not difficult to identify thecorrect index sets finitely. However, when z∗ is degenerate, it is not necessarily easy toidentify the active index sets. In the following, we will particularly be interested in thecase where zk is convergent to a degenerate point z∗ ∈ F .

Let εk be a positive sequence converging to zero and let zk stand for a solution ora stationary point of problem (Pεk

) for each k. Suppose that zk converges to somez∗ throughout this chapter. Note that

Gi(zk) > 0, Hi(zk) > 0, Gi(zk)Hi(zk) = ε2k (5.6)

for each k and each i. We try to estimate the index sets α(z∗), β(z∗) and γ(z∗) bysome index sets αk, βk and γk, respectively, which are obtained from zk and satisfyαk ∪βk ∪γk = 1, . . . , m with αk, βk, γk being mutually disjoint. Given the index setsαk, βk and γk, we then solve the nonlinear programming problem

minimize f(z)

subject to Gi(z) ≥ 0, Hi(z) = 0, i ∈ αk,

Gi(z) = 0, Hi(z) = 0, i ∈ βk, (5.7)

Gi(z) = 0, Hi(z) ≥ 0, i ∈ γk,

g(z) ≤ 0, h(z) = 0.

5.2 Algorithm H 79

This problem is no longer an MPCC and hence easier to deal with. Denote by zk astationary point of problem (5.84). Obviously, we always have βk ⊆ β(zk), and β(zk)may contain some i ∈ αk with Gi(zk) = 0 or some i ∈ γk with Hi(zk) = 0. If theLagrange multipliers corresponding to the constraints

Gi(z) ≥ 0, Hi(z) = 0, i ∈ αk ∩ β(zk),

Gi(z) = 0, Hi(z) = 0, i ∈ βk, (5.8)

Gi(z) = 0, Hi(z) ≥ 0, i ∈ γk ∩ β(zk)

are all nonnegative, then zk is a B-stationary point of problem (5.1) under the MPCC-LICQ assumption at the point [33]. Therefore, assuming that (5.84) can be solvedexactly, we may terminate the method in finite steps, unlike the method in [31], whichneeds to solve an infinite sequence of nonlinear programs.

The key to success is to define the index sets αk, βk and γk such that

αk = α(z∗), βk = β(z∗), γk = γ(z∗) (5.9)

for all k large enough. To this end, we may use an identification function ρ : <n →[0, +∞) satisfying

limk→∞

ρ(zk) = 0 (5.10)

and, for all k large enough,

maxi∈β(z∗)

Gi(zk),Hi(zk)

≤ ρ(zk), (5.11)

maxi∈α(z∗)∪γ(z∗)

minGi(zk),Hi(zk)

≤ ρ(zk), (5.12)

and consider the following hybrid algorithm that combines the smoothing continuationmethod with an active set identification technique.

Algorithm H:

Step 0: Choose ε0 > 0 and set k := 0.

Step 1: Solve problem (Pεk) and denote by zk one of its stationary points. Set

αk :=

i∣∣∣ Gi(zk) > ρ(zk), Hi(zk) ≤ ρ(zk)

, (5.13)

βk :=

i∣∣∣ Gi(zk) ≤ ρ(zk), Hi(zk) ≤ ρ(zk)

, (5.14)

γk :=

i∣∣∣ Gi(zk) ≤ ρ(zk), Hi(zk) > ρ(zk)

. (5.15)

If αk ∪ βk ∪ γk = 1, . . . , m, go to Step 2. Otherwise, go to Step 4.


Step 2: Solve problem (5.84) to get a stationary point zk and go to Step 3.

Step 3: If the Lagrange multipliers corresponding to the constraints (5.85) are allnonnegative, then terminate. Else, go to Step 4.

Step 4: Choose an εk+1 ∈ (0, εk) and let k := k + 1. Go to Step 1.

Next, we make some remarks on the identification function ρ and Algorithm H.

First of all, we have that, if β(z∗) 6= ∅,

maxi∈α(z∗)∪γ(z∗)

minGi(zk),Hi(zk)

< min

i∈β(z∗)

Gi(zk),Hi(zk)

(5.16)

holds for all k large enough as long as α(z∗) ∪ γ(z∗) 6= ∅. In fact, we have from (5.6)that

limk→∞

ε2kminGi(zk),Hi(zk) = lim

k→∞maxGi(zk),Hi(zk)

= 0, i ∈ β(z∗)

and

limk→∞

ε2kminGj(zk),Hj(zk) = lim

k→∞maxGj(zk),Hj(zk)

= Gj(z∗) + Hj(z∗) > 0, j ∈ α(z∗) ∪ γ(z∗).

In consequence,

limk→∞

minGj(zk),Hj(zk)minGi(zk),Hi(zk) = 0 (5.17)

holds for each i ∈ β(z∗) and each j ∈ α(z∗) ∪ γ(z∗) and hence we have (5.16). Thisinequality means that condition (5.11) implies condition (5.12) for all k sufficientlylarge if β(z∗) 6= ∅ and α(z∗) ∪ γ(z∗) 6= ∅.

Moreover, it is obvious that αk, βk, γk defined by (5.13)–(5.15) is mutually disjointfor each k. On the other hand, conditions (5.11) and (5.12) ensure that, when k issufficiently large,

minGi(zk),Hi(zk) ≤ ρ(zk), ∀i.

This means αk∪βk∪γk = 1, . . . , m and so αk, βk, γk defined in Step 1 is a partitionof 1, . . . , m for all k sufficiently large.

Furthermore, we have from (5.10) that, when k is sufficiently large,

αk ⊇ α(z∗), βk ⊆ β(z∗), γk ⊇ γ(z∗).

5.2 Algorithm H 81

In addition, it follows from (5.11) and (5.14) that β(z∗) ⊆ βk for all k sufficientlylarge. Note that both αk, βk, γk and α(z∗), β(z∗), γ(z∗) are partitions of 1, . . . ,m.Therefore, (5.9) holds when k is large enough.

The above analysis indicates that Algorithm H may possess a finite terminationproperty. The key question is, of course, how to define the identification function ρ.This is not a trivial task because the function may depend on the unknown point z∗

generally. Next, we consider the case where zk is asymptotically weakly nondegen-erate.

Theorem 5.3 Suppose the sequence zk generated by Algorithm H is asymptoticallyweakly nondegenerate. Let

ρ1(z) := τ ||min(G(z),H(z))||σ, (5.18)

where τ > 0 and σ ∈ (0, 1) are constants and

min(G(z),H(z)) :=(

minG1(z),H1(z), · · · , minGm(z),Hm(z))T

. (5.19)

Then (5.10)–(5.12) hold with ρ = ρ1, i.e., the function ρ1 can serve as an identificationfunction.

Proof: Note that the complementarity constraints (5.5) are equivalent to ρ1(z) = 0.By the fact that zk is convergent to z∗ ∈ F and the continuity of ρ1, we have

limk→∞

ρ1(zk) = ρ1(z∗) = 0.

Therefore, condition (5.10) holds with ρ = ρ1. On the other hand, for any i ∈ α(z∗) ∪γ(z∗), we have

minGi(zk), Hi(zk)

≤ ||min(G(zk),H(zk))|| ≤ ρ1(zk)

for all k sufficiently large, where the first inequality follows from (5.19) and the secondinequality follows from the fact that ||min(G(zk), H(zk))|| is convergent to 0 and theconstant σ lies in the interval (0, 1). This means that (5.12) with ρ = ρ1 holds when k

is sufficiently large.

We next prove (5.11) with ρ = ρ1 holds when k is sufficiently large. We may assumeβ(z∗) 6= ∅, because (5.11) holds vacuously if β(z∗) is empty. Let i ∈ β(z∗). Since the setof accumulation points of the sequence

∇Φεk,i(zk)

is compact, by the asymptotically

weak nondegeneracy of zk, the set of the coefficient pairs in (5.3) is a compact subsetof

<2++ :=

(ξ, η)T ∈ <2 | ξ > 0, η > 0

.


Then, by (5.76) and (5.3), there exist positive constants ai < bi such that

ai ≤ Gi(zk)Hi(zk)

≤ bi, ∀k.

Let a := mini∈β(z∗) ai and b := maxi∈β(z∗) bi. It follows that

0 < a ≤ Gi(zk)Hi(zk)

≤ b (5.20)

for each k and each i ∈ β(z∗). We then have from (5.20) that

Gi(zk) ≤ bHi(zk), Hi(zk) ≤ a−1Gi(zk)

and somaxGi(zk),Hi(zk) ≤ (a−1 + b)minGi(zk),Hi(zk) (5.21)

for each k and each i ∈ β(z∗). By (5.21) and (5.19), we have

maxGi(zk),Hi(zk)

≤ (a−1 + b)||min(G(zk),H(zk))||≤ ρ1(zk)

for all k sufficiently large, where the last inequality follows from the same facts as above.This completes the proof of (5.11) and so the function ρ1 given by (5.18) can serve asan identification function. 2

Theorem 5.4 Suppose that the sequence zk generated by Algorithm H is asymptot-ically weakly nondegenerate. Then the function

ρ2(z) := τ ||Φ0(z)||σ, τ > 0, σ ∈ (0, 1) (5.22)

is an identification function.

Proof: Noting that, for each i and each k,

22 +

√2||min(G(zk),H(zk))|| ≤ ||Φ0(zk)||

≤ (2 +√

2)||min(G(zk),H(zk))||

(see [81]), we have

τ

(2

2 +√

2

)σ

||min(G(zk),H(zk))||σ ≤ ρ2(zk)

≤ τ(2 +√

2)σ||min(G(zk),H(zk))||σ.

(5.23)

5.2 Algorithm H 83

Let

ρ1(z) := τ

(2

2 +√

2

)σ

||min(G(z), H(z))||σ,

ρ1(z) := τ(2 +√

2)σ||min(G(z),H(z))||σ.

By Theorem 5.3, both ρ1 and ρ1 are identification functions for Algorithm H. As aresult, condition (5.10) holds with ρ1. This, together with the second inequality in(5.23), implies that condition (5.10) holds with ρ = ρ2. On the other hand, the firstinequality in (5.23), together with the fact that conditions (5.11) and (5.12) with ρ = ρ1

hold for all k sufficiently large, means that (5.11) and (5.12) hold with ρ = ρ2 when k

is sufficiently large. In consequence, the function ρ2 satisfies conditions (5.10)–(5.12).This completes the proof. 2

The next lemma indicates that problem (Pε) satisfies the standard LICQ under someappropriate assumptions, unlike problem (5.1), which fails to satisfy any constraintqualification at any feasible point. A proof of the lemma may be found in [31].

Lemma 5.1 If the MPCC-LICQ holds at z∗, then there exist a neighborhood U(z∗) ofz∗ and a positive constant ε∗ such that, for any ε ∈ (0, ε∗), problem (Pε) satisfies thestandard LICQ at any feasible point in U(z∗).

Summarizing the above arguments, we obtain the following concluding result.

Theorem 5.5 Let σ ∈ (0, 1), τ > 0, and ρ be ρ1 defined by (5.18) or ρ2 defined by(5.22). Suppose that the sequence zk of stationary points of problems (Pεk

) convergesto z∗ as εk → 0, the MPCC-LICQ holds at z∗, and the sequence zk is asymptoticallyweakly nondegenerate. Then, for all k sufficiently large, we have

(i) problem (Pεk) satisfies the standard LICQ at zk;

(ii) the sets αk, βk, and γk defined by (5.13), (5.14), and (5.15) satisfy (5.9).

This theorem indicates that, by means of the technique introduced above, we canidentify the active sets α(z∗), β(z∗), and γ(z∗) in a finite number of iterations undersome mild conditions. As a result, if the problem

minimize f(z)

subject to Gi(z) ≥ 0, Hi(z) = 0, i ∈ α(z∗),

Gi(z) = 0, Hi(z) = 0, i ∈ β(z∗), (5.24)

Gi(z) = 0, Hi(z) ≥ 0, i ∈ γ(z∗),

g(z) ≤ 0, h(z) = 0


can be solved exactly, we may expect that Algorithm H terminates in finite steps.In particular, if z∗ is a stationary point obtained by solving (5.24), then, under theassumptions of either Theorem 5.1 or Theorem 5.2, the algorithm terminates in a finitenumber of iterations by producing a B-stationary point of problem (5.1). Note thatit is possible that the algorithm terminates by producing a B-stationary point of (5.1)before we solve problem (5.24).

Remark 5.1 In order to ensure αk ∪ βk ∪ γk = 1, 2, . . . , m in Step 1 for each k, wemay define

βk := 1, . . . , m \ (αk ∪ γk)

instead of (5.14) in Algorithm H.

Remark 5.2 Under the assumption of asymptotically weak nondegeneracy of zk,we may define αk, βk, γk in a different manner in Step 1. For example, we may put

αk :=

i∣∣∣ Gi(zk) >

√εk, Hi(zk) ≤ √

εk

, (5.25)

βk :=

i∣∣∣ Gi(zk) ≤ √

εk, Hi(zk) ≤ √εk

, (5.26)

γk :=

i∣∣∣ Gi(zk) ≤ √

εk, Hi(zk) >√

εk

(5.27)

instead of (5.13)–(5.15). In fact, the proof of Theorem 5.3 indicates that there existtwo positive numbers a and b such that (5.20) holds for each i ∈ β(z∗) and each k.Therefore, by the equality in (5.6) and the asymptotically weak nondegeneracy, wededuce

Gi(zk) = O(εk), Hi(zk) = O(εk), i ∈ β(z∗), (5.28)

minGi(zk),Hi(zk) = O(ε2k), i ∈ α(z∗) ∪ γ(z∗). (5.29)

Since √εk approaches zero slower than both εk and ε2k, we have from (5.28)–(5.29) that

minGi(zk),Hi(zk) ≤ √εk

for each i and each k sufficiently large. This means αk ∪ βk ∪ γk = 1, 2, . . . , m whenk is sufficiently large. Noticing that αk, βk, γk defined by (5.25)–(5.27) is mutuallydisjoint for each k, we have that αk, βk, γk given by (5.25)–(5.27) is a partition of1, . . . ,m for all k sufficiently large. On the other hand, it is obvious that, when k issufficiently large,

αk ⊇ α(z∗), βk ⊆ β(z∗), γk ⊇ γ(z∗).

5.3 Algorithm HIA 85

Moreover, it follows from (5.26) and (5.28) that β(z∗) ⊆ βk for all k sufficientlylarge. Note that both αk, βk, γk and α(z∗), β(z∗), γ(z∗) are partitions of 1, . . . ,m.Therefore, (5.9) holds when k is large enough. So, we may employ (5.25)–(5.27) in-stead of (5.13)–(5.15) in Step 1 of Algorithm H. From the computational point of view,(5.25)–(5.27) are simpler than (5.13)–(5.15).

5.3 Modified Hybrid Method with Index Addition Strat-

egy

For Algorithm H, the asymptotically weak nondegeneracy condition is a key assump-tion. Although this condition is not excessively stringent because it is implied by theULSC condition (see Section 5.5), nevertheless, it is certainly desirable to lessen therequired assumptions. In this section, we introduce a modified method with index ad-dition strategy and in the next section, we describe another method with the conversestrategy. Both of these two methods do not require the assumption of asymptoticallyweak nondegeneracy.

Let ρ be a function defined by (5.18) or (5.22) with τ > 0 and σ ∈ (0, 1).

Algorithm HIA:

Step 0: Choose θ0 > 0 and ε0 > 0. Set k := 0.

Step 1: Solve problem (Pεk) to obtain a stationary point zk and set

αk0 :=

i

∣∣∣ Gi(zk) > ρ(zk), Hi(zk) ≤ ρ(zk), (5.30)

βk0 :=

i

∣∣∣ Gi(zk) ≤ ρ(zk), Hi(zk) ≤ ρ(zk), (5.31)

γk0 :=

1, 2, · · · ,m

\

(αk

0 ∪ βk0

), (5.32)

δk0 := +∞,

and j := 0. Go to Step 2.

Step 2: Solve the problem

minimize f(z)

subject to Gi(z) ≥ 0, Hi(z) = 0, i ∈ αkj ,

Gi(z) = 0, Hi(z) = 0, i ∈ βkj , (5.33)

Gi(z) = 0, Hi(z) ≥ 0, i ∈ γkj ,

g(z) ≤ 0, h(z) = 0


to get a stationary point zkj .

Step 3: If, in (5.86), the Lagrange multipliers corresponding to the constraints

Gi(z) ≥ 0, Hi(z) = 0, i ∈ αkj ∩ β(zk

j ),

Gi(z) = 0, Hi(z) = 0, i ∈ βkj , (5.34)

Gi(z) = 0, Hi(z) ≥ 0, i ∈ γkj ∩ β(zk

j )

are all nonnegative, then terminate. Else, if there is an i ∈ αkj ∪ γk

j such that

mini∈αk

j∪γkj

max

Gi(zk),Hi(zk)

= max

Gi(z

k), Hi(zk)

< θk, (5.35)

then set

δkj+1 := min

δkj ,

12

minGi(zk

j ) + Hi(zkj ) | i ∈ α(zk

j ) ∪ γ(zkj )

(5.36)

and

αkj+1 := αk

j \ i, (5.37)

βkj+1 := βk

j ∪ i, (5.38)

γkj+1 := γk

j \ i, (5.39)

j := j + 1

and go to Step 2. Otherwise, let jk := j and δk := δkj and go to Step 4.

Step 4: Choose εk+1 ∈ (0, εk) and set θk+1 := minθk, δk

. Go to Step 1 with k :=

k + 1.

The next theorem describes the relations between the sets αkj , β

kj , γk

j , j = 0, 1, · · · , jk,

and α(z∗), β(z∗), γ(z∗).

Theorem 5.6 Suppose that the sequence zk generated by Algorithm HIA convergesto z∗ as εk → 0. Then there is an integer k0 ≥ 0 such that, for any k ≥ k0,

(i) αk0 ⊇ α(z∗), βk

0 ⊆ β(z∗), γk0 ⊇ γ(z∗);

(ii) if β(z∗) = ∅, namely, z∗ is nondegenerate, then we have

αk0 = α(z∗), βk

0 = ∅, γk0 = γ(z∗); (5.40)


(iii) if β(z∗) 6= ∅ and, for some j, βkj is a proper subset of β(z∗), then any index

i ∈ αkj ∪ γk

j satisfying (5.35) belongs to β(z∗) and hence we have

αk0 ⊇ αk

1 ⊇ · · · ⊇ αkj ⊇ αk

j+1 ⊇ α(z∗), (5.41)

βk0 ⊆ βk

1 ⊆ · · · ⊆ βkj ⊆ βk

j+1 ⊆ β(z∗), (5.42)

γk0 ⊇ γk

1 ⊇ · · · ⊇ γkj ⊇ γk

j+1 ⊇ γ(z∗). (5.43)

Proof: First of all, we note that

αk0 ⊇ αk

1 ⊇ · · · ⊇ αkj ⊇ αk

j+1,

βk0 ⊆ βk

1 ⊆ · · · ⊆ βkj ⊆ βk

j+1,

γk0 ⊇ γk

1 ⊇ · · · ⊇ γkj ⊇ γk

j+1

by (5.37)–(5.39). Next we show the existence of the integer k0. We only consider thecase where ρ = ρ2. We may deal with the other case similarly.

Since zk converges to z∗ ∈ F , we have from the continuity of the function ρ that

limk→∞

ρ(zk) = 0 (5.44)

and hence, for each i ∈ α(z∗) ∪ γ(z∗),

maxGi(zk),Hi(zk)

> ρ(zk)

(:= τ‖Φ0(zk)‖σ

)(5.45)

≥ 2 ‖Φ0(zk)‖≥ 2φ0(Gi(zk),Hi(zk))

= 2(Gi(zk) + Hi(zk)−

√(Gi(zk))2 + (Hi(zk))2

)

=4Gi(zk)Hi(zk)

Gi(zk) + Hi(zk) +√

(Gi(zk))2 + (Hi(zk))2

≥ minGi(zk),Hi(zk)

(5.46)

when k is large enough, where the second inequality follows from the fact that ‖Φ0(zk)‖is convergent to 0 and the constant σ lies in the interval (0, 1), and the last inequalityfollows from (5.6) and the fact that

Gi(zk) + Hi(zk) +√

(Gi(zk))2 + (Hi(zk))2 ≤ 2(Gi(zk) + Hi(zk))

≤ 4maxGi(zk),Hi(zk)

.


Thus, we have from (5.45)–(5.46) and the continuity of the functions G and H that,for any k sufficiently large,

Gi(zk) > ρ(zk), Hi(zk) ≤ ρ(zk), i ∈ α(z∗), (5.47)

Gi(zk) ≤ ρ(zk), Hi(zk) > ρ(zk), i ∈ γ(z∗). (5.48)

Note that (5.30)–(5.32) imply

γk0 ⊇

i | Gi(zk) ≤ ρ(zk), Hi(zk) > ρ(zk)

(5.49)

for each k. Moreover, it is obvious from (5.44) that βk0 ⊆ β(z∗) when k is sufficiently

large. It then follows from (5.30)–(5.32) and (5.47)–(5.49) that there exists an integerk1 ≥ 0 such that, for any k ≥ k1,

αk0 ⊇ α(z∗), βk

0 ⊆ β(z∗), γk0 ⊇ γ(z∗). (5.50)

If z∗ is nondegenerate, then (5.40) follows from (5.50) and the fact that both αk0 , β

k0 , γk

0and α(z∗), β(z∗), γ(z∗) are partitions of the set 1, 2, · · · ,m. Therefore, (i) and (ii)hold for all k ≥ k1.

Suppose β(z∗) 6= ∅. If β(z∗) = 1, 2, · · · ,m, then we have βkj+1 ⊆ β(z∗) for any

k. Next we suppose that β(z∗) is a proper subset of 1, 2, · · · ,m. We will show that,when k is large enough, if there is an i ∈ αk

j ∪ γkj satisfying (5.35) and βk

j is a propersubset of β(z∗), then i ∈ β(z∗). Note that

limk→∞

maxGi(zk),Hi(zk)

= Gi(z∗) + Hi(z∗) > 0, i /∈ β(z∗), (5.51)

limk→∞

maxGi(zk),Hi(zk)

= 0, i ∈ β(z∗). (5.52)

We have from (5.51) and (5.52) that there exists an integer k0 ≥ k1 such that, for anyk ≥ k0,

maxi∈β(z∗)

max

Gi(zk), Hi(zk)

< min

i/∈β(z∗)

max

Gi(zk),Hi(zk)

. (5.53)

This inequality means that the index i satisfying (5.35) must be in β(z∗) as long as βkj

is a proper subset of β(z∗), namely,(αk

j ∪ γkj

)∩ β(z∗) = β(z∗) \ βk

j 6= ∅.

By (5.38), we have βkj+1 ⊆ β(z∗) and therefore, (5.41)–(5.43) hold for all k ≥ k0. Note

that, since k0 ≥ k1, (i) and (ii) also hold for all k ≥ k0. This completes the proof. 2

We proceed to analyzing convergence properties of Algorithm HIA in detail. First,we make the following assumption:


A1: Even if the identical subproblem appears in Step 2 at infinitely many iterationsand this problem may have an infinite number of solutions, we always obtain thesame solution, or at most finitely many different solutions.

This assumption seems reasonable in practice, since an iterative method appliedto solve a subproblem will generate an identical sequence as long as the same startingpoint is chosen.

In Algorithm HIA, for each k, the parameter θk is expectantly used as a positivelower bound of

mini/∈β(z∗)

maxGi(z∗),Hi(z∗)

= min

i/∈β(z∗)

(Gi(z∗) + Hi(z∗)

)> 0. (5.54)

In fact, a key technique for obtaining a finite termination property of Algorithm HIAis to choose the index sets αk

j , βkj and γk

j so that

αkj′k

= α(z∗), βkj′k

= β(z∗), γkj′k

= γ(z∗) (5.55)

hold for some index j′k ∈ 0, 1, · · · , jk when k is large enough. In order to ensure this,the number θk needs to be small enough to exclude all the indices in α(z∗) ∪ γ(z∗)from βk

j′k

for all k sufficiently large. Another requirement is that all the indices in β(z∗)

remain in βkj′k

when k is sufficiently large.

Since the index set 1, 2, · · · ,m has a finite number of partitions, there are a finitenumber of subproblems (5.86). By Assumption A1, the set

S :=zkj | 0 ≤ j ≤ jk, k = 0, 1, · · ·

(5.56)

is a finite set. Recall that zkj ∈ F for any k and j. We consider the following two cases.

Case I:⋃

zkj ∈S

(α(zk

j ) ∪ γ(zkj )

)6= ∅. In this case, we have

minzkj ∈S

Gi(zk


j ) ∪ γ(zkj )

> 0,

since S is a finite set. It then follows from (5.36) and the way of updating δk in Step 3that the parameter δk stays at a positive constant when k is sufficiently large. So, bythe updating rule of θk, there exists an integer k > 0 such that

θk = θk > 0, k ≥ k. (5.57)

Since

limk→∞

maxGi(zk),Hi(zk)

= 0, ∀i ∈ β(z∗),


we have

maxGi(zk),Hi(zk)

< θk, ∀i ∈ β(z∗) (5.58)

for all sufficiently large k. Moreover, by the definition of β(z∗), we have

maxi∈β(z∗)

max

Gi(zk),Hi(zk)

< min

i/∈β(z∗)

max

Gi(zk), Hi(zk)

(5.59)

for any k large enough. Taking into account (5.35), we deduce from (5.58) and (5.59)that the indices in β(z∗) are inevitably included by some βk

j and, in Step 3, these indicesmust be chosen earlier than the indices in α(z∗) ∪ γ(z∗). As a result, there is somej ∈ 0, 1, · · · , jk such that β(z∗) ⊆ βk

j whenever k is sufficiently large. This togetherwith Theorem 5.6 means that, when k is large sufficiently, there must be some indexj′k ∈ 0, 1, · · · , jk satisfying (5.55), and then problem (5.86) with j = j′k is actuallyequivalent to problem (5.24). As long as a solution of problem (5.86) with j = j′k yieldsa B-stationary point of problem (5.1), Algorithm HIA may terminate in a finite numberof iterations. Of course, it may happen that the algorithm stops by getting anotherB-stationary point of (5.1) before we identify the correct index sets.

Furthermore, we make the following assumption, which is most likely to hold whenproblem (5.1) has finitely many B-stationary points:

A2: The limit point z∗ of zk, which is a sequence of stationary points of (Pεk), is a

B-stationary point of problem (5.1) and it belongs to the set S given by (5.56).

Then, it follows from (5.36) and the updating rule of θk that the number θk in (5.57)is actually a positive lower bound of (5.54). Thus, our requirements are fulfilled: Thatis, in Step 3,

(a) all the indices in α(z∗) ∪ γ(z∗) are excluded from βkjk

eventually;

(b) all the indices in β(z∗) remain in βkjk

eventually.

Therefore, we are able to identify the correct index sets α(z∗), β(z∗), and γ(z∗) in afinite number of iterations and furthermore, we may terminate the algorithm finitelyby getting z∗. Note that Algorithm HIA may stop prematurely by producing anotherB-stationary point of problem (5.1).

Case II: α(zkj ) ∪ γ(zk

j ) = ∅, ∀zkj ∈ S. In this case, we have from the updating

rules of δk and θk that δk remains to be +∞ and so

θk ≡ θ0, ∀k.


Since the strategy in Algorithm HIA is to add some indices to βkj one after another, by

Theorem 5.6, we have the same conclusion as in Case I: When k is sufficiently large,there exists some index j′k ∈ 0, 1, · · · , jk satisfying (5.55). Furthermore, suppose thatAssumption A2 holds. In the present case, this means

β(z∗) = 1, 2, · · · , m.

Recall that θk ≡ θ0 for all k. As a result, all indices should be chosen to be inβk

jkeventually when k becomes large enough, i.e., we also can identify the index sets

α(z∗), β(z∗), and γ(z∗) in a finite number of steps.

The preceding analysis together with Lemma 5.1 yields the following concludingresult.

Theorem 5.7 Suppose that the sequence zk generated by Algorithm HIA convergesto z∗ as εk → 0 and the MPCC-LICQ holds at z∗. Then, under Assumption A1, wehave that, for any sufficiently large k,

(i) problem (Pεk) satisfies the standard LICQ at zk;

(ii) there exists j′k ∈ 0, 1, · · · , jk such that αkj′k, βk

j′k, and γk

j′k

satisfy condition (5.55).

If furthermore, Assumption A2 also holds, then

αkjk

= α(z∗), βkjk

= β(z∗), γkjk

= γ(z∗) (5.60)

hold for all k sufficiently large.

In consequence, without the assumption of asymptotically weak nondegeneracy,we have attained the same target as Algorithm H, which is to identify the index setsα(z∗), β(z∗), and γ(z∗) finitely. Thus, Algorithm HIA may terminate in a finite numberof iterations by producing a B-stationary point of problem (5.1).

On the other hand, if the sequence zk is asymptotically weakly nondegenerate,the sets αk

0 , βk0 , γk

0 given in Step 1 of Algorithm HIA are the same as the sets αk, βk, γk

given in Step 1 of Algorithm H when k is large enough. Theorem 5.5(ii) immediatelyyields the following result.

Theorem 5.8 Suppose the sequence zk generated by Algorithm HIA converges toz∗ as εk → 0, the MPCC-LICQ holds at z∗, and zk is asymptotically weakly non-degenerate. Then the sets αk

0 , βk0 , and γk

0 satisfy (5.55) with j′k = 0 when k is largeenough.

The main strategy in Algorithm HIA is to add some indices, which are chosen fromαk

j ∪ γkj , to βk

j . Since αkj ∪ γk

j contains β(z∗) for any k large enough, condition (5.55)


holds for some j′k ∈ 0, 1, · · · , jk when k is large enough. In order to ensure this, theinclusions in (5.50) are necessary. In the above discussion, we suppose that τ > 0 andσ ∈ (0, 1), just as in Algorithm H. Actually, from the proof of Theorem 5.6, (5.50)remains true for the case where τ ≥ 2 and σ = 1 and furthermore, so do Theorems 5.6and 5.7. Moreover, since the functions G and H play a symmetric role in problem (5.1),we may exchange the definitions of αk

0 and γk0 in Step 1 of Algorithm HIA, namely, let

γk0 :=

i | Gi(zk) ≤ ρ(zk), Hi(zk) > ρ(zk)

,

αk0 :=

1, 2, · · · ,m

\

(βk

0 ∪ γk0

)

instead of (5.32) and (5.30), respectively.

5.4 Modified Hybrid Method with Index Subtraction Strat-

egy

In this section, we consider another hybrid algorithm that adopts an index subtractionstrategy. One advantage of this algorithm is that the function ρ employed in the lastsection can be replaced by a sequence of positive numbers.

Algorithm HIS:

Step 0: Choose η > 0, θ0 > 0, ξ0 > 0, and ε0 > 0. Set k := 0.

Step 1: Solve problem (Pεk) and let zk denote one of its stationary points. Set

αk0 :=

i | Gi(zk) > η, Hi(zk) ≤ ξk

, (5.61)

γk0 :=

i | Gi(zk) ≤ ξk, Hi(zk) > η

\ αk

0 , (5.62)

βk0 :=

1, 2, · · · ,m

\

(αk

0 ∪ γk0

), (5.63)

δk0 := +∞,

and j := 0. Go to Step 2.

Step 2: If problem (5.86) is solvable, let zkj denote one of its stationary points and go

to Step 3. Otherwise, go to Step 4.

Step 3: If, in (5.86), the Lagrange multipliers corresponding to the constraints (5.34)are all nonnegative, then terminate. Else, if there is an i ∈ βk

j such that

maxi∈βk

j

max

Gi(zk),Hi(zk)

= max

Gi(z

k),Hi(zk)

> θk, (5.64)

5.4 Algorithm HIS 93

then set

δkj+1 := min

δkj ,

12

minGi(zk


j ) ∪ γ(zkj )

(5.65)

and

αkj+1 :=

αk

j ∪ i, if Gi(zk) ≥ Hi(z

k)αk

j , otherwise,

βkj+1 := βk

j \ i,

γkj+1 :=

γk

j , if Gi(zk) ≥ Hi(z

k)γk

j ∪ i, otherwise,j := j + 1

and go to Step 2. Otherwise, let jk := j and δk := δkj and go to Step 4.

Step 4: Choose εk+1 ∈ (0, εk), ξk+1 ∈ (0, ξk], and set θk+1 := minθk, δk

. Let

k := k + 1 and go to Step 1.

Note that, since Gi(zk) → 0 for each i ∈ β(z∗)∪ γ(z∗), the index set αk0 determined

in Step 1 will eventually consist of indices in α(z∗) only. Similarly, γk0 will eventually

consist of indices in γ(z∗) only. Therefore, we must have

αk0 ⊆ α(z∗), βk

0 ⊇ β(z∗), γk0 ⊆ γ(z∗) (5.66)

for all k sufficiently large. This is the key requirement for Algorithm HIS, whichsubtracts some indices from βk

j so that condition (5.55) holds for some j′k ∈ 0, 1, · · · , jkwhen k is large enough.

In Algorithm HIS, the strict positivity of η is essential, and the sequence ξk playsonly a subsidiary role: The sets αk

0 and γk0 should be chosen not too large so that the

key condition (5.66) holds as early as possible. To this end, we may choose ξk to bea null sequence. On the other hand, from the computational viewpoint, it is desirablethat the set βk

0 is as small as possible, i.e., the sets αk0 and γk

0 are as large as possible.In consequence, it would be important to choose the constant η > 0 and the sequenceξk appropriately.

Another practical choice is simply to remove ξk from Algorithm HIS. For example,we may define

αk0 :=

i | Gi(zk) > η

,

γk0 :=

i | Hi(zk) > η

\ αk

0 ,

βk0 :=

1, 2, · · · ,m

\

(αk

0 ∪ γk0

)


instead of (5.61)–(5.63), or preferably, let

αk0 :=

i | Gi(zk) > η, Hi(zk) ≤ η

,

γk0 :=

i | Gi(zk) ≤ η, Hi(zk) > η

,

βk0 :=

1, 2, · · · , m

\

(αk

0 ∪ γk0

),

which is equivalent to letting ξk ≡ η (∀k) in (5.61)–(5.63).

Moreover, the parameter θk is also expected to be a positive lower bound of (5.54)so that the indices outside β(z∗) can be removed from βk

jkeventually in Step 3. As

analyzed in the last subsection, Assumptions A1 and A2 guarantee that the parameterθk satisfies this requirement, i.e., θk can serve as a positive lower bound of (5.54) whenk is large enough.

In a similar way to the last section, comprehensive and detailed analysis can be givenfor Algorithm HIS. In particular, under Assumption A1, θk stays at a positive constantwhen k is sufficiently large in both Cases I and II considered in the last subsection. Itthen follows from (5.64) that all the indices in β(z∗) cannot be selected in Step 3 for allk large enough, i.e., they remain in βk

jkeventually. Thus, we have the following result.

Theorem 5.9 Suppose that the sequence zk generated by Algorithm HIS convergesto z∗ as εk → 0 and Assumption A1 holds. Then, for any sufficiently large k, we have

αk0 ⊆ αk

1 ⊆ · · · ⊆ αkjk⊆ α(z∗),

βk0 ⊇ βk

1 ⊇ · · · ⊇ βkjk⊇ β(z∗),

γk0 ⊆ γk

1 ⊆ · · · ⊆ γkjk⊆ γ(z∗).

If furthermore, Assumption A2 holds, then we have (5.60) for all k sufficiently large.

This theorem indicates that, under similar assumptions to the previous subsection,Algorithm HIS also has a finite termination property.

5.5 Further Discussions

We have mainly considered B-stationarity for MPCC in the previous sections. In thissection, we make some remarks on the algorithms, including the employed assumptions,the stopping criteria, and some extensions.

5.5 Further Discussions 95

5.5.1 Remarks on the assumptions

We first investigate the connection between the ULSC condition and the asymptoticallyweakly nondegeneracy condition [31]. Our result can be stated as follows.

Theorem 5.10 Let zk be stationary to problem (Pεk) for each k and the sequence zk

converge to z∗ as εk → 0. Suppose that the MPCC-LICQ holds at z∗. If the ULSCcondition holds at z∗, then zk is asymptotically weakly nondegenerate.

Proof: The theorem trivially holds when IG(z∗) ∩ IH(z∗) = ∅. So we assumeIG(z∗) ∩ IH(z∗) 6= ∅. First, we show that z∗ is a weakly stationary point, i.e., thereexist multiplier vectors λg, λh, λG, and λH satisfying

∇f(z∗) +∇g(z∗)λg +∇h(z∗)λh −∇G(z∗)λG −∇H(z∗)λH = 0, (5.67)

λg ≥ 0, λTg g(z∗) = 0, (5.68)

λG,i = 0, i /∈ IG(z∗), (5.69)

λH,i = 0, i /∈ IH(z∗). (5.70)

Note that the ULSC condition means

λG,iλH,i 6= 0, i ∈ IG(z∗) ∩ IH(z∗). (5.71)

By the continuity of the functions involved and the assumptions that zk convergesto z∗ and the MPCC-LICQ holds at z∗, we have that, for all k large enough,

IG(zk) ⊆ IG(z∗), IH(zk) ⊆ IH(z∗), Ig(zk) ⊆ Ig(z∗) (5.72)

and the set∇gl(zk),∇hr(zk) : l ∈ Ig(z∗), r = 1, · · · , q

⋃ ∇Gi(zk),∇Hi(zk) : i ∈ IG(z∗) ∩ IH(z∗)

⋃ ∇Gi(zk) +

Gi(zk)Hi(zk)

∇Hi(zk) : i ∈ IG(z∗) \ IH(z∗)

⋃ ∇Hi(zk) +

Hi(zk)Gi(zk)

∇Gi(zk) : i ∈ IH(z∗) \ IG(z∗)

(5.73)

is linearly independent. It follows from the stationarity of zk that there exist Lagrangemultiplier vectors λk

g ∈ <p, λkh ∈ <q, and λk

Φ ∈ <m such that

∇f(zk) +∇g(zk)λkg +∇h(zk)λk

h −∇Φεk(zk)λk

Φ = 0, (5.74)

λkg ≥ 0, g(zk)T λk

g = 0. (5.75)


Note that

∇Φεk,i(zk) =Hi(zk)

Gi(zk) + Hi(zk)∇Gi(zk) +


∇Hi(zk) (5.76)

holds for each i and k [31]. Moreover, by (5.72) and (5.75), we have λkg,l = 0 for every

l /∈ Ig(z∗). Therefore, the equation (5.74) becomes

0 = ∇f(zk) +∑

l∈Ig(z∗)

λkg,l∇gl(zk) +∇h(zk)λk

h

−∑

i∈IG(z∗)∩IH(z∗)

(λk

Φ,iHi(zk)Gi(zk) + Hi(zk)

∇Gi(zk) +λk

Φ,iGi(zk)Gi(zk) + Hi(zk)

∇Hi(zk))

−∑

i∈IG(z∗)\IH(z∗)

λkΦ,iHi(zk)

Gi(zk) + Hi(zk)

(∇Gi(zk) +

Gi(zk)Hi(zk)

∇Hi(zk))

−∑

i∈IH(z∗)\IG(z∗)

λkΦ,iGi(zk)

Gi(zk) + Hi(zk)

(∇Hi(zk) +

Hi(zk)Gi(zk)

∇Gi(zk))

. (5.77)

By the linear independence of the set (5.73), we have that all the multiplier sequencesλk

g,l : l ∈ Ig(z∗),

λk

h,r : r = 1, 2, · · · , q,

λk

Φ,iHi(zk)Gi(zk) + Hi(zk)

: i ∈ IG(z∗)

,

λk

Φ,jGj(zk)Gj(zk) + Hj(zk)

: j ∈ IH(z∗)

are convergent. Define λg, λh, λG, and λH as follows:

λg,l =

limk→∞ λk

g,l , l ∈ Ig(z∗)0 , l /∈ Ig(z∗)

; (5.78)

λh,r = limk→∞

λkh,r, r = 1, 2, · · · , q; (5.79)

λG,i =

limk→∞λkΦ,iHi(z

k)

Gi(zk)+Hi(zk), i ∈ IG(z∗)

0 , i /∈ IG(z∗); (5.80)

λH,j =

limk→∞λkΦ,jGj(z

k)

Gj(zk)+Hj(zk), j ∈ IH(z∗)

0 , j /∈ IH(z∗). (5.81)

Letting k →∞ in (5.77) and taking into account (5.75) and (5.78)–(5.81), we have that(5.67)–(5.70) hold. Next we suppose that the ULSC holds at z∗. It then follows from(5.71) and (5.80)–(5.81) that, for each i ∈ IG(z∗) ∩ IH(z∗),

λG,i = limk→∞

λkΦ,iHi(zk)

Gi(zk) + Hi(zk)6= 0, (5.82)

λH,i = limk→∞

λkΦ,iGi(zk)

Gi(zk) + Hi(zk)6= 0. (5.83)

5.5 Further Discussions 97

Since both Gi(zk) and Hi(zk) are positive for each i and k, it follows from (5.82) and(5.83) that

0 < |λG,i| ≤ lim infk→∞

|λkΦ,i|,

limk→∞

λkΦ,i = λG,i + λH,i

for each i ∈ IG(z∗) ∩ IH(z∗). Therefore,λk

Φ,i

is convergent to a nonzero number

for each i ∈ IG(z∗) ∩ IH(z∗). We then have from (5.82) and (5.83) that, for eachi ∈ IG(z∗) ∩ IH(z∗),

limk→∞

Hi(zk)Gi(zk) + Hi(zk)

=λG,i

λG,i + λH,i6= 0,

limk→∞


=λH,i

λG,i + λH,i6= 0.

This, together with (5.76), implies that the sequence zk is asymptotically weaklynondegenerate. This completes the proof of the theorem. 2

Let us make a remark on Assumption A2 employed by Algorithms HIA and HIS.Recall that we assume the sequence zk converges to z∗. Actually, the sequence zkmay have multiple limit points in general. In this case, it is easy to see that, as longas one of the limit points satisfies the assumptions made for z∗ in Algorithms HIAand HIS, we may obtain similar conclusions. Thus, Assumption A2 can be restated asfollows :

A2 ′ : The set S (defined in Section 5.3) contains an accumulation point of the se-quence zk that is a B-stationary point of problem (5.1).

This indicates that our assumptions for Algorithms HIA and HIS are really not veryrestrictive.

5.5.2 Stopping criteria

The stopping criterion in Step 3 of Algorithms H, HIA, and HIS is used to check theB-stationarity of the point zk, which is based on the fact that, for given (αk, βk, γk), ifzk is a stationary point of the problem

minimize f(z)

subject to Gi(z) ≥ 0, Hi(z) = 0, i ∈ αk,


Gi(z) = 0, Hi(z) = 0, i ∈ βk, (5.84)

Gi(z) = 0, Hi(z) ≥ 0, i ∈ γk,

g(z) ≤ 0, h(z) = 0

with nonnegative Lagrange multipliers related to the constraints

Gi(z) ≥ 0, Hi(z) = 0, i ∈ αk ∩ β(zk),

Gi(z) = 0, Hi(z) = 0, i ∈ βk, (5.85)

Gi(z) = 0, Hi(z) ≥ 0, i ∈ γk ∩ β(zk)

and the MPCC-LICQ holds at zk, then zk is a B-stationary point of problem (5.1)[33]. From the meaning of B-stationarity, we see that B-stationary points are obviouslygood candidates for local minimizers of problem (5.1). However, it may be difficult toobtain such a point for MPCCs in general. Actually, we may use some other conditionsto check M-stationarity or C-stationarity in view of the fact that, if the MPCC-LICQholds at zk and the Lagrange multipliers λk

G,i and λkH,i corresponding to (5.85) satisfy

some suitable conditions, then zk is C- or M-stationary to problem (5.1). Hence, thealgorithms may terminate finitely by producing a C- or M-stationary point of problem(5.1) under some weaker conditions.

5.5.3 Comparison of the algorithms

Comparing with Algorithm H, Algorithms HIA and HIS may need to solve more sub-problems

minimize f(z)

subject to Gi(z) ≥ 0, Hi(z) = 0, i ∈ αkj ,

Gi(z) = 0, Hi(z) = 0, i ∈ βkj , (5.86)

Gi(z) = 0, Hi(z) ≥ 0, i ∈ γkj ,

g(z) ≤ 0, h(z) = 0.

On the other hand, Algorithm H may have to solve more subproblems (Pε) than theother two algorithms in general. From both theoretical and computational points ofview, problem (5.86) is easier to deal with than problem (Pε). For example, under thecondition that the functions G,H, h are all affine and each gi is convex, the feasibleregion of problem (5.86) is convex, but that of problem (Pε) is not convex.


5.5.4 Extensions

We have presented Algorithms H, HIA and HIS by applying an active set identificationtechnique to the smoothing continuation method. Actually, the proposed approachesmay be extended by using other subproblems instead of (Pε) in Step 1 of the algorithms.

(I) Since problem (Pε) is equivalent to

minimize f(z)

subject to g(z) ≤ 0, h(z) = 0 (5.87)

G(z) + H(z) ≥ 0

Gi(z)Hi(z) = ε2, i = 1, · · · ,m,

we may use problem (5.87) instead of (Pε) in Step 1 of the algorithms at each iteration.It is obvious that all analysis and conclusions remain valid. Note that, for any ε > 0,the constraints

Gi(z) + Hi(z) ≥ 0, i = 1, 2, · · · ,m

are always inactive and so problem (5.87) seems simpler than (Pε).

(II) The regularization scheme

minimize f(z)

subject to g(z) ≤ 0, h(z) = 0 (5.88)

G(z) ≥ 0, H(z) ≥ 0

Gi(z)Hi(z) ≤ ε, i = 1, · · · , m

and the penalty scheme

minimize f(z) + ε−1G(z)T H(z)

subject to g(z) ≤ 0, h(z) = 0 (5.89)

G(z) ≥ 0, H(z) ≥ 0,

where ε is a positive parameter, have been proposed as approximate problems of (5.1) in[76] and [38], respectively. These two methods share similar properties to the smooth-ing continuation method. We may replace (Pε) by (5.88) or (5.89) in Step 1 of thealgorithms at each iteration and we can obtain similar results.


5.6 Computational Results

We have tested the proposed algorithms on various instances of MPCCs. In our ex-periments, we employed the MATLAB 6.0 built-in solver function fmincon to solve thesubproblems at each iteration. The computational results indicate that the proposedapproach can find an optimal solution of an MPCC in a small number of iterations.We report the details below.


Smoothing ε0 = 10−2 (0.5211,0.5211,0.5232,0.5232,0.0044,0.0044)

Continuation z ε1 = 10−4 (0.5011,0.5011,0.5011,0.5011,0.0000,0.0000)

Method ε2 = 10−6 (0.5000,0.5000,0.5000,0.5000,-0.0000,-0.0000)

Ite 27

z ε0 = 10−2 (0.5000,0.5000,0.5000,0.5000,0,0)

Algorithm H β(z) ε0 = 10−2 1, 2Ite 15

Regularization z ε0 = 10−2 (0.5000,0.5000,0.5000,0.5000,0.0000,0)

Method Ite 5

z ε0 = 10−2 (0.5000,0.5000,0.5000,0.5000,0,0)


Penalty z ε0 = 10−2 (0.5000,0.5000,0.5000,0.5000,0.0000,-0.0000)

Method Ite 6

z ε0 = 10−2 (0.5000,0.5000,0.5000,0.5000,0,0)


5.6.1 Computational results for Algorithm H

In order to get a comprehensive computational experience, we have investigated Algo-rithm H incorporating various methods mentioned in the last section. In our testing,we set ε0 = 10−2 and updated this parameter by εk+1 = 10−2εk. The point zk is usedas the starting point for the next step in Algorithm H and the related methods. Weemployed

ρ1(z) = ||min(G(z),H(z))|| 12



ε0 = 10−2 (4.6978,3.9999,4.1277,0.0000,0.0000)

Smoothing (z1, · · · , z5) ε1 = 10−4 (4.8294,4.0000,4.0569,0.0000,-0.0000)

Continuation ε2 = 10−6 (5.0408,4.0000,2.1732,0.0000,-0.0000)

Method ε3 = 10−8 (5.0000,4.0000,2.0000,0.0000,0.0000)

Ite 117

(z1, · · · , z5) ε0 = 10−2 (5.0000,4.0000,2.0000,0.0000,-0.0000)

Algorithm H β(z) ε0 = 10−2 ∅Ite 30

Regularization (z1, · · · , z5) ε0 = 10−2 (5.0000,4.0000,2.0000,0.0001,0.0002)

Method ε1 = 10−4 (5.0000,4.0000,2.0000,-0.0000,-0.0000)

Ite 15

(z1, · · · , z5) ε0 = 10−2 (5.0000,4.0000,2.0000,-0.0000,-0.0000)


Penalty (z1, · · · , z5) ε0 = 10−2 (5.0000,4.0000,2.0000,0.0000,-0.0000)

Method Ite 5

(z1, · · · , z5) ε0 = 10−2 (5,4,2,0,0)


and

ρ2(z) = ||Φ0(z)|| 12

as the identification function ρ in Step 1 of Algorithm H, and we found that the twofunctions yielded almost the same numerical results for all examples solved. This is notsurprising because ρ1(zk) and ρ2(zk) tend to 0 in the same order as zk → z∗.

First, we report on numerical results for problems from an AMPL collection ofMPECs called MacMPEC [50]. We notice that, since most problems are small-scale and,especially, the cardinalities of the lower-level degenerate index sets at the solutions arequite low, the proposed approach always solved the problems in only one iteration.Here we only show detailed results for three problems, Problem 5.1, Problem 5.2, andProblem 5.3, which are coded as desilva.mod, ex.9.1.1.mod, and bilevel1.mod,respectively, in MacMPEC. The computational results with the identification function ρ1

are reported in Tables 5.1–5.3, where z and z denote the points obtained by the hybridalgorithms and the related methods, respectively, and Ite stands for the number oftotal iterations spent by the solver fmincon. In addition, β(z) denotes the lower-leveldegenerate index set estimated at the point z. Recall that the objective of the hybridapproach is to identify the set β(z∗).



Smoothing ε0 = 10−2 (24.9959,30.0163,4.9959,10.0041,0.0000)

Continuation (z1, · · · , z5) ε1 = 10−4 (25.0000,30.0002,5.0000,10.0000,0.0000)

Method ε2 = 10−6 (25.0000,30.0000,5.0000,10.0000,0.0000)

Ite 57

(z1, · · · , z5) ε0 = 10−2 (25.0000,30.0000,5.0000,10.0000,0.0000)

Algorithm H β(z) ε0 = 10−2 6Ite 22

Regularization (z1, · · · , z5) ε0 = 10−2 (24.9998,29.9995,5.0002,9.9997,0.0007)

Method ε1 = 10−4 (25.0000,30.0000,5.0000,10.0000,0.0000)

Ite 13

(z1, · · · , z5) ε0 = 10−2 (25.0000,30.0000,5.0000,10.0000,0.0000)


Penalty (z1, · · · , z5) ε0 = 10−2 (25.0000,30.0000,5.0000,10.0000,-0.0000)

Method Ite 7

(z1, · · · , z5) ε0 = 10−2 (25,30,5,10,0)


Problem 5.1 This is Problem 5 in [24]. In MacMPEC, it is called desilva.mod.

minimize z21 − 2z1 + z2

2 − 2z2 + z23 + z2

4

subject to 0 ≤ z1 ≤ 2, 0 ≤ z2 ≤ 2

z3 − z1 + z3z5 − z5 = 0

z4 − z2 + z4z6 − z6 = 0

G(z) ≥ 0, H(z) ≥ 0, G(z)T H(z) = 0,

where

G(z) =

(z5

z6

), H(z) =

(0.25− (z3 − 1)2

0.25− (z4 − 1)2

).

For this problem, z∗ = (0.5, 0.5, 0.5, 0.5, 0, 0) and β(z∗) = 1, 2. In our testing, weused z = (1, 1, · · · , 1) as the initial point for all methods.

Problem 5.2 This problem is equivalent to the bilevel programming problem 9.2.2 in[28]. It corresponds to ex.9.1.1.mod in MacMPEC.

minimize −z1 − 3z2 + 2z3

subject to 0 ≤ z1 ≤ 8

z4 + 3z5 + z6 − z7 + z8 = 1

4z4 − 2z5 − 3z6 = 0

G(z) ≥ 0, H(z) ≥ 0, G(z)T H(z) = 0,


Table 5.4: Problems 5.4–5.7 generated by QPECgen

Parameters Input Data

in QPECgen Problem 5.4 Problem 5.5 Problem 5.6 Problem 5.7

qpec type 300 300 300 300

(n, m) (8, 5) (6, 10) (10, 10) (6, 12)

(l, p) (4, 5) (4, 10) (5, 10) (4, 12)

cond P 100 100 100 100

scale P 100 100 100 100

convex f 1 1 1 1

symm M 1 1 1 1

mono M 1 1 1 1

cond M 200 200 200 200

scale M 200 200 200 200

second deg 2 6 5 6

first deg 2 2 3 2

mix deg 2 2 3 4

tol deg 1.0e-6 1.0e-6 1.0e-6 1.0e-6

implicit 1 1 0 1

rand seed 0 0 0 0

output 3 3 3 3

where

G(z) =

z4

z5

z6

z7

z8

, H(z) =

2z1 − z2 − 4z3 + 16−8z1 − 3z2 + 2z3 + 482z1 − z2 + 3z3 − 12

z2

−z2 + 4

.

For this problem, z∗ = (5, 4, 2, 0, 0, 0, 0, 1) and β(z∗) = 2, 3. We employed z =(0.5, · · · , 0.5) as the initial point for all methods.

Problem 5.3 This is bilevel1.mod in MacMPEC and goes back to [24].

minimize 2z1 + 2z2 − 3z3 − 3z4 − 60

subject to 0 ≤ z1 ≤ 50, 0 ≤ z2 ≤ 50

z1 + z2 + z3 − 2z4 ≤ 40

2z3 − 2z1 − z5 + z6 + 2z9 = −40

2z4 − 2z2 − z7 + z8 + 2z10 = −40

G(z) ≥ 0, H(z) ≥ 0, G(z)T H(z) = 0,


Table 5.5: Some data obtained by QPECgen for Problems 5.4–5.7

x∗ (0.0723, 0.4345, 0.2387, -0.4003,

Problem 5.4 -0.2870, -0.5857, -0.4421, 0.2286)

y∗ (0, 0, 0, 0, 0)

β(x∗, y∗) 1, 2x∗ (0.0872, 0.2576, -0.1181, 0.2958, -0.1939, -0.0858)

Problem 5.5 y∗ (0, 0, 0, 0, 0, 0, 0, 0.7559, 0.6660, 0.0115)

β(x∗, y∗) 1, 2, 3, 4, 5, 6x∗ (0.6369, 0.6371, 0.1739, -0.7158, -0.8703,

Problem 5.6 -0.7478, -0.4383, 0.1886, 0.0741, 0.1494)

y∗ (0, 0, 0, 0, 0, 0, 0, 0, 0, 0)

β(x∗, y∗) 1, 2, 3, 4, 5x∗ (-0.1018, 0.0166, -0.3092, 0.1948, -0.4273, 0.0296)

Problem 5.7 y∗ (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.8546, 0.3146)

β(x∗, y∗) 1, 2, 3, 4, 5, 6

where

G(z) =

z5

z6

z7

z8

z9

z10

, H(z) =

z3 + 10−z3 + 20z4 + 10−z4 + 20

z1 − 2z3 − 10z2 − 2z4 − 10

.

We have z∗ = (25, 30, 5, 10, 0, 0, 0, 0, 0, 0) and β(z∗) = 6 for this problem and we usedz = (10, · · · , 10) as the initial point for all methods.

Tables 5.1–5.3 show that the proposed methods were able to find the solutions ofProblems 5.1, 5.2, and 5.3 very quickly. This, as we mentioned above, may be due tothe small scale of the problems and the low cardinality of the lower-level degenerateindex sets at the solutions.

Next we report on numerical results for somewhat larger test problems generatedby QPECgen of Jiang and Ralph [43]. The QPECgen generator is a MATLAB programthat uses a set of parameters, see Table 5.4 or [43] for detail. Once these parametersare specified, the program can randomly generate a quadratic program with linearcomplementarity constraints

minimize12(xT , yT )P

(x

y

)+ cT x + dT y

subject to A

(x

y

)+ a ≤ 0,

y ≥ 0, Nx + My + q ≥ 0,


Table 5.6: Computational results for Problem 5.4 a

Smoothing (x1, · · · , x5) ε0 = 10−2 (0.0730,0.4335,0.2367,-0.4022,-0.2890)

Continuation ε1 = 10−4 (0.0724,0.4345,0.2386,-0.4003,-0.2870)

Method Ite 43

(x1, · · · , x5) ε0 = 10−2 (0.0723,0.4345,0.2386,-0.4003,-0.2870)

Algorithm H β(x, y) ε0 = 10−2 1, 2Ite 32

ε0 = 10−2 (0.0727,0.4341,0.2386,-0.4002,-0.2886)

Regularization (x1, · · · , x5) ε1 = 10−4 (0.0723,0.4345,0.2386,-0.4003,-0.2871)

Method ε2 = 10−6 (0.0723,0.4345,0.2386,-0.4004,-0.2871)

Ite 52

(x1, · · · , x5) ε0 = 10−2 (0.0731,0.4353,0.2377,-0.4006,-0.2883)

ε1 = 10−4 (0.0723,0.4346,0.2386,-0.4003,-0.2870)

Algorithm H β(x, y) ε0 = 10−2 1, 2ε1 = 10−4 1, 2

Ite 51

Penalty (x1, · · · , x5) ε0 = 10−2 (0.0723,0.4345,0.2387,-0.4003,-0.2870)

Method Ite 31

(x1, · · · , x5) ε0 = 10−2 (0.0723,0.4345,0.2386,-0.4003,-0.2870)

Algorithm H β(x, y) ε0 = 10−2 1, 2Ite 35

aWe used (0.5, 0.5, · · · , 0.5) as the initial point for all methods.

yT (Nx + My + q) = 0,

where P, A,N,M are constant matrices and c, d, a, q are constant vectors with appro-priate dimensions. QPECgen also outputs an approximate solution of a generatedproblem.

We set the QPECgen parameters as in Table 5.4 to generate Problems 5.4–5.7. Inparticular, the parameters n and m denote the dimensions of the variables x and y,respectively, and second deg stands for the cardinality of the lower-level degenerateindex set at a solution.

Some data obtained by the QPECgen generator are summarized in Table 5.5, inwhich (x∗, y∗) denotes the solution given by QPECgen and β(x∗, y∗) stands for thelower-level degenerate index set at (x∗, y∗), i.e., β(x∗, y∗) := i | (Nx∗ + My∗ + q)i =0 = y∗i .

The computational results with the identification function ρ1 for Problems 5.4–5.7are reported in Tables 5.6–5.9. In the tables, we only list the values of the first fivecomponents of variable x, i.e., (x1, · · · , x5), because that would be sufficient to illustratethe behavior of the tested algorithms. Similarly, (xk, yk) and (xk, yk) denote the pointsobtained by Algorithm H and the related methods at iteration k, respectively.

The results shown in the tables reveal that it was not difficult to identify the active



Smoothing ε0 = 10−2 (0.0800,0.2520,-0.1084,0.3054,-0.1941)

Continuation (x1, · · · , x5) ε1 = 10−4 (0.0867,0.2586,-0.1176,0.2971,-0.1944)

Method ε2 = 10−6 (0.0872,0.2576,-0.1181,0.2958,-0.1940)

Ite 103

(x1, · · · , x5) ε0 = 10−2 (0.0800,0.2520,-0.1084,0.3054,-0.1941)

ε1 = 10−4 (0.0871,0.2577,-0.1181,0.2957,-0.1940)

Algorithm H β(x, y) ε0 = 10−2 1, 2, 3, 4, 5, 6, 10ε1 = 10−4 1, 2, 3, 4, 5, 6

Ite 59

ε0 = 10−2 (0.0864,0.2614,-0.1160,0.2941,-0.1931)

Regularization (x1, · · · , x5) ε1 = 10−4 (0.0869,0.2577,-0.1179,0.2958,-0.1938)

Method ε2 = 10−6 (0.0872,0.2576,-0.1181,0.2958,-0.1940)

Ite 30

(x1, · · · , x5) ε0 = 10−2 (0.0864,0.2614,-0.1160,0.2941,-0.1931)

ε1 = 10−4 (0.0871,0.2577,-0.1181,0.2957,-0.1940)

Algorithm H β(x, y) ε0 = 10−2 1, 2, 3, 4, 5, 6, 10ε1 = 10−4 1, 2, 3, 4, 5, 6

Ite 30

ε0 = 10−2 (0.0872,0.2576,-0.1181,0.2958,-0.1940)

ε1 = 10−4 (0.0871,0.2576,-0.1181,0.2957,-0.1940)

Penalty Method (x1, · · · , x5) ε2 = 10−6 (0.0871,0.2593,-0.1169,0.2969,-0.1965)

ε3 = 10−8 (-0.1005,0.2845,-0.0061,0.2642,-0.1371)

Ite 59

ε0 = 10−2 (0.0871,0.2577,-0.1181,0.2957,-0.1940)

(x1, · · · , x5) ε1 = 10−4 (0.0871,0.2577,-0.1181,0.2957,-0.1940)

ε2 = 10−6 (0.0871,0.2577,-0.1181,0.2957,-0.1940)

ε3 = 10−8 (0.0871,0.2577,-0.1181,0.2957,-0.1940)

Algorithm H ε0 = 10−2 1, 2, 3, 4, 5, 6β(x, y) ε1 = 10−4 1, 2, 3, 4, 5, 6

ε2 = 10−6 1, 2, 3, 4, 5, 6ε3 = 10−8 1

Ite 71

aWe used (2, 2, · · · , 2) as the initial point for all methods.

sets by Algorithm H, at least for the test problems used in our numerical experiments,although we have observed that the penalty method may not be very stable when theparameter ε becomes small. In fact, we got the correct active sets in no more than threesteps in all cases and, since the computed points satisfy the B-stationarity conditions,Algorithm H terminated. Especially, as mentioned in the last section, Algorithm Hmay terminate by finding a solution before the correct index sets are obtained, seeTable 5.2. Moreover, we notice that, since the number of active indices is larger thanthe dimension of z, Problems 5.2, 5.5, and 5.7 do not satisfy the MPCC-LICQ at thesolutions. Nevertheless, we were able to obtain the solutions successfully, which showsthe robustness of the proposed approach.


We have also solved these test problems by some solvers from NEOS [21]. The re-sults indicates that the hybrid approach is comparable to those methods. For example,the total iterations spent by the solvers DONLP2 / MINOS / SNOPT for Problems5.1, 5.2, and 5.3 are 8 / 4 / 8, 12 / 3 / 6, and 9 / 11 / 9, respectively. Moreover,it should be emphasized that the proposed algorithms are theoretically guaranteed tosolve MPCCs under mild conditions, while the methods in NEOS are in general not,even if they could solve the test problems successfully.

5.6.2 Computational results for Algorithms HIA and HIS

In this subsection, we examine the effectiveness of Algorithms HIA and HIS on some ex-amples of MPCC. Since the numerical results shown in the last subsection have revealedthat Algorithm H is comparable to some existing methods such as the smoothing con-tinuation method [31], the penalty function method [38], and the regularization method[76], we only compare Algorithms HIA and HIS with Algorithm H.

We show the QPECgen parameters used to generate Problems 5.8 and 5.9 in Table5.10, and summarize some data output by the QPECgen generator in Table 5.11. Inour experiments, we set ε0 = 10−2 and updated this parameter by εk+1 = 10−1εk.

Moreover, we employ

ρ(x, y) = ‖min(Nx + My + q, y)‖ 45

as the identification function in Step 1 of both Algorithm H and Algorithm HIA. InAlgorithm HIS, we use the sequence ξk given by

ξk = ρ(xk, yk), k = 0, 1, 2, · · · ,

which is a reasonable choice to compare with the other two methods. See the tablesfor the setting of the other parameters involved.

The computational results for Problems 5.8 and 5.9 are reported in Tables 5.12 and5.13, respectively. In the tables, distance denotes the distance between the obtainedpoint and the solution (x∗, y∗) measured by the infinity norm. In the Ite column, asum ν1 + ν2 means that ν1 is the number of iterations spent by fmincon for solvingproblem (Pε) and ν2 denotes the number of iterations spent by solving subproblem(5.86), whereas a single number ν stands for the number of iterations spent by solvingsubproblem (5.86) (as there is no need to solve problem (Pε) in these cases).

108 6. Smoothing Implicit Programming Approach for SMPECs

The results shown in the tables reveal that both Algorithms HIA and HIS were ableto identify the active sets successfully. As mentioned in the previous section, AlgorithmsHIA and HIS need to solve more problems of the form (5.86) than Algorithm H, whereasthe latter has to solve more problems of the form (Pε). Our experiments show thatproblem (5.86) can be solved in fewer iterations than problem (Pε) generally.

6.1 Introduction 109


ε0 = 10−2 (0.6053,0.7516,0.1275,-0.6123,-0.6650)

Smoothing (x1, · · · , x5) ε1 = 10−4 (0.6358,0.6367,0.1739,-0.7155,-0.8697)

Continuation ε2 = 10−6 (0.6366,0.6368,0.1736,-0.7157,-0.8698)

Method ε3 = 10−8 (0.6370,0.6366,0.1738,-0.7158,-0.8703)

Ite 185

(x1, · · · , x5) ε0 = 10−2 (0.7361,0.4854,0.3201,-0.6325,-0.6232)

ε1 = 10−4 (0.6368,0.6371,0.1739,-0.7158,-0.8704)

Algorithm H β(x, y) ε0 = 10−2 2, 4, 7ε1 = 10−4 1, 2, 3, 4, 5

Ite 132

ε0 = 10−2 (0.6368,0.6321,0.1734,-0.7145,-0.8746)

Regularization (x1, · · · , x5) ε1 = 10−4 (0.6368,0.6368,0.1737,-0.7158,-0.8704)

Method ε2 = 10−6 (0.6367,0.6371,0.1739,-0.7158,-0.8702)

ε3 = 10−8 (0.6369,0.6370,0.1739,-0.7158,-0.8703)

Ite 105

ε0 = 10−2 (0.6369,0.6371,0.1739,-0.7159,-0.8703)

(x1, · · · , x5) ε1 = 10−4 (0.6368,0.6370,0.1739,-0.7158,-0.8704)

ε2 = 10−6 (0.6368,0.6370,0.1739,-0.7158,-0.8704)

Algorithm H ε0 = 10−2 1, 4, 5β(x, y) ε1 = 10−4 2, 3, 4, 5

ε2 = 10−6 1, 2, 3, 4, 5Ite 102

ε0 = 10−2 (0.6369,0.6370,0.1739,-0.7158,-0.8703)

ε1 = 10−4 (0.6368,0.6370,0.1739,-0.7158,-0.8702)

Penalty Method (x1, · · · , x5) ε2 = 10−6 (0.6365,0.6368,0.1742,-0.7158,-0.8709)

ε3 = 10−8 (0.6360,0.6343,0.1735,-0.7148,-0.8683)

Ite 104

ε0 = 10−2 (0.6368,0.6370,0.1739,-0.7158,-0.8703)

(x1, · · · , x5) ε1 = 10−4 (0.6368,0.6371,0.1739,-0.7158,-0.8704)

ε2 = 10−6 (0.6369,0.6370,0.1739,-0.7158,-0.8703)

ε3 = 10−8 (0.6368,0.6371,0.1739,-0.7158,-0.8703)

Algorithm H ε0 = 10−2 1, 2, 3, 4, 5β(x, y) ε1 = 10−4 1, 2, 3, 4, 5

ε2 = 10−6 3, 4, 5ε3 = 10−8 2, 3, 4, 5

Ite 132

a(0.5, 0.5, · · · , 0.5) was employed as the initial point for all methods.



ε0 = 10−2 (-0.0915,0.0196,-0.3204,0.2065,-0.4307)

Smoothing (x1, · · · , x5) ε1 = 10−4 (-0.0991,0.0163,-0.3107,0.1972,-0.4279)

Continuation ε2 = 10−6 (-0.1030,0.0150,-0.3080,0.1958,-0.4340)

Method ε3 = 10−8 (-0.1002,0.0159,-0.3091,0.1947,-0.4276)

Ite 104

(x1, · · · , x5) ε0 = 10−2 (-0.1018,0.0167,-0.3091,0.1947,-0.4275)

ε1 = 10−4 (-0.1018,0.0167,-0.3091,0.1947,-0.4275)

Algorithm H β(x, y) ε0 = 10−2 ∅ε1 = 10−4 1, 2, 3, 4, 5, 6

Ite 41

ε0 = 10−2 (-0.1007,0.0160,-0.3093,0.1952,-0.4274)

Regularization (x1, · · · , x5) ε1 = 10−4 (-0.1007,0.0162,-0.3095,0.1954,-0.4274)

Method ε2 = 10−6 (-0.1016,0.0166,-0.3092,0.1948,-0.4273)

ε3 = 10−8 (-0.1016,0.0167,-0.3093,0.1948,-0.4271)

Ite 35

(x1, · · · , x5) ε0 = 10−2 (-0.1007,0.0160,-0.3093,0.1952,-0.4274)

ε1 = 10−4 (-0.1018,0.0167,-0.3091,0.1947,-0.4275)

Algorithm H β(x, y) ε0 = 10−2 1, 2, 3, 4, 5, 6, 8, 10ε1 = 10−4 1, 2, 3, 4, 5, 6

Ite 32

ε0 = 10−2 (-0.1017,0.0166,-0.3092,0.1947,-0.4273)

ε1 = 10−4 (-0.1018,0.0166,-0.3092,0.1947,-0.4273)

Penalty Method (x1, · · · , x5) ε2 = 10−6 (-0.0885,0.0124,-0.3119,0.1926,-0.4274)

ε3 = 10−8 (0.2274,0.4623,-0.1441,-0.1743,-1.2077)

Ite 65

ε0 = 10−2 (-0.1018,0.0167,-0.3091,0.1947,-0.4275)

(x1, · · · , x5) ε1 = 10−4 (-0.1018,0.0167,-0.3092,0.1947,-0.4275)

ε2 = 10−6 (-0.1017,0.0167,-0.3091,0.1947,-0.4275)

ε3 = 10−8 (-0.0645,-0.0118,-0.2910,0.1588,-0.4824)

Algorithm H ε0 = 10−2 1, 2, 3, 4, 5, 6β(x, y) ε1 = 10−4 1, 3, 4, 5, 6

ε2 = 10−6 2, 5, 6ε3 = 10−8 8

Ite 79

aWe used (0.5, 0.5, · · · , 0.5) as the initial point for all methods.

Table 5.10: Parameters in QPECgen for Problems 5.8-5.9Problem qpec type (n, m) (l, p) cond P scale P convex f

# 5.8 300 (10, 10) (5, 10) 100 100 1

# 5.9 300 (8, 14) (4, 14) 100 100 1

Problem symm M mono M cond M scale M second deg first deg

# 5.8 1 1 200 200 4 3

# 5.9 1 1 200 200 4 2

Problem mix deg tol deg implicit rand seed output

# 5.8 3 1.0e-6 0 0 3

# 5.9 1 1.0e-6 0 0 3


Table 5.11: Some data obtained by QPECgen for Problems 5.8-5.9

Problem x∗ y∗ β(x∗, y∗)# 5.8 (0.6369, 0.6371, 0.1739, -0.7158, -0.8703, (0, 0, · · ·, 0) 1, 2, 3, 4

-0.7478, -0.4383, 0.1886, 0.0741, 0.1494)

# 5.9 (-0.7330, -0.2090, -0.4140, 0.0168, (0,· · ·,0,0.4681,0.4739,0.2088) 1, 2, 3, 4-0.7084, -0.1104, 0.0030, -0.4658)


εk degenerate set distance Ite

10−2 βk = ∅ 0.0007 25+10

Algorithm H 10−3 βk = ∅ 0.0010 29+10

10−4 βk = 4 0.0005 23+11

βk0 = ∅ 0.0007 25+10

βk1 = 4 0.0007 10

Algorithm HIA 10−2 βk2 = 2, 4 0.0005 11

βk3 = 1, 2, 4 0.0004 10

βk4 = 1, 2, 3, 4 0.0004 10

βk0 = 1, 2, 3, 4, 7, 9, 10 1.5944 25+6

Algorithm HIS 10−2 βk1 = 1, 2, 3, 4, 9, 10 1.2922 5

βk2 = 1, 2, 3, 4, 10 0.4131 9

βk3 = 1, 2, 3, 4 0.0004 10

aWe used (1, 1, · · · , 1) as the initial point for all methods and we set θ0 = 0.2 in Algorithms

HIA and HIS. In addition, the parameter η in Algorithm HIS is set to be 0.5.


εk degenerate set distance Ite

10−2 βk = ∅ 0.0003 27+6

Algorithm H 10−3 βk = 1, 2, 3 0.0004 19+5

10−4 βk = 1, 2, 3, 4 0.0003 35+7

βk0 = ∅ 0.0003 27+6

βk1 = 2 0.0003 6

Algorithm HIA 10−2 βk2 = 1, 2 0.0009 3

βk3 = 1, 2, 3 0.0009 3

βk4 = 1, 2, 3, 4 0.0009 3

βk0 = 1, 2, 3, 4, 11 0.8554 27+6

Algorithm HIS 10−2 βk1 = 1, 2, 3, 4 0.0009 3

aWe employed (1, 1, · · · , 1) as the initial point for all methods. We set θ0 = 0.2 in Algorithms

HIA and HIS and, η = 0.2 in Algorithm HIS.


Chapter 6

Smoothing Implicit Programming

Approach for SMPECs

From this chapter, we begin to study the stochastic mathematical programs with equi-librium constraints (SMPECs), which can be thought as generalizations of the math-ematical programs with equilibrium constraints. We first introduce the problems andthen, we show that many decision problems can be formulated as SMPECs in practice.We discuss two kinds of models: the lower-level wait-and-see decision model and thehere-and-now decision model. For the lower-level wait-and-see model, we propose asmoothing implicit programming method and establish a comprehensive convergencetheory. For the here-and-now decision problem, we apply a penalty technique and sug-gest a similar method. We show that the two methods possess similar convergenceproperties.

6.1 Introduction

In this chapter, we discuss MPECs under uncertainty. That is, we consider the stochas-tic mathematical program with equilibrium constraints (SMPEC):


subject to x ∈ X, ω ∈ Ω, (6.1)

y solves VI(F (x, ·, ω), C(x, ω)),

where X is a subset of <n, Ω stands for the underlying sample space, Eω means ex-pectation with respect to the random variable ω ∈ Ω, and f : <n+m × Ω → <, F :<n+m × Ω → <m, C : <n × Ω → 2<m

are mappings. Obviously, if Ω is a singleton,

113


then problem (6.1) reduces to an ordinary MPEC, and so SMPECs can be thought asgeneralized MPECs. It is well known that an MPEC is a hard problem because itsconstraints fail to satisfy a standard constraint qualification at any feasible point [17].This suggests that SMPECs may be more difficult to deal with because the number ofrandom events is usually very large in practice.

When C(x, ω) ≡ <m+ for any x ∈ X and any ω ∈ Ω in problem (6.1), the variational

inequality constraints reduce to the complementarity constraints and problem (6.1)is equivalent to the following stochastic mathematical program with complementarityconstraints (SMPCC):


subject to x ∈ X, ω ∈ Ω, (6.2)

y ≥ 0, F (x, y, ω) ≥ 0,

yT F (x, y, ω) = 0.

On the other hand, if the set C(x, ω) is defined by

C(x, ω) = y ∈ <m| c(x, y, ω) ≤ 0,

where c(·, ·, ω) is continuously differentiable, then, under some suitable conditions, thevariational inequality problem VI(F (x, ·, ω), C(x, ω)) has an equivalent Karush-Kuhn-Tucker representation

F (x, y, ω) +∇yc(x, y, ω)λ(x, ω) = 0,

λ(x, ω) ≥ 0, c(x, y, ω) ≤ 0, λ(x, ω)T c(x, y, ω) = 0,

where λ(x, ω) is the Lagrange multiplier vector [68]. As a result, problem (6.1) can bereformulated as a program like (6.2) under some conditions, see the monograph [62] fordetails. Hence, problem (6.2) constitutes an important subclass of SMPECs. In thischapter, we concentrate on this kind of SMPECs.

Problem (6.2) looks like a standard MPEC. However, the existence of the randomvariable ω means that (6.2) involves multiple complementarity-type constraints and itis therefore more difficult to solve than an ordinary MPEC generally.

We call (6.2) a lower-level wait-and-see model with an upper-level decision x and alower-level decision y. In this kind of decision problems, we wait until an observation ismade on the random events, and then we make an optimal lower-level decision y = y(ω)based on the observed information. This kind of problems has been discussed in [72],the main results of which are concerned with the existence of solutions, the convexity


and directional differentiability of an implicit objective function, and links betweenSMPEC and bilevel models. Actually, there have been no effective algorithms suggestedfor solving SMPECs so far. In this chapter, we will propose an implicit programmingapproach for (6.2). The problems dealt with in this chapter include not only the lower-level wait-and-see model, but a more practical problem as well that requires us to makeall decisions at once, before ω is observed:

minimize Eω[f(x, y, ω) + dT z(ω)]

subject to x ∈ X, ω ∈ Ω,

y ≥ 0, F (x, y, ω) + z(ω) ≥ 0, (6.3)

yT (F (x, y, ω) + z(ω)) = 0,

z(ω) ≥ 0,

where z(ω) is called a recourse variable and d ∈ <m is a vector with positive elements.We call (7.1) a here-and-now model.

We note that, in either of the lower-level wait-and-see and here-and-now models,the upper-level decision is made ‘here-and-now’. Therefore, both models yield decisionproblems unlike single-level stochastic programming, in which the wait-and-see modelis not a decision problem [46, 83]. The following example illustrates the two models.

Example 6.1 There are a food company who makes picnic lunches and a vendor whosells lunches to hikers on every Sunday. The company and the vendor have the followingcontract:

C1: The vendor buys lunches from the company at the price x ∈ [a, b] determined bythe company, where a and b are two positive constants.

C2: The vendor decides the amount y of lunches that he buys from the company,where y must be no less than the minimum amount c > 0.

C3: The vendor pays the company for the whole lunches he buys, i.e., the vendor paysxy to the company.

C4: The vendor sells lunches to hikers at the price 2x and get the proceeds for thetotal number of lunches actually sold.

C5: Even if there are any unsold lunches, the vendor cannot return them to thecompany but he can dispose of the unsold lunches with no cost.


We suppose that the demand of lunches depends on the price and the weather on thatday. Since the weather is uncertain, we may treat it as a random variable. Morespecifically, we suppose that the demand is given by the function

φ(x, ω) := D(ω)− d(ω)x, ω ∈ Ω,

where D(ω) ≥ 0 and d(ω) ≥ 0 are random variables. Therefore, the actual amountof lunches sold is given by min(y, φ(x, ω)), which also depends on the weather on thatday.

The decisions by the company and the vendor are x and y, respectively. Thecompany’s objective is to maximize its total profit xy, while the vendor’s objective isto maximize its total profit 2xmin(y, φ(x, ω))−xy. The latter problem may be writtenas

maximizey,t x(2t− y)

subject to y ≥ c, y − t ≥ 0,

D(ω)− d(ω)x− t ≥ 0,

whose optimality conditions are stated as(

x

−2x

)− u

(10

)− v

(1−1

)− w

(0−1

)= 0, (6.4)

0 ≤ u ⊥ (y − c) ≥ 0,

0 ≤ v ⊥ (y − t) ≥ 0,

0 ≤ w ⊥ (D(ω)− d(ω)x− t) ≥ 0. (6.5)

Here, λ⊥µ means λµ = 0. It follows from (6.4) that

u = x− v, w = 2x− v.

This implies that w = x+u ≥ a > 0, which together with (6.5) yields t = D(ω)−d(ω)x.Thus the above optimality conditions may further be rewritten as

0 ≤ (x− v) ⊥ (y − c) ≥ 0,

0 ≤ v ⊥ (y −D(ω) + d(ω)x) ≥ 0.(6.6)

Then the company’s problem may be written as the following stochastic MPEC:

minimizex,y −xy

subject to a ≤ x ≤ b,

0 ≤ (x− v) ⊥ (y − c) ≥ 0,

0 ≤ v ⊥ (y −D(ω) + d(ω)x) ≥ 0.


Now there are two cases.

Here-and-now model: Suppose that both the company and the vendor have to makedecision on Saturday, without knowing the weather of Sunday. In this case, there is no(x, v) satisfying (6.6) for all ω ∈ Ω in general. So, by introducing the recourse variables,the company’s problem is represented as the following model:

minimize −xy + βEω[ z(ω) ]subject to a ≤ x ≤ b,

0 ≤ (x− v) ⊥ (y − c) ≥ 0,

0 ≤ v ⊥ (y −D(ω) + d(ω)x + z(ω)) ≥ 0,

z(ω) ≥ 0, ω ∈ Ω,

where β > 0 is a constant.

Lower-level wait-and-see model: Suppose that the company makes a decision onSaturday, but the vendor can make a decision on Sunday morning after knowing theweather of that day. In this case, the vendor’s decision may depend on the observationof ω, which is given by (y(ω), v(ω)) that satisfies

0 ≤ (x− v(ω)) ⊥ (y(ω)− c) ≥ 0,

0 ≤ v(ω) ⊥ (y(ω)−D(ω) + d(ω)x) ≥ 0

for each ω ∈ Ω. Therefore the company’s problem is represented as the following model:

minimize EΩ[−y(ω)x ]subject to a ≤ x ≤ b, ω ∈ Ω,

0 ≤ (x− v(ω)) ⊥ (y(ω)− c) ≥ 0,

0 ≤ v(ω) ⊥ (y(ω)−D(ω) + d(ω)x) ≥ 0.

Organization of this chapter: In the next section, we will present a smoothing im-plicit programming method for the lower-level wait-and-see problem with linear com-plementarity constraints. A comprehensive convergence theory will also be included.In Section 6.3, we deal with the here-and-now problem and, by means of a penaltytechnique, we suggest a similar method for solving this kind of problems. Concludingremarks are made in the final section.

We will use the following notations in this and next two chapters: All vectors arethought as column vectors and x[i] stands for the ith coordinate of the vector x ∈ <n,whereas for a matrix M , we denote by M [i] the vector whose elements consist of theith row of M . If K is an index set, we let M [K] be the principal submatrix of M whoseelements consist of those of M indexed by K. For any vectors u and v of the same


dimension, we denote u⊥v to mean uT v = 0. For two vectors u, v ∈ <s, min(u, v) isunderstood to be taken componentwise, i.e.,

min(u, v) = (minu[1], v[1], · · · , minu[s], v[s])T .

In addition, ei denotes the unit vector with ei[i] = 1; I and O denote the identitymatrix and the zero matrix with suitable dimension, respectively.

6.2 Smoothing Implicit Programming Method for Lower-

Level Wait-And-See Problems

In this section, we consider the following stochastic mathematical program with linearcomplementarity constraints (SMPLCC):

minimizeL∑

`=1

p`f(x, y`)

subject to g(x) ≤ 0, h(x) = 0, (6.7)

y` ≥ 0, N`x + M`y` + q` ≥ 0,

yT` (N`x + M`y` + q`) = 0, ` = 1, · · · , L,

which corresponds to the discrete case where Ω = ω1, ω2, · · · , ωL. Here, p` denotesthe probability of the random event ω` ∈ Ω, i.e.,

L∑

`=1

p` = 1, p` ≥ 0, ` = 1, · · · , L,

the functions f : <n+m → <, g : <n → <s1 , and h : <n → <s2 are all continuouslydifferentiable, and N` ∈ <m×n, M` ∈ <m×m, q` ∈ <m are realizations of the randomcoefficients. Problem (6.7) represents a lower-level wait-and-see model, since the lower-level decisions y` are associated with possible outcomes ω` of the random variable ω,which means that, unlike the upper-level decision, the lower-level decisions are madeafter a random event is observed. Throughout we assume p` > 0 for all ` = 1, · · · , L.

By letting

y =

y1...

yL

, N =

N1...

NL

, M =

M1 O. . .

O ML

, q =

q1...

qL

, (6.8)

and

f(x,y) =L∑

`=1

p`f(x, y`), (6.9)

6.2 Method for Wait-And-See Problems 119

problem (6.7) can be rewritten as

minimize f(x,y)

subject to g(x) ≤ 0, h(x) = 0, (6.10)

y ≥ 0, Nx + My + q ≥ 0,

yT (Nx + My + q) = 0,

which is an ordinary MPEC. However, since L is usually very large in practice, problem(6.10) is a large-scale program with variables (x,y) ∈ <n+mL so that some methods forMPECs may cause more computational difficulties. Here, we treat problem (6.7) as amathematical program with multiple complementarity-type constraints.

Recently, Chen and Fukushima [12] have suggested a smoothing method for anMPEC with P-matrix linear complementarity constraints. We will develop a similarsmoothing method for solving problem (6.7), or equivalently (6.10), with M beinga P0-matrix. Specifically, in addition to smoothing, we will employ a regularizationtechnique to make an implicit programming approach applicable. We will investigatethe limiting behavior of the method under appropriate assumptions.

6.2.1 Preliminaries

We first recall some basic concepts and properties that will be used later on.

Definition 6.1 [19] Suppose that M is an m×m matrix. We call M a P-matrix if allthe principal minors of M are positive, or equivalently,

max1≤i≤m

y[i](My)[i] > 0, 0 6= ∀y ∈ <m,

and we call M a P0-matrix if all the principal minors of M are nonnegative, or equiv-alently,

max1≤i≤m

y[i](My)[i] ≥ 0, ∀y ∈ <m.

It is obvious that a P-matrix must be a P0-matrix and, if M is a P0-matrix and µ

is a positive number, then the matrix M + µI is a P-matrix.

Definition 6.2 [19] A square matrix is said to be nondegenerate if all of its principalsubmatrices are nonsingular.


It is easy to see that a P-matrix is nondegenerate.

For given N ∈ <m×n, M ∈ <m×m, q ∈ <m, and two positive numbers ε and µ, wedefine the function

Φε,µ(x, y, w; N,M, q) =

Nx + (M + εI)y + q − w

φµ(y[1], w[1])...

φµ(y[m], w[m])

, (6.11)

where φµ : <2 → < is the perturbed Fischer-Burmeister function

φµ(a, b) = a + b−√

a2 + b2 + 2µ2 .

Then we have the following well-known result [12, 47].

Theorem 6.1 Suppose that M is a P0-matrix. Then, for given x ∈ <n, ε > 0, andµ > 0, we have the following statements:

(1) The function Φε,µ defined by (6.11) is continuously differentiable with respect to(y, w) and the Jacobian matrix ∇(y,w)Φε,µ(x, y, w; N, M, q) is nonsingular everywhere;

(2) The equation

Φε,µ(x, y, w;N, M, q) = 0 (6.12)

has a unique solution (y(x, ε, µ), w(x, ε, µ)), which is continuously differentiable withrespect to x and satisfies

y(x, ε, µ) > 0, w(x, ε, µ) > 0,

y(x, ε, µ)[i]w(x, ε, µ)[i] = µ2, i = 1, · · · ,m. (6.13)

In the following, to mitigate the notational complication, we assume ε = µ anddenote Φε,µ, y(x, ε, µ), and w(x, ε, µ) by Φµ, y(x, µ), and w(x, µ), respectively. Ouranalysis will remain valid, however, even though the two parameters are treated inde-pendently.

Suppose that each M` is a P0-matrix in problem (6.7) and µ > 0. Theorem 6.1indicates that, for any fixed ` and µ > 0, the smooth equation

Φµ(x, y`, w`;N`, M`, q`) = 0 (6.14)


gives two functions y`(·, µ) and w`(·, µ) that are well-defined and continuously differen-tiable. Note that

φµ(a, b) = 0 ⇐⇒ a ≥ 0, b ≥ 0, ab = µ2.

As a result, the equation (6.14) is equivalent to the system

y` ≥ 0, N`x + (M` + µI)y` + q` ≥ 0,

y`[i](N`x + (M` + µI)y` + q`

)[i] = µ2, i = 1, · · · ,m (6.15)

in the sense that y`(x, µ) solves (6.15) if and only if

Φµ(x, y`(x, µ), w`(x, µ);N`,M`, q`) = 0 (6.16)

with

w`(x, µ) = N`x + (M` + µI)y`(x, µ) + q` . (6.17)

Since the system (6.15) with µ = 0 reduces to the linear complementarity problemLCP(x; N`,M`, q`):

y` ≥ 0, N`x + M`y` + q` ≥ 0,

yT`

(N`x + M`y` + q`

)= 0, (6.18)

we see that y`(x, µ) tends to a solution of (6.18) as µ → 0, provided that it is convergent.

In our analysis, we will simply assume that y`(x, µ) is bounded as µ → 0. Inparticular, if M` is a P-matrix, then (6.18) has a unique solution for any x and itcan be shown that y`(x, µ) actually converges to it as µ → 0, even without using theregularization term µI in (6.15), see [12].

The following lemma will be used later on.

Lemma 6.1 [29] Let M be a P0-matrix and D = diag(d1, · · · , dm) with 0 ≤ di ≤ 1 foreach i. If the principal submatrix M [K] is nonsingular, where K = i | di = 1, thenthe matrix

(M −I

I − D D

)

is nonsingular.


6.2.2 Method

Let µk be a sequence of positive numbers converging to 0. A smoothing implicitprogramming method for problem (6.7), which we call SIP-I, generates a sequencex(k) by solving the problems

minimize θµk(x) (6.19)

subject to g(x) ≤ 0, h(x) = 0,

where

θµk(x) =

L∑

`=1

p`f(x, y`(x, µk)) (6.20)

and y`(x, µk) satisfies the system (6.16)–(6.17) with µ = µk. Let

y(k)` = y`(x(k), µk), ` = 1, · · · , L. (6.21)

Throughout, we denote by F1 and X the feasible regions of problems (6.7) and(6.19), respectively. Moreover, particular sequences generated by the method will bedenoted by x(k), y(k), etc., while general sequences will be denoted by xk, yk,etc.

Note that, by Theorem 6.1, problem (6.19) is a smooth mathematical program. Inparticular, if X = <n, problem (6.19) reduces to a smooth unconstrained optimizationproblem. Moreover, under some suitable conditions, (6.19) is a convex program, see[12] for details. Therefore, we may expect that problem (6.19) may be relatively easyto deal with, provided the evaluation of the function y`(x, µk) is not very expensive.

6.2.3 Limiting behavior of local optimal solutions

In this section, we first give some properties of the functions y`(·, µ) and then investigatethe limiting behavior of sequences of local optimal solutions of problem (6.19). To thisend, the following assumptions are assumed throughout this subsection:

A1: SIP-I produces a bounded sequence x(k) of local optimal solutions of (6.19).

A2: For any bounded sequence xk in X , the sequence y`(xk, µk) is bounded foreach `.


Lemma 6.2 Suppose that all M`, ` = 1, · · · , L, are P0-matrices in problem (6.7) and(x∗, y∗1, · · · , y∗L) ∈ F1. Assume that, for each `, the submatrix M`[K∗` ] is nondegenerate,where

K∗` = i | (N`x∗ + M`y

∗` + q`)[i] = 0.

Then there exist a neighborhood U∗ of (x∗, y∗1, · · · , y∗L) and a positive constant π∗ suchthat

‖y`(x, µk)− y`‖ ≤ µkπ∗(‖y`‖+

√m), ` = 1, · · · , L (6.22)

holds for any (x, y1, · · · , yL) ∈ U∗ ∩ F1 and every k.

Proof: For any (x, y1, · · · , yL) ∈ <n+mL, we let

w` = N`x + M`y` + q`, ` = 1, · · · , L.

Since (x, y1, · · · , yL) ∈ F1 implies

Φ0(x, y`, w`; N`, M`, q`) = 0, ` = 1, · · · , L, (6.23)

(6.16) and (6.23) yield

0 = Φµk(x, y`(x, µk), w`(x, µk);N`, M`, q`)− Φ0(x, y`, w`; N`,M`, q`)

=

(M` + µkI −I

I −D`(x, µk) D`(x, µk)

) (y`(x, µk)− y`

w`(x, µk)− w`

)− µk

−y`

2ak` [1]µk...

2ak` [m]µk

.(6.24)

Here, D`(x, µk) = diag(ak

` [i](y`(x, µk)[i] + y`[i]))

and

ak` [i] =

1√(y`(x, µk)[i])2 + (w`(x, µk)[i])2 + 2µ2

k +√

(y`[i])2 + (w`[i])2

=1

y`(x, µk)[i] + w`(x, µk)[i] + y`[i] + w`[i], (6.25)

where the last equality follows from (6.16) and (6.23). From (6.13), we see that, forany `, i, and any k,

0 < ak` [i](y`(x, µk)[i] + y`[i]) < 1. (6.26)

Note that the matrix(

M` + µkI −I


)


is nonsingular [29]. We next prove that there exists a neighborhood U∗ of (x∗, y∗1, · · · , y∗L)and a positive constant π∗ such that

∥∥∥∥∥

(M` + µkI −I


)−1 ∥∥∥∥∥ ≤ π∗, ` = 1, · · · , L (6.27)

holds for any (x, y1, · · · , yL) ∈ U∗ ∩ F1 and any k. Otherwise, there must be an index`, a subsequence kj of k, and a sequence (xj , yj

1, · · · , yjL) ⊂ F1 such that

limj→∞

(xj , yj1, · · · , yj

L) = (x∗, y∗1, · · · , y∗L) (6.28)

and

limj→∞

∥∥∥∥∥

(M` + µkjI −I

I −D`(xj , µkj) D`(xj , µkj

)

)−1 ∥∥∥∥∥ = +∞. (6.29)

By (6.26), the sequence D`(xj , µkj) is bounded. Hence, passing to a further subse-

quence if necessary, we may assume that

limj→∞

D`(xj , µkj ) = D` := diag(d`[1], · · · , d`[m]).

Note that, by (6.25),

d`[i] = limj→∞

y`(xj , µkj )[i] + yj` [i]

y`(xj , µkj )[i] + w`(xj , µkj )[i] + yj` [i] + wj

` [i].

By Assumption A2, the sequence y`(xj , µkj ) is bounded and then so is the sequencew`(xj , µkj ). If d`[i] = 1, we have

limj→∞

w`(xj , µkj )[i] + wj` [i]


` [i]= 0. (6.30)

We claim that

limj→∞

(w`(xj , µkj

)[i] + wj` [i]

)= 0. (6.31)

In fact, suppose that (6.31) does not hold and, without loss of generality, assume

limj→∞

(w`(xj , µkj )[i] + wj

` [i])

= τ > 0.

Since all numbers involved are nonnegative and bounded, it follows that

limj→∞

w`(xj , µkj )[i] + wj` [i]


` [i]6= 0,


which contradicts (6.30). Therefore, we must have (6.31) and hence

w∗` [i] = limj→∞

wj` [i] = 0.

This implies that

K∗` ⊇ K` = i | d`[i] = 1.

Taking into account the assumption that the submatrix M`[K∗` ] is nondegenerate, wededuce that the submatrix M`[K`] is nonsingular and then, by Lemma 6.1, the limitmatrix

(M` −I

I − D` D`

)

is nonsingular. We therefore have

limj→∞

(M` + µkjI −I

I −D`(xj , µkj ) D`(xj , µkj )

)−1

=

(M` −I

I − D` D`

)−1

.

However, this contradicts (6.29) and hence we obtain (6.27). In addition, we have from(6.13) that, for every k, ` = 1, · · · , L, and i = 1, · · · ,m,

2ak` [i]µk ≤ 2µk√

(y`(x, µk)[i])2 + (w`(x, µk)[i])2 + 2µ2k

≤ 2µk√2y`(x, µk)[i]w`(x, µk)[i] + 2µ2

k

= 1. (6.32)

Thus, it follows from (6.24), (6.27), and (6.32) that

‖y`(x, µk)− y`‖ ≤∥∥∥∥

(y`(x, µk)− y`

w`(x, µk)− w`

) ∥∥∥∥

= µk

∥∥∥∥∥

(M` + µkI −I


)−1

−y`

2ak` [1]µk...

2ak` [m]µk

∥∥∥∥∥

≤ µkπ∗(‖y`‖+

√m). (6.33)

Note that the above inequalities hold for all ` and then the proof is completed.

Under Assumptions A1 and A2, the sequence(x(k), y

(k)1 , · · · , y(k)

L )

generated bySIP-I is bounded. In the rest of this subsection, we further make the following assump-tions:


A3: The sequence(x(k), y

(k)1 , · · · , y(k)

L )

generated by SIP-I is convergent to a point(x∗, y∗1, · · · , y∗L).

A4: There exists a neighborhood V ∗ of (x∗, y∗1, · · · , y∗L) such that x(k) minimizes θµk

over V ∗|X for all k large enough, where V ∗|X = x ∈ X | (x, y1, · · · , yL) ∈ U∗.

Theorem 6.2 Suppose that all matrices M`, ` = 1, · · · , L, are P0-matrices in problem(6.7). Let

(x(k), y

(k)1 , · · · , y(k)

L )

and (x∗, y∗1, · · · , y∗L) be a sequence generated by SIP-Iand its limit point, respectively. If the submatrix M`[K∗` ] is nondegenerate for each `,where K∗` is the same as in Lemma 6.2, then (x∗, y∗1, · · · , y∗L) is a local optimal solutionof problem (6.7).

Proof: First we have (x∗, y∗1, · · · , y∗L) ∈ F1 immediately from (6.15). By Lemma6.2, there exist a neighborhood U∗ ⊆ V ∗ of the point (x∗, y∗1, · · · , y∗L) and a positivenumber π∗ such that (6.22) holds for any (x, y1, · · · , yL) ∈ U∗ ∩ F1 and every k.

Choose a positive number η and let

F1,η =(x, y1, · · · , yL) ∈ U∗ ∩ F1

∣∣∣ ‖(x, y1, · · · , yL)− (x∗, y∗1, · · · , y∗L)‖ ≤ η.

Since F1,η is a nonempty compact set, the continuity of f ensures that the problem

minimizeL∑

`=1

p`f(x, y`)

subject to (x, y1, · · · , yL) ∈ F1,η (6.34)

has a nonempty solution set. Let (x, y1, · · · , yL) be an arbitrary solution of (6.34).

For any (x, y1, · · · , yL) ∈ F1,η, by the mean-value theorem, we have

θµk(x) =

L∑

`=1

p`f(x, y`(x, µk))

=L∑

`=1

p`

(f(x, y`) + (y`(x, µk)− y`)T∇yf(x, α`y` + (1− α`)y`(x, µk))

), (6.35)

where α` ∈ [0, 1] for each `. By (6.22), we have

‖α`y` + (1− α`)y`(x, µk)‖ = ‖(1− α`)(y`(x, µk)− y`) + y`‖≤ ‖y`(x, µk)− y`‖+ ‖y`‖≤ µkπ

∗(‖y`‖+√

m) + ‖y`‖. (6.36)

Since F1,η is bounded, it follows from (6.36) that the set(x, α`y` + (1− α`)y`(x, µk))

∣∣∣ (x, y1, · · · , yL) ∈ F1,η, k = 1, 2, · · · , α` ∈ [0, 1]


is bounded for each ` = 1, · · · , L, and so there exists a constant τ > 0 such that

‖∇yf(x, α`y` + (1− α`)y`(x, µk))‖ ≤ τ, ` = 1, · · · , L

holds for any (x, y1, · · · , yL) ∈ F1,η, α` ∈ [0, 1], and every k. It then follows from (6.35)and (6.22) that

∣∣∣θµk(x)−

L∑

`=1

p`f(x, y`)∣∣∣ ≤ τ

L∑

`=1

‖y`(x, µk)− y`‖ ≤ τµkπ∗

L∑

`=1

(‖y`‖+

√m

)

for any (x, y1, · · · , yL) ∈ F1,η and k. In particular,

∣∣∣θµk(x)−

L∑

`=1

p`f(x, y`)∣∣∣ ≤ τµkπ

∗L∑

`=1

(‖y`‖+

√m

), ∀k. (6.37)

On the one hand, the continuity of f yields

limk→∞

θµk(x(k)) =

L∑

`=1

p` limk→∞

f(x(k), y(k)` ) =

L∑

`=1

p`f(x∗, y∗` ). (6.38)

Note that, by Assumption A4 and the fact that U∗ ⊆ V ∗, x(k) is a global optimalsolution of problem (6.19) when k is large enough, and x is a feasible point of (6.19).We then have from (6.37) that, for every sufficiently large k,

θµk(x(k)) ≤ θµk

(x) ≤L∑

`=1

p`f(x, y`) + τµkπ∗

L∑

`=1

(‖y`‖+

√m

). (6.39)

Therefore, taking into account the equality (6.38) and the fact that µk → 0 as k →∞,we have by letting k →∞ in (6.39) that

L∑

`=1

p`f(x∗, y∗` ) ≤L∑

`=1

p`f(x, y`).

On the other hand, since (x, y1, · · · , yL) is a solution of problem (6.34), we have

L∑

`=1

p`f(x∗, y∗` ) ≥L∑

`=1

p`f(x, y`).

It then follows thatL∑

`=1

p`f(x∗, y∗` ) =L∑

`=1

p`f(x, y`).

This means that (x∗, y∗1, · · · , y∗L) is a global optimal solution of problem (6.34), in otherwords, (x∗, y∗1, · · · , y∗L) is a local optimal solution of problem (6.7). This completes theproof.


Theorem 6.3 Suppose that all M`, ` = 1, · · · , L, are P-matrices in problem (6.7) and,for each k, (x(k), y

(k)1 , · · · , y(k)

L ) is a global optimal solution of problem (6.19). Then thelimit point (x∗, y∗1, · · · , y∗L) is a global optimal solution of problem (6.7).

Proof: Recall that (x∗, y∗1, · · · , y∗L) ∈ F1. Since all M` are P-matrices (and hencenondegenerate), the conditions of Lemma 6.2 are satisfied at every point (x, y1, · · · , yL) ∈F1. Thus, for any point (x, y1, · · · , yL) ∈ F1, there exist a neighborhood U = U(x, y1, · · · ,yL) of (x, y1, · · · , yL) and a positive constant π = π(x, y1, · · · , yL) such that for any(x, y1, · · · , yL) ∈ U ∩ F1,

‖y`(x, µk)− y`‖ ≤ µkπ(‖y`‖+√

m)

holds for each ` and k.

For an arbitrary positive number η, we define the set F1,η by

F1,η =(x, y1, · · · , yL) ∈ F1

∣∣∣ ‖(x, y1, · · · , yL)− (x∗, y∗1, · · · , y∗L)‖ ≤ η. (6.40)

It is clear that F1,η is a nonempty compact set. Since the set of neighborhoods

U =U(x, y1, · · · , yL)

∣∣∣ (x, y1, · · · , yL) ∈ F1,η

is an open covering of F1,η, there is a finite number of neighborhoods, say U1, U2, · · · , Us,in U such that U1, U2, · · · , Us constitutes a covering of F1,η. By Lemma 6.2, thereexist constants π1, · · · , πs corresponding to the sets U1, U2, · · · , Us. Then, by setting

π∗ = maxπ1, π2, · · · , πs,we have (6.22) for any (x, y1, · · · , yL) ∈ F1,η and every k.

Consider problem (6.34) with F1,η defined by (6.40). In a similar way to the proofof Theorem 6.2, we can show that (x∗, y∗1, · · · , y∗L) is a global optimal solution of (6.34).Noting that η > 0 is arbitrary, we see that (x∗, y∗1, · · · , y∗L) is actually a global optimalsolution of problem (6.7) and so the proof is completed.

From a computational viewpoint, it is in general difficult or even impossible to getan exact optimal solution of an optimization problem. For this reason, the followingresult is more interesting in practice.

Theorem 6.4 Suppose that εk ⊆ (0, +∞) is convergent to 0 and, for every k, x(k) ∈X is an approximate local (or global) optimal solution of the problem (6.19) satisfying

θµk(x(k))− εk ≤ θµk

(x), ∀x ∈ X ∩ U(x(k)) (or x ∈ X ),

where U(x(k)) is a neighborhood of the point x(k). Let y(k)` = y`(x(k), µk) for each

` = 1, · · · , L. Then, under the same conditions as Theorem 6.2 or Theorem 6.3, thecorresponding conclusion remains true.


6.2.4 Limiting behavior of stationary points

In the last subsection, we have discussed the convergence of optimal solutions of prob-lems (6.19). In practice, it may not be easy to obtain an optimal solution, even anapproximate optimal solution, whereas computation of stationary points may be rela-tively easy. Therefore, it is necessary to study the limiting behavior of stationary pointsof subproblems (6.19).

For problem (6.19), we will use the standard definition of stationarity. Moreover,recalling that problem (6.7) is equivalent to an ordinary MPEC (6.10), we employ thesame terminologies for (6.7) as in the study of MPECs. Suppose (x∗,y∗) is a feasiblepoint of (6.10).

Definition 6.3 Assume that the MPEC-LICQ holds at (x∗,y∗) in problem (6.10). Wesay (x∗,y∗) is a C-stationary point if there exist multiplier vectors u∗, v∗, λ∗, and γ∗

such that

∇f(x∗,y∗)−∑

i∈IG(x∗,y∗)

u∗[i]∇Gi(x∗,y∗)−∑

i∈IF (x∗,y∗)

v∗[i]∇Fi(x∗,y∗)

+∑

i∈I∗gλ∗[i]

(∇gi(x∗)0

)+

s2∑

i=1

γ∗[i]

(∇hi(x∗)0

)= 0, (6.41)

λ∗ ≥ 0, u∗[i]v∗[i] ≥ 0 for i ∈ IG(x∗,y∗) ∩ IF (x∗,y∗). (6.42)

If furthermore, there holds

u∗[i] ≥ 0, v∗[i] ≥ 0, ∀i ∈ IG(x∗,y∗) ∩ IF (x∗,y∗), (6.43)

we call (x∗,y∗) a B-stationary point.

Definition 6.4 A solution y∗` of problem LCP(x∗;N`,M`, q`) is said to satisfy the strictcomplementarity condition if I∗Y`

∩ I∗W`= ∅, where

I∗Y`= i | y∗` [i] = 0,

I∗W`= i | (N`x

∗ + M`y∗` + q`)[i] = 0.

Theorem 6.5 Suppose that all M`, ` = 1, · · · , L, are P0-matrices in problem (6.7) andfor each k, x(k) is a stationary point of (6.19). Let (x∗, y∗1, · · · , y∗L) be an accumulationpoint of the sequence

(x(k), y

(k)1 , · · · , y(k)

L )

generated by SIP-I. If the MPEC-LICQ


is satisfied at (x∗, y∗1, · · · , y∗L), then (x∗, y∗1, · · · , y∗L) is a C-stationary point of problem(6.7). In particular, if y∗` satisfies the strict complementarity condition for each `, then(x∗, y∗1, · · · , y∗L) is B-stationary.

Proof: Assume without loss of generality that the sequence(x(k), y

(k)1 , · · · , y(k)

L )

converges to (x∗, y∗1, · · · , y∗L). Since the MPEC-LICQ holds at (x∗, y∗1, · · · , y∗L), it isobvious that, for every k sufficiently large, problem (6.19) satisfies the standard LICQat x(k) and then, by the stationarity of x(k), there exist unique Lagrange multipliervectors λk and γk such that

∇θµk(x(k)) +∇g(x(k))λk +∇h(x(k))γk = 0, (6.44)

g(x(k)) ≤ 0, h(x(k)) = 0, (6.45)

λk ≥ 0, g(x(k))T λk = 0. (6.46)

In the remainder of the proof, we suppose k is large enough so that (6.44)–(6.46) holdand, in addition,

Ig(x(k)) ⊆ I∗g , (6.47)

which follows from the continuity of g.

Note that Φµk(x, y`(x, µk), w`(x, µk);N`,M`, q`) = 0 is satisfied for each `. By the

implicit function theorem [64], we have(∇y`(x(k), µk)T

∇w`(x(k), µk)T

)

= −(∇y`

Φµk(x(k), y(k), w(k); N`,M`, q`)

∇w`Φµk

(x(k), y(k), w(k); N`,M`, q`)

)−T

∇xΦµk(x(k), y(k), w(k); N`,M`, q`)T

= −(

M` + µkI −I

I −Dk` Dk

`

)−1 (N`

O

), (6.48)

where Dk` = diag

(y(k)`

[1]

y(k)`

[1]+w(k)`

[1], · · · , y

(k)`

[m]

y(k)`

[m]+w(k)`

[m]

)and the existence of the inverse

matrix follows from Theorem 6.1. Furthermore, since

(M` + µkI −I

I −Dk` Dk

`

)−1

=

(Ek

` Dk` Ek

`

−I + (M` + µkI)Ek` Dk

` (M` + µkI)Ek`

)(6.49)

with

Ek` =

(Dk

` M` + I − (1− µk)Dk`

)−1(6.50)


(see [29]), it follows from (6.48) that

∇y`(x(k), µk) = −NT` Dk

` (Ek` )T , ` = 1, · · · , L. (6.51)

Thus, (6.44) becomes

0 = ∇θµk(x(k)) +∇g(x(k))λk +∇h(x(k))γk

=L∑

`=1

p`

(∇xf(x(k), y

(k)` ) +∇y`(x(k), µk)∇yf(x(k), y

(k)` )

)

+∇g(x(k))λk +∇h(x(k))γk

=L∑

`=1

p`∇xf(x(k), y(k)` )−

L∑

`=1

p`NT` Dk

` (Ek` )T∇yf(x(k), y

(k)` )

+∇g(x(k))λk +∇h(x(k))γk

=L∑

`=1

p`∇xf(x(k), y(k)` )−

L∑

`=1

NT` vk

` +∇g(x(k))λk +∇h(x(k))γk, (6.52)

where vk` is defined by

vk` = p`D

k` (Ek

` )T∇yf(x(k), y(k)` ), (6.53)

that is,

vk` [i] =

p` y(k)` [i]

y(k)` [i] + w

(k)` [i]

eTi (Ek

` )T∇yf(x(k), y(k)` ), i = 1, · · · ,m. (6.54)

For each `, we let

uk` = p`∇yf(x(k), y

(k)` )−MT

` vk` ,

that is,

p`∇yf(x(k), y(k)` )− uk

` −MT` vk

` = 0. (6.55)

It then follows from (6.53) and (6.50) that, for every ` = 1, · · · , L,

uk` = p`∇yf(x(k), y

(k)` )−MT

` vk`

= p`∇yf(x(k), y(k)` )− p`M

T` Dk

` (Ek` )T∇yf(x(k), y

(k)` )

= p`

((Ek

` )−T −MT` Dk

`

)(Ek

` )T∇yf(x(k), y(k)` )

= p`

(I − (1− µk)Dk

`

)(Ek

` )T∇yf(x(k), y(k)` )

and hence

uk` [i] =

p`(µky(k)` [i] + w

(k)` [i])

y(k)` [i] + w

(k)` [i]

eTi (Ek

` )T∇yf(x(k), y(k)` ), i = 1, · · · ,m. (6.56)


Taking into account (6.45)–(6.47), we obtain from (6.52) and (6.55) that

0 =

∑L`=1 p`∇xf(x(k), y

(k)` )

p1∇yf(x(k), y(k)1 )

...pL∇yf(x(k), y

(k)L )

−

O · · · O

I · · · O...

. . ....

O · · · I

uk −

NT1 · · · NT

L

MT1 · · · O...

. . ....

O · · · MTL

vk

+∑

i∈I∗gλk[i]

∇gi(x(k))0...0

+

∇h(x(k))O...O

γk, (6.57)

where

uk =

uk1...

ukL

, vk =

vk1...

vkL

.

We can further rewrite (6.57) as

∑L`=1 p`∇xf(x(k), y

(k)` )

p1∇yf(x(k), y(k)1 )

...pL∇yf(x(k), y

(k)L )

−

L∑

`=1

∑

i/∈I∗Y`

uk` [i]

0...ei...0

−L∑

`=1

∑

i/∈I∗W`

vk` [i]

N`[i]...

M`[i]...0

=L∑

`=1

∑

i∈I∗Y`

uk` [i]

0...ei...0

+L∑

`=1

∑

i∈I∗W`

vk` [i]

N`[i]...

M`[i]...0

−∑

i∈I∗gλk[i]

∇gi(x(k))0...0

−

s2∑

i=1

γk[i]

∇hi(x(k))0...0

. (6.58)

We next prove that, for each `,

i /∈ I∗Y`⇒ lim

k→∞uk

` [i] = 0, (6.59)

i /∈ I∗W`⇒ lim

k→∞vk` [i] = 0. (6.60)


To this end, it is enough to show that ‖(Ek` )T∇yf(x(k), y

(k)` )‖ is bounded for every

`. Otherwise, taking a further subsequence if necessary, there is an index ˆ satisfying∥∥∥(Ek

ˆ)T∇yf(x(k), y

(k)ˆ )

∥∥∥ = max1≤`≤L

∥∥∥(Ek` )T∇yf(x(k), y

(k)` )

∥∥∥, ∀k (6.61)

and

limk→∞

∥∥∥(Ekˆ)

T∇yf(x(k), y(k)ˆ )

∥∥∥ = +∞. (6.62)

Note that, from (6.56) and (6.61), we have

|uk` [i]| ≤ p`(µky

(k)` [i] + w

(k)` [i])

y(k)` [i] + w

(k)` [i]

‖(Ek` )T∇yf(x(k), y

(k)` )‖

≤ p`(µky(k)` [i] + w

(k)` [i])

y(k)` [i] + w

(k)` [i]

‖(Ekˆ)

T∇yf(x(k), y(k)ˆ )‖

for each `, i, and k. In consequence, for each `, we have

i /∈ I∗Y`⇒ µky

(k)` [i] + w

(k)` [i]

y(k)` [i] + w

(k)` [i]

→ 0 ⇒ |uk` [i]|

‖(Ekˆ)T∇yf(x(k), y

(k)ˆ )‖

→ 0. (6.63)

Similarly, we can show that, for all ` = 1, · · · , L,

i /∈ I∗W`⇒ |vk

` [i]|‖(Ek

ˆ)T∇yf(x(k), y(k)ˆ )‖

→ 0. (6.64)

Let dk denote the vector on the right-hand side of equality (6.58). It then follows from(6.58) and (6.62)–(6.64) that

limk→∞

dk


(k)ˆ )‖

= 0. (6.65)

Since the MPEC-LICQ holds at (x∗, y∗1, · · · , y∗L), the vectors on the right-hand side of(6.58) are linearly independent when k is sufficiently large and so, by (6.65), all thesequences generated by dividing the multipliers that appear on the right-hand side of(6.58) by the number ‖(Ek

ˆ)T∇yf(x(k), y

(k)ˆ )‖ are convergent to 0 as k →∞. This fact,

together with (6.63) and (6.64), implies that, for any i,

limk→∞

ukˆ[i]


(k)ˆ )‖

= 0, (6.66)

limk→∞

vkˆ[i]


(k)ˆ )‖

= 0. (6.67)


However, noticing that

ukˆ[i] + vk

ˆ[i]


(k)ˆ )‖

= pˆ

(1 +

µky(k)ˆ [i]

y(k)ˆ [i] + w

(k)ˆ [i]

) (Ekˆ)

T∇yf(x(k), y(k)ˆ )[i]


(k)ˆ )‖

holds for any i and k, there exists an index i such that

limk→∞

|ukˆ[i] + uk

ˆ[i]|‖(Ek

ˆ)T∇yf(x(k), y(k)ˆ )‖

≥ 1√m

limk→∞

pˆ

(1 +

µky(k)ˆ [i]

y(k)ˆ [i] + w

(k)ˆ [i]

)=

pˆ√m

> 0.

This contradicts (6.66) and (6.67). As a result, the implications (6.59) and (6.60) aretrue.

Consider equality (6.58) again. By (6.59) and (6.60), the left-hand side of (6.58)is convergent as k → ∞. Recall that the vectors on the right-hand side of (6.58)are linearly independent when k is sufficiently large. These facts imply that all thesequences of the multipliers that appear on the right-hand side of (6.58) are convergent.In consequence, by letting k → ∞ in (6.58), we obtain the equality corresponding to(6.41).

In addition, since both y(k)` [i] and w

(k)` [i] are positive, we have from (6.56) and (6.54)

that

uk` [i]v

k` [i] ≥ 0, i = 1, · · · ,m.

This together with (6.46) yields (6.42). Therefore, (x∗, y∗1, · · · , y∗L) is a C-stationarypoint of problem (6.7). This completes the proof of the first part. The second half of thetheorem follows from the definitions of C-stationarity and B-stationarity immediately.

6.3 Smoothing Implicit Programming Method for Here-

And-Now Problems

In this section, we consider the following discrete here-and-now problem:

minimizeL∑

`=1

p`

(f(x, y, ω`) + dT z`

)


y ≥ 0, N`x + M`y + q` + z` ≥ 0, (6.68)

yT (N`x + M`y + q` + z`) = 0,

z` ≥ 0, ` = 1, · · · , L,

6.3 Method for Here-And-Now Problems 135

where d is a vector with positive constants. As mentioned in Section 6.1, x and y

represent the upper-level and the lower-level decisions, respectively, that we have tomake at once, before ω`, ` = 1, · · · , L, are observed, whereas z` is the recourse variablecorresponding to ω`.

It is easy to see that problem (6.68) can be rewritten as

minimizeL∑

l=1

pl

(f(x, y, ωl) + dT zl

)


N`x + M`y + q` + z` ≥ 0,

z` ≥ 0, ` = 1, · · · , L, (6.69)

y ≥ 0, Nx + My + q +∑L

l=1zl ≥ 0,

yT (Nx + My + q +∑L

l=1zl) = 0,

where N =∑L

l=1 Nl,M =∑L

l=1 Ml, and q =∑L

l=1 ql, or equivalently,

minimizeL∑

`=1

p`f(x, y, ω`) + dTz


y −Dy = 0, z ≥ 0, (6.70)

y ≥ 0, Nx + My + q + z ≥ 0,

yT (Nx + My + q + z) = 0

with y,N,M,q defined by (6.8) and

z :=

z1...

zL

, d :=

p1d...

pLd

, D :=

I...I

. (6.71)

Both problems (6.69) and (6.70) are actually ordinary MPECs, which are large scaleproblems in practice. In this section, we propose a smoothing implicit programmingmethod akin to SIP-I for solving problem (6.68) with the help of a penalty technique.Note that SIP-I cannot be applied to (6.69) or (6.70) directly because of the existenceof some non-complementarity constraints involving the variable y.

On the one hand, for any feasible point (x, y, z1, · · · , zL) of problem (6.69), (Nx +My+q+

∑Ll=1zl)[i] = 0 implies that (N`x+M`y+q` +z`)[i] = 0 holds for every `. This

indicates that the MPEC-LICQ does not hold for problem (6.69) in general. Therefore,in this section, the MPEC-LICQ means the one for problem (6.70). On the other hand,


because the complementarity constraints in problem (6.69) are lower dimensional, weuse them to generate the subproblems.

Suppose that M is a P0-matrix. For each (x, z1, · · · , zL) and µk > 0, let y(x,∑L

l=1 zl, µk)and w(x,

∑Ll=1 zl, µk) solve

Φµk

(x, y(x,

∑Ll=1zl, µk), w(x,

∑Ll=1zl, µk);N, M, q +

∑Ll=1zl

)= 0. (6.72)

The existence and differentiability of the above implicit functions follow from Theorem6.1. Note that the implicit functions are denoted by y(x,

∑Ll=1 zl, µk) and w(x,

∑Ll=1 zl, µk),

rather than y(x, z1, · · · , zL, µk) and w(x, z1, · · · , zL, µk), respectively. In the following,we use ∇z to denote the derivative with respect to the second argument. As mentionedin Section 6.1, the smooth optimization problem

minimizeL∑

`=1

p`

(f(x, y(x,

∑Ll=1zl, µk), ω`) + dT z`

)

subject to g(x) ≤ 0, h(x) = 0, (6.73)

N`x + M`y(x,∑L

l=1zl, µk) + q` + z` ≥ 0,

z` ≥ 0, ` = 1, · · · , L

is an approximation of problem (6.69). Since the feasible region of problem (6.73)is dependent on µk, (6.73) may not be easy to solve. Therefore, we apply a penaltytechnique to this problem and obtain the following approximation of problem (6.69):

minimize θk(x, z1, · · · , zL)

subject to g(x) ≤ 0, h(x) = 0, (6.74)

z` ≥ 0, ` = 1, · · · , L,

where

θk(x, z1, · · · , zL) =L∑

`=1

p`

(f(x, y(x,

∑Ll=1zl, µk), ω`) + dT z`

)

+ρk

L∑

`=1

ψ(− (N`x + M`y(x,

∑Ll=1zl, µk) + q` + z`)

),

ρk is a positive parameter, and ψ : <m → [0, +∞) is a smooth penalty function. Somespecific penalty functions will be given later. Note that the feasible region of problem(6.74) is common for all k.

Now we present our method, called SIP-II, for problem (6.68): Choose two sequencesµk and ρk of positive numbers. We then solve the problems (6.74) to get a sequence


(x(k), z(k)1 , · · · , z(k)

L ) and let

y(k) = y(x(k),∑L

l=1z(k)l , µk).

In what follows, we let F2 and Z stand for the feasible regions of problems (6.69)and (6.74), respectively. Also, we use (6.8) and (6.71) to generate some related vectorssuch as y(k),y∗, z(k), z∗, and so on. In addition, we make the following assumption onSIP-II throughout this section:

A5: In SIP-II, the parameters µk and ρk are selected to satisfy

limk→∞

µk = 0, limk→∞

ρk = +∞, limk→∞

µkρk = 0. (6.75)

The following lemma is helpful in establishing convergence theory for SIP-II. Weomit its proof because it is similar to that of Lemma 6.2.

Lemma 6.3 Suppose that M is a P0-matrix in problem (6.68) and, for any bounded se-quence (xk, zk

1 , · · · , zkL) in Z, y(xk,

∑Ll=1z

kl , µk) is bounded. If (x∗, y∗, z∗1 , · · · , z∗L) ∈

F2 and the submatrix M [K∗] is nondegenerate, where K∗ = i | (Nx∗ + My∗ + q +∑Ll=1z

∗l )[i] = 0, there exist a neighborhood U∗ of (x∗, y∗, z∗1 , · · · , z∗L) and a positive

constant π∗ such that

‖y(x,∑L

l=1zl, µk)− y‖ ≤ µkπ∗(‖y‖+

√m) (6.76)

for any (x, y, z1, · · · , zL) ∈ U∗ ∩ F2 and any k.

We next investigate the limiting behavior of the sequence generated by SIP-II. Wewill show that SIP-II possesses similar convergence properties to SIP-I.

Theorem 6.6 Let M be a P0-matrix and ψ : <m → [0, +∞) be a continuously differ-entiable function satisfying ψ(0) = 0 and

ψ(t) ≤ ψ(t′), ∀t′ ≥ t ∈ <m, (6.77)

and, for any bounded sequence (xk, zk1 , · · · , zk

L) in Z, y(xk,∑L

l=1zkl , µk) be bounded.

Suppose that the sequence(x(k), y(k), z

(k)1 , · · · , z(k)

L )

generated by SIP-II with (x(k), z(k)1 ,

· · · , z(k)L ) being a local optimal solution of (6.74) is convergent to (x∗, y∗, z∗1 , · · · , z∗L) ∈

F2. If there exists a neighborhood V ∗ of (x∗, y∗, z∗1 , · · · , z∗L) such that (x(k), z(k)1 , · · · , z(k)

L )minimizes θk over V ∗|Z for all k large enough and the submatrix M [K∗] is nondegener-ate, where K∗ is the same as in Lemma 6.3, then (x∗, y∗, z∗1 , · · · , z∗L) is a local optimalsolution of (6.68).


Proof: By Lemma 6.3, there exist a neighborhood U∗ ⊆ V ∗ of (x∗, y∗, z∗1 , · · · , z∗L)and a positive number π∗ such that (6.76) holds for any (x, y, z1, · · · , zL) ∈ U∗∩F2 andevery k. Let

F2,η =(x, y, z1, · · · , zL) ∈ U∗ ∩ F2

∣∣∣ ‖(x, y, z1, · · · , zL)− (x∗, y∗, z∗1 , · · · , z∗L)‖ ≤ η,

where η > 0 is a constant. Since F2,η is a nonempty compact set, the problem

minimizeL∑

`=1

p`

(f(x, y, ω`) + dT z`

)

subject to (x, y, z1, · · · , zL) ∈ F2,η (6.78)

has a global optimal solution, say (x, y, z1, · · · , zL).

Suppose (x, y, z1, · · · , zL) ∈ F2,η. We then have from the mean-value theorem that

θk(x, z1, · · · , zL) =L∑

`=1

p`

(f(x, y, ω`) + dT z`

+(y(x,∑L

l=1zl, µk)− y)T∇yf(x, (1− α)y(x,∑L

l=1zl, µk) + αy, ω`))

+ρk

L∑

`=1

ψ(− (N`x + M`y(x,

∑Ll=1zl, µk) + q` + z`)

), (6.79)

where α ∈ [0, 1]. Note that, by (6.76),

‖(1− α)y(x,∑L

l=1zl, µk) + αy‖ = ‖(1− α)(y(x,∑L

l=1zl, µk)− y) + y‖≤ ‖y(x,

∑Ll=1zl, µk)− y‖+ ‖y‖

≤ µkπ∗(‖y‖+

√m) + ‖y‖.

This indicates that the set(x, (1− α)y(x,

∑Ll=1zl, µk) + αy)

∣∣∣ (x, y, z1, · · · , zL) ∈ F2,η, α ∈ [0, 1], k = 1, 2, · · ·

is bounded. Similarly, we see(x, αM`(y − y(x,

∑Ll=1zl, µk)))

∣∣∣

(x, y, z1, · · · , zL) ∈ F2,η, ` = 1, · · · , L, α ∈ [0, 1], k = 1, 2, · · ·

is also bounded. Then, by the continuous differentiability of both f and ψ, there existsa constant τ > 0 such that

‖∇yf(x, (1− α)y(x,∑L

l=1zl, µk) + αy, ω`)‖ ≤ τ (6.80)


and

‖∇ψ(αM`(y − y(x,

∑Ll=1zl, µk))

)‖ ≤ τ, ` = 1, · · · , L (6.81)

hold for any (x, y, z1, · · · , zL) ∈ F2,η, α ∈ [0, 1], and every k. Noticing that (x, y, z1, · · · ,zL) ∈ F2,η implies N`x + M`y + q` + z` ≥ 0 for each `, we have from (6.77) and (6.81)that

ψ(− (N`x + M`y(x,

∑Ll=1zl, µk) + q` + z`)

)

≤ ψ(M`(y − y(x,

∑Ll=1zl, µk))

)

= ψ(M`(y − y(x,

∑Ll=1zl, µk))

)− ψ(0)

= ∇ψ(α′M`(y − y(x,

∑Ll=1zl, µk))

)TM`

(y − y(x,

∑Ll=1zl, µk)

)

≤ τ‖M`‖ ‖y − y(x,∑L

l=1zl, µk)‖, (6.82)

where α′ ∈ [0, 1] and the second equality follows from the mean-value theorem. This,together with (6.79)–(6.80) and (6.76), yields

∣∣∣θk(x, z1, · · · , zL)−L∑

`=1

p`

(f(x, y, ω`) + dT z`

)∣∣∣

≤ τ‖y(x,∑L

l=1zl, µk)− y‖+(τρk

L∑

`=1

‖M`‖)‖y − y(x,

∑Ll=1zl, µk)‖

≤ π∗τ(µk + µkρk

L∑

`=1

‖M`‖)(‖y‖+

√m)

for any (x, y, z1, · · · , zL) ∈ F2,η and k. In particular,

∣∣∣θk(x, z1, · · · , zL)−L∑

`=1

p`

(f(x, y, ω`) + eT z`

)∣∣∣

≤ π∗τ(µk + µkρk

L∑

`=1

‖M`‖)(‖y‖+

√m). (6.83)

Moreover, since ψ is always nonnegative, we have from the continuity of f that

limk→∞

θk(x(k), z(k)1 , · · · , z(k)

L ) ≥L∑

`=1

p` limk→∞

(f(x(k), y(k), ω`) + dT z(k)

)

=L∑

`=1

p`

(f(x∗, y∗, ω`) + dT z∗

). (6.84)


Note that, by the fact that U∗ ⊆ V ∗, (x(k), z(k)1 , · · · , z(k)

L ) is a global optimal solution ofproblem (6.74) when k is large enough, and (x, z1, · · · , zL) is a feasible point of (6.74).We then have from (6.83) that, for every k sufficiently large,

θk(x(k), z(k)1 , · · · , z(k)

L )

≤ θk(x, z1, · · · , zL)

≤L∑

`=1

p`

(f(x, y, ω`) + eT z`

)+ π∗τ

(µk + µkρk

L∑

`=1

‖M`‖)(‖y‖+

√m). (6.85)

Therefore, taking into account the equality (6.84) and Assumption A5, we have byletting k →∞ in (6.85) that

L∑

`=1

p`

(f(x∗, y∗, ω`) + dT z∗`

)≤

L∑

`=1

p`

(f(x, y, ω`) + eT z`

).

On the other hand, since (x, y, z1, · · · , zL) is a solution of problem (6.78), we have

L∑

`=1

p`

(f(x∗, y∗, ω`) + dT z∗`

)≥

L∑

`=1

p`

(f(x, y, ω`) + eT z`

).

It then follows thatL∑

`=1

p`

(f(x∗, y∗, ω`) + dT z∗`

)=

L∑

`=1

p`

(f(x, y, ω`) + eT z`

),

namely, (x∗, y∗, z∗1 , · · · , z∗L) is a global optimal solution of problem (6.78) and hence itis a local optimal solution of problem (6.68). This completes the proof.

It is not difficult to see that the function

ψ(y) :=m∑

i=1

(maxy[i], 0

)ν, y ∈ <m,

where ν ≥ 2 is a positive integer, satisfies the conditions assumed in Theorem 6.6. Thisfunction is often employed for solving constrained optimization problems. For moredetails, see [1].

Theorem 6.7 Let M be a P-matrix, the function ψ be the same as in Theorem 6.6 and,for any bounded sequence (xk, zk

1 , · · · , zkL) in Z, y(xk,

∑Ll=1z

kl , µk) be bounded. As-

sume that (x(k), z(k)1 , · · · , z(k)

L ) is a global optimal solution of (6.74) for each k. Then anyaccumulation point (x∗, y∗, z∗1 , · · · , z∗L) ∈ F2 of the sequence

(x(k), y(k), z

(k)1 , · · · , z(k)

L )

generated by SIP-II is a global optimal solution of problem (6.68).

We omit the proof of the above theorem since it is similar to that of Theorem 6.3.Next, we discuss the limiting behavior of stationary points of problems (6.74).


Theorem 6.8 Suppose that M is a P0-matrix in problem (6.68), the function ψ :<m → [0, +∞) is given by ψ(y) := ‖max(y, 0)‖2, and (x(k), z

(k)1 , · · · , z(k)

L ) is a sta-tionary point of (6.74) for each k. Let (x∗, y∗, z∗1 , · · · , z∗L) ∈ F2 be an accumulationpoint of the sequence

(x(k), y(k), z

(k)1 , · · · , z(k)

L )

generated by SIP-II. If the MPEC-LICQ is satisfied at (x∗, y∗,y∗, z∗), then (x∗, y∗, z∗1 , · · · , z∗L) is a C-stationary point ofproblem (6.68). In particular, if y∗ satisfies the strict complementarity condition, then(x∗, y∗, z∗1 , · · · , z∗L) is S-stationary to (6.68).

Proof: Assume without loss of generality that the sequence(x(k), y(k), z

(k)1 , · · · ,

z(k)L )

converges to (x∗, y∗, z∗1 , · · · , z∗L). By the MPEC-LICQ assumption, problem

(6.74) satisfies the standard LICQ at (x(k), z(k)1 , · · · , z(k)

L ) for all k sufficiently large andso, by the stationarity of (x(k), z

(k)1 , · · · , z

(k)L ), there exist unique Lagrange multiplier

vectors λk, γk, and

αk :=

αk1...

αkL

such that

∇θk(x(k), z(k)1 , · · · , z(k)

L )

+

∇g(x(k))O...O

λk +

∇h(x(k))O...O

γk −

O · · · O

I · · · O...

. . ....

O · · · I

αk = 0, (6.86)

h(x(k)) = 0, 0 ≤ λk ⊥ g(x(k)) ≥ 0, (6.87)

0 ≤ αk` ⊥ z

(k)` ≥ 0, ` = 1, · · · , L. (6.88)

In the rest of the proof, we suppose k is large enough so that (6.86)–(6.88) and (6.47)are satisfied and furthermore, for each ` = 1, · · · , L, there hold

IkZ`

:=i

∣∣∣ z(k)` [i] = 0

⊆ I∗Z`

:=i

∣∣∣ z∗` [i] = 0, (6.89)

IkW`

:=i

∣∣∣(N`x

(k) + M`y(k) + q` + z

(k)`

)[i] = 0

⊆ I∗W`:=

i

∣∣∣ (N`x∗ + M`y

∗ + q` + z∗` )[i] = 0, (6.90)

IkY :=

i

∣∣∣ y(k)[i] = 0⊆ I∗Y :=

i

∣∣∣ y∗[i] = 0, (6.91)

IkW :=

i

∣∣∣(Nx(k) + My(k) + q +

∑Ll=1z

(k)l

)[i] = 0

⊆ I∗W :=i

∣∣∣ (Nx∗ + My∗ + q +∑L

l=1z∗l )[i] = 0

. (6.92)


It is clear that

I∗W = ∩L`=1I∗W`

. (6.93)

Analogous to (6.48), we have from the implicit function theorem that

(∇(x,z)y(x(k),

∑Ll=1z

(k)l , µk)T

∇(x,z)w(x(k),∑L

l=1z(k)l , µk)T

)= −

(M + µkI −I

I −Dk Dk

)−1 (N I

O O

), (6.94)

where Dk := diag

(y(k)[1]

y(k)[1]+w(k)[1], · · · , y(k)[m]

y(k)[m]+w(k)[m]

). Since

(M + µkI −I

I −Dk Dk

)−1

=

(EkDk Ek

−I + (M + µkI)EkDk (M + µkI)Ek

)

with Ek :=(DkM + I − (1− µk)Dk

)−1, it follows from (6.94) that

∇(x,z)y(x(k),∑L

l=1z(k)l , µk) = −

(NT Dk(Ek)T

Dk(Ek)T

).

As a result, we have

∇xy(x(k),∑L

l=1z(k)l , µk) = −NT Dk(Ek)T , (6.95)

∇zy(x(k),∑L

l=1z(k)l , µk) = −Dk(Ek)T . (6.96)

Thus, from the definition of θk, (6.95)–(6.96), and by a straightforward calculus, (6.86)becomes

0 =

∑L`=1 p`(∇xf(x(k), y(k), ω`)−NT Dk(Ek)T∇yf(x(k), y(k), ω`))

p1d−∑L

`=1 p`Dk(Ek)T∇yf(x(k), y(k), ω`)

...pLd−∑L

`=1 p`Dk(Ek)T∇yf(x(k), y(k), ω`)

−

NT1 −NT Dk(Ek)T MT

1 · · · NTL −NT Dk(Ek)T MT

L

I −Dk(Ek)T MT1 · · · −Dk(Ek)T MT

L

−Dk(Ek)T MT1 · · · −Dk(Ek)T MT

L...

......

−Dk(Ek)T MT1 · · · I −Dk(Ek)T MT

L

βk

+

∇g(x(k))O...O

λk +

∇h(x(k))O...O

γk −

O · · · O

I · · · O...

. . ....

O · · · I

αk.


Here, βk ∈ <n+mL is given by

βk :=

βk1...

βkL

with

βk` := 2ρk max

(− (N`x

(k) + M`y(k) + q` + z

(k)` ), 0

)(6.97)

for each `. We then have

0 =

∑L`=1 p`∇xf(x(k), y(k), ω`)

p1d...

pLd

+

∇g(x(k))O...O

λk +

∇h(x(k))O...O

γk

−

O · · · O

I · · · O...

. . ....

O · · · I

αk −

NT1 · · · NT

L

I · · · O...

. . ....

O · · · I

βk −

NT

I...I

vk, (6.98)

where vk is defined by

vk := Dk(Ek)TL∑

`=1

(p`∇yf(x(k), y(k), ω`)−MT

` βk`

). (6.99)

Furthermore, by letting

L∑

`=1

p`∇yf(x(k), y(k), ω`)−L∑

`=1

MT` βk

` − uk −MT vk = 0, (6.100)

we have

uk =L∑

`=1


` βk`

)−MT vk

=((Ek)−T −MT Dk

)(Ek)T

L∑

`=1


` βk`

)

=(I − (1− µk)Dk

)(Ek)T

L∑

`=1


` βk`

), (6.101)


where the second equality follows from (6.99). Combining (6.98) and (6.100) yields

0 =

∑L`=1 p`∇xf(x(k), y(k), ω`)∑L`=1 p`∇yf(x(k), y(k), ω`)

p1d...

pLd

+

∇g(x(k))O

O...O

λk +

∇h(x(k))O

O...O

γk

−

O · · · O

O · · · O

I · · · O...

. . ....

O · · · I

αk −

NT1 · · · NT

L

MT1 · · · MT

L

I · · · O...

. . ....

O · · · I

βk −

O

I

O...O

uk −

NT

MT

I...I

vk.(6.102)

We can show that, when k is large sufficiently,

βk` [i] = 0 as long as i /∈ I∗W`

. (6.103)

In fact, if i /∈ I∗W`, namely, (N`x

∗+M`y∗+q` +z∗` )[i] > 0, then, when k is large enough,

there must hold (N`x(k) + M`y

(k) + q` + z(k)` )[i] > 0 and hence

βk` [i] = 2ρk max

− (N`x

(k) + M`y(k) + q` + z

(k)` )[i], 0

= 0.

Taking into account (6.47) and (6.87)–(6.92), we can rewrite (6.102) as


p1d...

pLd

−

∑

i/∈I∗Yuk[i]

0ei

0...0

−

∑

i/∈I∗Wvk[i]

N [i]M [i]ei...ei

= −∑

i∈I∗gλk[i]

∇gi(x(k))0...0

−

s2∑

i=1

γk[i]

∇hi(x(k))0...0

+L∑

`=1

∑

i∈I∗Z`

αk` [i]

0...ei...0

+L∑

`=1

∑

i∈I∗W`

βk` [i]

N`[i]M`[i]

...ei...0

+∑

i∈I∗Yuk[i]

0ei

0...0

+∑

i∈I∗Wvk[i]

N [i]M [i]ei...ei

. (6.104)


We next prove

i /∈ I∗Y ⇒ limk→∞

uk[i] = 0, (6.105)

i /∈ I∗W ⇒ limk→∞

vk[i] = 0. (6.106)

Note that, by (6.101) and (6.99),

uk[i] =µky

(k)[i] + w(k)[i]y(k)[i] + w(k)[i]

eTi (Ek)T

L∑

`=1


` βk`

),(6.107)

vk[i] =y(k)[i]

y(k)[i] + w(k)[i]eTi (Ek)T

L∑

`=1


` βk`

). (6.108)

In a similar way to the proof of Theorem 6.5, we can show that

∥∥∥(Ek)TL∑

`=1


` βk`

)∥∥∥

is bounded. Then, it follows from (6.107) and (6.108) that

i /∈ I∗Y ⇒ limk→∞

µky(k)[i] + w(k)[i]

y(k)[i] + w(k)[i]= 0 ⇒ lim

k→∞uk[i] = 0,

i /∈ I∗W ⇒ limk→∞

y(k)[i]y(k)[i] + w(k)[i]

= 0 ⇒ limk→∞

vk[i] = 0.

By (6.105) and (6.106), the left-hand side of equality (6.104) is convergent. Moreover,from the assumption that the MPEC-LICQ holds at (x∗, y∗,y∗, z∗), we can prove thatall the sequences of the multipliers that appear on the right-hand side of (6.104) arebounded. In face, by letting uk

` :=(I−(1−µk)Dk

)(Ek)T


` βk`

)

for ` = 1, · · · , L,

vk :=

vk

...vk

∈ <mL, uk :=

uk1...

ukL

,

and

ak := uk + MT (βk + vk),

bk := uk,

ck := βk + vk,

we can obtain from (6.102) that

0 =


0d

+

∇g(x(k))O

O

O

λk +

∇h(x(k))O

O

O

γk


−

O

O

O

I

αk +

O

−DT

I

O

ak −

O

O

I

O

bk −

NT

O

MT

I

ck. (6.109)

Applying (6.105) and (6.106), it is not difficult to show that

y∗[i] > 0 ⇒ limk→∞

uk[i] = 0,

(Nx∗ + My∗ + q + z∗)[i] > 0 ⇒ limk→∞

vk[i] = 0.

From (6.93), (6.103), and the MPEC-LICQ assumption, we can see that all the se-quences of the multiplier vectors in (6.109) are bounded, which implies the bound-edness of the multiplier sequences that appear on the right-hand side of (6.104). Inconsequence, assuming these vector sequences are all convergent without loss of gen-erality and letting k → ∞ in (6.104), we may obtain the equality corresponding to(6.41).

In addition, since both y(k)[i] and w(k)[i] are positive, we have from (6.107) and(6.108) that

uk[i]vk[i] ≥ 0, i = 1, · · · ,m.

This together with (6.87)–(6.88) yields the results corresponding to (6.42). Therefore,(x∗, y∗1, · · · , y∗L) is a C-stationary point of problem (6.68). If, in addition, y∗ satisfiesthe strict complementarity condition, then (x∗, y∗, z∗1 , · · · , z∗L) is a S-stationary pointby the definitions of C-stationarity and S-stationarity immediately. This completes theproof of Theorem 6.8.

Note that similar discussion is valid for the following generalized problem:

minimizeL∑

`=1

p`

(f(x, y, ω`) + dT

` z`

)


y ≥ 0, N`x + M`y + q` + A`z` ≥ 0, (6.110)

yT (N`x + M`y + q` + A`z`) = 0,

z` ≥ 0, ` = 1, · · · , L,

where, for each `, d` is a positive vector and A` ∈ <m×m is a recourse matrix corre-sponding to the recourse variable z`.

7.1 Examples 147

6.4 Conclusions

In this chapter, we have presented smoothing implicit programming methods for stochas-tic mathematical programs with linear complementarity constraints, including both thelower-level wait-and-see and here-and-now cases. Comprehensive convergence theoryhas also been established. Recall that, as we mentioned in the first section, SMPECscontain the ordinary MPECs as a special subclass. As a result, all the conclusionsremain true for MPECs.

148 7. Some Reformulations and Algorithms for SMPECs

Chapter 7

Some Reformulations and

Algorithms for SMPECs

In this chapter, we consider the following here-and-now problem:

minimize f(x, y) + Eω[dT z(ω)]

subject to g(x, y) ≤ 0, h(x, y) = 0,

y ≥ 0, N(ω)x + M(ω)y + q(ω) + z(ω) ≥ 0, (7.1)

yT (N(ω)x + M(ω)y + q(ω) + z(ω)) = 0,

z(ω) ≥ 0, ω ∈ Ω,

where f : <n+m → <, g : <n+m → <s1 , and h : <n+m → <s2 are all continuouslydifferentiable, the data N(ω) ∈ <m×n, M(ω) ∈ <m×m, and q(ω) ∈ <m are randomvariables, z(ω) is the corresponding recourse variable, and the vector d ∈ <m is aconstant vector with positive elements. Note that g and h are functions of both theupper-level and the lower-level variables, unlike the problem dealt with in Chapter6.3 where g and h are functions of the variable x only. We will give some equivalentreformulations and propose some penalty methods for solving problem (7.1). Thenotations employed in this chapter are the same as the previous chapter.

7.1 Examples

Many problems can be formulated as this kind of models: The first example can befound in Section 6.1. Now we describe another example.

149


Example 7.1 An ordinary linear complementarity problem is to find a vector y ∈ <m

such that

y ≥ 0, My + q ≥ 0, yT (My + q) = 0,

where M ∈ <m×m and q ∈ <m. In many practical problems, some elements mayinvolve uncertain data, which can be characterized by random variables. Therefore, itis meaningful to consider the following stochastic linear complementarity problem:

y ≥ 0, M(ω)y + q(ω) ≥ 0, yT (M(ω)y + q(ω)) = 0, ∀ω ∈ Ω. (7.2)

In general, there may not exist a vector y satisfying these complementarity conditionsfor all ω ∈ Ω. In order to get a reasonable resolution, we may introduce recoursevariables z(ω) ≥ 0 to the inequality M(ω)y + q(ω) ≥ 0 and try to find a vector y ≥ 0that minimizes the total recourse. Thus, we obtain the following problem

minimize Eω[dT z(ω)]

subject to 0 ≤ y ⊥ (M(ω)y + q(ω) + z(ω)) ≥ 0, (7.3)

z(ω) ≥ 0, ω ∈ Ω,

where d is a constant vector with positive elements. Problem (7.3) is obviously a specialcase of problem (7.1).

7.2 Reformulations

Because of the existence of the complementarity constraints, problem (7.1) does notsatisfy a standard constraint qualification such as the linear independence constraintqualification or the Mangasarian-Fromovitz constraint qualification at any feasible point[17]. Therefore, the conventional theory and algorithms in nonlinear programmingcannot be applied to this problem directly. In order to develop some effective methodsfor solving problem (7.1), we give some reformulations of (7.1) in the following.

At first, we define the function Q : <n ×<m × Ω → [0, +∞] by

Q(x, y, ω) := sup− (u + ty)T (N(ω)x + M(ω)y + q(ω))

∣∣∣ u + ty ≤ d, u ≥ 0

and consider the problem

minimize f(x, y) + Eω[Q(x, y, ω)] (7.4)

subject to g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0,

7.2 Reformulations 151

which will be shown to be equivalent to problem (7.1). In what follows, we let F1 andF2 denote the feasible regions of problems (7.1) and (7.4), respectively. We assumeboth problems (7.1) and (7.4) have an optimal solution. Moreover, we suppose f isbounded from below on F2.

7.2.1 Properties of the function Q

In order to show the equivalence between problems (7.4) and (7.1), we first give someproperties of the function Q.

Theorem 7.1 For any x ∈ <n, y ∈ <m, and ω ∈ Ω, Q(x, y, ω) < +∞ if and only ifthe set

Z(x, y, ω) :=

z(ω)

∣∣∣∣∣yT (N(ω)x + M(ω)y + q(ω) + z(ω)) = 0N(ω)x + M(ω)y + q(ω) + z(ω) ≥ 0, z(ω) ≥ 0

is nonempty; moreover, we have

Q(x, y, ω) = mindT z(ω)

∣∣∣ z(ω) ∈ Z(x, y, ω). (7.5)

We can prove the above theorem using the duality theorem in linear programmingimmediately. On the other hand, we may write

Z(x, y, ω) =

z(ω)

∣∣∣∣∣yT (N(ω)x + M(ω)y + q(ω) + z(ω)) ≤ 0N(ω)x + M(ω)y + q(ω) + z(ω) ≥ 0, z(ω) ≥ 0

=

z(ω)

∣∣∣∣∣yT z(ω) ≤ −yT (N(ω)x + M(ω)y + q(ω))z(ω) ≥ −(N(ω)x + M(ω)y + q(ω)), z(ω) ≥ 0

.

Therefore, when Q(x, y, ω) < +∞, we have from Theorem 7.1 and the duality theoremthat

Q(x, y, ω) = max− (u + ty)T (N(ω)x + M(ω)y + q(ω))

∣∣∣ u + ty ≤ d, u ≥ 0, t ≤ 0.

Furthermore, we have the the following result, which will be used in the subsequentanalysis.

Theorem 7.2 Let g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0, and ω ∈ Ω. Then Q(x, y, ω) = +∞if and only if there exists an index i such that

(N(ω)x + M(ω)y + q(ω)

)[i] > 0, y[i] > 0. (7.6)


Proof: “ if ”: Suppose that there exists an index i satisfying (7.6). Let t be a realnumber and u(t) ∈ <m be defined by

u(t) := ty[i]ei − ty.

Then, for any t ≤ 0, we have

u(t) ≥ 0, u(t) + ty ≤ d.

It follows from the definition of Q and (7.6) that

Q(x, y, ω) ≥ sup− (u(t) + ty)T (N(ω)x + M(ω)y + q(ω))

∣∣∣ t ≤ 0

= sup− ty[i](N(ω)x + M(ω)y + q(ω))[i]

∣∣∣ t ≤ 0

= +∞,

from which we obtain Q(x, y, ω) = +∞ immediately.

“ only if ”: First of all, we define

I1 :=

i∣∣∣ y[i] = 0

,

I2 :=

i∣∣∣ y[i] > 0, (N(ω)x + M(ω)y + q(ω))[i] = 0

,

I3 :=

i∣∣∣ y[i] > 0, (N(ω)x + M(ω)y + q(ω))[i] < 0

,

I4 :=

i∣∣∣ y[i] > 0, (N(ω)x + M(ω)y + q(ω))[i] > 0

.

These index sets are mutually disjoint and, since y ≥ 0, we see that

I1 ∪ I2 ∪ I3 ∪ I4 = 1, 2, · · · , n. (7.7)

Denote C := (u, t) | u + ty ≤ d, u ≥ 0. Then, we have

Q(x, y, ω) = sup−

m∑

i=1

(u + ty)[i](N(ω)x + M(ω)y + q(ω))[i]∣∣∣ (u, t) ∈ C

. (7.8)

(a) For any i ∈ I1 and any (u, t) ∈ C, we have 0 ≤ u[i] ≤ d[i]. It follows that the term

−∑

i∈I1

(u + ty)[i](N(ω)x + M(ω)y + q(ω))[i]

= −∑

i∈I1

u[i](N(ω)x + M(ω)y + q(ω))[i]

is bounded on C.


(b) It is obvious that

−∑

i∈I2

(u + ty)[i](N(ω)x + M(ω)y + q(ω))[i] = 0.

(c) Let i ∈ I3. For any (u, t) ∈ C, we have

−∑

i∈I3

(u + ty)[i](N(ω)x + M(ω)y + q(ω))[i] ≤ −∑

i∈I3

d[i](N(ω)x + M(ω)y + q(ω))[i]

and so the term on the left-hand side is bounded above on C.

The facts (a)–(c) together with (7.7) and (7.8) indicate that, if Q(x, y, ω) = +∞, wemust have I4 6= ∅, from which we see that (7.6) must hold for some index i and hencethe proof is completed.

From Theorem 7.2, we have the following result immediately.

Corollary 7.1 Let g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0, and ω ∈ Ω. Then Q(x, y, ω) < +∞if and only if

y[i](N(ω)x + M(ω)y + q(ω))[i] ≤ 0 (7.9)

holds for every i.

This corollary, together with Theorem 7.1, yields the next result.

Corollary 7.2 Let g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0, and ω ∈ Ω. If Q(x, y, ω) < +∞,then

Q(x, y, ω) = dT z(x, y, ω), (7.10)

where

z(x, y, ω) := max−(N(ω)x + M(ω)y + q(ω)), 0. (7.11)

Proof: Noticing that (7.9) together with y ≥ 0 holds for each i, we can easily provethat z(x, y, ω) ∈ Z(x, y, ω). On the other hand, for any z(ω) ∈ Z(x, y, ω), since

z(ω) ≥ −(N(ω)x + M(ω)y + q(ω)), z(ω) ≥ 0,

we have from the definition (7.11) that

z(ω)− z(x, y, ω) ≥ 0,


which implies that

dT z(ω) ≥ dT z(x, y, ω).

Since z(ω) ∈ Z(x, y, ω) is arbitrary, (7.10) follows form (7.5) at once.

We next show the equivalence between problems (7.1) and (7.4).

7.2.2 Continuous case

Let ω be a continuous random variable and p(ω) represent the probability densityfunction of ω. Suppose that the probability of any nonempty open set in Ω is positiveand (N(ω), M(ω), q(ω)) is continuous with respect to ω. Analogous to the discretecase, we have the following result:

Theorem 7.3 If (x∗, y∗) solves problem (7.4), then there exist z∗(ω) ∈ Z(x∗, y∗, ω),ω ∈ Ω, such that (x∗, y∗, z∗(ω))ω∈Ω solves problem (7.1). Conversely, if (x∗, y∗, z∗(ω))ω∈Ω

solves problem (7.1), then (x∗, y∗) solves (7.4).

Proof: (a) Suppose that (x∗, y∗) solves (7.4). Then we claim that

Q(x∗, y∗, ω) < +∞, ∀ω ∈ Ω. (7.12)

In fact, if Q(x∗, y∗, ω) = +∞ for some ω ∈ Ω, then, by Theorem 7.2, there exists anindex i such that

(N(ω)x∗ + M(ω)y∗ + q(ω)

)[i] > 0, y∗[i] > 0.

It follows from the continuity of (N(ω), M(ω), q(ω)) that there is a neighborhood U(ω)of ω such that

(N(ω)x∗ + M(ω)y∗ + q(ω)

)[i] > 0, y∗[i] > 0

hold for any ω ∈ U(ω). We then have from Theorem 7.2 that

Q(x∗, y∗, ω) = +∞, ∀ω ∈ U(ω).

Therefore,

f(x∗, y∗) + Eω[Q(x∗, y∗, ω)] ≥ f(x∗, y∗) +∫

U(ω)Q(x∗, y∗, ω)p(ω)dω = +∞.

This is a contradiction and hence (7.12) must hold. As a result, we have from Theorem7.1 and Corollary 7.2 that there exists z∗(ω) ∈ Z(x∗, y∗, ω) such that

Q(x∗, y∗, ω) = mindT z(ω)

∣∣∣ z(ω) ∈ Z(x∗, y∗, ω)

= dT z∗(ω).


It then follows that, for any (x, y, z(ω))ω∈Ω ∈ F1,

f(x∗, y∗) + Eω[dT z∗(ω)] = f(x∗, y∗) + Eω[Q(x∗, y∗, ω)]

≤ f(x, y) + Eω[Q(x, y, ω)]

≤ f(x, y) + Eω[dT z(ω)], (7.13)

where the first inequality follows from the optimality of (x∗, y∗) and the last inequalityfollows from Theorem 7.1 and the fact that (x, y, z(ω))ω∈Ω ∈ F1 implies Z(x, y, ω) isnonempty (and hence there holds (7.5)) for any ω ∈ Ω. The inequality (7.13) meansthat (x∗, y∗, z∗(ω))ω∈Ω is an optimal solution of problem (7.1).

(b) Let (x∗, y∗, z∗(ω))ω∈Ω be an optimal solution of problem (7.1). Note that z∗(ω) ∈Z(x∗, y∗, ω) for any ω ∈ Ω. It then follows from Theorem 7.1 that

Q(x∗, y∗, ω) ≤ dT z∗(ω), ω ∈ Ω. (7.14)

We next show that (x∗, y∗) is a global optimal solution of problem (7.4), namely, forany (x, y) ∈ F2,

f(x∗, y∗) + Eω[Q(x∗, y∗, ω)] ≤ f(x, y) + Eω[Q(x, y, ω)]. (7.15)

Let (x, y) ∈ F2.

(b1) Suppose that Q(x, y, ω) < +∞ for every ω ∈ Ω and let z(x, y, ω) be the vectordefined in Corollary 7.2. By the same corollary, we see that z(x, y, ω) ∈ Z(x, y, ω`) andQ(x, y, ω) = dT z(x, y, ω). It is not difficult to see that (x, y, z(x, y, ω))ω∈Ω ∈ F1 and

f(x∗, y∗) + Eω[Q(x∗, y∗, ω)] ≤ f(x∗, y∗) + Eω[dT z∗(ω)]

≤ f(x, y) + Eω[dT z(x, y, ω)]

= f(x, y) + Eω[Q(x, y, ω)],

where the first inequality follows from (7.14) and the second inequality follows fromthe optimality of (x∗, y∗, z∗(ω))ω∈Ω to problem (7.1). So, (7.15) is valid in this case.

(b2) If Q(x, y, ω) = +∞ for some ω ∈ Ω, in a similar way to (a), we can show thatthere exists a neighborhood U(ω) of ω such that

Q(x, y, ω) = +∞, ∀ω ∈ U(ω).

It follows that Eω[Q(x, y, ω)] = +∞, which implies that (7.15) remains true.

Therefore, (x∗, y∗) is a global optimal solution of problem (7.4) and hence the proofof the theorem is completed.


7.2.3 Discrete case

Suppose that Ω = ω1, ω2, · · · , ωL and, for each ` = 1, 2, · · · , L, the probability p` ofthe random event ω` is positive. Then, problems (7.1) and (7.4) reduce to

minimize f(x, y) +L∑

`=1

p`dT z`


y ≥ 0, N`x + M`y + q` + z` ≥ 0, (7.16)

yT (N`x + M`y + q` + z`) = 0,

z` ≥ 0, ` = 1, 2, · · · , L

and


`=1

p`Q(x, y, ω`)


respectively. The following result shows that problem (7.17) is equivalent to the MPEC(7.16).

Theorem 7.4 If (x∗, y∗) solves problem (7.17), then there exist z∗` ∈ Z(x∗, y∗, ω`), ` =1, 2, · · · , L, such that (x∗, y∗, z∗1 , · · · , z∗L) solves the MPEC (7.16). Conversely, if (x∗, y∗,z∗1 , · · · , z∗L) solves the MPEC (7.16), then (x∗, y∗) solves problem (7.17).

Proof: (a) Suppose that (x∗, y∗) solves (7.17). By assumption, problem (7.17) hasa finite optimal value and furthermore, taking into account that each p` is positive, wesee that

Q(x∗, y∗, ω`) < +∞, ` = 1, 2, · · · , L.

Then, by Corollary 7.2, there exist z∗` ∈ Z(x∗, y∗, ω`), ` = 1, 2, · · · , L, such that (x∗, y∗,z∗1 , · · · , z∗L) is feasible to problem (7.16) and

Q(x∗, y∗, ω`) = dT z∗` , ` = 1, 2, · · · , L.

Here, we use the lower boundedness of the function dT z` on Z(x∗, y∗, ω`) and thepolyhedral convexity of the set Z(x∗, y∗, ω`). We next prove that (x∗, y∗, z∗1 , · · · , z∗L)solves problem (7.16).

Let (x, y, z1, · · · , zL) ∈ F1. This implies that z` ∈ Z(x, y, ω`) for each ` and hence,by Theorem 7.1, we have

Q(x, y, ω`) ≤ dT z`, ` = 1, 2, · · · , L.

7.3 Further Discussions on Discrete Problems 157


f(x∗, y∗) +L∑

`=1

p`dT z∗` = f(x∗, y∗) +

L∑

`=1

p`Q(x∗, y∗, ω`)

≤ f(x, y) +L∑

`=1

p`Q(x, y, ω`)

≤ f(x, y) +L∑

`=1

p`dT z`,

which means that (x∗, y∗, z∗1 , · · · , z∗L) is an optimal solution of problem (7.16).

(b) Suppose that (x∗, y∗, z∗1 , · · · , z∗L) solves problem (7.16). Note that, z∗` ∈ Z(x∗, y∗, ω`),` = 1, · · · , L. It then follows from Theorem 7.1 that

Q(x∗, y∗, ω`) ≤ dT z∗` , ` = 1, 2, · · · , L. (7.18)

Let (x, y) ∈ F2. First we suppose that Q(x, y, ω`) < +∞ for every `. This meansthat Z(x, y, ω`) is nonempty for each `. For any z` ∈ Z(x, y, ω`), ` = 1, 2, · · · , L, we have(x, y, z1, · · · , zL) ∈ F1 and then, by (7.18) and the optimality of (x∗, y∗, z∗1 , · · · , z∗L),

f(x∗, y∗) +L∑

`=1

p`Q(x∗, y∗, ω`) ≤ f(x∗, y∗) +L∑

`=1

p`dT z∗`

≤ f(x, y) +L∑

`=1

p`dT z`.

Noticing that z` ∈ Z(x, y, ω`) is arbitrary for every ` and taking (7.5) into account, weobtain

f(x∗, y∗) +L∑

`=1

p`Q(x∗, y∗, ω`) ≤ f(x, y) +L∑

`=1

p`Q(x, y, ω`) (7.19)

for any (x, y) ∈ F2. If Q(x, y, ω`) = +∞ for some index `, (7.19) holds immediatelyfrom the positivity of pl. This means that (x∗, y∗) is an optimal solution of problem(7.17).

7.3 Further Discussions on Discrete Problems

We continue to discuss the discrete problem (7.16) in this section. Theorem 7.4 indicatesthat problem (7.16) is equivalent to problem (7.17) and furthermore, by Corollary 7.1


and the positivity of every p`, problem (7.17) is equivalent to the problem


`=1

p`Q(x, y, ω`)


y[i](N`x + M`y + q`)[i] ≤ 0,

i = 1, · · · ,m, ` = 1, · · · , L.

Let F3 denote the feasible region of problem (7.20). We then have from Corollary 7.1that

Q(x, y, ω`) < +∞, ∀(x, y) ∈ F3, ∀`.

By Corollary 7.2, we see that, for any (x, y) ∈ F3 and any `,

Q(x, y, ω`) = dT z`(x, y)

with z`(x, y) = max(−(N`x + M`y + q`), 0). Let

θ(x, y) :=L∑

`=1

p`dT max

(− (N`x + M`y + q`), 0

). (7.21)

Then problem (7.20) may be rewritten as

minimize f(x, y) + θ(x, y)


y[i](N`x + M`y + q`)[i] ≤ 0,

i = 1, · · · ,m, ` = 1, · · · , L,

which is no longer an SMPEC.

Example 7.2 Consider the SMPEC

minimize p(z1[1] + z1[2]) + (1− p)(z2[1] + z2[2])

subject to z1[1] ≥ 0, z1[2] ≥ 0, z2[1] ≥ 0, z2[2] ≥ 0,

0 ≤(

y[1]y[2]

)⊥

(2y[1]− 3 + z1[1]y[2]− 5 + z1[2]

)≥ 0, (7.23)

0 ≤(

y[1]y[2]

)⊥

(2y[2]− 7 + z2[1]y[1]− 2 + z2[2]

)≥ 0,


where p is a constant such that 0 < p < 1. For this problem, problem (7.22) becomes

minimize p(

max3− 2y[1], 0+ max5− y[2], 0)

+(1− p)(

max7− 2y[2], 0+ max2− y[1], 0)

subject to y[1] ≥ 0, y[2] ≥ 0, (7.24)

y[1](2y[1]− 3) ≤ 0, y[2](y[2]− 5) ≤ 0,

y[1](2y[2]− 7) ≤ 0, y[2](y[1]− 2) ≤ 0.

Problem (7.24) has a unique solution y∗ = (32 , 7

2) for any p ∈ (0, 1) and it is easy toverify that the linear independence constraint qualification holds at y∗. This indicatesthat problem (7.24) has ordinary constraints, unlike the original SMPEC (7.23) thatdoes not satisfy any standard constraint qualification at any feasible point.

Problem (7.22) possesses the same optimal solution set as problem (7.16) in thesense of Theorem 7.4. However, it may not be easy to obtain a global optimal solutionof an optimization problem in practice, whereas computation of stationary points maybe relatively easy. Therefore, it is necessary to study the relation between the stationarypoints of problems (7.22) and (7.16). Recall that, for each `, N`[i] and M`[i] denotethe column vectors whose elements comprise the ith row of the matrices N` and M`,respectively.

Since θ is a nonsmooth convex function, we will use the following standard definitionof stationarity for problem (7.22):

Definition 7.1 A point (x∗, y∗) ∈ <n+m is said to be stationary to problem (7.22)if it is feasible to (7.22) and there exist Lagrange multiplier vectors λ∗, µ∗, ν∗, andξ∗` , ` = 1, · · · , L, such that

0 ∈ ∇f(x∗, y∗) + ∂θ(x∗, y∗) +∇g(x∗, y∗)λ∗ +∇h(x∗, y∗)µ∗ −(

O

I

)ν∗

+L∑

`=1

m∑

i=1

ξ∗` [i]((N`x

∗ + M`y∗ + q`)[i]

(0ei

)+ y∗[i]

(N`[i]M`[i]

) ), (7.25)

0 ≤ λ∗ ⊥ (−g(x∗, y∗)) ≥ 0, (7.26)

0 ≤ ν∗ ⊥ y∗ ≥ 0, (7.27)

0 ≤ ξ∗` [i] ⊥ (−y∗[i](N`x∗ + M`y

∗ + q`)[i]) ≥ 0. (7.28)

Here, ∂θ(x∗, y∗) stands for the subdifferential [74] of θ at the point (x∗, y∗).

For each ` and i, we let θ`,i(x, y) := max−(N`x + M`y + q`)[i], 0. It then follows


that

∂θ`,i(x∗, y∗) =

co ( −N`[i]

−M`[i]

), 0

, (N`x

∗ + M`y∗ + q`)[i] = 0

( −N`[i]−M`[i]

) , (N`x

∗ + M`y∗ + q`)[i] < 0

0

, (N`x

∗ + M`y∗ + q`)[i] > 0

(7.29)

and

∂θ(x∗, y∗) =L∑

`=1

m∑

i=1

p` d[i] ∂θ`,i(x∗, y∗), (7.30)

where co stands for the convex hull.

On the other hand, it is easy to see that problem (7.16) is equivalent to the followingordinary MPEC:

minimize f(x, y) + dTz


y −Dy = 0, z ≥ 0, (7.31)

y ≥ 0, Nx + My + q + z ≥ 0,

yT (Nx + My + q + z) = 0,

where y ∈ <mL, z ∈ <mL, and

d =

p1d...

pLd

, D =

I...I

, N =

N1...

NL

, M =

M1 O. . .

O ML

, q =

q1...

qL

.

Therefore, we may employ the same terminologies as in the literature on MPECs.Suppose that (x∗, y∗,y∗, z∗) is a feasible point of problem (7.31).

Definition 7.2 We say the MPEC-linear independence constraint qualification (MPEC-LICQ) to hold at (x∗, y∗,y∗, z∗) if the system

g(x, y) ≤ 0, h(x, y) = 0,

y −Dy = 0, z ≥ 0,

y ≥ 0, Nx + My + q + z ≥ 0

satisfies the linear independence constraint qualification (LICQ) at (x∗, y∗,y∗, z∗).


Definition 7.3 We say (x∗, y∗,y∗, z∗) to be a Bouligand or B-stationary point of theMPEC (7.31) if

vT

∇f(x∗, y∗)

0d

≥ 0, ∀v ∈ T (x∗, y∗,y∗, z∗),

where T (x∗, y∗,y∗, z∗) stands for the tangent cone of the feasible region of problem(7.31) at (x∗, y∗,y∗, z∗).

It is well-known [75] that, under the MPEC-LICQ, (x∗, y∗,y∗, z∗) is a B-stationarypoint of (7.31) if and only if there exist multiplier vectors λ, µ, ν, α, β, and γ such that

∇xf(x∗, y∗)∇yf(x∗, y∗)

0d

+

∇xg(x∗, y∗)∇yg(x∗, y∗)

O

O

λ +

∇xh(x∗, y∗)∇yh(x∗, y∗)

O

O

µ

+

O

−DT

I

O

ν −

O

O

O

I

α−

O

O

I

O

β −

NT

O

MT

I

γ = 0, (7.32)

0 ≤ λ ⊥ (−g(x∗, y∗)) ≥ 0, (7.33)

0 ≤ α ⊥ z∗ ≥ 0, (7.34)

y∗ ≥ 0 and y∗[i] > 0 ⇒ β[i] = 0, (7.35)

(Nx∗ + My∗ + q + z∗) ≥ 0 and (Nx∗ + My∗ + q + z∗)[i] > 0 ⇒ γ[i] = 0, (7.36)

β[i] ≥ 0, γ[i] ≥ 0, ∀i ∈ I∗ := i | y∗[i] = (Nx∗ + My∗ + q + z∗)[i] = 0. (7.37)

Moreover, for any feasible point (x, y) of problem (7.22), the point (x, y, z1, · · · , zL)with

z` = max(−(N`x + M`y + q`), 0), ` = 1, · · · , L

is a feasible point of problem (7.31) and this point is eligible in the sense that, if(x, y, z1, · · · , zL) is another feasible point of (7.31), then

f(x, y) +L∑

`=1

p`dT z` ≤ f(x, y) +

L∑

`=1

p`dT z`.

We further have the following theorem.

Theorem 7.5 Suppose that (x∗, y∗) is a feasible point of problem (7.22). Let

z∗` = max(−(N`x∗ + M`y

∗ + q`), 0), ` = 1, · · · , L (7.38)


and

y∗ = ((y∗)T , · · · , (y∗)T )T , z∗ = ((z∗1)T , · · · , (z∗L)T )T . (7.39)

If (x∗, y∗,y∗, z∗) is a B-stationary point of the MPEC (7.31) and the MPEC-LICQholds at (x∗, y∗,y∗, z∗), then (x∗, y∗) is a stationary point of problem (7.22). Con-versely, if (x∗, y∗) is a stationary point of problem (7.22) and the MPEC-LICQ holdsat (x∗, y∗,y∗, z∗), then (x∗, y∗,y∗, z∗) is a B-stationary point of the MPEC (7.31).

Proof: (a) Suppose that (x∗, y∗,y∗, z∗) is a B-stationary point of problem (7.31)and the MPEC-LICQ holds at (x∗, y∗,y∗, z∗). Then there exist multiplier vectorsλ, µ, ν, α, β, and γ satisfying (7.32)–(7.37). We will show that there exist multipliervectors λ∗, µ∗, ν∗, and ξ∗` , ` = 1, · · · , L, such that conditions (7.25)–(7.28) hold. To thisend, we denote

ν :=

ν1...

νL

, α :=

α1...

αL

, β :=

β1...

βL

, γ :=

γ1...

γL

,

and let

λ∗ := λ, µ∗ := µ, (7.40)

and, for each ` and i,

ξ∗` [i] :=

α`[i]y∗[i] , y∗[i] > 0, (N`x

∗ + M`y∗ + q`)[i] = 0,

|∑L

`=1β`[i]|

(N`x∗+M`y∗+q`)[i], y∗[i] = 0, (N`x

∗ + M`y∗ + q`)[i] > 0,

0, otherwise,

(7.41)

σ∗` [i] :=

1− α`[i]

p`d[i] , y∗[i] = 0,1, y∗[i] > 0,

(7.42)

ν∗ :=L∑

`=1

β` +

∑L`=1 ξ∗` [1](N`x

∗ + M`y∗ + q`)[1]

...∑L

`=1 ξ∗` [m](N`x∗ + M`y

∗ + q`)[m]

. (7.43)

Note that (7.32) can be rewritten as

∇f(x∗, y∗) +∇g(x∗, y∗)λ +∇h(x∗, y∗)µ−L∑

`=1

(O

I

)ν` −

L∑

`=1

(NT

`

O

)γ` = 0, (7.44)

ν` − β` −MT` γ` = 0, ` = 1, · · · , L, (7.45)

p`d− α` − γ` = 0, ` = 1, · · · , L. (7.46)


Substituting (7.45) into (7.44), we obtain

∇f(x∗, y∗) +∇g(x∗, y∗)λ +∇h(x∗, y∗)µ−(

O

I

)L∑

`=1

β` −L∑

`=1

(NT

`

MT`

)γ` = 0,

or equivalently,

∇f(x∗, y∗) +∇g(x∗, y∗)λ +∇h(x∗, y∗)µ

−(

O

I

)L∑

`=1

β` −L∑

`=1

m∑

i=1

γ`[i]

(N`[i]M`[i]

)= 0. (7.47)

We next prove that

γ`[i] = p`d[i]σ∗` [i]− ξ∗` [i]y∗[i], ∀`, ∀i. (7.48)

Since (x∗, y∗) is feasible to problem (7.22), we only need to consider three cases (a1)–(a3):

(a1) Suppose y∗[i] > 0 and (N`x∗ + M`y

∗ + q`)[i] = 0. It then follows from (7.41),(7.42), and (7.46) that

p`d[i]σ∗` [i]− ξ∗` [i]y∗[i] = p`d[i]− α`[i] = γ`[i].

Namely, (7.48) holds in this case.

(a2) Suppose that y∗[i] > 0 and (N`x∗ + M`y

∗ + q`)[i] < 0. We then have from(7.38) that z∗` [i] ≥ −(N`x

∗ + M`y∗ + q`)[i] > 0. Since z∗[(` − 1)m + i] = z∗` [i] > 0, by

(7.34), we see that α`[i] = α[(` − 1)m + i] = 0. This, together with (7.41)–(7.42) and(7.46) yields

p`d[i]σ∗` [i]− ξ∗` [i]y∗[i] = p`d[i] = p`d[i]− α`[i] = γ`[i].

(a3) Suppose that y∗[i] = 0. It then follows from (7.41), (7.42), and (7.46) that

p`d[i]σ∗` [i]− ξ∗` [i]y∗[i] = p`d[i]− α`[i] = γ`[i].

Therefore, (7.48) holds in any case. Substituting (7.40), (7.43), and (7.48) into(7.47), we obtain

0 = ∇f(x∗, y∗) +∇g(x∗, y∗)λ∗ +∇h(x∗, y∗)µ∗ −(

O

I

)ν∗

+L∑

`=1

m∑

i=1

ξ∗` [i]((N`x

∗ + M`y∗ + q`)[i]

(0ei

)+ y∗[i]

(N`[i]M`[i]

) )

+L∑

`=1

m∑

i=1

p`d[i]σ∗` [i]

( −N`[i]−M`[i]

).


So, in order to prove (7.25), we only need to show

L∑

`=1

m∑

i=1

p`d[i]σ∗` [i]

( −N`[i]−M`[i]

)∈ ∂θ(x∗, y∗),

which, by (7.29) and (7.30), is equivalent to

∀`, ∀i,

σ∗` [i] ∈ [0, 1], (N`x∗ + M`y

∗ + q`)[i] = 0,σ∗` [i] = 1 , (N`x

∗ + M`y∗ + q`)[i] < 0,

σ∗` [i] = 0 , (N`x∗ + M`y

∗ + q`)[i] > 0.

(7.49)

(a4) Suppose (N`x∗ + M`y

∗ + q`)[i] = 0. Then, we have from (7.38) that

(N`x∗ + M`y

∗ + q` + z∗` )[i] = 0.

If y∗[i] > 0, then σ∗` [i] = 1 ∈ [0, 1] by (7.42). If y∗[i] = 0, we have from (7.39) that

y∗[(`− 1)m + i] = y∗[i] = 0

and

(Nx∗ + My∗ + q + z∗)[(`− 1)m + i] = (N`x∗ + M`y

∗ + q` + z∗` )[i] = 0.

Thus, by (7.37), γ`[i] = γ[(`− 1)m + i] ≥ 0 and hence, by (7.34) and (7.46),

0 ≤ α`[i] = p`d[i]− γ`[i] ≤ p`d[i].

As a result, by (7.42), we obtain

σ∗` [i] = 1− α`[i]p`d[i]

∈ [0, 1].

(a5) Suppose (N`x∗ + M`y

∗ + q`)[i] < 0. In a similar way to (a2), we can show thatα`[i] = 0 and hence, by the definition (7.42) of σ∗` [i], we see that σ∗` [i] = 1.

(a6) Suppose (N`x∗+M`y

∗+q`)[i] > 0. It then follows from the feasibility of (x∗, y∗)to problem (7.22) that y∗[i] = 0 and furthermore, by (7.38),

(Nx∗ + My∗ + q + z∗)[(`− 1)m + i] = (N`x∗ + M`y

∗ + q` + z∗` )[i]

= (N`x∗ + M`y

∗ + q`)[i]

> 0.

From (7.36), we see that γ`[i] = γ[(` − 1)m + i] = 0. Thus, by (7.46), we obtainα`[i] = p`d[i] and hence, by the definition (7.42) of σ∗` [i],

σ∗` [i] = 1− α`[i]p`d[i]

= 0.


This completes the proof of (7.49) and hence (7.25) holds. Note that condition(7.26) follows from (7.33) and (7.40) immediately. In addition, the definition of ξ∗`implies

ξ∗` [i] ≥ 0, ξ∗` [i]y∗[i](N`x∗ + M`y

∗ + q`)[i] = 0, ∀`, ∀i. (7.50)

Since (x∗, y∗, z∗1 , · · · , z∗L) is feasible to problem (7.16), it then follows that, for any ` andi,

−y∗[i](N`x∗ + M`y

∗ + q`)[i] = y∗[i]z∗` [i] ≥ 0,

which, together with (7.50) indicates that (7.28) holds. We next show (7.27).

At first, for every i, we have from (7.35), (7.50), and the definition (7.43) of ν∗ that

ν∗[i]y∗[i] =L∑

`=1

β[(`− 1)m + i]y∗[(`− 1)m + i]

+L∑

`=1

ξ∗` [i]y∗[i](N`x∗ + M`y

∗ + q`)[i]

= 0. (7.51)

The rest is to prove ν∗ ≥ 0. If y∗[i] > 0, we see from (7.51) that ν∗[i] = 0. If y∗[i] = 0,we let

I(i) := ` | (N`x∗ + M`y

∗ + q`)[i] > 0.

It follows from (7.41) and (7.43) that

ν∗[i] =L∑

`=1

β`[i] +∑

`∈I(i)

ξ∗` [i](N`x∗ + M`y

∗ + q`)[i]

=L∑

`=1

β`[i] + n1

∣∣∣L∑

`=1

β`[i]∣∣∣, (7.52)

where n1 denotes the cardinality of the index set I(i). If I(i) 6= ∅, then ν∗[i] ≥ 0 by(7.52). If I(i) = ∅, it follows that

(N`x∗ + M`y

∗ + q`)[i] ≤ 0, ∀`

and then, by (7.38) and (7.39), there hold

y∗[(`− 1)m + i] = y∗[i] = 0


and

(Nx∗ + My∗ + q + z∗)[(`− 1)m + i] = (N`x∗ + M`y

∗ + q` + z∗` )[i] = 0

for every `. From (7.37) and (7.52), we obtain ν∗[i] =∑L

`=1 β`[i] ≥ 0. Thus, we haveν∗ ≥ 0 in any case. This completes the proof of (7.27) and hence (x∗, y∗) is a stationarypoint of problem (7.22).

(b) Suppose that (x∗, y∗) is a stationary point of problem (7.22) and the MPEC-LICQ holds at the point (x∗, y∗,y∗, z∗). By the stationarity of (x∗, y∗), there existmultiplier vectors λ∗, µ∗, ν∗, and ξ∗` , ` = 1, · · · , L, satisfying conditions (7.25)–(7.28).We will prove that (x∗, y∗,y∗, z∗) is a B-stationary point of problem (7.31), i.e., thereexist multiplier vectors λ, µ, ν, α, β, and γ such that (7.32)–(7.37) hold.

By (7.29) and (7.30), condition (7.25) means that there exist multiplier vectorsσ∗` , ` = 1, · · · , L, satisfying (7.49) and

0 = ∇f(x∗, y∗) +∇g(x∗, y∗)λ∗ +∇h(x∗, y∗)µ∗ −(

O

I

)ν∗

+L∑

`=1

m∑

i=1

ξ∗` [i]((N`x

∗ + M`y∗ + q`)[i]

(0ei

)+ y∗[i]

(N`[i]M`[i]

) )

+L∑

`=1

m∑

i=1

p`d[i]σ∗` [i]

( −N`[i]−M`[i]

). (7.53)

Let

λ := λ∗, (7.54)

µ := µ∗, (7.55)

α`[i] := p`d[i](1− σ∗` [i]) + ξ∗` [i]y∗[i], (7.56)

β`[i] :=1L

ν∗[i]− ξ∗` [i](N`x∗ + M`y

∗ + q`)[i], (7.57)

γ`[i] := p`d[i]σ∗` [i]− ξ∗` [i]y∗[i], (7.58)

ν` := β` + MT` γ`, (7.59)

and

α :=

α1...

αL

, β :=

β1...

βL

, γ :=

γ1...

γL

, ν :=

ν1...

νL

.

It is easy to verify (7.45)–(7.47) from (7.53)–(7.58) and so we obtain (7.32). In addition,condition (7.33) follows from (7.26) and (7.54) immediately. We next show (7.34)–(7.37). Note that, by (7.28) and (7.49), we have α ≥ 0. Moreover, by (7.38) and (7.39),

7.4 Smoothed Penalty Methods 167

we have y∗ ≥ 0, z∗ ≥ 0, and N`x∗+M`y

∗+ q` + z∗` ≥ 0 for any `, which in turn impliesNx∗ + My∗ + q + z∗ ≥ 0. Take an arbitrary index j with 1 ≤ j ≤ mL. Then, thereexist ` and i such that

1 ≤ ` ≤ L, 1 ≤ i ≤ m, j = (`− 1)m + i.

(b1) Suppose z∗[j] > 0. This means z∗` [i] > 0. It then follows from (7.38) that(N`x

∗ + M`y∗ + q`)[i] < 0 and hence we have

σ∗` [i] = 1, ξ∗` [i]y∗[i] = 0

from (7.49) and (7.28), respectively. This indicates α[j] = α`[i] = 0 and so (7.34) holds.

(b2) It is easy to see from (7.27) and (7.28) that

β[j]y[j] = β`[i]y∗[i] =1L

ν∗[i]y∗[i]− ξ∗` [i]y∗[i](N`x∗ + M`y

∗ + q`)[i] = 0,

which indicates that (7.35) holds.

(b3) Suppose (Nx∗ + My∗ + q + z∗)[j] > 0. This means that (N`x∗ + M`y

∗ +q` + z∗` )[i] > 0. It then follows from (7.38) that (N`x

∗ + M`y∗ + q`)[i] > 0 and hence

y∗[i] = 0. This indicates γ[j] = γ`[i] = 0 and therefore (7.36) holds.

(b4) Let I∗ be defined as in (7.37) and suppose j ∈ I∗. It is obvious that

γ[j] = p`d[i]σ∗` [i] ≥ 0.

On the other hand, since (Nx∗ + My∗ + q + z∗)[j] = 0 implies that (N`x∗ + M`y

∗ +q` + z∗` )[i] = 0, we see from (7.38) that (N`x

∗ + M`y∗ + q`)[i] ≤ 0. Therefore,

β[j] = β`[i] =1L

ν∗[i]− ξ∗` [i](N`x∗ + M`y

∗ + q`)[i] ≥ 1L

ν∗[i] ≥ 0.

This indicates that (7.37) holds.

Thus, the multiplier vectors defined by (7.54)–(7.58) satisfy conditions (7.32)–(7.37)and hence (x∗, y∗,y∗, z∗) is a B-stationary point of problem (7.31).

7.4 Smoothed Penalty Methods for Discrete Problems

We have established the equivalence between problems (7.16) and (7.22). Note that,although (7.22) is no longer an SMPEC and the function θ is convex, this problemmay not be easy to deal with because firstly, the objective function is not differentiable


everywhere, and secondly, since L is usually very large in practice, problem (7.22) hasa great many constraints. We will propose a smoothed penalty method for solvingproblem (7.22) in this section.

The following functions will be used later on: Let ε be a nonnegative constant. Thefunctions φε : < → [0, +∞) and ψε : < → [0, +∞) are defined by

φε(t) :=√

t2 + ε2 + t

2

and

ψε(t) :=√

t2 + ε2,

respectively. It is obvious that, for any fixed t,

limε↓0

φε(t) = φ0(t) = maxt, 0, limε↓0

ψε(t) = ψ0(t) = |t|.

However, unlike the limit functions φ0 and ψ0, both φε and ψε are differentiable every-where for each ε > 0.

7.4.1 Smoothed penalty method (I)

By means of the differentiable function φε and with the help of a smoothed penaltytechnique, we obtain the following smooth approximation of problem (7.22):

minimize f(x, y) + ϑε(x, y) + ρδε(x, y) (7.60)

subject to g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0,

where ρ > 0 is a penalty parameter and

ϑε(x, y) :=L∑

`=1

m∑

i=1

p`d[i]φε(−(N`x + M`y + q`)[i]), (7.61)

δε(x, y) :=L∑

`=1

m∑

i=1

φε(y[i](N`x + M`y + q`)[i]). (7.62)

Note that, for any (x, y),

ϑ0(x, y) = θ(x, y),

δ0(x, y) =L∑

`=1

m∑

i=1

maxy[i](N`x + M`y + q`)[i], 0.


Hence, when ε = 0 and ρ = ρ > 0, problem (7.60) reduces to

minimize f(x, y) + θ(x, y) + ρδ0(x, y) (7.63)

subject to g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0.

For problem (7.63), we will use a similar definition of stationarity to problem (7.22):We say (x∗, y∗) ∈ <n+m to be stationary to problem (7.63) if it is feasible and thereexist Lagrange multiplier vectors λ, µ, and ν such that

0 ∈ ∇f(x∗, y∗) + ∂θ(x∗, y∗) + ρ∂δ0(x∗, y∗)

+∇g(x∗, y∗)λ +∇h(x∗, y∗)µ−(

O

I

)ν, (7.64)

0 ≤ λ ⊥ (−g(x∗, y∗)) ≥ 0, (7.65)

0 ≤ ν ⊥ y∗ ≥ 0, (7.66)

where ∂δ0 denotes Clarke subdifferential operator [18]. Let

δ`,i(x, y) := maxy[i](N`x + M`y + q`)[i], 0

, ∀`, ∀i.

Then, since the functions δ`,i are Clarke regular [18], we have

∂δ`,i(x, y)

=

co

(N`x + M`y + q`)[i](

0ei

)+ y[i]

(N`[i]M`[i]

), 0

, y[i](N`x + M`y + q`)[i] = 0

(N`x + M`y + q`)[i]

(0ei

)+ y[i]

(N`[i]M`[i]

), y[i](N`x + M`y + q`)[i] > 0

0

, y[i](N`x + M`y + q`)[i] < 0

(7.67)

and

∂δ0(x, y) =L∑

`=1

m∑

i=1

∂δ`,i(x, y), (7.68)

where ∂δ`,i denotes Clarke subdifferential operator of δ`,i.

Theorem 7.6 Let (x∗, y∗) be a stationary point of problem (7.22). Then, (x∗, y∗) is astationary point of problem (7.63) for any ρ sufficiently large. Conversely, if (x∗, y∗)is a stationary point of problem (7.63), and δ0(x∗, y∗) = 0, i.e., (x∗, y∗) ∈ F3, then(x∗, y∗) is stationary to (7.22).


Proof: (a) Suppose (x∗, y∗) is a stationary point of problem (7.22). We will showthat, when ρ is sufficiently large, (x∗, y∗) is a stationary point of problem (7.63). Bythe stationarity of (x∗, y∗) to problem (7.22), there exist multiplier vectors λ∗, µ∗, ν∗,and ξ∗` , ` = 1, · · · , L, satisfying conditions (7.25)–(7.28). Let

λ := λ∗, µ := µ∗, ν := ν∗.

Then (7.65) and (7.66) follow from (7.26) and (7.27) immediately. Comparing (7.64)with (7.25), in order to finish the proof, we only need to show that, when ρ is sufficientlylarge,

L∑

`=1

m∑

i=1

ξ∗` [i]((N`x

∗ + M`y∗ + q`)[i]

(0ei

)+ y∗[i]

(N`[i]M`[i]

) )∈ ρ∂δ0(x∗, y∗).

By (7.67) and (7.68), it is sufficient to show that, when ρ is sufficiently large,

∀`, ∀i,

ξ∗` [i] ∈ [0, ρ], y∗[i](N`x∗ + M`y

∗ + q`)[i] = 0,ξ∗` [i] = ρ , y∗[i](N`x

∗ + M`y∗ + q`)[i] > 0,

ξ∗` [i] = 0 , y∗[i](N`x∗ + M`y

∗ + q`)[i] < 0.

Indeed, this follows from (7.28) and the fact that y∗[i](N`x∗ + M`y

∗ + q`)[i] ≤ 0 foreach ` and i.

(b) Suppose (x∗, y∗) is a stationary point of problem (7.63) and δ0(x∗, y∗) = 0.

Then, there exist multiplier vectors λ, µ, and ν satisfying (7.64)–(7.66). Note that(x∗, y∗) ∈ F3 implies

y∗[i](N`x∗ + M`y

∗ + q`)[i] ≤ 0, ∀`, ∀i. (7.69)


∂δ`,i(x∗, y∗)

=

co

(N`x∗ + M`y

∗ + q`)[i](

0ei

)+ y∗[i]

(N`[i]M`[i]

), 0

, y∗[i](N`x

∗ + M`y∗ + q`)[i] = 0

0

, y∗[i](N`x

∗ + M`y∗ + q`)[i] < 0

for any ` and i, and

∂δ0(x∗, y∗) =L∑

`=1

m∑

i=1

∂δ`,i(x∗, y∗).

Condition (7.64) means that there exist multiplier vectors ξ`, ` = 1, · · · , L, such that

∀`, ∀i,

ξ`[i] ∈ [0, ρ], y∗[i](N`x∗ + M`y

∗ + q`)[i] = 0ξ`[i] = 0 , y∗[i](N`x

∗ + M`y∗ + q`)[i] < 0

(7.70)


and

0 ∈ ∇f(x∗, y∗) + ∂θ(x∗, y∗) +∇g(x∗, y∗)λ +∇h(x∗, y∗)µ−(

O

I

)ν

+L∑

`=1

m∑

i=1

ξ`[i]((N`x

∗ + M`y∗ + q`)[i]

(0ei

)+ y∗[i]

(N`[i]M`[i]

) ). (7.71)

Let

λ∗ := λ, µ∗ := µ, ν∗ := ν

and

ξ∗` := ξ`, ` = 1, · · · , L.

Then (7.25)–(7.27) follow from (7.71) and (7.65)–(7.66), and (7.28) follows from (7.69)–(7.70). Therefore, (x∗, y∗) is a stationary point of problem (7.22).

We then have the following algorithm for problem (7.22).

Algorithm SP-I:

Step 1: Choose ε0 > 0 and ρ0 > 0. Set k := 0.

Step 2: Solve problem (7.60) with ε = εk and ρ = ρk to get a stationary point (xk, yk)and go to Step 3.

Step 3: If a stopping rule is satisfied, then terminate. Otherwise, choose εk+1 ∈ (0, εk)and ρk+1 ≥ ρk. Go to Step 2 with k := k + 1.

In what follows, we suppose that the sequences εk and ρk satisfy

limk→∞

εk = 0, limk→∞

ρk = ρ, (7.72)

where ρ is a sufficiently large constant. Recall that F3 denotes the feasible region ofproblem (7.20), which is the same as that of problem (7.22).

We next investigate the limiting behavior of the sequence (xk, yk) generated byAlgorithm SP-I. The convergence result can be stated as follows.

Theorem 7.7 Suppose that Algorithm SP-I generates a sequence (xk, yk) of station-ary points of (7.60) with ε = εk and ρ = ρk. For any accumulation point (x∗, y∗) of thesequence (xk, yk), if the system

g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0 (7.73)


satisfies the Mangasarian-Fromovitz constraint qualification (MFCQ) at (x∗, y∗), then(x∗, y∗) is a stationary point of problem (7.63). Furthermore, if δ0(x∗, y∗) = 0, then(x∗, y∗) is a stationary point of problem (7.22).

Proof: Assume without loss of generality that limk→∞(xk, yk) = (x∗, y∗). We willshow that (x∗, y∗) is stationary to problem (7.63), i.e., there exist multiplier vectorsλ, µ, and ν such that (7.64)–(7.66) hold.

First of all, by the stationarity of (xk, yk) for problem (7.60) with ε = εk and ρ = ρk,there exist Lagrange multiplier vectors λk, µk, and νk such that

∇f(xk, yk) +∇ϑεk(xk, yk) + ρk∇δεk

(xk, yk)

+∇g(xk, yk)λk +∇h(xk, yk)µk −(

O

I

)νk = 0, (7.74)

0 ≤ λk ⊥ (−g(xk, yk)) ≥ 0, (7.75)

0 ≤ νk ⊥ yk ≥ 0. (7.76)

Note that, by (7.61) and (7.62),

∇ϑεk(xk, yk) =

L∑

`=1

m∑

i=1

p`d[i]φ′εk(−(N`x

k + M`yk + q`)[i])

( −N`[i]−M`[i]

), (7.77)

∇δεk(xk, yk) =

L∑

`=1

m∑

i=1

φ′εk(yk[i](N`x

k + M`yk + q`)[i])

((N`x

k + M`yk + q`)[i]

(0ei

)+ yk[i]

(N`[i]M`[i]

) ), (7.78)

where

φ′εk(t) =

12

(t√

t2 + ε2k

+ 1)

, ∀ t ∈ <. (7.79)

We can then rewrite (7.74) as

−∇f(xk, yk)−∇g(xk, yk)λk −∇h(xk, yk)µk +

(O

I

)νk

=L∑

`=1

m∑

i=1

ξk` [i]

((N`x

k + M`yk + q`)[i]

(0ei

)+ yk[i]

(N`[i]M`[i]

) )

+L∑

`=1

m∑

i=1

p`d[i]σk` [i]

( −N`[i]−M`[i]

), (7.80)

where

ξk` [i] := ρkφ

′εk

(yk[i](N`xk + M`y

k + q`)[i]), (7.81)

σk` [i] := φ′εk

(−(N`xk + M`y

k + q`)[i]). (7.82)


We next prove that the sequences λk, µk, and νk are bounded. For thecontradiction purpose, let

τk :=m∑

i=1

(λk[i] + |µk[i]|+ νk[i]), (7.83)

and suppose limk→∞ τk = +∞. Taking a subsequence if necessary, we may assume thatthe limits

λ′[i] := limk→∞

λk[i]τk

, µ′[i] := limk→∞

µk[i]τk

, ν ′[i] := limk→∞

νk[i]τk

, i = 1, · · · ,m

exist. It is clear from (7.83) that

m∑

i=1

(λ′[i] + |µ′[i]|+ ν ′[i]) = 1.

It follows from (7.72), (7.79), and (7.81)–(7.82) that both ξk` [i] and σk

` [i] arebounded for each ` and each i. Thus, dividing (7.80) by τk and taking a limit, weget

−∇g(x∗, y∗)λ′ −∇h(x∗, y∗)µ′ +

(O

I

)ν ′ = 0.

Furthermore, taking (7.75) and (7.76) into account, we obtain λ′ ≥ 0, ν′ ≥ 0, and

λ′[i] = 0, i /∈ Ig(x∗, y∗),

ν ′[i] = 0, i /∈ Iπ(x∗, y∗),

where π : <n+m → <m is given by π(x, y) = y. Thus we have

∑

i∈Ig(x∗,y∗)

λ′[i] +s2∑

i=1

µ′[i] +∑

i∈Iπ(x∗,y∗)

ν ′[i] = 1,

λ′[i] ≥ 0, i ∈ Ig(x∗, y∗),

ν ′[i] ≥ 0, i ∈ Iπ(x∗, y∗),

and

−∑

i∈Ig(x∗,y∗)

λ′[i]∇gi(x∗, y∗)−s2∑

i=1

µ′[i]∇hi(x∗, y∗) +∑

i∈Iπ(x∗,y∗)

ν ′[i]

(0ei

)= 0.

This contradicts the assumption that the system (7.73) satisfies the MFCQ at (x∗, y∗).Hence all the sequences λk, µk, and νk are bounded. Since ξk

` [i] and σk` [i] are


bounded for any ` and i, we may assume, without loss of generality, that the followinglimits exist:

λ := limk→∞

λk, µ := limk→∞

µk, ν := limk→∞

νk,

and

ξ` := limk→∞

ξk` , σ` := lim

k→∞σk

` , ∀`.

Taking a limit in (7.75), (7.76), and (7.80), we obtain (7.65), (7.66), and

−∇f(x∗, y∗)−∇g(x∗, y∗)λ−∇h(x∗, y∗)µ +

(O

I

)ν

=L∑

`=1

m∑

i=1

ξ`[i]((N`x

∗ + M`y∗ + q`)[i]

(0ei

)+ y∗[i]

(N`[i]M`[i]

) )

+L∑

`=1

m∑

i=1

p`d[i]σ`[i]

( −N`[i]−M`[i]

). (7.84)

Thus, in order to show that (x∗, y∗) is a stationary point of problem (7.63), we only needto prove that the vector on the right-hand side of (7.84) belongs to the set ρ∂δ0(x∗, y∗)+∂θ(x∗, y∗).

(i) We first prove that

L∑

`=1

m∑

i=1

ξ`[i]((N`x

∗ + M`y∗ + q`)[i]

(0ei

)+ y∗[i]

(N`[i]M`[i]

) )∈ ρ∂δ0(x∗, y∗). (7.85)

By (7.68), it is sufficient to show that, for any ` and i,

ξ`[i]((N`x

∗ + M`y∗ + q`)[i]

(0ei

)+ y∗[i]

(N`[i]M`[i]

) )∈ ρ∂δ`,i(x∗, y∗),

which, by (7.67), is equivalent to

ξ`[i] ∈ [0, ρ], y∗[i](N`x∗ + M`y

∗ + q`)[i] = 0,ξ`[i] = ρ , y∗[i](N`x

∗ + M`y∗ + q`)[i] > 0,

ξ`[i] = 0 , y∗[i](N`x∗ + M`y

∗ + q`)[i] < 0.

(7.86)

In fact, we can obtain (7.86) immediately from (7.72) and the facts that

limk→∞

(xk, yk) = (x∗, y∗)

and

ξk` [i] =

ρk

2

(yk[i](N`x

k + M`yk + q`)[i]√

(yk[i](N`xk + M`yk + q`)[i])2 + ε2k

+ 1)

,


and hence (7.85) must hold.

(ii) We next prove that

L∑

`=1

m∑

i=1

p`d[i]σ`[i]

( −N`[i]−M`[i]

)∈ ∂θ(x∗, y∗).

By (7.30), it is enough to show

σ`[i]

( −N`[i]−M`[i]

)∈ ∂θ`,i(x∗, y∗), ∀`, ∀i. (7.87)

There are three cases:

(iia) Suppose (N`x∗ + M`y

∗ + q`)[i] = 0. We then have from (7.79) and (7.82) that

0 ≤ σk` [i] ≤ 1, ∀k.

Passing to the limit yields 0 ≤ σ`[i] ≤ 1 and hence

σ`[i]

( −N`[i]−M`[i]

)∈ co

( −N`[i]−M`[i]

), 0

= ∂θ`,i(x∗, y∗),

where the equality follows from (7.29).

(iib) Suppose (N`x∗ + M`y

∗ + q`)[i] < 0. Note that

σk` [i] =

12

( −(N`xk + M`y

k + q`)[i]√((N`xk + M`yk + q`)[i])2 + ε2k

+ 1)

, ∀k.

Taking a limit in the above equality, we obtain σ`[i] = 1 immediately and so

σ`[i]

( −N`[i]−M`[i]

)=

( −N`[i]−M`[i]

)∈ ∂θ`,i(x∗, y∗).

(iic) Suppose (N`x∗ + M`y

∗ + q`)[i] > 0. It is easy to show that, for any k,

0 ≤ σk` [i] ≤ εk

2(√

((N`xk + M`yk + q`)[i])2 + ε2k + (N`xk + M`yk + q`)[i]) .

Letting k →∞, we see that σ`[i] = 0 and so

σ`[i]

( −N`[i]−M`[i]

)= 0 ∈ ∂θ`,i(x∗, y∗).

Consequently, (7.87) holds in each case. (i) and (ii) indicate that the vector on theright-hand side of (7.84) belongs to the set ρ∂δ0(x∗, y∗)+∂θ(x∗, y∗). This completes theproof of the first part of the theorem. The second half readily follows from Theorem7.6.


7.4.2 Smoothed penalty method (II)

In the last subsection, making use of the function φε, we obtained the smooth problem(7.60). By applying a similar smoothed penalty technique to the constraints of problem(7.60), we may further get the following unconstrained smooth problem:

minimize f(x, y) + ϑεk(x, y) + ρkδεk

(x, y) + ρkrεk(x, y), (7.88)

where

rεk(x, y) :=

s1∑

i=1

φεk(gi(x, y)) +

s2∑

i=1

ψεk(hi(x, y)) +

m∑

i=1

φεk(−y[i]). (7.89)

Corresponding to problem (7.63), we have a nonsmooth approximation of problem(7.22):

minimize f(x, y) + θ(x, y) + ρδ0(x, y) + ρr0(x, y), (7.90)

where

r0(x, y) :=s1∑

i=1

max(gi(x, y)), 0+s2∑

i=1

|hi(x, y)|+m∑

i=1

max−y[i], 0.

We say (x∗, y∗) ∈ <n+m to be stationary to problem (7.90) if

0 ∈ ∇f(x∗, y∗) + ∂θ(x∗, y∗) + ρ∂δ0(x∗, y∗) + ρ∂r0(x∗, y∗). (7.91)

Theorem 7.8 Let (x∗, y∗) be a stationary point of problem (7.22). Then, (x∗, y∗) is astationary point of problem (7.90) for any ρ sufficiently large. Conversely, if (x∗, y∗) isa stationary point of problem (7.90), and δ0(x∗, y∗)+ r0(x∗, y∗) = 0, i.e., (x∗, y∗) ∈ F3,then (x∗, y∗) is stationary to problem (7.22).

The proof of this theorem is similar to that of Theorem 7.6 and so it is omittedhere.

A new algorithm for solving problem (7.22) can be stated as follows.

Algorithm SP-II:

Step 1: Choose ε0 > 0 and ρ0 > 0. Set k := 0.

Step 2: Solve problem (7.88) to get a stationary point, say (xk, yk), and go to Step 3.

Step 3: If a stopping rule is satisfied, then terminate. Otherwise, choose εk+1 ∈ (0, εk)and ρk+1 ≥ ρk. Go to Step 2 with k := k + 1.


We suppose that the sequences εk and ρk also satisfy condition (7.72). Then,we have the following convergence result for Algorithm SP-II.

Theorem 7.9 Suppose that Algorithm SP-II generates a sequence (xk, yk) of station-ary points of (7.88). Then any accumulation point (x∗, y∗) of the sequence (xk, yk) isa stationary point of problem (7.90). Furthermore, if δ0(x∗, y∗) + r0(x∗, y∗) = 0, then(x∗, y∗) is a stationary point of problem (7.22).

Proof: Assume without loss of generality that limk→∞(xk, yk) = (x∗, y∗). Since thesecond part of the theorem follows from the first part and Theorem 7.8 directly, weonly prove the first part of the theorem, namely, we will show (7.91).

Firstly, it follows from the stationarity of (xk, yk) that

∇f(xk, yk) +∇ϑεk(xk, yk) + ρk∇δεk

(xk, yk) + ρk∇rεk(xk, yk) = 0. (7.92)

Note that, by (7.89),

∇rεk(xk, yk) =

s1∑

i=1

φ′εk(gi(xk, yk))∇gi(xk, yk)

+s2∑

i=1

ψ′εk(hi(xk, yk))∇hi(xk, yk)−

m∑

i=1

φ′εk(−yk[i])

(0ei

),(7.93)

where φ′εkis given by (7.79) and

ψ′εk(t) =

t√t2 + ε2k

, ∀ t ∈ <. (7.94)

Substituting (7.77), (7.78), and (7.93) into (7.92), we obtain

−∇f(xk, yk) = ∇g(xk, yk)λk +∇h(xk, yk)µk −(

O

I

)νk

+L∑

`=1

m∑

i=1

ξk` [i]

((N`x

k + M`yk + q`)[i]

(0ei

)+ yk[i]

(N`[i]M`[i]

) )

+L∑

`=1

m∑

i=1

p`d[i]σk` [i]

( −N`[i]−M`[i]

), (7.95)

where the multiplier vectors λk, µk, νk, ξk` , and σk

` are given by

λk[i] := ρkφ′εk

(gi(xk, yk)), i = 1, · · · , s1,

µk[i] := ρkψ′εk

(hi(xk, yk)), i = 1, · · · , s2,

νk[i] := ρkφ′εk

(−yk[i]), i = 1, · · · ,m,

ξk` [i] := ρkφ

′εk

(yk[i](N`xk + M`y

k + q`)[i]), i = 1, · · · ,m, ` = 1, · · · , L,

σk` [i] := φ′εk

(−(N`xk + M`y

k + q`)[i]), i = 1, · · · , m, ` = 1, · · · , L,


respectively. Since ρk is bounded, it follows from (7.79) and (7.94) that all themultiplier vectors are bounded. Therefore, we may assume without loss of generalitythat the following limits exist:

λ := limk→∞



νk,

and

ξ` := limk→∞

ξk` , σ` := lim

k→∞σk

` , ∀`.

Taking a limit in (7.95) yields

−∇f(x∗, y∗) = ∇g(x∗, y∗)λ +∇h(x∗, y∗)µ−(

O

I

)ν

+L∑

`=1

m∑

i=1

ξ`[i]((N`x

∗ + M`y∗ + q`)[i]

(0ei

)+ y∗[i]

(N`[i]M`[i]

) )

+L∑

`=1

m∑

i=1

p`d[i]σ`[i]

( −N`[i]−M`[i]

). (7.96)

Thus, in order to show that (x∗, y∗) is a stationary point of problem (7.90), we onlyneed to prove that the vector on the right-hand side of (7.96) belongs to the set

ρ∂r0(x∗, y∗) + ρ∂δ0(x∗, y∗) + ∂θ(x∗, y∗).

In a similar way to the proof of Theorem 7.7, we can show that

L∑

`=1

m∑

i=1

ξ`[i]((N`x

∗ + M`y∗ + q`)[i]

(0ei

)+ y∗[i]

(N`[i]M`[i]

) )∈ ρ∂δ0(x∗, y∗)

and

L∑

`=1

m∑

i=1

p`d[i]σ`[i]

( −N`[i]−M`[i]

)∈ ∂θ(x∗, y∗).

Now let us prove that

∇g(x∗, y∗)λ +∇h(x∗, y∗)µ−(

O

I

)ν ∈ ρ∂r0(x∗, y∗). (7.97)

To this end, it is enough to show

i = 1, · · · , s1,

λ[i] ∈ [0, ρ], gi(x∗, y∗) = 0λ[i] = ρ , gi(x∗, y∗) > 0λ[i] = 0 , gi(x∗, y∗) < 0,

(7.98)


i = 1, · · · , s2,

µ[i] ∈ [−ρ, ρ], hi(x∗, y∗) = 0µ[i] = ρ , hi(x∗, y∗) > 0µ[i] = −ρ , hi(x∗, y∗) < 0,

(7.99)

and

i = 1, · · · , m,

ν[i] ∈ [0, ρ], y∗[i] = 0ν[i] = ρ , y∗[i] < 0ν[i] = 0 , y∗[i] > 0.

(7.100)

In fact, (7.98)–(7.100) follow from (7.72) and the facts that

limk→∞

(xk, yk) = (x∗, y∗)

and

λk[i] =ρk

2

(gi(xk, yk)√

(gi(xk, yk))2 + ε2k

+ 1)

,

µk[i] =ρkhi(xk, yk)√

(hi(xk, yk))2 + ε2k

,

νk[i] =ρk

2

( −yk[i]√(−yk[i])2 + ε2k

+ 1)

,

and hence (7.97) must hold. This completes the proof.

7.4.3 Numerical results

We have tested the proposed methods on Example 7.2. In our experiments, we employedthe MATLAB 6.5 built-in solver function fmincon to solve the constrained subproblems(7.60) and used fminunc to solve the unconstrained subproblems (7.88). For both SP-Iand SP-II, we set ε0 = 10−2, ρ0 = 103, and updated these parameters by εk+1 = 10−1εk

and ρk+1 = min10ρk, ρ, respectively. Moreover, the constants p and ρ are set to be0.25 and 105, respectively. In addition, the initial point is chosen to be y0 = (0, 0) andthe computed solution yk at the kth iteration is used as the starting point for the nextiteration.

The computational results for Example 7.2 by SP-I and SP-II are reported in Tables7.1 and 7.2, respectively. In the tables, Ite stands for the number of iterations spent byfmincon or fminunc to solve the subproblems. The results shown in the tables revealthat the proposed methods are able to solve Example 7.1 successfully.

180 8. Regularization Method for SMPECs

Table 7.1: Computational Results by SP-I Table 7.2: Computational Results by

SP-II

εk ρk yk Ite10−2 103 (1.4206,3.4292) 2210−3 104 (1.4744,3.4780) 610−4 105 (1.4919,3.4931) 710−5 105 (1.4992,3.4993) 1210−6 105 (1.4999,3.4999) 610−7 105 (1.5000,3.5000) 6

εk ρk yk Ite10−2 103 (1.4171,3.4339) 3410−3 104 (1.4744,3.4780) 910−4 105 (1.4919,3.4931) 1110−5 105 (1.4992,3.4993) 1210−6 105 (1.4999,3.4999) 710−7 105 (1.5000,3.5000) 20

7.5 Conclusions

A class of stochastic mathematical programs with linear complementarity constraints,called the here-and-now model, has been dealt with in this chapter. We have presenteda number of reformulations of the problem and then, based on these reformulations,proposed two smoothed penalty methods. Comprehensive convergence theory for thetwo methods have been established.

Chapter 8

Regularization Method for

SMPECs

In this chapter, assuming that the underlying sample space Ω is discrete and finite,i.e., Ω = ω1, ω2, · · · , ωL for some integer L > 0, we consider the here-and-now model[52, 53] of SMPECs:


`=1

p`dT z`


y ≥ 0, N`x + M`y + q` + z` ≥ 0, (8.1)

yT (N`x + M`y + q` + z`) = 0,

z` ≥ 0, ` = 1, 2, · · · , L.

Here, the functions f : <n+m → <, g : <n+m → <s1 , h : <n+m → <s2 are all continu-ously differentiable, d ∈ <m is a constant vector with positive elements and, for each `,N` := N(ω`) ∈ <m×n, M` := M(ω`) ∈ <m×m, and q` := q(ω`) ∈ <m are given matricesand vectors associated with the random event ω`, z` ∈ <m is a recourse variable, p`

denotes the probability of ω` and is assumed to be positive throughout.

Based on some reformulations, two penalty methods have been proposed for solvingproblem (8.1) in Chapter 7. In addition, a smoothing implicit programming methodincorporating a penalty technique has been suggested for solving a similar problem inChapter 6. However, like the penalty methods in standard nonlinear programming, themethods suggested in the previous chapters cannot ensure the feasibility of a limit pointof a generated sequence in general. In this chapter, we will present a regularizationmethod for problem (8.1) and show that, under a quite weak condition, an accumulation

181


point of the generated sequence is a feasible point of the original problem. We will alsoestablish global convergence to an S-stationary point of the problem under additionalassumptions.

8.1 Preliminaries and Regularization Method

In this section, we propose a regularization method for problem (8.1). We first recallsome basic concepts. Since problem (8.1) is equivalent to the following ordinary MPEC(8.2), we will employ the same stationary concepts as in the literature on MPECs:

minimize f(x, y) + dTz


y −Dy = 0, z ≥ 0, (8.2)

y ≥ 0, Nx + My + q + z ≥ 0,

yT (Nx + My + q + z) = 0,

where y := (yT , · · · , yT )T ∈ <mL, z := (zT1 , · · · , zT

L )T ∈ <mL, and

d :=

p1d...

pLd

, D :=

I...I

, N :=

N1...

NL

, M :=

M1 O. . .

O ML

,q :=

q1...

qL

. (8.3)

Suppose that (x∗, y∗,y∗, z∗) is a feasible point of problem (8.2).

Definition 8.1 We say (x∗, y∗,y∗, z∗) to be a Bouligand or B-stationary point of theMPEC (8.2) if

vT

∇f(x∗, y∗)

0d

≥ 0, ∀v ∈ T (x∗, y∗,y∗, z∗),

where T (x∗, y∗,y∗, z∗) stands for the tangent cone of the feasible region of problem(8.2) at (x∗, y∗,y∗, z∗).

Definition 8.2 We say (x∗, y∗,y∗, z∗) to be a strongly or S-stationary point of (8.2) ifthere exist multiplier vectors λ, µ, ν, α, β, and γ such that

∇xf(x∗, y∗)∇yf(x∗, y∗)

0d

+

∇xg(x∗, y∗)∇yg(x∗, y∗)

O

O

λ +

∇xh(x∗, y∗)∇yh(x∗, y∗)

O

O

µ

8.1 Method 183

+

O

−DT

I

O

ν −

O

O

O

I

α−

O

O

I

O

β −

NT

O

MT

I

γ = 0, (8.4)

0 ≤ λ ⊥ (−g(x∗, y∗)) ≥ 0, (8.5)

0 ≤ α ⊥ z∗ ≥ 0, (8.6)

y∗ ≥ 0, (8.7)

y∗[i] > 0 ⇒ β[i] = 0, (8.8)

(Nx∗ + My∗ + q + z∗) ≥ 0, (8.9)

(Nx∗ + My∗ + q + z∗)[i] > 0 ⇒ γ[i] = 0, (8.10)

β[i] ≥ 0, γ[i] ≥ 0, ∀i ∈ I∗ := i | y∗[i] = (Nx∗ + My∗ + q + z∗)[i] = 0. (8.11)

It is well-known [75] that any S-stationary point of (8.2) must be a B-stationarypoint of problem (8.2). In order to look for some effective methods for solving problem(8.1), some equivalent reformulations of problem (8.1) have been introduced recently[53]. In particular, for any x ∈ <n, y ∈ <m, and each `, it has been shown that the set

Z`(x, y) :=

z`

∣∣∣∣∣yT (N`x + M`y + q` + z`) = 0N`x + M`y + q` + z` ≥ 0, z` ≥ 0

is nonempty if and only if

Q`(x, y) := sup− (u + ty)T (N`x + M`y + q`)

∣∣∣ u + ty ≤ d, u ≥ 0, t ≤ 0

is finite. Based on this observation, we obtain the model


`=1

p`Q`(x, y) (8.12)

subject to g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0.

Furthermore, we have the following result.

Theorem 8.1 [53] If (x∗, y∗) solves problem (8.12), then there exist z∗` , ` = 1, 2, · · · , L,such that (x∗, y∗, z∗1 , · · · , z∗L) solves problem (8.1). Conversely, if (x∗, y∗, z∗1 , · · · , z∗L)solves problem (8.1), then the point (x∗, y∗) solves problem (8.12).

In what follows, we denote by F1 and F2 the feasible regions of problems (8.1) and(8.12), respectively. The next result will be used later on.


Theorem 8.2 [53] Let g(x, y) ≤ 0, h(x, y) = 0, and y ≥ 0. Then the following state-ments are equivalent:

(i) Q`(x, y) < +∞ for every ` = 1, 2, · · · , L.

(ii) For any i and any `, there holds

y[i](N`x + M`y + q`)[i] ≤ 0. (8.13)

(iii) The point (x, y, z1, · · · , zL) with z` := max(−(N`x+M`y + q`), 0), ` = 1, · · · , L,is a feasible point of problem (8.1).

On the other hand, we note that, for every `, the function Q` may not be finite-valued and not differentiable everywhere in general. We next introduce a smoothapproximation of this function: Let ε be a positive parameter. For each `, we definethe function Qε

` : <n ×<m → [0,+∞) as follows:

Qε`(x, y) := max

− (u + ty)T (N`x + M`y + q`)− ε

2(t2 + ‖u‖2)

∣∣∣

u + ty ≤ d, u ≥ 0, t ≤ 0. (8.14)

By the convex programming theory, we see that any Karush-Kuhn-Tucker point of theproblem

maximize −(u + ty)T (N`x + M`y + q`)− ε

2(t2 + ‖u‖2) (8.15)

subject to u + ty ≤ d, u ≥ 0, t ≤ 0

must be an optimal solution and, since ε > 0, problem (8.15) indeed has a uniqueoptimal solution. This implies that the function Qε

` is well-defined for each `. We nextshow that Qε

` is differentiable everywhere. To this end, let

g(x, y, u, t) := u + ty − d,

h(x, y, u, t) := −u,

c(x, y, u, t) := t,

and

Lε`(x, y, u, t, ζ, η, ξ) := (u + ty)T (N`x + M`y + q`) +

ε

2(t2 + ‖u‖2)

+ζT g(x, y, u, t) + ηT h(x, y, u, t) + ξc(x, y, u, t).

We then have that ∇2(u,t)L

ε`(x, y, u, t, ζ, η, ξ) = εI.

8.1 Method 185

Lemma 8.1 For any (x, y) ∈ <n+m, let u` := u(x, y) and t` := t(x, y) be the uniqueoptimal solution of problem (8.15) and ζ` := ζ(x, y), η` := η(x, y), ξ` := ξ(x, y) be thecorresponding Lagrangian multiplier vectors. Then,

(a) for any (u, t) ∈ <m+1, there holds

(u, t)T∇2(u,t)L

ε`(x, y, u`, t`, ζ`, η`, ξ`)(u, t) ≥ ε‖(u, t)‖2.

(b) the linear independence constraint qualification is satisfied at (x, y, u`, t`), thatis, the set of vectors

∇(u,t)gi(x, y, u`, t`),∇(u,t)hj(x, y, u`, t`),∇(u,t)c(x, y, u`, t`)

∣∣∣

i ∈ Ig(x, y, u`, t`), j ∈ Ih(x, y, u`, t`)

when t` = 0, or

∇(u,t)gi(x, y, u`, t`),∇(u,t)hj(x, y, u`, t`)

∣∣∣ i ∈ Ig(x, y, u`, t`), j ∈ Ih(x, y, u`, t`)

when t` 6= 0, is linearly independent.

Proof: It is obvious that (a) holds. Moreover, since for any index set I ⊆1, · · · , m, the set of vectors

∇(u,t)gi(x, y, u`, t`),∇(u,t)hj(x, y, u`, t`),∇(u,t)c(x, y, u`, t`)

∣∣∣∣ i ∈ I, j /∈ I

must be linearly independent and, in addition, there always holds

Ig(x, y, u`, t`) ∩ Ih(x, y, u`, t`) = ∅,

we then see that (b) is true.

Thus, by Theorem 2 [4, Page 130], we have the following result immediately.

Theorem 8.3 The functions u(x, y), t(x, y), ζ(x, y), η(x, y), and ξ(x, y) given in Lemma8.1 are well-defined and continuous. Furthermore, the function Qε

` defined by (8.14) isdifferentiable everywhere and

∇Qε`(x, y) =

( −NT` (u` + t`y)

−MT` (u` + t`y)− t`(N`x + M`y + q`)

)+

(0

t`ζ`

), (8.16)

where u`, t`, and ζ` are the same as in Lemma 8.1.


As a result, the problem


`=1

p`Qε`(x, y) (8.17)

subject to g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0

is a smooth approximation of problem (8.12). We then have the following algorithm.

Algorithm RA:

Step 1: Choose ε0 > 0 and set k := 0.

Step 2: Solve problem (8.17) with ε = εk to get a stationary point (xk, yk) and go toStep 3.

Step 3: If a stopping rule is satisfied, then terminate. Otherwise, choose an εk+1 ∈(0, εk) and return to Step 2 with k := k + 1.

In what follows, we suppose that the sequence εk is convergent to 0 and, forsimplicity, we denote Qεk

` by Qk` for each k and `. Recall that F2 denotes the feasible

region of problem (8.12), which is the same as that of problem (8.17).


We will investigate the limiting behavior of the sequence generated by Algorithm RAin this section. Our first result is concerned with the feasibility of the limit point ofthe generated sequence, which can be stated as follows.

Theorem 8.4 Let (xk, yk) be a sequence generated by Algorithm RA and supposethat Qk

` (xk, yk) is bounded for each `. Then, for any accumulation point (x∗, y∗) of

the sequence (xk, yk), the vector (x∗, y∗, z∗1 , · · · , z∗L) is feasible to problem (8.1), where

z∗` := max(−(N`x∗ + M`y

∗ + q`), 0), ` = 1, · · · , L.

Proof: Assume without loss of generality that limk→∞(xk, yk) = (x∗, y∗). It isobvious from the continuity of the functions g and h that

g(x∗, y∗) ≤ 0, h(x∗, y∗) = 0, y∗ ≥ 0.


Suppose that the assertion of the theorem does not hold. Then, by Theorem 8.2, thereexist some ` and i such that

y∗[i] > 0, (N`x∗ + M`y

∗ + q`)[i] > 0.

Therefore, we can find a constant η > 0 and an integer k0 > 0 such that

yk[i] > η, (N`xk + M`y

k + q`)[i] > η, ∀k ≥ k0. (8.18)

For any t ≤ 0, we define u(t) := tyk[i]ei − tyk. It is easy to see that

u(t) ≥ 0, u(t) + tyk ≤ d, ∀t ≤ 0.

It then follows from the definition of Qk` that

Qk` (x

k, yk) ≥ sup− (u(t) + tyk)T (N`x

k + M`yk + q`)− εk

2(t2 + ‖u(t)‖2)

∣∣∣ t ≤ 0

= sup− tyk[i](N`x

k + M`yk + q`)[i]− εk

2t2(1 + ‖yk[i]ei − yk‖2)

∣∣∣ t ≤ 0.

By straightforward calculus, we can show that, for any k ≥ k0,

Qk` (x

k, yk) ≥ (yk[i](N`xk + M`y

k + q`)[i])2

2εk(1 + ‖yk[i]ei − yk‖2)

≥ η4

2εk(1 + ‖yk[i]ei − yk‖2), (8.19)

where the second inequality follows from (8.18). Taking into account the fact that

limk→∞

‖yk[i]ei − yk‖ = ‖y∗[i]ei − y∗‖, limk→∞

εk = 0,

we see from (8.19) that the sequence Qk` (x

k, yk) is unbounded. This is a contradictionand hence there must be some vectors z∗` , ` = 1, 2, · · · , L, such that (x∗, y∗, z∗1 , · · · , z∗L)is feasible to problem (8.1). This completes the proof.

The main convergence result can be stated as follows.

Theorem 8.5 Suppose that Algorithm RA generates a sequence (xk, yk) of station-ary points of problems (8.17) and, for each k, (uk

` , tk` ) is the corresponding unique opti-

mal solution of problem (8.15) with ε := εk. Assume that, for each `, both Qk` (x

k, yk)and tk` are bounded. Moreover, suppose that (x∗, y∗) is an accumulation point of thesequence (xk, yk) such that the system

g(x, y) ≤ 0, h(x, y) = 0, y ≥ 0 (8.20)


satisfies the MFCQ at (x∗, y∗), and let

z∗` := max(−(N`x∗ + M`y

∗ + q`), 0), ` = 1, · · · , L (8.21)

and

y∗ := ((y∗)T , · · · , (y∗)T )T , z∗ := ((z∗1)T , · · · , (z∗L)T )T . (8.22)

Then (x∗, y∗,y∗, z∗) is an S-stationary point of problem (8.2).

Proof: Assume without loss of generality that limk→∞(xk, yk) = (x∗, y∗). FromTheorem 8.4, we see (x∗, y∗, z∗1 , · · · , z∗L) ∈ F1 and hence (x∗, y∗,y∗, z∗) is a feasiblepoint of (8.2). We next show that (x∗, y∗,y∗, z∗) is S-stationary to problem (8.2), thatis, there exist multiplier vectors λ, µ,ν,α, β, and γ such that (8.4)–(8.11) hold.

First of all, by the stationarity of (xk, yk) to (8.17), there exist Lagrange multipliervectors ak ∈ <s1 , bk ∈ <s2 , and ck ∈ <m such that

∇f(xk, yk) +L∑

`=1

p`∇Qk` (x

k, yk)

+∇g(xk, yk)ak +∇h(xk, yk)bk −(

O

I

)ck = 0, (8.23)

0 ≤ ak ⊥ (−g(xk, yk)) ≥ 0, (8.24)

0 ≤ ck ⊥ yk ≥ 0. (8.25)

From (8.16), we can rewrite (8.23) as

∇f(xk, yk) +L∑

`=1

p`

( −NT` (uk

` + tk` yk)

−MT` (uk

` + tk` yk)− tk` (N`x

k + M`yk + q` − ζk

` )

)

+∇g(xk, yk)ak +∇h(xk, yk)bk −(

O

I

)ck = 0,

where ζk` := ζ(xk, yk) and the function ζ(x, y) is defined as in Lemma 8.1. This condi-

tion is further equivalent to

0 = ∇f(xk, yk) +∇g(xk, yk)ak +∇h(xk, yk)bk −(

O

I

)ck

−L∑

`=1

p`tk`

(O

I

)(N`x

k + M`yk + q` − ζk

` )−L∑

`=1

p`

(NT

`

MT`

)(uk

` + tk` yk). (8.26)

We next prove that the sequences ak, bk and ck are bounded. To this end, let

ρk :=s1∑

i=1

ak[i] +s2∑

i=1

|bk[i]|+m∑

i=1

ck[i]. (8.27)


Suppose that at least one of the sequences ak, bk and ck is unbounded. We thenhave limk→∞ ρk = +∞ and, taking a subsequence if necessary, we may assume thatthe limits

a[i] := limk→∞

ak[i]ρk

, i = 1, · · · , s1,

b[i] := limk→∞

bk[i]ρk

, i = 1, · · · , s2,

c[i] := limk→∞

ck[i]ρk

, i = 1, · · · ,m

exist. It is clear from (8.27) that

s1∑

i=1

a[i] +s2∑

i=1

|b[i]|+m∑

i=1

c[i] = 1.

For each `, since tk` is bounded and

0 ≤ uk` ≤ d− tk` y

k, ∀k,

we see that uk` is bounded. Moreover, by the continuity of the functions given in

Theorem 8.3, ζk` is also bounded. Thus, dividing (8.26) by ρk and taking a limit, we

get

∇g(x∗, y∗)a +∇h(x∗, y∗)b−(

O

I

)c = 0.

Furthermore, taking (8.24) and (8.25) into account, we obtain a ≥ 0, c ≥ 0, and

a[i] = 0, i /∈ Ig(x∗, y∗),

c[i] = 0, i /∈ Iπ(x∗, y∗),

where π : <n+m → <m is given by π(x, y) := y. It follows that

∑

i∈Ig(x∗,y∗)

a[i] +s2∑

i=1

|b[i]|+∑

i∈Iπ(x∗,y∗)

c[i] = 1,

a[i] ≥ 0, i ∈ Ig(x∗, y∗),

c[i] ≥ 0, i ∈ Iπ(x∗, y∗),

and

∑

i∈Ig(x∗,y∗)

a[i]∇gi(x∗, y∗) +s2∑

i=1

b[i]∇hi(x∗, y∗)−∑

i∈Iπ(x∗,y∗)

c[i]

(0ei

)= 0.


This contradicts the assumption that the system (8.20) satisfies the MFCQ at (x∗, y∗)and hence all the sequences ak, bk, and ck are bounded. Recall that tk` , uk

` ,and ζk

` are also bounded for each `.

Now let us proceed to showing (8.4)–(8.11) step by step. First we show (8.4) and(8.5). Let

λk := ak, (8.28)

µk := bk, (8.29)

αk` [i] := p`(d[i]− uk

` [i]− tk` yk[i]), (8.30)

βk` [i] :=

1L

ck[i] + p`tk` (N`x

k + M`yk + q` − ζk

` )[i], (8.31)

γk` [i] := p`(uk

` [i] + tk` yk[i]), (8.32)

νk` := βk

` + MT` γk

` , (8.33)

and

αk :=

αk1...

αkL

, βk :=

βk1...

βkL

, γk :=

γk1...

γkL

, νk :=

νk1...

νkL

.

Then (8.26) can be rewritten as

0 =

∇xf(xk, yk)∇yf(xk, yk)

0d

+

∇xg(xk, yk)∇yg(xk, yk)

O

O

λk +

∇xh(xk, yk)∇yh(xk, yk)

O

O

µk

+

O

−DT

I

O

νk −

O

O

O

I

αk −

O

O

I

O

βk −

NT

O

MT

I

γk, (8.34)

where d, D,N, and M are defined as in (8.3). Since all the multiplier vectors arebounded, without loss of generality, we may assume that

λ := limk→∞



νk,

and

α := limk→∞

αk, β := limk→∞

βk, γ := limk→∞

γk.

Taking a limit in (8.34), we obtain (8.4) immediately. Moreover, we have (8.5) from(8.24) and (8.28) by letting k →∞.


We next prove (8.6)–(8.11). To this end, we let

zk` := max(−(N`x

k + M`yk + q`), 0), ` = 1, · · · , L (8.35)

and

yk := ((yk)T , · · · , (yk)T )T , zk := ((zk1 )T , · · · , (zk

L)T )T . (8.36)

Recall that, for each ` and each k, (uk` , t

k` ) is a Karush-Kuhn-Tucker point of problem

(8.15) with ε = εk. Thus, the Lagrange multiplier vectors ζk` , ηk

` , and ξk` satisfy

(N`xk + M`y

k + q`) + εkuk` + ζk

` − ηk` = 0, (8.37)

(yk)T (N`xk + M`y

k + q`) + εktk` + (yk)T ζk

` + ξk` = 0, (8.38)

0 ≤ ζk` ⊥ (d− uk

` − tk` yk) ≥ 0, (8.39)

0 ≤ ηk` ⊥ uk

` ≥ 0, (8.40)

0 ≤ ξk` ⊥ (−tk` ) ≥ 0. (8.41)

Moreover, for any index j with 1 ≤ j ≤ mL, there exist ` and i such that

1 ≤ ` ≤ L, 1 ≤ i ≤ m, j = (`− 1)m + i. (8.42)

It is obvious that αk ≥ 0 and zk ≥ 0 from the definitions (8.30), (8.35) and (8.36) forevery k. Taking a limit, we obtain α ≥ 0 and z∗ ≥ 0. Suppose z∗[j] > 0 and let ` and i

satisfy (8.42). Then, since z∗` [i] > 0, it follows from (8.21) that (N`x∗+M`y

∗+q`)[i] < 0.We then have from (8.37) and (8.40) that

ζk` [i] ≥ ζk

` [i]− ηk` [i]

= −(N`xk + M`y

k + q`)[i]− εkuk` [i]

→ −(N`x∗ + M`y

∗ + q`)[i]

> 0.

This implies that ζk` [i] > 0 when k is large sufficiently and so, by (8.39),

uk` [i] + tk` y

k[i] = d[i].

Therefore, we have from the definition (8.30) that αk[j] = αk` [i] = 0 for all k sufficiently

large. By taking a limit, we have α[j] = 0 and hence (8.6) holds.

It is easy to see that y∗ ≥ 0. Suppose y∗[j] > 0 and let ` and i satisfy (8.42). Then,since y∗[j] = y∗[i] > 0, by Theorem 8.2, there must hold (N`x

∗ + M`y∗ + q`)[i] = 0.

Moreover, we have yk[i] > 0 when k is large sufficiently. Thus, it follows from (8.25) that


ck[i] = 0 for all k sufficiently large. In addition, it follows from Theorem 8.3 that thesequences ζk

` [i] and ηk` [i] are bounded. Taking a subsequence if necessary, we may

assume that both the sequences are convergent. We claim that ζk` [i] is convergent

to 0. Otherwise, we have from (8.37) that the limit of ηk` [i] must be positive. This

means that, when k is large sufficiently, there holds ηk` [i] > 0 and so, by (8.40), uk

` [i] = 0.Therefore, we have that, when k is large sufficiently,

d[i]− uk` [i]− tk` y

k[i] = d[i]− tk` yk[i] ≥ d[i] > 0

and hence, by (8.39), ζk` [i] = 0. This is a contradiction. As a result, the sequence

ζk` [i] must be convergent to 0. Thus, taking a limit in (8.31) and noting that tk` is

bounded, we obtain

β[j] = limk→∞

βk[j] = limk→∞

βk` [i] = 0.

This shows (8.7) and (8.8).

It is easy to see that Nx∗+My∗+q+z∗ ≥ 0. Suppose (Nx∗+My∗+q+z∗)[j] > 0and let ` and i satisfy (8.42). Then, since (N`x

∗+M`y∗+q` +z∗` )[i] > 0, it follows from

(8.21) that (N`x∗ + M`y

∗ + q`)[i] > 0 and hence, by Theorem 8.2, y∗[i] = 0. Moreover,we have from (8.37) and (8.39)–(8.40) that

ηk` [i] = (N`x

k + M`yk + q`)[i] + εku

k` [i] + ζk

` [i]

≥ (N`xk + M`y

k + q`)[i]

→ (N`x∗ + M`y

∗ + q`)[i]

> 0.

In consequence, ηk` [i] > 0 when k is large sufficiently and then, by (8.40), we have

uk` [i] = 0. Taking a limit in (8.32) and noting that tk` is bounded, we obtain

γ[j] = limk→∞

γk[j] = limk→∞

γk` [i] = 0

and hence (8.9) and (8.10) hold.

Let I∗ be defined as in (8.11) and suppose j ∈ I∗. Note that

uk` [i] ≥ 0, tk` ≤ 0, ck[i] ≥ 0, ∀k

and (Nx∗ + My∗ + q + z∗)[j] = 0 implies (N`x∗ + M`y

∗ + q` + z∗` )[i] = 0, where ` andi satisfy (8.42), and hence (N`x

∗ + M`y∗ + q`)[i] ≤ 0 by (8.21). We then have from

(8.31)–(8.32) that

β[j] = limk→∞

βk` [i] ≥ lim

k→∞p`t

k` (N`x

k + M`yk + q`)[i] ≥ 0

8.3 Conclusions 193

and

γ[j] = limk→∞

γk` [i] ≥ lim

k→∞p`t

k` y

k[i] = 0.

This indicates that (8.11) holds.

Therefore, the multiplier vectors λ, µ, ν, α,β, and γ indeed satisfy conditions (8.4)–(8.11) and so (x∗, y∗,y∗, z∗) is an S-stationary point of problem (8.2). This completesthe proof of the theorem.

8.3 Conclusions

The SMPEC (8.1) has been discussed in the previous chapters and two penalty methodshave been proposed there. The main difficulty with the two methods consists in thefeasibility of a limit point of the generated sequence, which has not been addressedcompletely. In this chapter, based on a reformulation given in Chapter 7, we propose aregularization method for solving the SMPEC (8.1). It has been shown that, under aweak condition, an accumulation point of the generated sequence is a feasible point ofthe original problem. Global convergence to an S-stationary point of the problem hasalso been established.

194 Bibliography

Bibliography

[1] D.P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods,Academic Press, New York, 1982.

[2] E. Bierstone and P.D. Milman, Semianalytic and subanalytic sets, Institut desHautes Etudes Scientifiques, Publications Mathematiques, 67 (1988), 5–42.

[3] J.R. Birge and F. Louveaux, Introduction to Stochastic Programming, Springer,New York, 1997.

[4] J.F. Bonnans and A. Shapiro, Optimization problems with perturbations: A guidedtour, SIAM Review, 40 (1998), 228–264.

[5] J.V. Burke, An exact penalization viewpoint of constrained optimization, SIAMJournal on Control and Optimization, 29 (1991), 968–998.

[6] J.V. Burke, On the identification of active constraints II: The nonconvex case,SIAM Journal on Numerical Analysis, 27 (1990), 1081–1102.

[7] J.V. Burke and M.C. Ferris, Weak sharp minima in mathematical programming,SIAM Journal on Control and Optimization, 31 (1993), 1340–1359.

[8] J.V. Burke and J.J. More, On the identification of active constraints, SIAM Jour-nal on Numerical Analysis, 25 (1988), 1197–1211.

[9] B. Chen and P.T. Harker, Smooth approximations to nonlinear complementarityproblems, SIAM Journal on Optimization, 7 (1997), 403–420.

[10] C. Chen and O.L. Mangasarian, A class of smoothing functions for nonlinear andmixed complementarity problems, omputational Optimization and Applications,5 (1996), 97–138.

[11] X. Chen, Newton-type methods for stochastic programming, Mathematical andComputer Modelling, 31 (2000), 89–98.

195

196 Bibliography

[12] X. Chen and M. Fukushima, A smoothing method for a mathematical programwith P-matrix linear complementarity constraints, Computational Optimizationand Applications, to appear.

[13] X. Chen and M. Fukushima, Approximation and convergence in stochastic linearcomplementarity problems, Technical Report 2003-009, Department of AppliedMathematics and Physics, Kyoto University (July, 2003).

[14] X. Chen, L. Qi, and D. Sun, Global and superlinear convergence of the smoothingNewton method and its application to general box constrained variational inequal-ities, Mathematics of Computation, 67 (1998), 519–540.

[15] X. Chen, L. Qi, and R.S. Womersley, Newton’s method for quadratic stochasticprograms with recourse, Journal of Computational and Applied Mathematics, 60(1995), 29–46.

[16] X. Chen and R.S. Womersley, A parallel Newton method for quadratic stochasticprograms with recourse, Annals of Operations Research, 64 (1996), 113–141.

[17] Y. Chen and M. Florian, The Nonlinear Bilevel Programming Problem: Formu-lations, Regularity and Optimality Conditions, Optimization, 32 (1995), 193–209.

[18] F.H. Clarke, Optimization and Nonsmooth Analysis, SIAM, Philadelphia, 1990.

[19] R.W. Cottle, J.S. Pang, and R.E. Stone, The Linear Complementarity Problem,Academic Press, New York, NY, 1992.

[20] J.P. Dedieu, Penalty functions in subanalytic optimization, Optimization, 26(1992), 27–32.

[21] E.D. Dolan, R. Fourer, J.J. More, and T.S. Munson, The NEOSserver for optimization: Version 4 and beyond, Technical Report,http://www-neos.mcs.anl.gov/neos/ftp/v4.pdf, 2001.

[22] F. Facchinei, A. Fischer, and C. Kanzow, On the accurate identification of activeconstraints, SIAM Journal on Optimization, 9 (1998), 14–32.

[23] F. Facchinei, A. Fischer, and C. Kanzow, On the identification of zero variablesin an interior-point framework, SIAM Journal on Optimization, 10 (2000), 1058–1078.

[24] F. Facchinei, H. Jiang, and L. Qi, A smoothing method for mathematical programswith equilibrium constraints, Mathematical Programming, 85 (1999), 107–134.

Bibliography 197

[25] A. Fischer, A special Newton-type optimization method, Optimization, 24 (1992),269–284.

[26] M.L. Flegel and C. Kanzow, An Abadie-type constraint qualification formathematical programs with equilibrium constraints, manuscript, University ofWurzburg, Wurzburg, 2002.

[27] R. Fletcher, S. Leyffer, D. Ralph, and S. Scholtes, Local Convergence of SQPMethods for Mathematical Programs with Equilibrium Constraints, NumericalAnalysis Report, Department of Mathematics, University of Dundee, Dundee,Scotland, 2001.

[28] C.A. Floudas, P.M. Pardalos, C.S. Adjiman, W.R. Esposito, Z.H. Gumus, S.T.Harding, J.L. Klepeis, C.A. Meyer, C.A. Schweiger, Handbook of Test Problemsin Local and Global Optimization, Nonconvex Optimization and Its Applications,Volume 33, Dordrecht, Kluwer Academic Publishers, 1999.

[29] M. Fukushima, Z.Q. Luo, and J.S. Pang, A globally convergent sequentialquadratic programming algorithm for mathematical programs with linear comple-mentarity constraints, Computational Optimization and Applications, 10 (1998),5–34.

[30] M. Fukushima, Z.Q. Luo, and P. Tseng, A sequential quadratically constrainedquadratic programming method for differentiable convex minimization, SIAMJournal on Optimization, 13 (2003), 1098–1119.

[31] M. Fukushima and J.S. Pang, Convergence of a smoothing continuation methodfor mathematical problems with complementarity constraints, Ill-posed VariationalProblems and Regularization Techniques, Lecture Notes in Economics and Math-ematical Systems, Vol. 477, M. Thera and R. Tichatschke (eds.), Springer-Verlag,Berlin/Heidelberg, 1999, 105–116.

[32] M. Fukushima and J.S. Pang, Some feasibility issues in mathematical programswith equilibrium constraints, SIAM Journal on Optimization, 8 (1998), 673–681.

[33] M. Fukushima and P. Tseng, An implementable active-set algorithm for comput-ing a B-stationary point of the mathematical program with linear complementarityconstraints, SIAM Journal on Optimization, 12 (2002), 724–739.

[34] S.A. Gabriel and J.J. More, Smoothing of mixed complementarity problems, Com-plementarity and Variational Problems: State of the Art, M.C. Ferris and J.S.Pang (eds.), SIAM, Philadelphia, Pennsylvania (1997), 105–116.

198 Bibliography

[35] G. Gurkan, A.Y. Ozge and S.M. Robinson, Sample-path solution of stochasticvariational inequalities, Mathematical Programming, 84 (1999), 313–333.

[36] P.T. Harker and J.S. Pang, Finite-dimensional variational inequality and nonlin-ear complementarity problems: A survey of theory, algorithms and applications,Mathematical Programming, 48 (1990), 161–220.

[37] A. Haurie and F. Moresino, S-adapted oligopoly equilibria and approximations instochastic variational inequalities, with application to option pricing, Annals ofOperations Research, 114 (2002), 183–201.

[38] X. Hu and D. Ralph, Convergence of a penalty method for mathematical program-ming with equilibrium constraints, Journal of Optimization Theory and Applica-tions, to appear.

[39] X.X. Huang, X.Q. Yang, and K.L. Teo, A smoothed l1 penalty method for math-ematical programs with complementarity constraints, manuscript, Department ofApplied Mathematics, Hong Kong Polytechnic University, Hong Kong, 2003.

[40] X.X. Huang, X.Q. Yang, and K.L. Teo, Partial augmented Lagrangian method andmathematical programs with complementarity constraints, manuscript, Depart-ment of Applied Mathematics, Hong Kong Polytechnic University, Hong Kong,2002.

[41] X.X. Huang, X.Q. Yang, and D.L. Zhu, A sequential smooth penalization ap-proach to mathematical programs with complementarity constraints, manuscript,Department of Applied Mathematics, Hong Kong Polytechnic University, HongKong, 2001.

[42] H. Jiang and D. Ralph, Extension of quasi-Newton methods to mathematical pro-grams with complementarity constraints, Computational Optimization and Ap-plications, 25 (2003), 123–150.

[43] H. Jiang and D. Ralph, QPECgen, a MATLAB generator for mathematical pro-grams with quadratic objectives and affine variational inequality constraints, Com-putational Optimization and Applications, 13 (1999), 25–59.

[44] H. Jiang and D. Ralph, Smooth SQP methods for mathematical programswith nonlinear complementarity constraints, SIAM Journal on Optimization, 10(2000), 779–808.

[45] K. Jittorntrum, Solution point differentiability without strict complementarity innonlinear programming, Mathematical Programming Study, 21 (1984), 127–138.

Bibliography 199

[46] P. Kall and S.W. Wallace, Stochastic Programming, John Wiley & Sons, Chich-ester, 1994.

[47] C. Kanzow, Some noninterior continuation methods for linear complementarityproblems, SIAM Journal on Matrix Analysis and Applications, 17 (1996), 851–868.

[48] C. Kanzow and H. Jiang, A continuation method for (strongly) monotone varia-tional inequalities, Mathematical Programming, 81 (1998), 103–125.

[49] S. Karamardian and S. Schaible, Seven kinds of monotone maps, Journal of Op-timization Theory and Applications, 66 (1990), 37–46.

[50] S. Leyffer, MacMPEC: AMPL collection of MPECs, Technical Report,http://www-unix.mcs.anl.gov/~leyffer/MacMPEC/, 2000.

[51] S. Leyffer, The penalty interior point method fails to converge for mathemati-cal programs with equilibrium constraints, Numerical Analysis Report NA/208,Department of Mathematics, University of Dundee, Dundee, Scotland, 2002.

[52] G.H. Lin, X. Chen and M. Fukushima, Smoothing implicit programming ap-proaches for stochastic mathematical programs with linear complementarity con-straints, Technical Report 2003-006, Department of Applied Mathematics andPhysics, Graduate School of Informatics, Kyoto University, Kyoto, Japan (2003).

[53] G.H. Lin and M. Fukushima, A class of stochastic mathematical programs withcomplementarity constraints: Reformulations and algorithms, Technical Report2003-010, Department of Applied Mathematics and Physics, Graduate School ofInformatics, Kyoto University, Kyoto, Japan, 2003.

[54] G.H. Lin and M. Fukushima, A modified relaxation scheme for mathematicalprograms with complementarity constraints, Annals of Operations Research, toappear.

[55] G.H. Lin and M. Fukushima, Hybrid algorithms with active set identificationfor mathematical programs with complementarity constraints, Technical Report2002-008, Department of Applied Mathematics and Physics, Graduate School ofInformatics, Kyoto University, Kyoto, Japan, 2002.

[56] G.H. Lin and M. Fukushima, Hybrid algorithms with index addition and sub-traction strategies for solving mathematical programs with complementarity con-straints, Technical Report 2003-003, Department of Applied Mathematics andPhysics, Graduate School of Informatics, Kyoto University, Kyoto, Japan, 2003.

200 Bibliography

[57] G.H. Lin and M. Fukushima, New relaxation method for mathematical programswith complementarity constraints, Journal of Optimization Theory and Applica-tions, 118 (2003), 81–116.

[58] G.H. Lin and M. Fukushima, Regularization method for stochastic mathematicalprograms with complementarity constraints, Technical Report 2003-012, Depart-ment of Applied Mathematics and Physics, Graduate School of Informatics, KyotoUniversity, Kyoto, Japan, 2003.

[59] G.H. Lin and M. Fukushima, Some exact penalty results for nonlinear programsand their applications to mathematical programs with equilibrium constraints,Journal of Optimization Theory and Applications, 118 (2003), 67–80.

[60] X. Liu and J. Sun, Generalized stationary pointd and an interior point methodfor mathematical programs with equilibrium constraints, Preprint, National Uni-versity of Singapore, Singapore, 2002.

[61] M.S. Lojasiewicz, Ensembles semi-analytiques, Institut des Hautes Etudes Scien-tifiques, Bures-sur-Yvette, 1964.

[62] Z.Q. Luo, J.S. Pang, and D. Ralph, Mathematical Programs with EquilibriumConstraints, Cambridge University Press, Cambridge, United Kingdom, 1996.

[63] Z.Q. Luo, J.S. Pang, D. Ralph, and S.Q. Wu, Exact penalization and stationaryconditions of mathematical programs with equilibrium constraints, MathematicalProgramming, 75 (1996), 19–76.

[64] J.M. Ortega and W.C. Rheinboldt, Iterative Solution of Nonlinear Equations inSeveral Variables, Academic Press, 1970.

[65] J.V. Outrata, On optimization problems with variational inequality constraints,SIAM Journal on Optimization, 4 (1994), 340–357.

[66] J.V. Outrata, M. Kocvara and J. Zowe, Nonsmooth Approach to OptimizationProblems with Equilibrium Constraints: Theory, Applications and NumericalResults, Kluwer Academic Publishers, Boston, MA, 1998.

[67] J.V. Outrata and J. Zowe, A numerical approach to optimization problems withvariational inequality constraints, Mathematical Programming, 68 (1995), 105–130.

Bibliography 201

[68] J.S. Pang, Complementarity problems, Handbook on Global Optimization, R.Horst and P. Pardalos (eds.), Kluwer Academic Publishers, B.V. Dordrecht, 1994,271–338.

[69] J.S. Pang, Error bounds in mathematical programming, Mathematical Program-ming, 79 (1997), 299–332.

[70] J.S. Pang and M. Fukushima, Complementarity constraint qualifications and sim-plified B-stationarity conditions for mathematical programs with equilibrium con-straints, Computational Optimization and Applications, 13 (1999), 111–136.

[71] J.S. Pang and M. Fukushima, Quasi-variational inequalities, generalized Nashequilibria, and multi-leader-follower games, Technical Report 2002-009, Depart-ment of Applied Mathematics and Physics, Kyoto University, 2002.

[72] M. Patriksson and L. Wynter, Stochastic mathematical programs with equilibriumconstraints, Operations Research Letters, 25 (1999), 159–167.

[73] L. Qi and J. Sun, A nonsmooth version of Newton’s method, Mathematical Pro-gramming, 58 (1993), 353–367.

[74] R.T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ,1970.

[75] H.S. Scheel and S. Scholtes, Mathematical programs with complementarity con-straints: Stationarity, optimality, and sensitivity, Mathematics of Operations Re-search, 25 (2000), 1–22.

[76] S. Scholtes, Convergence properties of a regularization scheme for mathematicalprograms with complementarity constraints, SIAM Journal on Optimization, 11(2001), 918–936.

[77] S. Scholtes and M. Stohr, Exact penalization of mathematical programs with com-plementarity constraints, SIAM Journal on Control and Optimization, 37 (1999),617–652.

[78] S. Scholtes and M. Stohr, How stringent is the linear independence assumption formathematical programs with complementarity constraints, Mathematics of Oper-ations Research, 26 (2001), 851–863.

[79] R. Schultz, L. Stougie, M.H. van der Vlerk, Two-stage stochastic integer program-ming: A survey, Statistica Neerlandica, 50 (1996), 404–416.

202 Acknowledgements

[80] K. Shimizu, Y. Ishizuka, and J. Bard, Nondifferentiable and Two-Level Mathe-matical Programming, Kluwer Academic Publishers, Boston, 1997.

[81] P. Tseng, Growth behavior of a class of merit functions for the nonlinear comple-mentarity problem, Journal of Optimization Theory and Applications, 89 (1996),17–37.

[82] N. Yamashita, H. Dan, and M. Fukushima, On the identification of degenerateindices in the nonlinear complementarity problem with the proximal point algo-rithm, Mathematical Programming, to appear.

[83] S. Vajda, Probabilistic Programming, Academic Press, New York, 1972.

[84] H. Xu, An MPCC approach for stochastic Stackelberg-Nash-Cournot equilibrium,manuscript, University of Southampton, Highfield Southampton, 2003.

Acknowledgements

First of all, I would like to express my sincere appreciation to Professor MasaoFukushima of Kyoto University for his supervising this thesis. He kindly gaveme lots of suggestions and continual guidance. His excellent work and profoundknowledge related to optimization theory and many other fields are very valuableto my research. Without his help, I could not take any progress on my research.

I am deeply grateful to Professor Xiaojun Chen of Hirosaki University andProfessor Paul Tseng of University of Washington for their earnest guidance andhelpful suggestions. I am also thankful to Professor Tetsuya Takine and ProfessorNobuo Yamashita of Kyoto University for their friendly advices. Professors Zun-Quan Xia and Sining Zheng of Dalian University of Technology gave me continualsupport and encouragement. I would like to thank all of them very much.

I am particularly thankful to the Ministry of Education, Science, Sports andCulture of Japan and the Ministry of Education of China for their financiallysupporting me to study and research in Kyoto University.

In addition, I would like to thank all the members in Fukushima researchgroup. While I studied in Kyoto University, I received much help from them thatenriched my life in Kyoto.

Finally, I would like to pay my particular thanks to my parents, my wife, andmy daughter.

203

Documents

Studies on Methods for Mathematical Programs …The other methods in the literature on MPECs include the sequential quadratic pro-gramming methods [29, 44, 62], implicit programming