17
SIAM REVIEW Vol. 25, No. 2, April 1983 (C) 1983 Society for Industrial and Applied Mathematics 0036-1445/83/2502-0002 $01.25/0 APPLYING SOME MODERN DEVELOPMENTS TO CHOOSING YOUR OWN LAGRANGE MULTIPLIERS* J. PONSTEIN" Abstract. This paper has primarily been written for those applying Lagrange multipliers in practice. In it it is shown how these multipliers can be used for solving optimization problems in a much more general fashion than when using the "classical" method. In general it is even possible to apply the generalized method in more than only one way, depending either on which variation is the most attractive from a computational point of view, or on the kind of equilibrium prices or the kind of sensitivity analysis one is interested in. Sensitivity analysis is closely related to shadow prices and shadow costs, which terms are also used for equilibrium prices. The paper discusses the difference between the two, and in this connection argues that Lagrange multipliers should not be interpreted as something like imputed costs. The paper stresses the fact that the basic ideas regarding Lagrange multipliers can be developed without using differentiability, convexity or continuity. Duality too, is not a necessary ingredient. Yet, all these notions become extremely important when it comes to actually solving problems. In the final section remarks are made regarding certain generalizations, variations, and the like, of the theory. Key words, duality, equilibrium price, imputed cost, Lagrange multiplier, optimization, perturbation, sensitivity analysis, shadow cost, shadow price 1. Some introductory remarks. Mathematically, the material covered in this paper is not new. A comprehensive account of the underlying theory is given by e.g. Rockafellar [8], and is more fully developed in Craven [2], Dempster [3], Luenberger [5], Mangasar- ian [6], Ponstein [7], and other books. One of the aims of the present paper is to show what is the minimum .set of tools required to set up the theory. On the other hand the paper does not go into the details of the underlying existence theorems, where notions such as lower semi-continuity, closed functions, conjugation, etc. are so extremely important. The intention of the present paper is quite akin to that of Geoffrion [4]. As in [4] and in some of the literature quoted before, we strongly emphasize the use of perturbations. In [4], this is limited to perturbing right-hand sides of (in)equality constraints, whereas we present a more general point of view, as in [8] and elsewhere. Whereas in many presentations, including [4], duality is introduced at an early stage, we postpone its treatment until it can be fitted in the theory in a natural kind of way. This adds to the simplicity of the presentation, at least initially. In [4] notions like stability and subdifferentiability play an important part when showing some vital existence theorems (which we treat only superficially, but for good practical reasons). Further, this paper gives a survey of solution techniques, which is not one of our aims. Perhaps the present paper may be seen as an applications-oriented version of the modern theory, adding some new considerations to Geoffrion’s paper, thereby stressing the variety and the options one has when constructing the so-called Lagrange problem, as well as the practical meaning of notions like equilibrium price (which is nothing other than a "suitable" or "optimal" Lagrange multiplier) and shadow price. 2. The "classical" Lagrange multiplier rule applied to a geometric example. Origi- nally, when applying Lagrange multipliers, one had to assume that all constraints of the optimization problem to be solved were given as equalities, such as (Xl- 4) + (x2- 3) z-- 1. *Received by the editors July 15, 1981, and in final revised form July 12, 1982. fDepartment of Econometrics, University of Groningen, P.O.B. 800, 9700 AV Groningen, the Nether- lands. 183 Downloaded 11/22/14 to 129.120.242.61. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

  • Upload
    j

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

SIAM REVIEWVol. 25, No. 2, April 1983

(C) 1983 Society for Industrial and Applied Mathematics

0036-1445/83/2502-0002 $01.25/0

APPLYING SOME MODERN DEVELOPMENTS TO CHOOSING YOUR OWNLAGRANGE MULTIPLIERS*

J. PONSTEIN"

Abstract. This paper has primarily been written for those applying Lagrange multipliers in practice. In itit is shown how these multipliers can be used for solving optimization problems in a much more general fashionthan when using the "classical" method. In general it is even possible to apply the generalized method in morethan only one way, depending either on which variation is the most attractive from a computational point ofview, or on the kind of equilibrium prices or the kind of sensitivity analysis one is interested in. Sensitivityanalysis is closely related to shadow prices and shadow costs, which terms are also used for equilibrium prices.The paper discusses the difference between the two, and in this connection argues that Lagrange multipliersshould not be interpreted as something like imputed costs.

The paper stresses the fact that the basic ideas regarding Lagrange multipliers can be developed withoutusing differentiability, convexity or continuity. Duality too, is not a necessary ingredient. Yet, all these notionsbecome extremely important when it comes to actually solving problems.

In the final section remarks are made regarding certain generalizations, variations, and the like, of thetheory.

Key words, duality, equilibrium price, imputed cost, Lagrange multiplier, optimization, perturbation,sensitivity analysis, shadow cost, shadow price

1. Some introductory remarks. Mathematically, the material covered in this paperis not new. A comprehensive account of the underlying theory is given by e.g. Rockafellar[8], and is more fully developed in Craven [2], Dempster [3], Luenberger [5], Mangasar-ian [6], Ponstein [7], and other books. One of the aims of the present paper is to showwhat is the minimum .set of tools required to set up the theory. On the other hand thepaper does not go into the details of the underlying existence theorems, where notions suchas lower semi-continuity, closed functions, conjugation, etc. are so extremely important.

The intention of the present paper is quite akin to that of Geoffrion [4]. As in [4] andin some of the literature quoted before, we strongly emphasize the use ofperturbations. In[4], this is limited to perturbing right-hand sides of (in)equality constraints, whereas wepresent a more general point of view, as in [8] and elsewhere. Whereas in manypresentations, including [4], duality is introduced at an early stage, we postpone itstreatment until it can be fitted in the theory in a natural kind of way. This adds to thesimplicity of the presentation, at least initially. In [4] notions like stability andsubdifferentiability play an important part when showing some vital existence theorems(which we treat only superficially, but for good practical reasons). Further, this papergives a survey of solution techniques, which is not one of our aims.

Perhaps the present paper may be seen as an applications-oriented version of themodern theory, adding some new considerations to Geoffrion’s paper, thereby stressingthe variety and the options one has when constructing the so-called Lagrange problem, aswell as the practical meaning of notions like equilibrium price (which is nothing otherthan a "suitable" or "optimal" Lagrange multiplier) and shadow price.

2. The "classical" Lagrange multiplier rule applied to a geometric example. Origi-nally, when applying Lagrange multipliers, one had to assume that all constraints of theoptimization problem to be solved were given as equalities, such as

(Xl- 4) + (x2- 3)z-- 1.

*Received by the editors July 15, 1981, and in final revised form July 12, 1982.fDepartment of Econometrics, University of Groningen, P.O.B. 800, 9700 AV Groningen, the Nether-

lands.

183

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 2: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

184 J. PONSTEIN

Letting x (x, x2) be a point of 2-dimensional Euclidean space, this equation says that xmust be on the circumference of the circle with radius and with center (4, 3).

The next step is that these equality constraints must be written in such a way that theright-hand side is zero:

(Xl-4)2+(x2-3)2- 1=0

(obviously, variations such as (xl 4) (x2 3) 0 are also allowed, so that theresult is not unique).

Then, assuming that there are rn constraints, their left-hand sides are multiplied bynumbers ,, ,2, ", Xm, and all products are added to the objective function. If,continuing the example, the objective function is the square of the distance of x to theorigin (0, 0), we obtain, with m and X ,

L(x,X) x2 + x2 + h{(xl- 4) + (x2- 3)2- 1},which is the Lagrange function, depending on the multipliers ), ),,, but on thevariables Xl, x, (if there are n of them) as well.

Then, assuming differentiability, the Lagrange function is partially differentiatedwith respect to Xl, , x,, and the derivatives are set equal to zero:

2Xl + 2)t(Xl- 4)= 0 and 2X2 -k- 2)k(X2- 3)= 0.

These are the so-calledfirst-order conditions, to which the constraints are added:

2Xl+2X(x1-4)=0, 2Xz+2)(x2-3)=0 and (x-4)2+(x2-3)2=1.Finally, one tries to solve the n + rn unknowns x, x,, , )t,, from the

resulting n + rn equations.Some, perhaps for reasons of mathematical beauty, prefer to partially differentiate

the Lagrange function with respect to )1, ),,, as well. It is obvious that this will leadto the constraints. In view of the possible generalizations, however, this habit should beabandoned.

Solving for (Xl, x2, ,) in the example leads to

X 16/5, X 12/5, ) 4, or x 24/5, x2 18/s, X -6.

As we want to minimize x + x subject to (Xl 4) + (x2 3) 1, only the first answerdetermines the solution of the given problem. That the method developed so far may notonly lead to the correct answer but to false results as well is due to the fact that so far wehave only specified first-order conditions (and constraints). Additional second-order(and, maybe, even higher order) conditions are required to find out which solution is aminimum (or a maximum, as the case may be). In the generalized theory, however,differentiability is, in principle, not required at all, let alone the distinction betweenfirst-order and higher order conditions. For this reason we will not pursue the classicalmethod any further here.

Often, the classical method only indicates local optima. In the generalized theory,however, one wants global optima, so that one not only wants to compare an optimalsolution x with solutions x close to x, but with any x that satisfies the constraints.

3. Other L-functions for the shortest distance problem. Clearly, the rules (or therecipe) given in 2 for applying Lagrange multipliers is entirely based on equalityconstraints. What is required is a recipe that also works for inequality constraints, andeven for such general constraints as

xC g,

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 3: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

CHOOSING YOUR OWN LAGRANGE MULTIPLIERS 185

where g is a given region (thefeasible region) of the space of x’s. In this latter formulationthere is not a trace of (in)equalities. In order to generalize the method we apply aprocedure that is used more often in mathematics: reformulate the recipe for the given,restricted, class of problems, but in such a way that it can readily be applied to a widerclass of problems. (Think,. e.g. of n! n(n 1) 2 and n! fo e-t dt.) Here issuch a reformulation for the shortest distance example of 2.

(a) Replace the square of the radius of the circle (which was equal to 1) by + y.(/3) Add )‘y to the objective functionf(x) x + x.(3’) Find the Lagrange function L(x, )‘) for each x and )‘ from

L(x, )‘) min {x + x + )‘y: (x 4) -t- (X 3)= / y}.y

In fact, the minimization problem involved here is trivial, because y can take on only onevalue, i.e. y (x 4) + (x2 3)z 1, so that we obtain what we had before.

Reformulating this again, but slightly now, we obtain the following recipe.(a) Perturb (rather than "replace") the constraints (whatever be their form) by

changing some parameters. Denote the changes, or perturbations, by y, Y2and combine them into a single vector y (Yl, Y2, ").

(b) Introduce as many Lagrange multipliers )‘1, ),2, as there are y,.’s, andcombine them into a single vector )‘ ()‘1,)‘2, ). Add Xy )‘lY + )‘2Y2 +

to the objective functionf(x).(c) Find L(x, )‘) for each pair (x, X) by finding the minimum (or maximum, as the

case may be) off(x) + )‘y, subject to the perturbed constraints.Obviously, we are no longer bound now to perturbing right-hand sides of equalityconstraints. First of all we can also perturb now right-hand sides of inequality constraints.But there are more new possibilities:

Example 1. Suppose we do not perturb the radius of the circle, but the position of itscenter instead. This requires two perturbations Yl and Y2, hence two multipliers )‘1 and )‘2.The center was (4, 3) and now becomes (4 + Yl, 3 + y), or, if that is to be preferred,(4 + y, 3 Y2), and so on. According to (a) to (c) we obtain

L(x, )‘) min {x + x + )‘y: (x 4 y)2 + (x2- 3 yz)2= 1}y

x + x + )‘(x- 4) + )‘2(x- 3) -X/)‘ + )‘, with )‘y )‘lY + )‘:zY2,

which is quite different from what we had before. (A simple way to find this result is toput Xl 4 y cos 4, and x2 3 Y2 sin 4. Clearly, in this way the given problemcan be solved without using Lagrange multipliers at all, but this is, of course, not the pointhere.)

Let us go one step further, and not only consider perturbing constraints, but considerperturbing the objective function as well. In other words, let us consider perturbing theproblem, wherever we want.

(A) Perturb the constraints and/or the objective function by changing some param-eters. Denote these changes, or perturbations, by Yl, Y2, as far as theconstraints are concerned, and by z, z, as far as the objective function isconcerned. Let y (Yl, Y ") and let z (z, z2, .).

(B) Introduce as many Lagrange multipliers )‘, )‘2, as there are yi’s and asmany Lagrange multipliers #, #2, as there are z’s. Let )‘ ()‘1,)‘2, ")and let tt=(ut, uz, "). Add )‘y=)‘y+)‘2y+... as well as uz=#Zl + #2z_ + to the perturbed objective function.

(C) Find L(x, )‘, ) for each triple (x,)‘, ) by finding the minimum (or maximum)of the resulting sum, subject to the perturbed constraints.

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 4: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

186 J. PONSTEIN

In 15(c) we indicate how one can get rid of all constraints by allowing the objectivefunction to take on infinite values. Then perturbing the problem means perturbing onlythe objective function and the presentation is simpler. But given any practical constrainedoptimization problem one will have to go back to finite objective function values.Moreover, when discussing the dual problem later on, we will show how convenient it is tointroduce extra (dual) constraints in order to avoid infinite dual objective functionvalues.

Example 2. A third Lagrange function for the shortest distance problem is obtainedif we leave the circle untouched, but replace the origin (0, 0) by (21,22), so that now weminimize (Xl 21) + (x2 22) instead of Xl + x. Applying (A) to (C) we get

if (X- 4) + (X 3) 1,L(x, lz)

-u]/4 tz14 + ulx, + uzXz if(x-4)2+(x2-3)2=1,

which is different from the two Lagrange functions obtained before. Yet, perturbing thepoint from which the square of the distance to the circle has to be taken comes very closeto perturbing the center of the circle, that is of translating the entire circle. This examplealso shows that the Lagrange function can take on infinite values.

What good is all this? Sofar the only (but not unimportant) conclusion is that theremay be many ways to define the Lagrangefunction, given a single problem.

4. Constructing L-functions for any optimization problem; a general and usefultheorem. In this section we derive a very general and useful result, i.e., Theorem I. It hasnothing to do with differentiability, convexity or continuity, and is very easy to show.Moreover, its conditions should be very appealing to the problem solver. These conditionsare, however, only sufficient conditions. They can be turned to sufficient as well asnecessary conditions by narrow’, down the class of optimization problems considerably,say to the class of convex optimization problems that satisfy suitable regularityconditions (see e.g. [6]). These regularity conditions are responsible for a considerablefraction of the underlying theory on the one hand, and may be a stumbling stone for theproblem solver on the other, because she or he may not be able to verify them in advance.

Let x be some variable, say x (x, Xn) with xj real numbers, and let f(x) bethe objective function of some optimization problem. Let the constraints be given by x g,where g is a given subset of the space of x’s. Hence g is the constraint set, or thefeasibleregion. Assume thatf(x) must be minimized subject to x g (maximizations are treatedsimilarly). Hence the unperturbed problem amounts to finding

min {f(x):x g}.

Let the perturbed problem be that of finding

min {F(x,z):x G(y)},

where F and G must be such that

F(x,O) =f(x) and G(0) =g,

so that we require that (y, z) (0, O) means "do not disturb." This is in accordance withthe definitions ofy and z before.

Remark. For simplicity we assume most of the time that x, and g are finitedimensional vectors. Generalizations exist, however, where these vectors are (partly)replaced by (vector) functions; see the last section.

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 5: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

CHOOSING YOUR OWN LAGRANGE MULTIPLIERS 187

Applying rules (A) to (C) of 3, we obtain

(*) L(x,X,z) min {F(x,z) + Xy + #z:x G(y)},y,z

which is the generalized Lagrange function. In general we cannot give it a more explicitform, but in many special cases this is possible as has been shown above and will be shownbelow. Here are some special cases, each leading to all kinds of subcases.

(1) min {f(x):h(x) y, x C} leads to

L(x, X)ifx C;

(2) min {f(x):h(x) <= y, x C} leads to

)+Xh(x) if x C and 3, >_- 0,

L(x, X) if x C and 0,

ifx C.

As we shall see, multiplier values leading to L(x, ) or to L(x, ,, z) can beignored, so that in this example we need only consider X such that X >_- 0. But even if >_- 0it can happen that L(x, ) turns out to be equal to -. Then one should look for extraconditions on , (and u) in order that this will not happen if these conditions are satisfied(compare this with the remarks made above Example 2 in 3).

(3) min {f(x):x + y g, x C} leads to

{f+()-x+min{Xxt:x’g} ifxC,L(x, X)

ifx C,

where x’ x + y. For certain details about these cases, see following sections.For practical purposes we have the following important theorem, not involving

differentiability, convexity or continuity.THEOREM 1. Ifwe canfind (x, X, #o) such that

(**) min L(x,h,) L(x,h,/) =f(x) and x g,

then x is an optimal solution ofthe given problem (in the global sense).Proof If (**) is satisfied then according to (*) we have

f(x) min {F(x, z) + 3,y + vz:x G(y)} min {F(x, z) + XOy + #Oz:x G(y)}y,z x,y,z

_-< min {F(x, 0) + ,0 + V0:x G(0)} min {f(x):x g},

but x g, hence x is optimal.For a problem solver theorems like this one are attractive, because there are no

conditions to verify in advance. Moreover, the risk that harmful conclusions can be drawnfrom using the theorem illegally seems to be very small, because in that case no (x, X,/z)will be found. The latter is only true, however, if the minimization over x of L(x, X, t) isdone properly, which in this paper we take for granted! This minimization problem isoften called the Lagrange problem.

Ideally, one would want to be able to tell in advance whether (**) is solvable or not.The following conditions together guarantee the solvability of (**).

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 6: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

188 j. PONSTEIN

1. The perturbationfunction p(y, z) defined by

p(y, z) min {F(x, z):x G(y)}

is convex.2. There exists at least one optimal solution x of the given, unperturbed, problem,

implying thatf(x) p(0, 0).3. Some regularity condition is satisfied (such as Slater’s constraint qualification,

see e.g. [6], [7]).Often, it is not difficult to verify Condition 2, and in practice it is satisfied if the problemhas been "properly" defined. Regarding Condition 3 we have already made some remarksat the beginning of this section. To these we add that practical experience shows that theinsolvability of (**) is seldom due to only the fact that this condition is not satisfied. So weare left with Condition 1. When is p convex? This is the case if, for example,f is convexand for any x’, x", y’, y", o such that x’ G(y’), x" G(y"), 0 _-< o --< 1, it follows that0x’ + (1 o)x" G(py’ + (1 o)Y"). This condition can easily be specialized to casessuch as the cases (1) to (3) above.

It should be noted, however, that conditions guaranteeing the solvability of (**) areagain sufficient conditions, but this time to the effect that they limit the use of thetheorem, because they may (and usually do) exclude problems for which (**) is solvable.On the other hand, if one tries to minimize a concave function subject to certainconstraints then he or she is warned.

Remarks. 1. For all kinds of theorems regarding the solvability of (**), see e.g. [2],[3], [5], [6], [7], [9] and other literature (the proofs of these theorems often involveseparation theorems).

2. It is easily shown that (**) is equivalent to

min L(x, h, tt) f(x), x g.

3. It is also easily shown that if only equality constraints are present and only theirright-hand sides are perturbed, then (**) is equivalent to

min L(x, 3,, #o) L(xo, ho, #o), xo g.

Comparing the classical approach with the generalized one we see that the former isrelated to differentiability in an essential way, in that first and higher order conditionsplay an important part. These conditions, if at all relevant, are neatly packed in (**).When applying the generalized approach considering these conditions is postponed to alater phase in the process of solving the given problem, whereas they are part of a singlephase when applying the classical approach. But even if differentiability cannot beestablished, the generalized approach might lead to solving the given problem. Generallyspeaking, convexity is a more basic notion than is differentiability, although evenconvexity is not necessary in order to be able to apply (**) successfully.

In the next section we will add a condition to (**) that in many cases will be veryhelpful when solving (**), even though this condition is a direct consequence of (**). Thisinvolves duality.

5. Lagrange multipliers and duality. Along with the given, primal, problem offinding

min f(x):x g}

or, after perturbing it,

min {F(x, z):x G(y)},

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 7: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

CHOOSING YOUR OWN LAGRANGE MULTIPLIERS 189

consider the following dual objective function, which is dependent on the chosenperturbation,

fd(), It) min L(x, X, It)

and the dualproblem of finding

maxfd(x, It).

Because fd(), It) is the result of a minimization, its value might be + or -. In casefd()k, It) we should try to find extra conditions to be imposed on (X, It) in order toavoid this, see the remark made below case (2) of 4, and what is ahead, in particularabout linear programming in 7.

A direct consequence of (**) is the following duality theorem.THEOREM 2. If(x, o, Ito) satisfies

(**) min L(x,X,it) L(x,X,it) f(x), x g,

then (o, Ito) is an optimal solution ofthe dual problem, and min max, that is,

(***) min {f(x):x g} =f(x) =fd(x, o) maxfd(x, ).,,Proof. Take any x’ c g--- G(0). Then

fd(x, it) min L(x,X, it) min {F(x,z) + Xy + Itz:xc G(y)}X,y,Z

<= F(x’, O) + hO + ItO f( ),

so that fd(h, it) <=f(x’) for all x’ g. In particular, fd(h, it) <=f(x), so thatmaxx.fd(h, it)<-f(x), but fd(h,it)=L(x,h,it)=f(x), hence fd(h,it)=maxx,fd(h, It) and min max.

Again, neither differentiability, nor convexity, nor continuity is involved. In the nextsection we will see how this result can be used for actually solving (**).

Remarks. 1. It follows from the above proof thatfd(h, It) <-_f(x) for all x g and all(h, It). This is sometimes called weak duality.

2. Apparently, it also follows from (**) that

L(x, h, It) <= L(x, h, Ito) __< L(x, h, Ito) for all x g and all (h, It),

which is why (x, ),o, Ito) is called a saddle-point of the Lagrange function.Let us summarize:Define L by

(*) L(x, X, It) min {F(x,z) + Xy + Itz:x G(y)},y,z

and try to solve for (x, X, Ito) from

(**) min L(x, X, It) L(x, X, It) =f(x), x g,

and

(***) max min L(x, , It) min L(x, o, Ito).,,#

6. Applying the generalized rules to some examples.Example 3. Applying the "classical" method to the shortest distance problem gives

L(x,k) x21 + x] + k{(x- 4) + (x2- 3)2- 1}.

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 8: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

190 J. PONSTEIN

The coefficient of x is + X, as is that of x]. Hence we .must require that + X > 0, asotherwise minx L(x, X)-- -o. But then we may set equal to zero the partial derivativeswith respect to Xl and x2. Together with xc g, or (Xl -4) + (x- 3)2= this givesthree equations in Xl, x and X, giving X -6 or X 4. Because of the requirement that+ X > 0, only X= 4 remains, leading to Xl 16/5, x 12/5, as can easily be verified.

Notice that this solution is a global minimum.Example 4. Continuing Example we must start from

L(x, Xl, X2)= X-t--X2 @ Xl(X --4) + X2(x2- 3)- /X + X2.A straightforward application of (**) gives 2Xl + Xl 0, 2x + X 0, and

0 0 0 0Xl(x 4) + X2(x2- 3) /(X) + (X) 0, (Xl -4) + (x2- 3)2= 1.

To solve these four equations we can set p 4Xl + 3x, q2 (XOl)_ + (xO2)2, solve for p andq and find Xl and x2 from p and q. This gives

Xl0 16/5, X02 12/5, kl0---- 32/5, X20 24/5.

Example 5. Continuing Example 2 we have

L(x, #,, #2) ( + lXl -Jr- 2X2

if (Xl 4) + (x2- 3) - 1,

if (x --4) + (x2- 3)2= 1.

Working out (**) leads to only three equations in the four unknowns Xl, x2, #0, . Withsome trickery we can solve these. Put

x-4=-cos4, Xz-3=-sin4,cos fro, sin fro.

Then after some elementary computations we obtain that 4 fro and that

+ 4 cos fro 3 sin fro (4 cos + 3 sin 25,

where s /(t) + (t2)-,so that we are left with one equation in two unknowns. Apparently, however, (4 cos po + 3sin po) >__ 25, hence cos po 4/5 sin fro 3/5 giving

Xl0 16/5, X ---12/5 /.01 32/5 . 24/5.

Applying the duality relation (***), however, this result is found in a much morestraightforward manner, since all we have to do is maximize minx L(x, Ul, u2), which is aconcave function of and 2, over these variables.

Remarks. 1. The dual objective functionfe(x, ) minx L(x, ) always is concave.This is, of course, an important fact in view of solving (**) and (***).

2. Notice, that up to a sign the multipliers of Examples 4 and 5 are pairwise equal.We come back to this when discussing shadow prices.

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 9: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

CHOOSING YOUR OWN LAGRANGE MULTIPLIERS 191

7. Linear programming. Let

p(y) min {cx:Ax >= b y, x >= 0}.

Notice the greater-than-or-equal-to signs here. In the preceding sections we have alwayscombined minimization with _-<, as is also done in many theoretical treatments on thesubject. In LP, however, it is customary to combine minimization with >-_. This is done inorder to get a nicer form of the dual problem. In order to be consistent with the precedingsections we now have -y instead ofy. It follows that

=cx+X(b-Ax) ifX>_-O,L(x, X)

ifX O,

so thatfd(X) min L(x, X) is only >- ifX >_- 0 and c XA >_- O. If these conditions aremet we have thatfd(X) Xb, hence

[- otherwise,

so that the dual problem is that of finding

max [Xb:XA <-_ c,

We see from this example that although the abstract formulation of the dual problem, i.e.find maxfd(), may suggest that no constraints are involved in the dual problem, suchconstraints may appear in a natural way by requiring that fd(X)>- (see earlierremarks).

8. Quadratic programming. Compared with LP we now have an extra term in theobjective function, i.e. l/2x’Qx, where the prime denotes transposition, and Q is asymmetric, positive (semi) definite matrix. For convenience we not only perturb Ax >= b,but x >_- 0 as well (and use z and now for the constraints, not for the objective function).This gives

p(y, z) min {cx + 1/2x’Qx:Ax >-_ b y, x >= -z},

and

L(x, X #)= {cx+_ 1/2x’Qx +X(b-Ax)-#x

In view of the definiteness of Q we must have that

c+x’Q-XA-u=O.

If Q-1 exists, we can express x in terms of the multipliers,

x’= (-c + XA + u)Q-1,

from which we obtain that

if (X, u) >-- 0,

otherwise,

fa(X’ #) { -1/2(c-oo XA #)Q-(c- XA u)’ + Xb if (X, u) >-- 0,

otherwise.

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 10: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

192 J. PONSTEIN

The dual problem becomes that of finding

max {_ /2(c XA #)Q-’ (c hA #)’ + hb: (h, It) >= O}

or

max {_ 1/2(sR)(sR )’ + hb:s c hA u, (h, u) >= 0},

where Q-1 R R’, for a suitable R.

The fact that x can be expressed in terms of h and 0 is a distinct advantage of quadraticprogramming over linear programming.

9. Convex programming. Let

p(y, z) min f(x):g(x) <= y, h(x) z, x C}

(again using z for constraints). Letfand g be convex functions, withf(x) scalar but g(x),say, m-dimensional, let h be an affine function, say, m’-dimensional, and let C be a convexset. Problems of this type are usually called convex programming problems. We havethat

+ hg(x) + #h(x) ifh >_- 0 andx C,

if h 0 and x C,

if xC.Working out (**) gives the familiar Karush-Kuhn-Tucker conditions,

min {f(x) + hg(x) + h(x):x C} =f(x),(KKT)

hg(x)=0, g(x)<=0, h(x)=0, h>=0.

Because of the assumptions made, if in addition primal optimal solutions exist, and if asuitable regularity condition is satisfied, the KKT-conditions have a solution; see thediscussion following Theorem 1.

10. Nonlinear programming. This is the same as convex programming, except thatf,g and C need not be convex, and that h need not be affine. Again we obtain (KKT), but ingeneral we cannot guarantee its solvability. Yet, if a solution is found (accidentally, so tospeak), it immediately yields an optimal solution x for the primal problem, as well as anoptimal solution (h,/0) for the dual problem.

Much research on nonlinear programming is going on, but much of it is directed tofinding only local solutions.

11. Fenchel duality. An interesting relation is found if we apply the generalizedtheory to

p(y) min {f(x) g(x + y):x C},

withfa convex function, g a concave function, and C a convex set. Then

L(x, h) min {f(x) g(x + y) + hy:x C},y

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 11: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

CHOOSING YOUR OWN LAGRANGE MULTIPLIERS 193

or, putting y z x,

(mzin {f(x)- g(z) + hz hx:x C},/

L(x, X)= If(x) Xx + nn {hz g(z)} if x C,

I+ ifx C.

Assuming thatf(x) + if x C, we can simply write

L(x, h) =f(x) hx + min {hz g(z)},

so that

fd(x) min L(x, X) min {,z g(z)} max {Xx -f(x)}.

Now, if (**) holds, sof(x) =fd(x0), then it follows from this that

min {f(x) g(x)} max [min {Xz g(z)}- max {,x -f(x)}].,Defining the following, so-called conjugatefunctions,

f* (X) max {Xx f(x)} and g. (X) min {,x g(x)},

this becomes

min {f(x) g(x)} max {g, (,) -f*(X)},)

which is known as Fenchel duality. It says that the minimum of the difference offand g isequal to the maximum of the difference of the conjugates of g andf. It holds, if apart fromthe conditions already mentioned, p(0) is finite and if a suitable regularity condition issatisfied (see e.g. [7], [8]).

Notice that fa(k) consists of two terms, the first one involving g but not f, and thesecond one inv,olving f but not g. The success of introducing conjugate functions is, ofcourse, based on this.

Example 6. As an example consider the minimum norm problem as it arises instatistics, and elsewhere. In fact our shortest distance problem too is a (very simple)minimum norm problem, and Examples and 4 deal with applying Fenchel duality to it.Let the norm be

I. . ,/.,x Ip (I Xl -]- "+- Xn where p >

and let us right away introduce

Ihlq (Ih, q + + Ih, lq) ’/q, where q is such that- + 1.P q

For simplicity we drop the indices p and q. The problem is to find

min {Ixl :x s / z, z M},

where s is a fixed vector and M is a fixed linear subspace of the space of x’s. Put C{x:x s + z, z M}, and let

0+ if x C,f(x)

ifxC,g(x) x l.D

ownl

oade

d 11

/22/

14 to

129

.120

.242

.61.

Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 12: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

194 J. PONSTEIN

Then a not very difficult calculation leads to

{ks if X perpendicular to M, {0_ iflXl_-<l,f*(X) g,(k)

+ if this is not so, if lkl>l.It follows that the dual problem is that of finding

max {- ks:k _t_ M and lk Iq --< }.

Since f, -g and C are convex and a suitable regularity happens to be satisfied, it followsthat minx maxx, so that the related Lagrange multiplier rule will work.

12. Optimal Lagrange multipliers as equilibrium prices. Perturbing a given optimi-zation problem means defining a certain perturbation function

p(y, z) min {F(x, z):x c G(y)}.

If

p(0, 0) _-< p(y, z) + k’y + #’z for all y and all z, whatsoever,

then we say that the components of (k’, u’) are equilibrium prices. The terms k’y and u’zmay be considered as correction terms (sometimes called penalty terms, although theymight be the opposite of real penalty terms; the very same objection can, however, beraised against the term equilibrium price, and against the term shadow price later on).The inequality says that it does not "pay" to perturb the problem, no matter how small orhow large y and z are. In other words, the unperturbed state is one of equilibrium.

Notice that this definition depends on the way the given problem has been perturbedand that it does not depend on the existence of (x, k, u) satisfying (**),

(**) min L(x,k,tt) L(x, k, tt) =f(x), x c g.

The close relationship that happens to exist between optimal Lagrange multipliers andequilibrium prices appears from the following result.

THEOREM 3. If (x,k, go) satisfies (**), then x is a primal optimal solution and(ko, #o) consists ofequilibrium prices. Conversely, ifx is a primal optimal solution and(ko, go) consists ofequilibrium prices, then (x, k, #o) satisfies (**).

The proof of this result is simple and may be omitted.A more general result can be obtained by not assuming that optimal primal solutions

exist. Then p(0, 0) is an infimum rather than a minimum. As in practical situations one isnot very much interested in this case, we will not go into the details of it, and satisfyourselves with referring the reader to the literature quoted. We may say, therefore, thatequilibrium prices are optimal Lagrange multipliers, and conversely.

Remark. It should be noticed that equilibrium prices have a global character,whereas shadow prices (and shadow costs) have a local character, see the next section.Unfortunately, both types of prices are usually termed shadow prices (or costs)!

13. Optimal Lagrange multipliers and shadow prices; sensitivity analysis. When p isdifferentiable at (y, z) (0, 0), which even if we are dealing with LP need not always bethe case, then (**) implies that

Op(O, O) k/ andOp(O, O) o

Oy OZias is easily shown. It follows that if these derivatives are continuous at (0, 0), we have that

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 13: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

CHOOSING YOUR OWN LAGRANGE MULTIPLIERS 195

p(y, z) p(O, O) Oy #Oz, for y and z small.

So, if optimal Lagrange multipliers exist and if p is differentiable, the multipliers tell ushow fast the unperturbed minimum will change when small perturbations are applied.This property is used in sensitivity analysis (in particular in LP), which is nothing otherthan the analysis of the behavior of the perturbation function p. Often, one is not satisfiedwith small changes of the parameters, and this leads to the well-known method ofparametric programming, where from time to time one has to compute new optimalmultipliers (in LP this corresponds to changes of the basis matrix).

When the optimal Lagrange multipliers are interpreted in a marginal sense as we aredoing in this section, they are called shadowprices or shadow costs, depending on the typeof constraint that is being perturbed. This terminology is, however, only justified if p isdifferentiable. For suppose that not only 0 5 is optimal, but also that ’= 7 is optimal,and assume that we are perturbing only one parameter. Then

and

so that

p(O) <= p(y) + 5y for all y

p(O) <= p(y) + 7y for all y,

p(O) <= p(y) + y for all y

for any , such that 5 _-< X =< 7. (Incidentally, this argument shows that the set of optimalLagrange multiplier vectors is convex.) In particular,

p(O) <= p(y) + 6y for all y.

It follows from Theorem 3, assuming the existence of optimal x, that "= 6 too is anoptimal Lagrange multiplier. But it cannot be interpreted as a shadow price (or cost). Forif it could, that is if all optimal Lagrange multipliers could be interpreted as shadowprices, then the conclusion would be that the minimum would change at three (and evenmore) different rates, i.e. at the rates 5, 6 and 7. At most two rates are possible in thisexample; one in the direction of positive y, and one in the direction of negative y.Assuming that with , < 5 or > 7 are not optimal, and that p is convex and p(0) isfinite, it can be shown that ,’= 7 can be interpreted as the shadow price in the positivedirection, hence taking y > 0, and that 0 5 can be interpreted as the shadow price in thenegative direction, hence taking y < 0. But " 6 is no shadow price (or cost) at all!

The conclusion is that it is not legitimate to regard any optimal Lagrange multiplieras a shadow price (or cost), hence that equilibrium prices and shadow prices (and costs)are different notions. The former are related to optimality and duality, the latter tooptimality and sensitivity. Often, i.e. when the perturbation function is (continuously)differentiable at Zero, there is no numerical difference between the two, at least ifequilibrium prices exist.

Even though there is an important difference between the two, in most of the appliedliterature both are indicated by shadow price (or cost) and the term equilibrium price isnot used at all. It seems that this has caused much confusion, in particular for those whosemathematical background is not too strong.

Example 7. Here is an example where the shadow price is 2 (in both directions), butno equilibrium prices exist,

p(y) min {f(x):x <= + y, x >= 0}, withf(x)if 0 _-< x <- 10,

-100 ifx>_-10.

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 14: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

196 J. PONSTEIN

Example 8. This example is somewhat trivial, but the situation it represents couldwell occur in some practical LP problem. Let

p(y) min {x:x >= y, 2x >= 2, x >= 0}.

Then any h such that 0 _-< h _-< is an equilibrium price, in particular h 1/2, but this lattervalue cannot be a shadow price.

Let us finally return to Examples 4 and 5. Why are the optimal multipliers pairwiseequal, up to a sign? This is because they may be interpreted as shadow prices (in bothdirections) and because perturbing the position of the center of the circle amounts to thesame as perturbing the position of the point (the origin) to which the square of thedistance to the circle is computed, if only the perturbations (y,y2) and (-Yl, -Y2) areapplied, respectively.

14. Optimal Lagrange multipliers and imputation?? In several texts on economics,and even in some on operations research it is suggested that optimal h0 can be used toimpute, say, the maximum profit to the scarce resources. Would this mean that there is athird interpretation of optimal Lagrange multipliers? It seems that this is not so, becauseit seems that such an imputation is not properly defined. For suppose that ,i 0 for somei, that the effect of the ith resource is described by g(x) <= b, where b is the amount of theith resource that is available, and that the right-hand side of this constraint is perturbed,leading to the multiplier i. Assume further that gi(x) 5, so that 5 units of the ithresource are used in order to realize the optimal solution, and that the maximum profitwould be strictly lower if only 3 units would have been available, hence if b; would havebeen replaced by 3. Because Xi 0 nothing would be imputed to the ith resource, whichsounds like: the ith resource is of no value at all! Yet, more than 3 units of it are requiredin order to realize the maximum profit. Hence it would seem that imputation by means ofoptimal Lagrange multipliers is not correct in general. The point of the example is, ofcourse, that any additional unit of the ith resource is of no value at all (starting from thesituation there are at least 5 of them). Or, that the shadow price of the ith resource in thedirection of additional units is equal to zero (whether it is also zero in the oppositedirection remains to be seen).

Another objection against the indicated imputation can be raised when for some all,; between, say, 5 and 7 are optimal, whereas all ,, for j are unique. What would bethe "value" of the ith resource in that case?

It would seem, therefore, that attempts to use optimal Lagrange multipliers toimpute something like maximum profit to available resources cannot be successful.

15. Final remarks. (a) Because of Theorem 3 one can also start the development ofthe theory from the existence of equilibrium prices, hence from the inequality

p(O, O) <= p(y, z) + ,’y + t,t’z for all (y,z),

which is a special case of

p(y’, z’) <= p(y, z) + ,’(y y’) + u’(z z’) for all (y’, z’) and all (y, z).

Obviously, the latter inequality is much stronger than the former. It is sufficient forequilibrium prices to exist, but not at all necessary. It expresses the fact that at each point(y’,z’,p(y’,z’)) of the graph of p there exists a supporting hyperplane that nowhere is"above" the graph of p, but on the other hand has (y’,z’,p(y’,z’)) in common with thatgraph. Since the components of (,’, z’) are finite numbers, this supporting hyperplane is"nonvertical." It follows from this that p must be a convex function!

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 15: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

CHOOSING YOUR OWN LAGRANGE MULTIPLIERS 197

Conversely, if p is a convex function then such hyperplanes exist at any point(y’, z’, p(y’, z’)) of the graph of p, except perhaps for some boundary points of the set ofpoints (y,z) where p(y,z)< +oo (this is the effective domain of p). If, for example,p(y) ffi for y >= O, and p(y) +o for y < O, then p is convex, but no ’ exists such thatp(O) <= p(y) + k’y for all y. In fact, only a "vertical" supporting hyperplane exists at(y’ O, p(O)). It follows that convexity and the finiteness of p(O, O) are not sufficient toguarantee the existence of equilibrium prices. Some regularity condition (such as aconstraint qualification) is required to exclude the possibility of "vertical" supportinghyperplanes.

(b) Still another way to develop the theory is to start from a function that will serveas the Lagrange function. For let us compute maxx. L(x, , ), then we obtain

max [min IF(x, z) + )ty + gz:x G(y)}].,# y,z

Assume that we may interchange max and min here. Then the result is

if x c G(0) g,

if x g (if F(x, z) > for all (x, z)).

In other words, if max-min min-max, we can recover g from L, and thenf(x) for x c g.We could even recover the perturbed primal problem formulation, namely by computingmaxx,, {L(x, , u) kY tz}, again assuming that max and min may be interchanged.

(c) It is possible to perturb the objective function only, because the constraints canbe absorbed into it if one is willing to accept infinite objective function values. Instead ofminx {F(x, z):x c G(y)}, we could then take

if x G(y),

if x G(y),

leading to

min F(x, y, z),

where we can just as well replace (y, z) by y, so that we get

min F(x, y).

Conversely, if such a bifunction F(x, y) is given, define G(y) by

G(y) --{x:(x, y) < +}.

Then we may write, with z y and F(x, z) F(x, y) if x G(y),

min {F(x, y):x G(y)}.

For more details the reader may consult [7] or [8].(d) In the theory of converse duality one regards the dual problem as a new primal

problem and by dualizing it one tries to return to the original primal problem. This meansthat the components of x will serve as the Lagrange multipliers of the dual problem, whichrequires the introduction of a vector of perturbations into the dual problem. Sincefd ingeneral will take on infinite values, it is now more convenient to use a bifunction F(x, y)(see remark (c) above, omitting the bar in F(x, y)). Then

L(x, X) min {F(x, y) + Xy}y

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 16: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

198 J. PONSTEIN

and the perturbed dual problem is defined as that of finding

Fd(A, v) min {L(x, ,) vx} min {F(x, y) vx + Xy}.x,y

Similarly, the bidual problem is defined as that of finding

Fdd(x, y) max {Fd(X, v) + vx Xy},

and the aim is to show that Fdd(x,y)= F(x,y), SO that the dual of the dual is the(original) primal again.

(e) Conjugation not only plays its part in Fenchel duality (11), but also whendefining the Lagrange function, the dual objective function and the bidual objectivefunction. In particular, Fdd is nothing other then the biconjugate of F with respect to(x, y). Similarly, Fd is the conjugate of F with respect to (x, y), except for some signs, andL is the conjugate of F with respect to only y, again except for some signs.

(f) Up to now we have restricted ourselves to finite dimensional x, , and/, buteverything can equally well be applied when, say,

x:t-- x(t) c R., with in some interval of the real axis.

Something similar can be done for , and u (and for y and z).As an example, suppose that 5c(t)= p(x(t), u(t), t) is the state equation of an

optimal control problem, where u(t) is the control and x(t) is the state at time t, with0 _<- <_- 1. Instead of x we now regard (x, u) as the variable of this optimization problem,and we let the perturbed constraint be 5c(t) =p(x(t), u(t), t) + y(t), so that y too becomesa function of the continuously varying t, and the same holds for ,. Instead of the linearfunctional Xy Xy + h.y2 + we have been using hitherto, we must now introducesomething else, say

y /"J0 X(t)y(t) dt.

One has to be careful here, because the functional to be introduced not only has to belinear, but also bounded (the latter is, of course, implied in the finite dimensional case).This means that one must carefully select the space of y’s, as well as that of ,’s, andsimilarly for the space of z’s and that of #’s. In fact one should select the right pair ofspaces, see [7] and in particular [8].

As it turns out, the multiplier , is nothing other than the adjoint variable of theoptimal control problem. Further, results like the maximum (or minimum) principle andtransversality conditions can be derived in a straightforward manner. Thus one obtainsan interesting variant of the calculus ofvariations. For further reading, see [2], [5], [7].

(g) Lagrange multiplier methods can also be applied to stochastic optimizationproblems, such as multi-period stochastic inventory problems.

(h) As said before, the terms Ay and #z may be regarded as correction (or penalty)terms. Instead of these linear terms, one may also consider arbitrary terms. In theliterature they occur in two (equivalent) formulations. Either they are given the formX(y) or #(z) and then one speaks of pricefunctions (see [9]), or they are given the formc(X, y), where c is a fixed given functional called couplingfunctional (see ]), "coupling"perturbations with what no longer needs to be multipliers.

It is well known that linear correction terms (as we have been using) when applied tolinear integer programming usually do not result in a solvable set of conditions (**). Withmore general price functions or coupling functionals one can make (**) again solvable(see e.g. [9]), but it remains to be seen whether this will lead to computationally attractiveprocedures.

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 17: Applying Some Modern Developments to Choosing Your Own Lagrange Multipliers

CHOOSING YOUR OWN LAGRANGE MULTIPLIERS 199

It is also conceivable to go one step further, and replace F(x, z)+ hy + #z bysomething like F(x, y, z, , #).

REFERENCES

E. J. BALDER, An extension of duality-stability relations to nonconvex optimization, SIAM J. ControlOptim., 15 (1977), pp. 329-343.

[2] B. D. CRAVEN, Mathematical Programming and Optimal Control Theory, Chapman and Hall, London,1978.

[3] M.A.H. DEMPSTER, Elements ofOptimization, Chapman and Hall, London, 1975.[4] A. M. GEOFFRION, Duality in nonlinear programming: a simplified applications-oriented development,

this Review, 13 (1971), pp. 1-37.[5] D.G. LUENBERGER, Optimization by l/’ector Space Methods, John Wiley, New York, 1969.[6] O. L. MANGASARIAN, Nonlinear Programming, McGraw-Hill, New York, 1969.[7] J. PONSTEIN, Approaches to the Theory ofOptimization, Cambridge Univ. Press, Cambridge, 1980.[8] R.T. ROCKAFELLAR, Conjugate Duality and Optimization, CBMS Regional Conference Series in Applied

Mathematics 16, Society for Industrial and Applied Mathematics, Philadelphia, 1974.[9] L. A. WOLSEY, Integer Programming Duality: Price Functions and Sensitivity Analysis, CORE, Leuven,

1978.

Dow

nloa

ded

11/2

2/14

to 1

29.1

20.2

42.6

1. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p