43
3.6 Lagrangian relaxation Consider a generic ILP problem min {c t x : Ax b , Dx d , x Z n } with integer coefficients. Suppose Dx d are the ”complicating” constraints in the sense that the ILP without them is ”easy”. Often the linear relaxation and the relaxation by elimination of Dx d yield weak bounds (e.g., TSP/UFL deleting cut-set/demand constraints) More general setting: min {c t x : Dx d , x X R n } (1) Idea : Delete the complicating constraints Dx d and, for each one of them, add to the objective function a term with a multiplier u i , which penalizes its violation and that is 0 for all feasible solutions of the problem (1). Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 1 / 43

3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

3.6 Lagrangian relaxation

Consider a generic ILP problem

min {c tx : Ax ≥ b, Dx ≥ d , x ∈ Zn}

with integer coefficients.

Suppose Dx ≥ d are the ”complicating” constraints in the sense that the ILP withoutthem is ”easy”.

Often the linear relaxation and the relaxation by elimination of Dx ≥ d yield weakbounds (e.g., TSP/UFL deleting cut-set/demand constraints)

More general setting:min {c tx : Dx ≥ d , x ∈ X ⊆ Rn} (1)

Idea: Delete the complicating constraints Dx ≥ d and, for each one of them, add to theobjective function a term with a multiplier ui , which penalizes its violation and that is≤ 0 for all feasible solutions of the problem (1).

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 1 / 43

Page 2: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Lagrangian subproblem

Definition: Given a problem

z∗ = min {c tx : Dx ≥ d , x ∈ X ⊆ Rn} (2)

For each vector of Lagrange multipliers u ≥ 0, the Lagrangian subproblem is

w(u) = min {c tx + ut(d − Dx) : x ∈ X ⊆ Rn} (3)

where

L(x , u) = c tx + ut(d − Dx) is the Lagrangian function of the primal problem (2),

w(u) = min {L(x , u) : x ∈ X ⊆ Rn} is the dual function.

Proposition: For any u ≥ 0, the Lagrangian subproblem (3) is a relaxation of problem (2).

Proof: Clearly {x ∈ X : Dx ≥ d} ⊆ X . Moreover, for each u ≥ 0 and each feasible x

for problem (2), we have w(u) ≤ c tx . Indeed w(u) ≤ c tx + ut(d − Dx) ≤ c tx since

ut(d − Dx) ≤ 0 and w(u) is the minimum value of c tx + ut(d − Dx) for x ∈ X .

Corollary: If z∗ = min {c tx : Dx ≥ d , x ∈ X} is finite, then w(u) ≤ z∗ ∀u ≥ 0.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 2 / 43

Page 3: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Lagrangian dual

To determine the tightest lower bound

Definition: The Lagrangian dual of the primal problem (2) is

w∗ = maxu≥0

w(u) (4)

Only nonnegativity constraints!

N.B.: By relaxing (dualizing) linear constraints, the objective function remains linear.

The other constraints can be of any type, provided subproblem (3) is ”sufficiently easy”.

For Linear Programs the Lagrangian dual coincides with the LP dual (cf. exercise).

Corollary: (Weak Duality)For each pair of feasible solutions x ∈ {x ∈ X : Dx ≥ d} of the primal problem (2) andu ≥ 0 of the Lagrangian dual (4), we have

w(u) ≤ c tx . (5)

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 3 / 43

Page 4: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Consequences:

i) If x is feasible for the primal problem (2), u is feasible for the Lagrangian dual (4) andw(u) = c t x , then x is optimal for the primal (2) and u is optimal for the dual (4).

ii) In particular w∗ = maxu≥0 w(u) ≤ z∗ = min {c tx : Dx ≥ d , x ∈ X}. If either theprimal or the dual is unbounded (admits no finite optimal solution), then the other one isinfeasible.

Recall: For any pair of primal-dual Linear Programs (LPs) that are bounded we havestrong duality, namely w∗ = z∗.

Observation: Unlike for LPs, discrete optimization problems can have a duality gap,that is w∗ < z∗.

Lagrangian relaxation of equality constraints:

Only difference: the Lagrange multipliers associated to equality constraints areunrestriced in sign, namely ui ∈ R.

If all the m relaxed/dualized constraints are equality constraints, the Lagrangian dual is:

maxu∈Rm

w(u)

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 4 / 43

Page 5: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Example 1: Binary knapsack

Consider max z = 10x1 + 4x2 + 14x3

s.t. 3x1 + x2 + 4x3 ≤ 4

x1, x2, x3 ∈ {0, 1}

Relaxing the capacity constraint, we obtain the Lagrangian function:

L(x , u) = 10x1 + 4x2 + 14x3 + u(4− 3x1 − x2 − 4x3)

Dual function: w(u) = maxx∈{0,1}3 10x1 + 4x2 + 14x3 + u(4− 3x1 − x2 − 4x3) ∀u ≥ 0

Thus the Lagrangian subproblem:

w(u) = maxx∈{0,1}3

(10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u

can be solved in linear time by setting to 1 (0) the variables with nonnegative(nonpositive) coefficient and choosing an arbitrary value for those with zero coefficient.

Lagrangian dual:

minu≥0

w(u) = minu≥0

( maxx∈{0,1}3

(10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u )

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 5 / 43

Page 6: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Dual function: w(u) = maxx∈{0,1}3 (10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u

Values of u for which the coefficients of x1, x2, x3 are nonpositive: u ≥ 103

for x1, u ≥ 144

for x3 and u ≥ 4 for x2.

Optimal solution of the Lagrangian subproblem as a function of u:

x = (1, 1, 1) for u ∈ [0, 103

],

x = (0, 1, 1) for u ∈ [ 103, 14

4],

x = (0, 1, 0) for u ∈ [ 144, 4],

x = (0, 0, 0) for u ∈ [4,∞).

Thus

w(u) =

28− 4u for u ∈ [0, 10

3]

18− u for u ∈ [ 103, 14

4]

4 + 3u for u ∈ [ 144, 4]

4u for u ∈ [4,∞)

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 6 / 43

Page 7: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Dual function:

u

w(u)

28

443292

16

103144

4

Lagrangian dual:

minu≥0

w(u) = minu≥0

( maxx∈{0,1}3

(10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u )

optimal solution u∗ = 144

with w∗ = w(u∗) = 292

.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 7 / 43

Page 8: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Example 2: Uncapacitated Facility Location (UFL)

Consider the variant with profits pij and fixed costs fj for opening the depots in thecandidate sites, where we wish to maximize the total profit.

MILP formulation:

z∗ = max∑

i∈M∑

j∈N pijxij −∑

j∈N fjyj

s.t.∑

j∈N xij = 1 ∀i ∈ M (6)

xij ≤ yj ∀i ∈ M, j ∈ N

yj ∈ {0, 1} ∀j ∈ N

0 ≤ xij ≤ 1 ∀i ∈ M, j ∈ N

Relaxing the demand constraints (6), we obtain the Lagrangian subproblem:

w(u) = max∑

i∈M∑

j∈N(pij − ui )xij −∑

j∈N fjyj +∑

i∈M ui

s.t. xij ≤ yj ∀i ∈ M, j ∈ N (7)

yj ∈ {0, 1} ∀j ∈ N (8)

0 ≤ xij ≤ 1 ∀i ∈ M, j ∈ N (9)

which decomposes into |N| independent subproblems, one for each candidate site j .

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 8 / 43

Page 9: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Indeed w(u) =∑

j∈N wj(u) +∑

i∈M ui where

wj(u) = max∑

i∈M(pij − ui )xij − fjyj (10)

s.v . xij ≤ yj ∀i ∈ M

yj ∈ {0, 1}0 ≤ xij ≤ 1 ∀i ∈ M

For each j ∈ N, the subproblem (10) can be solved by inspection:

If yj = 0, then xij = 0 for each i and the objective function value is 0.

If yj = 1, it is convenient to serve all clients with positive profit, namely xij = 1 for allindices i such that pij − ui > 0, with an objective function value of∑

i∈M

max{pij − ui , 0} − fj .

Thus wj(u) = max{0,∑

i∈M max{pij − ui , 0} − fj}.

Example: see Chapter 10 of L. Wolsey, Integer Programming, p. 169-170

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 9 / 43

Page 10: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Optimal solutions of the Lagrangian subproblem and of the primal problem

Solving the Lagrangian subproblem (3) may sometimes yield an optimal solution of theprimal problem (2).

Proposition: If u ≥ 0 and

i) x(u) is an optimal solution of the Lagrangian subproblem (3)

ii) Dx(u) ≥ d

iii) (Dx(u))i = di for each Lagrange multiplier ui > 0 (complementary slacknessconditions),

then x(u) is an optimal solution of the primal poblem (2).

Proof: Due to (i) we have w∗ ≥ w(u) = c tx(u) + ut(d − Dx(u)) and to (iii) we havec tx(u) + ut(d − Dx(u)) = c tx(u).

According to (ii), x(u) is a feasible solution of the primal (2) and hence c tx(u) ≥ z∗.

Therefore w∗ ≥ c tx(u) + ut(d − Dx(u)) = c tx(u) ≥ z∗ and, since w∗ ≤ z∗, everythingholds with equality and x(u) is an optimal solution of the primal (2).

Observation: If Lagrangian relaxation is applied to equality constraints, conditions (iii)are automatically satisfied, and an optimal solution of the Lagrangian subproblem isoptimal for primal problem (2) if it is feasible.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 10 / 43

Page 11: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Property of the Lagrangian dual:

Proposition: The dual function w(u) is concave.

Proof: Consider any pair u1 ≥ 0 and u2 ≥ 0.

For any α such that 0 ≤ α ≤ 1, let x be an optimal solution of the Lagrangiansubproblem (3) for u = αu1 + (1− α)u2, namely w(u) = c t x + ut(d − Dx).

By definition of w(u), we have w(u1) ≤ c t x + ut1(d − Dx) and

w(u2) ≤ c t x + ut2(d − Dx).

Multipying the first inequality by α and the second one by 1− α, we obtain

αw(u1) + (1− α)w(u2) ≤ c t x + (αu1 + (1− α)u2)t(d − Dx)

= w(αu1 + (1− α)u2).

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 11 / 43

Page 12: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

3.6.1 LP characterization of the Lagrangian dual

How strong is the lower bound on z∗ we obtain by solving the Lagrangian dual?

Theorem: Consider a generic ILP problem

min {c tx : Ax ≥ b, Dx ≥ d , x ∈ Zn}

with integer coefficients.

Let w(u) = min {c tx + ut(d − Dx) : Ax ≥ b, x ∈ Zn} be the dual function,

w∗ = maxu≥0 w(u) be the optimal value of the Lagrangian dual and

X = {x ∈ Zn : Ax ≥ b}, then

w∗ = min {c tx : Dx ≥ d , x ∈ conv(X )}. (11)

The problem is ”convexified”, the Lagrangian dual is characterized in terms of a LinearProgram.

Corollary 1: Since conv(X ) ⊆ {x ∈ Rn : Ax ≥ b},zLP = min {c tx : Ax ≥ b,Dx ≥ d , x ∈ Rn} ≤ w∗ ≤ z∗.

Depending on the objective function, these inequalities can be strict, i.e, zLP < w∗ < z∗.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 12 / 43

Page 13: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Illustration: from D. Bertsimas, R. Weismantel, Optimization over integers, DynamicIdeas, 2005, p. 144-146

Consider the ILP problemmin 3x1 − x2

s.t. x1 − x2 ≥ −1

−x1 + 2x2 ≤ 5

3x1 + 2x2 ≥ 3

6x1 + x2 ≤ 15

x1, x2 ≥ 0 integer

- Represent graphically the feasible region and the optimal solutions of the ILP and of itslinear relaxation: x ILP = (1, 2) with zILP = 1 and xLP = (1/5, 6/5) with zLP = −3/5.

- Apply Lagrangian relaxation to the first constraint.

For every u ≥ 0, the Lagrangian subproblem is

w(u) = min(x1,x2)∈X

3x1 − x2 + u(−1− x1 + x2)

where X is the set of all integer solutions that satisfy all the other constraints.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 13 / 43

Page 14: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

- Use the theorem to find the optimal value of the Lagrangian dual

w∗ = maxu≥0

w(u)

and the corresponding optimal solution xD = x(u∗), where u∗ is an optimal solution ofthe Lagrangian dual.

Represent conv(X ) and the polyhedron conv(X ) ∩ {(x1, x2) ∈ R2 : x1 − x2 ≥ −1}.

We obtain xD = (1/3, 4/3) with w∗ = −1/3.

Thus, we have: zLP = −3/5 < w∗ = −1/3 < zILP = 1

Drawing the dual function w(u) it is possibile to verify that the optimal solution of theLagrangian dual is u∗ = 5/3 with w∗ = −1/3.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 14 / 43

Page 15: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Proof: Consider the case where X = {x1, . . . , xk} contains a finite (even though huge)number of feasible solutions.

The Lagrangian dual is equivalent to maximize a nondifferentiable piecewise linearconcave function:

w∗ = maxu≥0

w(u) = maxu≥0{minx∈X

[c tx + ut(d − Dx)]} = maxu≥0{ min1≤l≤k

[c tx l + ut(d − Dx l)]}

and it can be expressed as the following Linear Program:

w∗ = max y

s.t. ctx l + ut(d − Dx l ) ≥ y ∀lu ≥ 0, y ∈ R

which contains a huge number of constraints.

Taking the dual of this LP and applying strong duality, we obtain:

w∗ = mink∑

l=1

µl (ctx l )

s.t.k∑

l=1

µl (Dx l − d) ≥ 0

k∑l=1

µl = 1

µl ≥ 0 ∀l .Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 15 / 43

Page 16: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

w∗ = mink∑

l=1

µl (ctx l )

s.t.k∑

l=1

µl (Dx l − d) ≥ 0

k∑l=1

µl = 1

µl ≥ 0 ∀l .

Setting x =∑k

l=1 µlx l with∑k

l=1 µl = 1 and µl ≥ 0 for each l , we obtain:

w∗ = min c tx

s.t. Dx ≥ d

x ∈ conv(X ).

The result can be extended to the case where X is the feasible region of any ILP.

Example of a nondifferentiable piecewise linear concave dual function w(u).

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 16 / 43

Page 17: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

In some cases, the lower bound on z∗ obtained by Lagrangian relaxation is not strongerthan the linear relaxation bound.

Corollary 2: If X = {x ∈ Zn : Ax ≥ b} and conv(X ) = {x ∈ Rn : Ax ≥ b}, then

w∗ = maxu≥0

w(u) = zLP = min {c tx : Ax ≥ b,Dx ≥ d , x ∈ Rn}.

Example: Given a generic binary knapsack problem

max z =∑n

i=1 pixi

s.t.∑n

i=1 aixi ≤ b

xi ∈ {0, 1} ∀i

and its linear relaxation

zLP−KP = maxx∈[0,1]n

{n∑

i=1

pixi :n∑

i=1

aixi ≤ b}.

The Lagrangian relaxation is as weak as the linear relaxation.

Indeed, X = {x ∈ {0, 1}n} and obviously conv(X ) = {x ∈ [0, 1]n}, whose constraints arealready contained in the linear relaxation.

According to Corollary 2, we have w∗ = zLP−KP .

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 17 / 43

Page 18: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

3.6.2 Solution of the Lagrangian duals

Generalization of the gradient method for functions of class C1 to functions piecewise C1(not everywhere differentiable)

Definition: Let C ⊆ Rn be a convex set and f : C → R a convex function on C

• a vector γ ∈ Rn is a subgradient of f at x ∈ C if

f (x) ≥ f (x) + γt(x − x) ∀x ∈ C

• the subdifferential, denoted by ∂f (x), is the set of all subgradients of f at x .

Examples:

- For f (x) = x2, at x = 3 the only subgradient is γ = 6. Indeed0 ≤ (x − 3)2 = x2 − 6x + 9 implies that for each x :

f (x) = x2 ≥ 6x − 9 = 9 + 6(x − 3) = f (x) + 6(x − x)

- For f (x) = |x | it is clear that: γ = 1 if x > 0, γ = −1 if x < 0, and ∂f (x) = [−1, 1] ifx = 0

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 18 / 43

Page 19: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Properties:

1) A convex function f : C → R has at least one subgradient at each interior point x ofC .

N.B.: The existence of (at least) one subgradient at any point of int(C), with C convex,is a necessary and sufficient condition for f to be convex on int(C).

2) If f is convex and x ∈ C , ∂f (x) is a nonempty, convex, closed and bounded set.

3) x∗ is a global minimum of f if and only if 0 ∈ ∂f (x∗).

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 19 / 43

Page 20: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Subgradient method

Consider the problem minx∈Rn f (x) with f (x) convex.

Start from an abitrary x0.

At the k-th iteration: consider a γk∈ ∂f (xk) and set

xk+1 := xk − αk γk

with αk > 0

Observation: We do not perform a one-dimensional search because for nondifferentiablefunctions a subgradient γ ∈ ∂f (x) is not necessarily a descent direction!

Example: see next page

But for sufficiently small stepsizes, one gets closer to an optimal solution:

Lemma: If xk is a non-optimal solution and x∗ any optimal solution, then

0 < αk < 2f (xk)− f (x∗)

‖γk‖2

implies that ‖xk+1 − x∗‖ < ‖xk − x∗‖2.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 20 / 43

Page 21: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Example: min−1≤x1,x2≤1 f (x1, x2) with f (x1, x2) = max{−x1, x1 + x2, x1 − 2x2} noteverywhere differentiable

Level curves in brown, points of nondifferentiability in green (of type: (t, 0), (−t, 2t) and(−t,−t) for t ≥ 0), global minimum x∗ = (0, 0).

If xk = (1, 0) and we consider the subgradient γk

= (1, 1) ∈ ∂f (xk), f (x) increases along

the half-line {x ∈ R2 : x = xk − αkγk, αk ≥ 0} but if the step αk is sufficiently small

then xk+1 = xk − αkγk

is closer to x∗.

From Chapter 8, Bazaraa et al., Nonlinear Programming, Wiley, 2006, p. 436-437Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 21 / 43

Page 22: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Theorem:

If f is convex, lim‖x‖→∞ f (x) = +∞, limk→∞ αk = 0 and∑∞

k=0 αk =∞, then thesubgradient method terminates after a finite number of iterations with an optimalsolution x∗ or it generates an infinite sequence {xk} which admits a subsequenceconverging to x∗.

Choice of the stepsize:

In practice, sequences {αk} such that limk→∞ αk = 0,∑∞

k=0 αk =∞ (e.g., αk = 1/k)are too slow.

An alternative is to choose αk = α0ρk , for a given ρ < 1. A more sophisticated and

popular rule is

αk = εkf (xk)− f

‖γk‖2 ,

where 0 < εk < 2 and f is either the optimal value (minimum) f (x∗) or an estimate.

Stopping criterion: prescribed maximum number of iterations because, even if0 ∈ ∂f (xk) that subgradient may non be considered at xk .

N.B.: Since it is not a monotone method, one needs to store the best solution xk foundso far.

The method can be easily extended to the case with bound constraints by including asimple projection step at each iteration.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 22 / 43

Page 23: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Subgradient method for solving the Lagrangian dual

Lagrangian dual:maxu≥0

w(u)

where w(u) = min {c tx + ut(d − Dx) : x ∈ X ⊆ Rn} is concave and piecewise linear.

Simple characterization of the subgradients of w(u):

Proposition:

Consider u ≥ 0 and X (u) = {x ∈ X : w(u) = c tx + ut(d − Dx)} the set of optimalsolutions of the Lagrangian subproblem (3).Then

For each x(u) ∈ X (u), the vector (d − Dx(u)) ∈ ∂w(u).

Each subgradient of w(u) at u can be expressed as a convex combination ofsubgradients (d − Dx(u)) with x(u) ∈ X (u).

The first point is a straightforward consequence of the definitions of dual function and ofsubgradient.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 23 / 43

Page 24: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Proof:

By definition of w(u):

w(u) ≤ c tx + ut(d − Dx) ∀x ∈ X , ∀u ≥ 0.

In particular, for any x(u) ∈ X (u) we have

w(u) ≤ c tx(u) + ut(d − Dx(u)) ∀u ≥ 0 (12)

and clearlyw(u) = c tx(u) + ut(d − Dx(u)). (13)

Substracting (13) from (12) we obtain

w(u)− w(u) ≤ (ut − ut)(d − Dx(u)) ∀u ≥ 0

which is equivalent to

w(u) ≤ w(u) + (ut − ut)(d − Dx(u)) ∀u ≥ 0.

Thus (d − Dx(u)) ∈ ∂w(u).

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 24 / 43

Page 25: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Procedure:

1) Select an initial u0 and set k := 0.

2) Solve the Lagrangian subproblem

min {c tx + utk(d − Dx) : x ∈ X}.

Let x(uk) be the optimal solution found, then (d − Dx(uk)) is a subgradient ofw(u) at uk .

3) Update the Lagrange multipliers:

uk+1 = max{0, uk + αk (d − Dx(uk))}

with, for instance, αk = εkw−w(uk )

‖d−Dx(uk )‖2, where w is an estimate of the optimal value

w∗.

4) Set k := k + 1

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 25 / 43

Page 26: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Example: Lagrangian relaxation for the binary knapsack problem

Considermax z = 10x1 + 4x2 + 14x3

s.t. 3x1 + x2 + 4x3 ≤ 4

x1, x2, x3 ∈ {0, 1}

Lagrangian dual:

minu≥0

w(u) = minu≥0

( maxx∈{0,1}3

(10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u )

with the dual function

w(u) =

28− 4u for u ∈ [0, 10

3]

18− u for u ∈ [ 103, 14

4]

4 + 3u for u ∈ [ 144, 4]

4u for u ∈ [4,∞)

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 26 / 43

Page 27: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

u

w(u)

28

443292

16

103144

4

Optimal solution u∗ = 144

with w∗ = w(u∗) = 292

.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 27 / 43

Page 28: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Lagrangian dual:

minu≥0

w(u) = minu≥0

( maxx∈{0,1}3

(10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u )

Subgradient γk = (4− 3xk1 − xk

2 − 4xk3 ), where xk = x(uk) is any optimal solution of the

Lagrangian subproblem at the k-th iteration.

Lagrange multiplier update: uk+1 = max{0, uk − αk γk} with αk = 12k

Subgradient method:

k uk αk w(uk) xk γk0 0 1 max (10, 4, 14)tx + 4u = 28 (1, 1, 1) -41 4 0.5 max (−2, 0,−2)tx + 4u = 0 + 16 (0, 0, 0)* 42 2 0.25 max (4, 2, 6)tx + 4u = 12 + 8 (1, 1, 1) -43 3 0.125 max (1, 1, 2)tx + 4u = 4 + 12 (1, 1, 1) -44 3.5 0.0625 max (−0.5, 0.5, 0)tx + 4u = 0.5 + 14 (0, 1, 0)* 3

Symbol ∗: there are several optimal solutions xk and we choose the lexicographicallysmallest one (set to 0 each variable xk

i with zero coefficient)

N.B.: The optimal value of the multiplier u∗ = 144

is reached at iteration k = 4 but theconsidered subgradient is nonzero.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 28 / 43

Page 29: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

3.6.3 Lagrangian relaxation for the STSP (Held & Karp)

Symmetric TSP: Given an undirected graph G = (V ,E) with a cost ce ∈ Z+ for eachedge e ∈ E , determine a Hamiltonian cycle of minimum total cost.

ILP formulation:

min∑

e∈E cexe

s.t.∑

e∈δ(i) xe = 2 ∀i ∈ V (14)∑e∈E(S) xe ≤ |S | − 1 ∀S ⊆ V , 2 ≤ |S | ≤ n − 1 (15)

xe ∈ {0, 1} ∀e ∈ E

where E(S) = {{i , j} ∈ E : i ∈ S, j ∈ S}

Observation:

i) Due to the presence of constraints (14), half of the subtour-elimination ones (15)are redundant:

∑e∈E(S) xe ≤ |S | − 1 iff

∑e∈E(S) xe ≤ |S | − 1, where S = V \ S .

Thus all the constraints (15) with 1 ∈ S can be deleted.

ii) Summing over all the degree constraints (14) and dividing by 2, we obtain∑e∈E xe = n that can be added to the formulation.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 29 / 43

Page 30: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Recall that a Hamiltonian cycle is a 1-tree (i.e., a spanning tree on nodes {2, . . . , n} plustwo edges incident to node 1) in which all nodes have exactly two incident edges.

Since∑

e∈E cexe +∑

i∈V ui (2−∑

e∈δ(i) xe) =∑

e={i,j}∈E (ce − ui − uj)xe + 2∑

i∈V ui ,

relaxing/dualizing the degree constraints (14) for all nodes except for node 1, we obtain

the Lagrangian subproblem:

w(u) = min∑

e∈E (ce − ui − uj)xe + 2∑

i∈V ui

s.t.∑

e∈δ(1) xe = 2∑e∈E(S) xe ≤ |S | − 1 ∀S ⊆ V , 2 ≤ |S | ≤ n − 1, 1 6∈ S∑

e∈E xe = n

xe ∈ {0, 1} ∀e ∈ E

where u1 = 0 and E(S) = {{i , j} ∈ E : i ∈ S , j ∈ S}.

N.B.: The set of feasible solutions of this problem coincides with the set of all 1-trees.

To find a minimum cost 1-tree: determine a minimum cost spanning tree on nodes{2, . . . , n} (Kruskal or Prim) and select two smallest cost edges incident to node 1.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 30 / 43

Page 31: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Observation: Since constraints∑e∈δ(1) xe = 2∑

e∈E(S) xe ≤ |S | − 1 ∀S ⊆ V , 2 ≤ |S | ≤ n − 1, 1 6∈ S∑e∈E xe = n

with x ≥ 0 describe the convex hull of the (binary) incidence vectors of 1-trees, Corollary2 implies that w∗ = zLP .

The linear relaxation

min∑

e∈E cexe

s.t.∑

e∈δ(i) xe = 2 ∀i ∈ V∑e∈E(S) xe ≤ |S | − 1 ∀S ⊆ V , 2 ≤ |S | ≤ n − 1

xe ≥ 0 ∀e ∈ E

with an exponential number of constraints can thus be solved without considering themexplicitly.

Since the dualized constraints are equations, the Lagrangian dual is:

maxu∈R|V | : u1=0

w(u)

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 31 / 43

Page 32: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Example: taken from L. Wolsey, Integer Programming, p. 175-177

Consider the undirected graph G = (V ,E) with 5 nodes and the cost matrix:

− 30 26 50 4030 − 24 40 5026 24 − 24 2650 40 24 − 3040 50 26 30 −

Dual function:

w(uk) = min

∑e={i,j}∈E

(ce − uki − uk

j )xke + 2

∑i∈V

uki : xk incidence vector of a 1-tree

Notation: ck

ij = ce − uki − uk

j for e = {i , j} ∈ E

Subgradient γk with γki = (2−

∑e∈δ(i) xk

e ), where xk = x(uk) is an optimal solution ofthe Lagrangian subproblem at the k-th iteration.

Since we did not relax the constraint∑

e∈δ(1) xe = 2, we will always have γk1 = 0.

Starting from uk1 = 0 for k = 0, this implies that uk

1 = 0 for each k ≥ 1.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 32 / 43

Page 33: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

A feasible solution of cost 148 found with a primal heuristic:

x12 = x23 = x34 = x45 = x51 = 1 and xij = 0 for all other {i , j} ∈ E

Solution of the Lagrangian dual starting from u0 = 0 with ε = 1:

Solving the Lagrangian subproblem with costs:

C0 = C =

− 30 26 50 4030 − 24 40 5026 24 − 24 2650 40 24 − 3040 50 26 30 −

(c0e = ce for each e ∈ E since u0 = 0),

we find x(u0) that corresponds to the 1-tree of cost 130:

x12 = x13 = x23 = x34 = x35 = 1 and xij = 0 for all other {i , j} ∈ E

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 33 / 43

Page 34: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Knowing x(u0), we can compute the value of the dual function: w(u0) = 130 + 0,equivalent to 1-tree cost + 2

∑i∈V u0

i .

Subgradient

γ0 =

00−211

Update the Lagrange multipliers:

u1 = u0 +(w − w(u0))

‖γ0‖2

γ0 = 0 +

(148− 130)

6

00−211

=

00−633

Thus

C1 =

− 30 32 47 3730 − 30 37 4732 30 − 27 2947 37 27 − 2437 47 29 24 −

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 34 / 43

Page 35: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

As optimal solution x(u1) of the Lagrangian subproblem with cost matrix C 1 we find the1-tree of cost 143:

x12 = x13 = x23 = x34 = x45 = 1 and xij = 0 for all other {i , j} ∈ E

and w(u1) = 143 + 2∑

i∈V u1i = 143.

Since

γ1 =

00−101

,

we have

u2 = u1 +(w − w(u1))

‖γ1‖2

γ1 =

00−633

+(148− 143)

2

00−101

=

00−1723112

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 35 / 43

Page 36: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Therefore

C2 =

− 30 34.5 47 34.530 − 32.5 37 44.534.5 32.5 − 29.5 2947 37 29.5 − 21.534.5 44.5 29 21.5 −

and we obtain x(u2) that corresponds to the 1-tree of cost 147.5:

x12 = x15 = x23 = x35 = x45 = 1 and xij = 0 for all other {i , j} ∈ E

and w(u2) = 147.5 + 0.

Since all costs ce are integer, the feasible solution of cost 148 found by the heuristic isoptimal.

JAVA Applet: http://itp.nat.uni-magdeburg.de/mertens/TSP/index.html

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 36 / 43

Page 37: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

3.6.4 Choice of the Lagrangian dual

For problems containing different groups of constraints, we need to decide whichconstraints to relax.

Choice criteria:

i) Strength of the bound w∗ obtained by solving the Lagrangian dual,

ii) Difficulty of solving the Lagrangian subproblems

w(u) = min {c tx + ut(d − Dx) : x ∈ X ⊆ Rn},

iii) Difficulty of solving the Lagrangian dual: w∗ = maxu≥0 w(u).

For (i) we have the characterization of the Lagrangian dual bound w∗ in terms of LP.

The difficulty of the Lagrangian subproblems depends on the specific problem.

The difficulty of the Lagrangian dual depends, among others, on the number of dualvariables.

We look for a reasonable trade-off.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 37 / 43

Page 38: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Example: Generalized assignment problem

Given a set I of processes and a set J of machines with

cij cost for executing process i ∈ I on machine j ∈ J,

wij the amount of resource required to execute process i ∈ I on machine j ∈ J,

bj the total amount of resource available on machine j ∈ J.

assign the processes to the machines so as to minimize the total cost, while respectingthe resource constraints.

Assumption: Once started processes cannot be interrupted.

ILP formulation

min z =∑

i∈I∑

j∈J cijxij

s.t.∑

j∈J xij = 1 ∀i ∈ I∑i∈I wijxij ≤ bj ∀j ∈ J

xij ∈ {0, 1} ∀i ∈ I , ∀j ∈ J

where xij = 1 if process i ∈ I is assigned to machine j ∈ J, and xij = 0 otherwise.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 38 / 43

Page 39: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Three possible Lagrangian relaxations:

1) Relaxing the capacity constraints:

w1(u) = min∑

i∈I∑

j∈J cijxij −∑

j∈J uj(bj −∑

i∈I wijxij)

s.t.∑

j∈J xij = 1 ∀i ∈ I

xij ∈ {0, 1} ∀i ∈ I , ∀j ∈ J

Trivial solution: assign each process i ∈ I to machine j ∈ J with min cij + ujwij .

N.B.: If the integrality conditions were relaxed, the solution would not changebecause for each iconv({x i ∈ {0, 1}

|J| :∑

j∈J xij = 1}) = {x ∈ R|J| :∑

j∈J xij = 1, 0 ≤ x i ≤ 1}.

2) Relaxing the assignment constraints:

w2(v) = min∑

i∈I∑

j∈J cijxij −∑

i∈I vi (∑

j∈J xij − 1)

s.t.∑

i∈I wijxij ≤ bj ∀j ∈ J

xij ∈ {0, 1} ∀i ∈ I , ∀j ∈ J

Since the variables xij corresponding to different machines j are not linked, thesubproblem decomposes into |J| independent binary knapsack subproblems, withprofit −cij + vi (max version) and weight wij for item i ∈ I .

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 39 / 43

Page 40: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

3) Relaxing all the constraints, we obtain the Lagrangian subproblem:

w3(u, v) = min∑

i∈I∑

j∈J cijxij −∑

i∈I vi (∑

j∈J xij − 1)−∑

j∈J uj(bj −∑

i∈I wijxij)

xij ∈ {0, 1} ∀i ∈ I , ∀j ∈ J

Trivial solution: set xij = 1 if cij − vi + ujwij < 0, xij = 0 if cij − vi + ujwij > 0 andarbitrarily xij = 0 or xij = 1 if cij − vi + ujwij = 0.

Observations:

i) According to Corollary 2, the first and third Lagrangian relaxation bounds are as weakas (not stronger than) the linear relaxation one.

ii) Since the ideal formulation for a binary knapsack problem contains many otherinequalities (e.g., cover inequalities), namely for each j ∈ J

conv({x j ∈ {0, 1}|I | :

∑i∈I

wijxij ≤ bj}) ⊂ {x j ∈ R|I | :∑i∈I

wijxij ≤ bj , 0 ≤ x j ≤ 1},

the second Lagrangian relaxation provides a potentially stronger bound.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 40 / 43

Page 41: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

3.6.5 Lagrangian heuristics

When uk gets closer to u∗, the optimal solutions x(uk) of the Lagrangian subproblemstend to be feasible for the primal problem.

Example: For Symmetric TSP, many nodes of the minimum cost 1-tree have degree 2.

Often simple heuristics allow to convert x(uk) into a feasible solution without worseningtoo much the objective function value.

Example: Set covering problem (SCP)

min {n∑

j=1

cjxj :n∑

j=1

aijxj ≥ 1, ∀i ∈ {1, . . . ,m}, x ∈ {0, 1}n}

with aij ∈ {0, 1} for 1 ≤ i ≤ m, 1 ≤ j ≤ n.

Lagrangian subproblem obtained by relaxing all covering contraints:

min{n∑

j=1

(cj −m∑i=1

uiaij)xj : x ∈ {0, 1}n}+m∑i=1

ui

with u ≥ 0.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 41 / 43

Page 42: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Let cj = cj −∑m

i=1 uiaij , for j = 1, . . . , n, then

min{n∑

j=1

cjxj : x ∈ {0, 1}n}+m∑i=1

ui

can be solved by setting xj = 1 if cj < 0, and xj = 0 otherwise.

To derive a feasible solution of SCP, it suffices to:

- Consider an optimal solution x(uk) of the Lagrangian dual.

- Delete the rows (elements) covered by x(uk).

- ”Cover” the other remaining rows (elements) greedily and let y∗ be the resultingpartial greedy solution.

- Verify whether some of the components of the SCP feasible solution x(uk) + y∗ can beset to 0.

- Return the resulting SCP solution.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 42 / 43

Page 43: 3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

Fixing the value of some variables:

Consider a feasible SCP solution of value z , then any better solution satisfies

m∑i=1

ui + minx∈{0,1}n

{n∑

j=1

(cj −m∑i=1

uiaij)xj} ≤ c tx < z

Property:

Let N+ = {j ∈ N : (cj −∑m

i=1 uiaij) > 0} and N− = {j ∈ N : (cj −∑m

i=1 uiaij) < 0}where N = {1, . . . , n}.

- If k ∈ N+ and∑m

i=1 ui +∑

j∈N−(cj −∑m

i=1 uiaij) + (ck −∑m

i=1 uiaik) ≥ z , then xk = 0

in any better (feasible) solution.

- If k ∈ N− and∑m

i=1 ui +∑

j∈N−\{k}(cj −∑m

i=1 uiaij) ≥ z , then xk = 1 in any better

(feasible) solution.

Example: see Chapter 10 of L. Wolsey, Integer Programming, p.178.

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 43 / 43