3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied

3.6 Lagrangian relaxation

Consider a generic ILP problem

min {c tx : Ax ≥ b, Dx ≥ d , x ∈ Zn}

with integer coefficients.

Suppose Dx ≥ d are the ”complicating” constraints in the sense that the ILP withoutthem is ”easy”.

Often the linear relaxation and the relaxation by elimination of Dx ≥ d yield weakbounds (e.g., TSP/UFL deleting cut-set/demand constraints)

More general setting:min {c tx : Dx ≥ d , x ∈ X ⊆ Rn} (1)

Idea: Delete the complicating constraints Dx ≥ d and, for each one of them, add to theobjective function a term with a multiplier ui , which penalizes its violation and that is≤ 0 for all feasible solutions of the problem (1).

Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 1 / 43

Lagrangian subproblem

Definition: Given a problem

z∗ = min {c tx : Dx ≥ d , x ∈ X ⊆ Rn} (2)

For each vector of Lagrange multipliers u ≥ 0, the Lagrangian subproblem is

w(u) = min {c tx + ut(d − Dx) : x ∈ X ⊆ Rn} (3)

where

L(x , u) = c tx + ut(d − Dx) is the Lagrangian function of the primal problem (2),

w(u) = min {L(x , u) : x ∈ X ⊆ Rn} is the dual function.

Proposition: For any u ≥ 0, the Lagrangian subproblem (3) is a relaxation of problem (2).

Proof: Clearly {x ∈ X : Dx ≥ d} ⊆ X . Moreover, for each u ≥ 0 and each feasible x

for problem (2), we have w(u) ≤ c tx . Indeed w(u) ≤ c tx + ut(d − Dx) ≤ c tx since

ut(d − Dx) ≤ 0 and w(u) is the minimum value of c tx + ut(d − Dx) for x ∈ X .

Corollary: If z∗ = min {c tx : Dx ≥ d , x ∈ X} is finite, then w(u) ≤ z∗ ∀u ≥ 0.


Lagrangian dual

To determine the tightest lower bound

Definition: The Lagrangian dual of the primal problem (2) is

w∗ = maxu≥0

w(u) (4)

Only nonnegativity constraints!

N.B.: By relaxing (dualizing) linear constraints, the objective function remains linear.

The other constraints can be of any type, provided subproblem (3) is ”sufficiently easy”.

For Linear Programs the Lagrangian dual coincides with the LP dual (cf. exercise).

Corollary: (Weak Duality)For each pair of feasible solutions x ∈ {x ∈ X : Dx ≥ d} of the primal problem (2) andu ≥ 0 of the Lagrangian dual (4), we have

w(u) ≤ c tx . (5)


Consequences:

i) If x is feasible for the primal problem (2), u is feasible for the Lagrangian dual (4) andw(u) = c t x , then x is optimal for the primal (2) and u is optimal for the dual (4).

ii) In particular w∗ = maxu≥0 w(u) ≤ z∗ = min {c tx : Dx ≥ d , x ∈ X}. If either theprimal or the dual is unbounded (admits no finite optimal solution), then the other one isinfeasible.

Recall: For any pair of primal-dual Linear Programs (LPs) that are bounded we havestrong duality, namely w∗ = z∗.

Observation: Unlike for LPs, discrete optimization problems can have a duality gap,that is w∗ < z∗.

Lagrangian relaxation of equality constraints:

Only difference: the Lagrange multipliers associated to equality constraints areunrestriced in sign, namely ui ∈ R.

If all the m relaxed/dualized constraints are equality constraints, the Lagrangian dual is:

maxu∈Rm

w(u)


Example 1: Binary knapsack

Consider max z = 10x1 + 4x2 + 14x3

s.t. 3x1 + x2 + 4x3 ≤ 4

x1, x2, x3 ∈ {0, 1}

Relaxing the capacity constraint, we obtain the Lagrangian function:

L(x , u) = 10x1 + 4x2 + 14x3 + u(4− 3x1 − x2 − 4x3)

Dual function: w(u) = maxx∈{0,1}3 10x1 + 4x2 + 14x3 + u(4− 3x1 − x2 − 4x3) ∀u ≥ 0

Thus the Lagrangian subproblem:

w(u) = maxx∈{0,1}3

(10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u

can be solved in linear time by setting to 1 (0) the variables with nonnegative(nonpositive) coefficient and choosing an arbitrary value for those with zero coefficient.

Lagrangian dual:

minu≥0

w(u) = minu≥0

( maxx∈{0,1}3

(10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u )


Dual function: w(u) = maxx∈{0,1}3 (10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u

Values of u for which the coefficients of x1, x2, x3 are nonpositive: u ≥ 103

for x1, u ≥ 144

for x3 and u ≥ 4 for x2.

Optimal solution of the Lagrangian subproblem as a function of u:

x = (1, 1, 1) for u ∈ [0, 103

],

x = (0, 1, 1) for u ∈ [ 103, 14

4],

x = (0, 1, 0) for u ∈ [ 144, 4],

x = (0, 0, 0) for u ∈ [4,∞).

Thus

w(u) =

28− 4u for u ∈ [0, 10

3]

18− u for u ∈ [ 103, 14

4]

4 + 3u for u ∈ [ 144, 4]

4u for u ∈ [4,∞)


Dual function:

u

w(u)

28

443292

16

103144

4

Lagrangian dual:

minu≥0

w(u) = minu≥0

( maxx∈{0,1}3

(10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u )

optimal solution u∗ = 144

with w∗ = w(u∗) = 292

.


Example 2: Uncapacitated Facility Location (UFL)

Consider the variant with profits pij and fixed costs fj for opening the depots in thecandidate sites, where we wish to maximize the total profit.

MILP formulation:

z∗ = max∑

i∈M∑

j∈N pijxij −∑

j∈N fjyj

s.t.∑

j∈N xij = 1 ∀i ∈ M (6)

xij ≤ yj ∀i ∈ M, j ∈ N

yj ∈ {0, 1} ∀j ∈ N

0 ≤ xij ≤ 1 ∀i ∈ M, j ∈ N

Relaxing the demand constraints (6), we obtain the Lagrangian subproblem:

w(u) = max∑

i∈M∑

j∈N(pij − ui )xij −∑

j∈N fjyj +∑

i∈M ui

s.t. xij ≤ yj ∀i ∈ M, j ∈ N (7)

yj ∈ {0, 1} ∀j ∈ N (8)

0 ≤ xij ≤ 1 ∀i ∈ M, j ∈ N (9)

which decomposes into |N| independent subproblems, one for each candidate site j .


Indeed w(u) =∑

j∈N wj(u) +∑

i∈M ui where

wj(u) = max∑

i∈M(pij − ui )xij − fjyj (10)

s.v . xij ≤ yj ∀i ∈ M

yj ∈ {0, 1}0 ≤ xij ≤ 1 ∀i ∈ M

For each j ∈ N, the subproblem (10) can be solved by inspection:

If yj = 0, then xij = 0 for each i and the objective function value is 0.

If yj = 1, it is convenient to serve all clients with positive profit, namely xij = 1 for allindices i such that pij − ui > 0, with an objective function value of∑

i∈M

max{pij − ui , 0} − fj .

Thus wj(u) = max{0,∑

i∈M max{pij − ui , 0} − fj}.

Example: see Chapter 10 of L. Wolsey, Integer Programming, p. 169-170


Optimal solutions of the Lagrangian subproblem and of the primal problem

Solving the Lagrangian subproblem (3) may sometimes yield an optimal solution of theprimal problem (2).

Proposition: If u ≥ 0 and

i) x(u) is an optimal solution of the Lagrangian subproblem (3)

ii) Dx(u) ≥ d

iii) (Dx(u))i = di for each Lagrange multiplier ui > 0 (complementary slacknessconditions),

then x(u) is an optimal solution of the primal poblem (2).

Proof: Due to (i) we have w∗ ≥ w(u) = c tx(u) + ut(d − Dx(u)) and to (iii) we havec tx(u) + ut(d − Dx(u)) = c tx(u).

According to (ii), x(u) is a feasible solution of the primal (2) and hence c tx(u) ≥ z∗.

Therefore w∗ ≥ c tx(u) + ut(d − Dx(u)) = c tx(u) ≥ z∗ and, since w∗ ≤ z∗, everythingholds with equality and x(u) is an optimal solution of the primal (2).

Observation: If Lagrangian relaxation is applied to equality constraints, conditions (iii)are automatically satisfied, and an optimal solution of the Lagrangian subproblem isoptimal for primal problem (2) if it is feasible.


Property of the Lagrangian dual:

Proposition: The dual function w(u) is concave.

Proof: Consider any pair u1 ≥ 0 and u2 ≥ 0.

For any α such that 0 ≤ α ≤ 1, let x be an optimal solution of the Lagrangiansubproblem (3) for u = αu1 + (1− α)u2, namely w(u) = c t x + ut(d − Dx).

By definition of w(u), we have w(u1) ≤ c t x + ut1(d − Dx) and

w(u2) ≤ c t x + ut2(d − Dx).

Multipying the first inequality by α and the second one by 1− α, we obtain

αw(u1) + (1− α)w(u2) ≤ c t x + (αu1 + (1− α)u2)t(d − Dx)

= w(αu1 + (1− α)u2).


3.6.1 LP characterization of the Lagrangian dual

How strong is the lower bound on z∗ we obtain by solving the Lagrangian dual?

Theorem: Consider a generic ILP problem

min {c tx : Ax ≥ b, Dx ≥ d , x ∈ Zn}

with integer coefficients.

Let w(u) = min {c tx + ut(d − Dx) : Ax ≥ b, x ∈ Zn} be the dual function,

w∗ = maxu≥0 w(u) be the optimal value of the Lagrangian dual and

X = {x ∈ Zn : Ax ≥ b}, then

w∗ = min {c tx : Dx ≥ d , x ∈ conv(X )}. (11)

The problem is ”convexified”, the Lagrangian dual is characterized in terms of a LinearProgram.

Corollary 1: Since conv(X ) ⊆ {x ∈ Rn : Ax ≥ b},zLP = min {c tx : Ax ≥ b,Dx ≥ d , x ∈ Rn} ≤ w∗ ≤ z∗.

Depending on the objective function, these inequalities can be strict, i.e, zLP < w∗ < z∗.


Illustration: from D. Bertsimas, R. Weismantel, Optimization over integers, DynamicIdeas, 2005, p. 144-146

Consider the ILP problemmin 3x1 − x2

s.t. x1 − x2 ≥ −1

−x1 + 2x2 ≤ 5

3x1 + 2x2 ≥ 3

6x1 + x2 ≤ 15

x1, x2 ≥ 0 integer

- Represent graphically the feasible region and the optimal solutions of the ILP and of itslinear relaxation: x ILP = (1, 2) with zILP = 1 and xLP = (1/5, 6/5) with zLP = −3/5.

- Apply Lagrangian relaxation to the first constraint.

For every u ≥ 0, the Lagrangian subproblem is

w(u) = min(x1,x2)∈X

3x1 − x2 + u(−1− x1 + x2)

where X is the set of all integer solutions that satisfy all the other constraints.


- Use the theorem to find the optimal value of the Lagrangian dual

w∗ = maxu≥0

w(u)

and the corresponding optimal solution xD = x(u∗), where u∗ is an optimal solution ofthe Lagrangian dual.

Represent conv(X ) and the polyhedron conv(X ) ∩ {(x1, x2) ∈ R2 : x1 − x2 ≥ −1}.

We obtain xD = (1/3, 4/3) with w∗ = −1/3.

Thus, we have: zLP = −3/5 < w∗ = −1/3 < zILP = 1

Drawing the dual function w(u) it is possibile to verify that the optimal solution of theLagrangian dual is u∗ = 5/3 with w∗ = −1/3.


Proof: Consider the case where X = {x1, . . . , xk} contains a finite (even though huge)number of feasible solutions.

The Lagrangian dual is equivalent to maximize a nondifferentiable piecewise linearconcave function:

w∗ = maxu≥0

w(u) = maxu≥0{minx∈X

[c tx + ut(d − Dx)]} = maxu≥0{ min1≤l≤k

[c tx l + ut(d − Dx l)]}

and it can be expressed as the following Linear Program:

w∗ = max y

s.t. ctx l + ut(d − Dx l ) ≥ y ∀lu ≥ 0, y ∈ R

which contains a huge number of constraints.

Taking the dual of this LP and applying strong duality, we obtain:

w∗ = mink∑

l=1

µl (ctx l )

s.t.k∑

l=1

µl (Dx l − d) ≥ 0

k∑l=1

µl = 1

µl ≥ 0 ∀l .Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 15 / 43

w∗ = mink∑

l=1

µl (ctx l )

s.t.k∑

l=1

µl (Dx l − d) ≥ 0

k∑l=1

µl = 1

µl ≥ 0 ∀l .

Setting x =∑k

l=1 µlx l with∑k

l=1 µl = 1 and µl ≥ 0 for each l , we obtain:

w∗ = min c tx

s.t. Dx ≥ d

x ∈ conv(X ).

The result can be extended to the case where X is the feasible region of any ILP.

Example of a nondifferentiable piecewise linear concave dual function w(u).


In some cases, the lower bound on z∗ obtained by Lagrangian relaxation is not strongerthan the linear relaxation bound.

Corollary 2: If X = {x ∈ Zn : Ax ≥ b} and conv(X ) = {x ∈ Rn : Ax ≥ b}, then

w∗ = maxu≥0

w(u) = zLP = min {c tx : Ax ≥ b,Dx ≥ d , x ∈ Rn}.

Example: Given a generic binary knapsack problem

max z =∑n

i=1 pixi

s.t.∑n

i=1 aixi ≤ b

xi ∈ {0, 1} ∀i

and its linear relaxation

zLP−KP = maxx∈[0,1]n

{n∑

i=1

pixi :n∑

i=1

aixi ≤ b}.

The Lagrangian relaxation is as weak as the linear relaxation.

Indeed, X = {x ∈ {0, 1}n} and obviously conv(X ) = {x ∈ [0, 1]n}, whose constraints arealready contained in the linear relaxation.

According to Corollary 2, we have w∗ = zLP−KP .


3.6.2 Solution of the Lagrangian duals

Generalization of the gradient method for functions of class C1 to functions piecewise C1(not everywhere differentiable)

Definition: Let C ⊆ Rn be a convex set and f : C → R a convex function on C

• a vector γ ∈ Rn is a subgradient of f at x ∈ C if

f (x) ≥ f (x) + γt(x − x) ∀x ∈ C

• the subdifferential, denoted by ∂f (x), is the set of all subgradients of f at x .

Examples:

- For f (x) = x2, at x = 3 the only subgradient is γ = 6. Indeed0 ≤ (x − 3)2 = x2 − 6x + 9 implies that for each x :

f (x) = x2 ≥ 6x − 9 = 9 + 6(x − 3) = f (x) + 6(x − x)

- For f (x) = |x | it is clear that: γ = 1 if x > 0, γ = −1 if x < 0, and ∂f (x) = [−1, 1] ifx = 0


Properties:

1) A convex function f : C → R has at least one subgradient at each interior point x ofC .

N.B.: The existence of (at least) one subgradient at any point of int(C), with C convex,is a necessary and sufficient condition for f to be convex on int(C).

2) If f is convex and x ∈ C , ∂f (x) is a nonempty, convex, closed and bounded set.

3) x∗ is a global minimum of f if and only if 0 ∈ ∂f (x∗).


Subgradient method

Consider the problem minx∈Rn f (x) with f (x) convex.

Start from an abitrary x0.

At the k-th iteration: consider a γk∈ ∂f (xk) and set

xk+1 := xk − αk γk

with αk > 0

Observation: We do not perform a one-dimensional search because for nondifferentiablefunctions a subgradient γ ∈ ∂f (x) is not necessarily a descent direction!

Example: see next page

But for sufficiently small stepsizes, one gets closer to an optimal solution:

Lemma: If xk is a non-optimal solution and x∗ any optimal solution, then

0 < αk < 2f (xk)− f (x∗)

‖γk‖2

implies that ‖xk+1 − x∗‖ < ‖xk − x∗‖2.


Example: min−1≤x1,x2≤1 f (x1, x2) with f (x1, x2) = max{−x1, x1 + x2, x1 − 2x2} noteverywhere differentiable

Level curves in brown, points of nondifferentiability in green (of type: (t, 0), (−t, 2t) and(−t,−t) for t ≥ 0), global minimum x∗ = (0, 0).

If xk = (1, 0) and we consider the subgradient γk

= (1, 1) ∈ ∂f (xk), f (x) increases along

the half-line {x ∈ R2 : x = xk − αkγk, αk ≥ 0} but if the step αk is sufficiently small

then xk+1 = xk − αkγk

is closer to x∗.

From Chapter 8, Bazaraa et al., Nonlinear Programming, Wiley, 2006, p. 436-437Edoardo Amaldi (PoliMI) Optimization Academic year 2015-16 21 / 43

Theorem:

If f is convex, lim‖x‖→∞ f (x) = +∞, limk→∞ αk = 0 and∑∞

k=0 αk =∞, then thesubgradient method terminates after a finite number of iterations with an optimalsolution x∗ or it generates an infinite sequence {xk} which admits a subsequenceconverging to x∗.

Choice of the stepsize:

In practice, sequences {αk} such that limk→∞ αk = 0,∑∞

k=0 αk =∞ (e.g., αk = 1/k)are too slow.

An alternative is to choose αk = α0ρk , for a given ρ < 1. A more sophisticated and

popular rule is

αk = εkf (xk)− f

‖γk‖2 ,

where 0 < εk < 2 and f is either the optimal value (minimum) f (x∗) or an estimate.

Stopping criterion: prescribed maximum number of iterations because, even if0 ∈ ∂f (xk) that subgradient may non be considered at xk .

N.B.: Since it is not a monotone method, one needs to store the best solution xk foundso far.

The method can be easily extended to the case with bound constraints by including asimple projection step at each iteration.


Subgradient method for solving the Lagrangian dual

Lagrangian dual:maxu≥0

w(u)

where w(u) = min {c tx + ut(d − Dx) : x ∈ X ⊆ Rn} is concave and piecewise linear.

Simple characterization of the subgradients of w(u):

Proposition:

Consider u ≥ 0 and X (u) = {x ∈ X : w(u) = c tx + ut(d − Dx)} the set of optimalsolutions of the Lagrangian subproblem (3).Then

For each x(u) ∈ X (u), the vector (d − Dx(u)) ∈ ∂w(u).

Each subgradient of w(u) at u can be expressed as a convex combination ofsubgradients (d − Dx(u)) with x(u) ∈ X (u).

The first point is a straightforward consequence of the definitions of dual function and ofsubgradient.


Proof:

By definition of w(u):

w(u) ≤ c tx + ut(d − Dx) ∀x ∈ X , ∀u ≥ 0.

In particular, for any x(u) ∈ X (u) we have

w(u) ≤ c tx(u) + ut(d − Dx(u)) ∀u ≥ 0 (12)

and clearlyw(u) = c tx(u) + ut(d − Dx(u)). (13)

Substracting (13) from (12) we obtain

w(u)− w(u) ≤ (ut − ut)(d − Dx(u)) ∀u ≥ 0

which is equivalent to

w(u) ≤ w(u) + (ut − ut)(d − Dx(u)) ∀u ≥ 0.

Thus (d − Dx(u)) ∈ ∂w(u).


Procedure:

1) Select an initial u0 and set k := 0.

2) Solve the Lagrangian subproblem

min {c tx + utk(d − Dx) : x ∈ X}.

Let x(uk) be the optimal solution found, then (d − Dx(uk)) is a subgradient ofw(u) at uk .

3) Update the Lagrange multipliers:

uk+1 = max{0, uk + αk (d − Dx(uk))}

with, for instance, αk = εkw−w(uk )

‖d−Dx(uk )‖2, where w is an estimate of the optimal value

w∗.

4) Set k := k + 1


Example: Lagrangian relaxation for the binary knapsack problem

Considermax z = 10x1 + 4x2 + 14x3

s.t. 3x1 + x2 + 4x3 ≤ 4

x1, x2, x3 ∈ {0, 1}

Lagrangian dual:

minu≥0

w(u) = minu≥0

( maxx∈{0,1}3

(10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u )

with the dual function

w(u) =

28− 4u for u ∈ [0, 10

3]

18− u for u ∈ [ 103, 14

4]

4 + 3u for u ∈ [ 144, 4]

4u for u ∈ [4,∞)


u

w(u)

28

443292

16

103144

4

Optimal solution u∗ = 144

with w∗ = w(u∗) = 292

.


Lagrangian dual:

minu≥0

w(u) = minu≥0

( maxx∈{0,1}3

(10− 3u)x1 + (4− u)x2 + (14− 4u)x3 + 4u )

Subgradient γk = (4− 3xk1 − xk

2 − 4xk3 ), where xk = x(uk) is any optimal solution of the

Lagrangian subproblem at the k-th iteration.

Lagrange multiplier update: uk+1 = max{0, uk − αk γk} with αk = 12k

Subgradient method:

k uk αk w(uk) xk γk0 0 1 max (10, 4, 14)tx + 4u = 28 (1, 1, 1) -41 4 0.5 max (−2, 0,−2)tx + 4u = 0 + 16 (0, 0, 0)* 42 2 0.25 max (4, 2, 6)tx + 4u = 12 + 8 (1, 1, 1) -43 3 0.125 max (1, 1, 2)tx + 4u = 4 + 12 (1, 1, 1) -44 3.5 0.0625 max (−0.5, 0.5, 0)tx + 4u = 0.5 + 14 (0, 1, 0)* 3

Symbol ∗: there are several optimal solutions xk and we choose the lexicographicallysmallest one (set to 0 each variable xk

i with zero coefficient)

N.B.: The optimal value of the multiplier u∗ = 144

is reached at iteration k = 4 but theconsidered subgradient is nonzero.


3.6.3 Lagrangian relaxation for the STSP (Held & Karp)

Symmetric TSP: Given an undirected graph G = (V ,E) with a cost ce ∈ Z+ for eachedge e ∈ E , determine a Hamiltonian cycle of minimum total cost.

ILP formulation:

min∑

e∈E cexe

s.t.∑

e∈δ(i) xe = 2 ∀i ∈ V (14)∑e∈E(S) xe ≤ |S | − 1 ∀S ⊆ V , 2 ≤ |S | ≤ n − 1 (15)

xe ∈ {0, 1} ∀e ∈ E

where E(S) = {{i , j} ∈ E : i ∈ S, j ∈ S}

Observation:

i) Due to the presence of constraints (14), half of the subtour-elimination ones (15)are redundant:

∑e∈E(S) xe ≤ |S | − 1 iff

∑e∈E(S) xe ≤ |S | − 1, where S = V \ S .

Thus all the constraints (15) with 1 ∈ S can be deleted.

ii) Summing over all the degree constraints (14) and dividing by 2, we obtain∑e∈E xe = n that can be added to the formulation.


Recall that a Hamiltonian cycle is a 1-tree (i.e., a spanning tree on nodes {2, . . . , n} plustwo edges incident to node 1) in which all nodes have exactly two incident edges.

Since∑

e∈E cexe +∑

i∈V ui (2−∑

e∈δ(i) xe) =∑

e={i,j}∈E (ce − ui − uj)xe + 2∑

i∈V ui ,

relaxing/dualizing the degree constraints (14) for all nodes except for node 1, we obtain

the Lagrangian subproblem:

w(u) = min∑

e∈E (ce − ui − uj)xe + 2∑

i∈V ui

s.t.∑

e∈δ(1) xe = 2∑e∈E(S) xe ≤ |S | − 1 ∀S ⊆ V , 2 ≤ |S | ≤ n − 1, 1 6∈ S∑

e∈E xe = n

xe ∈ {0, 1} ∀e ∈ E

where u1 = 0 and E(S) = {{i , j} ∈ E : i ∈ S , j ∈ S}.

N.B.: The set of feasible solutions of this problem coincides with the set of all 1-trees.

To find a minimum cost 1-tree: determine a minimum cost spanning tree on nodes{2, . . . , n} (Kruskal or Prim) and select two smallest cost edges incident to node 1.


Observation: Since constraints∑e∈δ(1) xe = 2∑

e∈E(S) xe ≤ |S | − 1 ∀S ⊆ V , 2 ≤ |S | ≤ n − 1, 1 6∈ S∑e∈E xe = n

with x ≥ 0 describe the convex hull of the (binary) incidence vectors of 1-trees, Corollary2 implies that w∗ = zLP .

The linear relaxation

min∑

e∈E cexe

s.t.∑

e∈δ(i) xe = 2 ∀i ∈ V∑e∈E(S) xe ≤ |S | − 1 ∀S ⊆ V , 2 ≤ |S | ≤ n − 1

xe ≥ 0 ∀e ∈ E

with an exponential number of constraints can thus be solved without considering themexplicitly.

Since the dualized constraints are equations, the Lagrangian dual is:

maxu∈R|V | : u1=0

w(u)


Example: taken from L. Wolsey, Integer Programming, p. 175-177

Consider the undirected graph G = (V ,E) with 5 nodes and the cost matrix:

− 30 26 50 4030 − 24 40 5026 24 − 24 2650 40 24 − 3040 50 26 30 −

Dual function:

w(uk) = min

∑e={i,j}∈E

(ce − uki − uk

j )xke + 2

∑i∈V

uki : xk incidence vector of a 1-tree

Notation: ck

ij = ce − uki − uk

j for e = {i , j} ∈ E

Subgradient γk with γki = (2−

∑e∈δ(i) xk

e ), where xk = x(uk) is an optimal solution ofthe Lagrangian subproblem at the k-th iteration.

Since we did not relax the constraint∑

e∈δ(1) xe = 2, we will always have γk1 = 0.

Starting from uk1 = 0 for k = 0, this implies that uk

1 = 0 for each k ≥ 1.


A feasible solution of cost 148 found with a primal heuristic:

x12 = x23 = x34 = x45 = x51 = 1 and xij = 0 for all other {i , j} ∈ E

Solution of the Lagrangian dual starting from u0 = 0 with ε = 1:

Solving the Lagrangian subproblem with costs:

C0 = C =

− 30 26 50 4030 − 24 40 5026 24 − 24 2650 40 24 − 3040 50 26 30 −

(c0e = ce for each e ∈ E since u0 = 0),

we find x(u0) that corresponds to the 1-tree of cost 130:



Knowing x(u0), we can compute the value of the dual function: w(u0) = 130 + 0,equivalent to 1-tree cost + 2

∑i∈V u0

i .

Subgradient

γ0 =

00−211

Update the Lagrange multipliers:

u1 = u0 +(w − w(u0))

‖γ0‖2

γ0 = 0 +

(148− 130)

6

00−211

=

00−633

Thus

C1 =

− 30 32 47 3730 − 30 37 4732 30 − 27 2947 37 27 − 2437 47 29 24 −


As optimal solution x(u1) of the Lagrangian subproblem with cost matrix C 1 we find the1-tree of cost 143:


and w(u1) = 143 + 2∑

i∈V u1i = 143.

Since

γ1 =

00−101

,

we have

u2 = u1 +(w − w(u1))

‖γ1‖2

γ1 =

00−633

+(148− 143)

2

00−101

=

00−1723112


Therefore

C2 =

− 30 34.5 47 34.530 − 32.5 37 44.534.5 32.5 − 29.5 2947 37 29.5 − 21.534.5 44.5 29 21.5 −

and we obtain x(u2) that corresponds to the 1-tree of cost 147.5:


and w(u2) = 147.5 + 0.

Since all costs ce are integer, the feasible solution of cost 148 found by the heuristic isoptimal.

JAVA Applet: http://itp.nat.uni-magdeburg.de/mertens/TSP/index.html


3.6.4 Choice of the Lagrangian dual

For problems containing different groups of constraints, we need to decide whichconstraints to relax.

Choice criteria:

i) Strength of the bound w∗ obtained by solving the Lagrangian dual,

ii) Difficulty of solving the Lagrangian subproblems

w(u) = min {c tx + ut(d − Dx) : x ∈ X ⊆ Rn},

iii) Difficulty of solving the Lagrangian dual: w∗ = maxu≥0 w(u).

For (i) we have the characterization of the Lagrangian dual bound w∗ in terms of LP.

The difficulty of the Lagrangian subproblems depends on the specific problem.

The difficulty of the Lagrangian dual depends, among others, on the number of dualvariables.

We look for a reasonable trade-off.


Example: Generalized assignment problem

Given a set I of processes and a set J of machines with

cij cost for executing process i ∈ I on machine j ∈ J,

wij the amount of resource required to execute process i ∈ I on machine j ∈ J,

bj the total amount of resource available on machine j ∈ J.

assign the processes to the machines so as to minimize the total cost, while respectingthe resource constraints.

Assumption: Once started processes cannot be interrupted.

ILP formulation

min z =∑

i∈I∑

j∈J cijxij

s.t.∑

j∈J xij = 1 ∀i ∈ I∑i∈I wijxij ≤ bj ∀j ∈ J

xij ∈ {0, 1} ∀i ∈ I , ∀j ∈ J

where xij = 1 if process i ∈ I is assigned to machine j ∈ J, and xij = 0 otherwise.


Three possible Lagrangian relaxations:

1) Relaxing the capacity constraints:

w1(u) = min∑

i∈I∑

j∈J cijxij −∑

j∈J uj(bj −∑

i∈I wijxij)

s.t.∑

j∈J xij = 1 ∀i ∈ I

xij ∈ {0, 1} ∀i ∈ I , ∀j ∈ J

Trivial solution: assign each process i ∈ I to machine j ∈ J with min cij + ujwij .

N.B.: If the integrality conditions were relaxed, the solution would not changebecause for each iconv({x i ∈ {0, 1}

|J| :∑

j∈J xij = 1}) = {x ∈ R|J| :∑

j∈J xij = 1, 0 ≤ x i ≤ 1}.

2) Relaxing the assignment constraints:

w2(v) = min∑

i∈I∑

j∈J cijxij −∑

i∈I vi (∑

j∈J xij − 1)

s.t.∑

i∈I wijxij ≤ bj ∀j ∈ J

xij ∈ {0, 1} ∀i ∈ I , ∀j ∈ J

Since the variables xij corresponding to different machines j are not linked, thesubproblem decomposes into |J| independent binary knapsack subproblems, withprofit −cij + vi (max version) and weight wij for item i ∈ I .


3) Relaxing all the constraints, we obtain the Lagrangian subproblem:

w3(u, v) = min∑

i∈I∑

j∈J cijxij −∑

i∈I vi (∑

j∈J xij − 1)−∑

j∈J uj(bj −∑

i∈I wijxij)

xij ∈ {0, 1} ∀i ∈ I , ∀j ∈ J

Trivial solution: set xij = 1 if cij − vi + ujwij < 0, xij = 0 if cij − vi + ujwij > 0 andarbitrarily xij = 0 or xij = 1 if cij − vi + ujwij = 0.

Observations:

i) According to Corollary 2, the first and third Lagrangian relaxation bounds are as weakas (not stronger than) the linear relaxation one.

ii) Since the ideal formulation for a binary knapsack problem contains many otherinequalities (e.g., cover inequalities), namely for each j ∈ J

conv({x j ∈ {0, 1}|I | :

∑i∈I

wijxij ≤ bj}) ⊂ {x j ∈ R|I | :∑i∈I

wijxij ≤ bj , 0 ≤ x j ≤ 1},

the second Lagrangian relaxation provides a potentially stronger bound.


3.6.5 Lagrangian heuristics

When uk gets closer to u∗, the optimal solutions x(uk) of the Lagrangian subproblemstend to be feasible for the primal problem.

Example: For Symmetric TSP, many nodes of the minimum cost 1-tree have degree 2.

Often simple heuristics allow to convert x(uk) into a feasible solution without worseningtoo much the objective function value.

Example: Set covering problem (SCP)

min {n∑

j=1

cjxj :n∑

j=1

aijxj ≥ 1, ∀i ∈ {1, . . . ,m}, x ∈ {0, 1}n}

with aij ∈ {0, 1} for 1 ≤ i ≤ m, 1 ≤ j ≤ n.

Lagrangian subproblem obtained by relaxing all covering contraints:

min{n∑

j=1

(cj −m∑i=1

uiaij)xj : x ∈ {0, 1}n}+m∑i=1

ui

with u ≥ 0.


Let cj = cj −∑m

i=1 uiaij , for j = 1, . . . , n, then

min{n∑

j=1

cjxj : x ∈ {0, 1}n}+m∑i=1

ui

can be solved by setting xj = 1 if cj < 0, and xj = 0 otherwise.

To derive a feasible solution of SCP, it suffices to:

- Consider an optimal solution x(uk) of the Lagrangian dual.

- Delete the rows (elements) covered by x(uk).

- ”Cover” the other remaining rows (elements) greedily and let y∗ be the resultingpartial greedy solution.

- Verify whether some of the components of the SCP feasible solution x(uk) + y∗ can beset to 0.

- Return the resulting SCP solution.


Fixing the value of some variables:

Consider a feasible SCP solution of value z , then any better solution satisfies

m∑i=1

ui + minx∈{0,1}n

{n∑

j=1

(cj −m∑i=1

uiaij)xj} ≤ c tx < z

Property:

Let N+ = {j ∈ N : (cj −∑m

i=1 uiaij) > 0} and N− = {j ∈ N : (cj −∑m

i=1 uiaij) < 0}where N = {1, . . . , n}.

- If k ∈ N+ and∑m

i=1 ui +∑

j∈N−(cj −∑m

i=1 uiaij) + (ck −∑m

i=1 uiaik) ≥ z , then xk = 0

in any better (feasible) solution.

- If k ∈ N− and∑m

i=1 ui +∑

j∈N−\{k}(cj −∑m

i=1 uiaij) ≥ z , then xk = 1 in any better

(feasible) solution.

Example: see Chapter 10 of L. Wolsey, Integer Programming, p.178.


Documents

3.6 Lagrangian relaxation - Politecnico di Milanohome.deib.polimi.it/amaldi/SlidesOPT-15-16/IP-lagrangian-relaxation-15-16.pdf · Observation: If Lagrangian relaxation is applied