Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Convex Problems, Separation Theorems

September 17, 2008

Lecture 7

Outline

• Preliminary for Duality Theory

• Separation Theorems (Ch. 2.5 of Boyd and Vandenberghe’s book)

� Supporting Hyperplane Theorem

� Separating Hyperplane Theorems

• Duality

� Motivation

� Visualization of Primal-Dual Framework

� Primal-Dual Constrained Optimization Problems

� Dual Function Properties

� Weak and Strong Duality

� Examples

Convex Optimization 1

Lecture 7

Some Terminology

Given a hyperplane H = {x ∈ Rn | aTx = b}, we say that• The hyperplane H passes through a vector x0 when

x0 ∈ H ⇐⇒ aTx0 = b

• The hyperplane H contains a set C in one of its halfspaces when

either aTz ≤ b for all z ∈ C or aTz ≥ b for all z ∈ C


Lecture 7

Supporting Hyperplane Theorem

Th. Let C ⊆ Rn be a nonempty convex set. Let x0 be such that

either x0 ∈ bdC or x0 /∈ C

Then, there exists a hyperplane passing through x0 and containing the set

C in one of its halfspaces, i.e., there is a vector a ∈ Rn, a 6= 0, such that

supz∈C

aTz ≤ aTx0

• A hyperplane with such property is referred to as a supporting hyperplane


Lecture 7

Proof of the Supporting Hyperplane Teorem

Proof for the case when C is closed:

Let {xk} 6⊆ C such that xk → x0. (Why does it exist?)

Let z∗k be the projection of xk on the set C for each k. Consider

ak =xk−z∗

k‖xk−z∗

k‖ for k ≥ 1

This sequence is bounded and, therefore, it has a limit point, say a ∈ Rn.

Let {ak}K be a subsequence of {ak} converging to a. Since z∗k is the

projection of xk for each k, by the Projection Theorem (b), it follows that

for each k (z∗k − xk)T(z − z∗k) ≥ 0 for all z ∈ C

implying that for all k

aTk z ≤ aT

k z∗k for all z ∈ C

Because aTk z∗k = aT

k (z∗k − xk) + aT

k xk < aTk xk for all k, we obtain

aTk z < aT

k xk for all z ∈ C

Since xk → x0 and ak → a over k ∈ K, it follows that

aTz ≤ aTx0 for all z ∈ C


Lecture 7

Separating Hyperplane Theorems

Th. Let C, D ⊆ Rn be nonempty convex disjoint sets i.e., C ∩D = ∅.Then, there exists a hyperplane separating these sets, i.e.,

there is a ∈ Rn, a 6= 0, such that

supx∈C

aTx ≤ infz∈D

aTz


Lecture 7

Proof of the Separating Hyperplane Theorem

Consider the set Y = C −D. This is a (nonempty) convex set.

Since C ∩D = ∅, it follows that 0 6∈ Y .

By the Supporting Hyperplane Theorem, it follows that there exists a ∈ Rn

such that

supy∈Y

aTy ≤ 0.

Hence, aTy ≤ 0 for all y ∈ Y . Because Y = C −D, we have

aTx ≤ aTz for all x ∈ C and all z ∈ D

Taking supremum over x ∈ C and infimum over z ∈ D, we obtain

supx∈C

aTx ≤ infz∈D

aTz


Lecture 7

Strictly Separating Hyperplane Theorem

Th. Let C, D ⊆ Rn be nonempty convex disjoint sets.

Assume that C − D is closed. Then, there exists a hyperplane strictlyseparating the sets, i.e., there is a ∈ Rn, a 6= 0, such that

supx∈C

aTx < infz∈D

aTz

Proof: Homework assignment.

• When is C −D closed?

• One of conditions: C is closed and D is compact


Lecture 7

Duality Theory

• An important part of optimization theory

• Its implications are far reaching both in theory and practice

• A powerful tool providing:

• A basis for the development of a rich class of optimization algorithms

• A general systematic way for developing bounding strategies (both in

continuous and discrete optimization)

• A basis for sensitivity analysis


Lecture 7

Main Idea and Issues in Duality Theory

• Associate an “equivalent dual problem” with a given (primal) problem

• Methodology applicable to a general constrained optimization problem

• Investigate:

• Is there a general relation between the primal and its associated dual

problem?

• Under which conditions the primal and the dual problems have the

same optimal values?

• Under which conditions the primal and dual optimal solutions exist?

• What are the relations between primal and dual optimal solutions?

• What kind of information the dual optimal solutions provide about

the primal problem?


Lecture 7

Geometric Visualization of Duality

We illustrate duality using an abstract “geometric framework”

• This framework provides insights into:

• Weak duality

• Strong duality (zero duality gap)

• Existence of duality gap

Within this setting, we define:

• A “geometric primal problem” using an abstract set V ⊆ Rm × R

• A corresponding “geometric dual problem” using the hyperplanes that

support the set V


Lecture 7

Geometric Primal

Consider an abstract (nonempty) set V of vectors (u, w) ∈ Rm × RThe set V intersects the w-axis, i.e.,

(0, w) ∈ V for some w ∈ RThe set V extends “north” and “east”:

[North] For any (u, w) ∈ V and u ∈ Rm with u � u, we have (u, w) ∈ V

[East] For any (u, w) ∈ V and w ∈ R with w ≤ w, we have (u, w) ∈ V

• Geometric Primal Problem

Determine the minimum intercept of the set V and the w-axis:

minimize w

subject to (0, w) ∈ V

The minimum intercept value is denoted by f ∗, i.e., f ∗ = inf(0,w)∈V w.


Lecture 7

Nonvertical HyperplanesA hyperplane in Rm×R: {(u, w) | µTu+µ0w = ξ}, µ ∈ Rm, µ0, ξ ∈ R• We say that a hyperplane is nonvertical when µ0 6= 0

• Let Hµ,ξ denote a nonvertical hyperplane in Rm × R, i.e.,

Hµ,ξ = {(u, w) | µTu + w = ξ} with µ ∈ Rm, ξ ∈ R• Let q(µ) be the minimum value of µTu + w for (u, w) ∈ V , i.e.,

q(µ) = inf(u,w)∈V {µTu + w}• A nonvertical hyperplane Hµ,ξ supports a set V when ξ = q(µ)


Lecture 7

Geometric Dual Problem

• A hyperplane supporting the set V intersects the w-axis at (0, q(µ))• Geometric Dual Problem: Determine the maximum intercept with the

w-axis for the nonvertical hyperplanes that support the set V :

maximize q(µ)

subject to µ ∈ Rm

• Note: q(µ) = inf(u,w)∈V {µTu + w}, q(µ) = −∞ for µ 6� 0


Lecture 7

Observations

Primal: minimize w

subject to (0, w) ∈ V

Dual: maximize q(µ)

subject to µ � 0

• Dual values q(µ) are always below f ∗ and below any w with (0, w) ∈ V

• Dual optimal value q∗ never exceeds the primal optimal value f ∗:

q∗ ≤ f ∗ Weak Duality

• The weak duality may be strict i.e., q∗ < f∗ there is a Duality Gap


Lecture 7

Duality Gap Illustrations

• A duality gap may exist even for a convex set V


Lecture 7

Strong Duality

• We may have q∗ = f ∗ Strong Duality

• However, a nonvertical hyperplane achieving the maximum intercept

may not exist

• With or without convexity of V , only one relation is sure: q∗ ≤ f ∗


Lecture 7

Constrained Optimization DualityPrimal Problem (not necessarily convex)

minimize f(x)

subject to gj(x) ≤ 0, j = 1, . . . , m

x ∈ X

variable x ∈ Rn, feasible, optimal value f ∗ > −∞Geometric Framework:

• Define the set V as follows:

V = {(u, w) | there is x ∈ X such that g(x) � u, f(x) ≤ w}• Dual function is:

q(µ) = inf(u,w)∈V {w + µTu} = infx∈X{f(x) + µTg(x)}, µ � 0

• Dual Problem:

maximize q(µ)

subject to µ � 0


Lecture 7

General CasePrimal Problem (not necessarily convex)

minimize f(x)

subject to gj(x) ≤ 0, j = 1, . . . , m

hj(x) = 0, j = 1, . . . , p

x ∈ X

variable x ∈ Rn, feasible, optimal value f ∗ > −∞Lagrangian Function: L : Rn × Rm × Rp → R given by

L(x, µ, λ) = f(x) +m∑

j=1

µjgj(x) +p∑

j=1

λjhj(x)

= f(x) + µTg(x) + λTh(x)

• Weighted sum of the objective and constraint functions• µ ∈ Rm is Lagrange multiplier associated with g = (g1, . . . , gm)• λ ∈ Rp is Lagrange multiplier associated with h = (h1, . . . , hp)


Lecture 7

Dual ProblemLagrangian Function:

q(µ, λ) = infx∈X L(x, µ, λ) = infx∈X{f(x) + µTg(x) + λTh(x)

}The infimum above has an implicit constraint on the primal problem domain

• Dual Problem:maximize q(µ, λ)

subject to µ � 0, λ ∈ Rp

• Important properties: hold without any assumptions on the primal

• Concave Dual: q(µ, λ) is concave, the constraint set is convex

• Lower Bound: For any µ � 0 and λ ∈ Rp, we have q(µ, λ) ≤ f ∗


Lecture 7

Least-Norm Solution of Linear Equationsminimize xTx

subject to Ax = b

Dual Function:• Lagrangian is: L(x, λ) = xTx + λT(Ax− b)• To minimize L over x ∈ Rn, set the gradient ∇xL equal to zero:

∇xL(x, λ) = 2x + ATλ = 0 =⇒ xλ = −1

2ATλ

• Plug xλ in L to obtain q(λ):

q(λ) = L(xλ, λ) = −1

4λTAATλ− bTλ

a concave function of λ

Lower Bound Property: −(1/4)λTAATλ− bTλ ≤ f ∗ for all λ


Lecture 7

Standard Form LPminimize cTx

subject to Ax = b, x � 0

Dual Function:• Lagrangian is:

L(x, µ, λ) = cTx + λT(Ax− b)− µTx

= −bTλ + (c + ATλ− µ)Tx

• L is linear in x, hence

q(µ, λ) = infxL(x, µ, λ) =

{−bTλ when ATλ− µ + c = 0

−∞ otherwise

q is linear on affine domain {(µ, λ) | ATλ−µ+ c = 0}, hence concave

Lower Bound Property: −bTλ ≤ f ∗ when ATλ + c � 0


Lecture 7

Two-Way Partitioning

minimize xTWx

subject to x2i = 1, i = 1, . . . , n

• A nonconvex problem: feasible set contains 2n discrete points

• Interpretation: partition {1, . . . , n} in two sets; Wij is cost of assigning

i, j to the same set; −Wij is cost of assigning to different sets

Dual Function:

q(λ) = infx

{xTWx +

∑i

λi(x2i − 1)

}= inf

xxT [W + diag(λ)]x− 1Tλ

=

{−1Tλ when W + diag(λ) � 0

−∞ otherwise

Lower Bound Property: −1Tλ ≤ f ∗ when W + diag(λ) � 0

example: λ = −λmin(W )1 gives bound nλmin(W ) ≤ f ∗


Lecture 7

Weak and Strong DualityWeak Duality: q∗ ≤ f ∗

• Holds always without any assumptions on the primal problem• Can be used to compute nontrivial lower bounds for difficult problems

For example, a lower bound for the two-way partitioning problem can be

obtained by solving the SDP

maximize −1Tλ

subject to W + diag(λ) � 0

Nonzero Duality Gap: q∗ < f∗

• May hold even for a convex primal problemZero Duality Gap: q∗ = f ∗ (Strong Duality)• Does not hold in general• Even for convex primal problems additional conditions are needed• The conditions that guarantee strong duality in convex problems are

referred to as constraint qualifications


Lecture 7

Examples with Zero Duality Gap

Examples show that, in general, the relation q∗ = f ∗ provides no information

about the existence of dual optimal solutions

• Unique Dual Optimal:

minimize x1 − x2

subject to x1 + x2 ≤ 1

x1 ≥ 0, x2 ≥ 0

minimize 12

(x2

1 + x22

)subject to x1 ≤ 1

• Multiple Dual Optimal:minimize |x1| − x2

subject to x1 ≤ 0, x2 ≥ 0

• No Dual Optimal:minimize x

subject to x2 ≤ 0

Similarly, we can construct examples when q∗ = f ∗ and there is no

information about the existence of the primal optimal solutions


Lecture 7

Examples with Duality Gap

• Discrete Optimization:

minimize −x

subject to x ≤ 1, x ∈ {0,2}

• Convex Optimization:

minimize e−√

x1x2

subject to x21 ≤ 0, x1 ≥ 0, x2 ≥ 0


Documents

Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and