26
Lecture 7 Convex Problems, Separation Theorems September 17, 2008

Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Embed Size (px)

Citation preview

Page 1: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Convex Problems, Separation Theorems

September 17, 2008

Page 2: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Outline

• Preliminary for Duality Theory

• Separation Theorems (Ch. 2.5 of Boyd and Vandenberghe’s book)

� Supporting Hyperplane Theorem

� Separating Hyperplane Theorems

• Duality

� Motivation

� Visualization of Primal-Dual Framework

� Primal-Dual Constrained Optimization Problems

� Dual Function Properties

� Weak and Strong Duality

� Examples

Convex Optimization 1

Page 3: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Some Terminology

Given a hyperplane H = {x ∈ Rn | aTx = b}, we say that• The hyperplane H passes through a vector x0 when

x0 ∈ H ⇐⇒ aTx0 = b

• The hyperplane H contains a set C in one of its halfspaces when

either aTz ≤ b for all z ∈ C or aTz ≥ b for all z ∈ C

Convex Optimization 2

Page 4: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Supporting Hyperplane Theorem

Th. Let C ⊆ Rn be a nonempty convex set. Let x0 be such that

either x0 ∈ bdC or x0 /∈ C

Then, there exists a hyperplane passing through x0 and containing the set

C in one of its halfspaces, i.e., there is a vector a ∈ Rn, a 6= 0, such that

supz∈C

aTz ≤ aTx0

• A hyperplane with such property is referred to as a supporting hyperplane

Convex Optimization 3

Page 5: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Proof of the Supporting Hyperplane Teorem

Proof for the case when C is closed:

Let {xk} 6⊆ C such that xk → x0. (Why does it exist?)

Let z∗k be the projection of xk on the set C for each k. Consider

ak =xk−z∗

k‖xk−z∗

k‖ for k ≥ 1

This sequence is bounded and, therefore, it has a limit point, say a ∈ Rn.

Let {ak}K be a subsequence of {ak} converging to a. Since z∗k is the

projection of xk for each k, by the Projection Theorem (b), it follows that

for each k (z∗k − xk)T(z − z∗k) ≥ 0 for all z ∈ C

implying that for all k

aTk z ≤ aT

k z∗k for all z ∈ C

Because aTk z∗k = aT

k (z∗k − xk) + aT

k xk < aTk xk for all k, we obtain

aTk z < aT

k xk for all z ∈ C

Since xk → x0 and ak → a over k ∈ K, it follows that

aTz ≤ aTx0 for all z ∈ C

Convex Optimization 4

Page 6: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Separating Hyperplane Theorems

Th. Let C, D ⊆ Rn be nonempty convex disjoint sets i.e., C ∩D = ∅.Then, there exists a hyperplane separating these sets, i.e.,

there is a ∈ Rn, a 6= 0, such that

supx∈C

aTx ≤ infz∈D

aTz

Convex Optimization 5

Page 7: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Proof of the Separating Hyperplane Theorem

Consider the set Y = C −D. This is a (nonempty) convex set.

Since C ∩D = ∅, it follows that 0 6∈ Y .

By the Supporting Hyperplane Theorem, it follows that there exists a ∈ Rn

such that

supy∈Y

aTy ≤ 0.

Hence, aTy ≤ 0 for all y ∈ Y . Because Y = C −D, we have

aTx ≤ aTz for all x ∈ C and all z ∈ D

Taking supremum over x ∈ C and infimum over z ∈ D, we obtain

supx∈C

aTx ≤ infz∈D

aTz

Convex Optimization 6

Page 8: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Strictly Separating Hyperplane Theorem

Th. Let C, D ⊆ Rn be nonempty convex disjoint sets.

Assume that C − D is closed. Then, there exists a hyperplane strictlyseparating the sets, i.e., there is a ∈ Rn, a 6= 0, such that

supx∈C

aTx < infz∈D

aTz

Proof: Homework assignment.

• When is C −D closed?

• One of conditions: C is closed and D is compact

Convex Optimization 7

Page 9: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Duality Theory

• An important part of optimization theory

• Its implications are far reaching both in theory and practice

• A powerful tool providing:

• A basis for the development of a rich class of optimization algorithms

• A general systematic way for developing bounding strategies (both in

continuous and discrete optimization)

• A basis for sensitivity analysis

Convex Optimization 8

Page 10: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Main Idea and Issues in Duality Theory

• Associate an “equivalent dual problem” with a given (primal) problem

• Methodology applicable to a general constrained optimization problem

• Investigate:

• Is there a general relation between the primal and its associated dual

problem?

• Under which conditions the primal and the dual problems have the

same optimal values?

• Under which conditions the primal and dual optimal solutions exist?

• What are the relations between primal and dual optimal solutions?

• What kind of information the dual optimal solutions provide about

the primal problem?

Convex Optimization 9

Page 11: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Geometric Visualization of Duality

We illustrate duality using an abstract “geometric framework”

• This framework provides insights into:

• Weak duality

• Strong duality (zero duality gap)

• Existence of duality gap

Within this setting, we define:

• A “geometric primal problem” using an abstract set V ⊆ Rm × R

• A corresponding “geometric dual problem” using the hyperplanes that

support the set V

Convex Optimization 10

Page 12: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Geometric Primal

Consider an abstract (nonempty) set V of vectors (u, w) ∈ Rm × RThe set V intersects the w-axis, i.e.,

(0, w) ∈ V for some w ∈ RThe set V extends “north” and “east”:

[North] For any (u, w) ∈ V and u ∈ Rm with u � u, we have (u, w) ∈ V

[East] For any (u, w) ∈ V and w ∈ R with w ≤ w, we have (u, w) ∈ V

• Geometric Primal Problem

Determine the minimum intercept of the set V and the w-axis:

minimize w

subject to (0, w) ∈ V

The minimum intercept value is denoted by f ∗, i.e., f ∗ = inf(0,w)∈V w.

Convex Optimization 11

Page 13: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Nonvertical HyperplanesA hyperplane in Rm×R: {(u, w) | µTu+µ0w = ξ}, µ ∈ Rm, µ0, ξ ∈ R• We say that a hyperplane is nonvertical when µ0 6= 0

• Let Hµ,ξ denote a nonvertical hyperplane in Rm × R, i.e.,

Hµ,ξ = {(u, w) | µTu + w = ξ} with µ ∈ Rm, ξ ∈ R• Let q(µ) be the minimum value of µTu + w for (u, w) ∈ V , i.e.,

q(µ) = inf(u,w)∈V {µTu + w}• A nonvertical hyperplane Hµ,ξ supports a set V when ξ = q(µ)

Convex Optimization 12

Page 14: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Geometric Dual Problem

• A hyperplane supporting the set V intersects the w-axis at (0, q(µ))• Geometric Dual Problem: Determine the maximum intercept with the

w-axis for the nonvertical hyperplanes that support the set V :

maximize q(µ)

subject to µ ∈ Rm

• Note: q(µ) = inf(u,w)∈V {µTu + w}, q(µ) = −∞ for µ 6� 0

Convex Optimization 13

Page 15: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Observations

Primal: minimize w

subject to (0, w) ∈ V

Dual: maximize q(µ)

subject to µ � 0

• Dual values q(µ) are always below f ∗ and below any w with (0, w) ∈ V

• Dual optimal value q∗ never exceeds the primal optimal value f ∗:

q∗ ≤ f ∗ Weak Duality

• The weak duality may be strict i.e., q∗ < f∗ there is a Duality Gap

Convex Optimization 14

Page 16: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Duality Gap Illustrations

• A duality gap may exist even for a convex set V

Convex Optimization 15

Page 17: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Strong Duality

• We may have q∗ = f ∗ Strong Duality

• However, a nonvertical hyperplane achieving the maximum intercept

may not exist

• With or without convexity of V , only one relation is sure: q∗ ≤ f ∗

Convex Optimization 16

Page 18: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Constrained Optimization DualityPrimal Problem (not necessarily convex)

minimize f(x)

subject to gj(x) ≤ 0, j = 1, . . . , m

x ∈ X

variable x ∈ Rn, feasible, optimal value f ∗ > −∞Geometric Framework:

• Define the set V as follows:

V = {(u, w) | there is x ∈ X such that g(x) � u, f(x) ≤ w}• Dual function is:

q(µ) = inf(u,w)∈V {w + µTu} = infx∈X{f(x) + µTg(x)}, µ � 0

• Dual Problem:

maximize q(µ)

subject to µ � 0

Convex Optimization 17

Page 19: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

General CasePrimal Problem (not necessarily convex)

minimize f(x)

subject to gj(x) ≤ 0, j = 1, . . . , m

hj(x) = 0, j = 1, . . . , p

x ∈ X

variable x ∈ Rn, feasible, optimal value f ∗ > −∞Lagrangian Function: L : Rn × Rm × Rp → R given by

L(x, µ, λ) = f(x) +m∑

j=1

µjgj(x) +p∑

j=1

λjhj(x)

= f(x) + µTg(x) + λTh(x)

• Weighted sum of the objective and constraint functions• µ ∈ Rm is Lagrange multiplier associated with g = (g1, . . . , gm)• λ ∈ Rp is Lagrange multiplier associated with h = (h1, . . . , hp)

Convex Optimization 18

Page 20: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Dual ProblemLagrangian Function:

q(µ, λ) = infx∈X L(x, µ, λ) = infx∈X{f(x) + µTg(x) + λTh(x)

}The infimum above has an implicit constraint on the primal problem domain

• Dual Problem:maximize q(µ, λ)

subject to µ � 0, λ ∈ Rp

• Important properties: hold without any assumptions on the primal

• Concave Dual: q(µ, λ) is concave, the constraint set is convex

• Lower Bound: For any µ � 0 and λ ∈ Rp, we have q(µ, λ) ≤ f ∗

Convex Optimization 19

Page 21: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Least-Norm Solution of Linear Equationsminimize xTx

subject to Ax = b

Dual Function:• Lagrangian is: L(x, λ) = xTx + λT(Ax− b)• To minimize L over x ∈ Rn, set the gradient ∇xL equal to zero:

∇xL(x, λ) = 2x + ATλ = 0 =⇒ xλ = −1

2ATλ

• Plug xλ in L to obtain q(λ):

q(λ) = L(xλ, λ) = −1

4λTAATλ− bTλ

a concave function of λ

Lower Bound Property: −(1/4)λTAATλ− bTλ ≤ f ∗ for all λ

Convex Optimization 20

Page 22: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Standard Form LPminimize cTx

subject to Ax = b, x � 0

Dual Function:• Lagrangian is:

L(x, µ, λ) = cTx + λT(Ax− b)− µTx

= −bTλ + (c + ATλ− µ)Tx

• L is linear in x, hence

q(µ, λ) = infxL(x, µ, λ) =

{−bTλ when ATλ− µ + c = 0

−∞ otherwise

q is linear on affine domain {(µ, λ) | ATλ−µ+ c = 0}, hence concave

Lower Bound Property: −bTλ ≤ f ∗ when ATλ + c � 0

Convex Optimization 21

Page 23: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Two-Way Partitioning

minimize xTWx

subject to x2i = 1, i = 1, . . . , n

• A nonconvex problem: feasible set contains 2n discrete points

• Interpretation: partition {1, . . . , n} in two sets; Wij is cost of assigning

i, j to the same set; −Wij is cost of assigning to different sets

Dual Function:

q(λ) = infx

{xTWx +

∑i

λi(x2i − 1)

}= inf

xxT [W + diag(λ)]x− 1Tλ

=

{−1Tλ when W + diag(λ) � 0

−∞ otherwise

Lower Bound Property: −1Tλ ≤ f ∗ when W + diag(λ) � 0

example: λ = −λmin(W )1 gives bound nλmin(W ) ≤ f ∗

Convex Optimization 22

Page 24: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Weak and Strong DualityWeak Duality: q∗ ≤ f ∗

• Holds always without any assumptions on the primal problem• Can be used to compute nontrivial lower bounds for difficult problems

For example, a lower bound for the two-way partitioning problem can be

obtained by solving the SDP

maximize −1Tλ

subject to W + diag(λ) � 0

Nonzero Duality Gap: q∗ < f∗

• May hold even for a convex primal problemZero Duality Gap: q∗ = f ∗ (Strong Duality)• Does not hold in general• Even for convex primal problems additional conditions are needed• The conditions that guarantee strong duality in convex problems are

referred to as constraint qualifications

Convex Optimization 23

Page 25: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Examples with Zero Duality Gap

Examples show that, in general, the relation q∗ = f ∗ provides no information

about the existence of dual optimal solutions

• Unique Dual Optimal:

minimize x1 − x2

subject to x1 + x2 ≤ 1

x1 ≥ 0, x2 ≥ 0

minimize 12

(x2

1 + x22

)subject to x1 ≤ 1

• Multiple Dual Optimal:minimize |x1| − x2

subject to x1 ≤ 0, x2 ≥ 0

• No Dual Optimal:minimize x

subject to x2 ≤ 0

Similarly, we can construct examples when q∗ = f ∗ and there is no

information about the existence of the primal optimal solutions

Convex Optimization 24

Page 26: Lecture 7 Convex Problems, Separation Theoremsangelia/L7_separationthms.pdf• Separation Theorems ... Supporting Hyperplane Theorem Separating Hyperplane Theorems ... Main Idea and

Lecture 7

Examples with Duality Gap

• Discrete Optimization:

minimize −x

subject to x ≤ 1, x ∈ {0,2}

• Convex Optimization:

minimize e−√

x1x2

subject to x21 ≤ 0, x1 ≥ 0, x2 ≥ 0

Convex Optimization 25