MATH353: Optimisation...rg(x) 6= 0 8x.Iff has a local optimum at a point xˆ then there exists some 2 R such that: rf (ˆx)=rg(ˆx). This equation gives ﬁrst-order necessary conditions

MATH353: Optimisation

Stephen [email protected]

School of Mathematics and StatisticsVictoria University of Wellington

Stephen Marsland (VUW) 1

Constrained Optimisation

For x = (x1, x2, . . . xn)

x⇤ = min f (x)

subject to:gj(x) = 0, j = 1, . . . , phk(x) 0, k = 1, . . . ,m

These are called the primal constraints.


TIt



max g 0

fi fth

1unconstrained f Max f

subject to g O



coOconstraint

f 2

Oh

µ

levelsets f f


• The constraints change the problem a lot.• Without constraints, we find the stationary points of the function, and

choose the optimal one.

• With constraints, we have to follow a path of admissable solutions to

find the optimal value on that path.

• There is no generally applicable method to find constrained optimalsolutions.


satisfyconstraints

Constrained Stationary Points

The solutions are still stationary points, but on the intersection set of theconstraints and the function surface: constrained stationary points.They are points where the constraint curves touches the contours of thelevel set of function.


Constrained Stationary Points

ExampleFind the constrained stationary points of

max z = �x2 � y2 subject to (x � 2)2 + y2 = 1.


level sets J z are circles centred at Cgd

i

Equivalence of Constraints

• An equality constraint can be made into two inequality constraints:

g(x) = 0 ⌘ g(x) 0 and � g(x) 0

• An inequality constraint can be made into one equality constraint:

h(x) 0 ⌘ h(x) + b2 = 0

by using a new slack variable. This version will be very useful later.


g I o

h Ca Eo

Definitions

Open BallAn open ball with centre x⇤ and radius r > 0 is:

B(x⇤, r) = {x 2 Rn : kx � x⇤k r}

Feasible Set XThe set of x̃ 2 Rn such that g(x̃) = 0 and h(x̃) 0, i.e., the values of xthat satisfy the constraints.

Constrained Local Minimumx⇤ 2 X is a constrained local minimum if f (y) � f (x⇤) 8y 2 X .


0Possible sol's

n FS it is a man that beat mm

is in the FS

Definitions

Active ConstraintsAn inquality constraint is active or binding at x̃ if h(x̃) = 0. And inactiveotherwise.

Regular PointsA point x̃ is regular for the constraints if gradients at x̃ of the activeconstraints (i.e., rg(x̃), rh(x̃)) are linearly independent.

Remember that the gradient vector rf (x) is normal to the (relevant) levelset of the function at each point.


X 70 x D automatically truea o on the baby tractive

Regularisers

• We can add extra terms to an optimisation problem:

min f (x) + �R(x)

• This is normally done to make the solutions simpler or easier to find,especially for ill-posed problems

• It’s unclear how to choose �

• Idea: add the constraints as regularisers• Problem: might fail to satisfy any part of solution well


Somenumber

1v

e

Inverse Problems

Relaxation start with d bog shrink it

Lagrange’s Theorem

TheoremConsider the problem min f (x) such that g(x) = 0. Suppose that f and gare continuously differentiable functions of two variables andrg(x) 6= 0 8x . If f has a local optimum at a point x̂ then there existssome � 2 R such that:

rf (x̂) = �rg(x̂).

This equation gives first-order necessary conditions for a constrained localoptimum.The � is known as a Lagrange multiplier, and the function

L(x ,�) = f (x)� �g(x)

as the Lagrangian function.


Primalequality constraints

0 O2 vectors point msame drr



Of glny c o

0 09


Proof.• The graph of g(x) = 0 is a curve in R2.• We can parameterise this as a function of some other variable t:x1 = h(t), x2 = k(t) and it will be smooth for nice functions. Letr(t) = (h(t), k(t))T .

• Now F (t) = f (r(t)) describes the curve g(x) = 0.• Let x̂ = (x̂1, x̂2) be a point at which f has an extrenum, with

corresponding parameter point t̂. F 0(t̂) = 0, since F (t̂) is anextrenum of F (t).


gin u o glhltl.hu gCrCH

deet

0 at f I


Proof.

F 0(t) = fx1dx1

dt+ fx2

dx2

dt= fx1h

0(t) + fx2k0(t)

So at t = t̂, 0 = F 0(t̂)

= fx1(x̂)h0(t̂) + fx2(x̂)k

0(t̂)

= rf (x̂) · r 0(t̂)) rf (x̂) ? r 0(t̂)

which is a tangent vector to the curve.rg is also ? r 0(t̂), as the curve is a level set of g .Hence rf and rg are both orthogonal to the same vector. So in 2D, theymust be parallel, hence rf = �rg .It’s still true in higher dimensions, but the proof is a bit more subtle.


mao

Ofr

iii


ExampleFind the constrained stationary points of:

f (x , y) = xy subject to x2 + y2 = 1.

Example

max�x2 � y2 subject to (x � 2)2 + y2 = 1.


circle

f ay g y L

guy e a'tyr 1 0

of L og IOf X Pg y D 2x y 2day x

a 72g a y

x'ty L 2x I x y Yr

I.E

fi

Tutu

f ai y y G 2 ty 1 0

a ios ft if4

2x X 2x 4 2y 2Xy 7 1 or y 0

Try D l 2x 2et4 0 4 g o

If g o then 2 2 1 0 22 1,3

Had Kiso EIt

Using the Lagrangian function• It doesn’t matter how many (equality) constraints we have:

L(x ,�) = f (x) +mX

i=1

�i (bi � gi (x))

• We can differentiate the Lagrangian with respect to each of itsvariables:

@L@xi

=@f

@xi�

mX

j=1

�j@gj@xj

@L@�i

= bi � gi (x)

• At the optimum rL = 0 by Lagrange’s theorem and the fact that theconstraints are satisfied:

⇢rf =

Pi �irgi (x)

gi (x) = bi

�⌘ rL = 0


e s t 9

gilalebi

2

constants


To solve a problem with equality constraints, turn it into an unconstrainedoptimisation problem with Lagrange muliplier.

Example

min 6x21 + 4x2

2 + x23 subject to 24x1 + 24x2 = 360, x3 = 1.


mm f subject to g gzmm f t I g

164 ok Y h h bg

6x t Kai tog't t 3 24 2424

d I K

of s µ gradient

primal constraints242 24k 360

Kz I 2K Yeai z

f 3 A

ni 6,9 1

b x 6 theCau 9 Cq D 541

Summary

• The Lagrangian function matches the original objective function atfeasible points (since there the constraints are satisfied). This isknown as the relaxed form.

• For fixed Lagrange multipliers, an unconstrained optimum of therelaxed model L must be a stationary point.

• Stationary points of L satisfy the constraints of the original function.• Hence, if (x̂ , �̂) is a stationary point of L(x ,�) and x̂ is an

unconstrained optimum of L(x ,�), then x̂ is an optimum of theoriginal equality-constrained function.

• This gives a sufficient condition for a solution.


m.in AN st g.tkbiForm L feast E bi gCut 01 0 atgot


Example

max z = �x21 � x2

2 � x23 such that x1 + x2 + x3 = 0, x1 + 2x2 + 3x3 = 1.

In general, finding solutions is hard. But this method sets up a system for anumerical solver.


L ai E ol t d x oh og t d l a 2x 3

JL 21j

i ti d O

OL0Th

24 Xi da O

oo u o

JLTx i l K K O

OLE

i l M 242 Ix o

L k t D ai fi H k

Inequality Constraints• In 1D it’s easy:

max z = f (x) such that a x b

• Three possible solutions:• Stationary point inside interval

• a• b

• So solve f 0(x) = 0 and evaluate f (x) at each of these points, togetherwith f (a) and f (b)

• Same in higher dimensions: constraints define a boundary betweenfeasible solutions and infeasible ones. The optimum can be:

• On the boundary

• Inside the feasible region

• At infinity

• Equivalently, @L@xi

= 0 and �i (bi � gi (x)) = 0 (so �i = 0 if gi (x) = bi ).• The second constraint is called complementary slackness.


t

t.io E

Complementary Slackness

Example

max f (x , y) = x � y2 � 1 subject to x2 + y2 � 1 0

Example

max f (x , y) = �x2 � y2 subject to 1 x + y


a slack variable

FEI

interior 0 at stationary partof 0

no local c sp inside

boundary Form L x y ltd l ai y

0LE L 2ha O

Fg 2g 2dg o

y Itt o

fly I i y 0 either y o

or c I

f l l O0

f C 1,0z

9

fl I a

Il t EI4

L si y t I Caty 1

Iff 2x to o

0 2g X co

DL xty I O

Tx

interior aty I to have PL o

Ya need 4 0

boundaryx y 0

Xty L x y

x y k

a 2 1 f E k kk

at e f a

More Inequality Constraints

• Nothing really changes with more inequality constraints, but you needto check that the other constraints hold when looking at the boundaryof one.

• Inactive inequalities can be ignored by setting the corresponding�i = 0, the others have to be included.

• But which ones are active? There are 2m partitions of m constraintsinto active and inactive!

Example

max x2 � y2 subject to y � 1 0, x2 � y � 1 0


D t f Ya

inside bdry X

L ai y th Ct y th l x'ty2x 2nd o all day

Ooty 2g X th o

DLOT

i t y o

deOT l a'ty o

interior X la o f10,01 0

boundaries 1st constraint tr O

g floD I

is 2nd constraint okh y I fo

O l I 2 Sq

2nd crustraint X O

o1 7 0 Zytek o

or 2 1 a y 1 0

at a O y I fco D t

at ten I y k f FL k 44

ftp.k 5ke

1st constraint y I GOYes as.p

vertices both constants active

y 1 0 yetx y 1 0 a IF

f It D I

point at u No

i max is 54 at ft FL

Duality

• Every optimisation has a corresponding one called the Lagrange dual• The solution to the dual problem is a lower bound (for a minimisation)

for the solution to the primal one• And the problem is always concave• The difference between the solutions to the primal and dual problems

in the duality gap• For convex primal problems the duality gap =0.


no E

Duality

• Earlier we wrote down the Lagrangian and then:• solved for the (non-negative) Lagrange multipliers

• used them to find the values of the primal variables

• If instead, we solve for the primal variables as functions of theLagrange multipliers, we get dual variables

• You used it a lot when you looked at linear programming


f bi gG left Gigi

Duality

• To solve:min f (x) subject to gi (x) = 0, hj(x) 0

• we formed the Lagrangian:

L(x ,�, ⌫) = f (x) +X

i

�igi (x) +X

j

⌫ihj(x).

• Now the dual equation is:

D(�, ⌫) = infxL(x ,�, ⌫) = infx

0

@f (x) +X

i

�igi (x) +X

j

⌫ihj(x)

1

A


equally megual.ly

t t

bi gill hilal

LHs

J

Linear Programming

It ldud.ms

Primal problem

mm al t k't 5 t 71 4 450

4,24270

solution i fc2 cl 8

Dual problem

min ai't't th n 24 411 ll

p Ki xxdtm.in xi xxt4dX7oax 7hX LO 74 24 0

MY f 442 41 470

max 4,1 NO

The Karush-Kuhn-Tucker ConditionsFirst-order necessary conditions for a constrained optimisation problem

min (or max )f (x)

such that gi (x) � bi , i 2 G

gi (x) bi , i 2 L

gi (x) = bi , i 2 E

The conditions are:1 Complementary slackness for inequality constraints2 Sign restrictions3 Lagrange’s gradient equation4 The primal constraints

The sign equations deal with the two sets of inequalities: for minimising �i

for L inequalities are negative and those for G are positive, and vice versafor maximising.


KK T

h

if 5bi

Of doggCat E bi

KKT Conditions

Example

max 2x + 7y subject to

(x � 2)2 + (y � 2)2 = 10 x 20 y 2


L 2xt7g t t I E y GDYth na t c y 2 a tdsky

FIFI th ttyo.H.li thfopcomplementary Slackness

kata O de 2x 0

ty o Cry o 14g's

Sign conditions

X unrestricted

maximisation problem Osx

1,4 EO

de di 20

Y 12 4 the the 2

titty oo th tds 7

Improving Directions

• If we are at a point x (k) we want to travel in an improving direction toget closer to a local optimum:

f (x (k) + h�x) ⇡ f (x (k)) + hrf (x (k))T�x

• The step �x needs to improve the current solution while remaining inthe feasible set for some small h

• The direction is improving if:

rf (x (k))T�x

⇢< 0 for min> 0 for max


a'ht a4thBx

steepestdesert


• How to we check if the step is feasible?• Linear constraints

Equality aT x = b. x (k) is feasible ) aT x (k) = b. Then

aT (x (k) + h�x) = b iff aT�x = 0.

Inequality (; symmetric for �). Inactive constraints are

automatically satisfied for small h. For active

constraints, aT x = b. Then aT (x (k) + h�x) b iff

aT�x 0.

• Nonlinear constraints Taylor’s theorem:

gi (x(k) + h�x) ⇡ gi (x

(k) + hrgi (x(k))T�x

Then �x is feasible at x (k) to 1st order if:

rgi (x(k))T�x

8<

:

� 0 for active � constraints

0 for active constraints

= 0 for active equality constraints


I


TheoremA solution x⇤ is a KKT point if there are no improving feasible solutions atx⇤.

In other words, the KKT conditions are a first-order working test of thelack of improving feasible directions.The KKT conditions are sufficient if all the constraints are linear, or thegradients of all the active constraints are linearly independent.


KKT point youcan't improve fromhere

and stay in feasible set

Last Examples

Example

min x2 + y2 subject to x + y = 1, x , y � 0

State the KKT conditions, show that they hold at the global optimum.

Show that �x =

✓1�1

◆is an improving direction at x (k) =

✓01

◆and

that the KKT conditions have no solution there.

Example

min x2 � 2x + y2 + 1 subject to x + y 0, x2 � 4 0


Of XOg

filth thtPrimal constraints

a get It22 Z O

y 20

Complementary Slackness

7cL x OX L y O

Sign la Xs 710

OfCODTda Q2 f z Z E Oso it

is anpony

at a 9 my I r

U7C O active

y c u r inautre

g lo 1 Da il f O

gold 1 Dx Ilo f I 710

C S 7 O

i i Hall E

X y

YY 2 d zt t have opposite

signsY 2

no KILT satisfaction

Ii k pcS 1 0

ill C t I

KKT

Miu ol 2aty't s t xty SO n 450

KILT conditions

Primal constraints

xxy SOall 4 SO

Gradient Eq

57 41 4Complementary Slackness

X Gay O

k al 41 0

Sign conditions

7 da E O

1 1 0 4 0 7 so 1 0,1 0

X CO d CO

I D 1 0

G E 2x 2 0 a I

2g 0 y 0

bit aty SO not satisfiedus sol's

X 0 1 0 x 4 0 x z

Za Z Zha

2g O g 0

so 2,0 are possible sots

aty SO a T2 y 0 not in FR

4 2 44 i 6 41

Xp LO 1 0 xty O x y2x 2 7

2g 2

y x Iz ti

I Hz ti Y I

do I

yk

n k

Y co d co xty o y Iz

X 4 0 x Iz

ft 2 2

2 2

2 2 7 12nd

I I.amz 41 41

6 472

di 20 X

no sol's here

ok 2 y 0 f 9 a 2x ty t1

i I 1 4 1

Documents

MATH353: Optimisation...rg(x) 6= 0 8x.Iff has a local optimum at a point xˆ then there exists some 2 R such that: rf (ˆx)=rg(ˆx). This equation gives ﬁrst-order necessary conditions