     # concave + convex

• View
215

1

Tags:

Embed Size (px)

DESCRIPTION

concave

### Text of concave + convex

• Lecture 4

B1 Optimization Michaelmas 2013 A. Zisserman

Convexity

Robust cost functions

Optimizing non-convex functions grid search

branch and bound

multiple coverings

simulated annealing

• The Optimization Tree

• Unconstrained optimization

down-hill search (gradient descent) algorithms can find local minima

which of the minima is found depends on the starting point

such minima often occur in real applications

minxf(x)

f(x)

local minimum

global minimum

function of one variable

• The answer is: see if the optimization problem is convex.

If it is, then a local optimum is the global optimum.

First, we need to introduce

Convex Sets, and

Convex Functions

Note sketch introduction only

How can you tell if an optimization has a single optimum?

• convex Not convex

• Convex functions

• Convex function examples

convex Not convex

A non-negative sum of convex functions is convex

• Convex Optimization Problem

Minimize:

a convex function

over a convex set

Then locally optimal points are globally optimal

Also, such problems can be solved both in theory and practice

• Why do we need the domain to be convex?

f(x)

cant get from here to here by downhill search

domain (not a convex set)

local optimum

• Examples of convex optimization problems

1. Linear programming

2. Least squares

f(x) = (Ax b)2 , for any A

f(x) = x>Px+q>x+r, provided that P is positive definite

Many more useful examples, see Boyd & Vandenberghe

• The Hessian of a function f(x1, x2, . . . , xn) is the matrix of partial derivatives

H =

2fx1x1

2fx1x2

. . . 2f

x1xn2f

x2x12f

x2x2. . .

2fx2xn... ... . . . ...

2fxnx1

2fxnx2

. . . 2f

xnxn

Diagonalize the Hessian by an orthogonal change of coordinates.

Diagonals are the eigenvalues.

If the eigenvalues are all positive, then the Hessian is positive definite, and

f is convex.

Second order condition

• Strictly convex

A function f(x) is strictly convex if

f((1 )x0 + x1) < (1 )f(x0) + f(x1) .

strictly convex

one global optimum

Not strictly convex

multiple local optima (but all are global)

• Robust Cost Functions

In formulating an optimization problem there is often some room for design and choice

The cost function can be chosen to be: convex robust to noise (outliers) in the data/measurements

• f(x) =Xi

(x ai)2Consider minimizing the cost function

the data { ai } may be thought of as repeated measurements of a fixed value (at 0), subject to Gaussian noise and some outliers

it has 10% of outliers biased towards the right of the true value

the minimum of f(x) does not correspond to the true value

{ ai }

• Examine the behaviour of various cost functions f(x) =Xi

C(|x ai|)

L1 huber

squared error the usual default cost function

arises in Maximum Likelihood Estimation for Gaussian noise

convex

C()

C() = 2

for inliers behaves as a quadratic

truncated so that outliers only incur a fixed cost

non-convex

C()

C() = min(2,)

=

(2 if || < otherwise.

• L1 cost function

absolute error

called total variation

convex

non-differentiable at origin

finds the median of { ai }

C()

C() = ||

• Huber cost function

continuous first derivative

for larger values becomes linear

thus has the outlier stability of L1

convex

C()

C() =

(2 if || < 2|| 2 otherwise.

• { ai }

f(x) =Xi

C(|x ai|)

zoom

Example 1: measurements with outliers

• { ai }

f(x) =Xi

C(|x ai|)

Example 2: bimodal measurements

70% in principal mode

30% in outlier mode

• Summary

Squared cost function very susceptible to outliers

Truncated quadratic has a stable minimum, but is non-convex and also has other local minima. Also basin of attraction of global minimum limited

Huber has stable minimum and is convex

• Optimizing non-convex functions

1. grid search: uniform grid space covering

2. branch and bound

3. multiple coverings: Newton like methods within regions

4. simulated annealing: stochastic optimization

minxf(x)

f(x)

local minimum

global minimum

function of one variable

Sketch four methods:

• Branch and bound

Split region into sub-regions and compute bounds

Consider two regions A and C

If lower bound of A is greater than upper bound of C then A can be discarded

divide (branch) regions and repeat

minxf(x) f(x)

Key idea:A C

LB UB

B D

• Multiple coverings

Key idea is to cover the parameter space with overlapping regions to deal with local optima, and then take advantage of efficient continuous optimization for each region.

Example from Matlab Global Optimization toolbox

x

y

Pattern Contours with Constraint Boundaries

-3 -2 -1 0 1 2 3 4 5-4

-3

-2

-1

0

1

2

3

4

5

• Multiple coverings ctd

Use multiple starting points

Continuous optimization method for each

Record optimum for each starting point

Sort values to find global optimum

• Simulated Annealing

The algorithm has a mechanism to jump out of local minima

It is a stochastic search method, i.e. it uses randomness in the search

f(x)

local minimum

global minimum

1 2

• Simulated annealing algorithm

At each iteration propose a move in the parameter space

If the move decreases the cost, then accept it

If the move increases the cost by E, then accept it with a probability exp(-E/T), Otherwise, dont move

Note probability depends on temperature T

Decrease the temperature according to a schedule so that at the start cost increases are likely to accepted, and at the end they are not

• 0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100

exp(-x)exp(-x/10)

exp(-x/100)

Boltzmann distribution and the cooling schedule

Boltzmann distribution exp(-E/T) start with T high, then exp(-E/T) is approx. 1, and all moves are accepted

many cooling schedules are possible, but the simplest is

where k is the iteration number

The algorithm can be very slow to converge

Tk+1 = Tk, 0 < < 1

• Simulated annealing

The name and inspiration come from annealing in metallurgy, a technique involving heating and controlled cooling of a material to increase the size of its crystals and reduce their defects.

The heat causes the atoms to become unstuck from their initial positions (a local minimum of the internal energy) and wander randomly throughstates of higher energy; the slow cooling gives them more chances offinding configurations with lower internal energy than the initial one.

Algorithms due to: Kirkpatrick et al. 1982; Metropolis et al.1953.

• Example: Convergence of simulated annealing

HILL CLIMBING

HILL CLIMBING

HILL CLIMBINGCO

S

T

F

U

N

C

T

I

O

N

,

C

NUMBER OF ITERATIONS

AT INIT_TEMP

AT FINAL_TEMP

move accepted with probability exp(-E/T)

unconditional Acceptance

• Steepest descent on a graph

Slide from Adam Kalai and Santosh Vempala

• Random Search on a graph

• Simulated Annealing on a graph

Phase 1: Hot (Random)Phase 2: Warm (Bias down)Phase 3: Cold (Descend)

Phase 1: Hot (Random)Phase 2: Warm (Bias down)Phase 3: Cold (Descent)

• There is more

There are many other classes of optimization problem, and also many efficient optimization algorithms developed for problems with special structure. Examples include:

Combinatorial and discrete optimization

Dynamic programming

Max-flow/Min-cut graph cuts

See the links on the web page

http://www.robots.ox.ac.uk/~az/lectures/b1/index.html

and come to the C Optimization lectures next year

Recommended ##### SOL 9 Review Chuck Higy. 1. The image below is a ____ A. Convex mirror B. Concave mirror C. Convex lens D. Concave lens 123456789101112131415161718192021222324252627
Documents Documents ##### Services - lltes.· ... Concave mirror as well as convex lens (b) Convex mirror as well as ... Refractive
Documents ##### Curved Mirrors: Locating Images in Concave & Convex Mirrors
Documents ##### Concave Vs Convex. Focal length Focal length (shown in red) is the distance between the center of a convex lens or a concave mirror and the focal point
Documents ##### Convex lens Concave ilrhee/lecture/modern/chap6.pdfآ  2017-11-13آ  Convex lens Concave lens Optical
Documents ##### TIPOS DE ANTENAS - Radio DE ANTENAS Antenas lineales ... Lens antennas with index of refraction n I Concave—plane Concave—concave Concave—convex (b) Lens antennas
Documents ##### Drawing Ray Diagrams Concave and Convex Mirrors and Double Concave and Double Convex Lenses By Vincent Sapone
Documents ##### Polygons - Polygons Concave vs. Convex A convex figure has no indentations, or if every line segment
Documents ##### Disciplined Convex-Concave Programming - arXiv Disciplined convex-concave programming We refer to a problem as a disciplined convex-concave program if it has the form (2), with o,
Documents Documents