Optimization and Optimal Control

Prof. Dr.–Ing. habil. Thomas Meurer

Optimization and Optimal Control

Lecture Notes

Stand: Winter term 2013/14

c© Lehrstuhl fur RegelungstechnikChristian–Albrechts–Universitat zu Kiel

Preface

These lecture notes are partly based on the set of notes ”Methoden der Optimierung und OptimalenSteuerung” of Prof. Dr.–Ing. Knut Graichen from Ulm University, Germany and those of Prof. Dr.techn.Andreas Kugi from Vienna University of Technology, Austria about ”Optimierung”. Their contributionsare gratefully acknowledged.

The lecture notes are compiled using LATEX. METAPOST with the additional packages makecirc.mp forelectrial networks and blockdraw.mp for block diagrams are used together with my personal extensionsto create most of the arising graphs. The presented numerical results are obtained using Matlab andOctave, the latter available from http://www.gnu.org/software/octave, as well as the ACADOtoolkit, which is available at http://sourceforge.net/p/acado/wiki/Home/.

Thomas Meurer

iii

http://www.gnu.org/software/octave

http://sourceforge.net/p/acado/wiki/Home/

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Static optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Dynamic optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Mathematical fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3.1 Infimum and supremum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3.2 Local and global minimum and maximum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3.3 Gradient and Hessian matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.3.4 Differentiation of vector–valued functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.3.5 Definiteness and semi–definiteness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.3.6 Mean value theorems and Taylor’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.3.7 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 Static unconstrained optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.1 Optimality conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2 Numerical minimization algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.2.2 Line search methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.3 Trust–region methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.2.4 Direct search methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.3 Benchmark example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3 Static constrained optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.1 Optimality conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.1.1 Equality constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.1.2 Inequality constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

v

vi Contents

3.2 Numerical optimization algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.2.1 Active set methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.2.2 Gradient projection methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.2.3 Penalty and barrier methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.2.4 Sequential quadratic programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75


3.4 Optimization software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4 Dynamic optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.1 Problem statement and preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.2 Calculus of variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.2.2 Problems with constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

4.3 Unconstrained optimal control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4.3.1 Existence of an optimal control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4.3.2 Application of variational calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.4 Input constrained optimal control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

4.4.1 Pontryagin maximum principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

4.4.2 Application to nonlinear affine input systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

4.5 Numerical solution of optimal control problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

4.5.1 Indirect methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

4.5.2 Direct methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134


References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

Chapter 1

Introduction

Problems of decision making in physical and technical systems, economics or organizations have receivedincreasing attention mainly inspired by the benefits which may result from a proper (optimal in somesense) decision concerning the distribution of expensive resources. This development is further fosteredby the evolution of computing and computational hardware enabling an efficient and fast solution of alsohighly complex decision problems.

It seems natural that the concept of optimal decisions has emerged as the fundamental approach forthe formulation of decision problems. Herein a single quantity describing performance or value is eitherminimized or maximized by proper selection among the available alternatives. The resulting optimaldecision is taken as the solution of the decision problem. For the determination of an optimal decisionor optimal solution, respectively, it is necessary to provide a mathematical formulation of the decision oroptimization problem (OP). Thereby it is in general distinguished between

• static optimization problems addressing the minimization of a functional with optimization variables(also called decision variables) from the Euclidean space and

• dynamic optimization problems with the decision variables being elements of an infinite–dimensionalfunction space, e.g. functions of time.

Some properties and differences between static and dynamic optimization problems are presented inthis introductory section based on selective example problems. In addition, mathematical concepts areaddressed, which are used throughout the text.

Notation In what follows, scalars are typically denoted by lower case letters, e.g. x(t), while vectorsare denoted by lower case bold face letters, e.g.

x =

x1

x2

...xn

.If not stated otherwise the vector space Rn is the working space equipped with the inner, scalar or dotproduct, respectively, i.e. 〈x,y〉Rn = x ·y = xTy =

∑nj=1 xjyj . The associated norm is hence given by

‖x‖Rn = ‖x‖ =√〈x,x〉. The unit matrix is denoted by E. Other (equivalent) norms are introduced

whenever they are needed.

1

2 1 Introduction

1.1 Static optimization

The standard formulation of a static optimization problem is given by

minx∈Rn

f(x) (objective function) (1.1a)

subject to

gj(x) = 0, j = 1, . . . , p (equality constraints) (1.1b)

hj(x) ≤ 0, j = 1, . . . , q (inequality constraints). (1.1c)

The optimization problem (1.1) is called unconstrained if it consists only of (1.1a). If equality con-straints (1.1b) and (or) inequality constraints (1.1c) arise, then the optimization problem (1.1) is calledconstrained .

An equivalent problem formulation is given by

minx∈Rn

f(x) (1.2a)

subject to

x ∈ Xad (1.2b)

with Xad the set of admissible or feasible decisions. Here, equality and inequality constraints are includedin the definition of Xad, i.e.

Xad = {x ∈ Rn : gj(x) = 0, j = 1, . . . , p and hj(x) ≤ 0, j = 1, . . . , q} (1.2c)

For unconstrained problems it hence follows that Xad = Rn. It is obvious that Xad 6= ∅ since thisimplies that there does not exist a solution of (1.2). Moreover, taking into account the equality constraintsgj(x) = 0, j = 1, . . . , p the number of free decision variables reduces to n − p. Hence, p must not belarger than the number of decision variables n since otherwise Xad is the empty set.

Remark 1.1. It is common to consider the formulation as a minimization problem. Note that any max-imization problem can be converted to a minimization problem following

maxx∈Rn

f(x) = minx∈Rn

−f(x). (1.3)

Besides the terminology static optimization the term mathematical programming is used and is reflectedin the different problem categories typically distinguished within static optimization:

• linear programming , i.e. objective function and constraints are linear in x;

• quadratic programming , i.e. objective function is quadratic with constraints linear in x;

• nonlinear programming , i.e. objective function or at least one constraint function is nonlinear in x;

• integer programming , i.e. decision variables x are discrete integers;

• mixed–integer programming , i.e. decision variables x are continuous and discrete.

In order to illustrate the previous terms, subsequently the example of portfolio optimization is consideredfor a linear program [6].

1.1 Static optimization 3

Example 1.1 (Portfolio optimization). An investor aims at a profit–making investment of 10.000 e.The available investment options comprise 3 investment funds of different earning and risk summarizedin Table 1.1 below. The investor’s goal is to realize a yield of 600 e at the end of the first year. However,

Fund Estimated annual yield Risk

A 10 % 4

B 7 % 2C 4 % 1

Table 1.1: Portfolio optimization.

the investor is conservative and want to invest at least 40 % into fund C to reduce risk. Determine theoptimal distribution of the 10.000 e to meet these requirements.

In order to solve this problem introduce the decision variables x1, x2 and x3 expressing the percentagesof the investments into funds A, B and C. This implies

x3 = 1− x1 − x2. (1.4)

The desired minimum gain of 600 e requires

10.000(0.1x1 + 0.07x2 + 0.04x3) ≥ 600

or equivalently

6x1 + 3x2 ≥ 2. (1.5)

The minimum invest in fund C can be expressed according to

10.000x3 = 10.000(1− x1 − x2) ≥ 4.000,

which yields

x1 + x2 ≤ 0.6. (1.6)

In addition it is required that x1, x2, x3 ≥ 0. Minimizing risk can be expressed, e.g., by the linear objectivefunction

f(x) = 4x1 + 2x2 + x3 = 3x1 + x2 + 1. (1.7)

With these preliminaries the static optimization problem is given by

minx∈R2

(3x1 + x2 + 1) (1.8a)

subject to

h1(x) = 2− (6x1 + 3x2) ≤ 0

h2(x) = x1 + x2 − 0.6 ≤ 0

h3(x) = x1 + x2 − 1 ≤ 0

h4(x) = −x1 ≤ 0

h5(x) = −x2 ≤ 0.

(1.8b)

Figure 1.1 shows the solution of the static optimization problem. Here, isoclines of the objective functionf(x) = constant are depicted together with the individual constraints to cut out the admissible region.From these it follows that the optimal value x∗ of the decision variable x is located at the intersection ofthe lines h1(x) = 0 and h2(x) = 0, which yields x∗ = [1/15, 8/15, 6/15]T .

4 1 Introduction

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x1

x2

Admissible seth1 = 0h2 = 0h3 = 0f = const.

Fig. 1.1: Illustration of portfolio optimization.

Example 1.2 (Unconstrained and constrained quadratic programming). Subsequently, thequadratic programming problem defined by

minx∈R2

f(x) =(x1

2− 1)2

+(x2 − 2

)2

(1.9a)

is considered without and with equality as well as inequality constraints. Figure 1.2(a) shows a contourplot of f(x) defined by (1.9a) in the (x1, x2)–plane. The minimum f(x∗) = 0 is obviously located atx∗ = [2, 2]T .

Amending (1.9a) by the equality constraint

g(x) = x2 +x1

2− 2 (1.9b)

yields the modified picture shown in Figure 1.2(b). Recalling that (1.9b) imposes an algebraic constraintthe number of decision variables is reduced to one. In geometric terms this implies that the optimal solutionresides on the line defined by (1.9b) and is determined as the tangential contact point of g(x) with therespective isocline f(x) = 0.5.

The equality constraint is replaced by the inequality constraint

h1(x) = x2 − p0x1 + p0 ≤ 0, p0 = 0.5354. (1.9c)

With this, the set of admissible decisions is restricted to the region bounded by h1(x) = 0 as depicted inFigure 1.2(c). The optimal value is hence given by the contact point of the isocline of f(x) with the lineh1(x) = 0 tangential to the isocline. Taking into account an additional inequality constraint

h2(x) = x2 − 0.15x21 + p1 ≤ 0, p1 = 0.468 (1.9d)

yields the picture shown in Figure 1.2(d). In this case the admissible set is reduced and the optimal valuex∗ = [3, 0.882]T of the decision variable x is located at the intersection of the curves h1(x) = 0 andh2(x) = 0 so that both constraints are active.

1.2 Dynamic optimization 5

0 1 2 3 40

0.5

1

1.5

2

2.5

3

3.5

4

0.25

0.250.2

5

0.5

0.5

0.5

0.5

1

1

1

1

1

1

2

2 2

2

2

2 2

2

3

3 3

3

3

3 3

3

x1

x2

(a) Unconstrained problem (1.9a).

0 1 2 3 40

0.5

1

1.5

2

2.5

3

3.5

4

0.25

0.250.2

5

1

1

1

1

1

1

2

2 2

2

2

2 2

2

3

3 3

3

3

3 3

3

0.5

0.5

0.5

0.5

x1

x2

(b) Equality constraint problem (1.9a), (1.9b).

0 1 2 3 40

0.5

1

1.5

2

2.5

3

3.5

4

0.25

0.2

50.2

5

1

1

1

1

1

1

2

2

2

2

2

2

3

3

3

3

3

3

x1

x2

(c) Inequality constraint problem (1.9a), (1.9c).

0 1 2 3 40

0.5

1

1.5

2

2.5

3

3.5

4

0.25

0.2

5

0.25

1

1

1

1

1

1

2

2

2

2

2

2

3

3

3

3

3

3

x1

x2

(d) Inequality constraint problem (1.9a), (1.9c), (1.9d).

Fig. 1.2: Geometric illustration of the quadratic programming problem (1.9). The optimum is highlighted by • for the

unconstrained and the three constrained cases. Admissible sets for inequality constraints are shown in gray.

1.2 Dynamic optimization

The general mathematical formulation of a dynamic optimization problem is given by

minu( · )

J(u) = ϕ(te,x(te)) +

∫ te

t0

l(t,x(t),u(t))dt (objective function) (1.10a)

subject to

x = f(t,x,u), x(t0) = x0 (system dynamics) (1.10b)

g(te,x(te)) = 0, (terminal constraints) (1.10c)

hj(x,u) ≤ 0, j = 1, . . . , q (inequality constraints). (1.10d)

Herein, the optimization variable u(t) : [t0, te]→ Rm is a function of an independent coordinate t, whichtypically corresponds to time and hence motivates the label dynamic optimization. Note that u(t) denotesthe input to the nonlinear system (1.10b) in the state x(t) ∈ Rn. Besides initial conditions x(t0) = x0

often terminal constraints x(te) are given in the form (1.10c), e.g. to achieve a desired state x∗e at te

6 1 Introduction

0 0.05 0.1 0.15 0.21.000

1.010

1.020

t

h

(a) Altitude h(t).

0 0.05 0.1 0.15 0.20.000

0.100

0.200

t

v

(b) Velocity v(t).

0 0.05 0.1 0.15 0.20.4

0.6

0.8

1

t

m

(c) Mass m(t).

0 0.05 0.1 0.15 0.20.000

2.000

4.000

t

u

(d) Control input u(t).

Fig. 1.3: Solution of the dynamic optimization problem (1.11) for the Goddard rocket.

by defining g(te,x(te)) = x(te) − x∗e. Moreover, inequality constraints arise, which are e.g. imposed byactuator limitations or security considerations. The objective function (1.10a) is also called performanceindex or Bolza functional .

Dynamic optimization deals with the determination of the time history of the control vector u(t),t ∈ [t0, te] to minimize the objective function (1.10a) while ensuring that the evolution of the state variablex(t) governed by (1.10b) satisfies (1.10c) and (1.10d). If the end time te is known or unknown, the problemis referred to a dynamic optimization problem with fixed end time or free end time, respectively. Sinceu(t) is an element of an infinite dimensional function space due to t ∈ [t0, te] ⊆ R, problems of the form(1.10) are also referred to by the notions optimal control , dynamic programming or infinite–dimensionaloptimization.

Example 1.3 (Rocket flight). One of the classic examples for dynamic optimization addresses themaximization of the altitude of a flying rocket under the influence of earth’s gravity. This problem tracesback to the pioneering work of R.H. Goddard in 1919 [5] and can be formulated in normalized form asfollows

minu( · )

J(u) = −h(te) (objective function) (1.11a)

subject to

d

dt

h

v

m

=

v

u −Af (h, v)

m− 1

h2

−uc

,h(0)v(0)m(0)

=

101

(system dynamics) (1.11b)

m(te) = 0.6, (terminal constraint) (1.11c)

0 ≤ u ≤ 3.5 (inequality constraint). (1.11d)

Here, h(t) refers to flight altitude, v(t) is the rocket velocity and m(t) denotes the rocket mass. Drag isdescribed by the function

Af (h, v) = A0v2eβ(1−h).

The terminal constraint (1.11c) refers to the dead weight of the rocket without fuel. The input is given bythe thrust u(t), who has to fulfill (1.11d). Subsequently, the parameters are assigned as c = 0.5, A0 = 310


and β = 500. Results for the numerical solution of (1.11) are shown in Figure 1.3. The rocket is initiallydriven at maximum thrust. Then thrust is dropped and is again increased along a parabolic curve untilall fuel is consumed. This behavior is a result of drag, which decreases with increasing altitude so that itis advantageous to reduce thrust in areas of high drag (see, e.g. [3]).

Example 1.4 (Optimal ship trajectory). The trajectory of a ship moving in the (x1, x2)–plane isconsidered under the influence of a known flow rate acting in x1–direction [8]. The scenario is shownschematically in Figure 1.4. The simplified equations of motion of the ship are given by

x1

x2

u

x(tf )d(t)

Fig. 1.4: Schematics of ship movement in a the (x1, x2)–plane under the influence of the flow d(t).

d

dt

[x1

x2

]=

[v cos(u) + dv sin(u)

],

[x1(0)x2(0)

]=

[00

](1.12a)

with x1(t) and x2(t) the spatial coordinates of the ship, v(t) = 1 + σ(t− 1/2) the forward velocity of theship assumed to jump from 1 to a value of 2 at t = 1/2 and d(t) = 2 + 2 sin(2πt) the known flow rateassumed to change periodically. The optimization objective is herein given by

minu( · )

J(u) =

∫ te

0

(1 + cu2)dt (1.12b)

depending in the parameter c. For c = 0, the problem reduces to minimizing the end time te, where thedesired final state[

x1(te)x2(te)

]=

[105

](1.12c)

is to be reached. For c > 0, the weighting of the input u(t) in the objective function is increased, whichallows to address the issue of minimizing energetic utilization during the ship movement. Optimal tra-jectories are depicted in Figure 1.5 for c ∈ {100, 1, 0}. For c = 0, the input u(t) remains constant forall t while for c > 0 a discontinuity arises. Moreover, a close inspection confirms that te increases forincreasing values of c.

Example 1.5 (Optimize personal income distribution). Economics is an important application do-main for optimization. This is subsequently illustrated for a process describing the behavior of a consumerwho’s desire is to maximize his/her personal income distribution [7]. The human capital H(t), which canbe interpreted as the stock of competencies, knowledge, social and personality attributes – we will shortly

8 1 Introduction

0 2 4 6 8 100

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

x1

x2

(a)

0 0.5 1 1.5 2 2.5 3 3.50.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2

1.25

t

u

c = 100c = 1c = 0

(b)

Fig. 1.5: Solution of the dynamic optimization problem for ship motion.

refer to this as education and professional training –, and the capital K(t) of an average consumer canbe mathematically described by the model equations

d

dt

[HK

]=

[Hεu2u3 − δH

iK +Hu2g(u3)− u1

],

[H(0)K(0)

]=

[H0

K0

]. (1.13a)

Inputs to the system are consumption u1(t), the rate of total time used for working u2(t) and the rate ofworking time used for education and professional training u3(t). These are subject to the constraints

u1 > 0, u2 ∈ [0, 1], u3 ∈ [0, 1). (1.13b)

The optimization problem refers to the maximization of the personal income distribution over a period ofte = 75 a reflected by the objective function

maxu( · )∈R3

J(u) = Kκ(te) +

∫ te

0

e−ρtU(t, u1, u2, u3)dt (1.13c)

with the utility function

U(t, u1, u2, u3) = α0uα1 + β0(1− u2)β + γ0tH

γ . (1.13d)

The utility function contains three parts to take into account the influence of consumption, of leisuretime and of human capital (educational level). The term Kκ(te) allows to consider also the bequest of theconsumer. Note that the maximization problem is by (1.3) equivalent to the minimization problem

minu( · )∈R3

−J(u) = −Kκ(te)−∫ te

0

e−ρtU(t, u1, u2, u3)dt (1.13e)

which is utilized for the numerical solution of the optimization problem. The function g(u3) in (1.13a) isassigned as g(u3) = 1− (1− a)u3 − au2

3 and parameter values are chosen as a = 0.3, α = −1, α0 = −1,β = −0.5, β0 = −0.8, γ = 0.2, γ0 = 0.0075, κ = 0.2, ρ = 0.01, ε = 0.35, δ = 0.01, i = 0.04, H0 = 1, andK0 = 30 [7].

Solutions to the optimization problem are shown in Figure 1.6. The first 17 years are characterized bylearning with u3(t) = 1. This phase is followed by a work period of about 34 years with still a high levelof professional education. The years 52 to 61 are characterized by suspending professional education andfocusing on work at a higher level of consumption. Retirement starts at in about year 62. The highest


0 10 20 30 40 50 60 70

5

10

15

20

25

t

H

(a) Human capital.

0 10 20 30 40 50 60 70

0

50

100

t

K

(b) Capital.

0 20 40 60

3

4

5

6

7

8

t

u1

(c) Consumption.

0 20 40 600

0.1

0.2

0.3

0.4

0.5

t

u2

(d) Rate of total time used for work-ing.

0 20 40 600

0.2

0.4

0.6

0.8

1

t

u3

(e) Rate of working time used for edu-cation.

Fig. 1.6: Solution of the dynamic optimization problem for the personal income distribution.

level of human capital, i.e. education, is reached during the period 30 to 60 years. Negative capital refersto financial debt, which is later compensated by the increasing income.

Dynamic optimization can be integrated into the operation of many technical and non–technical sys-tems. Selected examples include

• process control in chemical and process industry,

• robotics, guidance and trajectory generation,

• automotive and engine control,

• energy production and distribution,

• cooling and reheating processes,

• economics and organization.

10 1 Introduction

1.3 Mathematical fundamentals

In the following, mathematical notions and fundamentals are provided, which are required for the subse-quent chapters. For this, recall the standard formulation of a static optimization problem given by (1.1),i.e.

minx∈Rn

f(x) (1.14a)

s.t. gj(x) = 0, j = 1, . . . , p (1.14b)

hj(x) ≤ 0, j = 1, . . . , q (1.14c)

or equivalently by (1.2), i.e.

minx∈Rn

f(x) (1.15a)

s.t. x ∈ Xad (1.15b)

with Xad the set of admissible or feasible decisions.

1.3.1 Infimum and supremum

Infimum inf and supremum sup refer to the largest lower and smallest upper bound of a non–empty set.

Definition 1.1 (Infimum and supremum). Let X ⊂ R be a non–empty set. The infimum inf X ofX denotes the largest lower bound of X, i.e. there exists α ∈ R such that

(i) x ≥ α for all x ∈ X and

(ii) for all α > α there exists x ∈ R with x < α.

The supremum supX of X denotes the smallest upper bound of X, i.e. there exists β ∈ R such that

(i) x ≤ β for all x ∈ X and

(ii) for all β < β there exists x ∈ X with x > β.

It should be pointed out that neither inf X nor supX do need to be elements of X — providedinf X and/or supX do exist. As an example consider the set X = {x ∈ R : x > 0} = (0,∞). Henceinf X = 0 /∈ X.

1.3.2 Local and global minimum and maximum

Let Xad denote the set of admissible or feasible decisions of the static optimization problem (1.15).

Definition 1.2 (Global and local minimum). The function f(x) has at x∗ ∈ Xad a

(i) local minimum, if there exists an ε > 0 such that f(x∗) ≤ f(x) for all x ∈ Uε ∩ Xad with Uε asufficiently small ε–neighborhood of x∗;

(ii) strict local minimum, if there exists an ε > 0 such that f(x∗) < f(x) for all x ∈ Uε \ {x∗} ∩Xad;

(iii) global (absolute) minimum, if f(x∗) ≤ f(x) for all x ∈ Xad;

1.3 Mathematical fundamentals 11

(iv) strict (unique) global minimum, if f(x∗) < f(x) for all x ∈ Xad \ {x∗}.

Figure 1.7 provides a graphical illustrates of the different notions of a minimum. Note that Definition 1.2can be directly transferred to a local and global maximum. Furthermore recall from Definition 1.2 that

x

f

local

strictly local

strictly global

Xad[ ]

Fig. 1.7: Graphical illustration of local and global minima.

given a minimum (maximum) then min{f(x) : x ∈ Xad} ∈ Xad (max{f(x) : x ∈ Xad} ∈ Xad) whileinf{f(x) : x ∈ Xad} or sup{f(x) : x ∈ Xad}, respectively, do not need to be elements of Xad. The setof all minima is often referred to by

G = arg min{f(x) : x ∈ Xad} := {x ∈ Xad : f(x) = inf{f(x) : x ∈ Xad}}. (1.16)

The set G can be empty, can contain a finite or even an infinite number of elements. For a strictly globalminimum in Xad the term x = arg min{f(x) : x ∈ Xad} typically refers to the function determining thevalue x ∈ Xad minimizing the function f(x).

xα β

f

( )

(a)

xα βγ

f

[ ]

(b)

xα ∞

f

[ )

(c)

Fig. 1.8: Non–existence of minima.

For the existence or non–existence of minima and maxima consider the three examples in Figure 1.8:

• In Figure 1.8(a) the infimum of f(x) in the set X := (α, β) is given by f(β). Since X is open andhence β /∈ X no minimum exists in this case.

12 1 Introduction

• In Figure 1.8(b) the limit limx→γ f(x) is the infimum of f(x) in X := [α, β]. However, the minimumdoes not exist due to the discontinuity of f(x) at x = γ.

• Similarly in the case depicted in Figure 1.8(c) the function f(x) does not have a minimum since f(x)is not bounded from below in the set X := {x ∈ R : x ≥ α}.

These examples clearly illustrate that criteria are required to conclude the existence of minima and hencethe existence of solutions to optimization problems. One of these is given below.

Theorem 1.1. Let X be a non–empty and compact (closed and bounded) set and let f : X → R becontinuous on X. Then the set of minima G = arg min{f(x) : x ∈ X} is non–empty and compact.

Theorem 1.1 is also called Weierstrass extreme value theorem and yields only a sufficient existence con-dition. To illustrate this fact consider the optimization problem minx∈(−1,1) x

2, whose solution is givenby x = 0 although X := {x ∈ R : x ∈ (−1, 1)} is open and hence not compact.

The point(s) x∗, where f(x) attains a local or global minimum are called local or global minimizer(s),as is formally summarized in the definition below.

Definition 1.3 (Global and local minimizer). Let x∗ ∈ Xad, then x∗ is a

(i) local minimizer if f(x) has a local minimum at x∗;

(ii) strict local minimizer if f(x) has a strict local minimum at x∗;

(iii) global (absolute) minimizer if f(x) has a global (absolute) minimum at x∗;

(iv) strict (unique) global minimizer if f(x) has a strict (unique) global minimum at x∗.

For the constant f(x) = 1 every point x is a local minimizer. The function f(x) = (x − 1)4 has a strictlocal minimizer at x∗ = 1.

1.3.3 Gradient and Hessian matrix

As will be seen in subsequent chapters the computation of derivatives of first and second order of anobjective function f(x) is of fundamental importance for the solution of optimization problems. Sincedifficulties arise when f(x) or its derivatives are not continuous, it is typically assumed that all functionsdefining an optimization problem (cf. (1.14), (1.15)) are continuous and sufficiently often continuouslydifferentiable.

Definition 1.4 (Gradient). Let f : X → R be a continuously differentiable function, i.e. f(x) ∈C1(X). Then the gradient of f(x) at x is given by

(∇f)(x) =

(∂f

∂x

)T=

∂

∂x1f(x)

∂

∂x2f(x)

...∂

∂xnf(x)

. (1.17)

Definition 1.5 (Hessian matrix). Let f : X → R be a twice continuously differentiable function, i.e.f(x) ∈ C2(X). Then the Hessian (or Hessian matrix) of f(x) at x is given by


(∇2f

)(x) =

∂2

∂x12f(x) · · · ∂2

∂x1∂xnf(x)

......

∂2

∂xn∂x1f(x) · · · ∂2

∂xn2f(x)

. (1.18)

It should be pointed out that both gradient and Hessian matrix play an important role in optimizationand are utilized in essentially any optimization algorithm.

Remark 1.2. If x = x, then the notation is typically simplified to f ′(x) and f ′′(x).

The assumption that f ∈ C2(X) implies that the mixed partial derivatives commute, i.e.

∂2

∂xi∂xjf(x) =

∂2

∂xj∂xif(x).

As a result, the Hessian matrix is always symmetric satisfying (∇2f)(x) = (∇2f)T (x) and as suchexhibits only real eigenvalues.

1.3.4 Differentiation of vector–valued functions

The differentiation of vector–valued functions is encountered in different variations throughout the re-maining text. Subsequently, certain rules are summarized using

x =[x1 . . . xn

]T, f(x) =

[f1(x) · · · fm(x)

]T, g(x) =

[g1(x) · · · gm(x)

]Twith n, m ∈ N. The differentiation with respect to x of

(i) a column vector f(x) yields the Jacobian matrix, i.e.

∂

∂xf(x) =

∂

∂xf1(x)

...∂

∂xfm(x)

=

∂

∂x1f1(x) · · · ∂

∂xnf1(x)

......

∂

∂x1fn(x) · · · ∂

∂xnfn(x)

, (1.19)

(ii) the product of a scalar function f(x) and a column vector g(x), i.e.

∂

∂x

[f(x)g(x)

]= g(x)

∂

∂xf(x) + f(x)

∂

∂xg(x), (1.20)

(iii) the scalar product of two vectors f(x), g(x), i.e.

∂

∂x

[f(x) · g(x)

]=

∂

∂x

[fT (x)g(x)

]= gT (x)

∂

∂xf(x) + fT (x)

∂

∂xg(x), (1.21)

(iv) the product of a (p×m)–matrix A(x) and an m–column vector f(x), i.e.

∂

∂x

[A(x)f(x)

]=

[∂

∂x1A(x)f(x) · · · ∂

∂xnA(x)f(x)

]+A(x)

∂

∂xf(x). (1.22)

14 1 Introduction

1.3.5 Definiteness and semi–definiteness

The property of definiteness or semi–definiteness of the Hessian matrix may be analyzed by making useof the following result.

Theorem 1.2 (Definiteness of matrices). Let A ∈ Rn×n be symmetric. Then matrix A is

(a) ∀p ∈ Rn, p 6= 0 : (b) ∀j = 1, . . . , n : (c) ∀j = 1, . . . , n

positive semi–definite pTAp ≥ 0 λj (A) ∈ R, λj (A) ≥ 0 —

positive definite pTAp > 0 λj(A) ∈ R, λ

j(A) > 0 Dj > 0

negative semi–definite pTAp ≤ 0 λj(A) ∈ R, λ

j(A) ≤ 0 —

negative definite pTAp < 0 λj (A) ∈ R, λj (A) < 0 (−1)j+1Dj < 0

Herein, recall that the eigenvalues λj of the matrix A are solutions of the equation

det(λE −A

)= 0

with E the (n × n) identity matrix. The minors Dj are given as the determinants of the left uppersub–matrices of A, i.e.

D1 = det(A1,1), D2 = det

([A1,1 A1,2

A2,1 A2,2

]), . . . , Dn = det(A)

with Ai,j referring to the element of A in row i and column j.

For the sake of brevity A > 0 (A ≥ 0) and A < 0 (A ≤ 0) are used to refer to positive and negative(semi–)definiteness, respectively. Note that for the analysis of definiteness only one of the conditions (a)–(c) has to be considered since each is necessary and sufficient. Criteria (c) is also known as the Sylvestercriteria and does not apply to semi–definite matrices.

1.3.6 Mean value theorems and Taylor’s formula

Gradient and Hessian matrix are exploited in the formulation of mean value theorems, which allow toestimate the difference of two function values by their derivatives. For this, the notion of a line segmentbetween two points x1, x2 ∈ X is required, which refers to

[x1,x2] := x2 + r(x1 − x2) = rx1 + (1− r)x2 (1.23)

when the parameter r varies between 0 and 1.

Theorem 1.3 (Mean value theorem). Let f(x) be continuously differentiable, i.e. f(x) ∈ C1(X),and assume that the line segment [x1,x2] ⊂ X. Then there exists a real number r ∈ [0, 1] such that

f(x2) = f(x1) + (x2 − x1)T (∇f)(x2 + r(x1 − x2)). (1.24)

Moreover, if f(x) ∈ C2(X), then there exists a real number r ∈ [0, 1] such that

f(x2) = f(x1) + (x2 − x1)T (∇f)(x1) +1

2(x2 − x1)T (∇2f)(x2 + r(x1 − x2))(x2 − x1). (1.25)

In addition, an integral mean value theorem is available.


Theorem 1.4 (Integral mean value theorem). Let f(x) ∈ C1(X) and assume that the line segment[x1,x2] ⊂ X. Then

f(x2)− f(x1) =

∫ 1

0

(∇f)(x1 + r(x2 − x1))(x2 − x1)dr. (1.26)

For the sake of completeness, Taylor’s formula is provided, i.e.

f(x+ h) =

q∑k=0

(h · ∇)kf(x)

k!+Rq(f,x;h) =

∑|α|≤q

∂αf(x)

α!hα +Rq(f,x;h) (1.27)

assuming that f(x) ∈ Cq(X) and [x,x + h] ⊂ X [1, 4]. The latter equality follows from the use ofmulti–index notation using α = (α1, . . . , αn) with α! = α1! · · ·αn! and |α| = α1 + · · · + αn so thathα = hα1

1 · · ·hαnn . The remainder Rq(f,x;h) can be formulated either in Lagrange form

Rq(f,x;h) =∑|α|=q+1

∂αf(x+ rh)

α!hα for some r ∈ (0, 1)

or in integral form

Rq(f,x;h) = (k + 1)∑|α|=q+1

hα

α!

∫ 1

0

(1− r)q∂αf(x+ rh)dr.

1.3.7 Convexity

The property of convexity of sets and functions is of fundamental importance for the analysis of opti-mization problems and eventually results in a simpler solution [2].

Convex sets In geometrical terms a set X ∈ Rn is convex if the line segment connecting two arbitrarypoints x, y ∈ X is completely included in X.

Definition 1.6 (Convex set). A set X ⊂ Rn is called convex , if for all x, y ∈ X and all r ∈ (0, 1) thefollowing condition holds, i.e.

[x,y] = y + r(x− y) = z ∈ X. (1.28)

Examples of convex and non–convex sets are shown in Figure 1.9. A point r1x1 + · · · + rkxk, wherer1 + · · ·+ rk = 1 and rj > 0, j = 1, . . . , k, is called a convex combination of the points xj , j = 1, . . . , k.It can be shown that a set is convex if and only if it contains every convex combination of its points.

Definition 1.7 (Convex hull). The convex hull of a set X, denoted convX, is the set of all convexcombinations of points in X, i.e.

convX ={r1x1 + · · ·+ rkxk : xj ∈ X, r1 + · · ·+ rk = 1 for rj > 0, j = 1, . . . , k

}. (1.29)

The convex hull convX is always convex and is the smallest convex set that contains X, where X doesnot need to be convex. Figure 1.10 illustrates this fact.

For convex sets the following useful properties hold true. These include operations that preserve theconvexity of sets or allow to construct convex sets from others:

16 1 Introduction

y

x

y

x

y

x

(a) Convex sets

y

x

y

x

y

x

(b) Non–convex sets

Fig. 1.9: Examples of convex and non–convex sets in R2.

(i) If the sets X and Y are convex, then the intersection X ∩ Y is convex.

(ii) Scaling and translation preserve convexity. In particular, if X is a convex set and r ∈ R, then theset {y : y = rx, x ∈ X} is convex. Similarly, let X ⊆ Rn be a convex set and h ∈ Rn, then the set{y : y = x+ h} is convex.

(iii) If X and Y are convex sets, then the sum of X and Y, i.e., the set {z : z = x+ y, x ∈ X, y ∈ Y}is convex.

(iv) The image of a convex set under a linear transformation is convex.

For a detailed discussion about convex sets in the context of optimization, the interested reader is,e.g., referred to the exposition in [2].

Fig. 1.10: Examples of convex hulls in R2. Left: convex hull of 10 points. Right: convex hull of the non–convex set of Figure

1.9(b), left.


Convex functions Similar to convex sets, convex functions play a crucial role in optimization. Theirdefinition is given below.

Definition 1.8 (Convex and concave function). Let X ⊂ Rn be a convex set. The function f : X →R is called convex if for all x, y ∈ X and all r ∈ R with r ∈ [0, 1] the inequality

f(z) ≤ rf(x) + (1− r)f(y), z = rx+ (1− r)y (1.30)

is satisfied. The function f is called strictly convex if for all r ∈ R with r ∈ [0, 1] and x 6= y the condition

f(z) < rf(x) + (1− r)f(y), z = rx+ (1− r)y (1.31)

is fulfilled. The function f is called (strictly) concave if −f is (strictly) convex.

In geometrical terms, this definition implies that the function f is convex (concave) if the line segmentbetween (x, f(x)) and (y, f(y)), which is the chord from x to y, lies above (below) the graph of f .The line segment thereby refers to the value of f(z) for z = rx + (1 − r)y with r ∈ [0, 1]. Figure 1.11illustrates this geometric interpretation. This moreover implies that any affine function f(x) = aTx+ bwith aT = [a1, . . . , an] is both convex and concave.

z

f

x y(a) strictly convex

z

f

x y(b) concave

z

f

x y(c) neither concave nor convex

Fig. 1.11: Examples of convex and concave functions.

It can be shown that a function f : X → R is convex if and only if it is convex when it is restrictedto any line that intersects its domain. Hence it suffices to show that for all x ∈ X and all v ∈ Rn, thefunction g(t) = f(x + tv) is convex, which restricts the convexity analysis to a line. Convex functionsshow certain interesting properties:

(i) Let fj(x), j = 1, . . . , k be convex on the convex set X and let rj ≥ 0, j = 1, . . . , k. Then the weightedsum

f(x) =

k∑j=1

rjfj(x) (1.32)

is convex on X.

(ii) If f(x) is convex on the convex set X, then the set S = {x : x ∈ X, f(x) ≤ c} is convex for all realnumbers c. This property is visualized in Figure 1.12(a).

(iii) Let f ∈ C1, i.e. ∇f exists at each point in the open set X. Then f is convex if and only if X isconvex and the condition

f(y) ≥ f(x) + (y − x)T (∇f)(x) (1.33)

is fulfilled for all x, y ∈ X. This inequality, also referred to a first–order condition, admits thegeometric interpretation that at any point x of a convex function f(x) there exists a supportinghyperplane above which f(x) is restricted to (cf. Figure 1.12(b)). Taking into account either themean value theorem (1.24) or Taylor’s formula (1.27) up to order one for h = y − x sufficientlysmall, then (1.33) states that the first order Taylor approximation of a convex function is a global

18 1 Introduction

underestimator of the function. This implies that if the first order Taylor approximation of a functionis always a global underestimator, then the function is convex [2]. Hence, given a convex function itis possible by making use of (1.33) to deduce global information from local information (value andderivative at a point). Note that strict convexity can be similarly characterized in terms of

f(y) > f(x) + (y − x)T (∇f)(x). (1.34)

(iv) Let f ∈ C2, i.e. ∇2f exists at each point in the open set X. Then f is convex if and only if X isconvex and the Hessian matrix (∇2f)(x) is positive semi–definite on X. This second–order conditionimplies geometrically that the curvature of f(x) at any x is positive. If (∇2f)(x) is positive definite,then f is strictly convex in X provided that X is convex. However, the converse does not hold.This can be easily seen for the example of the strictly convex function f(x) = x4, which satisfiesf ′′(0) = 0.

x1

x2

x3

Plane f(x) = c

Convex function f(x)

Convex set S

(a) Convex set S obtain from the intersection of a convexfunction f(x) with the plane f(x) = c.

z

f

x

f(x)

y

f(y)

f(x) + (y − x)f ′(x)

(b) Supporting tangent for a convex function.

Fig. 1.12: Illustration of properties (ii) and (iii) of convex functions.

Example 1.6 (Convexity of quadratic functions). Consider the quadratic function f : Rn → R

defined by

f(x) =1

2xTPx+ qTx+ r

with P = PT ∈ Rn×n, q ∈ Rn and r ∈ R. Gradient and Hessian matrix evaluate to

(∇f)(x) = Px+ q, (∇2f)(x) = P.

By making use of property (iv) it follows that f(x) is convex if and only if P is positive semi–definite (itis concave if and only if P is negative semi–definite). Strict convexity (concavity) follows if P is positive(negative) definite.

Exercise 1.1. Prove properties (i) to (iv).

References 19

References

1. Amann H, Escher J (2006) Analysis II. Birkhauser Verlag, Basel, Boston, Berlin

2. Boyd S, Vandenberghe L (2004) Convex Optimization. Cambridge University Press, Cambridge

3. Bryson A (1999) Dynamic Optimization. Addison Wesley Longman, Inc., Menlo Park (CA), USA

4. Evans L (2002) Partial Differential Equations, Graduate Studies in Mathematics, vol 19. American Mathematical Society,

Providence, Rhode Island

5. Goddard R (1919) A method for reaching extreme altitudes. Smithsonian Miscellaneous Collections 71

6. Graichen K (2013) Methoden der Optimierung und optimalen Steuerung. Skriptum zur Vorlesung, Universitat Ulm

7. Oberle H, Rosendahl R (2006) Numerical computation of a singular–state subarc in an economic optimal control model.

Optim Contr Appl Met 27:211–235

8. Papageorgiou M (1991) Optimierung. R. Oldenbourg Verlag, Munchen, Wien

Chapter 2

Static unconstrained optimization

In unconstrained optimization an objective function is minimized without any additional restriction onthe decision variables, i.e.

minx∈Xad

f(x) (2.1)

with Xad ⊂ Rn the set of admissible decisions.

2.1 Optimality conditions

In the following, conditions are derived to determine (local) minimizers x∗ of f(x) in Xad so that

f(x) ≥ f(x∗) (2.2)

for x ∈ Xad or x ∈ Uε ∩Xad with Uε a sufficiently small ε–neighborhood of x∗ (cf. Definition 1.2). Forthis, recall the mean value theorem (Theorem 1.3) and assume that f(x) ∈ C1(Xad) and that the linesegment [x∗,x∗ + δx] ⊂ Xad for sufficiently small δx. Then taking into account (1.24) we have

f(x∗ + δx) = f(x∗) + (δx)T (∇f)(x∗ + (1− r)δx)

for some r ∈ [0, 1]. In case of a minimizer x∗ the inequality (2.2) with x = x∗ + δx has to be fulfilled forall δx sufficiently small. This yields

f(x∗) + (δx)T (∇f)(x∗ + (1− r)δx) ≥ f(x∗)

or equivalently

(δx)T (∇f)(x∗ + (1− r)δx) ≥ 0.

It will be shown by contradiction that the latter inequality implies (∇f)(x∗) = 0. Hence, let first(∇f)(x∗) 6= 0 and define δx = −(∇f)(x∗). Then (δx)T (∇f)(x∗) = −‖(∇f)(x∗)‖2 < 0. Since ∇f isby assumption continuous in a neighborhood of x∗, there exists a scalar τ such that

δxT (∇f)(x∗ + tδx) < 0 (2.3)

for all t ∈ [0, τ ]. Hence, for any t ∈ (0, τ ] the mean value theorem implies

f(x∗ + tδx) = f(x∗) + tδxT (∇f)(x∗ + (1− r)tδx) (2.4)

for some r ∈ [0, 1]. Noting that 1 − r ∈ [0, 1] and hence t := (1 − r)t ∈ [0, τ ], substitution of (2.3) into(2.4) yields

21

22 2 Static unconstrained optimization

f(x∗ + tδx) < f(x∗)

for all t ∈ (0, τ ]. With this, a new direction is obtained away from x∗ along which f(x) decreases. Thusx∗ is not a local minimizer and a contradiction is obtained, which implies the following result.

Theorem 2.1 (First order necessary optimality condition). Let Xad ⊂ Rn be the set of admissibledecisions and assume f(x) ∈ C1(Xad). If x∗ ∈ Xad is a local minimizer, then

(∇f)(x∗) = 0. (2.5)

Example 2.1. Consider the minimization problem

minx∈Xad

f(x) = sin(x1x2)− tan(x1) (2.6)

with Xad = [−1, 1]× [−1, 1]. Evaluation of (2.5) provides

(∇f)(x∗) =

[x2 cos(x1x2)− cos−2(x1)

x1 cos(x1x2)

]= 0,

which results in x∗ = [0, 1]T . In order to ensure that x∗ is a local minimizer it is required to show that0 = f(x∗) ≤ f(x) for all x in a neighborhood of x∗. Therefore let x = x∗+[ε, 0]T for |ε| � 1 and evaluate

f(x) = f(x∗ + [ε, 0]T ) = sin(ε)− tan(ε).

Thus, f(x) < f(x∗) for x1 ∈ (0, ε] and f(x) > f(x∗) for x1 ∈ [−ε, 0) given x2 = 1, which yields that x∗

satisfying (2.5) is not a local minimizer.

This example illustrates that the conditions of Theorem 2.1 are only necessary but not sufficient. Inaddition, (2.5) only implies that x∗ is an extremum, subsequently often called stationary point , and isfulfilled for a minimum but also for a maximum and a saddle point (cf. Figure 2.1). By taking into account

x1x2

f

(a) Minimum.

x1x2

f

(b) Maximum.

x1x2

f

(c) Saddle point.

Fig. 2.1: Examples of minimum, maximum and saddle point.

higher order terms, assuming that f(x) is at least twice continuously differentiable in Xad, the higherorder mean value theorem (1.25) can be considered to improve Theorem 2.1.

Theorem 2.2 (Second order necessary optimality conditions). Let Xad ⊂ Rn be the set of ad-missible decisions and assume f(x) ∈ C2(Xad). If x∗ ∈ Xad is a local minimizer, then

(∇f)(x∗) = 0 and (∇2f)(x∗) ≥ 0. (2.7)

2.1 Optimality conditions 23

The second condition in (2.7) refers to the Hessian of f(x) being positive definite at the point x∗. Theproof of Theorem 2.2 is left as an exercise to the reader.


minx∈Xad

f(x) = x21 − 4x1 + 3x2

2 − 6x2 − 10 (2.8)

with Xad = {x ∈ R2 : x1 ≥ −1, x2 ≥ 0}. Evaluation of (2.7) provides

(∇f)(x∗) =

[2x∗1 − 46x∗2 − 6

]= 0,

which is satisfied for x∗ = [2, 1]T and

(∇2f)(x∗) =

[2 00 6

].

Since the Hessian matrix is positive definite (eigenvalues at λ1 = 2 and λ2 = 6), the objective functionf(x) does fulfill the necessary optimality conditions of Theorem 2.2.


minx∈Xad

f(x) = x21 + x1x2 − 2x2

2 (2.9)

with Xad = R2. Evaluation of (2.7) results in

(∇f)(x∗) =

[2x∗1 + x∗2x∗1 − 4x∗2

]= 0,

which yields x∗ = 0 and hence

(∇2f)(x∗) =

[2 11 −4

].

The Hessian matrix is indefinite (one positive and negative real eigenvalue) so that neither the necessaryoptimality conditions for a minimum nor maximum are fulfilled. It is left to the reader to show thatx∗ = 0 is a saddle point as depicted in Figure 2.1(c).

The following result provides sufficient conditions that guarantee that a point x∗ interior to Xad is astrict local minimizer [9].

Theorem 2.3 (Second order sufficient optimality condition). Let Xad ⊂ Rn be the set of admis-sible decisions and let f(x) ∈ C2(Xad). If x∗ ∈ Xad and the conditions

(∇f)(x∗) = 0 and (∇2f)(x∗) > 0. (2.10)

are fulfilled, then x∗ is a strict local minimizer of f(x).

If the objective function f(x) in (2.1) is a convex function, local and global minimizers can be easilycharacterized as is summarized below [9].

Theorem 2.4. If f(x) is a convex function on the convex set Xad, then any local minimizer x∗ is a globalminimizer and the set of minima G = arg min{f(x) : x ∈ Xad} is convex. If in addition f(x) ∈ C1(Xad),then any stationary point x∗ ∈ Xad is a global minimizer.

The proof of Theorem 2.4 is left to the reader and can be, e.g., found in [2].


Example 2.4. Consider a minimization problem involving a quadratic form, i.e.

minx∈Xad

f(x) =1

2xTPx+ qTx+ r

for Xad = Rn with P = PT ∈ Rn×n, q ∈ Rn and r ∈ R. From Example 1.6 the gradient and Hessianmatrix follow as

(∇f)(x) = Px+ q, (∇2f)(x) = P.

We note that f(x) is strictly convex if P is positive definite. There is a unique stationary point x∗ =−P−1q, which, due to the differentiability of f(x), is a global minimizer by Theorem 2.4.

2.2 Numerical minimization algorithms

The necessary optimality conditions require the determination of stationary points x∗ as solutions to anin general nonlinear system of n coupled equations given by (∇f)(x∗) = 0. As a result, an analyticalsolution can be expected only in special cases so that numerical techniques are needed to accuratelyapproximate stationary points x∗.

For this, various algorithms are available, which in principle are based on the computation of a sequenceof values (xk)k∈N0

starting at an initial point x0 such that f(x) is decreased in each iteration step, i.e.

f(xk+1) < f(xk), k = 0, 1, . . . (2.11)

with the desire to achieve convergence of the sequence to the (local) minimizer

limk→∞

xk → x∗. (2.12)

The algorithms are often referred to as iterative descent algorithms.

Remark 2.1. It should be mentioned that also nonmonotone algorithms exist that do not require thedecrease of f(x) in every iteration but after a certain prescribed number of iterations. Also informationof earlier iterates x0, x1, . . . ,xk can be used to determine xk+1.

In the following, some preliminaries from numerical analysis are summarized, which are required toproperly analyze so–called line search and trust region methods. Finally so–called direct search strategiesare briefly introduced.

2.2.1 Preliminaries

Convergence is the essential question and preliminary in any iterative technique. For a proper definition,the contraction property of a mapping in a suitable complete space has to be taken into account bydefining a suitable metric, i.e. a measure of distance from the iterate to the fixed point of the mapping.The reader is referred to [13, 11] for further details. Subsequently, only the notion of convergence orderis introduced as a measure of convergence speed.

Definition 2.1 (Order of convergence). Let (xk) be a sequence converging towards the limit x∗. Theorder of convergence of the sequence (xk)k∈N0 is the supremum of all nonnegative numbers p for which

0 ≤ limk→∞

|xk+1 − x∗||xk − x∗|p

= µ <∞. (2.13)

2.2 Numerical minimization algorithms 25

The constant µ is called the asymptotic error constant.

It is obvious from (2.13) that larger values of p correspond to a higher speed of convergence since thedistance of the iterate xk+1 to x∗ is for large k reduced by the p–th power.

Example 2.5. The sequence (√√

k + 1−√√

k)k∈N0 converges to 0 with the order of convergence p = 1since

limk→∞

√√k + 2−

√√k + 1√√

k + 1−√√

k= limk→∞

1−(k+1k+2

) 14(

k+1k+2

) 14 −

(kk+2

) 14

= 1.

One typically distinguishes between the two major cases{p = 1, µ ∈ (0, 1), linear convergence

p = 2, µ <∞, quadratic convergence

and{p = 1, µ = 0, superlinear convergence

p = 1, µ = 1, sublinear convergence.

This in particular illustrates that any algorithm with convergence order p > 1 is superlinear.

Exercise 2.1. Determine the convergence order p and asymptotic error constant µ of the sequence{k−k}k∈N0

.

Solution. The sequence converges superlinearly to zero.

When analyzing a sequence of vectors (xk)k∈N0 converging to a limit x∗, as is the case in the consideredminimization algorithms, the determination of the rate of convergence requires the proper mapping ofthis sequence into a sequence of scalars. If f(x) is the objective function according to (2.1), then typicallythe convergence of the sequence (f(xk))k∈N0

to f(x∗) is analyzed. In this context f(x) is also referredto as error function. Alternatively also the norm ‖xk − x∗‖ can be considered or another suitable mapfrom Rn to R. However, the rate of convergence of a vector–valued sequence is in general independent ofthe choice of the error function.

2.2.2 Line search methods

The principle operation of line search methods is illustrated in Figure 2.2. In each iteration of a linesearch method a search direction sk is computed and the algorithm decides how far to move into thisdirection by determining a suitable step length αk > 0, i.e.

xk+1 = xk + αksk. (2.14)

Most line search algorithms require sk to be a descent direction, i.e. one for which sTk (∇f)(xk) < 0, sincethis property guarantees that f(x) can be reduced along this direction such that

f(xk+1) = f(xk + αksk) < f(xk). (2.15)

To illustrate this, the following proposition is proved subsequently.


x2

x1

f

Fig. 2.2: Illustration a line search method.

Proposition 2.1 (Direction of steepest descent). The search direction sk = −(∇f)(xk) is thedirection of steepest descent, i.e. among all directions at xk it is the one along which f(x) decreases mostrapidly.

Proof. Let f ∈ C2(Xad). Then the mean value theorem (1.25) implies that there exists an r ∈ [0, 1] suchthat

f(xk+1) = f(xk + αksk) = f(xk) + αksTk (∇f)(xk) +

1

2α2ksTk (∇2f)(xk + (1− r)αk︸︷︷︸

=tk

sk)sk.

Herein, tk ∈ [0, αk] since r ∈ [0, 1]. The rate of change in f is the coefficient before αk, i.e. sTk (∇f)(xk).Hence, the unit direction of sk of most rapid decrease is the solution to the minimization problem

minsk∈Rn

sTk (∇f)(xk) subject to ‖sk‖ = 1.

Evaluation of the scalar product yields

sTk (∇f)(xk) = ‖sk‖︸︷︷︸=1

‖(∇f)(xk)‖ cos θ

with θ the angle between sk and (∇f)(xk). The desired minimum is obviously attained for cos θ = −1so that

sk = − (∇f)(xk)

‖(∇f)(xk)‖

is the (unit) direction of steepest descent starting at xk. ut

This also illustrates that f(x) can be reduced along any direction sk fulfilling the property thatsTk (∇f)(xk) < 0. Depending on the selection of the search direction sk different algorithms can bedistinguished, which are summarized below. These, moreover, depend on the suitable determination ofthe second degree of freedom, namely the step length αk > 0. For this, it would be ideal to find the globalminimizer of the scalar minimization problem

minαk>0

g(αk) = f(xk + αksk) (2.16)


for fixed xk and sk. However, this is in general computationally too expensive so that other techniqueshave to be taken into account to locally address (2.16). The schematic realization of line search methodsis summarized in Algorithm 1 below.

Algorithm 1: Schematic line search method.

input : x0 (starting value)

ε (stopping criteria)initialize: k = 0

repeat

Compute search direction sk

Find an appropriate step length αk

Compute xk+1 = xk + αksk

Update k = k + 1

until ‖f(xk+1)‖ ≤ ε;

2.2.2.1 Determination of step length It should be mentioned that simply asking for (2.15), i.e.f(xk + αksk) < f(xk) is not enough to achieve convergence to the minimizer x∗. As the followingexample illustrates sufficient decrease conditions are required to solve (2.16).

Example 2.6. Let f(x) = (x−1)2−1 and consider the sequence (xk)k∈N with xk = 1 + (−1)k√

2/k + 1.Then f(xk) = 2/k so that f(xk+1) < f(xk) but as k →∞ the sequence f(xk) approaches 0 since xk willstart alternating between 0 and 2. However, the minimum f(x∗) = −1 for x∗ = 1 is not reached.

Subsequently, different conditions and related algorithms are provided, which enable to determine anappropriate step length αk in the line search method assuming that the starting point xk of the linesearch and a search direction (descent direction) sk are given.

Armijo conditions The Taylor series of g(αk) = f(xk + αksk) around αk = 0 results in

g(αk) = g(0) + g′(0)αk +O2(αk) = f(xk + αksk) = f(xk) + αksTk (∇f)(xk) +O2(αk).

In the Armijo condition the step length αk, the directional derivative sTk (∇f)(xk) and the reduction inf are connected by the inequality

f(xk + αksk) ≤ f(xk) + ε0αksTk (∇f)(xk) (2.17)

for some constant ε0 ∈ (0, 1), typically chosen small, e.g. ε0 ≤ 0.01. With this, an upper bound on the steplength is imposed. To ensure that αk does not become too small an additional inequality is introduced

f(xk + αksk) ≥ f(xk) + ε0ε1αksTk (∇f)(xk) (2.18)

with the parameter ε1 > 1. Figure 2.3(a) shows a graphical illustration of (2.17) and (2.18). Herein recallthat sk is by assumption a descent direction with sTk (∇f)(xk) < 0.

In practice one starts for fixed xk and sk with an initial choice of αk = α(0)k :

(i) If the initial value satisfies (2.17), then αk is successively increased by a factor ε1 > 1 until at say

α(j+1)k condition (2.17) is violated.

(ii) If the initial value does not satisfy (2.18), then successively decrease αk by a factor ε1 > 1 until

α(j)k = α

(j−1)k /ε1 fulfills (2.18).


g

0 αk

g(0) + ε0αkg′(0)

g(0) + ε0ε1αkg′(0)

(a) Armijo conditions (2.17), (2.18)

g

0 αk

g(0) + ε0αkg′(0)

ε2g′(0)

g′(αk)

(b) Wolfe conditions (2.19)

Fig. 2.3: Illustration of Armijo and Wolfe conditions. Admissible areas are marked by the double arrows.

(iii) Finally assign the determined αk = α(j)k as step length for the line search algorithm.

Wolfe conditions A slight modification of the Armijo conditions leads to the so–called Wolfe condi-tions. Besides (2.17) a curvature condition is introduced different from (2.18) to exclude unacceptablesmall values of αk, i.e.

g′(αk) ≥ ε2g′(0)

or equivalently

sTk (∇f)(xk + αksk) ≥ ε2sTk (∇f)(xk)

for some constant ε2 ∈ (ε1, 1). This condition ensures that the slope of g( · ) at αk is ε2 times greater thanthe initial slope at αk = 0. Figure 2.3(b) provides a graphical illustration and confirms that this selectionis useful since if the slope g′(αk) = sTk (∇f)(xk+αksk) is strongly negative, then f can be further reducedby moving along the search direction sk with αk. On the other hand, if g′(αk) = sTk (∇f)(xk + αksk) isonly slightly negative or positive, then one can in general no longer assume that f can be further reducedin this search direction so that line search can be terminated with this sk.

In summary, the introduced two sufficient conditions are known as the Wolfe condition and read

f(xk + αksk) ≤ f(xk) + ε1αksTk (∇f)(xk) (2.19a)

sTk (∇f)(xk + αksk) ≥ ε2sTk (∇f)(xk) (2.19b)

for constants ε1 ∈ (0, 1) and ε2 ∈ (ε1, 1). Typical values of ε2 are 0.9 when the search direction sk isdetermined by a Newton or quasi–Newton method and 0.1 if a nonlinear conjugate gradient method ischosen to obtain sk.

The so–called strong Wolfe conditions are obtained by modifying the curvature condition, i.e.

f(xk + αksk) ≤ f(xk) + ε1αksTk (∇f)(xk) (2.20a)∣∣sTk (∇f)(xk + αksk)

∣∣ ≤ ε2∣∣sTk (∇f)(xk)∣∣ (2.20b)

for constants ε1 ∈ (0, 1) and ε2 ∈ (ε1, 1). This more restrictive formulation enforces that αk attains avalue so that xk+1 = xk + αksk lies in (at least) a large neighborhood of a local minimizer or stationarypoint.


Remark 2.2. It can be shown under the assumption of continuous differentiability of f(x) that therealways exist step lengths αk satisfying the Wolfe and the strong Wolfe conditions. For further details thereader is referred to, e.g., [9].

Goldstein conditions The so–called Goldstein conditions are rather similar to the Wolfe conditionsand read as

f(xk) + (1− ε)αksTk (∇f)(xk) ≤ f(xk + αksk) ≤ f(xk) + εαksTk (∇f)(xk) (2.21)

for a constant ε ∈ (0, 1/2). The Goldstein conditions are often used in Newton–type methods but showthe disadvantage compared to the Wolfe conditions that the first inequality may exclude all minimizersof g(αk).

Backtracking As is argued above, the decrease condition (2.19a) alone is not sufficient to guaranteethat the algorithm makes reasonable progress in the considered search direction. Nevertheless, if thecandidate step lengths are chosen appropriately by using a so–called backtracking approach, then thecurvature condition (2.19b) can be neglected and only (2.19a) may be used to terminate the line searchprocedure. The most basic form of this technique is summarized in Algorithm 2. The initial step α0

k is

Algorithm 2: Backtracking algorithm.

input : α0k > 0 (starting value)ρ ∈ (0, 1) (backtracking parameter)

ε0 ∈ (0, 1) (descent parameter)

initialize: αk = α0k

repeat

αk ← ραk

until f(xk + αksk) ≤ f(xk) + ε0αksTk (∇f)(xk);

chosen to be 1 in Newton and quasi–Newton methods but can have different values in other algorithmssuch as steepest descent or conjugate gradient. On the one hand the backtracking algorithm ensures thatαk will in a finite number of trials become sufficiently small so that the decrease condition (2.19a) isfulfilled. On the other hand, αk will not become too small, preventing progress of the algorithm, due tothe successive reduction by ρ ∈ (0, 1). Applications illustrate that backtracking is well suited for Newton’smethod but less appropriate for quasi–Newton and conjugate gradient methods.

Nested intervals A less heuristic technique for the determination of the step length αk minimizing(2.16) is provided by nested intervals. The underlying idea is illustrated in Figure 2.4. For this, it isassumed that g(αk) is unimodal1 in an interval αk ∈ [l0, r0] so that g(αk) has a unique local minimum inthe open interval (l0, r0). To determine the interval [l0, r0] start from a sufficiently small l0 and increasethe value of the right interval boundary r until g(r) starts increasing for some r = r0.

Interval nesting is an iterative procedure to successively decrease the interval [lj , rj ] including the localminimum of g(αk) as j increases. Consider now the j–th iteration step. Based on lj and rj new intervalboundaries l+j , r+

j with lj < l+j < r+j < rj are computed using

l+j = lj + (1− ε)(rj − lj) (2.22a)

r+j = lj + ε(rj − lj) (2.22b)

with the parameter ε ∈ (1/2, 1). The remaining procedure is based on the following lemma.

1 The function f(x) is called unimodal for x ∈ X if it has unique local minimum in X.


g

0 αk

lj

g(lj)

rj

g(rj)

l+j r+j

(a) Step j.

g

0 αk

lj+1

g(lj+1)

rj+1

g(rj+1)

(b) Step j + 1.

Fig. 2.4: Example of nested intervals.

Lemma 2.1. Let lj < l+j < r+j < rj and let g(αk) be an unimodal function on the interval [lj , rj ]. Let

α∗k denote the local minimum of g(αk) in (lj , rj). Then α∗k ∈ [lj , r+j ] if g(l+j ) ≤ g(r+

j ) or α∗k ∈ [l+j , rj ] if

g(l+j ) ≥ g(r+j ).

Proof. Consider the case g(l+j ) ≤ g(r+j ). We follow a contradiction argument assuming that the local

minimizer satisfies α∗k > r+j , which implies that l+j < α∗k. Since g(l+j ) ≤ g(r+

j ) there exists a point

α0k ∈ (l+j , α

∗k) such that g(α0

k) = maxαk∈[l+j ,α∗k] g(αk). Hence α0

k denotes a local maximizer in the interval

[lj , rj ], which contradicts the assumption that g(αk) is unimodal in [lj , rj ]. The case g(l+j ) ≥ g(r+j ) follows

analogously. ut

Lemma 2.1 implies that r+j is dropped for the iteration step j + 1 if g(l+j ) ≤ g(r+

j ) so that the new

interval [lj+1, rj+1] is given by lj+1 = lj and rj+1 = r+j . This case is shown in Figure 2.4. If g(l+j ) ≥ g(r+

j ),

then the new interval [lj+1, rj+1] is obtained as lj+1 = l+j and rj+1 = rj . For the scenario of Figure 2.4evaluate (2.22) for j = j + 1, which yields

l+j+1 = lj + ε(1− ε)(rj − lj), r+j+1 = lj + ε2(rj − lj). (2.23)

By imposing the constraint ε2 = (1− ε), i.e.

ε =

√5− 1

2≈ 0.618, (2.24)

the equality r+j+1 = l+j is obtained, so that in each iteration only one new boundary has to be computed.

Note that the fraction 1/a = 1.618 is also known as the golden ratio. If g(l+j ) ≥ g(r+j ), then (2.24)

similarly ensures l+j+1 = rj to reduce the number of computational steps.

The local minimizer α∗k is finally obtained by averaging the final iteration results, i.e. α∗k = (lj+rj)/2 orby quadratic interpolation (see below) using the three smallest values of the four values of g at lj , rj , l

+j ,

and r+j . The method of nested intervals is an easily implementable and numerically robust procedure to

compute αk at the cost of a typically larger number of iteration steps.

Quadratic interpolation One very efficient method to solve the minimization problem (2.16) is givenby quadratic interpolation. For this, choose three pairwise distinct values α1

k, α2k and α3

k and evaluate

gj = g(αjk). The quadratic interpolation function passing through these three points is given by


q(αk) =

3∑j=1

gj

∏i6=j(αk − αik)∏i6=j(α

jk − αik)

. (2.25)

The minimizer α∗k of q(αk) follows as

α∗k =1

2

g1

(α2k − α3

k

)(α2k + α3

k

)+ g2

(α3k − α1

k

)(α3k + α1

k

)+ g3

(α1k − α2

k

)(α1k + α2

k

)g1

(α2k − α3

k

)+ g2

(α3k − α1

k

)+ g3

(α1k − α2

k

) . (2.26)

2.2.2.2 Determination of the search direction The convergence of the line search methods notonly depends on the selection of the step length αk but also on the chosen search direction sk, which hasto be a descent direction such that sTk (∇f)(xk) < 0. In the following, different approaches for the properchoice of sk are presented together with the resulting convergence rates.

Steepest descent or gradient method As is shown in Proposition 2.1 the search direction

sk = −(∇f)(xk) (2.27)

is the direction of steepest descent, i.e. among all directions at xk it is the direction along f(x) decreasesmost rapidly. For the analysis of convergence of the steepest descent method

xk+1 = xk − αk(∇f)(xk) (2.28)

with (2.27) consider first the quadratic minimization problem

minx∈Rn

f(x) =1

2xTPx− bTx (2.29)

for P symmetric and positive definite. It was shown in Example 1.6 that f(x) is strictly convex since(∇2f)(x) = P is positive definite so that Property (iv) of convex functions applies. Taking into accountTheorems 2.2 and 2.4 it follows from (∇f)(x) = Px − b = 0 that x∗ = P−1b is a global minimizer of(2.29).

Given (2.29) the method of steepest descent (2.28) evaluates to

xk+1 = xk − αk(Pxk − b). (2.30)

The minimizer of (2.16), i.e. minαk>0 g(αk) = f(xk + αksk), and hence the optimal step length αk canbe computed explicitly from

minαk>0

f(xk + αksk) =1

2

(xk − αk (Pxk − b)︸︷︷︸

=(∇f)(xk)

)TP(xk − αk (Pxk − b)︸︷︷︸

=(∇f)(xk)

)− bT

(xk − αk (Pxk − b)︸︷︷︸

=(∇f)(xk)

)taking the derivative of f(xk + αksk) with respect to αk. This yields

αk =(∇f)T (xk)(∇f)(xk)

(∇f)T (xk)P (∇f)(xk). (2.31)

Exercise 2.2. Verify (2.31).

With αk as above the steepest descent method for the quadratic minimization problem reads

xk+1 = xk −(∇f)T (xk)(∇f)(xk)

(∇f)T (xk)P (∇f)(xk)(∇f)(xk). (2.32)


For the convergence analysis, introduce a suitably weighted norm by defining ‖x‖2P = xTPx. This inparticular implies with x∗ = P−1b that

1

2‖x− x∗‖2P = f(x)− f(x∗). (2.33)

The introduced norm is a measure of the difference between the current objective function and theminimal value.

Exercise 2.3. Verify (2.33).

Consider the weighted distance of xk+1 defined in (2.32) and the minimizer, i.e. ‖xk+1 − x∗‖P , whichevaluates to

‖xk+1 − x∗‖2P =

∥∥∥∥xk − (∇f)T (xk)(∇f)(xk)

(∇f)T (xk)P (∇f)(xk)(∇f)(xk)− x∗

∥∥∥∥2

P

= ‖xk − x∗‖2P −[(∇f)T (xk)(∇f)(xk)

]2(∇f)T (xk)P (∇f)(xk)

=

(1−

[(∇f)T (xk)(∇f)(xk)

]2(∇f)T (xk)P (∇f)(xk)‖xk − x∗‖2P

)‖xk − x∗‖2P

=

(1−

[(∇f)T (xk)(∇f)(xk)

]2[(∇f)T (xk)P (∇f)(xk)

][(∇f)T (xk)P−1(∇f)(xk)

])︸︷︷︸= (?)

‖xk − x∗‖2P . (2.34)

Herein, (∇f)(xk) = P (xk−x∗) is used, which follows from x∗ = P−1b and hence b = Px∗. The term (?)describes the decrease in each iteration step so that the convergence properties of the steepest descentmethod can be deduced from this expression. For its interpretation, Kantorovich’s inequality is used.

Lemma 2.2 (Kantorovich’s inequality). Let P ∈ Rn×n be a symmetric positive definite matrix. Forevery x ∈ Rn the inequality

xTx

(xTPx)(xTP−1x)≥ 4λminλmax

(λmin + λmax)2(2.35)

holds with λmin and λmax referring to the smallest and largest eigenvalues of P .

Note that the eigenvalues of a symmetric and positive definite matrix are real and positive.

Exercise 2.4. Prove Lemma 2.2.

These preliminaries allow to conclude the following theorem [7].

Theorem 2.5 (Convergence of steepest descent for quadratic objective function). For anyinitial value x0 ∈ Rn the steepest descent method (2.32) converges linearly to the global minimum of thestrict convex objective function (2.29) with the error norm satisfying

‖xk+1 − x∗‖2P ≤(κ− 1

κ+ 1

)2

‖xk − x∗‖2P (2.36)

with κ = λmax/λmin the spectral condition number of P .

Proof. The result is a direct consequence of (2.35) applied to (2.34), i.e.

‖xk+1 − x∗‖2P‖xk − x∗‖2P

= 1−[(∇f)T (xk)(∇f)(xk)

]2[(∇f)T (xk)P (∇f)(xk)

][(∇f)T (xk)P−1(∇f)(xk)

] ≤ 1− 4λminλmax

(λmin + λmax)2


with λmin and λmax referring to the smallest and largest eigenvalues of P . Hence,

‖xk+1 − x∗‖2P‖xk − x∗‖2P

≤ (λmin − λmax)2

(λmin + λmax)2=

(κ− 1

κ+ 1

)2

which equals (2.36). ut

This result admits a geometric interpretation. At first, it is obvious that convergence is achieved ina single step if κ = 1, i.e. if all eigenvalues λj = λ of P coincide so that P = λE. In this case the

contours of the objective function f(x) = xTPx − bTx are circles and the steepest descent directionalways points at the global minimizer. This case is visualized in Figure 2.5(a).If κ increases, then thecontours approach ellipsoids and convergence degrades due to a zigzagging behavior of the line searchalgorithm with steepest descent as is shown in Figure 2.5(b). Note that the zigzagging will increase withthe spectral condition number κ.

x1

0

α1

0s1

0

x2

0

α2

0s2

0

x∗

x1

x2

(a) Ideal conditioning with κ = 1.

x0

α0s0 x1

x2

x3

x4

x∗

x1

x2

(b) Conditioning with κ� 1.

Fig. 2.5: Line search with steepest descent for quadratic strict convex objective function.

The rate of convergence remains in principle unchanged if the minimization problem (2.1) is consideredwith a general objective function f(x) [9].

Theorem 2.6 (Convergence of steepest descent for general objective function). Let f(x) ∈C2(Rn) and let x∗ denote the local minimizer of (2.1). Moreover, assume that (∇2f)(x∗) is positivedefinite and let λmin and λmax denote its smallest and largest (positive real) eigenvalue. Assume that thesequence of iterations (xk)k∈N0

generated by the steepest descent method

xk = xk − αk(∇f)(xk)

converges to the local minimizer x∗ for suitable step lengths αk. Then the sequence (f(xk))k∈N0 convergeslinearly to f(x∗) with a rate of convergence larger than (κ − 1)2/(κ + 1)2, where κ = λmax/λmin is thespectral condition number of the Hessian matrix.

For poorly conditioned problems with large κ an appropriate scaling might be used to improve theiterations. This approach exploits the fact that the determination of the minimum of the objective functionf(x) is equivalent to the determination of the minimum of the objective function g(z) = f(V z) withx = V z and V regular. With this, the minimizer x∗ is mapped according to z∗ = V −1x∗. Hence, in thenew state z the gradient and Hessian of g(z) are related to those of f(x) by


(∇g)(z) = V T (∇f)(V z), (∇2g)(z) = V T (∇2f)(V z)V, (2.37)

which in particular implies (∇g)(z∗) = V T (∇f)(x∗) and (∇2g)(z∗) = V T (∇2f)(x∗)V . The properselection of the transformation matrix V may lead to an improvement of the spectral condition numberof the Hessian matrix (∇2g)(z) compared to (∇2f)(x). Nevertheless, these so–called pre–conditioningtechniques should be only applied with caution as is remarked, e.g., in [1, p. 34f].

Pros and cons of line search with steepest descent or gradient method can be summarized as follows:

(+) Simple with low computational burden since the explicit evaluation of the Hessian matrix (∇2f)(xk)is not needed;

(+) Convergence can be achieved also for starting values x0 not close to the local minimizer x∗;

(−) Slow convergence depending on the conditioning (and scaling);

(−) Linear convergence only.

Conjugated gradient method The conjugated gradient (CG) method aims at combining quadraticconvergence (as in Newton’s method below) with the low computational burden of the steepest descentmethod. Herein, information of the present and previous iteration are used to appropriately determinethe search direction, i.e.

sk = −(∇f)(xk) + βksk−1, k ≥ 1

s0 = −(∇f)(x0).(2.38)

Different formula exist for the determination of the parameter βk. One version is given by the Fletcher–Reeves formula, where

βFRk =(∇f)T (xk)(∇f)(xk)

(∇f)T (xk−1)(∇f)(xk−1). (2.39)

Moreover, the Polak–Ribiere formula should be mentioned in this context, where

βPRk =(∇f)T (xk)[(∇f)(xk)− (∇f)(xk−1)]

(∇f)T (xk−1)(∇f)(xk−1). (2.40)

While the convergence properties of CG methods are well understood for linear and quadratic problems,in the general nonlinear setting surprising convergence properties can be observed, as is, e.g., pointed outin [9]. The reader is referred to this reference or [1] for further details and analysis.

Newton’s method Newton’s iterative method is based on the analysis of f(xk+1) for xk+1 = xk + sk,i.e. unit step length αk = 1. Evaluation of the Taylor series at xk neglecting terms of order 3 and largeryields

f(xk+1) = f(xk) + sTk (∇f)(xk) +1

2sTk (∇2f)(xk)sk. (2.41)

The search direction, also called Newton direction is obtained by minimizing the right hand side of (2.41)with respect to sk. Taking into account Theorem 2.1 and noting that the right hand side is a quadraticform in sk, implies

(∇skf)(xk+1) = (∇f)(xk) + (∇2f)(xk)sk = 0

so that

sk = −(∇2f)−1(xk)(∇f)(xk). (2.42)


Hence, Newton’s method can be interpreted as minimizing the quadratic function approximation of theobjective function f(x). For xk in a sufficiently small neighborhood of a strict local minimizer x∗ itfollows from Theorem 2.3 that the Hessian matrix (∇2f)(xk) is positive definite and hence invertible. Inthis case, Newton’s method is well–defined and (2.42) defines a descent direction.

Theorem 2.7 (Convergence of Newton’s method). Let f ∈ C2(Rn) and let (∇2f)(x) be locallyLipschitz continuous in a neighborhood of x∗ for which the second order sufficient optimality conditions(2.10) are satisfied. If the starting point x0 is sufficiently close to the minimizer x∗, then the Newtoniteration

xk+1 = xk − (∇2f)−1(xk)(∇f)(xk) (2.43)

converges to x∗ with an order of convergence p of at least 2. In addition, the sequence of gradient norms(‖(∇f)(xk)‖)k∈N0

converges quadratically to zero.

The proof of this theorem is subsequently omitted but can be, e.g., found in [9, Chap. 3.3].

Remark 2.3. Let f : Rn → Rm satisfy the inequality

‖f(x1)− f(x2)‖ ≤ L‖x1 − x2‖, L ∈ (0,∞) (2.44)

for all x1, x2 ∈ Br(y) = {x ∈ Rn : ‖x − y‖ ≤ r}. Then f(x) is called locally Lipschitz continuous onBr(y) ⊂ Rn. If the inequality holds for all x1, x2 ∈ Rn, then f(x) is called globally Lipschitz continuous.Note that if f(x) and (∇f)(x) are continuous on Br(y) ⊂ Rn, then f(x) is locally Lipschitz continuous.In view of Theorem 2.7 and the considered scalar case with f(x) the local Lipschitz continuity of (∇2f)(x)in a neighborhood of x∗ is given provided that f(x) ∈ C3(Rn).

For the practical implementation typically a certain step length αk ≤ 1 is introduced so that (2.43) isreplaced by

xk+1 = xk − αk(∇2f)−1(xk)(∇f)(xk). (2.45)

Herein αk is also referred to as damping coefficient and the damped Newton method is often calledNewton–Raphson method . Strategies for the suitable determination of αk are discussed in Section 2.2.2.1above.

It is crucial to observe that the positive definiteness of the Hessian matrix (∇2f)(xk) might be lost ifxk is not sufficiently close to x∗. In this case, sk defined in (2.42) is no longer a descent direction and(∇2f)(xk) is not necessarily invertible. To address this issue, the search direction is modified so that theiteration rule reads

xk+1 = xk − αkN−1k (∇f)(xk), Nk = (∇2f)(xk) + εkE (2.46)

with the unit matrix E ∈ Rn×n and a suitable εk ≥ 0. For εk = 0 Newton’s method is recovered whilefor large εk the iteration (2.46) approaches the method of steepest descent. The proper selection of εk isnot trivial. One typically begins with a starting value and successively increases εk until Nk is positivedefinite. According to Theorem 1.2 definiteness can be checked, e.g., by computing the eigenvalues of Nk.Numerically more efficient techniques such as the Cholesky factorization can be used, which imply positivedefiniteness if and only if the matrix can be factorized into Nk = DkD

Tk with Dk a lower triangular matrix

with strictly positive entries on its diagonal [13].

Exercise 2.5. Verify that line search with Newton’s method does converges in a single step independentof the starting point x0 for the quadratic minimization problem

minx∈Rn

f(x) =1

2xTPx− bTx

with P positive definite.


Pros and cons of line search with Newton’s method can be summarized as follows:

(+) Quadratic convergence if the Hessian matrix (∇2f)(xk) is positive definite;

(−) Loss of positive definiteness of the Hessian matrix (∇2f)(xk) if xk is not in a sufficiently smallneighborhood of the minimizer x∗;

(−) Requires evaluation of the Hessian matrix (∇2f)(xk) and the computation of its inverse (not explic-itly but by solving a linear system of equations at each xk)

Quasi–Newton methods In the quasi–Newton method the evaluation and in particular inversion ofthe Hessian matrix (∇2f)(xk) is replaced by an iterative procedure, which makes the approach suitablealso for medium and large scale systems with n � 1. The underlying idea makes use of (2.41), i.e. aquadratic model of the objective function given by

f(xk+1) = f(xk) + sTk (∇f)(xk) +1

2sTkBksk, (2.47)

with the difference that (∇2f)(xk) is replaced by the (n × n)–matrix Bk, which is assumed symmetricand positive definite. Proceeding as in Newton’s method, the search direction is chosen as

sk = −B−1k (∇f)(xk) (2.48)

and minimizes the quadratic (convex) approximation (2.47). With this, the next iterate is

xk+1 = xk − αkB−1k (∇f)(xk) (2.49)

with the step length chosen to satisfy the Wolfe conditions (2.19). The crucial point is now to determineBk from the knowledge of Bk−1, (∇f)(xk−1) and (∇f)(xk). For this, let f(x) ∈ C2(Rn) and recall theintegral mean value theorem (1.26), which implies

(∇f)(xk+1)− (∇f)(xk) =

∫ 1

0

(∇2f)(xk + r(xk+1 − xk))(xk+1 − xk)dr ≈ (∇2f)(xk+1)(xk+1 − xk).

In view of the approximation of the Hessian matrix (∇2f)(xk+1) by Bk+1 this furthermore motivates

(∇f)(xk+1)− (∇f)(xk) = Bk+1(xk+1 − xk).

From a numerical point of view it is advantageous to select the approximation of the Hessian matrixso that rankBk+1 −Bk is small [3]. Quasi–Newton methods can be hence characterized by the followingthree properties

Bk(xk+1 − xk) = −(∇f)(xk) (2.50a)

Bk+1(xk+1 − xk) = (∇f)(xk+1)− (∇f)(xk) (2.50b)

Bk+1 = Bk +∆Bk, rank∆Bk = m ≥ 1 (2.50c)

for k ∈ N∪{0}. Eqn. (2.50b) is also known as the secant condition. The idea behind (2.50c) is to minimizethe distance between Bk+1 and Bk in some suitable norm. Typically m = 1 or m = 2 is chosen leadingto so–called rank 1 and rank 2 corrections ∆Bk. Introducing

pk = xk+1 − xk and qk = (∇f)(xk+1)− (∇f)(xk)

properties (2.50) imply the frequently used relations

Bk+1pk = qk (2.51a)

(Bk+1 −Bk)pk = (∇f)(xk+1) (2.51b)

(∇f)(xk+1) = qk −Bkpk. (2.51c)


Since Bk is assumed positive definite and as such is invertible for any k = 0, 1, . . . it is reasonable to imposethat the rank 1 perturbation ∆Bk does not interfere with this assumption (for a detailed discussion onthis topic the reader is referred to the analysis of matrix perturbations). Hence, instead of determiningBk+1 = Bk +∆Bk we will seek for Hk+1 = Hk +∆Hk inverting Bk+1.

A straightforward rank 1 correction is obtained using ∆Hk = γkzkzTk since the dyadic product zkz

Tk

is at most of rank 1. Substitution into (2.51a) results in

pk = Hk+1qk = Hkqk + γkzkzTk qk. (2.52)

From this one obtains

(pk −Hkqk)(pk −Hkqk)T = γ2kzkz

Tk qkq

Tk zkz

Tk = γ2

k

(zTk qk

)2zkz

Tk = γk

(zTk qk

)2∆Hk.

Solving for ∆Hk hence yields

Hk+1 = Hk +(pk −Hkqk)(pk −Hkqk)T

γk(zTk qk

)2 . (2.53)

This expression can be further simplified by taking the scalar product of (2.52) with qTk , i.e.

qTk pk = qTkHkqk + γkqTk zkz

Tk qk = qTkHkqk + γk

(zTk qk

)2.

Solving for the latter term and substituting into (2.53) results in the so–called good Broyden method

Hk+1 = Hk +(pk −Hkqk)(pk −Hkqk)T

qTk (pk −Hkqk). (2.54)

Various convergence results are available for Broyden’s method proving superlinear convergence undercertain conditions. For details, the reader is referred to, e.g., [3, 5, 6].

The main problem with (2.54) is that positive definiteness of Hk+1 is only preserved if qTk (pk−Hkqk) >0. One of the most elegant techniques to ensure this property is provided by the Davidson–Fletcher–Powell (DFP) method . The technique is summarized in Algorithm 3 below and essentially relies on theinitialization of the algorithm by a positive definite matrix H0. It can be shown that Hk remains positive

Algorithm 3: Quasi–Newton method with DFP update.

input : H0 (symmetric, positive definite matrix)

x0 (starting value)εx, εf (stopping criteria)

initialize: k = 0

repeat

Compute search direction sk = −Hk(∇f)(xk)

Apply line search to solve minαk f(xk + αksk) (taking into account the Wolfe conditions (2.19))

Compute xk+1 = xk + αksk, pk = xk+1 − xk and qk = (∇f)(xk+1)− (∇f)(xk)

Update using

Hk+1 = Hk +pkp

Tk

pTk qk−Hkqkq

TkHk

qTkHkqk(2.55)

until ‖xk+1 − xk‖ ≤ εx ∨ ‖f(xk+1)− f(xk)‖ ≤ εf ;

definite as long as H0 is positive definite and the condition qTk pk > 0 is satisfied. Since the approximationof the inverse Hessian matrix is in any step corrected by two rank 1 matrices one refers in case of theDFP update also to a rank 2 correction.


An alternative to the DFP update is given by the so–called Broyden–Fletcher–Goldfarb–Shanno(BFGS) method . Herein, the iterative determination of the inverse Hessian matrix in Algorithm 3 isreplaced by

Hk+1 =

(E − pkq

Tk

qTk pk

)Hk

(E − qkp

Tk

qTk pk

)+pkp

Tk

qTk pk. (2.56)

In general superlinear convergence is achieved by making use quasi–Newton methods involving theDFP or the BFGS update laws. While convergence of Newton’s method is faster, its cost per iterationis higher due to the need for second order derivatives and the explicit inversion of the Hessian matrix.For further analysis and information regarding implementation of the quasi–Newton method the readeris referred to, e.g., [9].

2.2.3 Trust–region methods

Trust–region methods are somewhat similar to line search methods in the sense that they both generatesteps based on a quadratic model of the objective function. They, however, differ in the way the modelis exploited. While line search methods rely on the determination of a search (descent) direction and asuitable step length to move along this direction trust–region methods define a region around the currentiterate. In this region, the quadratic model is trusted to be an adequate approximation of the objectivefunction, i.e.

m(sk) = f(xk) + sTk (∇f)(xk) +1

2sTkBksk ≈ f(xk + sk) (2.57)

with Bk an appropriate symmetric and uniformly bounded matrix. Application of Taylor’s formula (1.27)reveals that the error of approximation is of the order ‖sk‖2 or even ‖sk‖3 if Bk = (∇2f)(xk). Thetrust region around the iterate xk, which is subsequently characterized by the parameter ∆k, can beinterpreted as the region, where f(xk + sk) is supposed to be sufficiently accurate represented by m(sk).In trust–region methods, the minimization problem

minsk∈Rn

m(sk) = f(xk) + sTk (∇f)(xk) +1

2sTkBksk ≈ f(xk + sk)

s.t. ‖sk‖ ≤ ∆k

(2.58)

is solved in each iteration k for suitable trust–region radius ∆k. The solution s∗k of (2.58) is hence theminimizer of m(sk) in the ball of radius ∆k. Contrary to line search both search direction and step lengthare determined simultaneously.

The proper choice of the degree–of–freedom ∆k is crucial in a trust–region method. For this, theagreement between the model function m(sk) and the objective function f(xk) at previous iterations isconsidered in terms of the ratio

%(sk) =f(xk)− f(xk + sk)

m(0)−m(sk). (2.59)

Herein, the numerator is called actual reduction and the denominator is the predicted reduction. Notethat the predicted reduction is always nonnegative since sk minimizes m(sk) inside the trust region thatincludes sk = 0. As a result, if %(sk) < 0, then the new value f(xk + sk) of the objective function isgreater than the current value f(xk) so that the step must be rejected and the trust–region must beshrunk. For %(sk) ≈ 1 the agreement between model and objective function is good so that the trustregion may be expanded for the next iteration. If 0 < %(sk) � 1, then the trust–region is shrunk in thenext iteration by reducing ∆k.


Algorithm 4: Trust–region method.

input : ∆ > 0, ∆0 ∈ (0,∆) (starting trust–region radius)η ∈

[0, 1

4

)εx, εf (stopping criteria)

initialize: k = 0

repeat

Determine sk by (approximately) solving (2.58)

Evaluate %(sk) from (2.55)

if %(sk) < 14then

∆k+1 = 14∆k

else if %(sk) > 34and ‖sk‖ = ∆k then

∆k+1 = min{2∆k,∆}else

∆k+1 = ∆kend

if %(sk) > η thenxk+1 = xk + sk (next iterate)

Bk+1 = Bk + . . . (update Hessian matrix)

elsexk+1 = xk (repeat iteration with ∆k+1 < ∆k)

end

k = k + 1


The principle process is summarized in Algorithm 4 [9]. Thereby, ∆ refers to the overall bound on thetrust–region radius. The radius is increased only if ‖sk‖ reaches the boundary of the trust region, i.e.when ‖sk‖ = ∆k. For the implementation of trust–region methods and the update of the Hessian matrixBk+1 the reader is referred to [9, 1].

2.2.4 Direct search methods

Direct (derivative–free) methods are characterized by the fact that no explicit knowledge of the gradientor the Hessian matrix for the objective function f(x) is needed to compute the minimum. Herein, a seriesof function values is computed for a set of samples to determine the subsequent iteration point. One ofthe most famous methods in this context is the so–called simplex method of Nelder and Mead [8].

In the case of two decision variables x ∈ R2 a simplex is a triangle and the method makes use ofthe comparison of function values at the triangle’s three vertices. The worst vertex characterized by thelargest value of the objective function f(x) is rejected and replaced with a new vertex. With this, a newtriangle is formed to continue the search. In the course of the process a sequence of triangles, in general ofdifferent shape, is generated with decreasing function values at the vertices. Since the size of the trianglesis reduced in each step the coordinates of the minimizer can be found.

Remark 2.4. The simplex algorithm of Nelder and Mead should not be confused with the conceptuallydifferent simplex method introduced by G.B. Dantzig in linear programming [4].

In the n–dimensional setting a simplex is the convex hull2 (cf. Definition 1.7), which is spanned byn + 1 points xk,j , j = 0, . . . , n in the k–th iteration. Denote by xk,min and xk,max those points xk,j ,j = 0, . . . , n, where the objective function attains a minimum or maximum, i.e.

f(xk,min) = minj=0,...,n

f(xk,j), f(xk,max) = maxj=0,...,n

f(xk,j). (2.60)

2 It reduces to a straight line if n = 1, a triangle for n = 2, a tetrahedron for n = 3, etc..


xk,1

xk,3

xk,2xk

xk,ref

(a) Reflection.

xk,1

xk,3

xk,2xk

xk,exp

(b) Expansion.

xk,1

xk,3

xk,2xk

xk,con

(c) Outer contraction.

xk,1

xk,3

xk,2xk

xk,con

(d) Inner contraction.

xk,1

xk,3

xk,2

(e) Shrinkage.

Fig. 2.6: Operations involved in the simplex algorithm of Nelder and Mead.

The centroid of the simplex xk is defined by

xk =1

n

(n∑j=0

xk,j − xk,max

)(2.61)

The algorithm replaces the point xk,max in the simplex by another point with lower value of the objectivefunction. In particular xk,max is replaced by a new point on the line

xrefk = xk + α(xk − xk,max

)(2.62)

depending on α. For this, various operations on the simplex are defined, that are summarized in Figure2.6. During the iteration the simplex moves in the direction of the minimizer and is thereby successivelycontracted. Algorithm 5 summarizes the general procedure. Implementations of the Nelder/Mead simplexalgorithm are available, e.g., in Matlab and Octave in terms of the function fminsearch.

Convergence of the simplex algorithm of Nelder and Mead cannot be guaranteed in general and the al-gorithm might even approach an non–minimizer. However, in practical applications the simplex algorithmyields good results at the cost of a rather slow convergence.

2.3 Benchmark example 41

Algorithm 5: Simplex algorithm of Nelder and Mead.

input : x0,j , j = 0, . . . , n (initial simplex)

αref > 0 (reflection coefficient [αref = 1])

αexp > 0 (expansion coefficient [αexp = 1])αcon ∈ (0, 1) (contraction coefficient [αcon = 1/2])

εx, εf (stopping criteria)initialize: k = 0

repeat

Compute xk,min, xk,max

Compute centroid xk

Reflection step xk,ref = xk + αref(xk − xk,max)

if f(xk,ref) < f(xk,min) then

Expansion step xk,exp = xk,ref + αexp(xk,ref − xk)

if f(xk,exp) < f(xk,ref) thenxk,new = xk,exp

elsexk,new = xk,ref

end

else if f(xk,ref) > maxj=0,...,n,xk,j 6=xk,maxf(xk,j) then

if f(xk,max) ≤ f(xk,ref) then

Inner contraction xk,new = αconxk,max + (1− αcon)xkelse

Outer contraction xk,new = αconxk,ref + (1− αcon)xkend

else

Preserve reflection point xk,new = xk,refend

if f(xk,new) ≥ f(xk,max) then

Shrinkage step xk+1,j = 12

(xk,j + xk,min), j = 0, . . . , n

else

xk,max = xk,new, xk+1,j = xk,j , j = 0, . . . , n

end

k = k + 1


2.3 Benchmark example

For the evaluation of the different techniques subsequently Rosenbrock’s problem is considered as abenchmark example [10]. Herein, the minimization problem is considered for the objective function

minx∈R2

f(x) = 100(x2 − x2

1

)2+(x1 − 1

)2. (2.63)

Figure 2.7 shows the profile of f(x) and the corresponding isoclines.

Exercise 2.6. Verify that x∗ = [1, 1]T is a local minimizer of (2.63). Analyze whether this minimizer isglobal and unique. Is f(x) a convex function?

In the following, it is desired to evaluate the properties and convergence behavior of the line search,trust region and direct search methods introduced in the paragraphs above. For this, the ’OptimizationToolbox’ of Matlab provides the two functions

• fminunc, implementing quasi–Newton as line search method as well as a trust–region method

• fminsearch, implementing the simplex method of Nelder and Mead.


−1

0

1

−1

0

1

200

400

600

800

1000

1200

1400

x1x2

f

1

1

1

1

5

5

5

55

5

5

510

10

10

10

101

0

10 10

25

25

25

25

25

25

25

25

50

50

50

50

505

0

50

50

100

100

100

100

100

100

100

250

250

250

250

500

500

500

500

1000

1000

x1

x2

−1.5 −1 −0.5 0 0.5 1 1.5−1.5

−1

−0.5

0

0.5

1

1.5f = const.

Fig. 2.7: Rosenbrock’s banana or valley function: profile and isoclines.

Similarly, the ’Optim Package’ of Octave enables, e.g., the use of the functions

• d2 min, implementing Newton’s method;

• minimize, implementing Newton’s method as well as the BFGS method as an example of quasi–Newton methods;

• fminsearch and nelder mead min, implementing the simplex method of Nelder and Mead.

The reader is also referred to the user–supplied function minFunc, which can be obtained from [12] andprovides a large selection of line search methods including those discussed in previous sections. Thisfunction is used subsequently to evaluate the different line search methods for the Rosenbrock problem.Herein, the strong Wolfe conditions (2.20) are by default used for the determination of the step lengthprovided that the user does not manually set a different option.

Item Method Iter. f(x∗) ‖(∇f)(x∗)‖2 #eval(f)

1 Line search: steepest descent 500 0.24033 0.41951 5062 Line search: conjugated gradient 28 7.3648× 10−21 7.7219× 10−11 65

3 Line search: Newton 20 3.8289× 10−16 7.3242× 10−7 32

4 Line search: quasi–Newton with BFGS 27 9.6395× 10−15 2.5508× 10−6 345 Trust–region 25 2.1627× 10−18 2.0607× 10−08 26

6 Direct method: Nelder–Mead 67 5.3093× 10−10 0.00103 125

Table 2.1: Comparison of line search, trust–region and direct search methods for the Rosenbrock problem (2.63). Line

search methods are evaluated using the function minFunc [12], fminunc is used for the trust–region approach andfminsearch for the simplex algorithm of Nelder and Mead.

Table 2.1 summarizes the results of a comparison of the different algorithms using the functionsminFunc, fminunc and fminsearch. The initial value is always set to x0 = [−1,−1]T . The correspondingbehavior of the iterates is depicted in Figure 2.8. The weak performance of steepest descent is directlyvisible. In particular the local minimizer x∗ = [1, 1]T is not even closely reached after 500 iterations.This behavior is illustrated in Figure 2.9, where the progress in the successive iterations is depicted for2500 iterations. The steepest descent direction is orthogonal to the respective isocline with the gradient(∇f)(xk) still attaining a reasonable value. In order to further investigate this, it is left to the reader toanalyze the convergence rate of the method.

Exercise 2.7. Determine the convergence rate to the minimizer x∗ = [1, 1]T of the sequence of iterates{xk}k obtained by the steepest descent method for the Rosenbrock problem.


−1

0

1

−10

12

3

200400600800

100012001400

x2x1

f

(a) Steepest descent.

−1

0

1

−10

12

3

200400600800

100012001400

x2x1

f

(b) Conjugated gradient.

−1

0

1

−10

12

3

200400600800

100012001400

x2x1

f

(c) Newton.

−1

0

1

−10

12

3

200400600800

100012001400

x2x1

f

(d) Quasi–Newton with BFGS.

−1

0

1

−10

12

3

200400600800

100012001400

x2x1

f

(e) Trust region.

−1

0

1

−10

12

3

200400600800

100012001400

x2x1

f

(f) Nelder–Mead simplex method.

Fig. 2.8: Comparison of line search, trust–region and direct search methods for the Rosenbrock problem (2.63). Here, ×refers to the minimum f(x∗) = 0 and + denotes the final value of the individual methods.

Convergence is significantly improved when applying the conjugated gradient method. For Newton’smethod the iteration at first moves in the direction of the local minimum next to the starting value whichis at [−1, 1]T . This is particularly due to the second order approximation of the objective functions thatis underlying the Newton approach. Due to the explicit utilization of the Hessian matrix the numberof iterates necessary to approach the minimizer is smallest among the line search methods. The quasi–


200

50

201

0

200

102

050

50

200

201

0

200

1020

50

50

200

20

x1

10

200

10

20

50

50

20

200

10

200

10

20

50

50

20

10

x2

−1.5 −1 −0.5 0 0.5 1 1.5 2

−1

−0.5

0

0.5

1

1.5

2

2.5

3

Fig. 2.9: Progress in steepest descent method for Rosenbrock’s problem.

Newton method taking into account the BFGS formula (2.56) directly moves into the direction of theminimizer x∗ = [1, 1]T . The behavior of the trust–region method is almost identical to those of Newton’smethod, however, at the cost of a slightly larger number of iterations. The simplex method of Nelder andMead similarly approaches the local minimum. Here, the a weak zigzagging behavior emerges, which isdue to the application of the individual operations on the simplex to achieve the desired shrinking.

The results of Table 2.1 can be generated using the function file below. This assumes that the functionminFunc and certain sub–folders of the installation are available on the Matlab or Octave search path.

function x = rosenbrock_optim(x0,method)

%

options = [];

options.Display = ’iter’;

if method==1

%Steepest descent

options.Method = ’sd’;

options.MaxIter = 1000;

elseif method==2

%Conjugate gradient

options.Method = ’cg’;

elseif method==3

% Newton

options.Method = ’newton’;

elseif method==4

%Quasi-Newton with BFGS

options.Method = ’lbfgs’;

elseif method==5

% Trust region using fminunc

options.Method = ’tr’;

opt = optimoptions(’fminunc’,’Display’,’iter’,’GradObj’,’on’,’Hessian’,’on’);

elseif method==6

% Nelder-Mead simplex

options.Method = ’nm’;

opt = optimset(’LargeScale’,’on’,’Hessian’,’on’,’GradObj’,’on’,’Display’,’iter’);

end

if method<5,

[x,f,exitflag,output] = minFunc(@rosenbrock,x0,options);


[hf,hg] = rosenbrock(x);

elseif method==5,

[x,f,exitflag,output] = fminunc(@rosenbrock,x0,opt);


elseif method==6,

[x,f,exitflag,output] = fminsearch(@rosenbrock,x0,opt);


end

fprintf(strcat(’\n STATISTICS \n Method %s: iterations %1.0f, funcCount ’, ...

’= %1.0f, f = %0.5g, norm(nabla f) = %0.5g\n’), ...

options.Method,output.iterations,output.funcCount,f,norm(hg,2));

Thereby, the following implementation of Rosenbrock’s function (2.63) and the (analytic) computationof its gradient (∇f)(x) and Hessian matrix (∇2f)(x) is used for Matlab and Octave.

function [varargout] = rosenbrock(varargin)

%

if nargin==1

v = varargin{1};

x1 = v(1);

x2 = v(2);

elseif nargin==2

x1 = varargin{1};

x2 = varargin{2};

else

error(’Input arguments not properly defined!’);

end

%

f = 100.0*(x2-x1.^2).^2+(x1-1.0).^2;

gradf = [-400.0*(x2-x1.^2).*x1+2.0*(x1-1.0);

200.0*(x2-x1.^2)];

hessf = [2.0+1200.0*x1.^2-400.0*x2,-400.0*x1;

-400.0*x1,200.0];

%

if nargout==1

varargout{1} = f;

elseif nargout==2

varargout{1} = f;

varargout{2} = gradf;

elseif nargout==3

varargout{1} = f;

varargout{2} = gradf;

varargout{3} = hessf;

else

error(’Output arguments not properly defined!’);

end


Exercise 2.8 (Computer exercise). The general scheme of a line search method is introduced inAlgorithm 1.

(i) Proceed along these lines to implement the introduced line search methods steepest descent, conju-gated gradient, Newton, and quasi–Newton with the BFGS formula in Matlab or Octave.

(ii) For the determination of the step length make use of the Armijo and Wolfe conditions as well asbacktracking. Herein use (or determine) proper values for the coefficients ε0, ε1 and ε2. In addition,utilize quadratic interpolation.

(iii) Prepare your implementation in such a way, e.g., by using global variables or additional return valuesof functions, to access the number of function, gradient and Hessian evaluations.

(iv) Apply your realizations to the Rosenbrock problem. Thereby vary also the starting value x0 andanalyze and compare convergence, number of function, gradient and Hessian evaluations.

Exercise 2.9. Use your line search implementations of Exercise 2.8 to determine the global minimumof the so–called Schwefel function, i.e.

minx∈Xad

f(x) = −x1 sin(√|x1|)− x2 sin(

√|x2|) (2.64)

for Xad = [−500, 500] × [−500, 500]. The global minimum is f(x∗) = −837.9658 for x∗ =[420.9687, 420.9687]T .

References

1. Bonnans J, Gilbert J, Lemarechal C, Sagastizabal C (2006) Numerical Optimization, 2nd edn. Springer–Verlag, Berlin,

Heidelberg


3. Broyden C (1967) Quasi–Newton methods and their application to function minimisation. Math Comp 21:368–381

4. Dantzig G (1998) Linear Programming and Extensions. Princeton Landmarks in Mathematics and Physics, PrincetonUniversity Press

5. Dennis J, Schnabel R (1998) Numerical methods for unconstrained optimization and nonlinear equations, Classics inApplied Mathematics, vol 16. SIAM Publications, Philadelphia (PA)

6. Deuflhard P (2004) Newton Methods for Nonlinear Problems. Springer–Verlag, Berlin, Heidelberg

7. Luenberger D, Ye Y (2008) Linear and Nonlinear Programming, 3rd edn. Springer

8. Nelder J, Mead R (1965) A simplex method for function minimization. The Computer Journal 7:308–313

9. Nocedal J, Wright S (2006) Numerical Optimization, 2nd edn. Springer, New York (NY)

10. Rosenbrock H (1960) An automatic method for finding the greatest or least value of a function. The Computer Journal3:175–184

11. Schaback R, Wendland H (2005) Numerische Mathematik, 4th edn. Springer–Verlag, Berlin, Heidelberg

12. Schmidt M (2013) minFunc. URL http://www.di.ens.fr/~mschmidt/Software/minFunc.html

13. Schwarz H (1997) Numerische Mathematik. Teubner-Verlag, Stuttgart

http://www.di.ens.fr/~mschmidt/Software/minFunc.html

Chapter 3

Static constrained optimization

In the following, we will focus on static optimization problems in n decision variables x satisfying equalityand inequality constraints, i.e.

minx∈Rn

f(x) (3.1a)

subject to

gj(x) = 0, j = 1, . . . , p (3.1b)

hl(x) ≤ 0, l = 1, . . . , q. (3.1c)

It is assumed that p ≤ n and that f(x), gj(x) and hj(x) are at least continuous but may be requiredto possess continuous second partial derivatives. Proceeding as in Section 1.1 the introduction of theadmissible set

Xad = {x ∈ Rn : gj(x) = 0, j = 1, . . . , p and hl(x) ≤ 0, l = 1, . . . , q} (3.2)

implies the equivalent problem formulation

minx∈Xad

f(x). (3.3)

The constraints g(x) = 0 and h(x) ≤ 0 are called functional constraints. A point x that satisfies allfunctional constraints is called feasible.

Solutions to the static constrained optimization problem are characterized in the definition below [7].

Definition 3.1. The feasible point x∗ ∈ Xad is a

(i) local solution to (3.3) if there exists a neighborhood Uε of x∗ such that f(x) ≥ f(x∗) for allx ∈ Uε ∩Xad;

(ii) strict local solution to (3.3) if there exists a neighborhood Uε of x∗ such that f(x) > f(x∗) for allx ∈ Uε \ {x∗} ∩Xad;

(iii) isolated local solution to (3.3) if there exists a neighborhood Uε of x∗ such that x∗ is the only localsolution in Uε ∩Xad.

Isolated local solutions are also strict local solutions but the converse is not true.

47

48 3 Static constrained optimization

3.1 Optimality conditions

In order to determine necessary and sufficient optimality conditions taking into account functional con-straints a fundamental concept is that of an active constraint . An inequality constraint is said to beactive at a feasible point x if hl(x) = 0 and inactive if hl(x) < 0. Any equality constraint gj(x) = 0 is byconvention an active constraint. Hence, it is obvious that inactive constraints at a feasible point x haveno influence on the solution of the optimization problem in a sufficiently small neighborhood of x. If theset of active constraints were known a priori , then one would be able to reduce the optimization prob-lem by neglecting inactive constraints and by replacing any active inequality constraint by an equalityconstraint. In view of this, at first only optimization problems with equality constraints are consideredbefore secondly extending the exposition to more general settings. The exposition thereby mainly followsthe treatise in [5] and [7].

3.1.1 Equality constraints

The set of equality constraints M := {x ∈ Rn : gj(x) = 0, j = 1, . . . , p} with gj(x) ∈ C1(Rn),j = 1, . . . , p forms a subset of Rn, which can be viewed as an (n − p)–dimensional (smooth) manifold .This can geometrically be thought of as a hyperplane. Associated with this manifold is the tangentspace or tangent plane at a point1. For its definition, note that a curve on M is a family of pointsx(t) ∈M continuously parametrized by t ∈ [t0, t1]. The curve is differentiable if x = d

dtx(t) and is twicedifferentiable if x(t) exists. The curve x(t) is said to pass through the point x∗ if x(τ) = x∗ for someτ ∈ [t0, t1]. The derivative of the curve at x∗ is defined as x(τ) and is a vector in Rn. For our purposesthe tangent space at a point x∗ can be seen as the collection of the derivatives at x∗ of all curves on Mpassing through x∗.

Fig. 3.1: Examples of tangent spaces Tx∗M.

1 The reader is, e.g., referred to [6] for definitions and further information concerning these differential geometric concepts.


To obtain an explicit expression of the tangent space TM, respectively, for the surface defined by Mthe notion of a regular point is needed [5].

Definition 3.2 (Regular point (equality constraints)). A point x∗ satisfying the constraint

g(x∗) = 0 (3.4)

is said to be a regular point of the constraint (3.4) if the gradient vectors (∇g1)(x∗), (∇g2)(x∗), . . . ,(∇gp)(x∗) are linearly independent, i.e.

rank(∇g)(x∗) = rank[(∇g1), (∇g2), . . . , (∇gp)

](x∗) = p. (3.5)

It should be noted that condition (3.5) for x∗ does not impose a condition on the constraint surface Mbut on its representation in terms of g(x).

Example 3.1. Let x ∈ R2 and consider g(x) = x1. The constraint g(x) = 0 yields thatM is the x2–axisand any point along this line is regular by (3.5). Let now g(x) = x2

1 so that M is again identical to thex2–axis. Since (∇g)(x∗) = 0 no point on this line is regular.

Example 3.2. Let x ∈ R3 and consider g1(x) = sin(x1), g2(x) = cos(x3). The constraints gj(x) = 0,j = 1, 2 yield that M is the surface defined by M = {x ∈ R3 : x1 = kπ, x3 = (2l − 1)π/2, k, l ∈ Z}and corresponds to the set of straight lines parallel to the x2–axis passing through the points x1 = kπ,x3 = (2l − 1)π/2 for k, l ∈ Z. In addition, we have (∇g1)(x∗) = [cos(x∗1), 0, 0]T and (∇g2)(x∗) =[0, 0,− sin(x∗3)]T so that any point on M is regular by (3.5).

Remark 3.1. In view of Definition 1.4 of the gradient of a scalar function, given g(x) ∈ Rp the vectorgradient (∇g)(x) ∈ Rn×p is defined as

(∇g)(x) =[(∇g1), (∇g2), . . . , (∇gp)

](x) =

∂

∂x1g1(x) · · · ∂

∂x1gp(x)

......

∂

∂xng1(x) · · · ∂

∂xngp(x)

. (3.6)

With these preliminaries, the following theorem can be proven as presented in [5].

Theorem 3.1. Let x∗ be a regular point of M defined by g(x∗) = 0. Then the tangent space at x∗ isequal to

Tx∗M = {d ∈ Rn : (∇gj)T (x∗)d = 0, j = 1, . . . , p}. (3.7)

To be precise in the setting of Theorem (3.1) the vector d represents a so–called vector field.

The formulation of necessary and later also sufficient conditions for a point to be a local minimizer of(3.1a) subject only to (3.1b) with p ≤ n can be deduced using the introduced concept of a tangent space.

Lemma 3.1. Let x∗ be a regular point ofM = {x ∈ Rn : gj(x) = 0, j = 1, . . . , p} and a local extremum(minimum or maximum) of f(x) subject to the equality constraints g(x) = 0 for f(x), g1(x), . . . , gp(x) ∈C1(Rn). Then all d ∈ Rn fulfilling the condition

(∇g)T (x∗)d = 0 (3.8)

must also satisfy

(∇f)T (x∗)d = 0. (3.9)


Proof. The proof of this result can be sketched as follows. Let d ∈ Tx∗M and let x(t) be a smooth curveon the constraint surface M passing through x∗ with derivative d at x∗, i.e. x(0) = x∗, x(0) = d andg(x(t)) = 0 for t ∈ [−ε, ε] for some ε > 0. By assumption x∗ is a regular point so that Theorem 3.1 yieldsthe tangent space as the set of vectors d satisfying (∇g)T (x∗)d = 0. Observing that x∗ is by assumptiona local extremum of f(x) fulfilling the equality constraints g(x∗) = 0 we have

d

dtf(x(t))

∣∣∣∣t=0

= (∇f)T (x(t))dx(t)

dt

∣∣∣∣t=0

= (∇f)T (x∗)d = 0

which implies the conclusion. ut

In particular, Lemma 3.1 states in view of Theorem 3.1 that (∇f)(x∗) is orthogonal to the tangent spaceTx∗M. This implies that (∇f)(x∗) can be presented as a linear combination of (∇gj)(x∗), j = 1, . . . , p,which leads to the definition of so–called Lagrange multipliers.

Theorem 3.2 (First order necessary optimality condition). Let x∗ be a regular point of M ={x ∈ Rn : gj(x) = 0, j = 1, . . . , p} and a local extremum (minimum or maximum) of f(x) subject to theequality constraints g(x) = 0 for f(x), g1(x), . . . , gp(x) ∈ C1(Rn). Then there exists a λ∗ ∈ Rp suchthat

(∇f)(x∗) + (∇g)(x∗)λ∗ = 0. (3.10)

The vector λ∗ in (3.10) is called a Lagrange multiplier . The first order necessary conditions

(∇f)(x∗) + (∇g)(x∗)λ∗ = 0

and the equality constraints

g(x∗) = 0

form a (nonlinear) system of n+p equations in n+p unknowns x∗ and λ∗. These considerations motivatethe introduction of the Lagrangian associated with the equality constrained optimization problem in theform

l(x∗,λ∗) = f(x∗) + (λ∗)Tg(x∗). (3.11)

The necessary conditions then read

∇xl(x∗,λ∗) = 0 (3.12a)

∇λl(x∗,λ∗) = 0 (3.12b)

in terms of l(x∗,λ∗). The application of the first order necessary condition is subsequently analyzed inexamples.

Example 3.3. Consider the optimization problem

minx∈R2

f(x) = (x1 − 1)2 + (x2 − 2)2 (3.13a)

subject to

g(x) = x2 − x1 = 0. (3.13b)

It becomes apparent from Figure 3.2 that the intersections of g(x) with the isoclines of the objectivefunction f(x) move to a single point, i.e., the minimum, determined as


x∗ =

[3

2

3

2

]T, f(x∗) = 0.5. (3.13c)

The first order necessary optimality condition (3.10) for f(x) and g(x) reads as

−1 0 1 2 30

0.5

1

1.5

2

2.5

3

3.5

4

1

1

1

1

1

2

22

2

2

2

2

4

4

4

4

4

4

4

4

4

0.50.5

0.5

x1

x2

(∇f)(x)

(∇g)(x)

(∇f)(x∗)

(∇g)(x∗)

Fig. 3.2: Isoclines and equality constraint for Example 3.3.

(∇f)(x) + (∇g)(x)λ =

[2(x1 − 1)− λ2(x2 − 2) + λ

]=

[00

]with g(x) = x2 − x1 = 0. Solving this set of linear equations in x1, x2 and λ yields the optimal solutionx∗ = [3/2, 3/2]T and λ = 1.

Example 3.4 (Hanging chain). Consider the hanging chain of Figure 3.3 clamped at x = 0 and x = Lconsisting of n stiff segments of length l and weight m. The j–th segment spans a distance of xj in

L

y

x

Fig. 3.3: Hanging chain.

x–direction and of yj in y–direction so that x2j + y2

j = l2. For the determination of the equilibriumof the hanging chain we make use of the potential energy. Assuming that the mass of each segment isconcentrated in its center the potential energy is given by

Wpot = mg

(1

2y1 +

[y1 +

1

2y2

]+

[y1 + y2 +

1

2y3

]+ · · ·+

[y1 + · · ·+ yn−1 +

1

2yn

])= mg

n∑j=1

(n− j +

1

2

)yj .

The equilibrium shape of the hanging chain can be thus determined from


minyj , j=1,...,n

mg

n∑j=1

(n− j +

1

2

)yj (3.14)

subject to the two equality constraints

n∑j=1

yj = 0 (3.15a)

n∑j=1

√l2 − y2

j − L = 0. (3.15b)

Taking into account the first order optimality condition (3.10) yields that if y∗j , j = 1, . . . , n is a regularpoint of the constraints and a local minimizer, then

mg

(n− j +

1

2

)+ λ1 − λ2

y∗j√l2 − (y∗j )2

= 0 (3.16)

for each j = 1, . . . , n and y∗j has to fulfill (3.15). Equation (3.16) implies

y∗j = −l(mg(n− j + 1

2

)+ λ1

)√λ2

2 +(mg(n− j + 1

2

)+ λ1

)2 (3.17)

depending on the Lagrange multipliers λ1 and λ2. These can be obtained by solving (3.15) with y∗j as above.For this, numerical techniques as those introduced in Section 2.2 to find zeros of a system of nonlinearalgebraic equations need to be incorporated.

For the formulation of the second order optimality condition it is in the following assumed thatf(x), gj(x) ∈ C2(Rn) for j = 1, . . . , p.

Theorem 3.3 (Second order necessary optimality condition). Let x∗ be a regular point of M ={x ∈ Rn : gj(x) = 0, j = 1, . . . , p} and a local extremum (minimum or maximum) of f(x) subject to theequality constraints g(x) = 0 for f(x), g1(x), . . . , gp(x) ∈ C2(Rn). Then there exists a λ∗ ∈ Rp suchthat

(∇f)(x∗) + (∇g)(x∗)λ∗ = 0. (3.18a)

and the Hessian matrix of the associated Lagrangian is positive semi–definite on the tangent space Tx∗M,i.e.

dT (∇2l)(x∗,λ∗)d = dT(

(∇2f)(x∗) +

p∑j=1

λ∗j (∇2gj)(x∗)

)d ≥ 0 (3.18b)

for all d ∈ Tx∗M.

The proof of this theorem follows rather standard arguments along the line of the proof of Lemma 3.1and can be, e.g., found in [5].

Sufficient optimality conditions for the optimization problem (3.1a) with only equality constraints(3.1b) are provided subsequently.

Theorem 3.4 (Second order sufficient optimality condition). If there exists (x∗,λ∗) such that

∇xl(x∗,λ∗) = 0 (3.19a)

∇λl(x∗,λ∗) = 0 (3.19b)

and


dT (∇2l)(x∗,λ∗)d > 0 (3.19c)

for all d ∈ Tx∗M with l(x∗,λ∗) the Lagrangian according to (3.11), then x∗ is a strict local minimumof (3.1a) with only equality constraints (3.1b).

The proof of this result can be found, e.g., in [5]. Moreover, the analysis reveals that the matrix

L = (∇2l)(x∗,λ∗) (3.20)

plays the role of the Hessian matrix (∇2f)(x∗) in unconstrained optimization. As such, the structureof the matrix L restricted to the tangent space Tx∗M determines the rate of convergence of algorithmsdesigned for the numerical solution of equality constrained optimization problems as does the structureof the Hessian of the objective function in unconstrained optimization. In particular it can be shownthat the eigenvalues of L restricted to Tx∗M, subsequently referred to by LTx∗M , determine the rate ofconvergence.

In general, given d ∈ Tx∗M the vector y = Ld is not an element of Tx∗M, i.e. y /∈ Tx∗M. Let thecolumn vectors of the matrix V ∈ Rn×n−p denote an orthonormal basis of Tx∗M, then for any d ∈ Tx∗Mthere exists a z ∈ Rn−p so that d = V z. In view of (3.19c) this yields

dT (∇2l)(x∗,λ∗)d = dTLd = zTV TLV z > 0.

This implies that the projection of L into the tangent space Tx∗M is given by

LTx∗M = V TLV. (3.21)

Moreover, the orthonormality of the column vectors of V , i.e. V TV = E, provides

(λE − LTx∗M)v = (λV TV − V TLV )v = V T (λE − L)V v, (3.22)

which confirms that any eigenvalue λ of LTx∗M is an eigenvalue of L. In other words, the eigenvalue ofLTx∗M are independent of the particular basis of Tx∗M.


minx∈R3

f(x) = x1 + x22 + x2x3 + 2x2

3

subject to

g(x) = x21 + x2

2 + x23 − 4 = 0.

Given the Lagrangian

l(x, λ) = x1 + x22 + x2x3 + 2x2

3 + λ(x21 + x2

2 + x23 − 4)

the necessary first order optimality conditions of Theorem 3.2 or Eqn. (3.12), respectively, read

∇xl(x∗, λ∗) =

1 + 2λ∗x∗12x∗2 + x∗3 + 2λ∗x∗2x∗2 + 4x∗3 + 2λ∗x∗3

= 0

∇λ l(x∗, λ∗) = (x∗1)2 + (x∗2)2 + (x∗3)2 − 4 = 0.

(3.23)

A solution to this set of equations is given by

x∗ =

200

, λ = −1

4.


For the second order sufficient optimality condition consider the matrix

L = (∇2l)(x∗, λ∗) =

2λ∗ 0 00 2(1 + λ∗) 10 1 2(2 + λ∗)

=

− 12 0 0

0 32 1

0 1 72

.The tangent space Tx∗M of the manifold M = {x ∈ R3 : x2

1 + x22 + x2

3 = 4} follows by (3.7) from thecomputation of

(∇g)T (x∗) =[2x∗1 2x∗2 2x∗3

]=[4 0 0

]as

Tx∗M = {d ∈ R3 : (∇g)T (x∗)d = 4d1 = 0}.

Hence, the first component of each element of Tx∗M has to evaluate to zero. This yields

dTLd =[d2 d3

] [32 11 7

2

]︸︷︷︸

=L[{2,3},{2,3}]

[d2

d3

]> 0

since the 2 × 2 submatrix L[{2,3},{2,3}] is positive definite. As a result, x∗ = [2, 0, 0]T is a strict localminimum.

Alternatively, the projection of L into the tangent space Tx∗M can be considered. An orthonormalbasis of Tx∗M can be easily deduced and summarized in the column vectors of the matrix

V =

0 01 00 1

.With this the projection can be evaluated using (3.21), which implies

LTx∗M = V TLV =

[32 11 7

2

].

Since LTx∗M is positive definite, we conclude by Theorem 3.4 that x∗ = [2, 0, 0]T is a strict local minimum.

Exercise 3.1. Analyze which of the remaining solutions (x∗, λ∗) to (3.23) satisfy the second order suf-ficient optimality conditions of Theorem 3.4.

To conclude this section the so–called sensitivity theorem for Lagrange multipliers is presented, whichimplies an interpretation of the Lagrange multipliers [5]. For this, let x∗ denote a solution to the opti-mization problem

minx∈Rn

f(x) (3.24a)

subject to

gj(x) = 0, j = 1, . . . , p (3.24b)

with the associated Lagrange multiplier λ∗ ∈ Rn.

Theorem 3.5 (Sensitivity (equality constraints)). Let f(x), gj(x) ∈ C2(Rn), j = 1, . . . , p andconsider the family of equality constrained optimization problems

minx∈Rn

f(x) (3.25a)


subject to

g(x) = c, j = 1, . . . , p (3.25b)

for c ∈ Rp. Suppose that for c = 0 the vector x∗ is a regular point and (x∗,λ∗) satisfies the second ordersufficient optimality conditions of Theorem 3.4 for a strict local minimum. Then for any c ∈ Rp in aneighborhood of 0 (containing 0) there is an x(c) depending continuously on c such that x(0) = x∗ andx(c) is a local minimum of (3.25). In addition,

∂

∂cf(x(c))

∣∣∣∣c=0

= −(λ∗)T . (3.26)

Proof. To verify the claim recall the first order optimality condition (3.10), which for (3.25) read

(∇f)(x) + (∇g)(x)λ = 0 (3.27a)

g(x) = c. (3.27b)

By assumption there is a solution (x∗,λ∗) if c = 0. In view of this, the Jacobian of (3.27) at (x∗,λ∗),i.e., at c = 0, is[

(∇2l)(x∗,λ∗) (∇g)(x∗)(∇g)T (x∗) 0

].

This matrix is regular since x∗ is by assumption a regular point and (∇2l)(x∗,λ∗) is positive definite.

Exercise 3.2. Prove this claim.

Thus, the Implicit Function Theorem2 implies that there is a solution (x(c), λ(c)) to (3.27), which iscontinuously differentiable in c. Taking into account the chain rule, we hence have

∂

∂cf(x(c))

∣∣∣∣c=0

= (∇f)T (x∗)∂

∂cx(c)

∣∣∣∣c=0

and

∂

∂cg(x(c))

∣∣∣∣c=0

= (∇g)T (x∗)∂

∂cx(c)

∣∣∣∣c=0

(3.27b)= E.

Multiplication of (3.27a) from the left by ( ∂∂cx(c)|c=0)T yields (3.26) in view of the latter equationsabove.

Example 3.6 (Optimal control). Consider the scalar sampled–data system

xk+1 = φ(xk, uk), x0 = x(0) (3.28)

in the state xk with input uk. We seek the input sequence (uk)k and the associated states (xk)k minimizingthe objective function

J =

N∑k=0

ψ(xk, uk) (3.29)

2 Let U be an open subset in Rm ×Rn and let f : U → R be a Ck function with k ≥ 1. Consider a point (x, y) ∈ U withf(x, y) = c, where x ∈ Rm and y ∈ Rn. If the (n× n) matrix ∂

∂yf(x, y) is invertible, then there are open sets Vm ⊂ Rm

and Vn ⊂ Rn with (x, y) ∈ Vm×Vn ⊂ U and a unique Ck function ψ : Vm → Vn such that f(x,ψ(x)) = c for all x ∈ Vm.In addition, f(x,y) 6= c if (x,y) ∈ Vm × Vn and y 6= ψ(x) [4].


subject to the terminal constraint

α(xN+1) = 0. (3.30)

It is assumed that φ, ψ and g have continuous first partial derivatives and that the sequences (uk)k, (xk)kfulfill the regularity condition of Definition (3.4).

Exercise 3.3. Extend Example 3.6 to the case where xk ∈ Rn and uk ∈ Rm with

xk+1 = φ(xk,uk), x0 = x(0) (3.31)

and terminal constraint α(xN+1) = 0.


3.1.2 Inequality constraints

In the following, we focus on the optimization problem

minx∈Rn

f(x) (3.32a)

subject to

gj(x) = 0, j = 1, . . . , p (3.32b)

hl(x) ≤ 0, l = 1, . . . , q. (3.32c)

with equality and inequality constraints. As before, it is assumed that p ≤ n and that f(x), gj(x) andhl(x) are at least continuous but may be required to possess continuous second partial derivatives. Recallalso that an inequality constraint is said to be active at a feasible point x if hl(x) = 0 and inactive ifhl(x) < 0.

Definition 3.3 (Active inequality constraints set). The set of all active inequality constraints ofthe optimization problem (3.1) at a point x ∈ Rn is

Aieq(x) = {l ∈ {1, . . . , q} : hl(x) = 0}. (3.33)

With this, Definition 3.2 of a regular point of the constraints can be extended to the case of activeinequality constraints.

Definition 3.4 (Regular point (inequality constraints)). A point x∗ satisfying the constraints

g(x∗) = 0, h(x∗) ≤ 0 (3.34)

is said to be a regular point of the constraints (3.34) if the gradient vectors (∇gj)(x∗), j = 1, . . . , p and(∇hl)(x∗), l ∈ Aieq(x) are linearly independent.

Due to the restriction to active inequality constraints the definition is consistent with the case of equal-ity constraints only. In view of these preparations first order necessary optimality conditions can beformulated, which are also referred to as Karush–Kuhn–Tucker (KKT) conditions.

Theorem 3.6 (Karush–Kuhn–Tucker first order necessary optimality conditions). Let x∗ bea local minimum of the optimization problem (3.32) with f(x), gj(x), hl(x) ∈ C1(Rn) for j = 1, . . . , p,l = 1, . . . , q and suppose x∗ is a regular point of the constraints g(x∗) = 0, h(x∗) ≤ 0. Then there existsa unique Lagrange multiplier ((λ∗)T , (µ∗)T ) with λ∗ ∈ Rp and µ∗ ∈ Rq such that

(∇f)(x∗) + (∇g)(x∗)λ∗ + (∇h)(x∗)µ∗ = 0 (3.35a)

hT (x∗)µ∗ = 0 (3.35b)

µ∗ ≥ 0 (3.35c)

hold true.

Since µ∗ ≥ 0 and h(x∗) ≤ 0 the condition (3.35b) is equivalent to the statement that hl(x∗) < 0 implies

µ∗l = 0 and µ∗l > 0 implies hl(x∗) = 0. This condition is also called complementary slackness condition.

Subsequently the proof of Theorem 3.6 is briefly sketched [5].

Proof. Observing that by assumption x∗ is a local minimum of the constrained optimization problem(3.32) it is a local minimum of the optimization problem defined by setting the active constraints to zero.Then (3.35a) is identical to the first order optimality condition (3.10) for the only equality constrainedproblem addressed in Theorem 3.2 for µ∗l = 0 if hl(x

∗) < 0.


To verify that µ∗ ≥ 0 we proceed by contradiction. Suppose there is a µ∗k < 0 for some k ∈ Aieq(x).Define by

M = {x ∈ Rn : gj(x) = 0, j = 1, . . . , p, hl(x) = 0, ∀l ∈ Aieq(x) \ {k}}

the manifold constructed by all active constraints excluding hk(x) and by Tx∗M the respective tangentialspace at x∗. Since x∗ is by assumption a regular point of the constraints, i.e., (∇hk)(x∗) cannot berepresented as a linear combination of (∇gj)(x∗), j = 1, . . . , p and (∇hl)(x∗), l ∈ Aieq(x∗) \ {k}, thereis a d ∈ Tx∗M such that (∇hk)T (x∗)d < 0. Let x(t) denote a C1 curve on M parametrized by t passingthrough x∗ with derivative d, i.e., x(0) = x∗ and x(0) = d, then

d

dtf(x(t))

∣∣∣∣t=0

= (∇f)T (x∗)d(3.35a),

x∗∈M= −µ∗k(∇hk)T (x∗)d < 0.

This, however, contradicts the assumption that x∗ is a local minimum. ut

Remark 3.2. It is important to note that not every local minimum fulfills the KKT conditions (3.35).This holds only if certain assumptions are imposed on the constraints g(x) = 0 and h(x) ≤ 0 in (3.32),which are referred to as constraint qualification (CQ). The requirement that x∗ in Theorem 3.6 is aregular point of the constraints according to Definition 3.4 is in this context denoted linear independenceconstraint qualification (LICQ). It is in particular guaranteed that the Lagrange multiplier is unique,whenever the LICQ condition is fulfilled. In general, this does not hold for other CQ conditions.

One of the main drawbacks of Theorem 3.6 is that it is in principle necessary to know the set Aieq(x),i.e., which inequality constraints are active or inactive, so that any possible combination needs to beanalyzed to determine all local minima.


minx∈R3

1

2

(x2

1 + x22 + x2

3

)subject to

h1(x) = x1 + x2 + x3 + 3 ≤ 0

h2(x) = x1 ≤ 0.

In view of Definition 3.4 and

(∇h1)(x) =

111

, (∇h2)(x) =

100

any admissible point x∗ fulfilling

x∗1 + x∗2 + x∗3 + 3 ≤ 0

x∗1 ≤ 0

is a regular point. The KKT conditions (3.35) read

(∇f)(x∗) + (∇g)(x∗)λ∗ + (∇h)(x∗)µ∗ =

x∗1x∗2x∗3

+

1 11 01 0

[µ∗1µ∗2

]= 0

µ∗1(x∗1 + x∗2 + x∗3 + 3

)+ µ∗2x

∗1 = 0

µ∗1 ≥ 0

µ∗2 ≥ 0.


We need to distinguish between the following four cases.

(i) Aieq(x) = ∅: Since both inequalities are inactive we have by the complementary slackness conditionthat µ∗1 = µ∗2 = 0 so that the KKT conditions imply x∗ = 0. However, this solution is not feasiblesince h1(x∗) = 3 ≤ 0 is not satisfied.

(ii) Aieq(x) = {1}: The complementary slackness condition implies µ∗2 = 0 while µ∗1 > 0. The KKTconditions provide x∗j = −µ∗1, j = 1, 2, 3 and x∗1 + x∗2 + x∗3 + 3 = 0. Hence, x∗j = −1, j = 1, 2, 3 andµ∗1 = 1, µ∗2 = 0 is a candidate for a local minimum satisfying the inactive constraint h2(x∗).

(iii) Aieq(x) = {2}: Here h2(x∗) is active so that x∗1 = 0 and the complementary slackness conditionyields µ∗1 = 0, µ∗2 > 0. With this, the KKT conditions result in x∗1 = −µ∗2, x∗2 = x∗3 = 0 which is notfeasible.

(iv) Aieq(x) = {1, 2}: When both constraints are active we have µ∗1 = µ∗2 > 0 and h1(x∗) = 0 andh2(x∗) = x∗1 = 0 and the KKT conditions impose x∗1 = −(µ∗1 + µ∗2), x∗2 = x∗3 = −µ∗1. Hence, nofeasible solution x∗(t) can be found.

Whether x∗j = −1, j = 1, 2, 3 with µ∗1 = 1, µ∗2 = 0 is indeed a local minimum for Aieq(x) = {1} is clarifiedbelow when formulating sufficient optimality conditions.

Exercise 3.4. Find the minima of the function f(x) = x1x2 inside the unit circle x21 + x2

2 ≤ 1. Providea graphical illustration.

Exercise 3.5. Consider the constrained optimization problem

minx∈R2

f(x) =

(x1 −

3

2

)2

+ (x2 − r)4

subject to

h(x) =

x1 + x2 − 1x1 − x2 − 1−x1 + x2 − 1−x1 − x2 − 1

≤ 0.

a) Determine r so that x∗ = [1, 0]T satisfies the KKT first order necessary optimality conditions (3.35).

b) For r = 1 show that only h1(x) is active at the solution x∗. Determine x∗ explicitly.

Similar to the consideration of optimization problems with equality constraints only, necessary andsufficient second order optimality conditions can be determined for (3.32). These are essentially based onthe analysis of the optimization problem induced by the active constraints.

Theorem 3.7 (Karush–Kuhn–Tucker second order necessary optimality conditions). Let x∗

be a local minimum of the optimization problem (3.32) with f(x), gj(x), hl(x) ∈ C2(Rn) for j = 1, . . . , p,l = 1, . . . , q and suppose x∗ is a regular point of the constraints g(x∗) = 0, h(x∗) ≤ 0. Then there existsa unique Lagrange multiplier ((λ∗)T , (µ∗)T ) with λ∗ ∈ Rp and µ∗ ∈ Rq such that

(∇f)(x∗) + (∇g)(x∗)λ∗ + (∇h)(x∗)µ∗ = 0 (3.36a)

hT (x∗)µ∗ = 0 (3.36b)

µ∗ ≥ 0 (3.36c)

and

dT (∇2l)(x∗,λ∗,µ∗)d = dT(

(∇2f)(x∗) +

p∑j=1

λ∗j (∇2gj)(x∗) +

∑l∈Aieq(x)

µ∗l (∇2hl)(x∗)

)d ≥ 0 (3.36d)


for all d ∈ Tx∗M with M = {x ∈ Rn : gj(x) = 0, j = 1, . . . , p, hl(x) = 0, ∀l ∈ Aieq(x)} and theLagrangian

l(x∗,λ∗,µ∗) = f(x∗) + (λ∗)Tg(x∗) + (µ∗)Th(x∗). (3.37)

Herein, µ∗l = 0 if hl(x∗) < 0.

The tangent space Tx∗M is thereby defined as

Tx∗M = {d ∈ Rn : (∇gj)T (x∗)d = 0, j = 1, . . . , p, (∇hl)T (x∗)d = 0, ∀l ∈ Aieq(x)}. (3.38)

The proof of Theorem 3.7 makes use of the fact that if x∗ is a local minimum of the constraint optimizationproblem (3.32), then it is also a local minimum of the optimization problem defined by setting the activeconstraints to zero.

Theorem 3.8 (Karush–Kuhn–Tucker second order sufficient optimality conditions). Considerthe constrained optimization problem (3.32) and let f(x), gj(x), hl(x) ∈ C2(Rn) for j = 1, . . . , p,l = 1, . . . , q. The regular point x∗ of the constraints g(x∗) = 0, h(x∗) ≤ 0 is a strict (local) minimum ifthere exist ((λ∗)T , (µ∗)T ) with λ∗ ∈ Rp and µ∗ ∈ Rq such that

(∇f)(x∗) + (∇g)(x∗)λ∗ + (∇h)(x∗)µ∗ = 0 (3.39a)

hT (x∗)µ∗ = 0 (3.39b)

µ∗ ≥ 0 (3.39c)

and

dT (∇2l)(x∗,λ∗,µ∗)d = dT(

(∇2f)(x∗) +

p∑j=1

λ∗j (∇2gj)(x∗) +

∑l∈Aieq(x)

µ∗l (∇2hl)(x∗)

)d > 0 (3.39d)

for all d ∈ Tx∗M with M = {x ∈ Rn : gj(x) = 0, j = 1, . . . , p, hl(x) = 0, ∀l ∈ Aieq(x)} and theLagrangian

l(x∗,λ∗,µ∗) = f(x∗) + (λ∗)Tg(x∗) + (µ∗)Th(x∗). (3.40)

Herein, µ∗l = 0 if hl(x∗) < 0.

The proof of this result can be found, e.g., in [5]. It should be noted that the positive definiteness ofL = (∇2l)(x∗,λ∗,µ∗) is required to hold on a subspace larger than Tx∗M if so–called degenerate inequalityconstraints arise, i.e., active inequality constraints having zero as associated Lagrange multiplier.

Exercise 3.6. Show that the feasible point x∗j = −1, j = 1, 2, 3 in Example 3.7 is a global minimum.

Example 3.8 (Linear program). The standard form of a linear optimization problem (also called linearprogram) is

minx∈Rn

cTx (3.41a)

subject to

Ax = b (3.41b)

x ≥ 0 (3.41c)

for A ∈ Rp×n and b ∈ Rp. Let x∗ be a regular point of the constraints Ax∗ − b = 0 and −x∗ ≤ 0. TheKKT first order optimality conditions (3.35) for (3.41) reduce to


c+ATλ∗ − µ∗ = 0 (3.42a)

−(x∗)Tµ∗ = 0 (3.42b)

λ∗ ≥ 0. (3.42c)

Taking into account the KKT second order conditions (3.36) or (3.39), respectively, with the Lagrangian

l(x∗,λ∗,µ∗) = cTx∗ + (λ∗)T (Ax− b) + (µ∗)T (−x∗) (3.43)

the computation of the Hessian L = (∇2l)(x∗,λ∗,µ∗) results in L = 0. In order to interpret this note thefollowing result [2].

Theorem 3.9. The static constrained optimization problem (3.32) is convex if f(x) is convex on Xad ⊂Rn, gj(x), j = 1, . . . , p are linear and hl(x), l = 1, . . . , q are convex.

Since the linear problem (3.41) is convex, we know from Theorem 2.4 that any local minimum is a globalminimum. Hence, the evaluation of the second order optimality condition is not required and, as seenabove, does not provide further information. These consideration lead to the so–called simplex method ofDantzig to determine a numerical solution to (3.42). For further details the reader is referred to [7, 2].

Remark 3.3. Any linear program can be transformed into the form (3.41). Given the problem minx∈Rn cTx

subject to Ax ≤ b, then the inequality constraints can be reformulated into equality and additional in-equality constraints by introducing so–called slack variables so that Ax ≤ b implies Ax + z = b, z ≥ 0[7]. However, not all variables are now constrained to be non–negative. This can be achieved by split-ting x into its non–negative and non–positive parts x = x+ − x−, where x+ = max(x,0) ≥ 0 andx− = max(−x,0) ≥ 0. With this, the problem can be re–written as

min[cT −cT 0T

] x+

x−

z

subject to[A −A E

] x+

x−

z

= b,

x+

x−

z

≥ 0

which is of the form (3.41).

Exercise 3.7. Consider the constrained optimization problem

minx∈R2

f(x) = −2x1 + x2 (3.44a)

subject to

h1(x) = x2 − (1− x31) ≤ 0 (3.44b)

h2(x) = 1− 1

4x2

1 − x2 ≤ 0. (3.44c)

The optimal solution is x∗ = [0, 1]T with both constraints active [7].

a) Analyze if the LICQ condition holds at x∗.

b) Explicitly determine the tangent space Tx∗M.

c) Are the KKT first order necessary optimality conditions (3.35) satisfied?

d) Do the KKT second order necessary optimality conditions (3.36) and the second order sufficientoptimality conditions (3.39) hold?

We conclude this section by providing a sensitivity theorem for the Lagrange multipliers for the op-timization problem (3.32) with inequality constraints by directly extending Theorem 3.5 for the case ofequality constraints only.


Theorem 3.10 (Sensitivity (inequality constraints)). Let f(x), gj(x), hl(x) ∈ C2(Rn) for j =1, . . . , p and l = 1, . . . , q and consider the family of constrained optimization problems

minx∈Rn

f(x) (3.45a)

subject to

g(x) = c (3.45b)

h(x) ≤ d. (3.45c)

Suppose that for c = 0, d = 0 there is a local solution x∗ that is a regular point of the constraintsg(x∗) = 0, h(x∗) ≤ 0 and that together with the associated Lagrange multipliers λ∗, µ∗ ≥ 0 satisfiesthe second order sufficient optimality conditions for a strict (local) minimum. Assume that there are nodegenerate inequality constraints. Then for every c ∈ Rp, d ∈ Rq in a neighborhood of 0 (containing 0)there is a solution x(c,d) depending continuously on c and d such that x(0,0) = x∗ and x(c,d) is alocal minimum of (3.45). In addition,

∂

∂cf(x(c,d))

∣∣∣∣c=0,d=0

= −(λ∗)T ,∂

∂df(x(c,d))

∣∣∣∣c=0,d=0

= −(µ∗)T . (3.46)

3.2 Numerical optimization algorithms

For the development of numerical algorithms for the solution of static optimization problems, we subse-quently focus on the setting

minx∈Rn

f(x) (3.47a)

subject to

gj(x) = 0, j = 1, . . . , p (3.47b)

hl(x) ≤ 0, l = 1, . . . , q. (3.47c)

in n decision variables x ∈ Rn, p ≤ n equality constraints g(x) and q inequality constraints h(x). Ingeneral, f(x), gj(x), j = 1, . . . , p and hl(x), l = 1, . . . , q are assumed to be at least continuous in x.Numerical optimization algorithms for (3.47) are typically classified into primal, penalty and barrier,dual, or primal–dual methods depending in particular on the dimension of the spaces the methods workin, i.e. n− p, n, p or n+ p, respectively.

Primal methods correspond to search methods working on the original problem (3.47) by searching fora solution in the feasible space. The feasible space is determined by the active constraints so that primalmethods work in the space of dimension determined by n minus the number of equality constraints.Primal methods possess the advantages that (i) each point generated by the primal method is feasible sothat even algorithm termination yields a feasible result, (ii) in many situation convergence of the searchsequence to at least a local minimum can be guaranteed, (iii) they do not rely on special properties of theoptimization problem such as convexity. Disadvantages comprise that (i) they require an initializationphase to determine a feasible starting point, (ii) computational issues may arise due to the need torestrict the search to the feasible space, (iii) some primal algorithm may fail for problems with inequalityconstraints. Subsequently, so–called active set methods and gradient projection methods are introducedas representatives of primal methods.

3.2 Numerical optimization algorithms 63

Penalty and barrier methods approximate the constrained optimization problem by an unconstrainedproblem using penalty or barrier functions. These methods work directly in the n–dimensional space ofdecision variables. Penalty and barrier methods are rather easy to implement but still allow to ensureconvergence (though often slow). In addition, although the problems are solved in a space of dimensionn, the Lagrange multipliers can be recovered during the process. In the following, penalty function andbarrier function techniques are introduced to achieve the desired approximation by an unconstrainedoptimization problem.

In dual methods the Lagrange multipliers are considered as the fundamental unknowns. Once theseare obtained the determination of the minimizer is rather straightforward, at least in certain situations.Hence, instead of working with the original constrained optimization problem dual methods work withthe so–called dual problem in the p–dimensional space of Lagrange multipliers. Primal–dual methods aredesigned to satisfy the first–order optimality conditions and as such combine some of the techniquesaddressed before including active set methods and penalty and barrier methods. As such they work inthe space of dimension n + p for n decision variables and p equality constraints. For further details thereader is referred to, e.g., [7, 1, 5] and the references therein.

3.2.1 Active set methods

In active set methods inequality constraints are partitioned into those treated active and those treatedinactive by the algorithm. For the sake of simplicity let p = 0 so that only inequality constraints arise in(3.47). The KKT first order necessary optimality conditions for a local minimum x∗ in this case implythe existence of µ∗ ∈ Rq such that

(∇f)(x∗) +∑

l∈Aieq(x∗)

µ∗l (∇hl)(x∗) = 0 (3.48a)

hl(x∗) = 0, l ∈ Aieq(x∗) (3.48b)

hl(x∗) < 0, l /∈ Aieq(x∗) (3.48c)

µ∗l > 0, l ∈ Aieq(x∗) (3.48d)

µ∗l = 0, l /∈ Aieq(x∗). (3.48e)

At each step active set methods in principle proceed by

(i) determining a current working set that is a subset of the currently active inequality constraints3 atthe iterate xk and by

(ii) moving on the working surface, i.e. the manifold defined by the working set , to a new point xk+1 ofimproved value of the objective function.

The various active set methods mainly differ in the way the movement on the working surface is deter-mined, which also induces their convergence properties.

Let Wact ∈ Aieq(xk) denote the index set of active constraints in the working set at the currentiteration step. Then the optimization problem reduces to finding a solution x∗Wact

to

minx∈Rn

f(x) (3.49a)

subject to

hl(x) = 0, l ∈ Wact, (3.49b)

which satisfies hl(x∗Wact

) < 0, l /∈ Wact. By Theorem 3.7 this point fulfills the necessary conditions

3 If there are also equality constraints, then these can be similarly included in the current working set.


(∇f)(x∗Wact) +

∑l∈Wact

µ∗l (∇hl)(x∗Wact) = 0 (3.50a)

hl(x∗Wact

) = 0, l ∈ Wact. (3.50b)

If µ∗l > 0 for all l ∈ Wact, then the point x∗Wactis a local solution to the original problem (3.47a),

(3.47c). If there is an m ∈ Wact such that µ∗m < 0, then the objective function f(x) can be decreasedby removing the constraint m from Wact, i.e., by setting hm(x) inactive. To illustrate this fact we makeuse of Theorem 3.10 addressing the sensitivity of the Lagrange multipliers. For this, start from the activeconstraint hk(x) = 0 and move by a sufficiently small c < 0 into the domain of the constraint, i.e., sethm(x) = c. Then for µ∗m < 0 and x(0) = x∗Wact

the objective function changes according to

f(x(c)) ≈ f(x(0))︸︷︷︸=f(x∗Wact

)

+∂

∂cf(x(c))

∣∣∣∣c=0︸︷︷︸

=−µ∗m>0

c = f(x∗Wact)−µ∗mc︸︷︷︸

<0

< f(x∗Wact),

which confirms the further decrease of the objective function when moving into the domain of the inequal-ity constraint hm(x) < 0. Hence, the Lagrange multiplier of a problem serves as an indicator of whichconstraint should be dropped from the working set Wact. A graphical illustration is shown in Figure 3.4.

h1(x) = 0 h2(x) = 0

Feasible region

x

(∇h1)(x)

(∇f)(x)

Fig. 3.4: Deactivation of inequality constraints. Let x denote the minimum of f(x) on the surface h1(x) = 0. Since the

Lagrange multiplier µ∗1 < 0 the constraint h1(x) = 0 should be dropped. As (∇f)(x) points to the exterior (recallthat the gradient points in the direction of increasing values of the level surfaces) a movement towards the interior

of the feasible region will decrease f(x).

On the other hand it is possible that an inequality constraint considered inactive in the working set isviolated, i.e. ∃m /∈ Wact : hm(x∗Wact

) ≥ 0. In this case the working set Wact has to be enlarged by m.

Active set methods rely on the presented ideas to systematically drop and add inequality constraints:

(i) Start with a working set generated from a feasible starting point and solve (3.49) over the corre-sponding working surface;

(ii) Add newly encountered inequality constraints to the working set but do not yet drop constraintsfrom the working set;

(iii) Once a point is obtained minimizing f(x) with respect to the current working set determine thecorresponding Lagrange multipliers. If these are all nonnegative, then the solution is (locally) optimal.

(iv) If negative Lagrange multipliers arise, then drop the corresponding inequality constraints from theworking set and restart the procedure with the new working set.


Convergence of this basic active set procedure follows from the theorem below, whose proof can be foundin [5].

Theorem 3.11 (Active set theorem). Suppose that for every subset Wact of the constraint indicesthe constrained problem

minx∈Rn

f(x)

subject to

hl(x) = 0, l ∈ Wact,

is well–defined with a unique non–degenerate solution, i.e., ∀j ∈ Wact : µ∗j 6= 0. Then the sequence ofpoints generated by the basic active set procedure converges to the solution of constrained optimizationproblem.

Difficulties arise from the fact that each iterate has to be an exact solution to the intermediate opti-mization problem to ensure that the signs of the Lagrange multipliers are correct. In practice, constraintsare in case dropped before an exact minimum is reached on the current working surface using vari-ous criteria. Hence, convergence cannot be verified for many algorithm and zigzagging may arise whenthe working set changes an infinite number of times. Nevertheless, experience shows that zigzagging israther rare and that active set methods are often very effective solution techniques for static constrainedoptimization problems.

Example 3.9 (Active set method for convex quadratic program). Consider the (convex) opti-mization problem

minx∈R2

f(x) = (x1 − 1)2 +

(x2 −

5

2

)2

(3.51a)

subject to

h1(x) = −x1 + 2x2 − 2 ≤ 0 (3.51b)

h2(x) = x1 + 2x2 − 6 ≤ 0 (3.51c)

h3(x) = x1 − 2x2 − 2 ≤ 0 (3.51d)

h4(x) = −x1 ≤ 0 (3.51e)

h5(x) = −x2 ≤ 0. (3.51f)

To proceed, we exploit the fact that (3.51) is a constrained quadratic program, which can be reformulatedas

minx∈R2

f(x) =1

2xT[2 00 2

]︸︷︷︸= P

x+[−2 −5

]︸︷︷︸= qT

x+29

4︸︷︷︸= r

(3.52a)

subject to

aTl x− bl ≤ 0 (3.52b)

for

aT1 =[−1 2

], aT2 =

[1 2], aT3 =

[1 −2

], aT4 =

[−1 0

], aT5 =

[0 −1

]b1 = 2, b2 = 6, b3 = 2, b4 = 0, b5 = 0.


It is subsequently assumed that P is positive semi–definite so that the problem is convex and we proceedas in [7]. Given an iterate xk and the working set Wact,k one first checks if xk minimizes f(x) over thecorresponding working surface. If not, then a step sk is determined by solving the equality constrainedquadratic problem

minx∈R2

f(x) =1

2xTPx+ qTx+ r

subject to

aTl x− bl = 0, l ∈ Wact,k.

To deduce the step sk from this problem substitute x = xk + sk, which yields

minx∈R2

f(x) =1

2sTk Psk +

(Pxk + q

)Tsk +

1

2xTPx+ qTx+ r︸︷︷︸

= (?)

subject to

aTl sk + aTl xk − bl︸︷︷︸= (??)

= 0, l ∈ Wact,k.

Since (?) is independent of sk the term can be dropped without changing the solution of the problem. Byconstruction xk has to comply with the active constraints so that (??) = 0. As a result, the quadraticprogram

mins∈R2

f(x) =1

2sTk Psk +

(Pxk + q

)Tsk (3.53a)

subject to

aTl sk = 0, l ∈ Wact,k. (3.53b)

has to be solved at the k–th iteration. If sk solving (3.53) is non–zero, then the next iterate is chosenaccording to

xk+1 =

{xk + sk, if xk + sk is feasible with respect to all constraints

xk + αksk, otherwise. (3.54)

The maximal step length αk ∈ [0, 1] is computed as

αk = min

{1, minl 6=Wact,k:aTl sk>0

bl − aTl xkaTl sk

}. (3.55)

This expression is obtained when considering what happens to the constraints aTl x−bl outside the workingset for l /∈ Wact,k. If αk < 1, i.e., there is some constraint not in Wact,k blocking the maximal step lengthαk = 1, then the working set is enlarged by adding one of the blocking constraint to Wact,k+1.

The iteration is continued until a point x∗Wactis found minimizing the quadratic objective function over

the current working set Wact. This point can be recognized by observing that (3.53) has a solution sk = 0,which satisfies the first order optimality conditions (3.10) for (3.53), i.e.,

(∇f)(x∗Wact) +

∑l∈Wact

µ∗l (∇hl)(x∗Wact) = Psk + Px∗Wact

+ q +∑l∈Wact

µ∗l al = 0


and hence

∑l∈Wact

µ∗l al = −(Px∗Wact

+ q)

(3.56a)

aTl sk = 0, l ∈ Wact. (3.56b)

As mentioned above, if all µ∗l are non–negative, then x∗Wactis the local optimizer x∗. For P positive

semi–definite x∗ is even the global minimizer. If there is an m ∈ Wact for which µ∗l < 0, then f(x) canbe further decreased by dropping this constraint from the working set. With this, a new subproblem of theform (3.53) has to be solved.

Consider now the original example (3.52) and proceed with the active set method for quadratic programsas outlined above:

• Start: A feasible starting point of the iteration is given by x0 = [2, 0]T . At this point the constraintsaT3 x and aT5 x are active so that the active set is Wact,0 = {3, 5}. By construction x0 = x∗Wact,0

is a

minimizer of f(x) subject to the constraints, which implies s0 = 0. Hence, the respective Lagrangemultipliers follow from (3.56a), i.e.,

µ∗3

[1−2

]+ µ∗5

[0−1

]= −

[2−5

]as µ∗3 = −2 and µ∗5 = −1. Note that (3.56a) is fulfilled here directly.

• Iteration 1: Since µ∗3 < µ∗5 the constraint 3 is removed from the working set so that Wact,1 = {5}.With x1 = x0 the solution to (3.53) is s1 = [−1, 0]T . The corresponding step length follows from(3.55) as α1 = 1 so that x2 = [1, 0]T .

• Iteration 2: For α1 = 1 no blocking constraints arise so that Wact,2 = Wact,1 = {5}. Solving (3.53)yields s2 = 0. Evaluation of (3.56a) yields the Lagrange multiplier µ∗5 = −5. Hence, the constraint 5is dropped from the working set so that Wact3 = ∅ with x3 = x2 = [1, 0]T .

• Iteration 3: Solution of the unconstrained problem (3.53) provides s3 = [0, 52 ]T . Evaluation of (3.55)

yields α3 = 0.6 and hence x4 = [1, 32 ]T . Since a blocking constraint arises the new working set

Wact,4 = {1} is obtained.

• Iteration 4: From this, s4 = [0.4, 0.2]T and α4 = 1 follows, which implies Wact,5 =Wact,4 = {1} withthe next iterate x4 = [1.4, 1.7]T .

• Iteration 5: With x4 solving (3.53) gives s5 = 0 with the Lagrange multiplier µ∗1 = 0.8. Hence, theprocedure terminates and x∗ = [1.4, 1.7]T is the global minimizer for the quadratic program (3.51).

Several active set algorithms are suggested in the literature addressing linear and quadratic pro-grams as well as nonlinear problems. The ’Optimization Toolbox’ of Matlab provides the functionfmincon, which, among other techniques, implements also an active set strategy by setting options =

optimoptions(’fmincon’,’Algorithm’,’active-set’);.

3.2.2 Gradient projection methods

In gradient projection methods given a feasible point the negative gradient of the objective function isprojected onto the subspace tangent to the working surface, i.e., onto the tangent space of the manifolddefined by the working set, to define the direction of movement. This is obviously similar to the method ofsteepest descent introduced in Section 2.2.2 for static unconstrained optimization problems. Subsequently,only the basic ideas are summarized for the case of nonlinear constraints. Detailed derivations are possibleif the equality and inequality constraints are linear. For these, the reader is referred to, e.g., [5].


M

xk

TxkM

−(∇f)(xk)

(∇h)(xk)

skyk

xk+1

Fig. 3.5: Principle of gradient projection methods.

Recall the static constrained problem (3.47), i.e.,

minx∈Rn

f(x) (3.57a)

subject to

gj(x) = 0, j = 1, . . . , p (3.57b)

hl(x) ≤ 0, l = 1, . . . , q (3.57c)

and consider Figure 3.5, which shows the principle procedure. Since the manifoldM is curved in generalthe direction obtained from the projection of the negative gradient −(∇f)(xk) at the point xk onto thetangent space TxkM might not be a feasible direction. Thus it is not directly possible to move into thisdirection to obtain the next point. To address this, the procedure continues by moving along the projectednegative gradient to a point yk followed by move in a direction perpendicular to the tangent plane TxkMto a nearby feasible point on M. These two steps are repeated with various points yk until a feasiblepoint xk+1 is found that fulfills one of the descent criteria for improvement relative to the starting pointxk.

The resulting moving away and onto the manifold M of feasible points induces certain difficulties,which make an implementation of projected gradient methods rather complex. For the sake of simplicityall equality and active inequality constraints are subsequently summarized in the vector h(x) ∈ Rq. Theprojection of −(∇f)(xk) onto the tangent space TxkM is achieve using the projection matrix

Pk = E − (∇h)(xk)[(∇h)T (xk)(∇h)(xk)

]−1(∇h)T (xk). (3.58)

When moving from xk to xk+1 the gradient (∇h)(x) will change so that Pk must be recomputed in eachstep. Returning to the manifold M from points

yk = xk − αkPk(∇f)(xk) (3.59)

outside M is the most important feature. For this, as is indicated in Figure 3.5, one moves back to theconstraint surface in a direction orthogonal to the tangent plane. Starting from a point yk this impliesto seek a vector ηk ∈ Rq such that

xk+1 = yk + (∇h)T (xk)ηk with h(xk+1) = 0. (3.60)

It is worth noting that a vector ηk must not necessarily exist as is shown in Figure 3.6(a). To find a firstapproximation to ηk the equation h(xk+1) is linearized with respect to ηk to obtain


0 = h(xk+1) = h(yk + (∇h)T (xk)ηk) ≈ h(yk) + (∇h)T (xk)(∇h)(xk)ηk.

Hence, we have

ηk = −[(∇h)T (xk)(∇h)(xk)

]−1h(yk) (3.61a)

xk+1 = yk − (∇h)T (xk)[(∇h)T (xk)(∇h)(xk)

]−1h(yk) (3.61b)

which converges for sufficiently small αk in (3.59).

A further difficulty of gradient projection methods with nonlinear constraints arises from the factthat previously inactive inequality constraints may be violated when moving into the direction of theprojected negative gradient at xk. In this case an interpolation scheme must be used to find a new pointyk along −αkPk(∇f)(xk) so that when returning to the active constraint manifold M no originallyinactive constraint is violated. Figure 3.6(b) provides a graphical illustration. This is partly a trial anderror process so that in practice one often relaxes the equality and inequality constraints so that theyonly need to be fulfilled within a certain tolerance.

M

xk

TxkM

yk

(a) Non–existence of vector ηk.

xk

TxkM

−(∇f)(xk)

(∇h)(xk)

skyk

yk

xk+1

(b) Interpolation to obtain feasible point.

Fig. 3.6: Issues in project gradient methods.

3.2.3 Penalty and barrier methods

Penalty and barrier methods represent procedures for approximating constrained problems by uncon-strained problems. In the first case a term is added to the objective function to penalize the constraintsby inducing a large cost in case of constraint violation. In the second case a barrier is imposed to theboundary of the feasible region to prevent search algorithms from leaving this region. Herein, two mainissues arise concerning the quality of approximation of the constrained problem by the unconstrainedone and the question of how unconstrained problems are solved when the objective function includes apenalty or barrier term. Penalty and barrier methods are very important in view of applications sincethey in general provide rather simple methods to address constrained optimization problems. Moreover,these techniques are also theoretically appealing in terms of their convergence properties and convergencerates.

3.2.3.1 Penalty methods Penalty methods are based on the approximation of the constrained opti-mization problem


minx∈Xad

f(x) (3.62a)

with (see also (3.1)–(3.3)) admissible set

Xad = {x ∈ Rn : gj(x) = 0, j = 1, . . . , p and hl(x) ≤ 0, l = 1, . . . , q} (3.62b)

by an unconstrained optimization problem of the form

minx∈Rn

f(x) + cP (x) (3.63)

with the positive constant c and the continuous penalty function P (x). The term f(x) + cP (x) is alsocalled auxiliary function. Here, P (x) ≥ 0 for all x ∈ Rn and P (x) = 0 if and only if x ∈ Xad.

If only inequality constraints arise so that

Xad = {x ∈ Rn : hl(x) ≤ 0, l = 1, . . . , q}, (3.64)

then a possible penalty function is given by

P (x) =1

2

q∑l=1

(max{0, hl(x)})2. (3.65)

Figure 3.7 shows the behavior of the penalty function cP (x) with P (x) defined in (3.65) for h1(x) = x−20and h2(x) = 10− x.

0 5 10 15 20 25 300

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

x

cP(x)

c=1c=10c=100

Fig. 3.7: Penalty function cP (x) defined in (3.65) for c ∈ {1, 10, 100}.

If both equality and inequality constraints arise so that Xad is given by (3.62b), then the penaltyfunction can be chosen as

P (x) =

p∑j=1

φ(gj(x)) +

q∑l=1

ψ(hj(x)), (3.66)

where φ( · ) and ψ( · ) are continuous functions satisfying{φ(z) = 0, z = 0

φ(z) > 0, otherwise,

{ψ(z) = 0, z ≤ 0

ψ(z) > 0, otherwise. (3.67)

Often φ( · ) and ψ( · ) are assigned in the form


φ(z) = |z|r, ψ(z) = (max{0, z})r (3.68)

with some positive integer r [3].

For large values of c the minimizer x∗ of (3.63) will be in a region where P (x) is small, i.e. close toXad. Ideally as c→∞ the solution of (3.63) will converge to the solution of (3.62). To address this issuethe penalty method make use of a sequence (ck)k, ck+1 > ck > 0 approaching infinity and the sequentialsolution x∗k of the unconstrained penalized optimization problem (3.63). In each subsequent solution stepk+ 1 for ck+1 the previous value x∗k is used as starting value. The proof of convergence of this proceduremakes use of the following lemmas [5].

Lemma 3.2. The solutions xk and xk+1 of the unconstrained penalized optimization problem (3.63) forck+1 > ck > 0 fulfill the inequalities

f(x∗k) + ckP (x∗k) ≤ f(x∗k+1) + ck+1P (x∗k+1)

P (x∗k) ≥ P (x∗k+1)

f(x∗k) ≤ f(x∗k+1).

Lemma 3.3. Let x∗ be a solution to the constrained optimization problem (3.62). Then for each k wehave

f(x∗) ≥ f(x∗k) + ckP (x∗k) ≥ f(x∗k).

Exercise 3.8. Prove Lemmas 3.3 and 3.3.

With these intermediate results, the following theorem can be verified, which implies that any limitpoint of the sequence (x∗k)k generated by the penalty method is a solution to (3.62).

Theorem 3.12 (Convergence of the penalty method). Let (x∗k)k be a sequence generated as solu-tions to the unconstrained penalized optimization problem (3.63) for a sequence (ck)k approaching infinitywith ck+1 > ck > 0. Then any limit point of the sequence (x∗k)k is a solution of the constrained optimiza-tion problem (3.62).

The proof of this result can be, e.g., found in [1, 5]. Note that under the conditions (i) f(x), gj(x), hl(x) ∈C1(Rn) for j = 1, . . . , p, l = 1, . . . , q, (ii) P (x) ∈ C1(Rn) and (iii) the limit point x∗ of the sequence(x∗k)k is a regular point it is possible to recover the Lagrange multipliers λ and µ associated with theequality and inequality constraints at x∗ from the penalized unconstrained problem [3, 5]. In particularit can be shown that λk → λ∗ and µk → µ∗ as ck →∞.

The generic schematics of a penalty method is summarized in Algorithm 6. Although superlinearconvergence can be achieved for certain problems numerical problems might occur due to ill–conditioningfor large values of c in the penalized problem. The larger ck the more emphasis is placed on feasibility ofthe iterate xk and procedures for unconstrained numerical optimization may quickly approach a feasiblepoint. However, this point may be far from being optimal. In addition, round–off errors may yield slowconvergence and premature termination of the algorithm.

Example 3.10. Consider the constrained static optimization problem

minxf(x) = x

subject to the inequality constraint

h(x) = 2− x ≤ 0.

Obviously, x∗ = 2 with f(x∗) = 2. Consider now the penalized unconstrained problem (3.63) with P (x) =(max{0, 2−x})2, i.e., f(x)+cP (x) = x+c(max{0, 2−x})2. It can be shown that this auxiliary function is


Algorithm 6: Schematic penalty method.

input : x0 fulfilling h(x0) < 0 (feasible starting value)

ε (stopping criteria)c0 > 0 (starting value)

β > 1

initialize: k = 0

repeat

Determine

xk+1 ∈ arg min{x ∈ intXad : f(x) + ckP (x)

}Compute ck+1 = βck

Update k = k + 1

until ckP (xk+1) ≤ ε;

convex for any c [3]. Hence, a necessary and sufficient condition for optimality is that f ′(x)+cP ′(x) = 0.The latter implies 1 − 2c(2 − x) for x < 2 so that x = 2 − 1

2c and f(x) + cP (x) = 2 − 14c . Thus x∗ and

f(x∗) can be approximated arbitrarily close for sufficiently large c as is illustrated in Figure 3.8.

−1 −0.5 0 0.5 1 1.5 2 2.5 30

2

4

6

8

10

12

14

16

18

x

cP(x)

c=0.5c=1c=2

(a) Penalization cP (x).

−1 −0.5 0 0.5 1 1.5 2 2.5 30

2

4

6

8

10

12

14

16

18

x

f(x)+cP

(x)

c=0.5c=1c=2

(b) Auxiliary function f(x) + cP (x).

Fig. 3.8: Penalization and auxiliary function for Example 3.10.

The Lagrange multiplier µ associated with the inequality constraint can be recovered from the penaliza-tion as µ = cP ′(x) if h(x) is active. Thus, µ = 2cmax{0, 2− x} = 2cmax{0, 1

2c} = 1 for any c > 0. Theproperty that µ is independent of c in this example follows from the fact that h(x) is linear.

3.2.3.2 Barrier or interior–point methods Barrier methods are applicable to constrained optimiza-tion problems

minx∈Xad

f(x) (3.69a)

with admissible set4

Xad = {x ∈ Rn : gj(x) = 0, j = 1, . . . , p and hl(x) ≤ 0, l = 1, . . . , q} (3.69b)

4 Often Xad is split into the set of inequality constraints h(x) ≤ 0 and a set X ⊂ Rn into which also equality constraints(if there are any) should be included [3].


if Xad 6= ∅ is a so–called robust set , i.e., any point on the border of Xad can be reached from the interior ofXad. Here, barrier functions are used to transform the constrained problem (3.69) into an unconstrainedproblem or into a sequence of unconstrained problems. These functions are used to generate a barrierso that the iterates cannot leave the admissible set. This also motivates the alternative terminologyinterior–point methods.

Let intXad denote the interior of Xad. Then a barrier function B(x) is defined on intXad such that

(i) B(x) is continuous;

(ii) B(x) ≥ 0;

(iii) B(x)→∞ as x approaches the boundary of Xad.

Selected examples of barrier functions depending, of course, onXad are provided subsequently. Let hl(x) ≤0, l = 1, . . . , q be continuous on Rn and let Xad = {x ∈ Rn : hl(x) ≤ 0, l = 1, . . . , q} be a robust setwith intXad = {x ∈ Rn : hl(x) < 0, l = 1, . . . , q}. Then the function

B(x) = −q∑l=1

1

hl(x)(3.70)

is a barrier function fulfilling the conditions formulated above. The behavior of B(x)c is depicted in Figure

3.9(a) for different values of c given h1(x) = x − 20 and h2(x) = 10− x. With identical preliminaries asabove the function

B(x) = −q∑l=1

ln(min{1,−hl(x)}

)(3.71)

is a barrier function fulfilling the conditions formulated above. The function B(x)c is shown in Figure 3.9(b)

for different values of c given h1(x) = x−10 and h2(x) = 20−x. This function is quite commonly used ininterior–point methods and is referred to as Frisch’s logarithmic barrier function [3]. The procedure used

10 12 14 16 18 200

2

4

6

8

10

12

14

16

18

20

x

B(x)/c

c=0.5c=1c=2

(a) Function (3.70)

10 12 14 16 18 20

0

2

4

6

8

10

12

14

16

18

20

x

B(x)/c

c=0.5c=1c=2

(b) Function (3.71)

Fig. 3.9: Barrier functionsB(x)c

for c ∈ {0.5, 1, 2}.

in interior–point methods is in principle identical to the set–up of penalty methods. The unconstrainedoptimization problem


minx∈intXad

f(x) +1

cB(x) (3.72)

is successively solved for a sequence (ck)k, ck+1 > ck > 0 of coefficients. Let x∗k denote the respectivesolution in the k–th step, the convergence of the interior–point method follows in a way similar to theconvergence of the penalty method.

Theorem 3.13 (Convergence of the interior–point method). Let (x∗k)k be a sequence generated assolutions to the unconstrained optimization problem (3.72) for a sequence (ck)k approaching infinity withck+1 > ck > 0. Then any limit point of the sequence (x∗k)k is a solution of the constrained optimizationproblem (3.69).

Consult, e.g., [1] for a proof of this theorem. Similar to penalty methods the associated Lagrange mul-tipliers λ and µ can be recovered from the unconstrained problem (3.72) under the conditions (i)f(x), gj(x), hl(x) ∈ C1(Rn) for j = 1, . . . , p, l = 1, . . . , q, (ii) B(x) ∈ C1(Rn) and (iii) the limitpoint x∗ of the sequence (x∗k)k is a regular point [3, 5]. In particular it can be shown that λk → λ∗ andµk → µ∗ as ck →∞.

A generic algorithm for the interior–point method is summarized in Algorithm 7. Interior–point meth-

Algorithm 7: Schematic interior–point method.

input : x0 fulfilling h(x0) < 0 (feasible starting value)

ε (stopping criteria)

c0 > 0 (starting value)β > 1

initialize: k = 0

repeat

Determine

xk+1 ∈ arg min

{x ∈ intXad : f(x) +

1

ckB(x)

}Compute ck+1 = βck

Update k = k + 1

until1

ckB(xk+1) ≤ ε;

ods for constrained nonlinear optimization problems (3.69) have to compete with several computationaldifficulties. Finding a feasible starting point x0 satisfying h(x0) < 0 might be difficult for certain prob-lems. Due to the structure of the barrier function B(x) iterative technique may face ill–conditioning anddifficulties with round–off errors when ck is large, i.e., when the boundary of Xad is approached. Neverthe-less, interior–point methods are nowadays considered competitive algorithms. Some free and commercialimplementations of interior–point methods are summarized in Table 3.1.

Solver License Webpage/Option

Ipopt Eclipse Public License (EPL) https://projects.coin-or.org/Ipopt

LOQO Commerical http://www.princeton.edu/~rvdb/loqomenu.html

KNITRO Commerical http://www.ziena.com/knitro.htm

fmincon Commercial (Matlab) optimoptions(’fmincon’,’Algorithm’,’interior-point’);

Table 3.1: Free and commerical interior–point solvers.

Example 3.11. Consider Example 3.10, i.e.,

https://projects.coin-or.org/Ipopt

http://www.princeton.edu/~rvdb/loqomenu.html

http://www.ziena.com/knitro.htm


minxf(x) = x

subject to the inequality constraint

h(x) = 2− x ≤ 0.

Taking into account the barrier function (3.70) yields the auxiliary function f(x) + B(x)c = x1 − 1

(2−x)c .

Since this function is convex a necessary and sufficient condition for x to be a minimizer is that f ′(x)−B′(x)c = 0. This implies 1− 1

(2−x)2c = 0 and thus x = 2+ 1√c

such that f(x)+ B(x)c = 2+ 2√

c. Thus we can

approach the minimizer x∗ = 2 and the minimal value f(x∗) = 2 arbitrarily close for sufficiently large c.Figure shows the respective numerical results for different values of c.

2 2.5 3 3.5 40

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

x

B(x)/c

c=1c=5c=20

(a) Barrier functionB(x)c

.

2 2.5 3 3.5 42

2.5

3

3.5

4

4.5

5

5.5

6

x

f(x)+B(x)/c

c=1c=5c=20

(b) Auxiliary function f(x) +B(x)c

.

Fig. 3.10: Barrier function and auxiliary function for Example 3.11.

To recover the Lagrange multiplier µ associated with h(x) ≤ 0 consider µ = 1c(h(x))2 = 1

c(2−x)2 = 1 for

x = 2+ 1√c. The value is constant since h(x) is linear and equals the exact Lagrange multiplier associated

with the inequality constraint.

3.2.4 Sequential quadratic programming

In this section, a numerical approach based on Newton’s method is introduced to solve static constrainedoptimization problems, namely sequential quadratic programming (SQP). SQP can be considered ratheras a methodology than a single algorithm. The underlying idea is to successively linearize the optimalityconditions and to solve the resulting sequence of quadratic programs. One thereby distinguishes betweenlocal and global SQP methods with the latter enabling convergence starting at points remote from theminimizer.

3.2.4.1 Local SQP method To motivate the principle idea consider the static equality constrainedoptimization problem

minx∈Rn

f(x) (3.73a)

subject to


gj(x) = 0, j = 1, . . . , p (3.73b)

with f(x), gj(x) ∈ C2(Rn) for j = 1, . . . , p with p < n. Given a local minimum x∗, which is assumed tobe a regular point of the equality constraints, then the KKT first order necessary optimality conditionsof Theorem 3.2 imply the existence of unique Lagrange multiplier λ∗ ∈ Rp such that[

(∇f)(x∗) + (∇g)(x∗)λ∗

g(x∗)

]= 0. (3.74)

This system of n + p (nonlinear) equations in n + p unknowns (x∗,λ∗) can be, e.g., solved numericallyusing Newton’s method. Taking into account the exposition in Section 2.2.2.2 the iteration rule (2.43)yields[

xk+1

λk+1

]=

[xkλk

]+

[rxkrλk

](3.75a)

with[L(xk,λk) (∇g)(xk)(∇g)T (xk) 0

]︸︷︷︸

Mk

[rxkrλk

]= −

[(∇f)(xk) + (∇g)(xk)λk

g(xk)

](3.75b)

with L(xk,λk) = (∇2l)(xk,λk) for the Lagrangian l(x,λ) = f(x)+λTg(x). The iteration is well–definedwhen matrix Mk, also named KKT matrix, in (3.75b) has full rank. This condition is fulfilled, when (i) thematrix (∇g)(xk) has linearly independent column vectors, i.e., when the LICQ condition is satisfied, andwhen (ii) for all d ∈ TxkM we have dTL(xk,λk)d > 0. The latter requirement is satisfied by Theorem 3.4for a strict local minimum (x∗,λ∗) so that by the assumption of continuous differentiability of f(x) andgj(x) positive definiteness extends to a sufficiently small neighborhood of (x∗,λ∗). With this, Newton’smethod converges quadratically provided that the starting point of the iteration is close enough to theminimizer.

The iteration (3.75) can be alternatively viewed as the successive solution to the quadratic program

minpk∈Rn

f(xk) + (∇f)T (xk)pk +1

2pTk L(xk,λk)pk (3.76a)

subject to

(∇g)T (xk)pk + g(xk) = 0. (3.76b)

The KKT first order necessary optimality conditions for (3.76) read[L(xk,λk) (∇g)(xk)(∇g)T (xk) 0

] [p∗kλ∗,pk

]= −

[(∇f)(xk)g(xk)

](3.77)

with the Lagrange multiplier λ∗,pk . However, (3.75b) with rλk = λk+1 − λk results in[L(xk,λk) (∇g)(xk)(∇g)T (xk) 0

] [rxkλk+1

]= −

[(∇f)(xk)g(xk)

](3.78)

Comparing (3.77) with (3.78) yields that the iterate (xk+1,λk+1) can be either interpreted as the iterategenerated by Newton’s method from (3.75) or the solution of the quadratic program (3.76). The latterdirectly follows from the substitution rxk = p∗k and

xk+1 = xk + p∗k

λk+1 = λ∗,pk .(3.79)


Algorithm 8: Schematic SQP method for solving (3.73).

input : x0 (feasible starting point)

λ0 (starting point)ε (stopping criteria)

initialize: k = 0

repeat

Evaluate f(xk), (∇f)(xk), L(xk,λk), g(xk), (∇g)(xk)

Solve the quadratic program (3.76) for p∗k, λ∗,pk

Update xk+1 = xk + p∗k, λk+1 = λ∗,pk , k = k + 1

until ‖p∗k‖ ≤ ε;

If p∗k = 0, then (3.77) yields that a point x∗ = xk, λ∗ = λ∗,pk is found that satisfies the KKT conditions(3.74) of the original problem (3.73). The name sequential quadratic problem hence refers to the fact thata sequence of quadratic programs is solved in the procedure. The resulting simplest form of the SQPalgorithm is summarized in Algorithm 8.

The transfer to static optimization problems with both equality and inequality constraints followsrather similar to the previous exposition. For this, consider

minx∈Rn

f(x) (3.80a)

subject to

gj(x) = 0, j = 1, . . . , p (3.80b)

hl(x) ≤ 0, l = 1, . . . , q (3.80c)

with f(x), gj(x), hl(x) ∈ C2(Rn) for j = 1, . . . , p with p < n and l = 1, . . . , q. The previous resultsimply that the solution to (3.80) can be approximated in every iteration step by the quadratic program

minpk∈Rn

f(xk) + (∇f)T (xk)pk +1

2pTk L(xk,λk,µk)pk (3.81a)

subject to

(∇g)T (xk)pk + g(xk) = 0 (3.81b)

(∇h)T (xk)pk + h(xk) ≤ 0 (3.81c)

with L(xk,λk,µk) = (∇2l)(xk,λk,µk) for the Lagrangian l(x,λ) = f(x)+λTg(x)+µTh(x). The KKTfirst order necessary conditions of Theorem 3.6 for (3.81) read

(∇f)(xk) + L(xk,λk,µk)p∗k + (∇g)(xk)λ∗,pk + (∇h)(xk)µ∗,pk = 0 (3.82a)

(∇g)T (xk)p∗k + g(xk) = 0 (3.82b)

(∇h)T (xk)p∗k + h(xk) ≤ 0 (3.82c)((p∗k)T (∇h)(xk) + hT (xk)

)µ∗,pk = 0 (3.82d)

µ∗,pk ≥ 0 (3.82e)

with the Lagrange multipliers λ∗,pk ∈ Rp and µ∗,pk ∈ Rq. The iteration rule for the SQP procedure followsdirectly from above as


Algorithm 9: Schematic SQP method for solving (3.80).

input : x0 (feasible starting point)

λ0 (starting point)µ0 (starting point)

ε (stopping criteria)

initialize: k = 0

repeat

Evaluate f(xk), (∇f)(xk), L(xk,λk,µk), g(xk), (∇g)(xk), h(xk), (∇h)(xk)

Solve the quadratic program (3.81) for p∗k, λ∗,pk , µ∗,pk

Update xk+1 = xk + p∗k, λk+1 = λ∗,pk , µk+1 = µ∗,pk , k = k + 1

until ‖p∗k‖ ≤ ε;

xk+1 = xk + p∗k

λ∗k+1 = λ∗,pkµ∗k+1 = µ∗,pk .

(3.83)

If an iteration step is reached with p∗k = 0, then (3.82) implies that the point x∗ = xk, λ∗ = λ∗,pk ,µ∗ = µ∗,pk fulfills the KKT conditions of the original optimization problem (3.80). Algorithm 9 summarizesthe procedure. Herein, e.g., the active set method introduced in Section 3.2.1 can be used to solve thequadratic program. Convergence of the SQP method can be ensured assuming that certain conditionsare fulfilled and that the starting point is sufficiently close to the local minimizer [7].

Theorem 3.14. Let x∗ be a local solution of (3.80) fulfilling the KKT second order necessary optimalityconditions (3.35). Assume that the LICQ condition and the complementary slackness condition holdand that (x∗,λ∗,µ∗) satisfy the KKT second order sufficient optimality conditions. If (xk,λk,µk) issufficiently close to (x∗,λ∗,µ∗), then there is a local solution to the quadratic program (3.81), whoseactive set Aieq(xk) is the same as the active set Aieq(x

∗) of (3.80).

The proof of this result can be, e.g., found in [8]. In principle a quadratic convergence rate of the SQPmethod can be verified, which, as does the theorem above, assumes that the starting values (x0,λ0,µ0)lie close to (x∗,λ∗,µ∗). This illustrates the used notion of local SQP method .

Remark 3.4. The formulation of the quadratic program relies on the availability of the HessianL(xk,λk,µk) of the Lagrangian l(xk,λk,µk). Depending on the particular application, the computationof the second order derivatives can become burdensome since the terms might not be available/computableanalytically and finite difference approximations might induce additional errors. Moreover, the Hessiancould become indefinite if the current iterate (xk,λk,µk) is not in a sufficiently small neighborhood of(x∗,λ∗,µ∗). In order to address these issues, it is useful to replace the exact Hessian L(xk,λk,µk) by apositive definite quasi–Newton approximation Hk as is discussed in Section 3. For this, a modified BFGSmethod (see, e.g., [7]) can be used according to

Hk+1 = Hk +rkr

Tk

rTk dk− Hkdkd

TkHk

dTkHkdk(3.84a)

with

dk = xk+1 − xk (3.84b)

rk = θkyk + (1− θk)Hkdk (3.84c)

yk =

(∂

∂xl

)T(xk+1,λk,µk)−

(∂

∂xl

)T(xk,λk,µk) (3.84d)


θk =

1, if dTk yk ≥ 0.2dTkHkdk

0.8dTkHkdk

dTkHkdk − dTk yk, else

. (3.84e)

It can be shown that these equations guarantee that Hk+1 is symmetric and positive definite if Hk issymmetric and positive definite. Hence, Algorithm 9 can be modified by adding the input of a symmetric,positive definite matrix H0 and replacing the evaluation of L(xk,λk,µk) by Hk determined by (3.84).While this in general deteriorates the quadratic convergence rate it is still possible to guarantee superlinearconvergence provided that the starting value (x0,λ0,µ0) is close to (x∗,λ∗,µ∗).

For details on the various options and approaches for the realization of SQP methods, the reader is inparticular referred to [7, Chapter 18].

3.2.4.2 Globalized SQP method Although convergence of SQP methods can be only guaranteedunder the assumption that the starting point of the iteration is sufficiently close to the minimizer, numer-ical experience shows that SQP methods often converge to a solution also from remote starting points.To achieve a globalization of the SQP method a step length αk can be introduced so that the update ofxk+1 in Algorithm 9 is replaced by

xk+1 = xk + αkp∗k. (3.85)

Herein, αk is computed using a line search approach. During the line search it is in view of the equalityand inequality constraints necessary to provide a mechanism, which allows to decide whether a trail stepcan be accepted or needs to be rejected. In the unconstrained case this is directly possible by simplyconsidering the values of xk and xk+1. In the constrained case this no longer holds since the SQP methodin general results in xk 6∈ Xad. Thus the situation may arise that the value of the objective function f(x)is decreased at the iterate xk+1 but at the cost of an increased violation of the constraints. To addressthis issue SQP methods often employ a so–called merit function as a measure of progress to evaluate thechoice of αk. Different techniques such as penalty function or augmented Lagrangians are available fromwhom subsequently only the non–smooth `1 (penalty) merit function is presented, which is defined as

l1(x;σ) = f(x) + σ

p∑j=1

|gj(x)|+ σ

q∑l=1

max{0, hl(x)} σ > 0. (3.86)

The optimal step length then follows from the line search problem

αk = arg minαl1(xk + αp∗k;σ). (3.87)

In practice αk is chosen to achieve a sufficient improvement of l1(xk + αp∗k;σ) compared to l1(xk;σ).For this, one of the procedures addressed in Section 2.2.2.1 can be applied, e.g., a backtracking approachwith termination criterion as suggested in [7]. Herein, one starts with the unit step length and terminateswhen the condition

l1(xk + αp∗k;σ) ≤ l1(xk;σ)− ραk(qσ(0)− qσ(pk)) (3.88)

is fulfilled, where ρ ∈ (0, 1) and

qσ(p) = f(xk) + (∇f)T (xk)p+1

2pTHkp+ σ

p∑j=1

∣∣gj(xk) + (∇gj)Tp∣∣

+

q∑l=1

max{

0, hl(xk) + (∇hl)T (xk)p}.

(3.89)

Note that in general an adjustment of σ is necessary in each iteration step. Taking into account somewhatrestrictive assumptions global convergence of the globalized SQP method applied to (3.80) can be verified.


Moreover, extensions employing trust region techniques allow to weaken the assumptions (see, e.g., thediscussion in [7] and the references therein).

If each local minimum of the nonlinear constrained optimization problem (3.80) is also a local minimumof the merit function, then we refer to the merit function as an exact merit function. The `1 merit function(3.86) is exact, if σ is larger than the largest modulus of any Lagrange multiplier at a local minimum(x∗,λ∗,µ∗) satisfying the KKT conditions. In this case, x∗ is also a local minimum of l1(x;σ).

Finally it should be pointed out that a different and computationally more efficient approach avoidingsome disadvantages of merit functions is given by so–called filter methods. For further details the readeris again referred to, e.g., [7] and the references therein.

Some free and commercial implementations of SQP methods are summarized in Table 3.2.

Solver License Webpage/Option

SNOPT Commercial, free student version http://www.sbsi-sol-optimize.com/asp/sol_product_

snopt.htm

FilterSQP – http://www.mcs.anl.gov/~leyffer/solvers.html

sqp GNU (Octave) http://www.gnu.org/software/octave/

fmincon Commercial (Matlab) optimoptions(’fmincon’,’Algorithm’,’sqp’);

Table 3.2: Free and commercial interior–point solvers.


Similar to Section 2.3 we make use of the Rosenbrock function to evaluate the properties and convergencebehavior of numerical techniques to solve static constrained optimization problems. For this consider

minx∈R2

f(x) = 100(x2 − x2

1

)2+(x1 − 1

)2(3.90)

subject to

h(x) = r2 − (x1)2 + (x2)2 ≤ 0 (3.91)

for r = 0.5. For the solution of (3.90) subsequently the different solvers implemented in the functionfmincon of the Matlab ’Optimization Toolbox’ are considered5. This comprises:

• active-set using sequential quadratic programming with the solution of the arising quadratic pro-gram using an active set strategy;

• interior-point implementing logarithmic barrier functions;

• trust-region-reflective implementing a trust region approach (see Section 2.2.3) extended toconstrained nonlinear objective functions with linear equality constraints Ax = b and so–called boxconstraints x ≤ x ≤ x;

• sqp, implementing an SQP method similar to the active-set algorithm but with a different iterationstrategy.

Since the trust-region-reflective algorithm does not allow nonlinear inequality constraints the anal-ysis of this option is omitted. In addition, Ipopt is used as an alternative implementation of interiorpoint methods, which provides also an interface to Matlab [9].

5 See, e.g., http://www.mathworks.de/de/help/optim/ug/constrained-nonlinear-optimization-algorithms.html for fur-ther information regarding constrained nonlinear optimization using Matlab.

http://www.sbsi-sol-optimize.com/asp/sol_product_snopt.htm

http://www.sbsi-sol-optimize.com/asp/sol_product_snopt.htm

http://www.mcs.anl.gov/~leyffer/solvers.html

http://www.gnu.org/software/octave/

http://www.mathworks.de/de/help/optim/ug/constrained-nonlinear-optimization-algorithms.html


Item Method Iter. f(x∗) #eval(f)

1 Matlab fmincon: active set 49 1.1245× 10−7 464

2 Matlab fmincon: interior point 33 2.1151 1173 Matlab fmincon: SQP 40 4.5598× 10−16 782

4 Ipopt: interior point 18 0.1558 38 (+19 (∇f))

Table 3.3: Comparison of algorithms for static constrained optimization applied to the (3.90).

Table 3.3 summarizes the results of a comparison of the different algorithms when applied to theRosenbrock problem with initial value x0 = [−1,−1]T . The corresponding behavior of the iterates isdepicted in Figure 3.11. It is obvious that both interior point algorithms converge to a local minimumwith the minimizer being feasible, i.e. the inequality constraint is fulfilled. The active set and SQPstrategies provided by fmincon achieve convergence to the global minimum x∗ = [1, 1]T . Comparingthe computational burden to the results of Section 2.3 for the unconstrained case directly illustrates theincreasing number of function evaluations that is necessary to converge to the solution in the constrainedsetting. Using the implementation of the Rosenbrock function provided in Section 2.3 the results of Table

−1

0

1

−10

12

3

200400600800

100012001400

x2x1

f

(a) Active set method using fmincon.

−1

0

1

−10

12

3

200400600800

100012001400

x2x1

f

(b) Interior point method using fmincon.

−1

0

1

−10

12

3

200400600800

100012001400

x2x1

f

(c) SQP method using fmincon.

−1

0

1

−10

12

3

200400600800

100012001400

x2x1

f

(d) Interior point method using Ipopt.

Fig. 3.11: Comparison of algorithms for static constrained optimization applied to the Rosenbrock problem (3.90). Here, ×refers to the minimum f(x∗) = 0 and + denotes the final value of the individual methods.

3.3 and Figure 3.11 are obtained using the Matlab function below. It is thereby assumed that Ipoptand the Ipopt Matlab interface are properly installed.


function Xopt = rosenbrock_optim(x0,method)

%

options = [];

options.Display = ’iter’;

if method==1

% fmincon - interior point

options.Method = ’fmincon_ip’;

opt = optimoptions(’fmincon’,’Algorithm’,’interior-point’,’GradObj’,’on’,...

’Hessian’,’bfgs’,’MaxFunEvals’,2000,’TolX’,1e-12,’Display’,’iter’);

elseif method==2

% fmincon - active set

options.Method = ’fmincon_as’;

opt = optimoptions(’fmincon’,’Algorithm’,’active-set’,’GradObj’,’on’,...

’MaxFunEvals’,2000,’TolX’,1e-12,’Display’,’iter’);

elseif method==3

% fmincon - trust region

options.Method = ’fmincon_tr’;

opt = optimoptions(’fmincon’,’Algorithm’,’trust-region-reflective’,...

’GradObj’,’on’,’MaxFunEvals’,2000,’TolX’,1e-12,’Display’,’iter’);

error(’Does activate the active set method since constraints are nonlinear!’);

elseif method==4

% fmincon - sqp

options.Method = ’fmincon_sqp’;

opt = optimoptions(’fmincon’,’Algorithm’,’sqp’,’GradObj’,’on’,...

’MaxFunEvals’,2000,’TolX’,1e-12,’Display’,’iter’);

elseif method==5

% ipopt

options.Method = ’ipopt’;

options.cl = [-inf]; % Lower bounds on constraints.

options.cu = [0.5]; % Upper bounds on constraints.

options.ipopt.jac_c_constant = ’no’;

options.ipopt.hessian_approximation = ’limited-memory’;

options.ipopt.mu_strategy = ’adaptive’;

options.ipopt.max_iter = 2000;

options.ipopt.tol = 1e-18;

options.ipopt.print_level = 5;

options.ipopt.file_print_level = 10;

options.ipopt.output_file= ’ipopt_output’;

funcs.objective = @ipopt_objective;

funcs.constraints = @ipopt_constraints;

funcs.gradient = @ipopt_gradient;

funcs.jacobian = @ipopt_jacobian;

funcs.jacobianstructure = @() sparse(ones(1,2));

end

if method<5

[Xopt,fopt,exitflag,output] = fmincon(@rosenbrock,x0,[],[],[],[],[],[],@constraints,opt);

elseif method==5

[Xopt,info] = ipopt(x0,funcs,options);

end

% Begin subfunctions

function [Cieq,Ceq] = constraints(x)

Cieq = 0.5^2-(x(1)^2+x(2)^2);

References 83

Ceq = [];

function out = ipopt_constraints(x)

out = x(1)^2+x(2)^2;

function out = ipopt_objective(x)

out = rosenbrock(x);

function out = ipopt_gradient(x)

[dummy,out] = rosenbrock(x);

function out = ipopt_jacobian(x)

out = sparse(2*[x(1),x(2)]);

% End subfunctions

3.4 Optimization software

There is an extensive set of software for the solution of static unconstrained and constrained optimizationproblems addressing different aspects and problem settings.

A very detailed overview is, e.g., provided on the website http://www.mat.univie.ac.at/~neum/

glopt/software_l.html. Moreover, the NEOS (Network–Enabled Optimization System) server shouldbe mentioned, which provides a free Internet–based service to be reached at http://www.neos-server.org/neos/ for solving optimization problems including interfaces to various free and commercial opti-mization routines.

While only a few free and commercial optimization tools have an interface to Matlab most utilizecertain modeling languages such as AMPL (http://www.ampl.com) or GAMS (http://www.gams.com).These can be also accessed on the NEOS server.

References

1. Bazarra M, Sherali H, Shetty C (2006) Nonlinear Programming: Theory and Algorithms, 3rd edn. John Wiley & Sons,

New York


3. Chachuat B (2007) Nonlinear and Dynamic Optimization: From Theory to Practice. Tech. rep., Laboratoired’Automatique, EPFL Lausanne, http://infoscience.epfl.ch/record/111939/files/Chachuat_07(IC32).pdf

4. Hale J, Kocak H (1991) Dynamics and Bifurcations, Texts in Applied Mathematics, vol 3. Springer–Verlag, New York

5. Luenberger D, Ye Y (2008) Linear and Nonlinear Programming, 3rd edn. Springer

6. Meurer T (2013) Regelung nichtlinearer Systeme. Skriptum zur Vorlesung, Lehrstuhl fur Regelungstech-

nik, Christian–Albrechts–Universitat Kiel, http://www.control.tf.uni-kiel.de/en/teaching/summer-term/

nonlinear-control-systems-etit-501

7. Nocedal J, Wright S (2006) Numerical Optimization, 2nd edn. Springer, New York (NY)

8. Robinson S (1976) Perturbed Kuhn–Tucker points and rates of convergence for a class of nonlinear programming prob-lems. Mathematical Programming 7:1–16

9. Wachter A, Biegler L (2006) On the Implementation of a Primal–Dual Interior Point Filter Line Search Algorithm for

Large-Scale Nonlinear Programming. Mathematical Programming 106(1):25–57

http://www.mat.univie.ac.at/~neum/glopt/software_l.html

http://www.mat.univie.ac.at/~neum/glopt/software_l.html

http://www.neos-server.org/neos/

http://www.neos-server.org/neos/

http://www.ampl.com

http://www.gams.com

http://infoscience.epfl.ch/record/111939/files/Chachuat_07(IC32).pdf

http://www.control.tf.uni-kiel.de/en/teaching/summer-term/nonlinear-control-systems-etit-501

http://www.control.tf.uni-kiel.de/en/teaching/summer-term/nonlinear-control-systems-etit-501

Chapter 4

Dynamic optimization

In the following, dynamic optimization problems are considered, where the decision variables x(t) are nolonger elements of the Euclidean space Rn but are elements of an infinite–dimensional (normed) functionspace (X, ‖ · ‖X). Herein, the goal is to minimize (or to maximize) an objective functional , also referredto as cost functional or performance index J( · ) : X → R with respect to x(t).

4.1 Problem statement and preliminaries

The general formulation of the cost functional is given in terms of the so–called Bolzano form


∫ te

t0

l(t,x(t),u(t))dt. (4.1)

Herein, te and x(te) are the terminal time or end–time, respectively, and terminal state, l : [t0, te]×Rn×Rm → R is the running cost or Lagrangian or Lagrangian density , respectively, and ϕ : R×Rn → R isthe terminal cost . The trajectory x(t) is governed by the dynamic system

x = f(t,x,u), t > t0, x(t0) = x0 ∈ Rn (4.2)

with the vector u(t) ∈ U representing the control or input or manipulated variable, respectively. Thespace U refers to the control space, which is further precised below. It is assumed that there is at leasta local unique solution to (4.2), which can be addressed by exploiting the local (or global) Lipschitzcontinuity of f(t,x,u). There are two important special cases of the Bolzano form. If ϕ(te,x(te)) = 0the so–called Lagrange form arises, i.e.

J(u) =

∫ te

t0

l(t,x(t),u(t))dt. (4.3)

The so–called Mayer form is obtained when l(t,x,u) = 0, i.e.,

J(u) = ϕ(te,x(te)). (4.4)

Remark 4.1. It is possible to convert these two special cases into each other by simple transforma-tions. Starting with the Mayer form (4.4) we obtain

ϕ(te,x(te)) = ϕ(t0,x(t0)) +

∫ te

t0

d

dtϕ(t,x(t))dt

= ϕ(t0,x(t0)) +

∫ te

t0

∂

∂tϕ(t,x(t)) +

∂

∂xϕ(t,x(t))f(t,x(t),u(t))dt,

(4.5)

85

86 4 Dynamic optimization

which corresponds to the Lagrange form. Contrary, starting with the Lagrange form (4.3), the introductionof the new state variable

xn+1 = l(t,x,u), t > t0, xn+1(t0) = 0 (4.6a)

implies∫ te

t0

l(t,x(t),u(t))dt = xn+1(te) (4.6b)

and thus the Mayer form.

Additional constraints may be imposed. It is thereby assumed that the initial time t0 and the initialstate x0 are fixed. The trajectory x(t) is called admissible, if it fulfills all constraints in the intervalt ∈ [t0, te] and the set of all admissible trajectories is, as before, called Xad. For proper classification letXta ⊂ [t0,∞)×Rn denote the so–called target set and let te be the smallest time such that (te,x(te)) ∈Xta. Let Xta = [t0,∞)×{xe} with xe a fixed point in Rn, then this refers to a free–time, fixed–endpointproblem. If Xta = {te} ×Rn, then this defines a fixed–time, free–endpoint problem. The most restrictivecase is obviously given by the fixed–time, fixed–endpoint problem with Xta = {te} × {xe}. The leastrestrictive free–time, free–endpoint problem is defined by means of Xta = [t0,∞) × Rn. All other casesmay be similarly included in the formulation. In addition, path constraints

ψ(t,x(t),u(t)) = 0, t ⊆ [t0, te], (4.7)

and inequality constraints can be specified, i.e.

ψ(t,x(t),u(t)) ≤ 0, t ⊆ [t0, te]. (4.8)

The latter can be often reduced to input constraints

u−j ≤ uj(t) ≤ u+j , j ∈ Iu ⊆ {1, . . . ,m} (4.9)

or state constraints

x−j ≤ xj(t) ≤ x+j , j ∈ Ix ⊆ {1, . . . , n} (4.10)

Moreover, so–called isoperimetric constraints may arise, which are given in the form∫ te

t0

ψk(t,x(t),u(t))dt = ak, k = 1, . . . , r < n (4.11)

During the subsequent analysis we make use of the space Ck([t0, te],Rn) of k–times continuously dif-

ferentiable functions mapping the interval [t0, te] to Rn as well as the space Ck([t0, te],Rn) of piecewise

k–times continuously differentiable functions. Note that a real–valued function f(t), t ∈ [t0, te] is calledpiecewise k–times continuously differentiable if there is a finite (irreducible) partition t0 = τ0 < τ1 <· · · < τN < τN+1 = te such that f(t) ∈ Ck((τj , τj+1),Rn) for each j = 0, 1, . . . , N . If n = 1, then we often

write Ck([t0, te]) and Ck([t0, te]) to refer to Ck([t0, te],R) and Ck([t0, te],R).

Similar to static constrained optimization problems dynamic constrained optimization problems leadto the definition of a feasible control and feasible pair [1].

Definition 4.1 (Feasible control and feasible pair). An admissible control u(t) ∈ U is said to befeasible if (i) the corresponding solution x(t) = x(t;x0,u(t)) to (4.2) is defined on t ∈ [t0, te] and (ii)u(t) and x(t;x0,u(t)) satisfy all constraints for t ∈ [t0, te]. In this case the pair (u(t),x(t)) is called afeasible pair . The set of feasible controls Ufe is defined as the set {u ∈ U : u(t) is feasible}.

It is furthermore required to define what is meant when referring to a global and local minimum of acost functional J(u).

4.2 Calculus of variations 87

Definition 4.2 (Global and local minimizer). Let u∗(t) ∈ Ufe. Then u∗(t) is a global minimizer ofthe cost functional J(u) if

J(u∗) ≤ J(u), ∀u ∈ Ufe. (4.12)

Moreover, u∗(t) is a local minimizer of the cost functional J(u) if

∃ρ > 0 such that J(u∗) ≤ J(u), ∀u ∈ Ufe ∩ Bρ(u∗) (4.13)

with Bρ(u∗) = {u ∈ U : ‖u− u∗‖ ≤ ρ}.

This implies that the analysis of the local behavior in the neighborhood of u∗(t) requires the definitionof a norm ‖ · ‖. Given the space C0([t0, te],R

n) a commonly used norm is the maximum norm defined as

‖u‖∞ = maxt∈[t0,te]

‖u(t)‖Rm (4.14)

with ‖ · ‖Rm denoting the standard Euclidean norm on Rm. For the space C1([t0, te],Rm) this extends to

‖u‖1,∞ = maxt∈[t0,te]

‖u(t)‖Rm + maxt∈[t0,te]

‖u(t)‖Rm . (4.15)

Local minimizers according to Definition 4.2 can be further subdivided depending on the choice of thenorm ‖ · ‖ in (4.13). If ‖ · ‖ = ‖ · ‖∞, then one refers to the local minimizer also as strong local minimizer .If ‖ · ‖ = ‖ · ‖1,∞, then the local minimizer is also called weak local minimizer .

Remark 4.2. While all norms defined on finite–dimensional spaces such as Rm are equivalent this nolonger holds for infinite–dimensional function spaces. Hence, according to (4.13) u∗(t) may be a localminimizer with respect to one norm but not with respect to another norm on Ufe.

4.2 Calculus of variations

To deduce optimality conditions we subsequently apply calculus of variations. Contrary to the determi-nation of extrema of functions, calculus of variations enables to determine the extrema of functionals,which map an element of a function space to the underlying field (R or C). We have already seen thescalar product as an example of a functional. For the present considerations we focus on integrals withrespect to a single independent coordinate t with integral kernel in x(t) and x(t), i.e.

J(x) =

∫ te

t0

l(t,x(t), x(t))dt. (4.16)

For the sake of simplicity x(t) and its derivative x(t) are used as dependent coordinates in the definitionof the functional. We now seek for the function x(t), t ∈ [t0, te] for which J(x) attains an extremum.

4.2.1 Preliminaries

In the following, selected results from calculus of variations are summarized, which are utilized throughoutthis chapter.


4.2.1.1 First variation or Gateaux derivative Searching for the zeros of the so–called (first) vari-ation of a functional , also called Gateaux derivative, enables to generalize the necessary condition for anextremum (∇f)(x) = 0 given a function f(x) to the case of functionals [4, 5, 1, 3].

Definition 4.3 (Variation of a functional (Gateaux derivative)). Let J(x) be a functional definedon a linear spaceX. The first variation of J at x ∈ X in the direction ξ ∈ X, also called Gateaux derivativewith respect to ξ at x, is defined as

δJ(x, ξ) = limη→0

J(x+ ηξ)− J(x)

η=

∂

∂ηJ(x+ ηξ)

∣∣∣∣η=0

. (4.17)

If δJ(x, ξ) exists for all ξ ∈ X, then J(x) is said to be Gateaux differentiable at x.

Note that this implies the existence of the Gateaux derivative provided that J(x) is defined and J(x+ηξ)is differentiable with respect to η at η = 0. The first variation or Gateaux derivative is a linear operationon the functional J fulfilling the

(i) additivity property, i.e.

δ(J1 + J2)(x, ξ) = δJ1(x, ξ) + δJ2(x, ξ) (4.18a)

(ii) and the homogeneity property, i.e.

δJ(x, αξ) = αδJ(x, ξ). (4.18b)

In addition, the Gateaux derivatives satisfies

(iii) the product rule, i.e.,

δ(J1J2)(x, ξ) = δJ1(x, ξ)J2(x, ξ) + J1(x, ξ)δJ2(x, ξ) (4.18c)

(iv) and the quotient rule

δ

(J1

J2

)(x, ξ) =

δJ1(x, ξ)J2(x, ξ)− J1(x, ξ)δJ2(x, ξ)

(J2(x, ξ))2. (4.18d)

Since δJ(x, ξ1)+δJ(x, ξ2) is not necessarily equal to δJ(x, ξ1+ξ2) the Gateaux derivative may in generalnot define a linear operator on X [1]. It is moreover important to note that the Gateaux derivative, whennon–vanishing, is independent of the norm on X and hence holds for any norm on X.

Example 4.1. Consider the functional (4.16) with x(t) ∈ C1([t0, te],Rn). With (4.17) the Gateaux

derivative of J(x) follows as

δJ(x, ξ) =∂

∂η

∫ te

t0

l(t,x(t) + ηξ(t), x(t) + ηξ(t))dt

∣∣∣∣η=0

=

∫ te

t0

{(∇xl)T (t,x(t), x(t))ξ(t) + (∇xl)T (t,x(t), x(t))ξ(t)

}dt

=

∫ te

t0

{(∇xl)T (t,x(t), x(t))− ∂

∂t(∇xl)T (t,x(t), x(t))

}ξ(t)dt

+[(∇xl)T (t,x(t), x(t))ξ(t)

]t=tet=t0

for all ξ(t) ∈ C1([t0, te],Rn). Hence, J(x) is Gateaux differentiable at any x(t) ∈ C1([t0, te],R

n).

The following example illustrates that the Gateaux derivative must not necessarily exist.


Example 4.2. Consider the functional

J(x) =

∫ te

t0

|x(t)|dt

for x(t) ∈ C1([t0, te],R). The value of J(x) is finite given the interval t ∈ [t0, te] with 0 ≤ t0 < te < ∞.From (4.17), the Gateaux derivative reads

δJ(x, ξ) = limη→0

1

η

(∫ te

t0

|x(t) + ηξ(t)|dt−∫ te

t0

|x(t)|dt).

Consider now the particular point x(t) = x0(t) = 0, ξ(t) = ξ0(t) = t, which implies

δJ(x0, ξ0) = limη→0

sign η

∫ te

t0

tdt =

{(te−t0)2

2 , η → 0+

− (te−t0)2

2 , η → 0−.

Obviously in the direction ξ0(t) = t the Gateaux derivative at x0(t) = 0 does not exist.

The introduction of the first variation or Gateaux derivative, respectively, enables to formulate ageometric first order necessary optimality condition. For this, we first provide an exclusion criteria.

Lemma 4.1 (Exclusion of minima). Let J be a functional defined on a normed linear space (X, ‖ · ‖X).Suppose that at a point x ∈ X there exists a direction ξ ∈ X such that δJ(x, ξ) < 0. Then x ∈ X cannotbe a local minimizer for J (in the sense of the norm ‖ · ‖X).

The proof of Lemma 4.1 can be found in [1] and in principle makes use of the following remark.

Remark 4.3. A direction ξ ∈ X such that δJ(x, ξ) < 0 defines a descent direction for J at x ∈ X.In other words, δJ(x, ξ) < 0 generalizes the algebraic condition (∇f)T (x)ξ < 0 for ξ to be a descentdirection at x given f : Rn → R.

To properly address the minimization of a functional over a subset of a normed linear space so–calledadmissible directions need to be defined [1].

Definition 4.4 (Admissible directions). Let J be a functional defined on a subset Xad of a normedlinear space (X, ‖ · ‖X) and let x ∈ Xad. Then, a direction ξ ∈ Xad, ξ 6= 0 is said to be admissible (orXad–admissible) for J at x if

(i) δJ(x, ξ) exists and

(ii) x+ ηξ ∈ Xad for all sufficiently small η, i.e., ∃ρ > 0 such that ∀η ∈ Bρ(0) we have x+ ηξ ∈ Xad.

With these preparations, a first order necessary optimality conditions is summarized in the theorembelow.

Theorem 4.1 (First order necessary optimality condition). Let J be a functional defined on asubset Xad of a normed linear space (X, ‖ · ‖X) and suppose that x∗ ∈ Xad is a local minimizer of J .Then

δJ(x∗, ξ) = 0 (4.19)

for all Xad–admissible directions ξ at the point x∗.

Proof. The proof of Theorem 4.1 follows by contradiction. Suppose there is a Xad–admissible directionξ at x∗ with δJ(x∗, ξ) < 0. Then by Lemma 4.1 the point x∗ cannot be a local minimizer for J(u). To


exclude δJ(x∗, ξ) > 0 let −ξ be a Xad–admissible direction. Then δJ(x∗,−ξ) = −δJ(x∗, ξ) < 0 so thatLemma 4.1 excludes x∗ from the set of local minimizers. Hence, we must have δJ(x∗, ξ) = 0 for anyXad–admissible direction ξ at x∗. ut

4.2.1.2 Euler–Lagrange equations Consider the Lagrange form (4.16) with fixed initial and endpoint, i.e.,

J(x) =

∫ te

t0

l(t,x(t), x(t))dt, x(t0) = x0, x(te) = xe. (4.20)

We subsequently seek for a local minimum x∗(t) ∈ C1([t0, te],Rn) of (4.20) such that J(x) ≥ J(x∗) with

x∗(t0) = x0 and x∗(te) = xe. Starting from the unknown x∗(t) introduce so–called admissible functionsdefined as

x(t) = x∗(t) + ηξ(t), ξ(t0) = 0, ξ(te) = 0. (4.21)

The function ξ(t) fulfilling ξ(t0) = 0 and ξ(te) = 0 is also called admissible variation. With this,J(x) ≥ J(x∗) implies

J(x) = J(x∗ + ηξ) = J(x∗) + (∇xJ)T (x∗)ηξ + (∇xJ)T (x∗)ηξ +O2(η) ≥ J(x∗).

By assumption J(x) attains a local extremum at x(t) = x∗(t) for η = 0 so that we have as a necessaryoptimality condition

(∇xJ)T (x∗)ηξ + (∇xJ)T (x∗)ηξ = η{

(∇xJ)T (x∗)ξ + (∇xJ)T (x∗)ξ}

= ηδJ(x, ξ) = 0

and hence

δJ(x∗, ξ) = 0

since η is arbitrary. In other words, the first variation of J at x∗(t) has to evaluate to zero. The explicitevaluation of δJ(x∗, ξ) is already presented in Example 4.1, i.e.,

δJ(x∗, ξ) =

∫ te

t0

{(∇xl)T (t,x∗(t), x∗(t))− ∂

∂t(∇xl)T (t,x∗(t), x∗(t))

}ξ(t)dt

+[(∇x∗ l)T (t,x∗(t), x∗(t))ξ(t)

]t=tet=t0︸︷︷︸

= 0 since ξ(t0) = ξ(te) = 0

.

For the final conclusion it is herein necessary to take into account the following lemma.

Lemma 4.2 (Fundamental lemma of variational calculus). Let f(t) ∈ C0([t0, te],Rn) such that∫ te

t0

fT (t)ξ(t)dt = 0 (4.22)

for all ξ(t) ∈ C0([t0, te],Rn), then f(t) = 0, t ∈ [t0, te] (almost everywhere possibly excluding a set of

measure zero).

Thus, for arbitrary ξ(t) ∈ C1([t0, te],Rn) the equation

δJ(x∗, ξ) =

∫ te

t0

{(∇xl)T (t,x∗(t), x∗(t))− ∂

∂t(∇xl)T (t,x∗(t), x∗(t))

}ξ(t)dt = 0

yields


(∇xl)(t,x∗(t), x∗(t))−∂

∂t(∇xl)(t,x∗(t), x∗(t)) = 0.

These equations are the so–called Euler–Lagrange equations, which impose a necessary optimality condi-tion for x∗(t) to be the local minimizer of (4.20).

Theorem 4.2 (Euler–Lagrange equations). Consider the functional

J(x) =

∫ te

t0

l(t,x(t), x(t))dt, x(t0) = x0, x(te) = xe (4.23)

for x(t) ∈ C1([t0, te],Rn) and continuously differentiable Lagrangian density l : R × Rn × Rn → R.

Suppose that x∗(t) is a local minimizer of J(x) with x∗(t0) = x0 and x∗(te) = xe. Then x∗(t) fulfillsthe Euler–Lagrange equations

∂

∂t(∇xl)(t,x∗(t), x∗(t))− (∇xl)(t,x∗(t), x∗(t)) = 0 (4.24)

for all t ∈ [t0, te].

The solution of the Euler–Lagrange equations can be formulated using so–called first integrals in somespecial cases:

(i) The Lagrangian density is independent of t, i.e., l = l(x(t), x(t)). With the so–called Hamiltonfunction

H(x, x) = (∇xl)(x, x)x− l(x, x) (4.25)

the Euler–Lagrange equations (4.24) reduce to

d

dtH(x, x) =

d

dt(∇xl)(x, x)x+ (∇xl)(x, x)x− (∇xl)(x, x)x− (∇xl)(x, x)x

=

(d

dt(∇xl)(x, x)− (∇xl)(x, x)

)x.

(4.26)

Thus, either x(t) = 0 and x(t) = c or the Hamiltonian H(x, x) must remain constant along a localminimizer x∗(t) so that H(x, x) is an invariant of the Euler–Lagrange equations.

(ii) The Lagrangian density is independent of x(t), i.e., l = l(t, x(t)). Then the evaluation of the Euler–Lagrange equations (4.24) yields

∂

∂t(∇xl)T (t, x) = 0. (4.27)

Thus, ∂∂xj

l(t, x), j = 1, . . . , n is an invariant of the Euler–Lagrange equations.

Example 4.3 (Brachistochrone problem). The so–called brachistochrone problem traces back to Jo-hann Bernoulli and refers to finding the path between two points in a vertical plane such that a particlesliding without friction and initial speed v0 along this path takes minimal time to travel from the initialto the end point (cf. Figure 4.1).

Since the particle slides without friction energy is conserved so that the sum of kinetic and potentialenergy equals zero at any instance of time. With the mass m of the particle this implies

1

2m(v2(x)− v2

0

)+mg(y(x)− y0) = 0

and hence

v(x) =√v2

0 − 2g(y(x)− y0).


x

y

x0 x1

y0

y1

Fig. 4.1: Brachistochrone problem.

The objective function is the traveling time from the initial to the final point

J(y) =

∫ x1

x0

dt =

∫ x1

x0

ds

v(x)

with s denoting the Jordan length of y(x). Taking into account an infinitesimal element along the path,it follows that (ds)2 = (dy)2 + (dx)2 so that

ds =√

(dy)2 + (dx)2 = dx√

1 + (y′(x))2.

Substitution into J(y) results in

J(y) =

∫ x1

x0

√1 + (y′(x))2√

v20 − 2g(y(x)− y0)︸︷︷︸

= l(y(x), y′(x))

dx

with the conditions that

y(x0) = y0, y(x1) = y1.

Obviously, since J(y) does not explicitly depend on the independent coordinate x we know from the twospecial cases discussed above that the Hamiltonian function

H(y, y′) =∂

∂y′l(y, y′)y′ − l(y, y′)

=(y′(x))2√

v20 − 2g(y(x)− y0)

√1 + (y′(x))2

−√

1 + (y′(x))2√v2

0 − 2g(y(x)− y0)

is an invariant of the respective Euler–Lagrange equations (4.24) so that H(y, y′) = c with constant c. Inorder to solve this (differential) equation for y(x) it is convenient to introduce the substitution

y(x) = y0 − y(x) +v2

0

2g⇒ y′(x) = −y′(x) ∧ y(x0) =

v20

2g

so that

H(y, y′) =1√2g

((y′(x))2√

y(x)√

1 + (y′(x))2−√

1 + (y′(x))2√y(x)

)= c.

The latter equation implies


(y′(x))2√y(x)

√1 + (y′(x))2

−√

1 + (y′(x))2√y(x)

=√

2gc

and hence

y(x)(1 + (y′(x))2) =1

2gc2

With a = 1/(2gc2) the differential equations allows for a formal integration in x to obtain

x − x0 =

∫ y

y0

√s

a− sds.

The integral on the right–hand side can be solved by making use of the substitution s = a2 (1 − cos(θ)) =

a sin2( θ2 ), which provides

x − x0 =

∫ θ

θ0

√sin2( θ2 )

1− sin2( θ2 )a sin

(θ

2

)cos

(θ

2

)dθ = a

∫ θ

θ0

sin2

(θ

2

)dθ =

a

2

(θ − θ0 + sin(θ0)− sin(θ)

).

As a result, the solution to the brachistochrone problem follows in parametrized form

x(θ) = x0 +a

2

(θ − θ0 + sin(θ0)− sin(θ)

)y(θ) = y0 +

v20

2g− a

2(1− cos(θ))

(4.28a)

for θ ∈ [θ0, θ1], where θ0 and θ1 have to be determined from the algebraic equations

a

2(1− cos(θ0)) = y(x0) =

v20

2g

a

2(1− cos(θ1)) = y(x1) = y0 − y1 +

v20

2g.

(4.28b)

It should be pointed out that since Theorem 4.2 only provides a first order necessary optimality conditionthe solution determined above is only a candidate for a local minimizer and the actual verification requiresfurther analysis.

This problem is moreover suitable to deduce a numerical solution procedure by evaluating the Euler–Lagrange equations for the brachistochrone problem, which yields

y′′(x)(v2

0 − 2g(y(x)− y0))− g

((y′(x))2 + 1

)((y′(x))2 + 1)

3/2(v2

0 − 2g(y(x)− y0))3/2

= 0, x ∈ (x0, x1) (4.29a)

subject to

y(x0) = y0, y(x1) = y1. (4.29b)

Equations (4.29) obviously define a boundary–value problem, which can be solved either analytically ornumerically. Subsequently, a numerical solution approach using the function bvp4c of Matlab is brieflysummarized. This numerical solver relies on the formulation of the boundary–value problem as a systemof coupled first–order ordinary differential equations. For this, introduce the state variables z1(x) = y(x)and z2(x) = y′(x) so that (4.29) reads

d

dx

[z1

z2

]=

z2

g(z2

2 + 1)

v20 − 2g(z1 − y0)

, x ∈ (x0, x1)

z1(x0) = y0, z2(x1) = y1.

(4.30)


0 1 2 3 4 5 6 7−2.5

−2

−1.5

−1

−0.5

0

x

y

x1 = 1.0x1 = 2.0x1 = 5.0x1 = 7.0

Fig. 4.2: Numerical solution using the function bvp4c from Matlab.

The following Matlab function computes a solution to this boundary–value problem when varying x1 ∈{1, 2, 5, 7}. The respective numerical results are depicted in Figure 4.2.

function out = brachistochrone_main()

%System parameters

p.g = 9.81;

p.v0 = 0.5;

p.x0 = 0.0;

p.y0 = 0.0;

p.y1 = -1.0;

%Vary x1

x1 = [1.0,2.0,5.0,7.0];

col = {’b’,’r’,’g’,’k’};

figure(1); hold on;

for j=1:length(x1)

p.x1 = x1(j);

if j==1

% Two runs with the solution of the first run as initial value

%for the second run to improve the result

disp(’-> Run 1’);

sol = brachistochrone_bvp4c(p,1e-2);


sol = brachistochrone_bvp4c(p,1e-5,sol);

else

%Use the previous solution as initial value mapped to the new

%interval [x0,x1]

ini.x = linspace(p.x0,p.x1,length(sol.x));

ini.z = sol.z;


sol = brachistochrone_bvp4c(p,1e-3,ini);


sol = brachistochrone_bvp4c(p,1e-5,sol);

end

figure(1);

plot(sol.x,sol.z(1,:),strcat(col{j},’-’));


end

% --------------------------------------------------------

% SUBFUNCTIONS

function out = brachistochrone_bvp4c(p,reltol,varargin)

%

%Initialize

if nargin==2

xini = linspace(p.x0,p.x1,21);

a1 = (p.y0-p.y1)/(p.x0-p.x1);

a0 = p.y0 - a1*p.x0;

yini = a1*xini+a0;

ypini = a1*ones(size(xini));

solinit.x = xini;

solinit.y = [yini;ypini];

figure(11); plot(xini,yini); drawnow;

elseif nargin==3

solinit.x = varargin{1}.x;

solinit.y = varargin{1}.z;

end

%Solve BVP

options = bvpset(’RelTol’,reltol,’Stats’,’on’);

options = bvpset(options,’FJacobian’,@(x,z)jacelsys(x,z,p),...

’BCJacobian’,@(za,zb)jacbcs(za,zb,p));

sol = bvp4c(@(x,z)elsys(x,z,p),@(za,zb)bcs(za,zb,p),solinit,options);

out.sol = sol;

out.x = sol.x;

out.z = sol.y;

function dzdx = elsys(x,z,p)

dzdx = [z(2);

p.g*(1+z(2)^2)/(p.v0^2-2.0*p.g*(z(1)-p.y0))];

function out = bcs(za,zb,p)

out = [za(1) - p.y0;

zb(1) - p.y1];

function out = jacelsys(x,z,p)

out = [0.0,1.0;

2.0*p.g^2*(1+z(2)^2)/(p.v0^2-2.0*p.g*(z(1)-p.y0))^2,...

2.0*p.g*z(2)/(p.v0^2-2.0*p.g*(z(1)-p.y0))];

function [dbcdza,dbcdzb] = jacbcs(x,z,p)

dbcdza = zeros(2,2);

dbcdza(1,1) = 1.0;

dbcdzb = zeros(2,2);

dbcdzb(2,1) = 1.0;

It can be furthermore shown similar to the analysis of extrema of functions that if x∗(t) is a localminimizer, then the second variation of J(x) at x∗(t) needs to be positive semi–definite, i.e.,

δ2J(x∗, ξ) =d2J(x∗ + ηξ)

dη2

∣∣∣∣η=0

≥ 0.


The leads to the so–called Legendre condition [2].

Theorem 4.3 (Legendre condition). Consider the functional

J(x) =

∫ te

t0

l(t,x(t), x(t))dt, x(t0) = x0, x(te) = xe (4.31)

for x(t) ∈ C1([t0, te],Rn) and twice continuously differentiable Lagrangian density l : R×Rn×Rn → R.

Suppose that x∗(t) is a local minimizer of J(x) with x∗(t0) = x0 and x∗(te) = xe. Then x∗(t) solves theEuler–Lagrange equations and satisfies the so–called Legendre condition

(∇2xl)(t,x

∗(t), x∗(t)) ≥ 0 (4.32)

for all t ∈ [t0, te].

4.2.1.3 Euler–Lagrange equations for problems with free end–point We conclude this sectionby addressing a Bolzano problem with a free end–point using calculus of variations, i.e.,

Theorem 4.4 (Euler–Lagrange equations for problems with free end–point and terminalcost). Consider the linear functional

J(te,x) = ϕ(te,x(te)) +

∫ te

t0

l(t,x(t), x(t))dt (4.33a)

on the subset

Xad = {(te,x(t)) : te ∈ (t0, t), x(t) ∈ C1([t0, t],Rn), x(t0) = x0} (4.33b)

for sufficiently large t � te with l ∈ R × Rn × Rn, ϕ ∈ R × Rn being continuously differentiable.Suppose that (t∗e,x

∗(t)) denotes a local minimum of J(te,x) on Xad. Then x∗(t) solves the Euler–Lagrange equations (4.24) on the interval t ∈ [t0, t

∗e] and satisfies both the initial condition x∗(t0) = x0

and the transversality conditions[(∇xl)(t,x(t), x(t)) + (∇xϕ)(t,x(t))

]t=t∗e ,x=x∗

= 0 (4.34a)[l(t,x(t), x(t))− (x)T (∇xl)(t,x(t), x(t)) +

d

dtϕ(t,x(t))

]t=t∗e ,x=x∗

= 0. (4.34b)

For the proof of Theorem 4.4 the definition of the Gateaux derivative (4.17) can be directly modifiedaccording to

δJ(te,x, τ, ξ) = limη→0

J(te + ητ,x+ ηξ)− J(te,x)

η=

∂

∂ηJ(te + ητ,x+ ηξ)

∣∣∣∣η=0

. (4.35)

Proof. If the end–point t∗e is fixed, then Theorem 4.2 implies that x∗(t) ∈ Xad must be a solution to theEuler–Lagrange equations (4.24) in the interval t ∈ [t0, t

∗e] provided that x∗(t) is such that the admissible

direction ξ(t) satisfies ξ(t0) = ξ(t∗e) = 0.

Taking into account the first order necessary optimality condition of Theorem 4.1 the application of(4.35) to (4.33) results in

∂

∂ηJ(t∗e + ητ,x∗ + ηξ)

=∂

∂ηϕ(t∗e + ητ,x∗(t∗e + ητ) + ηξ(t∗e + ητ)) +

∂

∂η

∫ t∗e+ητ

t0

l(t,x∗(t) + ηξ(t), x∗(t) + ηξ(t))dt


=

[τ∂

∂tϕ+

∂

∂xϕ[x+ ηξ

]τ +

∂

∂xϕξ

]t=t∗e+ητ,x=x∗+ηξ

+ τ[l(t,x∗(t) + ηξ(t), x∗(t) + ηξ(t))

]t=t∗e+ητ

+

∫ t∗e+ητ

t0

[∂

∂xlξ +

∂

∂xlξ

]x=x∗+ηξ,x=x∗+ηξ

dt

=

[τ∂

∂tϕ+

∂

∂xϕ[x+ ηξ

]τ +

∂

∂xϕξ

]t=t∗e+ητ,x=x∗+ηξ

+ τ[l(t,x∗(t) + ηξ(t), x∗(t) + ηξ(t))

]t=t∗e+ητ

+

[∂

∂xl(t,x∗ + ηξ, x∗ + ηξ)ξ

]t=t∗e+ητ

t=t0

+

∫ t∗e+ητ

t0

[∂

∂xlξ − d

dt

(∂

∂xl

)ξ

]x=x∗+ηξ,x=x∗+ηξ

dt

Evaluation of the limit as η → 0 yields

δJ(te,x, τ, ξ) =∂

∂ηJ(t∗e + ητ,x∗ + ηξ)

∣∣∣∣η=0

=

[τ∂

∂tϕ+

∂

∂xϕ[τ x+ ξ

]]t=t∗e ,x=x∗

+ τ l(t∗e,x∗(t∗e), x

∗(t∗e)) +∂

∂xl(t∗e,x

∗(t∗e), x∗(t∗e))ξ(t∗e)

− ∂

∂xl(t0,x

∗(t0), x∗(t0))ξ(t0) +

∫ t∗e

t0

[∂

∂xlξ − d

dt

(∂

∂xl

)ξ

]x=x∗,x=x∗

dt.

Since the initial value x(t0) = x0 is fixed by assumption any admissible direction ξ(t) has to satisfyξ(t0) = 0. Recalling also that the optimal solution x∗(t) has to fulfill the Euler–Lagrange equations (4.24)in the interval t ∈ [t0, te], the Gateaux derivative δJ(te,x, τ, ξ) reduces to

δJ(te,x, τ, ξ) = τ

[∂

∂tϕ+

(∂

∂xϕ

)x+ l

]t=t∗e ,x=x∗

+

[∂

∂xϕ+

∂

∂xl

]t=t∗e ,x=x∗

ξ(te)

= τ

[∂

∂tϕ−

(∂

∂xl

)x+ l

]t=t∗e ,x=x∗︸︷︷︸

= (?)

+

[∂

∂xϕ+

∂

∂xl

]t=t∗e ,x=x∗︸︷︷︸

= (??)

(τ x∗(te) + ξ(te)

)︸︷︷︸= (? ? ?)

(4.36)

If both the end–time te and end–point x(te) are free, then τ and ξ(t∗e) can be chosen independently.Thus, δJ(te,x, τ, ξ) = 0 if the transversality conditions (4.34) are fulfilled. ut

This result can be further generalized according to the listing below.

(i) For fixed end–time te = t∗e we have τ = 0 such that the term (?) vanishes in (4.36) and the transver-sality conditions reduce to (4.34a).

(a) If there is a component xj(t), j ∈ {1, . . . , n} of x(t) with fixed end–value xj(t∗e) = x∗j (t

∗e) = xj,e,

then ξj(t∗e) = 0 and the transversality condition (4.34a) vanishes for the j–th component.

(b) If there is a component xj(t), j ∈ {1, . . . , n} of x(t) with free end–value xj(t∗e), then ξj(t

∗e) 6= 0

and the transversality condition (4.34a) for the j–th component is[∂

∂xjϕ+

∂

∂xjl

]t=t∗e ,x=x∗

= 0.

(ii) For free end–time te we have τ 6= 0 and the term (?) in (4.36) remains so that the transversalitycondition (4.34b) has to hold.


∗e) = xj,e,

then an admissible direction (τ, ξj(t)) has to fulfill


xj,e = x∗j (t∗e + ητ) + ηξj(t

∗e + ητ) ⇒ ∂

∂ηxj,e

∣∣∣∣η=0

= τx∗j (t∗e) + ξj(t

∗e) = 0.

Hence, the corresponding entry in (???) vanishes and there is no transversality condition (4.34a)for this component.

(b) If there is a component xj(t), j ∈ {1, . . . , n} of x(t) with free end–value xj(t∗e), then the transver-

sality condition for this component is[∂

∂xjϕ+

∂

∂xjl

]t=t∗e ,x=x∗

= 0.

Remark 4.4 (Natural boundary conditions). If te is free and the terminal cost ϕ(te,x(te)) = 0,then the transversality conditions yield the so–called natural boundary conditions [5, 1]:

(i) If te is free, then (4.34b) reduces to[(∂

∂xl

)x− l

]t=t∗e ,x=x∗

= H(te,x∗(te), x

∗(te)) = 0.

(ii) If x(te) is free, then(∂

∂xl

)t=t∗e ,x=x∗

= 0.

Example 4.4. Consider the functional

J = p(x(te)− a)2 +

∫ te

0

(x(t)

)2dt

for x(t) ∈ Xad = {x : x(t) ∈ C1([0, te],R), x(0) = x0, te fixed}. According to Theorem 4.4 a candidatex(t) for a local minimizer has to solve the Euler–Lagrange equations (4.24), which for l(x(t)) = x(t)2

read

2x(t) = 0.

This implies the solution

x(t) = c1t+ c0

with c0 = x0 to fulfill the initial condition x(0) = x0. For the determination of c1, the transversalityconditions (4.34) have to be taken into account, which in view of case (i)(b) discussed above, ϕ(x(te)) =p(x(te)− a)2 and n = 1 reduce to

2p(x(te)− a) + 2x(te) = 0 = 2p(c1te + x0 − a) + 2c1

so that c1 = p(a−x0)pte+1 . The unique solution to the Euler–Lagrange equations fulfilling the transversality

conditions is given by

x(t) =p(a− x0)

pte + 1t+ x0.

If p � 1 more weight is put on the terminal constraint than on the integral and x(te) approaches a. Inthe limit p→∞ we obtain x(te) = a.


4.2.1.4 Piecewise continuous functions The theory developed above can be extended and refinedby including piecewise C1–functions, i.e. searching for x∗(t) ∈ C ([t0, te],R

n), as possible extrema. Withthis, one addresses the question whether these so–called cornered trajectories might yield improved re-sults. Moreover, one might wonder if problems not admitting a solution in the class of C1–functionshave an extrema in the extended class of C1–functions. In the following, these results, in particular theWeierstrass–Erdmann (corner) conditions, will not be developed but the reader is referred to [1, 3, 2] andthe references therein.

4.2.2 Problems with constraints

The Euler–Lagrange equations introduced in Theorem 4.2 provide a necessary optimality condition whenthe initial and terminal points are fixed but the curves are unconstrained otherwise. In the presenceof constraints it is convenient to introduce Lagrange multipliers to derive the corresponding necessaryoptimality conditions.

4.2.2.1 Equality constraints Making use of Theorem 4.1 it follows that given a functional J definedon a subset Xad of a normed linear space (X, ‖ · ‖X) a local minimizer x∗ ∈ Xad of J can be characterizedby

δJ(x∗, ξ) = 0

for all Xad–admissible directions ξ at the point x∗. It should be noted that subsets Xad may exist suchthat the set of Xad–admissible directions ξ is empty, possibly at every point in Xad. An example is givenbelow [1].

Example 4.5. Consider the set

Xad ={x(t) ∈ C1([t0, te],R

2) :√

(x1(t))2 + (x2(t))2 −√

2 = 0, ∀t ∈ [t0, te]}.

Obviously, Xad is the set of continuously differentiable curves lying on a cylinder of radius√

2 whoseaxis is time t centered at x1(t) = x2(t) = 0. Let x∗(t) ∈ C1([t0, te],R

2) with x∗1(t) = x∗2(t) = 1 sothat x∗(t) ∈ Xad. However, for every non–zero direction ξ(t) ∈ C1([t0, te],R

2) and for every η 6= 0we have x∗(t) + ηξ(t) /∈ Xad. Thus, the set of Xad–admissible directions is empty for any functionalJ : C1([t0, te],R

2)→ R.

The idea behind the introduction of Lagrange multipliers is to characterize the local minimizers orextremals of a functional J defined in a normed linear space (X, ‖ · ‖X), when it is restricted to one ofmore level sets of other such functionals.

Example 4.6. Consider the previous example. Then Xad can be also considered as the intersection ofthe 0–level sets of the family of functionals

Gt(x) =√

(x1(t))2 + (x2(t))2 −√

2

for t ∈ [t0, te], i.e.,

Xad =⋂

t∈[t0,te]

Γt(0)

for Γθ(s) = {x(t) ∈ C1([t0, te],R2) : Gθ(x) = s}. This, however, implies an uncountable number of

functionals, which also illustrates why problems having path constraints are rather hard to solve in general.

The existence of a Lagrange multiplier is guaranteed by the following theorem, whose proof can bededuced from the exposition, e.g., in [6, 1].


Theorem 4.5 (Existence of Lagrange multipliers (single equality constraint)). Let J and Gbe functionals defined in a neighborhood of x∗ in a normed linear space (X, ‖ · ‖X) having continuousGateaux derivatives in this neighborhood. Let G(x∗) = s and suppose that x∗ is a (local) extremum forJ constrained to Γ (s) = {x ∈ X : G(x) = s}. Suppose further that δG(x∗, ξ) 6= 0 for some directionξ ∈ X. Then there exists a scalar λ ∈ R such that

δJ(x∗, ξ) + λδG(x∗, ξ) = 0, ∀ξ ∈ X. (4.37)

As in Chapter 3, the parameter λ is called Lagrange multiplier . Condition (4.37) implies that the direc-tional derivatives of J are proportional to those of G. In other words, the level sets of both J and G sharea common tangent plane Tx∗M at x∗, i.e., they meet tangential.

The extension to the case of multiple equality constraints is given below.

Theorem 4.6 (Existence of Lagrange multipliers (multiple equality constraints)). Let J andGj, j = 1, . . . , p be functionals defined in a neighborhood of x∗ in a normed linear space (X, ‖ · ‖X) havingcontinuous Gateaux derivatives in this neighborhood. Let Gj(x

∗) = sj and suppose that x∗ is a (local)extremum for J constrained to Γ (s) = {x ∈ X : Gj(x) = sj , j = 1, . . . , p}. Suppose further that

det

δG1(x∗, ξ1) · · · δG1(x∗, ξp)...

...δGp(x

∗, ξ1) · · · δGp(x∗, ξp)

6= 0 (4.38)

for p independent directions ξj ∈ X, j = 1, . . . , p. Then there exists a vector λ ∈ Rp such that

δJ(x∗, ξ) +[δG1(x∗, ξ) · · · δGp(x∗, ξ)

]λ = 0, ∀ξ ∈ X. (4.39)

Remark 4.5. If x∗ ∈ Xad with Xad a subset of a normed linear space (X, ‖ · ‖X) and the Xad–admissibledirections form a linear subspace ofX, i.e., for all η1, η2 ∈ R given ξ1, ξ2 ∈ Xad we have η1ξ1+η2ξ2 ∈ Xad,then the conclusions of Theorems 4.5 and 4.6 remain valid, when restricting the continuity of J to Xad

and considering Xad–admissible directions only.

4.2.2.2 Inequality constraints Similar to the previous section, Lagrange multipliers can be used totreat variational problems involving inequality constraints or mixed equality and inequality constraints.

Theorem 4.7 (Existence of Lagrange multipliers (multiple inequality constraints)). Let Jand Gj, j = 1, . . . , p be functionals defined in a neighborhood of x∗ in a normed linear space (X, ‖ · ‖X)having continuous Gateaux derivatives in this neighborhood. Suppose that x∗ is a (local) minimizer forJ constrained to Γ (s) = {x ∈ X : Gj(x) ≤ sj , j = 1, . . . , p}. Suppose further that q ≤ p constraints areactive, say Gj, j = 1, . . . , q for simplicity, and satisfy

det

δG1(x∗, ξ1) · · · δG1(x∗, ξq)...

...δGq(x

∗, ξ1) · · · δGq(x∗, ξq)

6= 0 (4.40)

for q independent directions ξj ∈ X, j = 1, . . . , q. Then there exists a vector µ ∈ Rp such that

δJ(x∗, ξ) +[δG1(x∗, ξ) · · · δGp(x∗, ξ)

]µ = 0, ∀ξ ∈ X (4.41a)

(Gj(x∗)− sj)µj = 0 (4.41b)

µj ≥ 0 (4.41c)

for j = 1, . . . , p.


For a proof of this result consult [1]. Herein, conditions (4.41b) and (4.41c) have to interpreted in thesense of the complementary slackness condition, i.e., (Gj(x

∗)− sj) < 0 implies µj = 0 and µj > 0 implies(Gj(x

∗)− sj) = 0.

4.2.2.3 Isoperimetric constraints Recalling from (4.11) isoperimetric constraints defined accordingto ∫ te

t0

ψk(t,x(t), x(t))dt = ak, k = 1, . . . , r < n

involve constraints in terms of the integrals of a functional over parts or all of the horizon t ∈ [t0, te].The theorem provided below gives a characterization of the (local) minimizer based on the method ofLagrange multipliers [1].

Theorem 4.8 (First–order necessary optimality condition for problems with isoperimetricconstraints). Consider the functional

J(x) =

∫ te

t0

l(t,x(t), x(t))dt (4.42a)

on Xad = {x(t) ∈ C1([t0, te],Rn) : x(t0) = x0, x(te) = xe} subject to the isoperimetric constraint

Gk(x) =

∫ te

t0

ψk(t,x(t), x(t))dt = ak, k = 1, . . . , r < n (4.42b)

with Lagrangian density l : R×Rn×Rn → R and ψk : R×Rn×Rn → R, k = 1, . . . , r being continuouslydifferentiable. Suppose that x∗(t) ∈ Xad is a (local) minimizer for this problem and

det

δG1(x∗, ξ1) · · · δG1(x∗, ξr)...

...δGr(x

∗, ξ1) · · · δGr(x∗, ξr)

6= 0 (4.43)

for r independent directions ξj ∈ X, j = 1, . . . , r. Then there exists a vector λ∗ ∈ Rr such that x∗(t) isa solution to the Euler–Lagrange equations

∂

∂t(∇xL)(t,x∗(t), x∗(t),λ∗)− (∇xL)(t,x∗(t), x∗(t),λ∗) = 0, (4.44a)

where

L(t,x(t), x(t),λ) = l(t,x(t), x(t)) + λTψ(t,x(t), x(t)) (4.44b)

with ψ = [ψ1, . . . , ψr]T .

It can be similarly shown that if L does not depend on t, then the Hamilton function

H(x, x) = (∇xL)(x, x,λ)x− L(x, x,λ) (4.45)

is constant along any (local) minimizer x∗(t). It is hence an invariant of the Euler–Lagrange equationsintroduced in the theorem above involving the Lagrange multplier.

These results provide a fundamental tool to analyze optimal control problems by making use of vari-ational calculus.


4.3 Unconstrained optimal control

Subsequently, unconstrained dynamic optimization problems in Bolzano form are considered given by

minuJ(u) = ϕ(te,x(te)) +

∫ te

t0

l(t,x(t),u(t))dt (4.46a)

subject to

x = f(t,x,u), t > t0, x(t0) = x0 ∈ Rn (4.46b)

with free or fixed endtime te and endpoint xe, respectively.

We will thereby primarily focus on the optimal open–loop control , where we seek u(t) = u∗(t) as afunction of time for a specified initial state and in case final state.

4.3.1 Existence of an optimal control

It can be shown that if g(t,x) = f(t,x(t),u(t)) is piecewise continuous in t and locally Lipschitz contin-uous in x according to Remark 2.3, then there exists a δ > 0 such that the initial value problem

x = g(t,x), x(t0) = x0 (4.47)

has a unique solution for t ∈ [t0, t0 + δ] (see, e.g., [7, 8]). Global existence can be achieved by eitherrestricting g(t,x) to the very restrictive set of global Lipschitz continuous functions or by imposingfurther information about the solution of the system. In particular, assume that g(t,x) is piecewisecontinuous in t and locally Lipschitz continuous in x for all x ∈ V ⊂ Rn and let W be a compact subsetof V so that x0 ∈ W . Suppose that the solution x(t) to (4.47) lies entirely in W , then there is a uniquesolution for all t ≥ t0. Due to the assumption of piecewise continuity in t obviously input trajectoriesu(t) ∈ C0([t0, te],R

m) are admissible so that x(t) ∈ C1([t0, te],Rn) with the cornering points at the

discontinuities of u(t).

Besides the situations where there does not exist a feasible control or feasible pair, respectively, ac-cording to Definition 4.1, the non–existence of a solution to an optimal control problem results from thefailure of the set Ufe of feasible controls to be compact .

Remark 4.6 (Compact set). A set V in a normed linear space (X, ‖ · ‖) is said to be compact if everysequence in V contains a convergent subsequence with its limit point in V . The set V is called relativelycompact if its closure V (add to V all limit points of sequences in V ) is compact.

In view of the previous discussion, issues arise when the solution to (4.46b) becomes unbounded in theinterval t ∈ [t0, te], te < ∞ so that the cost functional J(u) approaches infinity. This corresponds to aso–called finite escape time. Hence, one typically requires the solutions to (4.46b) to be bounded, i.e.,

‖x(t;x0,u(t))‖ ≤ α, t ≥ t0

for finite α > 0. One particular class of systems not exhibiting finite escape behavior is given by systemsthat are affine in x, i.e., x = A(t,u)x + b(t,u), t > t0 with x(t0) = x0. In addition, if the optimizationinterval [t0, te] is unbounded , i.e., te = ∞, then the set of feasible controls is itself unbounded andhence not compact. Hence, operations should be restricted to a compact finite interval t ∈ [t0, T ] with Tsufficiently large so that the set of feasible controls Ufe = Ufe([t0, te]) 6= ∅ for some te ∈ (t0, T ].

Example 4.7. Consider a point mass m that is accelerated by a force u(t) with 0 ≤ u(t) ≤ 1 in theinterval t ∈ [t0, te], i.e.,

4.3 Unconstrained optimal control 103

x =

[0 10 0

]x+

[01m

]u, t > t0, x(t0) = x0 =

[x0

0

].

It is desired to determine the input u(t) so that starting from x1(t0) = x0 the mass reaches the pointx1(te) = xe at time t = te while minimizing the cost functional

J(u) =

∫ te

t0

u2(t)dt.

Given x1,e > x1,0 it follows immediately that u(t) ≡ 0 is infeasible so that J(u) > 0 for any feasiblecontrol u(t). For the sequence of constant admissible controls uk(t) = 1/k, k ≥ 1 and t ≥ t0 the solutionto the differential equation is obtained as

x1 =1

2mk(t− t0)2 + x0

x2 =1

mk(t− t0).

The value x1(te) = xe is hence reached at time

te,k = t0 +√

2mk(xe − x0).

With this, the value of the cost functional in the interval t ∈ [t0, te,k] follows as

J(uk) =

∫ te,k

t0

1

k2dt =

√2m

k3(xe − x0).

Thus, for k → ∞ we have J(uk) → 0 while te,k → ∞. This implies infk J(uk) = 0 so that the problemdoes not have a minimum.

Similarly examples can be constructed, where an optimal control does not exist although the time horizonis finite and the solution to the dynamic system remains bounded.

The discussion reveals that additional conditions have to be imposed on the class of admissible controls.On the one hand the input may be required to fulfill an additional Lipschitz condition, i.e.,

‖u(t)− u(s)‖ ≤ Lu|t− s|, Lu ∈ (0,∞) (4.48)

for all t, s ∈ [t0, te]. On the other hand the class of admissible controls may be restricted to the set ofpiecewise constant inputs with at most a finite number of points of discontinuity.

4.3.2 Application of variational calculus

In the following, necessary optimality conditions are derived for optimal control problems of the form(4.46) making use of variational calculus as introduced in the sections before.

4.3.2.1 Fixed–time, free–endpoint problems In the optimal control problem with fixed–time andfree–endpoint we seek the input u(t) ∈ C ([t0, te],R

m) minimizing the cost functional

J(u) =

∫ te

t0

l(t,x(t),u(t))dt (4.49a)

subject to the equality constraint given by the dynamic system

x = f(t,x,u), t > t0, x(t0) = x0 ∈ Rn (4.49b)


for fixed endtime te. Here it is obvious that a variation of the state trajectory x(t) in terms of x(t) =x∗(t) + ηξ(t) does not explicitly relate to a variation in the input u(t) due to the implicit coupling givenby the differential equation (4.49b). Hence to deduce first order optimality conditions we proceed byintroducing a one–parameter family of comparison trajectories in u(t), i.e., u(t) = u∗(t) + ηω(t) withω(t) ∈ C ([t0, te],R

m). These considerations will allow to prove the following result.

Theorem 4.9 (First order necessary condition (fixed–time, free–endpoint problem)). Con-sider the minimization problem

minuJ(u) =

∫ te

t0

l(t,x(t),u(t))dt (4.50a)

subject to

x = f(t,x,u), t > t0, x(t0) = x0 ∈ Rn (4.50b)

for fixed terminal time te > t0. Assume that l and f are continuous in (t,x,u) and continuously differ-entiable with respect to x, u for all (t,x,u) ∈ [t0, te]×Rn ×Rm.

Suppose that u∗(t) ∈ C ([t0, te],Rm) is a (local) minimizer of (4.50) with x∗(t) ∈ C1([t0, te],R

n) denot-ing the corresponding solution of the initial–value problem (4.50b). Then there is a λ∗(t) ∈ C1([t0, te],R

n)such that the triple (u∗(t),x∗(t),λ∗(t)) satisfies

x∗ = f(t,x∗,u∗), x∗(t0) = x0 (4.51a)

λ∗

= −(∇xl)(t,x∗,u∗)− (∇xf)T (t,x∗,u∗)λ∗(t), λ∗(te) = 0 (4.51b)

0 = (∇ul)(t,x∗,u∗) + (∇uf)T (t,x∗,u∗)λ∗(t) (4.51c)

for t ∈ [t0, te]. These equations are called Euler–Lagrange equations of the optimal control problem (4.50)and λ∗(t) is referred to as the adjoint state or co–state.

Proof. Consider the one–parameter family of comparison functions u(t) = u(t; η) = u∗(t) + ηω(t) withω(t) ∈ C ([t0, te],R

m) and the scalar parameter η. Due to the assumption of continuity and differentiabilityof f there exists a η > 0 such that the solution x(t; η) of (4.50b) associated with u(t; η) exists, is uniqueand is differentiable in η for all η ∈ Bη(0) for all t ∈ [t0, te] (see the discussion in Section 4.3.1). Inaddition, η = 0 implies x(t; 0) = x∗(t) for t ∈ [t0, te].

To proceed, recall from Section 4.2.2 that equality constraints can be included in the formalism by in-troducing a Lagrange multiplier λ(t). Hence, substitution of u(t; η) into the cost functional J(u) resultingfrom (4.50) yields

J(u(t; η)) =

∫ te

t0

{l(t,x(t; η),u(t; η)) + λT (t)

[f(t,x(t; η),u(t; η))− x(t; η)

]}dt

=

∫ te

t0

{l(t,x(t; η),u(t; η)) + λT (t)f(t,x(t; η),u(t; η)) + λT (t)x(t; η)

}dt

−[λT (t)x(t; η)

]t=tet=t0

for any λ(t) ∈ C1([t0, te],Rn) and each η ∈ Bη(0). The first order necessary optimality condition intro-

duced in Theorem 4.1 hence imposes

δJ(u∗,ω) =∂

∂ηJ(u(t; η))

∣∣∣∣η=0

= 0

so that


0 =

∫ te

t0

{(∇xl)T (t,x∗,u∗) + λT (t)(∇xf)(t,x∗,u∗) + λT (t)

}ξ(t)dt

+

∫ te

t0

{(∇ul)T (t,x∗,u∗) + λT (t)(∇uf)(t,x∗,u∗)

}ω(t)dt− λT (te)ξ(te) + λT (t0)ξ(t0)

(4.52)

for any ω(t) ∈ C ([t0, te],Rm) and any λ(t) ∈ C1([t0, te],R

n). Herein,

ξ(t) =

(∂

∂ηx

)(t; 0).

The integral kernels are continuous so that δJ(u∗,ω) exits.

Since the effect of the variation on u∗(t) on the solution in terms of ξ(t) is difficult to determine,λ(t) = λ∗(t) is chosen so that

λ∗(t) = −(∇xl)(t,x∗,u∗)− (∇xf)T (t,x∗,u∗)λ∗(t).

Taking into account the Taylor series, then x(t; η) approximately satisfies

x(t; η) ≈ x(t; 0) +

(∂

∂ηx

)(t; 0)η = x∗(t) + ξ(t)η.

The initial condition x∗(t0) = x0 = x(t0; 0) hence implies ξ(t0) = 0. Recalling that x(te) is free we have

λ∗(te) = 0

in (4.52). The adjoint differential equation is linear in λ∗(t) and its solution exists and is unique in theinterval t ∈ [t0, te] by the continuity and differentiability assumption imposed in l and f .

With this, (4.52) reduces to

0 =

∫ te

t0

{(∇ul)T (t,x∗,u∗) + λT (t)(∇uf)(t,x∗,u∗)

}ω(t)dt

so that the fundamental lemma of variational calculus provided in Lemma 4.2 yields

(∇ul)T (t,x∗,u∗) + λT (t)(∇uf)(t,x∗,u∗) = 0

for all t ∈ [t0, te]. ut

Some comments are in order [6, 1, 3, 2].

(i) The optimality conditions (4.51) consist of 2n differential equations (4.51a), (4.51b) in x∗(t), λ∗(t)and m algebraic equations (4.51c). Since the initial state at t = t0 in x∗(t) and the final state att = te in λ∗(t) are provided, (4.51) corresponds to a so–called two–point boundary–value problem.

(ii) The Euler–Lagrange equations (4.51) can be re–written using the Hamiltonian function

H(t,x,u,λ) = l(t,x,u) + λTf(t,x,u) (4.53)

as

x∗ = (∇λH)(t,x∗,u∗,λ∗), x∗(t0) = x0 (4.54a)

λ∗

= −(∇xH)(t,x∗,u∗,λ∗), λ∗(te) = 0 (4.54b)

0 = (∇uH)(t,x∗,u∗,λ∗). (4.54c)


for t ∈ [t0, te]. The last condition (4.54c) illustrates that for the triple (u∗(t),x∗(t),λ∗(t)) to bea local minimizer of J the input u∗(t) must necessarily be a stationary point of the Hamiltonianfunction for each t ∈ [t0, te].

(iii) The variation of the Hamiltonian function along an optimal trajectory in view of (4.54) results in

d

dtH =

∂

∂tH + (∇xH)T x∗ + (∇uH)u∗ + fT λ

∗

=∂

∂tH + (∇uH)u∗ + fT

[(∇xH) + λ

∗]=

∂

∂tH.

If neither f nor l depend explicitly on time, then the Hamiltonian function H is constant alongan optimal trajectory and is an invariant of the two–point boundary–value problem (4.51). TheHamiltonian function is in this case also called a first integral of (4.51).

(iv) The Euler–Lagrange equations (4.51) or (4.54), respectively, are necessary for both a minimizationand a maximization problem. For a (local) minimizer u∗(t) the Legendre condition introduced inTheorem 4.3 implies that necessarily

(∇2uH)(t,x∗,u∗,λ∗) ≥ 0, (4.55)

i.e., the Hessian matrix of the Hamiltonian function must be positive semi–definite. For a (local)maximizer u∗(t) it is obvious that (∇2

uH)(t,x∗,u∗,λ∗) must be negative semi–definite.

(v) If the cost functional includes a term involving a terminal cost, i.e.,


∫ te

t0

l(t,x(t),u(t))dt

it is an easy exercise to show that the optimal solution (u∗(t),x∗(t),λ∗(t)) must still satisfy (4.51)or (4.54), respectively, but with the terminal condition λ∗(te) = 0 replaced by

λ∗(te) = (∇xϕ)(te,x∗(te)). (4.56)

(vi) In some situations it is convenient to express u∗(t) in terms of x∗(t) and λ∗(t) using (4.54c) andthen substitute this expression into (4.54a), (4.54b) to obtain a two–point boundary–value problemin x∗(t) and λ∗(t) alone.

(vii) The adjoint state λ∗(t) can be interpreted in the sense that λ∗(t0) corresponds to the sensitivity ofthe cost functional (4.50a) to changes in the initial condition x0.

Moreover, note the following remark referring to the case, where u∗(t) is sought for in the class of piecewisecontinuous functions.

Remark 4.7. Theorem 4.9 relies on the assumption that u∗(t) is continuous, i.e., u∗(t) ∈ C ([t0, te],Rm).

There are, however, examples, where no solution of the Euler–Lagrange equations (4.51) can be found inthis function class. Hence, one tries to seek for minimizers in the extended class of only piecewise con-tinuous functions so that u∗(t) ∈ C ([t0, te],R

m). As is noted in Section 4.3.1 given u(t) ∈ C ([t0, te],Rm)

the corresponding solutions to the differential equation (4.49b) are piecewise continuously differen-tiable, i.e., x(t) ∈ C1([t0, te],R

n) with the cornering points at the discontinuities of u(t). Referringby u∗(t) ∈ C ([t0, te],R

m) to the optimal input with x∗(t) and λ∗(t) the corresponding state and adjointstate of the optimization problem (4.49a), then at each cornering point c ∈ [t0, te] the following conditionshave to hold

x∗(c−) = x∗(c+) (4.57a)

λ∗(c−) = λ∗(c+) (4.57b)

H(c−,x∗(c−),u∗(c−),λ∗(c−)) = H(c+,x∗(c+),u∗(c+),λ∗(c+)), (4.57c)

where c− and c+ denote the left and right limit point.



4.3.2.2 Free–time, fixed–endpoint problems Differing from the previous section subsequently wesubsequently seek the input u(t) ∈ C ([t0, te],R

m) solving an optimal control problem with free terminaltime and fixed endpoint.

Theorem 4.10 (First order necessary optimality condition (free–time, fixed–endpoint prob-lem)). Consider the minimization problem

minuJ(u) = ϕ(te,x(te)) +

∫ te

t0

l(t,x(t),u(t))dt (4.58a)

subject to

x = f(t,x,u), t > t0, x(t0) = x0 ∈ Rn (4.58b)

Gk(te,u) = ψk(te,x(te)) = 0, k = 1, . . . , p (4.58c)

for fixed initial time t0 and free terminal time te � T . Assume that l and f are continuous in (t,x,u)and continuously differentiable with respect to x, u for all (t,x,u) ∈ [t0, T ] × Rn × Rm. In addition,assume that ϕ and ψk, k = 1, . . . , p are continuously differentiable with respect to te and x(te) = xe forall (te,xe) ∈ [t0, T ]×Rn.

Suppose that (u∗(t), t∗e) ∈ C ([t0, te],Rm) × [t0, T ) is a (local) minimizer of (4.58) with x∗(t) ∈

C1([t0, T ],Rn) denoting the corresponding solution of the initial–value problem (4.58b). Suppose furtherthat the regularity condition

det

δG1(t∗e,u∗(t), τ1, ω1(t)) · · · δG1(t∗e,u

∗(t), τp, ωp(t))...

...δGp(t

∗e,u∗(t), τ1, ω1(t)) · · · δGp(t∗e,u∗(t), τp, ωp(t))

6= 0 (4.59)

holds for p independent directions (ωk(t), τk) ∈ C ([t0, te],Rm) × [t0, T ), k = 1, . . . , p. Then there is

a λ∗(t) ∈ C1([t0, t∗e],R

n) and a µ∗ ∈ Rp such that the tuple (u∗(t),x∗(t), t∗e,λ∗(t),µ∗) satisfies the

Euler–Lagrange equations

x∗ = (∇λH)(t,x∗,u∗,λ∗), x∗(t0) = x0 (4.60a)

λ∗

= −(∇xH)(t,x∗,u∗,λ∗), λ∗(t∗e) = (∇xeφ)(t∗e,x∗(t∗e),µ

∗) (4.60b)

0 = (∇uH)(t,x∗,u∗,λ∗) (4.60c)

for t ∈ [t0, te] and the transversality conditions

ψ(t∗e,x∗(t∗e)) =

[ψ1(t∗e,x

∗(t∗e)) · · · ψp(t∗e,x∗(t∗e))]T

= 0 (4.61a)(∂

∂teφ

)(t∗e,x

∗(t∗e),µ∗) +H(t∗e,x

∗(t∗e),u∗(t∗e),λ

∗(t∗e)) = 0. (4.61b)

with the Hamiltonian function

H(t,x,u,λ) = l(t,x,u) + λT (t)f(t,x,u) (4.62)

and

φ(te,x(te),µ) = ϕ(te,x(te)) + µTψ(te,x(te)). (4.63)

The proof of this result in principle combines the procedure used to verify Theorem 4.4 and Theorem 4.9above.

Proof. Consider the one–parameter family of comparison functions u(t) = u(t; η) = u∗(t) + ηω(t) withω(t) ∈ C ([t0, T ],Rm) and the scalar parameter η. Due to the assumption of continuity and differentiability


of f there exists a η > 0 such that the solution x(t; η) of (4.58b) associated with u(t; η) exists, is uniqueand is differentiable in η for all η ∈ Bη(0) for all t ∈ [t0, T ] (see the discussion in Section 4.3.1). Inaddition, η = 0 implies u(t; 0) = u∗(t) and x(t; 0) = x∗(t) for t ∈ [t0, t

∗e].

The cost functional (4.58a) extended by the equality constraints (4.58b), (4.58c) and the terminal timete = t∗e + ητ reads

J(te,u(t; η)) = ϕ(te,x(te; η)) + µTψ(te,x(te; η))

+

∫ te

t0

{l(t,x(t; η),u(t; η)) + λT (t)

[f(t,x(t; η),u(t; η))− x(t; η)

]}dt

= ϕ(te,x(te; η)) + µTψ(te,x(te; η))−[λT (t)x(t; η)

]t=tet=t0

+

∫ te

t0

{l(t,x(t; η),u(t; η)) + λT (t)f(t,x(t; η),u(t; η)) + λT (t)x(t; η)

}dt

The first order necessary optimality condition of Theorem 4.1 implies for the local minimizer that

δJ(t∗e,u∗, τ,ω) =

∂

∂ηJ(te,u(t; η))

∣∣∣∣η=0

= 0.

The Gateaux derivative at (u∗(t), t∗e) in any direction (ω(t), τ) ∈ C ([t0, T ],Rm)× [0, T ) evaluates1 to

0 = δJ(t∗e,u∗, τ,ω)

=

{−λT (t∗e)x(t∗e; 0)− λT (t∗e)x(t∗e; 0) +

(∂

∂teϕ

)(t∗e,x(t∗e; 0)) + (∇xeϕ)T (t∗e,x(t∗e; 0))x(t∗e; 0)

+ µT[(

∂

∂teψ

)(t∗e,x(t∗e; 0)) + (∇xeψ)(t∗e,x(t∗e; 0))x(t∗e; 0)+

]+ l(t∗e,x(t∗e; 0),u(t∗e; 0))

+ λT (t∗e)f(t∗e,x(t∗e; 0),u(t∗e; 0)) + λT (t∗e)x(t∗e; 0)

}τ

+{

(∇xeϕ)T (t∗e,x(t∗e; 0)) + µT (∇xeψ)(t∗e,x(t∗e; 0))− λT (t∗e)}ξ(t∗e)

+{λT (t0)

}ξ(t0)

+

∫ te

t0

{(∇xl)(t,x(t; 0),u(t; 0)) + λT (t)(∇xf)(t,x(t; 0),u(t; 0)) + λT (t)

}ξ(t)dt

+

∫ te

t0

{(∇ul)(t,x(t; 0),u(t; 0)) + λT (t)(∇uf)(t,x(t; 0),u(t; 0))

}ω(t)dt

with ξ(t) = ( ∂∂ηx)(t0; 0). Since the initial value is fixed we have, as in the proof of Theorem 4.9, that

x(t0; η) = x0 and ξ(t0) = 0. Noting that x(t∗e; 0) = x∗(t∗e) and u(t∗e; 0 = u∗(t∗e) the expression reduces to

0 =

{(∂

∂teϕ

)(t∗e,x

∗(t∗e)) + µT(∂

∂teψ

)(t∗e,x

∗(t∗e)) + l(t∗e,x∗(t∗e),u

∗(t∗e))

+ λT (t∗e)f(t∗e,x∗(t∗e),u

∗(t∗e))

}τ

+{

(∇xeϕ)T (t∗e,x∗(t∗e)) + µT (∇xeψ)(t∗e,x

∗(t∗e))− λT (t∗e)

}(ξ(t∗e) + x∗(t∗e)τ

)+

∫ te

t0

{(∇xl)(t,x∗,u∗) + λT (t)(∇xf)(t,x∗,u∗) + λT (t)

}ξ(t)dt

+

∫ te

t0

{(∇ul)(t,x∗,u∗) + λT (t)(∇uf)(t,x∗,u∗)

}ω(t)dt

(4.64)

Since the effect on u∗(t) on the solution in terms of ξ(t) is hard to determine the adjoint state λ∗(t) ischosen so that the first integral arising in (4.64) vanishes, i.e.,

1 Remember that the upper integration limit depends on η since te = t∗e + ητ .


λ∗(t) = −(∇xl)T (t,x∗,u∗)− (∇xf)T (t,x∗,u∗)λ∗(t)

with the terminal condition obtained from setting the argument in front of (ξ(t∗e) + x∗(t∗e)τ) to zero, i.e.,

λ∗(t∗e) = (∇xeϕ)(t∗e,x∗(t∗e)) + (∇xeψ)T (t∗e,x

∗(t∗e))µ∗. (4.65)

The transversality condition (4.61b) follows from lines 1 and 2 of (4.64) and the remaining equations areobtained by making use of the fundamental lemma of variational calculus. ut.

Some comments to Theorem 4.10 are in order.

(i) The optimality conditions (4.60) consist of 2n differential equations (4.60a), (4.60b) in x∗(t), λ∗(t)and m algebraic equations (4.60c). Since the initial state at t = t0 in x∗(t) and the final state att = te in λ∗(t) are provided we have to deal with a two–point boundary–value problem. The Lagrangemultiplier µ and — in case — the terminal time t∗e follow from the transversality conditions (4.61)so that a complete set of equations for the 2n+m+ p+ 1 unknowns is available.

(ii) For fixed end–time te = t∗e we have τ = 0 such that the term { · }τ vanishes in (4.64) and thetransversality conditions reduce to (4.61a).


∗e) = xj,e,

then ξj(t∗e) = 0 and there is no terminal condition for the corresponding adjoint state λ∗j (t

∗e).

(b) If there is a component xj(t), j ∈ {1, . . . , n} of x(t) with free end–value xj(t∗e), then ξj(t

∗e) 6= 0 and

the terminal condition for the corresponding adjoint state λ∗j (t∗e) is λ∗j (t

∗e) = ( ∂

∂xj,eϕ)(t∗e,x

∗(t∗e)).

(iii) For free end–time te we have τ 6= 0 the transversality condition (4.61b) has to hold.


∗e) = xj,e,

then an admissible direction (τ, ξj(t)) has to fulfill

xj,e = x∗j (t∗e + ητ) + ηξj(t

∗e + ητ) ⇒ ∂

∂ηxj,e

∣∣∣∣η=0

= τx∗j (t∗e) + ξj(t

∗e) = 0.

Hence, there is no terminal condition for the corresponding adjoint state λ∗j (t∗e).

(b) If there is a component xj(t), j ∈ {1, . . . , n} of x(t) with free end–value xj(t∗e), then the terminal

condition is determined from (4.65).

(iv) If the equality constraints in Theorem 4.10 are replaced by inequality constraints

Gk(te,u) = ψk(te,x(te)) ≤ 0 k = 1, . . . , p, (4.66)

then only (4.61a) has to be replaced by

ψk(te,x(te)) ≤ 0 k = 1, . . . , p (4.67a)

µ∗ ≥ 0 (4.67b)

ψT (te,x(te))µ∗ = 0, (4.67c)

which needs to be interpreted in the sense of the complementary slackness condition.

Remark 4.8 (Reachability condition). Theorem 4.10 relies on the verification of the regularity con-dition (4.59), which is difficult in general. As is elaborated, e.g., in [1], this condition can be interpretedas a reachability condition. Hence, if it does not hold, then is may not be possible to find a control u∗(t)so that the terminal conditions are fulfilled at the terminal time.



Example 4.8. Consider a particle of mass m moving in the (x, y)–plane subject to a thrust force ofmagnitude ma(t) with the thrust acceleration a(t) a known function of time [9, 10]. The goal is to steerthe point mass in minimal time t ∈ [0, te] from the initial position (x0, y0) to a prescribed final position(xe, ye). Let u(t) denote the input corresponding to the angle between the direction of thrust and thex–axis. The optimization problem hence reads

minu( · )

te (4.68)

subject toxvyw

=

v

a cos(u)w

a sin(u)

, t > 0,

x(0)v(0)y(0)w(0)

=

x0

v0

y0

w0

(4.69)

with the terminal constraint

ψ1(te, x(te)) = x(te)− xe = 0

ψ2(te, y(te)) = y(te)− ye = 0.(4.70)

Note that no conditions are imposed on the terminal velocities. Let x(t) = [x(t), v(t), y(t), w(t)]T , thenthe Hamiltonian function H and the function φ follow as

H(x, u,λ) = λ1v + λ2a cos(u) + λ3w + λ4a sin(u)

φ(te,x(te),µ) = te + µ1(x(te)− xe) + µ2(y(te)− ye).

With these preparations the Euler–Lagrange equations (4.60) for a minimizer candidate (te, u(t)) areobtained as (4.69) and

λ = −(∇xH)(x, u,λ) =

0λ1

0λ3

, λ(te) =

µ1

0µ2

0

0 = (∇uH)(x, u,λ) = −λ2a sin(u) + λ4a cos(u).

The differential equations for the adjoint state λ(t) directly admit a solution given by

λ(t) =

µ1

µ1(te − t)µ2

µ2(te − t)

The homogeneous equation moreover implies

tan(u) =λ4a

λ2a=µ2

µ1

so that

u(t) = u = arctan

(µ2

µ1

), |u| < π

2.

Obviously, the candidate solution is constant on the whole interval [0, te]. For constant thrust accelerationa(t) = a the corresponding states are obtained analytically by solving (4.69), which results in

x(t) = x0 + tv0 +t2

2a cos(u)


v(t) = v0 + ta cos(u)

y(t) = y0 + tw0 +t2

2a sin(u)

w(t) = w0 + ta sin(u).

With the previous results, we have herein

cos(u) =1√

1 + µ2

µ1

, sin(u) =µ2

µ1

1√1 + µ2

µ1

.

Since the terminal time is free the transversality condition (4.61b) has to be fulfilled, i.e.,

0 =

(∂

∂teφ

)(te,x(te),µ) +H(x(te), u,λ(te)) = 1 + µ1v(te) + µ2w(te). (4.71)

The remaining three unknowns te, µ1 and µ2 can be thus determined as the solution of the nonlinearsystem of algebraic equations

x0 + tev0 +t2e2 a

1√1+

µ2µ1

− xe

y0 + tew0 +t2e2aµ2

µ1

1√1+

µ2µ1

− ye

1 + µ1

(v0 + tea

1√1+

µ2µ1

)+ µ2

(w0 + tea

µ2

µ1

1√1+

µ2µ1

)

= 0 (4.72)

comprised of the two terminal constraints (4.70) and the transversality condition (4.71). Numerical tech-niques for the determination of the zero of (4.72) have already been introduced in Section 2.2. For thenumerical results depicted in Figure 4.3 the function fsolve of Matlab is used.

Example 4.9 (Linear–quadratic regulation). Consider the linear time–varying MIMO system

x = A(t)x+B(t)u(t), t > 0, x(t0) = x0

with x(t) ∈ Rn and u(t) ∈ Rm. Determine the input u∗(t) to minimize the quadratic cost functional

J(u) =1

2xT (te)Sex(te) +

1

2

∫ te

t0

[uT (t)R(t)u(t) + xT (t)Q(t)x(t)

]dt

for given endtime te. Herein, Se and Q(t) are assumed positive semi–definite and R(t) is assumed positivedefinite for all t ∈ [t0, te]. With the Hamiltonian function

H(t,x,u,λ) =1

2

[uT (t)R(t)u(t) + xT (t)Q(t)x(t)

]+ λT (t)

[A(t)x+B(t)u(t)

]and

φ(te,x(te)) =1

2xT (te)Sex(te)

the Euler–Lagrange equations (4.60) read

x = (∇λH)(t,x,u,λ) = A(t)x+B(t)u, x(t0) = x0

λ = −(∇xH)(t,x,u,λ) = −Q(t)x−AT (t)λ, λ(te) = Sex(te)

0 = R(t)u+BT (t)λ.

The candidate for the optimal control follows from the last equation, i.e.,

u = −R−1(t)BT (t)λ, (4.73)


0 0.2 0.4 0.6 0.8 1 1.2 1.4−0.2

0

0.2

0.4

0.6

0.8

1

1.2

x

y

(a) Initial velocities (v0, w0) = (0, 0).

0 0.2 0.4 0.6 0.8 1 1.2 1.40

0.2

0.4

0.6

0.8

1

1.2

1.4

x

y

(b) Initial velocities (v0, w0) = (0, 1).

−0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2−0.2

0

0.2

0.4

0.6

0.8

1

1.2

x

y

(c) Initial velocities (v0, w0) = (−1, 0).

−1 −0.5 0 0.5 1 1.5−0.2

0

0.2

0.4

0.6

0.8

1

1.2

x

y

(d) Initial velocities (v0, w0) = (−1, 1).

Fig. 4.3: Time optimal control of the point mass for different initial velocities v0 and w0 with a = 1.

and relies on the computation of the adjoint state λ(t). Substitution of u(t) into the Euler–Lagrangeequations yields the two–point boundary–value problem[

x

λ

]=

[A(t) −B(t)R−1(t)BT (t)−Q(t) −AT (t)

] [xλ

],

[x(t0)λ(te)

]=

[x0

Sex(te)

]. (4.74)

To solve this problem, we use a sweep method by assuming that λ(t) = P (t)x(t) so that λ(t0) = P (t0)x0,i.e., the terminal condition λ(te) = Sex(te) is swept back in time. Substitution of this relation provides

P (t) = −P (t)A(t)−AT (t)P (t) + P (t)B(t)R−1(t)BT (t)P (t)−Q(t), P (te) = Se. (4.75)

Hence, P (t) has to solve this matrix Riccati differential equation. Obviously, the differential equation isnonlinear in P (t) and has to be solved backward in time with the terminal condition P (te) = Se. OnceP (t) is computed, the optimal trajectories x∗(t) and λ∗(t) can be determined forward in time by solving(4.74) with the initial value λ(t0) = P (t0)x0. The optimal control hence follows from (4.73) in the form

u∗(t) = −R−1(t)BT (t)P (t)x∗(t). (4.76)

Moreover note that since

(∇2uH)(t,x∗,u∗,λ∗) = R(t)


with R(t) positive definite by assumption the determined optimal control u∗(t) is a minimizer.

The corresponding optimal feedback controller minimizing the quadratic cost functional is similarlyobtained as

u = −K(t)x, K(t) = R−1(t)BT (t)P (t) (4.77)

with P (t) solving (4.75) and x(t) the system state at time t.

For the special case of an infinite optimization horizon with te →∞, the terminal cost is meaninglesssince limt→∞ x(t) = 0 needs to hold to ensure the existence of the integral term in J(u). Hence, wehave to impose Se = 0. For the sake of simplicity consider the time–invariant setting with A(t) = A,B(t) = B, Q(t) = Q, and R(t) = R given Q positive semi–definite and R positive definite as before.Let the linear time–invariant system be stabilizable (there exists a matrix K such that the eigenvaluesof A − BK have negative real part) and let the pair (A,C) be detectable, where C originates from theCholesky decomposition of Q, i.e., Q = CTC. Then the matrix Riccati differential equation (4.75) reducesto the algebraic Riccati equation

PA+ATP − PBR−1BTP +Q = 0. (4.78)

In view of the assumptions above it can be shown that there exists a unique positive semi–definite solutionP to (4.78). In addition, it can be shown that the static feedback law

u = −Kx, K = R−1BTP (4.79)

with P the unique positive semi–definite solution to (4.78) under the imposed assumptions implies asymp-totic stability of the closed–loop control, i.e., the eigenvalues of the matrix A − BR−1BTP have strictlynegative real part. For further details the reader is, e.g., referred to [11, 12].

Exercise 4.1. Determine the solution of the optimal control problem

minuJ(u) =

∫ 1

0

(1

2u2(t) +

a

2x2(t)

)dt

subject to

x = u, t > 0, x(0) = 1

and x(1) = 0.

Solution. The solution is given by

x∗(t) = − sinh(√a(t− 1))

sinh(√a)

, u∗(t) =−√a cosh(

√a(t− 1))

sinh(√a)

, λ∗(t) = −u∗(t).

A graphical illustration when varying the parameter a is given below.

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

t

x∗

a = 1a = 10a = 500

0 0.2 0.4 0.6 0.8 1−25

−20

−15

−10

−5

0

t

u∗

a = 1a = 10a = 500

0 0.2 0.4 0.6 0.8 10

5

10

15

20

25

t

λ∗

a = 1a = 10a = 500


4.4 Input constrained optimal control

In the previous section it was assumed that no constraints are imposed on the optimization problem inaddition to a possibly fixed endtime and/or endpoint. This assumption will be weakened by introducingPontryagin’s maximum principle.

4.4.1 Pontryagin maximum principle

Subsequently, we consider in a first step dynamic optimization problems comprised of the cost functional

J(u) =

∫ te

t0

l(x(t),u(t))dt

with free endtime te and fixed terminal value x(te) = xe subject to the time invariant or autonomous2

dynamic system

x = f(x,u), t > t0, x(t0) = x0.

Differing from the previous section admissible controls are taken in the class of piecewise continuousfunctions

u ∈ U = {u ∈ C0([t0, T ],Rm) : u(t) ∈ U, t0 ≤ t ≤ te}.

for T � te sufficiently large and U the non–empty set of input constraints.

Some ideas underlying Pontryagin’s maximum principle3 can be motivated by properly reformulatingthe optimization problem in an extended state space. For this, consider xT (t) = [xT (t), xn+1(t)] with

xn+1(t) =

∫ t

t0

l(x(s),u(s))ds.

Hence, the optimization problem can considered as finding an admissible control u(t) ∈ U and an endtimete so that the solution of the extended system

˙x =

[x

xn+1

]=

[f(x,u)l(x,u)

]︸︷︷︸= f(x,u)

, t > t0, x(t0) =

[x0

0

](4.80)

terminates at the point xT (te) = [xTe , xn+1(te)] with xn+1(te) taking the smallest possible value. Toillustrate this fact consider Figure 4.4. Along the red line passing through the point (xTe , 0) one findsall terminal points of solution trajectories of the extended system for t = te with different values of thecost functional xn+1(te). No other trajectory can intersect with this line at a point below (xTe , x

∗n+1(te)).

These geometric observations form the basis for the derivation of the Pontryagin maximum principle.

Due to the necessary technical efforts, in the following only results are stated without providing anyproof. For this, the reader is referred to the provided literature [13, 14].

Theorem 4.11 (Pontryagin maximum principle for autonomous systems). Consider the optimalcontrol problem

2 Both f and l do not explicitly depend on time t.3 As the name refers to the maximum principle was originally defined for maximization problems [13]. In view of the

consideration of minimization problems one should hence refer to it as minimum principle. However, the name maximumprinciple is so commonly used in dynamic optimization and optimal control that we will not make this distinction.

4.4 Input constrained optimal control 117

x2

xn+1

x1

(x0, 0)

(xe, 0)

J(u)

J(u∗)

feasible trajectory x(t)

optimal trajectory x∗(t)

Fig. 4.4: Geometric interpretation of the reformulated optimization problem.

minu∈U

J(u) =

∫ te

t0

l(x(t),u(t))dt (4.81a)

subject to

x = f(x,u), t > t0, x(t0) = x0, x(te) = xe, (4.81b)

with x0, xe ∈ Rn for fixed initial time t0 and free terminal time te � T . Assume that l and f arecontinuous in (x,u) and continuously differentiable with respect to x for all (x,u) ∈ Rn ×Rm.

Suppose that (u∗(t), t∗e) ∈ U× [t0, T ) is a minimizer of (4.81) with x∗(t) the corresponding solution to(4.81b). Then there exists a non–zero λ

∗(t) ∈ C1([t0, t

∗e],R

n+1), λ∗(t) = [λ∗1(t), . . . , λ∗n+1(t)]T such that

the canonical equations

˙x∗ = (∇λH)(x∗,u∗, λ∗) =

[f(x∗,u∗)l(x∗,u∗)

], x∗(t0) =

[x0

0

], x∗(te) = xe (4.82a)

˙λ∗ = −(∇xH)(x∗,u∗, λ∗) =

[(−∇xH)(x∗,u∗, λ

∗)

0

](4.82b)

are fulfilled with the Hamiltonian function

H(x,u, λ) = λTf(x,u) =

[λ1 . . . λn

]f(x,u) + λn+1l(x,u) (4.82c)

of the extended system and:

(i) The optimal control u∗(t) minimizes the function H(x∗(t),u(t), λ∗(t)) for all t ∈ [t0, te] in the set

of input constraints U, i.e.,

H(x∗,v, λ∗) ≥ H(x∗,u∗, λ

∗), ∀v ∈ U (4.83)

(ii) For any t ∈ [t0, te] the relations

λ∗n+1 = const. ≥ 0 (4.84a)

H(x∗,u∗, λ∗) = const. ≥ 0 (4.84b)

hold.


(iii) For free endtime te the transversality condition

H(x∗(t∗e),u∗(t∗e), λ

∗(t∗e)) = 0 (4.85)

holds.

According to Theorem 4.11 2n+m+ 3 equations are available for the computation of the 2n+m+ 3unknowns (x∗,u∗, λ

∗, t∗e). These are given by the 2n+ 2 differential equations (4.82a) and (4.82b) for the

extended state x∗(t) and the adjoint state λ∗(t), m algebraic equations arising from (4.83), namely that

for all t ∈ [t0, te] the input u(t) = u∗(t) minimizes the function H(x∗(t),u(t), λ∗(t)) in the set of input

constraints U, and the transversality condition (4.85). These are completed with n+ 1 initial conditionsx(t0) = x0, n terminal constraints x∗(t∗e) = xe and the adjoint terminal condition λ∗n+1(t∗e) ≥ 0. For thelatter, we have to distinguish between two cases:

(i) If λ∗n+1(t) > 0, t ∈ [t0, t∗e] the entries λ∗j (t), j = 1, . . . , n are defined up to a common multiplier

since H is homogeneous with respect to λ∗(t). This is the so–called normal case and commonly theadjoint variables are normalized by imposing λ∗n+1(t) = 1, t ∈ [t0, t

∗e].

(ii) If λ∗n+1(t) = 0, t ∈ [t0, t∗e] the so–called abnormal case arises, where H is independent of l and hence

of the cost functional so that the optimization problem is obviously ill–posed. Here, the adjointvariables λ∗j (t), j = 1, . . . , n are uniquely determined.

Remark 4.9 (Connection to first and second order necessary optimality conditions). Thenecessary conditions for u(t) being a minimizer of H(x(t),u(t), λ(t)) for λn+1(t) = 1 according to (4.83)directly coincide with the necessary first and second order conditions (4.54c) and (4.55), i.e.,

(∇uH)(x∗,u∗, λ∗) = 0, (∇2

uH)(x∗,u∗, λ∗) ≥ 0, ∀t ∈ [t0, t

∗e]. (4.86)

Nevertheless, Pontryagin’s maximum principle is more general since the first condition (∇uH)(x∗,u∗, λ∗) =

0 in general does not hold if the minimum is located at the boundary of the set U of input constraints.Moreover, in Theorem 4.11 it is required that f and l are only continuous in u while the derivation ofthe Euler–Lagrange equations relies on their continuous differentiability in u.

With these considerations the general procedure for the application of the maximum principle in thenormal case with λn+1(t) = 1 so that λ(t) is used for λ(t) is given by the following steps:

(i) Set up the Hamiltonian function H(x,u,λ) =∑nj=1 λjfj(x,u) + l(x,u).

(ii) Solve the minimization problem

H(x,v,λ) ≥ H(x,u,λ), ∀v ∈ U

or equivalently

u = arg minv

{H(x,v,λ) : v ∈ U, t ∈ [t0, te]

}.

depending on (x,λ). This yields u = k(x,λ).

(iii) Substitution of u = k(x,λ) into (4.82a), (4.82b) results in the boundary–value problem

˙x = (∇λH)(x,k(x,λ),λ), x(t0) =

[x0

0

], x(te) = xe

λ = −(∇xH)(x,k(x,λ),λ)

with the transversality condition (4.85) if te is free.

(iv) The (numerical) solution of the boundary–value problem yields (x∗(t),λ∗(t)) and the optimal controlu∗(t) = k(x∗(t),λ∗(t)).


Example 4.10 (Normal case). Consider the optimal control problem

minu∈U

J(u) =

∫ te

t0

1

2u2(t)dt

subject to

x = u− x, t > 0, x(0) = 1, x(1) = 0

with the input constrained to u(t) ∈ [−0.6, 0] for t ∈ [0, 1]. The Hamiltonian function of the extendedsystem reads

H(x, u, λ) = λ1(u− x) +1

2λ2u

2

so that (4.82a), (4.82b) evaluate to

x = u− x, x(0) = 1, x(te) = 0

˙λ1 = −(∂

∂xH

)(x, u, λ) = λ1

˙λ2 = 0

This implies the solutions for the optimal adjoint states in the form

λ∗1 = C1et, λ∗2 = C2

for constants C1 and C2 ≥ 0. Since the problem is normal choose C2 = 1. By (4.83) the optimal solutionu∗(t) has to satisfy the inequality

H(x∗, v, λ∗) ≥ H(x∗, u∗, λ

∗), ∀v ∈ [−0.6, 0], ∀t ∈ [0, 1]

or in other words

u∗ = arg minv

{H(x∗, v, λ

∗) : v ∈ [−0.6, 0], t ∈ [0, 1]

}.

With(∂

∂uH

)(x∗, u∗, λ

∗) = λ∗1 + λ∗2u

∗ = 0

the optimal control follows as

u∗ =

0, λ∗1 ≤ 0

−λ∗1, λ∗1 ∈ (−0.6, 0)

−0.6, λ∗1 ≥ 0.6

in view of the input constraint. Taking C1 ≤ 0 implies λ∗1(t) = C1 exp(t) ≤ 0 so that u∗(t) = 0 fort ∈ [0, 1] and thus x∗(t) = exp(−t). This solution is infeasible since x∗(1) = exp(−1) 6= 0. Hence, wehave to consider C1 > 0.

For C1 > 0 every optimal control is piecewise continuous and takes the values −C1 exp(t) and −0.6with at most 1 corner point. To illustrate this note that λ∗1(t) = C1 exp(t) is strictly monotonicallyincreasing such that u∗(t) decreases monotonically. Starting with u(1)(t) = −C1 exp(t) for t ∈ [0, c] thesolution of the differential equation

x∗(1) = u∗(1) − x∗(1), x∗(1)(0) = 1

in this time interval is given by


0 0.2 0.4 0.6 0.8 1−0.65

−0.6

−0.55

−0.5

−0.45

−0.4

t

u∗

Fig. 4.5: Optimal control for Example 4.10.

x∗(1) = e−t(

1 +C1

2

)− C1

2et.

Similarly for t ∈ (c, 1] with u∗(2)(t) = −0.6 the solution is obtained from

x∗(2) = u∗(2) − x∗(2), x∗(2)(1) = 0

as

x∗(2) =3

5

(e1−t − 1

).

For the determination of the constant C1 recall from (4.84b) that the Hamiltonian function H(x∗, u∗, λ∗)

has to be constant for all t ∈ [0, 1]. Hence we have

λ∗1(u∗(1) − x

∗(1)

)+

1

2λ∗2(u∗(1)

)2= λ∗1

(u∗(2) − x

∗(2)

)+

1

2λ∗2(u∗(2)

)2 ⇒ −C1

(1 +

C1

2

)= −3

5eC1 +

9

50,

which yields the two solutions

C1,1 = 0.435721, C1,2 = 0.826218.

The switching time ts is deduced from the continuity condition

x∗(1)(ts) = x∗(2(ts) ⇒ −C1

2e2ts +

3

5ets + 1− 3

5e +

C1

2= 0.

Depending on the determined values of C1 the solution of this quadratic equation in exp(ts) yields theswitching times

ts,1 = 0.319929, ts,2 = −0.319929.

Since t ∈ [0, 1] the only possible value is ts,1 = 0.319929 so that C1 = C1,1 = 0.435721. The optimalcontrol hence follows as

u∗ =

{−0.435721et, t ∈ [0, 0.319929]

−0.6, t > 0.319929

and is shown in Figure 4.5.


Example 4.11 (Abnormal case). Consider the optimal control problem

minu∈U

J(u) =

∫ te

t0

l(x(t), u(t))dt

subject to

x = u, t > 0, x(0) = 0, x(1) = 1

with the input constrained to u(t) ∈ [0, 1] for t ∈ [0, 1].

There is only the single u∗(t) = 1 that transfers the state x(t) = t from the initial state x(0) = 0to the terminal state x(1) = 1. This optimal control is, however, independent of the cost functional inl(x(t), u(t)).

In the following, Theorem 4.11 is extended by replacing the terminal constraint x(te) = xe with thetarget set condition x(te) ∈ Xta ⊂ Rn. Herein, Xta is assumed to be a smooth manifold of dimensionn − p ≤ n. Recall from Section 3.1.1 that an (n − p)–dimensional manifold is typically defined in termsof the set

Xta = {x ∈ Rn : gj(x) = 0, j = 1, . . . , p}.

The corresponding tangent space Tx∗Xta at the point x = x∗ ∈ Xta is hence defined as

Tx∗Xta ={d ∈ Rn : (∇xgj)T (x∗)d = 0, j = 1, . . . , p

}. (4.87)

If the functions gj(x), j = 1, . . . , p are linearly independent, then the set of functions satisfies the con-straint qualification condition (cf. also Remark 3.2)

rank(∇xg)(x∗) = rank[(∇xg1), . . . , (∇xgp)

](x∗) = p. (4.88)

With this, the following extension of Pontryagin’s maximum principle can be verified.

Theorem 4.12 (Pontryagin maximum principle for autonomous systems with target set con-dition). Consider the optimal control problem

minu∈U

J(u) =

∫ te

t0

l(x(t),u(t))dt (4.89a)

subject to

x = f(x,u), t > t0, x(t0) = x0, x(te) ∈ Xta, (4.89b)

with x0 ∈ Rn for fixed initial time t0, free terminal time te � T and Xta a smooth manifold of dimension(n − p). Assume that l and f are continuous in (x,u) and continuously differentiable with respect to xfor all (x,u) ∈ Rn ×Rm.

Suppose that (u∗(t), t∗e) ∈ U × [t0, T ) is a minimizer of (4.89) with x∗(t) the corresponding solutionto (4.89b). Then there exists a non–zero λ

∗(t) ∈ C1([t0, t

∗e],R

n+1), λ∗(t) = [λ∗1(t), . . . , λ∗n+1(t)]T such

that conditions (4.82a)–(4.85) of Theorem 4.11 are fulfilled. Moreover, λ∗(t∗e) = [λ∗1(t∗e), . . . , λ∗n(t∗e)]

T isorthogonal to the tangent space Tx∗(t∗e)Xta, i.e., the transversality conditions(

λ∗(t∗e))Td = 0, ∀d ∈ Tx∗(t∗e)Xta (4.90)

hold.

In view of (4.87), (4.88) this obviously implies that λ∗(t∗e) is a linear combination of the individual(∇xgj)(x∗(t∗e)) and hence admits a representation in the form


λ∗(t∗e) =

p∑j=1

µj(∇xgj)(x∗(t∗e)) (4.91)

with the Lagrange multiplier µ = [µ1, . . . , µp]T ∈ Rp.

Finally we point our attention to the case of non–autonomous nonlinear dynamic systems dependingexplicitly on time. The procedure is basically identical to the time–invariant case and again relies on theintroduction of an extended system as in (4.80).

Theorem 4.13 (Pontryagin maximum principle for non–autonomous systems). Consider theoptimal control problem

minu∈U

J(u) =

∫ te

t0

l(t,x(t),u(t))dt (4.92a)

subject to

x = f(t,x,u), t > t0, x(t0) = x0, x(te) = xe, (4.92b)

with x0, xe ∈ Rn for fixed initial time t0 and free terminal time te � T . Assume that l and f arecontinuous in (t,x,u) and continuously differentiable with respect to (t,x) for all (t,x,u) ∈ [t0, T ] ×Rn ×Rm.

Suppose that (u∗(t), t∗e) ∈ U× [t0, T ) is a minimizer of (4.92) with x∗(t) the corresponding solution to(4.92b). Then there exists a non–zero λ

∗(t) ∈ C1([t0, t

∗e],R

n+1), λ∗(t) = [λ∗1(t), . . . , λ∗n+1(t)]T such that

the canonical equations

˙x∗ = (∇λH)(t,x∗,u∗, λ∗) =

[f(t,x∗,u∗)l(t,x∗,u∗)

], x∗(t0) =

[x0

0

], x∗(te) = xe (4.93a)

˙λ∗ = −(∇xH)(t,x∗,u∗, λ∗) =

[(−∇xH)(t,x∗,u∗, λ

∗)

0

](4.93b)

are fulfilled with the Hamiltonian function

H(t,x,u, λ) = λTf(t,x,u) =

[λ1 . . . λn

]f(t,x,u) + λn+1l(t,x,u) (4.93c)

of the extended system and:

(i) The optimal control u∗(t) minimizes the function H(t,x∗(t),u(t), λ∗(t)) for all t ∈ [t0, te] in the set

of input constraints U, i.e.,

H(t,x∗,v, λ∗) ≥ H(t,x∗,u∗, λ

∗), ∀v ∈ U (4.94)

(ii) For any t ∈ [t0, te] the relations

λ∗n+1 = const. ≥ 0 (4.95a)(∂

∂tH

)(t,x∗,u∗, λ

∗) =

(λ∗)T( ∂

∂tf

)(t,x∗,u∗) (4.95b)

hold.

(iii) For free endtime te the transversality condition

H(t∗e,x∗(t∗e),u

∗(t∗e), λ∗(t∗e)) = 0 (4.96)

holds.


This result can be deduced from Theorem 4.11 by first introducing the auxiliary variable xn+1(t) = 1,xn+1(t0) = t0 so that (4.92b) reads

d

dt

[x

xn+1

]︸︷︷︸= xex

[f(xn+1,x,u)

1

]︸︷︷︸= f ex(xex,u)

, t > t0,

[x(t0)

xn+1(t0)

]=

[x0

t0

]

and secondly applying Theorem 4.11 to this extended but autonomous system. This also implies theassumption that f and l have to be continuously differentiable in t. Theorem 4.12 including a target setcondition can be similarly extended to the non–autonomous case.

Further extensions to problems involving inequality constraints can be, e.g., found in [14, 1] and thereferences therein.

4.4.2 Application to nonlinear affine input systems

Subsequently, Pontryagin’s maximum principle is applied to autonomous nonlinear systems that are affinein the input, i.e.

x = f(x,u) = f0(x) +

m∑j=1

f j(x)uj , t > t0, x(t0) = x0 (4.97a)

with the input constraints

u ∈ U =[u−,u+

](4.97b)

or equivalently uj ∈ [u−j , u+j ], j = 1, . . . ,m. For this particular class of nonlinear systems different cost

functionals are studied that rather commonly arise in applications.

4.4.2.1 Cost functionals minimizing consumption Optimal control in view of minimizing con-sumption can be addressed by cost functionals of the form

J(u) =

∫ te

t0

(l0(x(t)) +

m∑j=1

rj |uj(t)|

)dt, rj > 0. (4.98)

The corresponding Hamiltonian function for the normal case with λn+1(t) = 1 reads

H(x,u,λ) = l0(x) +

m∑j=1

rj |uj |+ λT(f0(x) +

m∑j=1

f j(x)uj

)(4.99)

Since the term l0(x)+λTf0(x) is independent of u(t) it can be neglected when addressing the minimiza-tion problem H(x∗,v,λ∗) ≥ H(x∗,u∗,λ∗), ∀v ∈ U arising in the theorems introduced above. In view ofthe affine input structure the problem can be split into m independent problems of the form

minuj∈[u−j ,u

+j ]Hj(uj) = rj |uj |+ qj(x,λ)uj , qj(x,λ) = λTf j(x). (4.100)

Noting that minimization has to be achieved for fixed (x,λ), i.e., actually for (x∗(t),λ∗(t)) at eacht ∈ [t0, t

∗e] according to Pontryagin’s maximum principle, we have to distinguish between the 4 cases

shown in Figure 4.6. The optimal control can be hence deduced as


uju+ju−j

rj |uj |

qjuj

Hj(uj)

minHj(uj)

(a) qj < −rj

uju+ju−j

rj |uj |

qjuj

Hj(uj)

minHj(uj)

(b) qj ∈ (−rj , 0)

uju+ju−j

rj |uj |

qjuj

Hj(uj)

minHj(uj)

(c) qj ∈ (0, rj)

uju+ju−j

rj |uj |

qjuj

Hj(uj)

minHj(uj)

(d) qj > rj

Fig. 4.6: Illustration of (4.100).

u∗j =

u−j , qj(x,λ) > rj

0, qj(x,λ) ∈ (−rj , rj)u+j , qj(x,λ) < −rj

, j = 1, . . . ,m. (4.101)

When qj(x(t),λ(t)) = ±rj on a subset Is ⊂ [t0, te] the optimal control u∗j (t) can no longer be determineduniquely from (4.100). This case is referred to as the singular case and is considered in Section 4.4.2.4.

4.4.2.2 Cost functionals addressing energy optimality Optimal control in view of minimizingenergy can be addressed by cost functionals of the form

J(u) =

∫ te

t0

(l0(x(t)) +

1

2

m∑j=1

rj(uj(t)

)2)dt, rj > 0. (4.102)

The corresponding Hamiltonian function for the normal case with λn+1(t) = 1 reads

H(x,u,λ) = l0(x) +1

2

m∑j=1

rj(uj)2 + λT

(f0(x) +

m∑j=1

f j(x)uj

)(4.103)


Proceeding as before the optimal control can be determined for each component uj(t) by solving theminimization problem

minuj∈[u−j ,u

+j ]Hj(uj) =

1

2rj(uj)

2 + qj(x,λ)uj , qj(x,λ) = λTf j(x). (4.104)

Without constraints the minimizer follows as

u0j = −qj(x,λ)

rj(4.105)

Hence, if in the constrained case u0j ∈ [u−j , u

+j ], then u∗j = u0

j . Moreover, if u0j 6∈ [u−j , u

+j ], then the

minimizer is defined as a boundary value of the admissible interval. As a result, the optimal control isgiven by

u∗j =

u−j , u0

j ≤ u−j

u0j , u0

j ∈ (u−j , u+j )

u+j , u0

j ≥ u+j

, j = 1, . . . ,m. (4.106)

4.4.2.3 Cost functionals addressing time optimality In order to impose time optimality the costfunctional reads

J(u) =

∫ te

t0

1dt = te − t0 (4.107)

so that the Hamiltonian function (for the normal case) evaluates to

H(x,u,λ) = te − t0 + λT

(f0(x) +

m∑j=1

f j(x)uj

)(4.108)

Neglecting terms unaffected by uj reduces the respective minimization problem to

minuj∈[u−j ,u

+j ]Hj(uj) = qj(x,λ)uj , qj(x,λ) = λTf j(x), (4.109)

whose solution can be immediately determined as

u∗j =

{u−j , qj(x,λ) > 0

u+j , qj(x,λ) < 0

, j = 1, . . . ,m. (4.110)

Since u∗j (t) only switches between minimal and maximal values this type of optimal control is oftenreferred to as bang–bang control .

Similar to the situation in Section 4.4.2.1 the singular case arises when qj(x(t),λ(t)) = 0 for t ∈ Is ⊂[t0, te]. Here, the Hamiltonian function is independent of uj(t) so that minimality is trivially ensuredwithout providing information about the minimizer u∗j (t). To avoid the singular case, the cost functional(4.107) is often extended by a so–called regularization term

J(u) =

∫ te

t0

(1 +

1

2

m∑j=1

rj(uj(t)

)2)dt (4.111)

with rj > 0 but sufficiently small to achieve near time–optimality. Obviously, the cost functional (4.111)corresponds to (4.102) so that the optimal control u∗j (t), j = 1, . . . ,m is determined by (4.106).


Example 4.12. Consider the linear time–optimal control problem (4.107) for the double integrator

d

dt

[x1

x2

]=

[x2

u

], t > 0, x(0) = x0, x(te) = 0 (4.112)

with te free and the input constraint

u ∈ [−1, 1] ∀t ∈ [0, te].

The Hamiltonian function is given by H(x, u,λ) = 1 + λ1x2 + λ2u so that the adjoint equations followfrom (4.82b) as

λ∗1 = 0

λ∗2 = −λ∗1.

These result in

λ∗1 = C1, λ∗2 = −C1t+ C2

with the constants of integration C1, C2. The solution of the minimization problem (4.109) hence yieldsthat the optimal control u∗(t) must satisfy

u∗(t) =

{−1, λ∗2(t) > 0

+1, λ∗2(t) < 0.

Note that the singular case cannot arise since λ∗2(t) = 0 for some subinterval Is ⊂ [0, te] implies λ∗2(t) = 0and λ∗1(t) = 0 for all t ∈ [0, te]. This contradicts the transversality condition (4.85) for the free terminaltime

H(x∗(t∗e), u∗(t∗e),λ

∗(t∗e)) = 1 + λ∗1(te)x∗2(te) + λ∗2(te)u

∗(te) = 0.

In particular, due to λ∗2(t) being affine in t the situation λ∗2(t) = 0 can only arise at a single discreteinstance of time t = ts referring to a switch in u∗(t) between −1 and +1. Thus, at most 4 switchingsequences may arise, i.e., {+1}, {−1}, {−1,+1}, and {+1,−1}. Taking into account (4.112) with piece-wise constant u(t) = u∗(t) yields that trajectories x∗(t) denote parabolas in the (x1, x2)–plane. Indeed itis an easy exercise to show that given u(t) = u = ±1 we have

x1(t) =1

2ut2 + x2,0t+ x1,0, x2(t) = ut+ x2,0 (4.113)

and moreover

x1 =x2

2

2u+ x1,0 −

x22,0

2u︸︷︷︸= c

. (4.114)

The optimal control problem is set up to ensure the transfer from an arbitrary initial state x(0) = x0 tothe zero state x(te) = 0 in minimal time. Taking into account (4.114) it becomes apparent that the origincan only be reached along the switching curve defined by

x1 = +x2

2

2, for u = +1

x1 = −x22

2, for u = −1.

Introducing the curve S(x2) = − 12x2|x2| the switching curve is given by

x1 = S(x2). (4.115)


−1.5 −1 −0.5 0 0.5 1 1.5−1.5

−1

−0.5

0

0.5

1

1.5

x1

x2

x1 = S(x2)(x1,0, x2,0)

(a) Case (i).

−1.5 −1 −0.5 0 0.5 1 1.5−1.5

−1

−0.5

0

0.5

1

1.5

x1

x2

x1 = S(x2)(x1,0, x2,0)u = +1

(b) Case (ii).

Fig. 4.7: Switching curve and optimal response for the double integrator of Example 4.12.

In view of the possible switching sequences we have to distinguish between the following situations:

(i) If x0 lays on the switching curve, i.e., x1,0 = S(x2,0), then the origin x(te) = 0 is directly reachedwithout any switching along the switching curve (4.115) with u(t) = +1 or u(t) = −1. This scenariois shown in Figure 4.7(a).

(ii) If x0 does not lay on the switching curve, i.e., x1,0 > S(x2,0) or x1,0 < S(x2,0), then a single switchin u(t) is required to reach the switching curve (4.115) and travel along this path to the origin.

Figure 4.7(b) provides a graphical illustration of these two cases. As a result, the optimal control is

u∗(t) =

+1, if x1 < S(x2)

+1, if x1 = S(x2) and x1 > 0

−1, if x1 > S(x2)

−1, if x1 = S(x2) and x1 < 0

.

With this, the optimal switching points ts and the minimal endtime t∗e can be determined taking intoaccount (4.113), which provides

ts =

x2,0 +√

12x

22,0 + x1,0, x1,0 > S(x2,0)

−x2,0 +√

12x

22,0 − x1,0, x1,0 < S(x2,0)

(4.116)

and

t∗e =

x2,0 +

√2x2

2,0 + 4x1,0, x1,0 > S(x2,0)

|x2,0|, x1,0 = S(x2,0)

−x2,0 +√

2x22,0 − 4x1,0, x1,0 < S(x2,0)

. (4.117)

Solutions for the double integrator example when varying the initial condition x(0) = x0 are shown inFigure 4.8.

Exercise 4.2. Verify (4.116) and (4.117) for the switching time and minimal endtime.


0 2 4−1.5

−1

−0.5

0

t

x∗ 1

0 2 4−1

−0.5

0

0.5

1

1.5

t

x∗ 2

0 1 2 3

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

t

u∗

Fig. 4.8: Time optimal state trajectories and optimal control for the double integrator of Example 4.12. The initial conditions

are assigned as x0 = [−1,−1]T (blue), x0 = [−1, 0]T (green) and x0 = [−1, 1]T (red)

4.4.2.4 Singular problems As pointed out in Section 4.4.2.1 it may happen that the optimal controlu∗(t) cannot be determined from the minimization problem (4.83) for t ∈ Is ⊂ [t0, te]. In order toillustrate this and to discuss some measure to proceed in this situation we consider a particular scalarcase given by

minu∈U

J(u) =

∫ te

t0

(l0(x(t)) + l1(x(t))u(t)

)dt (4.118a)

subject to

x = f0(t,x) + f1(t,x)u, t > t0, x(t0) = x0, x(te) = xe. (4.118b)

The set U thereby comprises all piecewise continuous functions u(t) on the interval [t0, te] boundedaccording to u(t) ∈ [u−, u+]. Due to the affine input structure the corresponding Hamiltonian functionis affine in u(t), i.e.,

H(t,x, u, λ) = l0(x) + λTf0(t,x) +

(λTf1(t,x) + l1(x)

)u. (4.119)

Finding u(t) minimizing H(t,x, u, λ) for given (x(t), λ(t)) hence requires to determine the zeros of(∇uH)(t,x, u, λ) in u. To analyze this introduce the switching function

ζ(t) = (∇uH)(t,x, u, λ) = λTf1(t,x) + l1(x) (4.120)

with ζ∗(t) = ζ(t)|x(t)=x∗(t),λ(t)=λ∗(t). If ζ∗(t) = 0 on a finite time interval4 Is ⊂ [t0, te], then the mini-mization requirement (4.83) does not provide any information about u∗(t) for t ∈ Is. In other words thecontrol does not affect the Hamiltonian function on Is. This curve ζ∗(t) is called singular path or singulararc [12, 1].

For the determination of an optimal control along a singular arc one proceeds by adding the requirementthat in the interval Is also successive derivatives of ζ∗(t) must vanish. In particular, the smallest positiveinteger k such that

dk

dtkζ(t) = 0,

∂

∂u

[dk

dtkζ(t)

]6= 0 (4.121)

4 If this situation arises only at isolated points switching between u− and u+ occurs whenever ζ∗(t) crosses zero and theresulting control induces a bang–bang behavior as discussed in Section 4.4.2.3.


can be shown to be even k = 2r, provided it exists. The integer k is often called the degree (or the order)of the singularity . Along a singular arc the state x∗(t) and the adjoint state λ

∗(t) are restricted to the

manifold defined by

ζ∗(t) =d

dtζ∗(t) = · · · = dk−1

dtk−1ζ∗(t) = 0 (4.122)

together with the condition (4.95b) in the non–autonomous case governed by Theorem 4.13. The resultingmanifold is also referred to as singular surface. In order to ensure a minimum along the singular arc thegeneralized Legendre (or Legendre–Clebsch) condition has to hold, which imposes

(−1)k2∂

∂u

[dk

dtkζ∗(t)

]≥ 0. (4.123)

Similar to non–singular problems both the adjoint state λ(t) and the Hamiltonian function H must becontinuous along an optimal trajectory. It should be noted that in general the solution to an optimalcontrol problem may evolve a mixture of singular and non–singular arcs. Finding their proper sequenceis a difficult task and may even be impossible for certain problems.


minu∈U

J(u) =

∫ 2

0

1

2x2

1(t)dt (4.124a)

subject to

d

dt

[x1

x2

]=

[x2 + u−u

], t > 0, x(0) =

[11

], x(2) = 0 (4.124b)

with the input constraint

u ∈ [−10, 10] ∀t ∈ [0, 2].

The Hamiltonian function with λ3 = 1 is given by

H(x, u, λ) =1

2x2

1 + λ1(x2 + u)− λ2u.

The adjoint states are hence governed by

˙λ∗1 = −(

∂

∂x1H

)(x∗, u∗, λ

∗) = −x∗1

˙λ∗2 = −(

∂

∂x2H

)(x∗, u∗, λ

∗) = −λ∗1.

From Theorem 4.11 the optimal control must fulfill condition (4.83), i.e.,

H(x∗, v, λ∗) ≥ H(x∗, u∗, λ

∗), ∀v ∈ [−10, 10]. (4.125)

Taking into account the input constraint, this implies5

u∗ =

+10, λ∗1 < λ∗2−10, λ∗1 > λ∗2undefined, λ∗1 = λ∗2

.

5 The minimization problem (4.125) reduces to the analysis of (λ∗1(t)− λ∗2(t))u∗(t).


Thus, a singular arc occurs if λ∗1(t) = λ∗2(t) for some finite time interval Is ⊂ [0, 2]. To deter-mine the order of the singular arc consider the following sequence for the switching function ζ∗(t) =(∇uH)(x∗(t), u∗(t), λ

∗(t)) = λ∗1(t)− λ∗2(t)

ζ∗(t) = λ∗1(t)− λ∗2(t) = 0

d

dtζ∗(t) = ˙λ∗1(t)− ˙λ∗2(t) = λ∗1(t)− x∗1(t) ⇒ ∂

∂u

(d

dtζ∗(t)

)= 0

d2

dt2ζ∗(t) = ˙λ∗1(t)− x∗1(t) = −x∗1(t)− x∗2(t)− u∗(t) ⇒ ∂

∂u

(d2

dt2ζ∗(t)

)= −1.

This yields k = 2. The final equation provides the optimal control along the singular arc

u∗(t) = −x∗1(t)− x∗2(t)

and the generalized Legendre–Clebsch condition (4.123), i.e.,

(−1)1 ∂

∂u

[d2

dt2ζ∗(t)

]= 1 ≥ 0,

is fulfilled implying a minimum along the singular arc.

To proceed further note that the Hamiltonian function by (4.84b) must be constant for all t ∈ [0, 2]along an optimal trajectory so that

H(x∗, u∗, λ∗) =

1

2(x∗1)2 + λ∗1x

∗2 +

(λ∗1 − λ∗2

)u∗ = C (4.126)

for some constant C. In addition, along the singular arc x∗(t) and λ∗(t) are restricted to the singular

manifold defined by (4.122), which imposes λ∗1 = λ∗2 = x∗1. In view of (4.126) this yields

1

2(x∗1)2 + x∗1x

∗2 = C.

The remaining steps can be summarized as follows:

(i) Starting with u∗(t) = 10 solve the differential equations for x∗(t) and λ∗(t) with the initial condition

x∗(0) = [1, 1]T . Denote the arising 4 constants of integration by Pj, j = 1, . . . , 4 and refer to thesolution as x∗(1)(t).

(ii) Since we know that there is a singular arc, solve the differential equations for x∗(t) and λ∗(t) for

u∗(t) = −10 taking into account the terminal condition x∗(2) = 0. Denote the arising 4 constants ofintegration by Rj, j = 1, . . . , 4 and refer to the solution as x∗(3)(t).

(iii) On the singular arc take into account u∗(t) = −x∗1(t)− x∗2(t) and solve the differential equations forx∗(t) and λ

∗(t). Denote the arising 4 constants of integration by Qj, j = 1, . . . , 4 and refer to the

solution as x∗(2)(t).

(iv) The conditions determining the singular manifold provide 2 equations to determine 2 of the 4 con-stants Qj, j = 1, . . . , 4.

(v) At the unknown switching times ts,1 onto the singular arc and ts,2 from the singular arc continuityimplies x∗(1)(ts,1) = x∗(2)(ts,1) and x∗(2)(ts,2) = x∗(3)(ts,2) and hence 4 equations for 4 unknowns in-cluding ts,1 and ts,2. Numerically solving this system of coupled nonlinear algebraic equations providests,1 ≈ 0.299, ts,2 = 1.927 and thus the optimal control

u∗ =

+10, 0 ≤ t ≤ 0.299

−(x∗1 + x∗2), 0.299 < t < 1.927

−10, 1.927 ≤ t ≤ 2

.

4.5 Numerical solution of optimal control problems 131

4.5 Numerical solution of optimal control problems

In the following, a brief introduction is given to techniques for the numerical solution of optimizationproblems. We thereby distinguish between

(i) indirect methods, where the optimal trajectory is determined by solving the necessary optimalityconditions introduced in Sections 4.3 and 4.4, and

(ii) direct methods, where the infinite–dimensional optimization problem in the input u(t), t ∈ [t0, te] isdiscretized in t to obtain a finite–dimensional static optimization problem as considered in Chapters2 and 3.

Dynamic programming or stochastic optimization are not addressed so that the reader is referred to therespective literature. It is also assumed that the reader is familiar with numerical techniques for thesolution of systems of differential equations (see, e.g., [15, 16]).

4.5.1 Indirect methods

As outline above, indirect methods directly approach the two–point boundary–value problem defined bythe necessary optimality conditions. Techniques involve, e.g., discretization methods, shooting methods orcollocation methods. In order to introduce the individual concepts the optimization problem

minu∈U


∫ te

t0

l(t,x(t),u(t))dt (4.127a)

subject to

x = f(t,x,u(t)), t > 0, x(0) = x0 (4.127b)

ψk(te,x(te)) = 0, k = 1, . . . , p (4.127c)

is considered for fixed endtime te. Following Theorem 4.10 the Euler–Lagrange equations are given by

x = (∇λH)(t,x,u,λ), x(t0) = x0 (4.128a)

λ = −(∇xH)(t,x,u,λ), λ(te) = (∇xeφ)(te,x(te),µ) (4.128b)

0 = (∇uH)(t,x,u,λ) (4.128c)

for t ∈ [t0, te] and the transversality condition yields

ψ(te,x(te)) =[ψ1(te,x(te)) · · · ψp(te,x(te))

]T= 0 (4.129)

with the Hamiltonian function H(t,x,u,λ) = l(t,x,u)+λTf(t,x,u) and φ(te,x(te),µ) = ϕ(te,x(te))+µTψ(te,x(te)) for µ ∈ Rp. Under the assumption that (4.128c) can be solved (at least locally) for u(t) sothat u(t) = k(x(t),λ(t)) the substitution of this expression into (4.128a), (4.128b) reduces the problemformulation to a two–point boundary value problem

x = (∇λH)(t,x,k(x,λ),λ), x(t0) = x0, ψ(te,x(te)) = 0 (4.130a)

λ = −(∇xH)(t,x,k(x,λ),λ), λ(te) = (∇xeφ)(te,x(te),µ) (4.130b)

for t ∈ [t0, te] with a vector µ of free parameters. Since the entries in µ are constant they can be integratedinto the formulation in terms of µ = 0 so that (4.130) reads

x = (∇λH)(t,x,k(x,λ),λ), x(t0) = x0, ψ(te,x(te)) = 0 (4.131a)


λ = −(∇xH)(t,x,k(x,λ),λ), λ(te) = (∇xeφ)(te,x(te),µ) (4.131b)

µ = 0. (4.131c)

Introducing

z =

xλµ

, F (t, z) =

(∇λH)(t,x,k(x,λ),λ)−(∇xH)(t,x,k(x,λ),λ)

0

, G(te, z(te)) =

[ψ(te,x(te))

λ(te)− (∇xeφ)(te,x(te),µ)

]

with x(t) = Mz(t), then (4.131) can be re–formulated according to

z = F (t, z), t ∈ (t0, te), Mz(t0) = x0, G(te, z(te)) = 0. (4.132)

Both problem formulations (4.130) and (4.132) are subsequently exploited depending on the used solutionapproach.

4.5.1.1 Discretization methods Discretization or relaxation methods in principle make use of anappropriate finite difference approximation of the arising differentials in (4.131). For this, discretize thetime–interval [t0, te] in N + 1 steps

tj = t0 + jδt, j = 0, 1, . . . , N, δt =te − t0N

so that the solution can be approximated at the discretization points by zj ≈ z(tj), j = 0, . . . , N . Making,e.g., use of the trapezoidal rule the respective discretization of (4.132) is obtained as

zj+1 − zj

δt=

1

2

[F (tj+1, zj+1) + F (tj , zj)

], j = 0, . . . , N − 1 (4.133a)

Mz0 = x0, G(tN , zN

)= 0. (4.133b)

This algebraic system is implicit in the (2n + p)(N + 1) unknowns {zj}Nj=0 and is comprised of (2n +p)(N + 1) nonlinear equations. Note that since µ ∈ Rp is constant the last p rows in (4.133a) reduceto µj+1 = µj . The numerical solution of (4.133) is equivalent to the computation of the zeros of thenonlinear algebraic system

M 0 0 · · · 0 0−E E 0 · · · 0 00 −E E · · · 0 0... · · ·

...0 · · · −E E0 · · · 0 0

z0

z1

z2

...zN−1

zN

− δt

2

0F (t1, z1) + F (t0, z0)F (t2, z2) + F (t1, z1)

...F (tN , zN ) + F (tN−1, zN−1)

G(tN , zN )

−

x0

00...00

= 0 (4.134)

making use of, e.g., Newton’s method (see also Section 2.2.2.2). Note that in order to obtain a finitedifference approximation of (4.132) other techniques can used with different error orders such as backward,forward or central differences

z(tj) =zj − zj−1

δt+O(δt), z(tj) =

zj+1 − zj

δt+O(δt), z(tj) =

zj+1 − zj−1

2δt+O((δt)2)

or higher–order discretizations involving Runge–Kutta techniques. Some remarks are in order:

• Discretization methods in general lead to a numerically robust solution due to the simultaneousconsideration of the differential equations and the boundary conditions.

• Convergence essentially relies on a proper selection of the initial guess of the adjoint variables.

• The special structure of the arising matrices, cf. (4.134), can be exploited for the numerical solution.


• The number of discretization steps N + 1 inherently influences accuracy and computational burden.

4.5.1.2 Shooting method The shooting method traces the solution of the boundary value problem(4.130) back to the solution of an initial value problem by guessing the values λ(t0) = λ0 and µ, i.e.,

x = (∇λH)(t,x,k(x,λ),λ), x(t0) = x0, (4.135a)

λ = −(∇xH)(t,x,k(x,λ),λ), λ(t0) = λ0. (4.135b)

Let x(t) = x(t;λ0) and λ(t) = λ(t;λ0) denote the solution to (4.135). Then the (n + p) residual termsat the boundary t = te given by

B(λ0,µ) =

[ψ(te,x(te;λ0))

λ(te;λ0)−{

(∇xeϕ)(te,x(te;λ0)) + (∇xeψ)T (te,x(te;λ0))µ}] = 0 (4.135c)

need to be fulfilled by properly determining λ0 ∈ Rn and µ ∈ Rp. For this, Newton’s method can beutilized to obtain an iterative numerical solution of (4.135c). Let η = [λT0 ,µ

T ]T denote the stacked vectorof unknowns so that B(η) = B(λ0,µ) and let ηj denote the j–th Newton iterate, then

(∇ηB)(ηj)(ηj+1 − ηj

)= −B(ηj) (4.136)

or in other words

ηj+1 = ηj − (∇ηB)−1(ηj)B(ηj), η0 = η0 (4.137)

for a suitable starting value η0. The iteration is stopped at j = j∗, e.g., when ‖ηj∗+1 − ηj∗‖ <εmax{1, ‖ηj∗‖} for some ε� 1. In addition note:

• The implementation effort for the shooting method is rather small.

• The shooting method essentially relies on a proper choice of the initial state λ0.

• Since the canonical equations tend to be only weakly stable numerical issues may arise in the forwardintegration for large intervals [t0, te]. This issue can be resolved by considering so–called multipleshooting methods.

Remark 4.10 (Multiple shooting). In the multiple shooting method the interval [t0, te] is subdividedas in discretization methods and the (single) shooting method is applied in each interval. The solutionof the boundary value problem is hence obtained by collecting the individual solutions for each intervaland ensuring continuity at the borders between subinterval. With this, the residual terms also cover theconditions arising when connecting the intervals.

4.5.1.3 Collocation methods Collocation methods make use of a solution ansatz in terms of a linearcombination of suitable basis functions ϑk(t), k = 0, . . . , k fulfilling the boundary conditions in (4.130) att = t0 and t = te, i.e.,

x(t) ≈ x(t) =

K∑k=0

axkϑk(t), λ(t) ≈ λ(t) =

K∑k=0

aλkϑk(t), axk , aλk ∈ Rn. (4.138)

The basis functions are thereby assumed to be linear independent and are often chosen as polynomials,Legendre polynomials, trigonometric functions or other families of functions. For the determination ofthe coefficients axk , a

λk , k = 0, . . . ,K collocation conditions are imposed by requiring that the differential

equations and boundary conditions (4.130) are satisfied pointwise at K + 1 distinct collocation pointstj ∈ [t0, te], j = 0, . . . ,K with t0 = t0, tK = te that need to be fixed properly. Let xj(t) = x(tj) and

λj(t) = λ(tj), then (4.130) evaluates to


˙xj = (∇λH)(tj , xj ,k

(xj , λj

), λj)

(4.139a)

˙λj = −(∇xH)

(tj , xj ,k

(xj , λj

), λj), (4.139b)

x0 − x0 = 0 (4.139c)

ψ(te, x

K)

= 0 (4.139d)

λK −{

(∇xeϕ)(te, xK) + (∇xeψ)T (te, x

K)µ}

= 0. (4.139e)

This nonlinear system of algebraic equations can be solved, e.g., by making use of Newton’s method.

Besides a global setup also local ansatz functions can be used in each subinterval [tj , tj+1]. As an exam-ple consider the Matlab function bvp4c [17]. Here third order polynomials are used in each subinterval[tj , tj+1] of a mesh t0 = t0 < t1 < · · · < tK = te and the collocation conditions are determined by makingthe approximate solution fulfill the boundary conditions and the differential equations at both ends andthe midpoint of each subinterval. This again results in nonlinear system of algebraic equations, which canbe solved iteratively. It is thereby crucial to recall that boundary value problems can have more than onesolution so that a suitable guess for the solution has to be supplied for the iteration.

4.5.1.4 Extensions to free endtime problems If the endtime te is free, then the necessary opti-mality conditions of Theorem 4.10 include the transversality condition (4.61b), i.e.,(

∂

∂teφ

)(t∗e,x

∗(t∗e),µ∗) +H(t∗e,x

∗(t∗e),u∗(t∗e),λ

∗(t∗e)) = 0.

By introducing the time scaling

t = ντ, τ ∈ [0, 1], ν > 0 (4.140)

the Euler–Lagrange equations (and similarly the canonical equations introduced by making use of Pon-tryagin’s maximum principle in Section 4.4) can be transformed to the fixed time interval [0, 1] in the newindependent coordinate τ . For this, note that differentiation with respect to t is related to differentiationwith respect to τ by

d

dt=

1

ν

d

dτ. (4.141)

As a result, the boundary value problem consisting of the differential equations for the state x(t) andthe adjoint state λ(t), the algebraic equation (∇uH) = 0, and the transversality conditions can be easilyre–formulated on the fixed time interval τ ∈ [0, 1]. Thereby the determination of the unknown optimalendtime t∗e reduces to the computation of the constant scaling factor ν > 0.

4.5.2 Direct methods

Direct methods are based on the direct discretization of the (infinite–dimensional) optimal control problemso that methods of static optimization can be applied to the resulting finite–dimensional problem. Differingfrom indirect methods, which

• provide insight into the structure of the optimal solution,

• allow to determine a highly accurate or even exact solution,

• enable to utilize the adjoint variables for sensitivity analysis and controller design,

direct methods allow


• to avoid the determination of the canonical equations,

• a simpler incorporation of in particular state and path constraints,

• to compute the Lagrange multiplier in a post–processing step,

• often improved convergence behavior,

• the solution of optimal control problems for systems governed by ordinary differential equations,differential–algebraic equations and partial differential equations

at the cost of providing only a suboptimal solution due to discretization.

In order to illustrate so–called direct sequential methods and direct simultaneous methods we focus onthe optimal control problem

minu∈U


∫ te

t0

l(t,x(t),u(t))dt (4.142a)

subject to

x = f(t,x,u), t > 0, x(t0) = x0 (4.142b)

gk(te,x(te)) = 0, k = 1, . . . , p (4.142c)

hl(t,x(t),u(t)) ≤ 0, t ∈ [t0, te] l = 1, . . . , q (4.142d)

involving both equality and inequality constraints. The endtime te is assumed fixed since the procedureintroduced in Section 4.5.1.4 can be used to reduce the case of free endtime to the determination of asingle constant scaling parameter.

4.5.2.1 Control parametrization In direct methods the time interval [t0, te] is discretized into N +1stages

t0 = t0 < t1 < · · · < tN = te (4.143)

and the control inputs u(t) are parametrized on each individual subinterval [tj , tj+1] to obtain

u(t) = rj(t,vj), t ∈[tj , tj+1

](4.144)

with vj ∈ RmK , where K refers to the order of approximation by functions that are piecewise constant,piecewise linear, piecewise cubic, etc. Examples are shown in Figure 4.9. In practice Lagrange polynomialsare often employed for control parametrization. For further details the reader is referred to, e.g., [18, 1].

4.5.2.2 Direct sequential methods In direct sequential methods6 the control variables are parametrizedaccording to (4.144) so that (4.142) reduces to the static (finite–dimensional) optimization problem

minvJ(v) = ϕ(te,x(te)) +

N−1∑j=0

∫ tj+1

tjl(t,x(t), rj(t,vj))dt (4.145a)

with vT = [(v0)T , . . . , (vN )T ] ∈ RmKN subject to

x = f(t,x, rj(t,vj)), t ∈ (tj , tj+1), j = 0, 1, . . . , N − 1 (4.145b)

x(t0) = x0 (4.145c)

x(tj) = x(tj+1), j = 0, 1, . . . , N − 1 (4.145d)

6 Direct sequential methods are also referred to as semi–discretization methods or control vector parametrization methods.


t

u

tj−2 tj−1 tj tj+1 tj+2

uj−2 uj−1 uj uj+1 uj+2

piecewise constant

piecewise linear continuous

Fig. 4.9: Examples of control parametrizations.

g(tN ,x(tN )) = 0 (4.145e)

h(t,x(t), rj(t,vj)) ≤ 0, t ∈ [tj , tj+1] j = 0, 1, . . . , N − 1. (4.145f)

The differential equations (4.145b) need to be solved (numerically) in each subinterval taking into accountthe initial condition (4.145c), the terminal condition (4.145e) and conditions (4.145d) ensuring continuityof the solution on the interval [t0, te].

The key issue in this setting is the consideration of the path constraint (4.145e), which has to besatisfied for all t ∈ [tj , tj+1]. Since this would imply an infinite number of constraints one often replacesthis condition by interior–point constraints

h(tji ,x

(tji), rj(tji ,v

j))≤ 0, i = 0, . . . , Ij , j = 0, 1, . . . , N − 1. (4.146)

so that (4.145f) has to hold only at a finite number of points interior to each subinterval [tj , tj+1]. Notealso the following remarks:

• The control parametrization and the approximation of the path constraints yields N static optimalcontrol problems in either the Mayer of the Lagrange form, which are subject to an initial valueproblem. For an efficient numerical solution of (4.145) with (4.146) methods of nonlinear optimizationsuch as the SQP method can be applied to determine the decision variables.

• The accuracy of the solution of the differential equations only depends on the used numerical solverand is hence independent of the grid induced by (4.143). However, problems may arise when thesystem is unstable or the differential equation does not have a solution for certain decision variables.

4.5.2.3 Direct simultaneous methods In direct simultaneous methods7 the infinite–dimensional op-timal control problem (4.142) is discretized both in the control and the state variables. While differentmethods such as Lagrange polynomials or monomial basis functions are available for the parametrizationof the state variables in a way similar to (4.144), subsequently numerical routines for the solution ofordinary differential equations are incorporated into the setup. Examples include explicit and implicitEuler or Runge–Kutta methods as well as trapezoidal and Simpson’s rule. These schematically lead toan approximation of the solution to x(t) = f(t,x(t),u(t)) by

αj(xj+1,uj+1,xj ,uj

)= βj

(xj ,uj

), j = 0, . . . , N − 1 (4.147)

7 In the literature direct simultaneous methods are also referred to as full discretization methods.


with αj and βj depending on the used approach. With this, the optimal control problem (4.142) isreduced to the static optimal control problem

minx,u

J( · ) = ϕ(te,xN ) +

N−1∑j=0

tj+1 − tj

2

[l(tj+1,xj+1,uj+1

)+ l(tj ,xj ,uj

)](4.148a)

subject to

αj(xj+1,uj+1,xj ,uj

)= βj

(xj ,uj

), j = 0, 1, . . . , N − 1 (4.148b)

x0 = x0 (4.148c)

g(tN ,xN

)= 0 (4.148d)

h(tj ,xj ,uj

)≤ 0, j = 0, 1, . . . , N − 1 (4.148e)

on the grid imposed by (4.143). For its solution, the methods introduced in Section 3 can be applied todetermine the (N + 1)(n+m) decision variables

x =

x0

...xN

, u =

u0

...uN

. (4.149)

It becomes apparent that typically a large–scale static optimization problem is obtained, whose numericalsolution might require tailored techniques exploiting, e.g., certain sparsity or block structure properties.Some further remarks are in order:

• The differential equations are fulfilled at the converged solution x∗, u∗ only.

• The inequality constraints are satisfied at the discretization points tj , j = 0, . . . , N only.

• The number of time intervals N + 1 not only influences the approximation of the optimal control butalso the accuracy of approximation of the solution to the differential equations.


For the illustration of the discussed numerical approaches for the solution of optimal control problems,we consider the example of an evasive maneuver of a ship. Parts of this example are borrowed from [19]and the equations of motion are derived in [20]. For further details the reader is also referred to [21].

The motion of the ship is governed by

x =

x2

c1x2

c3v|x3|x3 + c4x2

v sin(x1 − x3)v cos(x1 − x3)

︸︷︷︸

= f0(x)

+

0c2000

︸︷︷︸= f1

u, t > 0, x(0) = 0 (4.150)

with x1(t) the heading angle, x2(t) the yaw rate, x3(t) the drift angle , (x4(t), x5(t)) the ship position,and u(t) the rudder angle. The ship velocity is v and is assumed constant with v = 4 m/s. The remainingparameters are identified for a real ship and are given by c1 = −0.26 1/s, c2 = 0.2 1/s2, c3 = −1.871/(rad m), and c4 = 0.6 1/s. The rudder angle is assumed to be bounded according to

u ∈ [u−, u+]. (4.151)


The considered cost functional is given in Bolzano form

minu∈[u−,u+]

J(u) =1

2∆xT (te)Se∆x(te) +

1

2

∫ te

0

{∆xT (t)Q∆x(t) + ru2(t)

}dt (4.152)

with ∆x(t) = x(t)−xe denoting the distance to a desired final ship position xe. The terminal time te isfixed to te = 20 s.

Subsequently, a sidestep of the ship is considered with

xe =[0 0 0 10 m free

]T. (4.153)

For the sake of simplicity the weighting matrices are chosen as Se = Q = E.

Remark 4.11. The desired final position xe is set–up with the final value for the position x5(te) to befree. For the numerical evaluation this is included by neglecting the state x5(t) throughout the solutionof the optimal control problem. Note that this is reasonable since x5(t) does not couple into the ordinarydifferential equations of the state variables x1(t) to x4(t).

In view of (4.150) the Hamiltonian function reads

H(x, u,λ) =1

2∆xTQ∆x+

r

2u2 + λT

[f0(x) + f1u

].

This implies the adjoint system

λ = −(∇xH)(x, u,λ) = −Q(x− xe)− (∇xf0)T (x)λ, λ(te) = Se(x(te)− xe). (4.154)

Since the system (4.150) is an affine input system and the cost functional contains a term related toenergy optimality the results of Section 4.4.2.2 apply so that the optimal control follows from (4.106) inthe form

u∗j =

u−, u0 ≤ u−

u0, u0 ∈ (u−, u+)

u+, u0 ≥ u+

, u0 = −1

rλTf1 = −1

rc2λ2. (4.155)

Collocation method using bvp4c

Indirect methods directly act on the canonical equations given by (4.150) and (4.154) with (4.155).Proceeding as in Section 4.5.1 a collocation approach is applied by making use of the Matlab functionbvp4c. The resulting code is split into two subparts with the first one initializing the problem, definingthe parameters of the optimal control problem, calling the numerical solver, and finally simulating theoriginal system (recall Remark 4.11).

function ship()

%System parameters

p.c1 = -0.26;

p.c2 = 0.2;

p.c3 = -1.87;

p.c4 = 0.6;

p.v = 3.5;

%Initial state

p.x0 = zeros(4,1);


%Optimization parameters

p.te = 20.0;

p.r = 10.0;

p.Q = diag([1,1,1,1]);

p.S = diag([1,1,1,1]);

p.xe = [0,0,0,10]’;

%Scenarios

umax = [15,10,5];

umin = -umax;

for j=1:length(umin)

%Run 1

p.r=10.0;

bvp=ship_bvp4c(umin(j),umax(j),p,1e-4);

%Run 2 with result of run 1 as initial condition, adjusted RelTol

%and reduced weight r

ini.x = bvp.t;

ini.y = [bvp.x;bvp.l];

p.r=1.0;

bvp=ship_bvp4c(umin(j),umax(j),p,1e-6,ini);

%Run 3 with result of run 2 as initial condition, adjusted RelTol

%and reduced weight r

ini.x = bvp.t;

ini.y = [bvp.x;bvp.l];

p.r=0.1;

bvp=ship_bvp4c(umin(j),umax(j),p,1e-6,ini);

%Simulate extended system

tspan = linspace(0.0,p.te,201);

[t,x] = ode45(@(t,x)ship_ode(t,x,bvp,p),tspan,zeros(5,1));

end

%=================================================================

% SUBFUNCTIONS

function out = ship_ode(t,x,inp,p);

%ODEs including x5 governing ship motion

u = interp1(inp.t,inp.u,t);

out = zeros(5,1);

out(1) = x(2);

out(2) = p.c1*x(2)+p.c2*u;

out(3) = p.c3*p.v*abs(x(3))*x(3)+p.c4*x(2);

out(4) = p.v*sin(x(1)-x(3));

out(5) = p.v*cos(x(1)-x(3));

It has to be pointed out that the use of bvp4c does for this example not allow to start with the smallparameter r = 0.1 but requires warm–up steps by starting with a larger value of r (here r = 10) andsuccessively reducing r. Thereby, it is crucial to properly assign starting conditions, which are hereinchosen as the solution of the previous warm–up step. The problem setup for bvp4c is shown below.


function out = ship_bvp4c(umin,umax,p,reltol,varargin)

%Input constraints

p.umax = umax*pi/180;

p.umin = umin*pi/180;

%Calling bvp4c

opts = bvpset(’Stats’,’on’,’RelTol’,reltol);

if nargin==4

ic = bvpinit(linspace(0,p.te,40),[p.x0;zeros(size(p.x0))]);

else

ic =varargin{1};

end

sol = bvp4c(@ship_canon,@ship_bcs,ic,opts,p);

t = sol.x;

x = sol.y(1:4,:);

l = sol.y(5:end,:);

%Determine input

for j=1:length(sol.x)

u(j) = ship_oc(x(:,j),l(:,j),p);

end

out.t=t;

out.x=x;

out.l=l;

out.u=u;

%=================================================================

%SUBEQUATIONS

function out = ship_ode(t,x,u,p);

%ODEs governing ship motion

out = zeros(4,1);

out(1) = x(2);

out(2) = p.c1*x(2)+p.c2*u;

out(3) = p.c3*p.v*abs(x(3))*x(3)+p.c4*x(2);

out(4) = p.v*sin(x(1)-x(3));

%out(5) = p.v*cos(x(1)-x(3));

%=================================================================

function out = ship_fjac(x,p);

%Jacobian of the ship ODEs

out = zeros(4,4);

out = [0,1,0,0;

0,p.c1,0,0;

0,p.c4,2*p.c3*p.v*x(3)*sign(x(3)),0;

p.v*cos(x(1)-x(3)),0,-p.v*cos(x(1)-x(3)),0];

%=================================================================

function out = ship_canon(t,z,p)

%Canoncial equations

x = z(1:4);

l = z(5:end);


u = ship_oc(x,l,p);

xdot = ship_ode(t,x,u,p);

ldot = -(p.Q*(x-p.xe) + ship_fjac(x,p)’*l);

out = [xdot;ldot];

%=================================================================

function out = ship_bcs(z0,ze,p)

%Boundary conditions

x0 = z0(1:4);

xe = ze(1:4);

le = ze(5:end);

out = [x0-p.x0;

le-p.S*(xe-p.xe)];

%=================================================================

function out = ship_oc(x,l,p)

%Optimal control

f1 = [0,p.c2,0,0]’;

u = -l’*f1/p.r;

%Take into account input constraints

if u<p.umin

u=p.umin;

elseif u>p.umax

u=p.umax;

end

out = u;

Numerical results when varying the input constraints are shown in Figure 4.10. The maneuver is drivenrather aggressively as can be seen from the rudder angle that almost approaches a bang–bang behaviorswitch between the maximal and minimal allowed input value. A relaxation is achieved by either increasingr or the terminal time te.

0 5 10 15 20

−15

−10

−5

0

5

10

15

t

u

0 2 4 6 8 10 120

10

20

30

40

50

60

70

x4

x5

u ∈ [−15, 15]◦

u ∈ [−10, 10]◦

u ∈ [−5, 5]◦

Fig. 4.10: Sidestep of the ship when varying the input constraints: Collocation method using bvp4c.


Multiple shooting using ACADO

As a second approach, the open–source code ACADO [22, 23] is utilized to compute the optimal solutionof the maneuvering problem. Besides a C++ interface ACADO provides a Matlab interface [24] that isused to compute the following results. Provided that the necessary libraries are available in the Matlabsearch path the problem can be setup as follows.

clear;

BEGIN_ACADO;

acadoSet(’problemname’, ’ship’);


te = 20.0;

xe = [0,0,0,10];

%Initialize

DifferentialState x1 x2 x3 x4 L;

Control u;

% Set default objects

f = acado.DifferentialEquation();

f.linkMatlabODE(’ship_ode’);

%Optimal control problem

ocp = acado.OCP(0.0, te, 200);

ocp.minimizeMayerTerm(L + (x1-xe(1))*(x1-xe(1)) + (x2-xe(2))*(x2-xe(2)) + ...

(x3-xe(3))*(x3-xe(3)) + (x4-xe(4))*(x4-xe(4)));

ocp.subjectTo( f );

ocp.subjectTo( ’AT_START’, x1 == 0.0 );

ocp.subjectTo( ’AT_START’, x2 == 0.0);



ocp.subjectTo( ’AT_START’, L == 0.0 );

%[-15 deg,+15 deg]

%ocp.subjectTo( -0.261799 <= u <= 0.261799 );

%[-10 deg,+10 deg]

ocp.subjectTo( -0.174533 <= u <= 0.174533 );

%[-5 deg,+5 deg]

%ocp.subjectTo( -0.087266 <= u <= 0.087266 );

algo = acado.OptimizationAlgorithm(ocp);

algo.set(’INTEGRATOR_TOLERANCE’,1e-5 );

algo.set(’KKT_TOLERANCE’, 1e-6);

algo.initializeControls([0 0]);

END_ACADO; % Always end with "END_ACADO".

clear;

out = ship_RUN();

draw; % Graphical output needs to be supplied by the user


In the setup above, the user has to specify the system (4.150) as a Matlab function provided below. Notethat the system equations can be also provided inline. However, due to the unavailability of a functionrealizing the absolute value | · | in the ACADO Matlab–interface linking to the external function waschosen to overcome this restriction.

function [ dx ] = ship_ode( t,x,u,p,w )

%System parameters

c1 = -0.26;

c2 = 0.2;

c3 = -1.87;

c4 = 0.6;

v = 3.5;


r = 0.1;

xe = [0,0,0,10]’;

de = (x(1)-xe(1))*(x(1)-xe(1)) + (x(2)-xe(2))*(x(2)-xe(2)) + ...

(x(3)-xe(3))*(x(3)-xe(3)) + (x(4)-xe(4))*(x(4)-xe(4));

%abs() seems not to work in ACADO

if x(3)<0

h=-x(3)*x(3);

else

h=x(3)*x(3);

end

dx(1) = x(2);

dx(2) = c1*x(2)+c2*u;

dx(3) = c3*v*h+c4*x(2);

dx(4) = v*sin(x(1)-x(3));

dx(5) = 0.5*(de +r*u*u);

end

Numerical results achieved using ACADO are shown in Figure 4.11. Both optimal control and the shiptrajectory are almost identical to those obtained in Figure 4.10 when taking into account the differentinput constraints.

0 5 10 15 20

−15

−10

−5

0

5

10

15

t

u

0 2 4 6 8 10 120

10

20

30

40

50

60

70

x4

x5

u ∈ [−15, 15]◦

u ∈ [−10, 10]◦

u ∈ [−5, 5]◦

Fig. 4.11: Sidestep of the ship when varying the input constraints: Multiple shooting using ACADO.


References

1. Chachuat B (2007) Nonlinear and Dynamic Optimization: From Theory to Practice. Tech. rep., Laboratoired’Automatique, EPFL Lausanne, http://infoscience.epfl.ch/record/111939/files/Chachuat_07(IC32).pdf

2. Clarke F (2013) Functional Analysis, Calculus of Variations and Optimal Control. No. 264 in Graduate Texts in

Mathematics, Springer, London

3. Liberzon D (2011) Calculus of Variations and Optimal Control Theory: A Concise Introduction. Princeton University

Press, Princeton (NJ)

4. Luenberger D (1969) Optimization by Vector Space Methods, 3rd edn. John Wiley & Sons, New York

5. Reddy J (1984) Energy and Variational Methods in Applied Mechanics. Wiley–Interscience, New York

6. Troutman J (1996) Variational Calculus and Optimal Control: Optimization with Elementary Convexity, 2nd edn.

Springer–Verlag, New York

7. Heuser H (2006) Gewohnliche Differentialgleichungen, 5th edn. B.G. Teubner, Wiesbaden

8. Taylor M (1996) Partial Differential Equations I. Basic Theory. Springer–Verlag, New York

9. Bryson A, Ho YC (1969) Applied optimal control. Ginn and Company, Waltham (MA)

10. Bryson A (1999) Dynamic Optimization. Addison Wesley Longman, Inc., Menlo Park (CA), USA

11. Kwakernaak H, Sivan R (1972) Linear Optimal Control Systems. Wiley–Interscience, New York

12. Papageorgiou M (1991) Optimierung. R. Oldenbourg Verlag, Munchen, Wien

13. Pontryagin L, Boltyanskii V, Gamkrelidze R, Mishchenko E (1964) The Mathematical Theory of Optimal Processes.

Pergamon Press, New York

14. Hartl R, Sethi S, Vickson R (1995) A survey of the Maximum Principles for optimal control problems with state

constraints. SIAM Review 37(2):181–218

15. Schwarz H, Kockler N (2009) Numerische Mathematik, 7th edn. Vieweg+Teubner, Wiesbaden

16. Deuflhard P, Bornemann F (2008) Numerische Mathematik 2: Gewohnliche Differentialgleichungen. De Gruyter, Berlin,Boston

17. Shampine L, Kierzenka J, Reichelt M (2000) Solving boundary value problems for ordinary differential equations in

MATLAB with bvp4c. http://www.mathworks.com/matlabcentral/answers/uploaded_files/bvp_paper.pdf

18. Betts J (2001) Practical Methods for Optimal Control Using Nonlinear Programming. SIAM, Philadelphia

19. Graichen K (2013) Methoden der Optimierung und optimalen Steuerung. Skriptum zur Vorlesung, Universitat Ulm

20. Bittner R, Driescher A, Gilles E (2003) Drift dynamics modeling for automatic track–keeping of inland vessels. In: Proc.

10th St. Petersburg Int. Conference on Integrated Navigation Systems, St. Petersburg (RUS)

21. Lutz A (2011) Kollissionserkennung und –vermeidung auf Binnenwasserstraßen. PhD thesis, Institut fur Systemdy-

namik, Universitat Stuttgart

22. Houska B, Ferreau H, Diehl M (2011) ACADO Toolkit – An Open Source Framework for Automatic Control andDynamic Optimization. Optimal Control Applications and Methods 32(3):298–312

23. Houska B, Ferreau H (2009–2011) ACADO Toolkit User’s Manual. http://www.acadotoolkit.org

24. Ariens D, Houska B, Ferreau H (2010–2011) ACADO for Matlab User’s Manual. http://www.acadotoolkit.org

http://infoscience.epfl.ch/record/111939/files/Chachuat_07(IC32).pdf

http://www.mathworks.com/matlabcentral/answers/uploaded_files/bvp_paper.pdf

http://www.acadotoolkit.org

http://www.acadotoolkit.org

Documents

Optimization and Optimal Control