Lecture: Quadratic optimization
1. Positive definite och semidefinite matrices
2. LDLT factorization
3. Quadratic optimization without constraints
4. Quadratic optimization with constraints
5. Least-squares problems
Lecture 1 Quadratic optimization
Quadratic optimization without constraints
minimize1
2xTHx+ cTx+ c0
where H = HT ∈ Rn×n, c ∈ Rn, c0 ∈ R.
• Common in applications, e.g.,
– linear regression (fitting of models)
– minimization of physically motivated objective functions such as
minimization of energy, variance etc.
– quadratic approximation of nonlinear optimization problems.
Lecture 2 Quadratic optimization
The quadratic term
Let
f(x) =1
2xTHx+ cTx+ c0
where H = HT ∈ Rn×n, c ∈ Rn, c0 ∈ R.
We can assume that the matrix H is symmetric.
If H is not symmetric it holds that
xTHx =1
2xTHx+
1
2(xTHx)T =
1
2xT(H+HT)x,
i.e., xTHx = xTHx where H =1
2(H+HT) is a symmetric matrix.
Lecture 3 Quadratic optimization
Positive definite and positive semi-definite matrices
Definition 1 (Positive semidefinite). A symmetric matrix
H = HT ∈ Rn×n is called positive semidefinite if
xTHx ≥ 0 for all x ∈ Rn
Definition 2 (Positive definite). A symmetric matrix H = HT ∈ Rn×n
is called positive definite if
xTHx > 0 for all x ∈ Rn − {0}
Lecture 4 Quadratic optimization
How do you determine if a symmetric matrix
H = HT ∈ Rn×n is positive (semi-) definite:
• LDLT factorization (suitable for calculations by hand).
Chapter 27.6 in the book.
• Use the spectral factorization of the matrix H = QΛQT, where
QTQ = I and Λ = diag(λ1, . . . , λn) where λk are the eigenvalues of
the matrix. From this it follows that
– H is positive semidefinite if and only if λk ≥ 0, k = 1, . . . , n
– H is positive definite if and only if λk > 0, k = 1, . . . , n
Lecture 5 Quadratic optimization
LDLT factorization
Theorem 1. A symmetric matrix H = HT ∈ Rn×n is positive
semidefinite if and only if there is a factorization H = LDLT where
L =
1 0 0 . . . 0
l21 1 0 . . . 0
l31 l32 1 . . . 0...
. . .
ln1 ln2 ln3 . . . 1
D =
d1 0 0 . . . 0
0 d2 0 . . . 0
0 0 d3 . . . 0...
. . .
0 0 0 . . . dn
and dk ≥ 0, k = 1, . . . , n.
H is positive definite if and only if dk > 0, k = 1, . . . , n, in the LDLT
factorization.
Proof: See chapter 27.9 in the book.
Lecture 6 Quadratic optimization
The factorization LDLT is determined by one of the following methods
• Elementary row and column operations,
• Completion of squares,
Lecture 7 Quadratic optimization
LDLT factorization - Example
Determine the LDLT factorization of
H =
2 4 6 8
4 9 17 22
6 17 44 61
8 22 61 118
The idea is now to perform symmetric row- and column-operations from
the left and right in order to form the diagonal matrix D.
The row-operations can be performed with lower triangular invertible
matrices, and the product of lower triangular matrices is lower triangular
and also the inverse.
Lecture 8 Quadratic optimization
LDLT factorization - Example
Perform row-operations to get zeros in the first column:
ℓ1H =
1 0 0 0
−2 1 0 0
−3 0 1 0
−4 0 0 1
2 4 6 8
4 9 17 22
6 17 44 61
8 22 61 118
=
2 4 6 8
0 1 5 6
0 5 26 37
0 6 37 86
and the corresponding column-operations:
ℓ1HℓT1=
2 4 6 8
0 1 5 6
0 5 25 30
0 6 30 35
1 −2 −3 −4
0 1 0 0
0 0 1 0
0 0 0 1
=
2 0 0 0
0 1 5 6
0 5 26 37
0 6 37 86
Lecture 9 Quadratic optimization
LDLT factorization - Example
Perform row-operations to get zeros in the second column:
ℓ2ℓ1HℓT1=
1 0 0 0
0 1 0 0
0 −5 1 0
0 −6 0 1
2 0 0 0
0 1 5 6
0 5 26 37
0 6 37 86
=
2 0 0 0
0 1 5 6
0 0 1 7
0 0 7 50
and the corresponding column-operations:
ℓ2ℓ1HℓT1ℓT2=
2 0 0 0
0 1 5 6
0 0 1 7
0 0 7 50
1 0 0 0
0 1 −5 −6
0 0 1 0
0 0 0 1
=
2 0 0 0
0 1 0 0
0 0 1 7
0 0 7 50
Lecture 10 Quadratic optimization
LDLT factorization - Example
Perform row-operations to get zeros in the third column:
ℓ3ℓ2ℓ1HℓT1ℓT2=
1 0 0 0
0 1 0 0
0 0 1 0
0 0 −7 1
2 0 0 0
0 1 0 0
0 0 1 7
0 0 7 50
=
2 0 0 0
0 1 0 0
0 0 1 7
0 0 0 1
and the corresponding column-operations:
ℓ3ℓ2ℓ1HℓT1ℓT2ℓT3=
2 0 0 0
0 1 0 0
0 0 1 7
0 0 0 1
1 0 0 0
0 1 0 0
0 0 1 −7
0 0 0 1
=
2 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
= D
Lecture 11 Quadratic optimization
LDLT factorization - Example
We finally got the diagonal matrix
D =
2 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
Since all elements on the diagonal of D are positive the matrix H is
positive definite.
The values on the diagonal are not the eigenvalues of H, but they have
the same sign as the eigenvalues, and that is all we need to know.
Lecture 12 Quadratic optimization
LDLT factorization - Example
How do we determine the L matrix ?
ℓ3ℓ2ℓ1HℓT1ℓT2ℓT3= D , i.e., H = LDLT if L = (ℓ3ℓ2ℓ1)
−1 = ℓ−1
1ℓ−1
2ℓ−1
3
L =
1 0 0 0
−2 1 0 0
−3 0 1 0
−4 0 0 1
−1
1 0 0 0
0 1 0 0
0 −5 1 0
0 −6 0 1
−1
1 0 0 0
0 1 0 0
0 0 1 0
0 0 −7 1
−1
=
1 0 0 0
2 1 0 0
3 0 1 0
4 0 0 1
1 0 0 0
0 1 0 0
0 5 1 0
0 6 0 1
1 0 0 0
0 1 0 0
0 0 1 0
0 0 7 1
=
1 0 0 0
2 1 0 0
3 5 1 0
4 6 7 1
Lecture 13 Quadratic optimization
Unbounded problems
Let f(x) =1
2xTHx+ cTx+ c0.
Assume that H is not positive semidefinite. Then there is a d ∈ Rn such
that dTHd < 0. It follows that
f(td) =1
2t2dTHd+ tcTd+ c0 → −∞ when t→ ∞
Conclusion: A necessary condition for f(x) to have a minimum is that
H is positive semidefinite.
Lecture 14 Quadratic optimization
Assume that H is positive semidefinite. Then we can use the
factorization H = LDLT which gives
minimize1
2xTHx+ cTx+ c0
= minimize1
2xTLDLTx+ cT(LT)−1LTx+ c0
= minimizen∑
k=1
1
2dky
2
k + ckyk + c0
= c0 +n∑
k=1
minimize1
2dky
2
k + ckyk
where we have introduced the new coordinates y = LTx and c = L−1c.
The optimization problem separates to n independent scalar quadratic
optimization problems, which are easy to solve.
Lecture 15 Quadratic optimization
Scalar Quadratic optimization without constraints
The scalar quadratic optimization problem
minimizex∈R1
2hx2 + cx+ c0
has a finite solution, if and only if,
• h = 0 and c = 0, or
• h > 0.
In the first case there is no unique optimum.
In the second case1
2hx2 + cx+ c0 =
1
2h(x+ c/h)2 + c0 − c2/(2h), and
the optimum is achieved for x = −c/h.
Lecture 16 Quadratic optimization
Solving the minimization problems separately shows that
y =[
y1 . . . yn
]T
is a minimizer if
(i)′ dk ≥ 0, k = 1, . . . , n
(ii)′ dkyk = −ck⇔
(i) H is positive semidefinite
(ii) DLTx = −L−1c
Constraint (ii) can be replaced with LDLTx = −c i.e., Hx = −c.
Furthermore, f(x) =1
2xTHx+ cTx+ c0 is unbounded (from below) if
(i)′ some dk < 0, or
(ii)′ one of the equations
dkyk = −ckdo not have a solution
i.e., dk = 0 and ck 6= 0
⇔
(i) H is not positive semi-definite
(ii) the equation Hx = −c
do not have a solution
Lecture 17 Quadratic optimization
The above arguments can be formalized as the following theorem
Theorem 2. Let f(x) =1
2xTHx+ cTx+ c0. Then x is a minimizer if
(i) H is positive semidefinite
(ii) Hx = −c
If H is positive definite then x = −H−1c.
If H is not positive semidefinite or Hx = −c do not have a solution,
then f is not limited from below.
Comment 1. The condition that Hx = −c has a solution is equivalent
to c ∈ R(H). Therefore, f is a minimizer if and only if H is positive
semidefinite and c ∈ R(H).
Lecture 18 Quadratic optimization
Quadratic optimization with linear constraints
minimize1
2xTHx+ cTx+ c0
s.t. Ax = b(1)
where H = HT ∈ Rn×n, A ∈ Rm×n, c ∈ Rn, and b ∈ Rm, and c0 ∈ R.
• We assume that b ∈ R(A) and n > m, i.e., N (A) 6= {0}.
• Assume that N (A) = span{z1, . . . , zk} and define the nullspace
matrix Z =[
z1 . . . zk
]
.
• If Ax = b it holds that an arbitrary solution to the linear constraint
has the form x = x+ Zv, for some v ∈ Rk. This follows since
A(x+ Zv) = Ax+AZv = b+ 0 = b
Lecture 19 Quadratic optimization
With x = x+ Zv inserted in f(x) =1
2xTHx+ cTx+ c0 we get
f(x) =1
2vTZTHZv +
(ZT(Hx+ c)
)Tv + f(x).
The minimization problem (1) is thus equivalent with
minimize1
2vTZTHZv +
(ZT(Hx+ c)
)Tv + f(x).
We can now apply Theorem 2 to obtain the following result:
Theorem 3. x is an optimal solution to (1) if
(i) ZTHZ is positive semidefinite.
(ii) x = x+ Zv where ZTHZv = −ZT(Hx+ c)
Lecture 20 Quadratic optimization
Comment 2. The second condition in the theorem is equivalent with
the existence of a u ∈ Rm such that
H −AT
A 0
x
u
=
−c
b
Proof: The second constraint in the theorem can be written
ZT(H(x+ Zv) + c) = 0
Since N (A) = R(Z) this is equivalent with
H(x+ Zv︸ ︷︷ ︸
x
) + c ∈ N (A)⊥ = R(AT)
⇔ Hx+ c = ATu, for some u ∈ Rm
Lecture 21 Quadratic optimization
Linear constrained QP: Example
Let f(x) = 1
2xTHx+ cTx+ c0, where
H =
−2 0
0 1
, c =
1
1
, c0 = 0.
Consider minimization of f over the feasible region F = {x |Ax = b}
where
A =[
2 1]
, b = 2.
I.e. the minimization problem
min −x2
1+ 1
2x22+ x1 + x2
s.t. 2x1 + x2 = 2
Lecture 22 Quadratic optimization
Linear constrained QP: Example - Nullspace method
The nullspace of A is spanned by Z and the reduced Hessian is given by
Z =
−1
2
, H = ZTHZ = 2
Since the reduced Hessian is positive definite the problem is strictly
convex. The unique solution is then given by taking an arbitrary x ∈ F,
e.g. x = (1 0)T and solving
Hv = −c = −ZT (Hx+ c) = −3 ⇒ v = −3
2
Then
x = x+ Zv =
1
0
+
−1
2
(−3
2) =
5
2
−3
Lecture 23 Quadratic optimization
Linear constrained QP: Example - Geometric illustration
Z
−4 −3 −2 −1 0 1 2 3 4−4
−3
−2
−1
0
1
2
3
4
x
F
x
−3
2Z
Lecture 24 Quadratic optimization
Linear constrained QP: Example - Lagrange method
Assume that we know the problem is convex, then
H −AT
A 0
x
u
=
−c
b
,
−2 0 −2
0 1 −1
2 1 0
x1
x2
u
=
−1
−1
2
Solving
−2x1 − 2u = −1
x2 − u = −1⇒
x1 = 1
2− u
x2 = −1 + u,
then 2x1 + x2 = 2 implies 2(12− u) + (−1 + u) = −u = 2, so u = −2
and then x1 = 5/2 and x2 = −3.
As for the nullspace method.
Lecture 25 Quadratic optimization
Linear constrained QP: Example - Sensitivity analysis
From the Lagrange method we know that u = −2, it tells us how the
objective function changes due to variations in the right hand side b of
the linear constraint, in first approximation.
Assume that b := b+ δ, then from the previous calculations
u = −(2 + δ) and then x1 = 5/2 + δ and x2 = −3− δ. Then
f(x) =1
2
5
2+ δ
−3− δ
T
−2 0
0 1
5
2+ δ
−3− δ
+
1
1
T
5
2+ δ
−3− δ
= −9
4−2δ −
1
2δ2
Lecture 26 Quadratic optimization
Example 2: Analysis of a resistance circuit
x12
x13
x23
x24
x34
x35
x45
r12
r13
r23
r24
r34r35
r45
v12
v13
v23
v24
v34
v35
v45
u1
u2
u3
u4
u5
I
• Let a current I go through the circuit from node 1 to node 5.
• What is the solution to the circuit equation?
Lecture 27 Quadratic optimization
The current xij goes from node i to node j if xij > 0. Kirchoff’s currentlaw gives
x12 + x13 = i
x12 − x23 − x24 = 0
x13 + x23 − x34 − x35 = 0
x24 + x34 − x45 = 0
x35 + x45 = −i
This can be written Ax = b where x =[
x12 x13 x23 x24 x34 x35 x45
]T
,
A =
1 1 0 0 0 0 0
−1 0 1 1 0 0 0
0 −1 −1 0 1 1 0
0 0 0 −1 −1 0 1
0 0 0 0 0 −1 −1
, b =
I
0
0
0
−I
Lecture 28 Quadratic optimization
• The Node potentials are denoted uj , j = 1, . . . , 5
• The voltage drop over the resistor rij is denoted vij and is given by the
equations
vij = ui − uj
vij = rijxij (Ohms law)
This can also be written
v = ATu
v = Dx
where
u =[
u1 u2 u3 u4 u5
]T
v =[
v12 v13 v23 v24 v34 v35 v45
]T
D = diag(r12, r13, r23, r24, r34, r35, r45)
Lecture 29 Quadratic optimization
The circuit equations:
Ax = b
v = ATu
v = Dx
⇔
D −AT
A 0
x =
0
b
Since D is positive definite (rij > 0) this is, see comment 2, equivalent
to the optimality conditions for the following optimization problem
minimize1
2xTDx
s.t. Ax = b
Lecture 30 Quadratic optimization
Comment
In the lecture on network optimization we saw that the matrix
N (A) 6= 0 (then the matrix was called A) and that the nullspace
corresponded to cycles in the circuit.
The laws of electricity (Ohms law + Kirchoffs law) determines the
current that minimizes the loss of power. This follows since the objective
function can be written
1
2xTDx =
1
2
∑
ij
rijx2
ij
Lecture 31 Quadratic optimization
Comment 3. It is common to connect one of the nodes to earth. Let
us, e.g., connect node 5 to earth, then u5 = 0.
x12
x13
x23
x24
x34
x35
x45
r12
r13
r23
r24
r34r35
r45
v12
v13
v23
v24
v34
v35
v45
u1
u2
u3
u4
u5 = 0
I
Lecture 32 Quadratic optimization
With the potential in node 5 fixed at u5 = 0 the last column in the
voltage equation v = ATu is eliminated. The Voltage equation and the
current balance can be replaced with v = ATu and Ax = b where
A =
1 1 0 0 0 0 0
−1 0 1 1 0 0 0
0 −1 −1 0 1 1 0
0 0 0 −1 −1 0 1
, b =
I
0
0
0
, u =
u1
u2
u3
u4
The advantage with this is that the rows of A are linearly independent.
This facilitates the solution of the optimization problem.
Lecture 33 Quadratic optimization
Least-Squares problems
Application: Linear regression (model fitting)
Systems(t)
e(t)
y(t)
Problem: Fit a linear regressionsmodel to measured data.
Regressionmodel : y(t) =n∑
j=1
αjψj(t)
• ψk(t), k = 1, . . . , n are the regressors (known functions)
• αk = k = 1, . . . , n are the model parameters (to be determined)
• e(t) measurement noise (not known)
• s(t) observations.
Lecture 34 Quadratic optimization
Idea for fitting the model: Minimize the quadratic sum of the prediction
errors
minimize1
2
m∑
i=1
(n∑
j=1
αjψj(ti)− s(ti))2
=minimize1
2
m∑
i=1
(ψ(ti)Tx− s(ti))
2
=minimize1
2(Ax− b)T(Ax− b) (2)
ψ(t) =
ψ1(t)...
ψn(t)
, x =
α1
...
αn
, A =
ψ(t1)T
...
ψ(tn)T
, b =
s(t1)...
s(tn)
Lecture 35 Quadratic optimization
The solution of the Least-Squares problem (LSQ)
The Least-Squares problem (2) can equivalently be written
minimize1
2xTHx+ cTx+ c0
where H = ATA, c = −ATb, c0 =1
2bTb. We note that
• H = ATA is positive semidefinite since xTATAx = ‖Ax‖2 ≥ 0.
• c ∈ R(H) since c = −ATb ∈ R(AT) = R(ATA) = R(H).
The conditions in Theorem 2 and Comment 1 are satisfied and it follows
that the LSQ-estimate is given by
Hx = −c ⇔ ATAx = ATb (The Normal equation)
Lecture 36 Quadratic optimization
Linear algebra interpretation
x
x = ATu
xn
A
b−Ax
R(A)
N (A) N (AT)
R(AT) Ax
RmRn
• The Normal equation can be written
AT(b−Ax) = 0 ⇔ b−Ax ∈ N (AT)
This orthogonality property has a geometric interpretation which is
illustrated in the figure above.
• The Fundamental Theorem of Linear algebra explains the LSQ
solution geometrically.
Lecture 37 Quadratic optimization
Uniqueness of the LSQ solution
Case 1 If the columns in A are linearly independent,i.e., N (A) = {0}, it
holds that H = ATA is positive definite and hence invertible. The
Normal equations then has the unique solution
x = (ATA)−1Ab
Case 2 If A has linearly dependent columns, i.e., N (A) 6= {0} then it is
natural to chose the solution of smallest length. From the figure on
the previous slide it is clear that we should let x = ATu, where
AATu = Ax, and where x is an arbitrary solution to the LSQ
problem.
Lecture 38 Quadratic optimization
Our choice of x in case 2 can equivalently be interpreted as the solution
to the quadratic optimization problem
minimize1
2xTx
s.t. Ax = Ax
According to Comment 2 the solution to this optimization problem is
given by the solution to
I −AT
A 0
x
u
=
0
Ax
i.e. x = ATu, where AATu = Ax
Lecture 39 Quadratic optimization
Lasanvisningar
• Quadratic constraints without constraints: Chapter 9 and 27 in the
book.
• Quadratic optimization with constraints: Chapter 10 in the book.
• Least-Squares method: Chapter 11 in the book.
Lecture 40 Quadratic optimization