A T-algebraic approach to primal-dual interior-point ... · 4 C. B. CHUA Deﬂnition 6 (Isomorphic T-algebras).A T-algebra A of rank r is said to be isomorphic to another T-algebra

A T-algebraic approach to primal-dual interior-pointalgorithms

CHEK BENG CHUA

Abstract. Three primal-dual interior-point algorithms for homogeneous cone pro-gramming are presented. They are a short-step algorithm, a large-update algorithm,and a predictor-corrector algorithm. These algorithms are described and analyzed basedon a characterization of homogeneous cone via T -algebra. The analysis show that thealgorithms have polynomial iteration complexity.

Key words and phrases. Homogeneous cone programming; T-algebra; Primal-dual interior-point algo-rithm.

1. Introduction

Primal-dual interior-point algorithms—first designed for linear programming (see, e.g.,[19]), and subsequently extended to semidefinite programming (see, e.g., [18, Part II]) andsymmetric cone programming (see, e.g., [13])—are the most widely used interior-pointalgorithms. At the same time, they are able to achieve the best iteration complexitybound known to date. These strongly motivates further development of primal-dualinterior-point algorithms for wider classes of optimization problems.

Various studies to extend primal-dual algorithms beyond symmetric cone programminginvolves the v-space approach; i.e., the use of scaling points. Tuncel [15] showed thatprimal-dual symmetric algorithms that use the v-space approach can only be designed forsymmetric cone programming. By dropping primal-dual symmetry, Tuncel [16] designedprimal-dual algorithms based on the v-space approach. However, polynomial iterationcomplexity bounds were established only for symmetric cone programming.

This paper focuses on primal-dual interior-point algorithms for homogeneous cone pro-gramming.

Homogeneous cone programming, which will be formally defined in the next section,is a class of convex programming problems. It includes, as a proper sub-class, symmetriccone programming. While there is a finite number of non-isomorphic symmetric cones ofeach dimension, this number is uncountable for homogeneous cones when the dimensionis at least eleven [17]. Thus the collection of homogeneous cones is significantly largerthan the subclass of symmetric cones.

On the other hand, the author [4] proved that every homogeneous cone can be rep-resented as the intersection of some cone of symmetric positive definite matrices witha suitable linear subspace. Thus all homogeneous cone programming problems can beformulated as semidefinite programming problems, which are special cases of symmetriccone programming problems. However it is argued in the same paper that there areadvantages in designing algorithms specifically for homogeneous cone programming asthe ranks of these cones—which determines the iteration complexity of interior-point

1

2 C. B. CHUA

algorithms [9]—are always no more than the ranks of the representing positive definitecones.

The primal-dual algorithms in this paper rely on a characterization of homogeneouscones via T -algebras.

Algebraic structures of convex cones had been successfully exploited in the designand analysis of interior-point algorithms for convex conic programming. This is mostnotable for symmetric cone programming [1, 3, 8, 14]) where Jordan algebraic structuresof symmetric cones were used. The primal-dual algorithms and their analyses in thispaper can be applied to symmetric cone programming as a special case of homogeneouscones programming. These these algorithms differ from those in [1] in the choice ofsearch direction. The search direction in this paper is a generalization of the S-Ch-MTsearch direction for semidefinite programming described in [11], which was motivated viatheoretical complexity and computational trials. On the other hand, as a search directionfor homogeneous cone programming, this generalization is a natural consequence of theT -algebraic structure of homogeneous cones.

This paper is organized as follows. Section 2 begins with a formal definition of homo-geneous cone programming, which is followed by a brief introduction to T -algebras. Thequadratic representations in Euclidean Jordan algebras are generalized to T -algebras.Section 2 then concludes with a discussion on dual homogeneous cones. Section 3 definesthe primal-dual central path of a pair of primal-dual homogeneous cone programmingproblems that is based on optimal barriers for the problems, and establishes various tech-nical results needed for the design and analysis of path-following algorithms. Section 4presents a general framework for primal-dual path-following algorithms, and describesthree specializations—a short-step algorithm, a large-update algorithm, and a predictor-corrector algorithm—with proofs of their respective iteration complexity.

2. Homogeneous Cone Programming

The main object of study in this paper is a class of convex programming problemsknown as homogeneous cone programming or convex conic programming over homoge-neous cones. We begin with a description of the programming problems in this class.

Let E denote a Euclidean space with inner product 〈·, ·〉.Definition 1 (Homogeneous cone). A homogeneous cone is a pointed, open, convex coneK ∈ E whose automorphism group Aut(K) acts transitively on it; i.e., for every pair(x,y) ∈ K⊕K, there exists a linear map A : E→ E satisfying A(K) = K and A(x) = y.

Henceforth, K denotes a homogeneous cone in E, x, s denote vectors in E and L denotesa linear subspace of E.

We consider the following homogeneous cone programming problem

infx

〈s,x〉subject to x ∈ L+ x, x ∈ cl(K)

(HCP )

and its Lagrangian dual problem

infs

〈x, s〉subject to s ∈ L⊥ + s, s ∈ cl(K])

(HCD)

T-algebraic approach to primal-dual algorithms 3

where L⊥ denotes the orthogonal complement of L in E and K] denotes the (open) dualcone s ∈ E : 〈x, s〉 > 0 ∀x ∈ cl(K) \ 0. Since K is pointed, the dual cone K] isnonempty.

2.1. T-algebras. We begin with some basics of T -algebras; see [17] and [2, Chapter 2]for more details.

Definition 2 (Matrix algebra). A matrix algebra A is a bi-graded algebra⊕r

i,j=1 Aij overthe reals satisfying

AijAkl ⊆

Ail if j = k,

0 if j 6= k.

The positive integer r is called the rank of the matrix algebra. For each a ∈ A, we denoteby aij its component in Aij.

Definition 3 (Involution). An involution (·)∗ of a matrix algebra A of rank r is a linearautomorphism on A that is

(1) involutory (i.e., (a∗)∗ = a for all a ∈ A),(2) anti-homomorphic (i.e., (ab)∗ = b∗a∗ for all a,b ∈ A),

and further satisfiesA∗

ij = Aji (1 ≤ i, j ≤ r).

Definition 4 (T -algebra). A T -algebra of rank r is a matrix algebra A of rank r withinvolution (·)∗ satisfying the following seven axioms:

I. For each i ∈ 1, . . . , r, the subalgebra Aii is isomorphic to the reals.II. For each a ∈ A and each i, j ∈ 1, . . . , r,

ajiei = aji and eiaij = aij,

where ei denotes the unit of Aii.III. For each a,b ∈ A and each i, j ∈ 1, . . . , r,

aijbji = bjiaij.

IV. For each a,b, c ∈ A and each i, j, k ∈ 1, . . . , r,aij(bjkcki) = (aijbjk)cki.

V. For each a ∈ A and each i ∈ 1, . . . , r,a∗ijaij ≥ 0

with equality when and only when aij = 0.VI. For each a,b, c ∈ A and each i, j, k, l ∈ 1, . . . , r with i ≤ j ≤ k ≤ l,

aij(bjkckl) = (aijbjk)ckl.

VII. For each a,b ∈ A and each i, j, k, l ∈ 1, . . . , r with i ≤ j ≤ k and l ≤ k,

aij(bjkb∗lk) = (aijbjk)b

∗lk.

Definition 5 (Inessential change of grading). A change in the grading of a T -algebra A isthe replacement of each subspace Aij by Aπ(i),π(j) where π is a permutation of 1, . . . , r.An inessential change in the grading of a T -algebra A is one that preserves the subspaceof upper triangular (or equivalently, lower triangular) elements. In another words, thepermutation π in an inessential change satisfies

(i < j) ∧ (π(i) > π(j)) =⇒ Aij = 0.

4 C. B. CHUA

Definition 6 (Isomorphic T -algebras). A T -algebra A of rank r is said to be isomorphicto another T -algebra of the same rank if, after an inessential change of grading of A,there is an isomorphism of the bi-graded algebras with involution.

Definition 7 (Associated cone). The cone associated with a T -algebra A of rank r is

tt∗ : t ∈ A, tij = 0 ∀1 ≤ j < i ≤ r, tii > 0 ∀1 ≤ i ≤ r.It is denoted by K(A).1

Example 1. Every (n+2)-dimensional T -algebra of rank 2 is isomorphic to the T -algebraA = A11 ⊕ A12 ⊕ A21 ⊕ A22 = R⊕ Rn ⊕ Rn ⊕ R defined by

(ab)ii = aTijbji (i, j ∈ 1, 2, i 6= j)

By Corollary IV.1.5 of [7], the subset of elements of A satisfying a∗ = a, when equippedwith the Jordan product

(a,b) 7→ ab + ba

2,

is a Euclidean Jordan algebra of rank 2.

Example 2. Let H be a Euclidean Hurwitz algebra (i.e., algebra of real numbers, complexnumbers, quaternions or octonions). We shall use <(x) and x to denote, respectively, thereal part and the conjugate of x ∈ H.

Let Aij = H for each i, j ∈ 1, . . . , r with i 6= j, Aii = R for i ∈ 1, . . . , r, and definethe matrix algebra A =

⊕ri,j=1 Aij by

(ab)ij =

∑rk=1<(aikbkj) if i = j,∑rk=1 aikbkj if i 6= j,

so that it satisfies Axioms I and II in the definition of a T -algebra. It is straightforwardto check that the unary operator (·)∗ : A → A defined by

a∗ij = aij

is an involution. Moreover, Proposition V.1.2 of [7] shows that Axioms III–V are alsosatisfied.

For r = 3, Axiom VI holds since at least one of aij, bjk and ckl is a real number.Finally, Axiom VII holds since Euclidean Hurwitz algebras are alternative (i.e., x(xy) =(xx)y and (yx)x = y(xx), or equivalently, the sub-algebra generated by any two elementsis associative), and both x and x are in the sub-algebra generated by x − <(x) for eachx ∈ H. Hence A is a T -algebra. By Theorem V.3.7 of [7], the subset of elements of Asatisfying a∗ = a, when equipped with the Jordan product, is a Euclidean Jordan algebraof rank 3.

For r > 3, suppose that H is associative. Then A is clearly a T -algebra. As before, byTheorem V.3.7 of [7], the subset of elements of A satisfying a∗ = a, when equipped withthe Jordan product, is a Euclidean Jordan algebra of rank r.

This example, together with the preceding example, covers all simple Euclidean Jordanalgebras (see [7, Chapter V]).

1When we write tt∗ ∈ K(A), we always mean that t satisfies the conditions in the definition of K(A).We shall see later that such t is uniquely determined by the product tt∗.


Theorem 1 (Characterization of homogeneous cones). A cone is homogeneous if andonly if it is linearly isomorphic to the cone associated with some T -algebra. Moreover theT -algebra is uniquely determined, up to isomorphism, by the homogeneous cone.

Proof. See Proposition 1 and Theorem 4 of [17]. ¤Henceforth,

(1) A shall denote a T -algebra of rank r with involution (·)∗ such that K(A) = K;(2) ei shall denote the unit of the subalgebra Aii and ρi shall denote the linear function

on A satisfyingaii = ρi(a)ei (a ∈ A)

for each i ∈ 1, . . . , r;(3) e shall denote the element in A satisfying

eii = ei (1 ≤ i ≤ r) and eij = eji = 0 (1 ≤ i < j ≤ r);

(4) tr(·) shall denote the linear function

a ∈ A 7→r∑

i=1

ρi(a);

(5) 〈·, ·〉 shall denote the bilinear function

(a,b) ∈ A⊕ A 7→ tr(a∗b);

(6) T shall denote the subalgebra

a ∈ A : aij = 0 (1 ≤ j < i ≤ r)of upper triangular elements (so that K = K(A) = tt∗ : t ∈ T, ρi(t) > 0 ∀i);and

(7) H shall denote the subspace

a ∈ A : aij = a∗ji (1 ≤ j < i ≤ r)of “hermitian” elements (so that the linear span of K is H).

Remark 1. For each i ∈ 1, . . . , r, since (·)∗ is an involutory linear automorphism on Aii,which is isomorphic to the reals, it must be the identity map. Thus the linear functiontr(·) is invariant under involution. Subsequently 〈·, ·〉 is symmetric.

Remark 2. Axiom II is equivalent to e being the unit of the T -algebra A.

Remark 3. Axiom III is equivalent to tr(ab) = tr(ba) for all a,b ∈ A. This is furtherequivalent to 〈a,b〉 = 〈a∗,b∗〉.Remark 4. Axiom V is equivalent to tr(aa∗) ≥ 0 for all a ∈ A with equality when andonly when a = 0. This is further equivalent to 〈·, ·〉 being positive definite. Thus 〈·, ·〉 is

an inner product of A. We shall denote by ‖·‖ the induced Euclidean norm a 7→√〈a, a〉.

We shall view K as a cone in the Euclidean space (H, 〈·, ·〉).Remark 5. Axiom IV is equivalent to tr(a(bc)) = tr((ab)c) for all a,b, c ∈ A. This isfurther equivalent to 〈ab, c〉 = 〈b, a∗c〉 for all a,b, c ∈ A. Together with Axiom V, wehave that the adjoint of the linear map b ∈ A 7→ ab under the inner product 〈·, ·〉 is thelinear map b ∈ A 7→ a∗b for each a ∈ A. Together with Remark 3, we deduce

〈ab, c〉 = 〈b∗a∗, c∗〉 = 〈a∗,bc∗〉 = 〈a, cb∗〉 (a,b, c ∈ A);

6 C. B. CHUA

i.e., the adjoint of the linear map b ∈ A 7→ ba under the inner product 〈·, ·〉 is the linearmap b ∈ A 7→ ba∗ for each a ∈ A.

Remark 6. Axiom VI is equivalent to t(uw) = (tu)w for all t,u,w ∈ T. Taking involu-tion gives the equivalent statement: t(uw) = (tu)w for all t,u,w ∈ T∗.

Remark 7. Axiom VII is equivalent to t(uu∗) = (tu)u∗ for all t,u ∈ T. Taking involutiongives the equivalent statement: (u∗u)t = u∗(ut) for all t,u ∈ T∗. Polarization gives theequivalent statements:

(1) t(uw∗) + t(wu∗) = (tu)w∗ + (tw)u∗ for all t,u,w ∈ T, and(2) (u∗w)t + (w∗u)t = u∗(wt) + w∗(ut) for all t,u,w ∈ T∗.

Theorem 2 (Sub-multiplicativity of Euclidean norm; c.f. Proposition 2.10 of [2].). Thenorm ‖·‖ is sub-multiplicative.

Proof. We first show that for all a,b ∈ A,

‖aijbjk‖ ≤ ‖aij‖ ‖bjk‖ ∀1 ≤ i, j, k ≤ r. (2.1)

Since the norm is invariant under involution, it suffices to prove the above inequality fori ≤ k. Thus we consider three cases

Case 1: 1 ≤ i ≤ j ≤ k ≤ r. In this case, equality holds as

‖aijbjk‖2 = 〈aijbjk, aijbjk〉=

⟨aij, (aijbjk)b

∗jk

⟩(Remark 5)

=⟨aij, aij(bjkb

∗jk)

⟩(Axiom VII)

=⟨a∗ijaij,bjkb

∗jk

⟩(Axiom IV)

= ρj(a∗ijaij)ρj(bjkb

∗jk)

= ‖aij‖2 ‖bjk‖2 .

Case 2: 1 ≤ i ≤ k < j ≤ r. Using the Cauchy-Schwarz inequality and the firstcase, we deduce


⟨aij, (aijbjk)b

∗jk

⟩(Remark 5)

≤ ‖aij‖∥∥(aijbjk)b

∗jk

∥∥= ‖aij‖ ‖aijbjk‖

∥∥b∗jk∥∥ ,

which shows that ‖aijbjk‖ ≤ ‖aij‖∥∥b∗jk

∥∥ = ‖aij‖ ‖bjk‖.Case 3: 1 ≤ j < i ≤ k ≤ r. Once again, we use the Cauchy-Schwarz inequalityand the first case to deduce


⟨a∗ij(aijbjk),bjk

⟩(Axiom IV)

≤∥∥a∗ij(aijbjk)

∥∥ ‖bjk‖=

∥∥a∗ij∥∥ ‖aijbjk‖ ‖bjk‖ .


By triangle inequality, we have

‖ab‖2 =r∑

i,j=1

∥∥∥∥∥r∑

k=1

aikbkj

∥∥∥∥∥

2

≤r∑

i,j=1

(r∑

k=1

‖aikbkj‖)2

.

Using (2.1) and Cauchy’s inequality, respectively, we further bound

r∑i,j=1

(r∑

k=1

‖aikbkj‖)2

≤r∑

i,j=1

(r∑

k=1

‖aik‖ ‖bkj‖)2

≤r∑

i,j=1

(r∑

k=1

‖aik‖2r∑

k=1

‖bkj‖2

)

= ‖a‖2 ‖b‖2 ,

hence proving the theorem. ¤Let T∗ (resp., T+ and T++) denote the set of elements of T with nonzero (resp.,

nonnegative and positive) diagonal components.The homogeneous cone K can be described as tt∗ : t ∈ T++. The map t 7→ tt∗ is

continuous, hence the closure cl(K) of the cone K contains the set tt∗ : t ∈ cl(T++) =T+. Conversely every x ∈ cl(K) has a sequence tk ⊂ T++ with tkt

∗k → x. The

sequence tk is bounded since ‖tk‖2 = tr(tkt∗k) is bounded, hence it has a converging

subsequence with limit t ∈ T+ satisfying tt∗ = x. Consequently

cl(K) = tt∗ : t ∈ T+.Proposition 1. The sets T?, T++, T∗? and T∗++ are multiplicative groups.

Proof. We shall prove that T? and T++ are multiplicative groups. Similarly result thenholds for T∗? and T∗++ since involution is anti-homomorphic and e∗ = e. Products ofupper triangular elements are upper triangular with diagonal components the productsof respective diagonal components. Let t ∈ T? be an arbitrary. Let u be the uppertriangular element whose components are defined recursively by

ui,i+j =

1tii

for j = 0,

− 1ti+j,i+j

∑j−1k=0 ui,i+kti+k,i+j for j = 1, . . . , r − i.

It is straightforward to check that u ∈ T? and ut = tu = e. Moreover if t ∈ T++,then u ∈ T++. Axiom VI states that multiplication of upper triangular elements isassociative. ¤

We shall denote by t−1 the multiplicative inverse of each element t ∈ T?. Similarly forelements l ∈ T∗?.

Involution is anti-homomorphic, hence the inverse of the involution of an element t ∈ T?

is the involution of its inverse, which we shall denote by t−∗. Similarly for elements l ∈ T∗?.

2.2. Quadratic representations.

Definition 8 (Quadratic representation). For each a ∈ A, the quadratic representationof a is the linear operator

b ∈ H 7→ 1

2(a(ba∗) + a(ab)− (aa)b + (ab)a∗ + (ba∗)a∗ − b(a∗a∗)) .

It is denoted by Qa.

8 C. B. CHUA

Remark 8. Let J be a Jordan algebra with binary operator . The quadratic representa-tion (see [7, Chapter II, §3]) of an element x ∈ J is defined to be the map

y ∈ J 7→ 2x (x y)− (x x) y.

Recall Examples 1 and 2, where we constructed T -algebras A whose space of “hermitian”matrices are simple Euclidean Jordan algebras J when equipped with the Jordan product

(a,b) 7→ ab + ba

2.

Not surprisingly, the quadratic representations of “hermitian” elements of A coincide withtheir respective quadratic representations as elements of J.

Proposition 2. For each t ∈ T, the map Qt simplifies to

a 7→ t(〈〈a〉〉 t∗) + (t 〈〈a〉〉∗)t∗.If, in addition, a = uu∗ for some u ∈ T, then

Qt(uu∗) = (tu)(tu)∗.

Proof. Let t,u ∈ T and a ∈ H be arbitrary. Let w denote the upper triangular element〈〈a〉〉∗. The first equation follows from

t(at∗) + t(ta)− (tt)a

= t(wt∗) + t(w∗t∗) + t(tw) + t(tw∗)− (tt)w − (tt)w∗

= t(wt∗) + t(w∗t∗) + t(tw∗)− (tt)w∗ (Axiom VI)

= (tw)t∗ + t(w∗t∗) (Remark 7)

while

t((uu∗)t∗) + t(t(uu∗))− (tt)(uu∗)

= t(u(u∗t∗)) + t((tu)u∗)− ((tt)u)u∗ (Axiom VII)

= (tu)(tu)∗ + (t(tu))u∗ − ((tt)u)u∗ (Remark 7)

= (tu)(tu)∗ (Axiom VI)

proves the second. ¤

Corollary 1. The map t 7→ Qt is a group homomorphism on T++.

Proof. Let t,u ∈ T be arbitrary upper triangular elements with positive diagonal. Bythe preceding proposition, for any w ∈ T,

Qt(Qu(ww∗)) = Qt((uw)(uw)∗) = (t(uw))(t(uw))∗

= ((tu)w))((tu)w))∗ (Axiom VI)

= Qtu(ww∗),

shows that QtQu = Qtu on K. Since Qt is linear for each t, and the linear span of K isthe space H, the corollary follows. ¤

The author [2] proved that the map t 7→ tt∗ = Qt(e) is a bijection between T++ andK(A); see also Proposition 2 of [17, Chapter III]. The following proposition shows that itis in fact a diffeomorphism.


Proposition 3. For each u ∈ T++, the map

t ∈ T++ 7→ Qt(uu∗)

is a diffeomorphism. Moreover its derivative at t ∈ T++ is

w ∈ T 7→ (tu)(wu)∗ + (wu)(tu)∗

and the derivative of its inverse map at Qt(uu∗) ∈ K is

a ∈ H 7→ tu⟨⟨Q(tu)−1(a)

⟩⟩∗u−1.

Proof. By Proposition 2, the derivative of t ∈ T++ 7→ Qt(uu∗) is the map

w ∈ T 7→ (tu)(wu)∗ + (wu)(tu)∗.

By Axiom VI and Proposition 2

(tu)(wu)∗ + (wu)(tu)∗ = (tu)(((wu)∗(tu)−∗)(tu)∗) + ((tu)((tu)−1(wu)))(tu)∗

= Qtu(((tu)−1(wu))H),

hence by Corollary 1 the derivative has inverse

a ∈ H 7→ tu⟨⟨Q(tu)−1(a)

⟩⟩∗u−1.

The proposition follows from the Inverse Function Theorem. ¤An alternative description of the cone K(A) based on quadratic representations is

K(A) = Qt(e) : t ∈ T++.Corollary 1 shows that we may replace e in the above description with any x ∈ K(A),since

Qt(uu∗) = Qt(Qu(e)) = Qtu(e)

for any t,u ∈ T++.The linear map Qt and its inverse Qt−1 take the cone K into itself, hence Qt is an

automorphism of the cone K. Moreover Qut−1 maps tt∗ to uu∗ for any t,u ∈ T++.This gives a quick proof that K = tt∗ : t ∈ T∗++ is a homogeneous cone. FurthermoreQt : t ∈ T++ acts transitively on K.2

2.3. Dual homogeneous cones. Using Axiom IV and Remark 5, it is straightforwardto check that the adjoint map QH

a of the quadratic representation of an element a ∈ Ais the quadratic representation Qa∗ of its involution.

In general, if A is an automorphism of an open cone C, then AH is an automorphismof C]. Moreover if G is a group of transitive automorphisms of C, then the group AH :A ∈ G also acts transitively on C]; see Proposition 9 of [17]. Recall that Qt : t ∈ T++is a transitive subgroup of automorphisms of K. Hence the dual cone K] is

Qt∗(x) : t ∈ T++ = Ql(x) : l ∈ T∗++,where x ∈ K] arbitrary. The identity e is a member of K] as

〈e, tt∗〉 = ‖t‖2 > 0

for all t ∈ T+ \ 0. We thus proved the following proposition.

Proposition 4. The dual cone K] of K is the homogeneous cone ll∗ : l ∈ T∗++. Thegroup of automorphisms Qt : t ∈ T∗++ acts transitively on K].

2It is in fact a simply transitive subgroup; i.e., for every t,u ∈ T++, there exists a unique Qw withw ∈ T++ that maps tt∗ to uu∗.

10 C. B. CHUA

We now turn our attention to identifying the T -algebra B with K(B) = K].It is easy to see that a matrix algebra B with involution satisfying K(B) = K] can be

obtained from A by the change in grading

Bij = Ar+1−i,r+1−j (1 ≤ i, j ≤ r)

It was pointed out by Vinberg [17, Chapter III, §6] that the matrix algebra B is alsoa T -algebra. Indeed Axioms I–V translates directly from A to B, Axiom VI for B isequivalent to that for A after taking involution, and Axiom VII for B is proved in thefollowing proposition.3

Proposition 5. For each t,u ∈ T,

t∗(u∗u) = (t∗u∗)u.

Equivalently, taking involution and/or polarization gives

(1) (uu∗)t∗ = u(u∗t∗) for all t,u ∈ T∗,(2) t∗(u∗w) + t∗(w∗u) = (t∗u∗)w + (t∗w∗)u for all t,u,w ∈ T, and(3) (uw∗)t∗ + (wu∗)t∗ = u(w∗t∗) + w(u∗t∗) for all t,u,w ∈ T∗.

Proof. Let t,u ∈ T be arbitrary. Let a denote the difference t∗(u∗u)− (t∗u∗)u. We shallprove the proposition by showing that ‖a‖ = 0, whence a = 0 by Axiom V. We evaluate‖a‖2 as the sum of 〈w, a〉 and 〈w∗, a〉, where w denotes the upper triangular element〈〈a〉〉∗. Now

〈w, t∗(u∗u)〉 = 〈u(tw),u〉 (Axiom IV)

= 〈(ut)w,u〉 (Axiom VI)

= 〈w, (t∗u∗)u〉 (Axiom IV)

implies 〈w, a〉 = 0. Also

〈u,u(wt∗)〉 = 〈w∗(u∗u), t∗〉 (Axiom IV)

= 〈w∗, t∗(u∗u)〉 (Remark 5)

= 〈u(tw∗),u〉 (Axiom IV)

and

〈u, (uw)t∗〉 = 〈(w∗u∗)u, t∗〉 (Axiom IV)

= 〈w∗, (t∗u∗)u〉 (Remark 5)

= 〈(ut)w∗,u〉 , (Axiom IV)

whence

2 〈w∗, t∗(u∗u)〉 = 〈u(wt∗) + u(tw∗),u〉= 〈(uw)t∗ + (ut)w∗,u〉 (Remark 7)

= 2 〈w∗, (t∗u∗)u〉shows that 〈w∗, a〉 = 0. ¤

The following propositions are the counterparts of Propositions 2 and 3 for B, statedin terms of the quadratic representations in A instead of those in B.

3It was noted by Vinberg that by observing that the subspace of upper triangular elements of B isprecisely T∗, both Axioms VI and VII for B follows from those for A by taking involution. While this iscertainly true for Axiom VI, it does not hold for Axiom VII.


Proposition 6. For each l ∈ T∗, the map Ql simplifies to

a 7→ l(〈〈a〉〉∗ l∗) + (l 〈〈a〉〉)l∗.If, in addition, a = kk∗ for some k ∈ T∗, then

Ql(kk∗) = (lk)(lk)∗.

Proposition 7. For each k ∈ T∗++, the map

l ∈ T∗++ 7→ Ql(kk∗)

is a diffeomorphism. Moreover its derivative at l ∈ T∗++ is

m ∈ T∗ 7→ (lk)(mk)∗ + (mk)(lk)∗

and the derivative of its inverse map at Ql(kk∗) ∈ K] is

a ∈ H 7→ lk⟨⟨Q(lk)−1(a)

⟩⟩k−1.

3. Primal-Dual Central Paths

In this section, we define a primal-dual central path for the pair of primal-dual ho-mogeneous cone programming problems HCP and HCD based on a pair of optimallogarithmically homogeneous self-concordant barriers.

3.1. Optimal barriers. We begin by recalling the definition of ϑ-logarithmically homo-geneous self-concordant barriers.

Definition 9 (Self-concordant barrier). A function f : C → R on an open convex coneC is said to be a (non-degenerate, strongly) ϑ-self-concordant barrier (for C) if it is astrictly convex, three-times continuously differentiable function that diverges to infinityas its argument approaches a point on the boundary ∂C of C, and satisfies

f ′′′(x;h) ≤ 2(f ′′(x;h))3/2

andf ′(x;h) ≤ (ϑf ′′(x;h))1/2

for all x ∈ K and h ∈ E. Here, f (k)(x;h) denotes the k-th directional derivative of f

along h (i.e., dk

dtkf(x + th)|t=0). The parameter ϑ is called the complexity parameter.

Definition 10 (Logarithmically homogeneous function). A function f : C → R on anopen convex cone C is said to be ϑ-logarithmically homogeneous if for every t ∈ R++ andevery x ∈ C, it holds

f(tx) = f(x)− ϑ log t.

When a self-concordant barrier on C is ϑ-logarithmically homogeneous, it has com-plexity parameter ϑ. We thus called such function a ϑ-logarithmically homogeneous self-concordant barrier on C. Every open convex cone C admits logarithmically homogeneousself-concordant barriers; see [12, Section 2.5]. Among all logarithmically homogeneousself-concordant barriers for C, a barrier with the least complexity parameter is called anoptimal barrier.

Optimal barriers for a homogeneous cone K of rank r has complexity parameter r.Moreover the function

f : x ∈ K 7→ −r∑

i=1

log(ρi(ux)2) (3.1)

12 C. B. CHUA

is a r-logarithmically homogeneous self-concordant barrier for K, where ux denotes theunique element in T++ satisfying uxu

∗x = x; see [9]. To the best of the author’s knowledge,

this is the only known optimal barrier for the homogeneous cone. This barrier is a memberof the family of logarithmically homogeneous self-concordant barriers

x ∈ K 7→ −

r∑i=1

wi log(ρi(ux)2) : wi ≥ 1

;

see [2, 5] for more details.By writing the barrier f as the composition of x ∈ K 7→ ux and

t ∈ T++ 7→ −r∑

i=1

log(ρi(t)2),

it is straightforward to compute the gradient of f using Proposition 3. This gives thegradient at each x ∈ K as

∇f(x) = u−∗x u−1x .

A simple modification of the Fenchel conjugate function of f gives the function

f ] : s ∈ K] 7→ −r∑

i=1

log(ρi(ls)2)

on the dual cone, where ls denotes the unique element in T∗++ satisfying lsl∗s = s. This is

in fact the optimal barrier given by (3.1) if we replace K with the homogeneous cone K].The gradient of this barrier at each s ∈ K] is

∇f ](s) = l−∗s l−1s .

3.2. The central path. The central path of the primal-dual pair of homogeneous coneprogramming problems (HCP ) and (HCD) is the set (x(µ), s(µ)) : µ > 0 of pairs ofsolutions to the primal and dual barrier problems

infx

〈s,x〉 − µ

r∑i=1

log(ρi(ux)2) : x ∈ L+ x

,

and

infs

〈x, s〉 − µ

r∑i=1

log(ρi(ls)2) : s ∈ L⊥ + s

,

respectively. We deduce from the optimality conditions for the barrier problems that thepair (x(µ), s(µ)) solves the central path equations

x ∈ L+ x, x ∈ K,

s ∈ L⊥ + s, s ∈ K],

Ql∗s (x) = µe.

(CPµ)


3.3. Tracing the central path. We now consider the problem of estimating the pair(x(µ++), s(µ++)) that solves the central path equations

x(µ++) ∈ L+ x, x(µ++) ∈ K,

s(µ++) ∈ L⊥ + s, s(µ++) ∈ K],

Ql∗s(µ++)(x(µ++)) = µ++e,

(CPµ++)

where µ++ ∈ R+ is given. Note that we allow µ++ to be zero, in which case we are solvingfor optimal solutions. Given a primal-dual strictly feasible pair, we measure its proximityto the pair (x(µ), s(µ)), where µ ∈ R++, with the proximity measure d : K⊕K]⊕R++ →R+ defined by

(x, s; µ) 7→ 1

µ

∥∥Ql∗s (x)− µe∥∥ .

Suppose we have a pair of estimates (x+, s+) satisfying d(x+, s+; µ+) ≤ β for someµ+ := 1

r〈x+, s+〉 > µ++ and some β < 1. We shall estimate (x(µ++), s(µ++)) with

Newton’s Method. This calls for the linearization of (CPµ).

Proposition 8. Suppose x ∈ K, s ∈ K] and a,b ∈ H. Then the directional derivative of

(tt∗, ll∗) ∈ K ⊕K] 7→ Ql∗(tt∗) = (l∗t)(l∗t)∗

at (x, s) along (a,b) is

Ql∗s (a) +(Ql∗s (x)

⟨⟨Ql−1s

(b)⟩⟩)

H.

Proof. Write the map

(tt∗, ll∗) ∈ K ⊕K] 7→ Ql∗(tt∗) = (l∗t)(l∗t)∗

as the composition of

(tt∗, ll∗) ∈ K ⊕K] 7→ (tt∗, l∗) and (tt∗, l∗) ∈ K ⊕ T++ 7→ Ql∗(tt∗).

By Proposition 7 the derivative of the first map at (tt∗, ll∗) is

(c,d) ∈ H⊕ H 7→ (c, 〈〈Ql−1(d)〉〉∗ l∗)and by Proposition 3, the derivative of the second map at (tt∗, l∗) is

(c,k∗) ∈ H⊕ T 7→ Ql∗(c) + ((l∗t)(k∗t)∗)H .

Hence Chain Rule gives the required directional derivative as

Ql∗s (a) + ((l∗sux)(⟨⟨Ql−1

s(b)

⟩⟩∗l∗sux)

∗)H = Ql∗s (a) + ((l∗sux)((l∗sux)

∗ ⟨⟨Ql−1s

(b)⟩⟩

))H

By Axiom VII,

(l∗sux)((l∗sux)

∗ ⟨⟨Ql−1s

(b)⟩⟩

) = ((l∗sux)(l∗sux)

∗)⟨⟨Ql−1

s(b)

⟩⟩,

hence the proposition follows from Proposition 2. ¤The above proposition gives the linearization of (CPµ) at (x+, s+) as

∆x ∈ L, ∆s ∈ L⊥,

Ql∗s+(∆x) +

(Ql∗s+

(x+)⟨⟨

Ql−1s+

(∆s)⟩⟩)

H= µ++e−Ql∗s+

(x+).(3.2)

One way to ensure that (x+ + ∆x, s+ + ∆s) is a good estimate of (x(µ++), s(µ++)) is toget a good upper bound on the sizes of ∆x and ∆s.

14 C. B. CHUA

The analysis can be greatly simplified by applying the primal-dual transformations

(x, s) 7→ (Ql∗s+(x),Ql−1

s+(s)).

Let v, (∆x,∆s) and L denote, respectively, the images of x+, (∆x,∆s) and L underthese transformations. Then (∆x,∆s) solves

∆x ∈ L, ∆s ∈ L⊥,

∆x + (v 〈〈∆s〉〉)H = µ++e− v.(3.3)

Note that the above linear system has a unique pair of solutions if and only if the sameholds for (3.2). Since the linear system is square, the following proposition shows thatthis holds when β < 1/

√2.

Proposition 9. Suppose v ∈ K satisfies

‖v − µe‖ ≤ βµ (3.4)

for some β ∈ (0, 1/√

2) and some µ > 0. If (∆x,∆s) satisfies 〈∆x,∆s〉 ≥ 0 and

∆x + (v 〈〈∆s〉〉)H = h (3.5)

for some h ∈ H, then

max‖∆x‖ , µ ‖∆s‖ ≤ ‖h‖1−√2β

Proof. If 〈∆x,∆s〉 ≥ 0, then

max‖∆x‖ , µ ‖∆s‖ ≤√‖∆x‖2 + µ ‖∆s‖2 ≤ ‖∆x + µ∆s‖ .

By triangle inequality, hypotheses (3.4) and (3.5), and the sub-multiplicativity of thenorm ‖·‖, we have

‖∆x + µ∆s‖ ≤ ‖∆x + (v 〈〈∆s〉〉)H‖+ ‖((v − µe) 〈〈∆s〉〉)H‖≤ ‖h‖+ 2 ‖(v − µe) 〈〈∆s〉〉‖≤ ‖h‖+ 2βµ ‖〈〈∆s〉〉‖≤ ‖h‖+

√2βµ ‖∆s‖

≤ ‖h‖+√

2β max‖∆x‖ , µ ‖∆s‖.Consequently under the hypotheses of the proposition,

(1−√

2β) max‖∆x‖ , µ ‖∆s‖ ≤ ‖h‖as required. ¤

Let xα and sα denote, respectively, x+ + α∆x and s+ + α∆s. Similarly, let xα andsα denote, respectively, v + α∆x = Ql∗s+

(xα) and e + α∆s = Ql−1s+

(sα). Let µα de-

note the scaled duality gap 1r〈xα, sα〉 = 1

r〈xα, sα〉. It follows straightforwardly from the

linearization (3.2) thatµα = (1− α + ασ)µ+,

where σ = µ++/µ+ < 1.We shall give an upper bound on d(xα, sα; µα) using the following lemma, which gen-

eralizes the local Lipschitz constant of Cholesky factorization of real symmetric matricesnear the identity matrix.


Lemma 1 (c.f. Lemma 11 of [6]). If h ∈ H satisfies ‖h‖ ≤ 1/2, then

‖le+h − e‖ ≤√

2 ‖h‖ .

Proof. Let ∆l(t) denote the lower triangular element le+th − e. Note that for all t ∈ R,

∆l(t)H + ∆l(t)∆l(t)∗ = th.

For t ∈ [0, 1], we havet ‖h‖ = ‖∆l(t)H + ∆l(t)∆l(t)

∗‖≥ ‖∆l(t)H‖ − ‖∆l(t)∆l(t)

∗‖≥√

2 ‖∆l(t)‖ − ‖∆l(t)‖2 .

(3.6)

Solving this quadratic in ‖∆l(t)‖ gives

‖∆l(t)‖ ≤ 1√2−

√1

2− t ‖h‖ or ‖∆l(t)‖ ≥ 1√

2+

√1

2− t ‖h‖.

Since ∆l(t), whence ‖∆l(t)‖, is continuous in t, it follows that

‖∆l(t)‖ ≤ 1√2−

√1

2− t ‖h‖

whenever t ‖h‖ ≤ 1/2. Under the hypothesis ‖h‖ ≤ 1/2, this indeed hold for t = 1, thus

‖le+h − e‖ ≤ 1√2−

√1

2− ‖h‖ ≤ 1√

2.

Finally, applying this upper bound in (3.6) with t = 1 gives

‖h‖ ≥√

2 ‖le+h − e‖ − 1√2‖le+h − e‖ =

1√2‖le+h − e‖

as required. ¤Lemma 2. Suppose β < 1/

√2 and χ = β + (1− σ)

√r. Then sα ∈ K] and

µ−1+

∥∥Ql∗sα(xα)− µαe

∥∥ ≤ (1− α)β + α2 χ2(7 + 6β)

(1−√2β)2+ α3 3χ3

(1−√2β)3(3.7)

whenever 0 ≤ α ≤ min1, (1−√2β)/(2χ).Moreover, if β ≤ β < 1 with σ(β − β) 6= 0, then there exists α > 0 such that

(i) the cubic polynomial

p : α 7→ (1− α)β + α2 χ2(7 + 6β)

(1−√2β)2+ α3 3χ3

(1−√2β)3− (1− α + σα)β (3.8)

satisfies p(α) ≤ 0 for all α ∈ [0, α]; and(ii) for all 0 ≤ α ≤ min1, α, the pair (xα, sα) lies in K ⊕K] and

d(xα, sα; µα) ≤ 1

1− α + ασ

((1− α)β + α2 χ2(7 + 6β)

(1−√2β)2+ α3 3χ3

(1−√2β)3

)≤ β (3.9)

Proof. According to Proposition 9, if β < 1/√

2, then

max ‖∆x‖ , µ+ ‖∆s‖ ≤ ‖µ++e− v‖1−√2β

≤ µ+β + (1− σ)

√r

1−√2β(3.10)

16 C. B. CHUA

by triangle inequality. Thus for every u ∈ T+ \ 0,〈sα,uu∗〉 = 〈e,uu∗〉+ α 〈∆s,uu∗〉 ≥ ‖u‖2 − |α| ‖∆s‖ ‖u‖2 = (1− |α| ‖∆s‖) ‖u‖2 > 0

whenever |α| < (1 − √2β)/χ. Thus sα ∈ K], whence sα = Qls+

(sα) ∈ K], for every

α ∈ [0, (1−√2β)/(2χ)].Note that whenever sα ∈ K], we have

sα = Ql−1s+

(sα) = (l−1s+

lsα)(l−1s+

lsα)∗,

hence lsα = l−1s+

lsα . In this case we deduce, using Corollary 1, that

Ql∗sα(xα) = Ql∗sα l−∗s+

(Qls∗+(xα)) = Ql∗sα

(xα).

Thus we may equivalently prove (3.7) for the pair (xα, sα) instead.Let lα denote the lower triangular element lsα − e = le+α∆s

− e, which is well-defined

when α ∈ [0, (1−√2β)/(2χ)]. By Proposition 2,

Ql∗sα(xα) = Qe+l∗α(xα)

= ((e + l∗α) (〈〈xα〉〉 (e + lα)))H

= (〈〈xα〉〉+ 〈〈xα〉〉 lα + l∗α 〈〈xα〉〉+ l∗α (〈〈xα〉〉 lα))H

= xα + (xαlα)H + Ql∗α(xα)

= v + α∆x + (vlα)H + α(∆xlα)H + Ql∗α(v) + αQl∗α(∆x),

whence the difference (Ql∗sα(xα)− µαe) is

v − (1− α)µ+e− ασµ+e + α∆x + (vlα)H + α(∆xlα)H + Ql∗α(v) + αQl∗α(∆x).

Using (3.3), we re-express this as

(1− α)(v − µ+e) + (v(lα − α 〈〈∆s〉〉))H + α(∆xlα)H

+ µ+l∗αlα + Ql∗α(v − µ+e) + αQl∗α(∆x)

= (1− α)(v − µ+e)− µ+lαl∗α − ((v − µ+e) 〈〈lαl∗α〉〉)H + α(∆xlα)H

+ µ+l∗αlα + Ql∗α(v − µ+e) + αQl∗α(∆x).

Using the sub-multiplicativity of ‖·‖, Lemma 1, and (3.10), we bound for each α ∈[0, (1−√2β)/(2χ)],

‖lαl∗α‖ , ‖l∗αlα‖ ≤ ‖lα‖2 ≤ 2α2 ‖∆s‖2 ≤ 2α2 χ2

(1−√2β)2,

‖((v − µ+e) 〈〈lαl∗α〉〉)H‖ ≤ 2 ‖v − µ+e‖ ‖〈〈lαl∗α〉〉‖≤√

2 ‖v − µ+e‖ ‖lαl∗α‖

≤ 2√

2α2µ+βχ2

(1−√2β)2≤ 3α2µ+β

χ2

(1−√2β)2,

‖α(∆xlα)H‖ ≤ 2α ‖∆x‖ ‖lα‖ ≤ 2√

2α2µ+χ2

(1−√2β)2≤ 3α2µ+

χ2

(1−√2β)2,


∥∥Ql∗α(v − µ+e)∥∥ ≤ 2 ‖〈〈v − µ+e〉〉‖ ‖lα‖2

≤ 2√

2α2µ+βχ2

(1−√2β)2≤ 3α2µ+β

χ2

(1−√2β)2,

and∥∥αQl∗α(∆x)

∥∥ ≤ 2α ‖〈〈∆x〉〉‖ ‖lα‖2 ≤ 2√

2α3µ+χ3

(1−√2β)3≤ 3α3µ+

χ3

(1−√2β)3.

The required inequality (3.7) then follows from the triangle inequality.

(i) If β > β, then p(0) = β − β < 0 shows that p has at least one positive real root.If not, then σ > 0, p(0) = 0 and p′(0) = β − σβ − β = −σβ < 0 also shows thesame. Hence p has a smallest positive real root α and p(α) ≤ 0 for all α ∈ [0, α].

(ii) If (1−√2β)/(2χ) < 1, then for α = (1−√2β)/(2χ) ∈ (0, 1), we have

p(α) > 7α2 χ2

(1−√2β)2− β =

7

4− β > 0.

Hence 0 ≤ α ≤ min1, (1 − √2β)/(2χ) whenever 0 ≤ α ≤ min1, α. Sub-

sequently sα ∈ K] and (3.7) holds whenever 0 ≤ α ≤ min1, α. This leadsto ∥∥Ql∗sα

(xα)− µαe∥∥ ≤ p(α) + (1− α + ασ)µ+β

≤ (1− α + ασ)µ+β = µαβ,

whence ⟨Ql∗sα(xα), ll∗

⟩= µα 〈e, ll∗〉+

⟨Ql∗sα(xα)− µαe, ll∗

⟩

≥ µα ‖l‖2 −∥∥Ql∗sα

(xα)− µe∥∥ ‖l‖2

≥ µα(1− β) ‖l‖2 > 0.

for all l ∈ T∗+ \ 0. We then conclude that Ql∗sα(xα) ∈ K, whence xα ∈ K.

Finally (3.9) follows directly from (3.7).

¤

4. Primal-Dual Path-Following Algorithms

The three algorithms in this paper are based on the following generic primal-dualpath-following framework.

Algorithm 1. (Primal-dual path-following framework)Given a pair of primal-dual strictly feasible solutions (xin, sin) ∈ K ⊕ K] and µin ∈ R++

with d(xin, sin; µin) ≤ β for some β ∈ (0, 1/√

2), and the required accuracy ε > 0.

(1) Set (x+, s+) = (xin, sin) and µ+ = µin.(2) While 〈x+, s+〉 > ε 〈xin, sin〉,

(a) Pick β ∈ (0, 1/√

2) and σ ∈ [0, 1].(b) Solve (3.2) with µ++ replaced by σµ+. For each α ∈ [0, 1], let (xα, sα) =

(x+ + α∆x, s+ + α∆s), and let µα = (1 − α + ασ)µ+. Pick α ∈ [0, 1] suchthat (xα, sα) ∈ K ⊕K] and

d2(xα, sα; µα) ≤ β.

(c) Update (x+, s+) ← (xα, sα) and µ+ ← µα.

18 C. B. CHUA

(3) Output (xout, sout) = (x+, s+).

4.1. Short-step algorithm. In our short-step algorithm, conservative updates of theparameter µ are used and full Newton steps are taken. The target parameter µ++ ischosen so that the ratio σ = µ++/µ+ is (1 − δ/

√r), where δ ∈ (0, 1) is a constant. We

shall show that with appropriate choices of β and δ, Algorithm 1 requires at most O(√

r)iterations to reduce the duality gap by a constant factor.

Theorem 3. If β ∈ (0, 1/√

2) and δ ∈ (0, 1) satisfy

(β + δ)2(7 + 6β)

(1−√2β)2+

3(β + δ)3

(1−√2β)3<

(1− δ√

r

)β, (4.1)

then we may use α = 1 in each iteration of Algorithm 1 with σ = 1 − δ/√

r and β = β.Moreover the algorithm terminates after O(

√r log 1

ε) iterations.

Proof. Consider an iteration of the algorithm. By choice of β, we have d(x+, s+; µ+) ≤β at the beginning of the iteration. Let α be the smallest positive root of the cubicpolynomial (3.8) with σ = (1 − δ/

√r), χ = β + (1 − σ)

√r = β + δ, and β = β. Under

the hypothesis (4.1), we have

(1− α)β + α2 χ2(7 + 6β)

(1−√2β)2+ α3 3χ3

(1−√2β)3− (1− α + σα)β

≤ α

[(β + δ)2(7 + 6β)

(1−√2β)2+

3(β + δ)3

(1−√2β)3−

(1− δ√

r

)β

]< 0

for all 0 < α ≤ 1, whence α > 1. Thus by Lemma 2, we have (xα, sα) ∈ K ⊕ K] andd(xα, sα; µα) ≤ β for all α ∈ [0, 1]. This means we may take α = 1 in the iteration.Finally, the duality gap reduces by the factor (1− δ/

√r) in each iteration, whence by a

factor of ε in O(√

r log 1ε) iterations. ¤

4.2. Large-update algorithm. For the large-update algorithm, large updates µ++ =σµ+, where σ ∈ (0, 1) is a constant independent of r, are taken. We shall show thatas long as β ∈ (0, 1/(2

√2)), Algorithm 1 requires at most O(r) iterations to reduce the

duality gap by a constant factor.

Theorem 4. If β ∈ (0, 1/(2√

2)), then we may use α = Ω(1/r) in Algorithm 1 with β = βand σ ∈ (0, 1) arbitrary but fixed. Moreover the algorithm terminates after O(r log 1

ε)

iterations.

Proof. Consider an iteration of the algorithm. By choice of β, we have d(x+, s+; µ+) ≤β at the beginning of the iteration. Let α be the smallest positive root of the cubicpolynomial (3.8) with χ = β + (1− σ)

√r and β = β; i.e., α is the smallest positive root

of

α 7→ −σαβ + α2 (β + (1− σ)√

r)2(7 + 6β)

(1−√2β)2+ α3 3(β + (1− σ)

√r)3

(1−√2β)3

=

(−σβ + α

(β + (1− σ)√

r)2(7 + 6β)

(1−√2β)2+ α2 3(β + (1− σ)

√r)3

(1−√2β)3

)α.

By Lemma 2, we have (xα, sα) ∈ K ⊕K] and

d(xα, sα; µα) ≤ β = β


whenever 0 ≤ α ≤ min1, α. Since α = Ω(1/r), we may use α = min1, α = Ω(1/r).Finally, the duality gap reduces by the constant factor (1−Ω(1/r)) every two iterations,whence by a factor of ε in O(r log 1

ε) iterations. ¤

4.3. Predictor-corrector algorithm. In our predictor-corrector algorithm, the mostaggressive updates µ++ = 0 are taken, each of which is immediately followed by a zeroupdate step µ++ = µ+. This is a direct extension of the Mizuno-Todd-Ye predictor-corrector algorithm for linear programming [10]. The aggressive update steps are calledpredictor steps and the zero update steps are corrector steps. We shall show that with

appropriate choices of β and β, Algorithm 1 requires at most O(√

r) iterations to reducethe duality gap by a constant factor.

Theorem 5. If β ∈ (0, 1/(2√

2)) satisfies

4β2(7 + 12β)

(1− 2√

2β)2+

24β3

(1− 2√

2β)3≤ β, (4.2)

then by alternating between (σ, β) = (0, 2β) and (σ, β) = (1, β) in Algorithm 1, we mayuse α = Ω(1/

√r) and α = 1, respectively. Moreover the algorithm terminates after

O(√

r log 1ε) iterations.

Proof. Consider an iteration of the algorithm where a predictor step is to be taken. By

choice of β in the previous iteration, we have d(x+, s+; µ+) ≤ β at the beginning of theiteration. Let α be the smallest positive root of the cubic polynomial (3.8) with σ = 0,χ = β + (1− σ)

√r = β +

√r, and β = 2β ∈ (β, 1); i.e., α is the smallest positive root of

α 7→ −β + αβ + α2 (β +√

r)2(7 + 6β)

(1−√2β)2+ α3 3(β +

√r)3

(1−√2β)3.

By Lemma 2, we have (xα, sα) ∈ K ⊕K] and

d(xα, sα; µα) ≤ 2β = β

whenever 0 ≤ α ≤ min1, α. Since α = Ω(1/√

r), we may use α = min1, α =Ω(1/

√r).

Now consider an iteration of the algorithm where a corrector step is to be taken. By

choice of β in the previous iteration, we have d(x+, s+; µ+) ≤ 2β at the beginning of theiteration. Under the hypothesis (4.2), for σ = 1, χ = (2β)+ (1−σ)

√r = 2β and β = 2β,

we have

(1− α)(2β) + α2 χ2(7 + 6(2β))

(1−√2(2β))2+ α3 3χ3

(1−√2(2β))3− (1− α + σα)β

≤ α

[4β2(7 + 12β)

(1− 2√

2β)2+

24β3

(1− 2√

2β)3− 2β

]< 0

for all 0 < α ≤ 1, whence by Lemma 2, we have (xα, sα) ∈ K ⊕K] and

d(xα, sα; µα) ≤ 1

1− α + ασ

((1− α)(2β) + α2 χ2(7 + 6(2β))

(1−√2(2β))2+ α3 3χ3

(1−√2(2β))3

)

≤(

2(1− α)β + α

(4β2(7 + 12β)

(1− 2√

2β)2+

24β3

(1− 2√

2β)3

))

≤ 2β − αβ

20 C. B. CHUA

for all α ∈ [0, 1], in particular d(x1, s1; µ1) ≤ β. This means we may take α = 1 in theiteration.

Finally, the duality gap reduces by the constant factor (1−Ω(1/√

r)) every two itera-tions, whence by a factor of ε in O(

√r log 1

ε) iterations. ¤

5. Conclusion

In this paper, a generic primal-dual path-following framework for homogeneous coneprogramming is presented. With it, we can design a short-step algorithm, a large-updatealgorithm, and a predictor-corrector algorithm. Each algorithm is shown to have polyno-mial iteration complexity bounds that match existing bounds for their counter-parts insemidefinite programming.

There are two rather unsatisfactory features of this generic primal-dual frameworkworth further investigations:

(1) There is a lack of primal-dual symmetry. In another words, when we swap theprimal problem with the dual problem, together with the initial primal-dual pairof solutions, and apply the same algorithm using the same respective T -algebrasfor the primal and dual homogeneous cones, we may not get the same iterates.

(2) In order for the primal-dual search direction to be uniquely defined, it is necessaryfor the primal-dual framework to use the narrow `2-neighbourhood. A first steptowards designing a wide-neighbourhood primal-dual algorithm for homogeneouscone programming is to find a search direction that is defined at all pairs ofprimal-dual strictly feasible solutions.

References

[1] F. Alizadeh and S. H. Schmieta, Extension of primal-dual interior point algorithms to symmetriccones, Math. Program. 96 (2003), 409–438.

[2] C. B. Chua, An algebraic perspective on homogeneous cone programming, and the primal-dual sec-ond order cone approximations algorithm for symmetric cone programming, Ph.D. thesis, CornellUniversity, Ithaca, NY, USA, May 2003.

[3] , The primal-dual second-order cone approximations algorithm for symmetric cone program-ming, manuscript, Cornell University, Feburary 2003, accepted by Found. Comput. Math.

[4] , Relating homogeneous cones and positive definite cones via T -algebras, SIAM J. Optim. 14(2003), no. 2, 500–506.

[5] , A new notion of weighted centers for semidefinite programming, SIAM J. Optim. 16 (2006),no. 4, 1092–1109.

[6] , Target following algorithms for semidefinite programming, Research Report CORR 2006-10,Department of Combinatorics and Optimization, Faculty of Mathematics, University of Waterloo,Canada, May 2006, 40 pages.

[7] J. Faraut and A. Koranyi, Analysis on symmetric cones, Oxford Press, New York, NY, USA, 1994.[8] L. Faybusovich, Linear systems in Jordan algebras and primal-dual interior-point algorithms, J.

Comput. Appl. Math. 86 (1997), 149–175.[9] O. Guler and L. Tuncel, Characterization of the barrier parameter of homogeneous convex cones,

Math. Program. 81 (1998), 55–76.[10] S. Mizuno, M. J. Todd, and Y. Ye, An O(

√nL)-iteration homogeneous and self-dual linear program-

ming algorithm, Math. Oper. Res. 19 (1994), 53–67.[11] R. D. C. Monteiro and P. Zanjacomo, Implementation of primal-dual methods for semidefinite pro-

gramming based on Monteiro and Tsuchiya Newton directions and their variants, Optim. MethodsSoftw. 11/12 (1999), 91–140.

[12] Yu. E. Nesterov and A. S. Nemirovski, Interior point polynomial algorithms in convex programming,SIAM Stud. Appl. Math., vol. 13, SIAM Publication, Philadelphia, PA, USA, 1994.


[13] Yu. E. Nesterov and M. J. Todd, Primal-dual interior-point methods for self-scaled cones, SIAM J.Optim. 8 (1998), 324–364.

[14] B. K. Rangarajan, Polynomial convergence of infeasible-interior-point methods over symmetriccones, SIAM J. Optim. 16 (2006), 1211–1229.

[15] L. Tuncel, Primal-dual symmetry and scale invariance of interior-point algorithms for convex opti-mization, Math. Oper. Res. 23 (1998), 708–718.

[16] , Generalization of primal-dual interior-point methods to convex optimization problems inconic form, Found. Comput. Math. 1 (2001), 229–254.

[17] E. B. Vinberg, The theory of convex homogeneous cones, Tr. Mosk. Mat. O.-va 12 (1963), 303–358.[18] H. Wolkowicz, R. Saigal, and L. Vandenberghe (eds.), Handbook of semidefinite programming: The-

ory, algorithms, and applications, Springer-Verlag, Berlin-Heidelberg-New York, 2000.[19] S. J. Wright, Primal-dual interior-point methods, SIAM Publication, Philadelphia, PA, USA, 1997.

Division of Mathematical Sciences, Nanyang Technological University, 1 NanyangWalk, Blk 5 Level 3, Singapore 637616, Singapore

E-mail address: [email protected]

Documents

A T-algebraic approach to primal-dual interior-point ... · 4 C. B. CHUA Deﬂnition 6 (Isomorphic T-algebras).A T-algebra A of rank r is said to be isomorphic to another T-algebra