51
Separating Doubly Nonnegative and Completely Positive Matrices Kurt M. Anstreicher Dept. of Management Sciences University of Iowa (joint work with Hongbo Dong) INFORMS National Meeting, Austin, November 2010 2

Separating Doubly Nonnegative and Completely Positive Matrices · Separating Doubly Nonnegative and Completely Positive Matrices Kurt M. Anstreicher Dept. of Management Sciences University

  • Upload
    others

  • View
    19

  • Download
    0

Embed Size (px)

Citation preview

Separating Doubly Nonnegative andCompletely Positive Matrices

Kurt M. AnstreicherDept. of Management Sciences

University of Iowa(joint work with Hongbo Dong)

INFORMS National Meeting, Austin, November 2010

2

0

CP and DNN matrices

Let Sn denote the set of n×n real symmetric matrices, S+n denote

the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.

CP and DNN matrices

Let Sn denote the set of n×n real symmetric matrices, S+n denote

the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.

• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+

n ∩Nn.

CP and DNN matrices

Let Sn denote the set of n×n real symmetric matrices, S+n denote

the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.

• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+

n ∩Nn.

• The cone of n× n completely positive (CP) matrices isCn = {X |X = AAT for some n× k nonnegative matrix A}.

CP and DNN matrices

Let Sn denote the set of n×n real symmetric matrices, S+n denote

the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.

• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+

n ∩Nn.

• The cone of n× n completely positive (CP) matrices isCn = {X |X = AAT for some n× k nonnegative matrix A}.• Dual of Cn is the cone of n× n copositive matrices,C∗n = {X ∈ Sn | yTXy ≥ 0 ∀ y ∈ <+

n}.

CP and DNN matrices

Let Sn denote the set of n×n real symmetric matrices, S+n denote

the cone of n×n real symmetric positive semidefinite matrices andNn denote the cone of symmetric nonnegative n× n matrices.

• The cone of n×n doubly nonnegative (DNN) matrices is thenDn = S+

n ∩Nn.

• The cone of n× n completely positive (CP) matrices isCn = {X |X = AAT for some n× k nonnegative matrix A}.• Dual of Cn is the cone of n× n copositive matrices,C∗n = {X ∈ Sn | yTXy ≥ 0 ∀ y ∈ <+

n}.

Clear that

Cn ⊆ Dn, D∗n = S+n +Nn ⊆ C∗n,

and these inclusions are in fact strict for n > 4.

Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.

Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.

Why Bother?

Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.

Why Bother?

• [Bur09] shows that broad class of NP-hard problems can beposed as linear optimization problems over Cn.

Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.

Why Bother?

• [Bur09] shows that broad class of NP-hard problems can beposed as linear optimization problems over Cn.

• Dn is a tractable relaxation of Cn. Expect that solution ofrelaxed problem will be X ∈ Dn \ Cn.

Goal: Given a matrix X ∈ Dn \ Cn, separate X from Cn using amatrix V ∈ C∗n having V •X < 0.

Why Bother?

• [Bur09] shows that broad class of NP-hard problems can beposed as linear optimization problems over Cn.

• Dn is a tractable relaxation of Cn. Expect that solution ofrelaxed problem will be X ∈ Dn \ Cn.

• Note that least n where problem occurs is n = 5.

For X ∈ Sn let G(X) denote the undirected graph on vertices{1, . . . , n} with edges {{i 6= j} |Xij 6= 0}.

For X ∈ Sn let G(X) denote the undirected graph on vertices{1, . . . , n} with edges {{i 6= j} |Xij 6= 0}.Definition 1. Let G be an undirected graph on n vertices.Then G is called a CP graph if any matrix X ∈ Dn withG(X) = G also has X ∈ Cn.

For X ∈ Sn let G(X) denote the undirected graph on vertices{1, . . . , n} with edges {{i 6= j} |Xij 6= 0}.Definition 1. Let G be an undirected graph on n vertices.Then G is called a CP graph if any matrix X ∈ Dn withG(X) = G also has X ∈ Cn.

The main result on CP graphs is the following:

Proposition 1. [KB93] An undirected graph on n vertices isa CP graph if and only if it contains no odd cycle of length 5or greater.

In [BAD09] it is shown that:

In [BAD09] it is shown that:

• Extreme rays of D5 are either rank-one matrices in C5, or rank-three “extremely bad” matrices where G(X) is a 5-cycle (everyvertex in G(X) has degree two).

In [BAD09] it is shown that:

• Extreme rays of D5 are either rank-one matrices in C5, or rank-three “extremely bad” matrices where G(X) is a 5-cycle (everyvertex in G(X) has degree two).

• Any such extremely bad matrix can be separated from C5 by atransformation of the Horn matrix

H :=

1 −1 1 1 −1−1 1 −1 1 1

1 −1 1 −1 11 1 −1 1 −1−1 1 1 −1 1

∈ C∗5 \ D∗5 .

In [DA10] show that:

In [DA10] show that:

• Separation procedure based on transformed Horn matrix ap-plies to X ∈ D5 \ C5 where X has rank three and G(X) hasat least one vertex of degree 2.

In [DA10] show that:

• Separation procedure based on transformed Horn matrix ap-plies to X ∈ D5 \ C5 where X has rank three and G(X) hasat least one vertex of degree 2.

•More general separation procedure applies to any X ∈ D5 \ C5that is not componentwise strictly positive.

In [DA10] show that:

• Separation procedure based on transformed Horn matrix ap-plies to X ∈ D5 \ C5 where X has rank three and G(X) hasat least one vertex of degree 2.

•More general separation procedure applies to any X ∈ D5 \ C5that is not componentwise strictly positive.

An even more general separation procedure that applies to anyX ∈ D5 \ C5 is described in [BD10]. In this talk we will describethe procedure from [DA10] for X ∈ D5 \ C5, X 6> 0, and itsgeneralization to larger matrices having block structure.

A separation procedure for the 5× 5 case

Assume that X ∈ D5, X 6> 0. After a symmetric permutationand diagonal scaling, X may be assumed to have the form

X =

X11 α1 α2

αT1 1 0

αT2 0 1

, (1)

where X11 ∈ D3.

A separation procedure for the 5× 5 case

Assume that X ∈ D5, X 6> 0. After a symmetric permutationand diagonal scaling, X may be assumed to have the form

X =

X11 α1 α2

αT1 1 0

αT2 0 1

, (1)

where X11 ∈ D3.

Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form(1). Then X ∈ C5 if and only if there are matrices A11 andA22 such that X11 = A11 + A22, and(

Aii αiαTi 1

)∈ D4, i = 1, 2.

A separation procedure for the 5× 5 case

Assume that X ∈ D5, X 6> 0. After a symmetric permutationand diagonal scaling, X may be assumed to have the form

X =

X11 α1 α2

αT1 1 0

αT2 0 1

, (1)

where X11 ∈ D3.

Theorem 1. [BX04, Theorem 2.1] Let X ∈ D5 have the form(1). Then X ∈ C5 if and only if there are matrices A11 andA22 such that X11 = A11 + A22, and(

Aii αiαTi 1

)∈ D4, i = 1, 2.

In [BX04], Theorem 1 is utilized only as a proof mechanism, butwe now show that it has algorithmic consequences as well.

Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix

V =

V11 β1 β2

βT1 γ1 0

βT2 0 γ2

such that

(V11 βiβTi γi

)∈ D∗4 , i = 1, 2,

and V •X < 0.

Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix

V =

V11 β1 β2

βT1 γ1 0

βT2 0 γ2

such that

(V11 βiβTi γi

)∈ D∗4 , i = 1, 2,

and V •X < 0.

• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. IfX ∈ C5 is another matrix of the form (1), then Theorem 1implies that V • X ≥ 0.

Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix

V =

V11 β1 β2

βT1 γ1 0

βT2 0 γ2

such that

(V11 βiβTi γi

)∈ D∗4 , i = 1, 2,

and V •X < 0.

• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. IfX ∈ C5 is another matrix of the form (1), then Theorem 1implies that V • X ≥ 0.

• However cannot conclude that V ∈ C∗5 because V •X ≥ 0 only

holds for X of the form (1), in particular, x45 = 0.

Theorem 2. Assume that X ∈ D5 has the form (1). ThenX ∈ D5 \ C5 if and only if there is a matrix

V =

V11 β1 β2

βT1 γ1 0

βT2 0 γ2

such that

(V11 βiβTi γi

)∈ D∗4 , i = 1, 2,

and V •X < 0.

• Suppose that X ∈ D5 \ C5, and V are as in Theorem 2. IfX ∈ C5 is another matrix of the form (1), then Theorem 1implies that V • X ≥ 0.

• However cannot conclude that V ∈ C∗5 because V •X ≥ 0 only

holds for X of the form (1), in particular, x45 = 0.

• Fortunately, by [HJR05, Theorem 1], V can easily be “com-pleted” to obtain a copositive matrix that still separates Xfrom C5.

Theorem 3. Suppose that X ∈ D5 \ C5 has the form (1), andV satisfies the conditions of Theorem 2. Define

V (s) =

V11 β1 β2

βT1 γ1 s

βT2 s γ2

.

Then V (s) •X < 0 for any s, and V (s) ∈ C∗5 for s ≥ √γ1γ2.

Separation for larger matrices with block structure

Procedure for 5 case where X 6> 0 can be generalized to largermatrices with block structure. Assume X has the form

X =

X11 X12 X13 . . . X1k

XT12 X22 0 . . . 0

XT13 0 . . . . . . ...... ... . . . . . . 0

XT1k 0 . . . 0 Xkk

, (2)

where k ≥ 3, each Xii is an ni × ni matrix, and∑ki=1 ni = n.

Lemma 1. Suppose that X ∈ Dn has the form (2), k ≥ 3,and let

Xi =

(X11 X1i

XT1i Xii

), i = 2, . . . , k.

Then X ∈ Cn if and only if there are matrices Aii, i = 2, . . . , ksuch that

∑ki=2Aii = X11, and(Aii X1i

XT1i Xii

)∈ Cn1+ni, i = 2, . . . , k.

Moreover, if G(Xi) is a CP graph for each i = 2, . . . , k,then the above statement remains true with Cn1+ni replacedby Dn1+ni.

Theorem 4. Suppose that X ∈ Dn \ Cn has the form (2),where G(Xi) is a CP graph, i = 2, . . . , k. Then there is amatrix V , also of the form (2), such that(

V11 V1i

V T1i Vii

)∈ D∗n1+ni, i = 2, . . . , k,

and V • X < 0. Moreover, if γi = diag(Diag(Vii).5), then the

matrix

V =

V11 . . . V1k... . . . ...

V T1k . . . Vkk

,

where Vij = γiγTj , 2 ≤ i 6= j ≤ k, has V ∈ C∗n and V • X =

V •X < 0.

Note that:

Note that:

•Matrix X may have numerical entries that are small but notexactly zero. Can then apply Lemma 1 to perturbed matrix Xwhere entries of X below a specified tolerance are set to zero. Ifa cut V separating X from Cn is found, then V •X ≈ V •X <0, and V is very likely to also separate X from Cn.

Note that:

•Matrix X may have numerical entries that are small but notexactly zero. Can then apply Lemma 1 to perturbed matrix Xwhere entries of X below a specified tolerance are set to zero. Ifa cut V separating X from Cn is found, then V •X ≈ V •X <0, and V is very likely to also separate X from Cn.

• Theorem 4 may provide a cut separating a given X ∈ Dn \ Cneven when the sufficient conditions for generating such a cutare not satisfied. In particular, a cut may be found even whenthe condition that Xi is a CP graph for each i is not satisfied.

A second case where block structure can be used to generate cutsfor a matrix X ∈ Dn \ Cn is when X has the form

X =

I X12 X13 . . . X1k

XT12 I X23 . . . X2k

XT13 XT

23. . . . . . ...

... ... . . . . . . X(k−1)k

XT1k X

T2k . . . XT

(k−1)kI

, (3)

where k ≥ 2, each Xij is an ni × nj matrix, and∑ki=1 ni = n.

The structure in (3) corresponds to a partitioning of the vertices{1, 2, . . . , n} into k stable sets in G(X), of size n1, . . . , nk (notethat ni = 1 is allowed).

Applications

Example 1: Consider the box-constrained QP problem

(QPB) max xTQx + cTx

s.t. 0 ≤ x ≤ e.

Applications

Example 1: Consider the box-constrained QP problem

(QPB) max xTQx + cTx

s.t. 0 ≤ x ≤ e.

To describe a CP-formulation of QPB, define matrices

Y =

(1 xT

x X

), Y + =

1 xT sT

x X Z

s ZT S

.

Applications

Example 1: Consider the box-constrained QP problem

(QPB) max xTQx + cTx

s.t. 0 ≤ x ≤ e.

To describe a CP-formulation of QPB, define matrices

Y =

(1 xT

x X

), Y + =

1 xT sT

x X Z

s ZT S

.

By the result of [Bur09], QPB can then be written in the form

(QPBCP) max Q •X + cTx

s.t. x + s = e, Diag(X + 2Z + S) = e,

Y + ∈ C2n+1.

Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.

Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.

• Can be shown that DNN relaxation is equivalent to “SDP+RLT”relaxation, and for n = 2 problem is equivalent to QPB [AB10].

Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.

• Can be shown that DNN relaxation is equivalent to “SDP+RLT”relaxation, and for n = 2 problem is equivalent to QPB [AB10].

• Constraints from Boolean Quadric Polytope (BQP) are valid foroff-diagonal components of X [BL09]. For n = 3, BQP is com-pletely determined by triangle inequalities and RLT constraints.However, can still be a gap when using SDP+RLT+TRI relax-ation.

Replacing C2n+1 with D2n+1 gives tractable DNN relaxation.

• Can be shown that DNN relaxation is equivalent to “SDP+RLT”relaxation, and for n = 2 problem is equivalent to QPB [AB10].

• Constraints from Boolean Quadric Polytope (BQP) are valid foroff-diagonal components of X [BL09]. For n = 3, BQP is com-pletely determined by triangle inequalities and RLT constraints.However, can still be a gap when using SDP+RLT+TRI relax-ation.

• Example from [BL09] with n = 3 has optimal value for QPBof 1.0, value for SDP+RLT+TRI relaxation of 1.093. Solutionmatrix Y + has 5 × 5 principal submatrix that is not strictlypositive, and is not CP. Can obtain cut from Theorem 3, re-solve problem, and repeat.

1.00E‐07

1.00E‐06

1.00E‐05

1.00E‐04

1.00E‐03

1.00E‐02

1.00E‐01

1.00E+00

Gap

 to optim

al value

1.00E‐09

1.00E‐08

1.00E‐07

1.00E‐06

1.00E‐05

1.00E‐04

1.00E‐03

1.00E‐02

1.00E‐01

1.00E+00

0 5 10 15 20 25

Gap

 to optim

al value

Number of CP cuts added

Figure 1: Gap to optimal value for Burer-Letchford QPB problem (n = 3)

Example 2: Let A be the adjacency matrix of a graph G onn vertices, and let α be the maximum size of a stable set in G.Known [dKP02] that

Example 2: Let A be the adjacency matrix of a graph G onn vertices, and let α be the maximum size of a stable set in G.Known [dKP02] that

α−1 = min{

(I + A) •X : eeT •X = 1, X ∈ Cn}. (4)

Example 2: Let A be the adjacency matrix of a graph G onn vertices, and let α be the maximum size of a stable set in G.Known [dKP02] that

α−1 = min{

(I + A) •X : eeT •X = 1, X ∈ Cn}. (4)

Relaxing Cn to Dn results in the Lovasz-Schrijver bound

(ϑ′)−1 = min{

(I + A) •X : eeT •X = 1, X ∈ Dn}. (5)

Let G12 be the complement of the graph corresponding to thevertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ′ ≈3.24.

Let G12 be the complement of the graph corresponding to thevertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ′ ≈3.24.

• Using the cone K112 to approximate the dual of (4) provides no

improvement [BdK02].

Let G12 be the complement of the graph corresponding to thevertices of a regular icosahedron [BdK02]. Then α = 3 and ϑ′ ≈3.24.

• Using the cone K112 to approximate the dual of (4) provides no

improvement [BdK02].

• For the solution matrix X ∈ D12 from (5), cannot find cutbased on first block structure (2). However can find a cutbased on (3). Adding this cut and re-solving, gap to 1/α = 1

3is approximately 2× 10−8.

References

[AB10] Kurt M. Anstreicher and Samuel Burer, Computable representations for convex hulls of low-dimensional quadratic forms, Math. Prog. B 124 (2010), 33–43.

[BAD09] Samuel Burer, Kurt M. Anstreicher, and Mirjam Dur, The difference between 5 × 5 doublynonnegative and completely positive matrices, Linear Algebra Appl. 431 (2009), 1539–1552.

[BD10] Samuel Burer and Hongbo Dong, Separation and relaxation for cones of quadratic forms, Dept.of Management Sciences, University of Iowa (2010).

[BdK02] Immanuel M. Bomze and Etienne de Klerk, Solving standard quadratic optimization problems vialinear, semidefinite and copositive programming, J. Global Optim. 24 (2002), 163–185.

[BL09] Samuel Burer and Adam N. Letchford, On nonconvex quadratic programming with box constri-ants, SIAM J. Optim. 20 (2009), 1073–1089.

[Bur09] Samuel Burer, On the copositive representation of binary and continuous nonconvex quadraticprograms, Math. Prog. 120 (2009), 479–495.

[BX04] Abraham Berman and Changqing Xu, 5× 5 Completely positive matrices, Linear Algebra Appl.393 (2004), 55–71.

[DA10] Hongbo Dong and Kurt Anstreicher, Separating doubly nonnegative and completely positive ma-trices, Dept. of Management Sciences, University of Iowa (2010).

[dKP02] Etienne de Klerk and Dmitrii V. Pasechnik, Approximation of the stability number of a graph viacopositive programming, SIAM J. Optim. 12 (2002), 875–892.

[HJR05] Leslie Hogben, Charles R. Johnson, and Robert Reams, The copositive completion problem, LinearAlgebra Appl. 408 (2005), 207–211.

[KB93] Natalia Kogan and Abraham Berman, Characterization of completely positive graphs, DiscreteMath. 114 (1993), 297–304.