2

Click here to load reader

Conjugate gradient algorithms for best rank-1 approximation of tensors

Embed Size (px)

Citation preview

Page 1: Conjugate gradient algorithms for best rank-1 approximation of tensors

Conjugate gradient algorithms for best rank-1 approximation of tensors

O. Curtef∗ 1, G. Dirr1, and U. Helmke1

1 University of Wurzburg, Institute of Mathematics, 97074 Wurzburg, Germany.

Motivated by considerations of pure state entanglement in quantum information, we consider the problem of finding the bestrank-1 approximation to an arbitrary r-th order tensor. Reformulating the problem as an optimization problem on the Liegroup SU(n1) ⊗ ... ⊗ SU(nr) of so-called local unitary transformations and exploiting its intrinsic geometry yields a newapproach, which finally leads to Riemannian variant of the conjugate gradient algorithm. Numerical simulations support thatour method offers an alternative to the higher-order power method for computing the best rank-1 approximation to a tensor.

Keywords: Tensor SVD, matrix approximation, Riemannian optimization, conjugate gradient method, Lie groups.

1 Introduction

A classical result from linear algebra, the Eckart-Young theorem, asserts that the best rank-k approximation to a rectangularmatrix is given by its truncated singular value decomposition (SVD). A similar, more general approximation result for tensorsis needed in quantum information in order to compute entanglement measures [6]. However, extending the Eckart-YoungTheorem to higher order tensors is by no means trivial. First, one has to define an appropriate notion of the rank of a tensor.This is easily done, by noting that any r-th order tensor X decomposes as the finite sum of elementary r-fold tensor productsx1 ⊗ · · · ⊗ xr ; thus the rank of X then is defined as the minimal number k of summands of tensor product terms occurring insuch a decomposition. Similarly, the definition of the SVD carries over also to the tensor case; we will not need this conceptof a tensor SVD here and refer instead to the literature [2]. Nevertheless, with these definitions of rank and SVD of a tensor inhand, the Eckart-Young theorem fails for higher order tensors.

Thus, in this paper, we consider the problem of developing numerical methods for finding the best rank-1 approximationto an arbitrary r-th order tensor. Our focus on the approximation problem by rank-1 tensors is motivated by questions fromquantum computing, where pure state entanglement problems lead exactly to this problem. After reformulating the rank-1 approximation problem as an equivalent optimization problem to the Lie group SUloc(n1, ..., nr) := SU(n1) ⊗ ... ⊗SU(nr) of so-called local unitary transformations, we introduce an intrinsic variant of the conjugate gradient algorithm onSUloc(n1, ..., nr). Our method offers an alternative approach to the higher-order power method (HOPM) [1]. Numericalsimulations support this and show that each of these algorithms has its own benefits.

2 Best rank-1 tensor approximation

Higher-order tensors or multiway arrays are the main object of multilinear algebra, being the higher-order equivalents ofvectors (first order tensors) and matrices (second order tensors). A tensor X of order r ∈ N is an array of complex numbers(Xi1...ir

)i1≤n1...ir≤nr; let Cn1×n2×...×nr denote the vector space of all such tensors. By arranging the components Xi1...ir

ofa tensor in the lexicographical order, we can identify the tensor space C

n1×n2×...×nr with complex space CN , N = n1 · · ·nr.

In this way, any tensor X can be identified with its vector representation vec(X) ∈ CN. Any tensor X of order r ∈ N is acomplex linear combination of finitely many tensors of rank one. Here a tensor X ∈ Cn1×n2×...×nr is called a rank-1 tensorif it is the outer product of vectors xk ∈ Cnk , k = 1, ..., r, i.e. X = x1 ⊗ x2 ⊗ ... ⊗ xr , with the tensor product defined by(x1 ⊗ x2 ⊗ ... ⊗ xr)i1...ir

:= x1i1· x2

i2· · · xr

ir. The Hermitian inner product of two tensors X, Y ∈ Cn1×n2×...×nr is given

by 〈X, Y 〉 :=∑

Xi1...irYi1...ir

, with associated norm ||X ||. The best rank-1 approximation problem than asks for finding asolution to the minimization problem

minc∈C, ‖yk‖=1

‖X − cy1 ⊗ y2 ⊗ ... ⊗ yr‖2.

A particular case appears in quantum entanglement of spin-12 systems, where n1 = ... = nr = 2, [5]. For connections to the

C-numerical range see also [4]. We find it convenient to re-define this minimization problem as an equivalent maximizationproblem for a trace function on the Lie group SUloc(n1, ..., nr) := SU(n1) ⊗ ... ⊗ SU(nr) of local unitary transformations.

Theorem 2.1 Let X ∈ Cn1×···×nr , x := vec(X), be arbitrary and Z = z1 ⊗ · · · ⊗ zr, z := vec(Z), be any rank-1 tensorwith ‖zk‖ = 1. Then

maxrk Y =1, ‖Y ‖=1

Re 〈X, Y 〉 = maxU∈SUloc(n1,...,nr)

Re (x†Uz).

∗ Corresponding author: e-mail: [email protected], Phone: +00 931 888 5009, Fax: +00 931 888 4611

PAMM · Proc. Appl. Math. Mech. 7, 1062201–1062202 (2007) / DOI 10.1002/pamm.200700706

© 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

© 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Page 2: Conjugate gradient algorithms for best rank-1 approximation of tensors

Moreover, any such maximizing Y = y1 ⊗ ... ⊗ yr yields a best rank-1 approximation of X with c := Re 〈X, Y 〉 of X .

The property of Y being a best rank-1 approximation of X is well-known in the literature, e.g. [1].

2.1 Riemannian Conjugate Gradient Method

In order to solve the best rank-1 approximation problem, we use the conjugate gradient algorithm (CG) on the Lie groupSUloc(n1, ..., nr). Here we endow SUloc(n1, ..., nr) with its canonical bi-invariant Riemannian metric. Note, that the Liealgebra of SUloc(n1, ..., nr) is suloc(n1, ..., nr) = {∑r

k=1 In1⊗In2

⊗...⊗Ω⊗...⊗Inr

∣∣ Ω ∈ su(nk) at the k−th position},where su(nk) denotes the Lie algebra of all skew-Hermitian traceless matrices.

Setting C := diag(1, 0, ..., 0) and A := xx†, we note that the maximization task in Theorem 2.1 is equivalent to ma-ximizing the trace function f(U) = Re tr (C†UAU †) on SUloc(n1, ..., nr). The Riemannian gradient of f is given asgradf(U) = π

([C†, UAU †]

)U where π : C

N×N → suloc(n1, ..., nr) denotes the orthogonal projection. The Rieman-nian exponential map expU and the parallel transport τU,eΩU along the geodesic through U in direction ΩU are given byexpU (ΩU) = eΩU and τU,eΩU (Ξ) = e

12Ω Ξe

12ΩU.

Riemannian Conjugate Gradient Algorithm (CG)Step1: U0 ∈ SUloc(n1, ..., nr) randomly chosen, set X0 := Y0 := gradf(U0), k = 0.Step2: Set Uk+1 = expUk

(αkYk) with αk = (2‖[A, YkU†k ]‖ · ‖[UkCU

†k , YkU

†k ]‖)−1Re tr(A[YkX

†k, UkCU

†k ]).

Step3: Set Xk+1 = gradf(Uk+1) and n := dimSUloc(n1,...,nr) . If (k mod n) = n − 1, then Yk+1 = Xk+1, else Yk+1 =

Xk+1 + γkτUk,Uk+1(Yk), with γk = ‖YkXk‖−1‖(Xk+1 − τUk,Uk+1

(Xk))Xk+1‖; k = k + 1 and go to Step2.

In numerical experiments we compared CG with HOPM ( [1]), for the best rank-1 tensor approximation problem. Bothmethods were implemented in MATLAB. We considered two examples of (2 × 2 × 2 × 2) tensors, where one is of the formX =

√s(e1⊗e1⊗e2⊗e2+e2⊗e2⊗e1⊗e1)+

√1 − s(e2⊗e1⊗e2⊗e1+e1⊗e2⊗e1⊗e2), with e1 = (1 0)�, e2 = (0 1)�

and parameter s ∈ [0, 1]. The other one is given by Y = 25.1 · e1 ⊗ e1 ⊗ e1 ⊗ e1 + 24.8 · e2 ⊗ e1 ⊗ e2 ⊗ e1 + 25.6 · e1 ⊗ e2 ⊗e1 ⊗ e2 + 23 · e2 ⊗ e2 ⊗ e2 ⊗ e2. For the particular choice of X an analytical solution is known [3]. While the initial valueU0 ∈ SUloc(2, 2, 2, 2) for the CG was choosen arbitrarily, the corresponding initial value for the HOPM was computed.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

s

1−m

ax. l

ocal

CG and HOPMAnalytical results

Average CPU Time CG = 0.226307Average CPU Time HOPM = 0.102072

0 5 10 15 20

10−6

10−4

10−2

100

Iterations

log(

Max

im−

Fun

ctio

n)

CGHOPM

0 50 100 150 200 250

10−4

10−2

100

102

104

Iterations

log(

Max

im−

Fun

ctio

n)

CGHOPM

Figure 1 plots the function f : [0, 1] → R, f(s) = 1 − maxs, where maxs is the best rank-1 approximation for thetensor X for different values of s ∈ [0; 1]. The continuous line gives the analytical solution and the dotted line presents thenumerical results. Both algorithms converged to the analytical solution. Last two figures show the speed of convergence of thetwo methods (Figure 2 for X with parameter s = 0.5 and Figure 3 for Y ).

References

[1] L. De Lathauwer, B. De Moor, and J. Vandewalle. On the Best Rank-1 and Rank-(R1, R2, ..., RN ) Approximation of Higher-OrderTensors. SIAM J. Matrix Anal. Appl., 21, 1324-1342 (2000).

[2] L. De Lathauwer, B. De Moor, and J. Vandewalle. A Multilinear Singular Value Decomposition. SIAM J. Matrix Anal. Appl., 21,1253-1278 (2000).

[3] T. Wei and P. Goldbart. Geometric measure of entanglement and applications to bipartite and multipartite quantum states. Phys. Rev.A, 68, 042307 (2003).

[4] G. Dirr, U. Helmke, M. Kleinsteuber, and Th. Schulte-Herbruggen. Relative C-numerical Ranges for Applications in Quantum Infor-mation. to appear in Linear and Multilinear Algebra.

[5] G. Dirr and U. Helmke. Lie Theory for Quantum Control. to appear in GAMM Mitteilungen (2008).[6] M. Nielsen and I. Chuang. Quantum Computation and Quantum Information, Cambridge University Press, Cambridge, 2000.

© 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

ICIAM07 Minisymposia – 06 Optimization 1062202