121
Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE, EPFL 1

Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

  • Upload
    others

  • View
    20

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Low-Rank Tensor Techniquesfor High-Dimensional Problems

Daniel KressnerCADMOS Chair for Numerical Algorithms and HPC

MATHICSE, EPFL

1

Page 2: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

ContentsI What is a tensor?I ApplicationsI Matrices and low rankI CP and TuckerI Hierarchical TuckerI Algorithms based on low-rank tensorsI Conclusions

2

Page 3: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

What is a tensor?I Vectors, matrices, and tensorsI Basic calculus with tensorsI Vectorization and matricizationI µ-mode matrix productsI Two classes of tensor problems

3

Page 4: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Vectors, matrices, and tensors

Vector Matrix Tensor

I scalar = tensor of order 0I (column) vector = tensor of order 1I matrix = tensor of order 2I tensor of order 3

= n1n2n3 numbers arranged in n1 × n2 × n3 array4

Page 5: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Tensors of arbitrary orderA d-th order tensor X of size n1 × n2 × · · · × nd is a d-dimensionalarray with entries

Xi1,i2,...,id , iµ ∈ 1, . . . ,nµ for µ = 1, . . . ,d .

In the following, entries of X are real (for simplicity)

X ∈ Rn1×n2×···×nd .

Multi-index notation:

I = 1, . . . ,n1 × 1, . . . ,n2 × · · · × 1, . . . ,nd.

Then i ∈ I is a tuple of d indices:

i = (i1, i2, . . . , id ).

Allows to write entries of X as Xi for i ∈ I.

5

Page 6: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Two important points1. A matrix A ∈ Rm×n has a natural interpretation as a linear

operator in terms of matrix-vector multiplications:

A : Rn → Rm, A : x 7→ A · x .

There is no such (unique and natural) interpretation for tensors! fundamental difficulty to define meaningful general notion ofeigenvalues and singular values of tensors.

2. Number of entries in tensor grows exponentially with d Curse of dimensionality.

Example: Tensor of order 30 with n1 = n2 = · · · = nd = 10 has1030 entries = 8× 1012 Exabyte storage!1

For d 1: Cannot afford to store tensor explicitly (in terms of itsentries).

1Global data storage calculated at 295 exabyte, seehttp://www.bbc.co.uk/news/technology-12419672.

6

Page 7: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Basic calculusI Addition of two equal-sized tensors X ,Y:

Z = X + Y ⇔ Zi = Xi + Yi ∀i ∈ I.

I Scalar product with α ∈ R:

Z = αX ⇔ Zi = αXi ∀i ∈ I.

vector space structure.

I Inner product of two equal-sized tensors X ,Y:

〈X ,Y〉 :=∑i∈I

xiyi .

Induced norm‖X‖ :=

(∑i∈I

x2i

)1/2

For a 2nd order tensor (= matrix) this corresponds to theFrobenius norm.

7

Page 8: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

VectorizationTensor X of size n1 × n2 × · · · × nd has n1 · n2 · · · nd entries many ways to stack entries in a (loooong) column vector.One possible choice:The vectorization of X is denoted by vec(X ), where

vec : Rn1×n2×···×nd → Rn1·n2···nd

stacks the entries of a tensor in reverse lexicographical order into along column vector.

Remark: For d = 2, this is the usual way how matrices are vectorized.

A =

a11 a12a21 a22a31 a32

⇒ vec(A) =

a11a21a31

a12a22a32

8

Page 9: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

VectorizationExample: d = 3, n1 = 3, n2 = 2, n3 = 3.

vec(X ) =

x111x112x113x121

...

...x321x322x323

9

Page 10: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

MatricizationI A matrix has two modes (column mode and row mode).I A d th-order tensor X has d modes (µ = 1, µ = 2, . . ., µ = d).

Let us fix all but one mode, e.g., µ = 1: Then

X (:, i2, i3, . . . , id ) (abuse of MATLAB notation)

is a vector of length n1 for each choice of i2, . . . , id .

View tensor X as a bunch of column vectors:

10

Page 11: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

MatricizationStack vectors into an n1 × (n2 · · · nd ) matrix:

X ∈ Rn1×n2×···×nd X (1) ∈ Rn1×(n2n3···nd )

For µ = 1, . . . ,d , the µ-mode matricization of X is a matrix

X (µ) ∈ Rnµ×(n1···nµ−1nµ+1···nd )

with entries (X (µ)

)iµ1 ,(i1,...,iµ−1,iµ+1...id )

= Xi ∀i ∈ I.

11

Page 12: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

MatricizationIn MATLAB: a = rand(2,3,4,5);

I 1-mode matricization:reshape(a,2,3*4*5)

I 2-mode matricization:b = permute(a,[2 1 3 4]);reshape(b,3,2*4*5)

I 3-mode matricization:b = permute(a,[3 1 2 4]);reshape(b,4,2*3*5)

I 4-mode matricization:b = permute(a,[4 1 2 3]);reshape(b,5,2*3*4)

For a matrix A ∈ Rn1×n2 :

A(1) = A, A(2) = AT .

12

Page 13: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

µ-mode matrix productsConsider 1-mode matricization X (1) ∈ Rn1×(n2···nd ):

Seems to make sense to multiply an m × n1 matrix A from the left:

Y (1) := A X (1) ∈ Rm×(n2···nd ).

Can rearrange Y (1) back into an m × n2 × · · · × nd tensor Y.This is called 1-mode matrix multiplication

Y = A 1 X ⇔ Y (1) = AX (1)

More formally (and more ugly):

Yi1,i2,...,id =

n1∑k=1

ai1,kXk,i2,...,id .

13

Page 14: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

µ-mode matrix productsGeneral definition of a µ-mode matrix product with A ∈ Rm×n1 :

Y = A µ X ⇔ Y (µ) = AX (µ).

More formally (and more ugly):

Yi1,i2,...,id =

n1∑k=1

aiµ,kXi1,...,iµ−1,k,iµ+1,...,id .

For matrices:I 1-mode multiplication = multiplication from the left:

Y = A 1 X = A X .

I 2-mode multiplication = transposed multiplication from the right:

Y = A 2 X = X AT .

14

Page 15: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Kronecker productFor m× n matrix A and k × ` matrix B, Kronecker product defined as

B ⊗ A :=

b11A · · · b1`A...

...bk1A · · · bk`A

∈ Rkm×`n.

Most important properties (for our purposes):1. vec(A X ) = (I ⊗ A) vec(X ).2. vec(X AT ) = (A⊗ I) vec(X ).3. (B ⊗ A)(D ⊗ C) = (BD ⊗ AC).4. Im ⊗ In = Imn.

15

Page 16: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

µ-mode matrix products and vectorizationBy definition,

vec(X ) = vec(X (1)

).

Consequently, also

vec(A 1 X ) = vec(A X (1)

).

Vectorized version of 1-mode matrix product:

vec(A 1 X ) = (In2···nd ⊗ A)vec(X )

= (Ind ⊗ · · · ⊗ In2 ⊗ A) vec(X ).

Relation between µ-mode matrix product and matrix-vector product:

vec(A µ X ) = (Ind ⊗ · · · ⊗ Inµ+1 ⊗ A⊗ Inµ−1 ⊗ · · · ⊗ In1 ) vec(X )

16

Page 17: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Two classes of tensor problemsClass 1: function-related tensorsConsider a function u(ξ1, . . . , ξd ) ∈ R in d variables ξ1, . . . , ξd .Tensor U ∈ Rn1×···×nd represents discretization of u:I U contains function values of u evaluated on a grid; orI U contains coefficients of truncated expansion in tensorized

basis functions:

u(ξ1, . . . , ξd ) ≈∑i∈I

Ui φi1 (ξ1)φi2 (ξ2) · · ·φid (ξd ).

Typical setting:I U only given implicitly, e.g., as the solution of a discretized PDE;I seek approximations to U with very low storage and tolerable

accuracy.I d may become very large.

Focus of this lecture on function-related tensors!

17

Page 18: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Discretization of function in d variablesξ1, . . . , ξd ∈ [0,1] #function values grows exponentially with d

18

Page 19: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Separability helpsIdeal situation:Function f separable:f (ξ1, ξ2, . . . , ξd ) = f1(ξ1)f2(ξ2) . . . fd (ξd )

Kronecker product

diskretized f

discretized f j O(nd ) memory O(dn) memoryOf course:Exact separability rarely satisfied inpractice.

19

Page 20: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Two classes of tensor problemsClass 2: data-related tensorsTensor U ∈ Rn1×···×nd contains multi-dimensional data.

Example 1: U2011,3,2 denotes the number of papers published 2011by author 3 in the mathematical journal 2.

Example 2: A video of 1000 frames with resolution 640× 480 canbe viewed as a 640× 480× 1000 tensor.

Typical setting:I entries of U given explicitly (at least partially).I extraction of dominant features from U .I usually moderate values for d .

20

Page 21: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

SummaryI Tensor X ∈ Rn1×···×nd is a d-dimensional array.I Various ways of reshaping entries of a tensor X into a vector or

matrix.I µ-mode matrix multiplication can be expressed with Kronecker

products

Further reading:I T. Kolda and B. W. Bader. Tensor decompositions and

applications. SIAM Rev. 51 (2009), no. 3, 455–500.Software:

I MATLAB offers basic functionality to work with d-dimensionalarrays.

I MATLAB Tensor Toolbox: http://www.csmr.ca.sandia.gov/~tgkolda/TensorToolbox/

21

Page 22: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Applications inscientific computing

I High-dimensional elliptic PDEsI High-dimensional PDE-eigenvalue problemsI Quantum many-body problemsI Stochastic Automata NetworksI further applications

22

Page 23: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

High-dimensional elliptic PDEs: 3D model problemI Consider

−∆u = f in Ω, u|∂Ω = 0,

on unit cube Ω = [0,1]3.I Discretize on tensor grid.

Uniform grid for simplicity:

ξ(j)µ = jh, h =

1n + 1

for µ = 1,2,3.

I Approximate solution tensor U ∈ Rn×n×n:

Ui1,i2,i3 ≈ u(ξ

(i1)1 , ξ

(i2)2 , . . . , ξ

(id )d

).

23

Page 24: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

High-dimensional elliptic PDEs: 3D model problemI Discretization of 1D-Laplace:

−∂xx ≈

2 −1

−1 2. . .

. . . . . . −1−1 2

=: A.

I Application in each coordinate direction:

−∂ξ1ξ1u(ξ1, ξ2, ξ3) ≈ A 1 U ,−∂ξ2ξ2u(ξ1, ξ2, ξ3) ≈ A 2 U ,−∂ξ3ξ3u(ξ1, ξ2, ξ3) ≈ A 3 U .

I Hence,−∆u ≈ A 1 U + A 2 U + A 3 U

or in vectorized form with u = vec(U):

−∆u ≈ (I ⊗ I ⊗ A + I ⊗ A⊗ I + A⊗ I ⊗ I)u.

24

Page 25: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

High-dimensional elliptic PDEs: 3D model problemFinite difference discretization of model problem

−∆u = f in Ω, u|∂Ω = 0

for Ω = [0,1]3 takes the form

(I ⊗ I ⊗ A + I ⊗ A⊗ I + A⊗ I ⊗ I)u = f.

Similar structure for finite element discretization with tensorized FEs:

V⊗W⊗ Z =∑

αijk vi (ξ1)wj (ξ2)zk (ξ3) : αijk ∈ R

with

V = v1(ξ1), . . . , vn(ξ1), W = w1(ξ2), . . . ,wn(ξ2), Z = z1(ξ3), . . . , zn(ξ3)

Galerkin discretization

(KV ⊗MW ⊗MZ + MV ⊗ KW ⊗MZ + MV ⊗MW ⊗ KZ )u = f,

with 1D mass/stiffness matrices MV ,MW ,MZ ,KV ,KW ,KZ .25

Page 26: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

High-dimensional elliptic PDEs: Arbitrary dimensionsFinite difference discretization of model problem

−∆u = f in Ω, u|∂Ω = 0

for Ω = [0,1]d takes the form

( d∑j=1

I ⊗ · · · ⊗ I ⊗ A⊗ I ⊗ · · · ⊗ I)

u = f.

To obtain such Kronecker structure in general:I tensorized domain;I highly structured grid;I coefficients that can be written/approximated as sum of

separable functions.

26

Page 27: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

High-dimensional PDE-eigenvalue problemsPDE-eigenvalue problem

∆u(ξ) + V (ξ)u(ξ) = λu(ξ) in Ω = [0,1]d ,u(ξ) = 0 on ∂Ω.

Assumption: Potential represented as

V (ξ) =s∑

j=1

V (1)j (ξ1)V (2)

j (ξ2) · · ·V (d)j (ξd ).

finite difference discretization

Au = (AL +AV )u = λu,

with

AL =d∑

j=1

I ⊗ · · · ⊗ I︸ ︷︷ ︸d−j times

⊗AL ⊗ I ⊗ · · · ⊗ I︸ ︷︷ ︸j−1 times

,

AV =s∑

j=1

A(d)V ,j ⊗ · · · ⊗ A(2)

V ,j ⊗ A(1)V ,j .

27

Page 28: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Quantum many-body problemsI spin-1/2 particles: proton, neutron, electron, and quark.I two states: spin-up, spin-downI quantum state for each spin represented by vector in C2 (spinor)I quantum state for system of d spins represented by vector in C2d

I quantum mechanical operators expressed in terms of Paulimatrices

Px =

[0 11 0

], Py =

[0 −ii 0

], Pz =

[1 00 −1

].

I spin Hamiltonian: sum of Kronecker products of Pauli matricesand identities each term describes physical (inter)action of spins

I interaction of spins described by graphI Goal: Compute ground state of spin Hamiltonian.

28

Page 29: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Quantum many-body problemsExample: 1d chain of 5 spins with periodic boundary conditions

1 3 4 52

Hamiltonian describing pairwise interaction between nearestneighbors:

H = Pz ⊗ Pz ⊗ I ⊗ I ⊗ I+ I ⊗ Pz ⊗ Pz ⊗ I ⊗ I+ I ⊗ I ⊗ Pz ⊗ Pz ⊗ I+ I ⊗ I ⊗ I ⊗ Pz ⊗ Pz+ Pz ⊗ I ⊗ I ⊗ I ⊗ Pz

29

Page 30: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Quantum many-body problemsI Ising (ZZ) model for 1d chain of d spins with open boundary

conditions:

H =

p−1∑k=1

I ⊗ · · · ⊗ I ⊗ Pz ⊗ Pz ⊗ I ⊗ · · · ⊗ I

p∑k=1

I ⊗ · · · ⊗ I ⊗ Px ⊗ I ⊗ · · · ⊗ I

λ = ratio between strength of magnetic field and pairwiseinteractions

I 1d Heisenberg (XY) modelI Current research: 2d models.I More details in:

Huckle/Waldherr/Schulte-Herbrüggen: Computations inQuantum Tensor Networks.Schollwöck: The density-matrix renormalization group in the ageof matrix product states.

30

Page 31: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Stochastic Automata Networks (SANs)

I 3 stochastic automata A1,A2,A3 having 3 states each.I Vector x (i)

t ∈ R3 describes probabilities of states (1), (2), (3) in Aiat time t

I No coupling between automata local transition x (i)t 7→ x (i)

t+1described by Markov chain:

x (i)t+1 = Eix

(i)t ,

with a stochastic matrix Ei .I Stationary distribution of Ai = Perron vector of Ei (eigenvector for

eigenvalue 1).

31

Page 32: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Stochastic Automata Networks (SANs)

I 3 stochastic automata A1,A2,A3 having 3 states each.I Coupling between automata local transition x (i)

t 7→ x (i)t+1 not

described by Markov chain.I Need to consider all possible combinations of states in

(A1,A2,A3):

(1,1,1), (1,1,2), (1,1,3), (1,2,1), (1,2,2), . . . .

I Vector xt ∈ R33(or tensor X (t) ∈ R3×3×3) describes probabilities

of combined states.

32

Page 33: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Stochastic Automata Networks (SANs)I Transition xt 7→ xt+1 described by Markov chain:

xt+1 = Ext ,

with a large stochastic matrix E .I Oversimplified example:

E = I ⊗ I ⊗ E1 + I ⊗ E2 ⊗ I + E3 ⊗ I ⊗ I︸ ︷︷ ︸local transition

.

+ I ⊗ E21 ⊗ E12︸ ︷︷ ︸interaction between A1,A2

+ E32 ⊗ E23 ⊗ I︸ ︷︷ ︸interaction between A2,A3

I Goal: Compute stationary distribution = Perron vector of E .I More details in:

Stewart: Introduction to the Numerical Solution of MarkovChains. Chapter 9.Buchholz: Product Form Approximations for CommunicatingMarkov Processes.

33

Page 34: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Further applicationsOther applications in scientific computing featuring low-rank tensorconcepts:

I Boltzmann equation [Ibragimov/Rjasanow’2009].I Dynamical systems [Koch/Lubich’2009].I Parabolic PDEs [Andreev/Tobler’2011], [Khoromskij’2009].I Stochastic PDEs [Khoromskij/Schwab’2010],

[Matthies/Zander’2011], [Kressner/Tobler’2011],[Ballani/Grasedyck/Kluge’2011], . . .

I Electronic structure calculation [Chinnamsetty et al.’2007], [Fladet al.’2009], [Khoromskij/Khoromskaja’2009],[Limpanuparb/Gill’2009], [Benedikt et al.’2011],[Mohlenkamp’2011], . . .

I Evaluation of boundary integrals (in BEM): [Grasedyck],[Khoromskij/Sauter/Veit’2011].

I . . .

34

Page 35: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

SummaryI Large diversity of applications leading to linear systems /

eigenvalue problems with Kronecker product structures.I For many problems of practical interest:

Explicit storage / computation of solution infeasible.I Increasing use of low-rank tensor techniques.

Heaviest use currently:DMRG for quantum many-body problems.

I Remark: For PDE-related applications, high dimensionality canalso be addressed during the discretization phase (sparse grids,adaptive sparse discretization, . . .).Has advantages and disadvantages.

35

Page 36: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Approximatelow-rank matrices

I Singular value decompositionI Separability and low rankI Separability by polynomial interpolationI Separability by exponential sumsI Low rank of snapshot matrices

36

Page 37: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Low-rank approximationSetting: Matrix X ∈ Rn×m, m and n too large to compute/store Xexplicitly.Idea: Replace X by RST with R ∈ Rn×r ,S ∈ Rm×r and r m,n.

X RST

Memory nm nr + rmCost ops(m,n) ops(m,n)× r

minm,n (?)

min‖X − RST‖2 : R ∈ Rn×r ,S ∈ Rm×r = σk+1.

with singular values σ1 ≥ σ2 ≥ · · · ≥ σminm,n of X .

37

Page 38: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Construction from singular value decompositionSVD: Let matrix X ∈ Rn×m and k = minm,n. Then ∃ orthonormalmatrices

U =[u1, u2, . . . , uk

]∈ Rn×k , V =

[v1, v2, . . . , vk

]∈ Rm×k ,

such thatX = UΣV T , Σ = diag(σ1, σ2, . . . , σk ).

Choose r ≤ k and partition

X =[U1, U2

] [ Σ1 00 Σ2

] [V1, V2

]T= U1 Σ1︸ ︷︷ ︸

=:R

V T1︸︷︷︸

=:ST

+ U2Σ2V T2 .

Then ‖X − RST‖2 = ‖Σ2‖2 = σr+1.

Good low rank approximation if singular values decay sufficiently fast.

Also: span(X ) ≈ span(R), span(X T ) ≈ span(ST )

38

Page 39: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Discretization of bivariate functionI Bivariate function: f (x , y) :

[xmin, xmax

]×[ymin, ymax

]→ R.

I Function values on tensor grid [x1, . . . , xn]× [y1, . . . , ym]:

F =

f (x1, y1) f (x1, y2) · · · f (x1, yn)f (x2, y1) f (x2, y2) · · · f (x2, yn)

......

...f (xm, y1) f (xm, y2) · · · f (xm, yn)

Basic but crucial observation: f (x , y) = g(x)h(y)

F =

g(x1)h(y1) · · · g(x1)h(yn)...

...g(xm)h(y1) · · · g(xm)h(yn)

=

g(x1)...

g(xm)

[ h(y1) · · · h(yn) ]

Separability implies rank 1.

39

Page 40: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Separability and low rankApproximation by sum of separable functions

f (x , y) = g1(x)h1(y) + · · ·+ gr (x)hr (y)︸ ︷︷ ︸=:fr (x,y)

+ error.

Define

Fr =

fr (x1, y1) · · · fr (x1, yn)...

...fr (xm, y1) · · · fr (xm, yn)

.Then Fr has rank ≤ r and ‖F − Fr‖F ≤

√mn × error.

σr+1(F ) ≤‖F − Fr‖2 ≤ ‖F − Fr‖F ≤

√mn × error.

Semi-separable approximation implies low-rank approximation.

40

Page 41: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Semi-separable approximation by polynomialsSolution of approximation problem

f (x , y) = g1(x)h1(y) + · · ·+ gr (x)hr (y) + error.

not trivial; gj ,hj can be chosen arbitrarily!

General construction by polynomial interpolation:1. Lagrange interpolation of f (x , y) in y -coordinate:

Iy [f ](x , y) =r∑

j=1

f (x , θj )Lj (y)

with Lagrange polynomials Lj of degree r − 1 on [xmin, xmax].

2. Interpolation of Iy [f ] in x-coordinate:

Ix [Iy [f ]](x , y) =r∑

i,j=1

f (ξi , θj )Li (x)Lj (y) =r∑

i=1

Li,x (x)Lj,y (y),

where f [f (ξi , θj )]i,j is “diagonalized” by SVD.41

Page 42: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Semi-separable approximation by polynomials

error ≤ ‖f − Ix [Iy [f ]]‖∞= ‖f − Ix [f ] + Ix [f ]− Ix [Iy [f ]]‖∞≤ ‖f − Ix [f ]‖∞ + ‖Ix‖∞‖f − Iy [f ]‖∞

with Lebesgue constant ‖Ix‖∞ ∼ log r when using Chebyshevinterpolation nodes.

Polynomial interpolation error typically much too pessimistic

I Lebesgue constants hit hard in high dimensions: (log r)d−1.I Severe theoretical barriers for general smooth multivariate

functions:E. Novak and H. Wozniakowski: Tractability of MultivariateProblems, Volume I and II. EMS.

42

Page 43: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Semi-separable approximation of 1/(x + y)Consider

f (x , y) =1

x + y, x , y ∈ [α, β], 0 < α < β.

Apply numerical quadrature:

1z

=

∫ ∞0

e−tz dt =r∑

j=1

ωje−γj z + error.

Inserting z = x + y

1x + y

=r∑

j=1

ωje−γj (x+y) + error =r∑

j=1

ωje−γj xe−γj y + error.

Choice of nodes γj > 0 and weights ωj > 0 as in [Stenger’93,Braess’86, Braess/Hackbusch’05]

error ≤ 8|α|

exp[− rπ2

log(8β/α)

].

43

Page 44: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Semi-separable approximation by exponential sumsI Consider more general case of function f (x , y) := g(x + y).I Approximation of g(z) with z := x + y by exponential sum

g(z) ≈r∑

j=1

ωj exp(γjz) (1)

for some coefficients γj , ωj ∈ R.I (1) gives semi-separable approximation for f :

f (x , y) = g(x + y) ≈r∑

j=1

ωj exp(γj (x + y))

=r∑

j=1

ωj exp(γjx) exp(γjy).

I Naturally extends to arbitrarily many variables.I Problem: (1) nontrivial approx problem [Braess’1986],

[Hackbusch’2006], . . .44

Page 45: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Low-rank approximation of snapshot matricesVector-valued function

x(α) : [αmin, αmax]→ Rn

Sampling at α1, . . . , αm ∈ [αmin, αmax]:

Snapshot matrix X = =[x(α1), x(α2), . . . , x(αm)

]

45

Page 46: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Example: Baking 1 cookieStationary heat equation with pw constant heat conductivity σ(x , α):

−∇(σ(x , α)∇u) = f in Ω = [−1,1]2

u = 0 on ∂Ω,

I σ(baking tray) = 1I σ(cookie) = 1 + α

I Undetermined parameter

α ∈ [αmin, αmax].

0 0.5 1 1.5 2

0

0.5

1

1.5

2

# Vertices : 455, # Elements : 825,# Edges : 1279

Standard FE discretization results in linearly parameter-dependentlinear system

(A0 + αA1)x(α) = b.

46

Page 47: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Singular value decay – observationI 1 Cookie: n = 371,m = 101.

log10(singular values of snapshot matrix)

0 20 40 60 80 100−20

−15

−10

−5

0

5

I Foundation of Proper Orthogonal Decomposition and ReducedBasis Methods.

47

Page 48: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Singular value decay – explanationPolynomial approximation:

x(α) = x0 + αx1 + α2x2 + · · ·+ αk−1xk−1 + error.

Approximation error:I Assume b(·), A(·) analytic x(·) analytic.I Then

error . ρ−k ,

where ρ > 1 depends on domain of analyticity of A,b.(Proof: Direct extension of classical result for scalar-valuedfunctions.)

48

Page 49: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Singular value decay – explanationPolynomial approximation:

x(α) = x0 + αx1 + α2x2 + · · ·+ αk−1xk−1 + error.

Snapshot matrix:

X =[x(α1), x(α2), . . . , x(αm)

]=

[x0, x1, . . . , xk−1

]

1 1 . . . 1α1 α2 . . . αm...

......

αk−11 αk−1

2 . . . αk−1m

+ error

= matrix of rank k + error

σk+1(X ) ≤ error . ρ−k

Remark: Trivially extends to pw analytic case.

49

Page 50: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Singular value decay – pw analytic caseExample: Consider smallest singular value σ(z) and correspondingright singular vector v(z) of B(z) = A− izI for z ∈ [−1,1].

I s(z) only Lipschitzcont, but pw anal.

I v(z) discontinuous,but pw anal.

I A = 2× 2 block diag randn, n = 400.I Snapshot matrix of singular vectors:

X =[

v(z1), v(z2), . . . , v(z100)]

for equidistant samples zj ∈ [−1,1].

σ(z) Singular values of X

−1 −0.5 0 0.5 10

0.005

0.01

0.015

0.02

0.025

0.03

z

0 20 40 60 80 10010

−20

10−15

10−10

10−5

100

105

50

Page 51: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Summary

Need strong singular value decay for good low-rank approximations.

For function-related matrices/tensors: Strong link to semi-separableapproximations.

Smoothness seems to be important... at least somehow.I Fortunately, smoothness is not necessary.

Piecewise smoothness can be enough.I Unfortunately, smoothness is not sufficient for higher-order

tensors.I Need to impose stronger regularity as dimension/order d

increases, based, e.g., on mixed weak derivatives [Yserentant:Regularity and approximability of electronic wave functions.2010].

51

Page 52: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Low-rank tensors:CP and Tucker

I CPI TuckerI Higher-order SVDI Tensor networks

52

Page 53: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

CP decompositionI Aim: Generalize concept of low rank from matrices to tensors.I One possibility motivated by

X =[a1, a2, . . . , aR

][b1, b2, . . . , bR

]T=

= a1bT1 + a2bT

2 + · · ·+ aRbTR .

vectorization

vec(X ) = b1 ⊗ a1 + b2 ⊗ a2 + · · ·+ bR ⊗ aR .

Canonical Polyadic decomposition of tensor X ∈ Rn1×n2×n3 definedvia

vec(X ) = c1 ⊗ b1 ⊗ a1 + c2 ⊗ b2 ⊗ a2 + · · ·+ cR ⊗ bR ⊗ aR

for vectors aj ∈ Rn1 , bj ∈ Rn2 , cj ∈ Rn3 .

CP directly corresponds to semi-separable approximation.Tensor rank of X = minimal possible R

53

Page 54: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

CP decompositionIllustration of CP decomposition

vec(X ) = c1 ⊗ b1 ⊗ a1 + c2 ⊗ b2 ⊗ a2 + · · ·+ cR ⊗ bR ⊗ aR .

c1

a1

b1

cr

ar

br

X

54

Page 55: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

CP decompositionI CP decomposition offers low data-complexity; for constant R:

linear complexity in d .I For matrices:

I rank r is upper semi-continuous closedness property:sequence of rank= r matrices can only converge to rank≤ r matrix.

I best low-rank approximation possible by successive rank-1approximations.

I Robust black-box algorithms/software available (svd, Lanczos).

For tensors of order d ≥ 3:I tensor rank R is not upper

semi-continuous

lack of closedness

I successive rank-1 approximations failI all algorithms based on optimization

techniques (ALS, Gauss-Newton)Picture taken from [Kolda/Bader’2009].

55

Page 56: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Tucker decompositionI Aim: Generalize concept of low rank from matrices to tensors.I Alternative possibility motivated by

A = U · Σ · V T , U ∈ Rn1×r , V ∈ Rn2×r , Σ ∈ Rr×r .

vectorization

vec(X ) =(V ⊗ U

)· vec(Σ).

Ignore diagonal structure of Σ and call it C.

Tucker decomposition of tensor X ∈ Rn1×n2×n3 defined via

vec(X ) =(W ⊗ V ⊗ U

)· vec(C)

with U ∈ Rn1×r1 , V ∈ Rn2×r2 , W ∈ Rn3×r3 ,and core tensor C ∈ Rr1×r2×r3 .

In terms of µ-mode matrix products:

X = U 1 V 2 W 3 C =: (U,V ,W ) C.

56

Page 57: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Tucker decompositionIllustration of Tucker decomposition

X = (U,V ,W ) C

X CU

V

W

57

Page 58: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Tucker decompositionConsider all three matricizations:

X (1) = U · C(1) ·(W ⊗ V

)T,

X (2) = V · C(2) ·(W ⊗ U

)T,

X (3) = W · C(3) ·(V ⊗ U

)T.

These are low rank decompositions

rank(X (1)

)≤ r1, rank

(X (2)

)≤ r2, rank

(X (3)

)≤ r3.

Multilinear rank of tensor X ∈ Rn1×n2×n3 defined by tuple

(r1, r2, r3), with ri = rank(X (i)).

58

Page 59: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Higher-order SVD (HOSVD)Goal: Approximate given tensor X by Tucker decomposition withprescribed multilinear rank (r1, r2, r3).

1. Calculate SVD of matricizations:

X (µ) = UµΣµV Tµ for µ = 1,2,3.

2. Truncate basis matrices:

Uµ := Uµ(:,1 : rµ) for µ = 1,2,3.

3. Form core tensor:

vec(C) :=(UT

3 ⊗ UT2 ⊗ UT

1)· vec(X ).

Truncated tensor produced by HOSVD [Lathauwer/DeMoor/Vandewalle’2000]:

vec(X)

:=(U3 ⊗ U2 ⊗ U1

)· vec(C).

Remark:Orthogonal projection X :=

(π1 π2 π3

)X with πµX := UµUT

µ µ X .59

Page 60: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Higher-order SVD (HOSVD)Tensor X resulting from HOSVD satisfies quasi-optimality condition

‖X − X‖ ≤√

d‖X − Xbest‖,

where Xbest is best approximation of X with multilinear ranks(r1, . . . , rd ).

Proof:

‖X − X‖2 = ‖X − (π1 π2 π3)X‖2

= ‖X − π1X‖2 + ‖π1X − (π1 π2)X‖2 + · · ·· · ·+ ‖(π1 π2)X − (π1 π2 π3)X‖2

≤ ‖X − π1X‖2 + ‖X − π2X‖2 + ‖X − π3X‖2

Using‖X − πµX‖ ≤ ‖X − Xbest‖ for µ = 1,2,3

leads to‖X − X‖2 ≤ 3 · ‖X − Xbest‖2.

Best approximation: See [Kolda/Bader’09].60

Page 61: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Tucker decomposition – SummaryFor general tensors:

I multilinear rank r is upper semi-continuous closednessproperty.

I HOSVD – simple and robust algorithm to obtain quasi-optimallow-rank approximation.

I quasi-optimality good enough for most applications in scientificcomputing.

I robust black-box algorithms/software available (e.g., TensorToolbox).

Drawback:Storage of core tensor ∼ rd

curse of dimensionality

61

Page 62: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Tensor network diagramsTensor network = undirected graph with:

I each node is a tensor;I each outgoing edge is a mode;I each connected edge represents a contraction; example:

Zi1,i2,i3,i4 =r∑

j=1

Xi1,i2,jYj,i3,i4 .2

13 1

2

3

I number of free edges = order of tensor represented by entirenetwork

Researchers on quantum many-body problems think2 in terms oftensor networks!

2and dream62

Page 63: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Tensor network diagramsExamples:

1 2

3 3

1 2 1 2

2 2 2 2

1 1 1 1

1 11

1

22

2

(v)(i) (ii) (iii) (iv)

(i) vector;(ii) matrix;(iii) matrix-matrix multiplication;(iv) Tucker decomposition;(v) hierarchical Tucker decomposition.

63

Page 64: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Low-rank tensors:Hierarchical Tucker

I Intro of Hierarchical Tucker Decomposition (HTD)I MATLAB toolbox htuckerI Basic operations: µ-mode matrix multiplication, addition, . . .I Advanced Operations: inner product, elementwise multiplication,. . .

64

Page 65: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

IntroductionI CP offers low data complexity but difficult truncation;I Tucker offers simple truncation but high data complexity.

Recently developed formats:I Matrix Product State (MPS),I TT decomposition,I Hierarchical Tucker decomposition (HTD).

Aim to offer compromise between CP and Tucker.

Focus in this lecture: HTD.I L. Grasedyck. Hierarchical singular value decomposition of tensors.

SIAM J. Matrix Anal. Appl., 31(4):2029–2054, 2010.I W. Hackbusch and S. Kühn. A new scheme for the tensor

representation. J. Fourier Anal. Appl., 15(5):706–722, 2009.I D. Kressner and C. Tobler. htucker – A MATLAB toolbox for the

hierarchical Tucker decomposition. In preparation. Seehttp://www.math.ethz.ch/~ctobler.

65

Page 66: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

More general matricizationsRecall: µ-mode matricization for tensor X ,

X (µ) ∈ Rnµ×(n1···nµ−1nµ+1···nd ), µ = 1, . . . ,d .

It is getting ugly...

General matricization for mode de-composition 1, . . . ,d = t ∪ s:

X (t) ∈ R(nt1 ···ntk )×(ns1 ···nsd−k )

with(X (t)

)(it1 ,...,itk ),(is1 ,...,isd−k )

:= Xi1,...,id .

X

X (1)

X (1,2)

66

Page 67: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Hierarchical constructionSingular value decomposition: X (t) = Ut ΣtUT

s .

Column spaces are nested

t = t1 ∪ t2 ⇒ span(Ut ) ⊂ span(Ut2 ⊗ Ut1 )

⇒ ∃Bt : Ut = (Ut2 ⊗ Ut1 )Bt .

Size of Ut :Ut ∈ Rnt1 ···ntk×rt with rt = rank

(X (t)).

For d = 4:

U12 = (U2 ⊗ U1)B12

U34 = (U4 ⊗ U3)B34

vec(X ) = X (1234) = (U34 ⊗ U12)B1234

⇒ vec(X ) = (U4 ⊗ U3 ⊗ U2 ⊗ U1)(B34 ⊗ B12)B1234.

67

Page 68: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Dimension treeTree structure for d = 4:

B12

U1

U2

U3

U4

B34

B1234(n2 × r2)

(n3 × r3)

(n4 × r4)

(n1 × r1)

(r1r2 × r12)(r1r2 × r12)

(r3r4 × r34)

(r12r34 × 1)

Reshape:

B12 ∈ Rr1r2×r12 ⇒ B12 ∈ Rr1×r2×r12

B34 ∈ Rr3r4×r34 ⇒ B34 ∈ Rr3×r4×r34

B1234 ∈ Rr12r34×1 ⇒ B1234 ∈ Rr12×r34

68

Page 69: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Dimension tree

B34

B12

U4

U3

U2

U1

B1234

I Often, U1,U2,U3,U4 are orthonormal. This is advantageous butnot required.

I Storage requirements for general d :

O(dnr) +O(dr3),

where r = maxrt, n = maxnµ.69

Page 70: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Constructors for MATLAB class htensor

x = htensor([4 5 6 7]) constructs zero htensor of size4× 5× 6× 7, with a balanced dimension tree.

x = htensor([4 5 6 7], ’TT’) constructs zero htensorof size 4× 5× 6× 7, with a TT-style dimension tree.

x = htensor(U1, U2, U3) constructs htensor fromtensor in CP decomp X (i1, i2, i3) =

∑j U1(i1, j)U2(i2, j)U3(i3, j).

x = htenrandn([4 5 6 7]) constructs htensor of size4× 5× 6× 7, with random ranks and random entries.

x = htenones([4 5 6 7]) constructs htensor of size4× 5× 6× 7, with all entries one.

...

70

Page 71: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Basic functionality for MATLAB class htensorExample: x is in htensor of order 4.

x(1, 3, 4, 2) returns entry of X .x(1, 3, :, :) returns slice of X as an htensor.full(x) returns full tensor represented by X . (use with care)

disp_tree(htenrand([5 4 6 3])) returns treestructure/ranks:

ans is an htensor of size 5 x 4 x 6 x 31-4 1; 6 3 11-2 2; 3 4 6

1 4; 5 32 5; 4 4

3-4 3; 3 3 33 6; 6 34 7; 3 3

spy(x) displays spy plots of Ut ,Bt , on the dimension tree.change_root(x, i) switches root node.

71

Page 72: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Singular value treeplot_sv(x) plots singular values of corresponding matricizations inthe dimension tree of a tensor X .

Example: Singular value tree of solution to elliptic PDE with 4parameters.

Dim. 1, 2 Dim. 3, 4, 5

Dim. 1 Dim. 2 Dim. 3 Dim. 4, 5

Dim. 4 Dim. 5

Remark: Singular values are computed from Gramians. 72

Page 73: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Basic ops: µ-mode matrix multiplicationApplication of matrix A ∈ Rm×nµ to mode µ of X ∈ Rn1×···×nd :

Y = A µ X ⇔ Y (µ) = AX (µ).

Nearly trivial if X is in H-Tucker format:

A µ X = A µ((U1, . . . ,Ud ) C

)= (U1, . . . ,Uµ−1,AUµ,Uµ+1, . . . ,Ud ) C

I Almost no operations required.I Ranks stay the same.I Orthogonality destroyed.

ttm(x, A, 2) applies matrix A to htensor X in mode 2.y = ttm(x, A, B, C, [2, 3, 4])y = ttm(x, @(x)(fft(x)), 2) applies FFT in mode 2.y = ttm(x, A, B, C, [2, 3, 4], ’h’) successivelyapplies matrices AT , BT , CT in modes 2,3,4.

73

Page 74: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Addition of low-rank matricesAddition of two matrices in low-rank format:

A = U1ΣAUT2 , B = V1ΣBV T

2

⇒A + B =

[U1 V1

] [ ΣA 00 ΣB

] [U2 V2

]TI No operations required.I Rank increases.I Orthogonality destroyed.

74

Page 75: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Addition of low-rank tensorsAddition of four tensors X1,X2,X3,X4 in H-Tucker format:

X1 + X2 + X3 + X4.

Proceed as in matrix case by embedding factors in larger matrices.I No operations required.I H-Tucker rank increases.I Orthogonality destroyed.

Command in htucker: x1 + x2 + x3 + x4

75

Page 76: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

U [4]1

U [4]2

U [4]3

U [4]4

B[1]12B[2]

12B[3]

12B[4]

12

B[1]34B[2]

34B[3]

34B[4]

34

B[1]1234B[2]

1234

B[3]1234B[4]

1234

U [3]1U [2]

1U [1]1

U [3]3

U [3]2U [2]

2

U [2]3U [1]

3

U [1]2

U [3]4U [2]

4U [1]4

76

Page 77: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

OrthogonalizationAny tensor X in H-Tucker format can be orthogonalized in the sensethat all factors in the dimension tree, except for the root node, containorthonormal columns.

Example: vec(X ) = (U4 ⊗ U3 ⊗ U2 ⊗ U1)(B34 ⊗ B12)B1234.

Step 1: QR decompositions Ut = QtRt

vec(X ) = (Q4 ⊗Q3 ⊗Q2 ⊗Q1)(B34 ⊗ B12)B1234

with B34 := (R4 ⊗ R3)B34, B12 := (R2 ⊗ R1)B12.

Step 2: QR decompositions B34 = Q34R34, B12 = Q12R12

vec(X ) = (Q4 ⊗Q3 ⊗Q2 ⊗Q1)(Q34 ⊗Q12)B1234

with B1234 := (R34 ⊗ R12)B1234.

Compt. requirements for general d : O(dnr2) +O(dr4).

Command in htucker: x = orthog(x)

77

Page 78: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Norms and inner productsInner product of two tensors X ,Y ∈ Rn1×···nd :

〈X ,Y〉 = 〈vec(X ), vec(Y)〉 =

n1∑i1=1

· · ·nd∑

id =1

xi1,...,id yi1,...,id .

Can be performed efficiently in H-Tucker, provided that X ,Y havecompatible dimension trees.

Example: Two tensors of order 4

〈X ,Y〉 = (Bx1234)T (Bx

34 ⊗ Bx12)T (Ux

4 ⊗ Ux3 ⊗ Ux

2 ⊗ Ux1 )T · · ·

· · · (Uy4 ⊗ Uy

3 ⊗ Uy2 ⊗ Uy

1 )(By34 ⊗ By

12)By1234

Norm: After X has been orthogonalized:

‖X‖ =√〈X ,X〉 = ‖Bx

12···d‖F .

Possibly most accurate way to compute norm. Used in norm(x).

78

Page 79: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Computation of inner products

〈X ,Y〉 =

n1∑i1=1

· · ·nd∑

id =1

xi1,...,id yi1,...,id .

79

Page 80: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Computation of inner products

80

Page 81: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Computation of inner products

81

Page 82: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Computation of inner products

82

Page 83: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Computation of inner products

83

Page 84: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Computation of inner products – contraction step

(Bxt )

T

(Uxt2)

T Uyt2(Ux

t1)T Uy

t1

Byt

(Uxt )T Uy

t = (Bxt )T ((Ux

t2 )T Uyt2 ⊗ (Ux

t1 )T Uyt1

)By

t .

I htucker command: innerprod(x,y)I Overall cost: O(dnr2) +O(dr4).

84

Page 85: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Reduced Gramians in H-Tucker

t

Ut

Gt

t

Ut

X (t) = UtV Tt ⇒ X (t)(X (t))T = Ut V T

t Vt︸ ︷︷ ︸=:Gt

UTt

If Ut orthonormal svd(X (t)

)=√

eig(Gt ) (used in plot_sv).85

Page 86: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Reduced Gramians in H-Tucker

86

Page 87: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Reduced Gramians in H-Tucker

87

Page 88: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Reduced Gramians in H-Tucker

88

Page 89: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Reduced Gramians in H-Tucker

89

Page 90: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Reduced Gramians in H-Tucker

90

Page 91: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Reduced Gramians in H-Tucker

Implemented in htucker command gramians(x).

91

Page 92: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Advanced operationsI TruncationI Combined addition + truncationI Elementwise multiplicationI Elementwise reciprocal

92

Page 93: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Truncation of explicit tensorLet X ∈ Rn1×n2×···×nd be explicitly given.

I For each tree node t , let Wt contain rt dominant left singularvectors of X (t) and define projection

πtX = WtW Tt t X ⇔ πtX (t) = WtW T

t X (t).

I Truncated tensor:

X :=( ∏

t∈TL

πt

)· · ·( ∏

t∈T1

πt

)X ,

where T` contains all nodes on level `.I [Grasedyck’2010]: ‖X − X‖ ≤

√2d − 3 ‖X − Xbest‖.

Proof similar as for HOSVD.

93

Page 94: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Truncation of explicit tensorExample:

vecX = (W4W T4 ⊗W3W T

3 ⊗W2W T2 ⊗W1W T

1 )(W34W T34 ⊗W12W T

12)vecX= (W4 ⊗W3 ⊗W2 ⊗W1) · · ·

([W T4 ⊗W T

3 ]W34︸ ︷︷ ︸=:B34

⊗ [W T2 ⊗W T

1 ]W12︸ ︷︷ ︸=:B12

) ([W T34 ⊗W T

12]vecX )︸ ︷︷ ︸=:B1234

.

opts.max_rank = 10 maximal rank at truncation.opts.rel_eps = 1e-6 maximal relative truncation error.opts.abs_eps = 1e-6 maximal absolute truncation error.Condition max_rank takes precedence over rel_eps andabs_eps.xt = htensor.truncate_rtl(x, opts) returns truncatedtensor X of a multidimensional array.

Remark: There is also a significantly fasterhtensor.truncate_ltr (proceeds successively from leafs toroots), for which the same error bound holds [Tobler’10].

94

Page 95: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Truncation of H-Tucker tensorLet X ∈ Rn1×n2×···×nd be in H-Tucker format and orthogonalized.

I Compute left singular vectors of X (t) = UtV Tt from eigenvectors

ofX (t)(X (t))T

= Ut V Tt Vt︸ ︷︷ ︸=Gt

UTt ,

with reduced Gramian Gt .If St contains rt dominant eigenvectors of Gt Wt = UtSt .

I Traverse tree from root to leafs. In each step:

Btp

StSTt

Bt

Bt

Btp

STt

St

STt Btp

St Bt

I In htucker: truncate(x,opts). Complexity O(dnr2 + dr4).

95

Page 96: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Combined addition + truncationSum of more than two tensors:

Y = X1 + X2 + · · ·+ Xs.

Two possibilities to incorporate truncation operator T :1. Y ≈ T (X1 + X2 + X3 + · · ·+ Xs)

2. Y ≈ T (· · · (T (T (X1 + X2) + X3) + · · ·+ Xs)

Option 2 is usually significantly cheaper but may suffer from severecancellation.

Artificial example: X1,X2,X3 ∈ R101×101×101 truncated tensor griddiscretizations for summands of

f (x1, x2, x3) = tan(x1 + x2 + x3) + (x1 + x2 + x3)−1 − tan(x1 + x2 + x3).

Error(Option 1) ≈ 10−7. Error(Option 2) ≈ 1.3.

What is wrong with Option 1?

96

Page 97: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Combined addition + truncation

U [4]1

U [4]2

U [4]3

U [4]4

B[1]12B[2]

12B[3]

12B[4]

12

B[1]34B[2]

34B[3]

34B[4]

34

B[1]1234B[2]

1234

B[3]1234B[4]

1234

U [3]1U [2]

1U [1]1

U [3]3

U [3]2U [2]

2

U [2]3U [1]

3

U [1]2

U [3]4U [2]

4U [1]4

I Orthogonalization (needed before truncation) destroys blockdiagonal structure.

I Complexity O(dns2r2 + ds4r4) for s summands.

97

Page 98: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Combined addition + truncationIdea: New variant delays orthogonalization to keep block diagonalstructure in transfer tensors as long as possible.

Reduces O(dns2r2 + ds4r4) to O(dns2r2 + ds2r4 + ds3r3)

100

101

10−2

10−1

100

101

102

Number of summands

Run

time

[s]

time truncate stdtime truncate sumtime truncate succ.O(t4)O(t2)O(t)

I htucker command: add_truncate(x1 x2 x3 x4, opts).

98

Page 99: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Elementwise multiplicationElementwise multiplication (also called Hadamard or Schur product)of two low-rank matrices A = U1ΣAUT

2 ,B = V1ΣBV T2 :

A ? B = (U1 V1)(ΣA ⊗ ΣB)(U2 V2)T ,

with the row-wise Khatri-Rao product

C D =

cT1...

cTn

dT

1...

dTn

=

cT1 ⊗ dT

1...

cTn ⊗ dT

n

I Orthogonality destroyed.I Rank increases significantly.

But: singular value decay of ΣA ⊗ ΣB may become significantlystronger additional opportunities for truncation.

99

Page 100: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Elementwise multiplicationElementwise multiplication of two tensors X ,Y in H-Tucker format:

I Row-wise Khatri-Rao product of leaf matrices.I “Kronecker product” of non-leaf tensors.I Optional: Products are only formed after suitable truncation to

avoid excessive memory requirements.Commands in htucker:x.*y (without truncation)x.ˆ2 (without truncation)elem_mult( x, y, opt ) (with truncation)

100

Page 101: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Elementwise reciprocalGoal: Compute reciprocal of each entry in tensor X .

Basic idea: Newton-Schultz iteration

y0 = 1, yi+1 = yi + yi (1− x yi ), (2)

converges to 1/x for 0 < x < 2.

Apply (2) simultaneously to all entries.

Code snippet of elem_reciprocal( x, opt ) in htucker:

all_ones = htenones(size(x));y = all_ones;for it=1:maxit

xy = elem_mult( x, y );xy = truncate( all_ones - xy );xy = elem_mult( xy, y );y = truncate( y + xy );

end

See also [Oseledets et al. 2009].101

Page 102: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Elementwise reciprocalExample: (x1 + x2 + x3 + x4)−1 with xi ∈ [10−3,1].

c = laplace_core(4);U = [ones(100, 1), linspace(1e-3, 1, 100)’];x = ttm(c, U, U, U, U);inv_x = elem_reciprocal(x, opts);

0 2 4 6 8 10 1210

−5

100

||y*x

k −

1||/

||1||

Convergence of ‖X ? Yk − 1‖.

Dim. 1, 2

Dim. 3, 4

Dim. 1

Dim. 2

Dim. 3

Dim. 4

Singular value tree upon conver-gence.

102

Page 103: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

SummaryI HTD offers good compromise between CP and Tucker.I Algorithms often quite technical but conceptually simple.I Computational complexity ∼ d but often ∼ r4:

Curse of dimensionality ⇒ curse of rank ?I Important to keep in mind:

Unless d is tiny, tensor X can/should never be formed explicitly.

All operations need to be performed implicitly in HTD.

Can pose severe problems even for seemingly simple operations:min(X ), max(X ), abs(X ), 1./X , . . .

103

Page 104: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

104

Page 105: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Algorithms based onlow-rank tensors

I Inexact LOBPCGI ALS / MALS

105

Page 106: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Strategies for solving tensor equationsI In many practical situations, tensor X is given implicitly as

solution to linear system A(X ) = B, eigenvalue problemA(X ) = λX , nonlinear system, ODE, . . .

Two main strategies to use low-rank tensor techniques:1. Combine existing iterative solver (e.g., CG, LOBPCG, GMRES)

with repeated low-rank truncation of iterates ( inexact CG).I Straightforward to derive and implement (based, e.g., onhtucker).

I Hard to analyze impact of nonnegligible truncations on accuracyand convergence.

I Intermediate rank growth may result in excessive computing timesand/or harm accuracy+convergence.

2. Formulate optimization problem, constrain to low-rank tensors,iteratively optimize wrt individual factors of low-rank format.

I Works well in practice.I Convergence theory not well understood.I Not straightforward to implement.

106

Page 107: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Example: PDE-eigenvalue problemGoal: Compute smallest eigenvalue for

∆u(ξ) + V (ξ)u(ξ) = λu(ξ) in Ω = [0,1]d ,u(ξ) = 0 on ∂Ω.

Assumption: Potential represented as

V (ξ) =s∑

j=1

V (1)j (ξ1)V (2)

j (ξ2) · · ·V (d)j (ξd ).

finite difference discretization

Au = (AL +AV )u = λu,

with

AL =d∑

j=1

I ⊗ · · · ⊗ I︸ ︷︷ ︸d−j times

⊗AL ⊗ I ⊗ · · · ⊗ I︸ ︷︷ ︸j−1 times

,

AV =s∑

j=1

A(d)V ,j ⊗ · · · ⊗ A(2)

V ,j ⊗ A(1)V ,j .

107

Page 108: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

LOBPCG methodLOBPCG with block size 1 [Knyazev’01] for computing smallesteigenvalue of

Ax = λx , A symmetric.

λ0 = 〈x0, x0〉A, p0 = 0for k = 0,1, . . . (until converged) do

rk = B−1(Axk − λk x)U =

[xk , rk , pk

]A = UT AU, M = UT UFind eigenpair (λk+1, y), with ‖y‖2 = 1, for smallest eigenvalueof matrix pencil A− λM.pk+1 = y2 · rk + y3 · pkxk+1 = y1 · xk + pk+1xk+1 ← xk+1/‖xk+1‖2

end forReturn (λmin, x) = (λk+1, xk+1).

108

Page 109: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Tensor low-rank LOBPCGTruncated LOBPCG with block size 1 for computing smallesteigenvalue of

A(X ) = λX , A symmetric, X tensor.

λ0 = 〈X0,X0〉A, P0 = 0 · Xfor k = 0,1, . . . (until converged) doRk = B−1(A(Xk )− λkXk ), Rk ← T (Rk )U1 = Xk , U2 = Rk , U3 = PkAij = 〈Ui ,Uj〉A, Mij = 〈Ui ,Uj〉Find eigenpair (λk+1, y), with ‖y‖2 = 1, for smallest eigenvalueof matrix pencil A− λM.Pk+1 = y2 · Rk + y3 · Pk Pk+1 ← T (Pk+1)Xk+1 = y1 · Xk + Pk+1 Xk+1 ← T (Xk+1)Xk+1 ← Xk+1/

√〈Xk+1,Xk+1〉

end forReturn (λmin,X ) = (λk+1,Xk+1).

T = truncation to hierarchical low rank

109

Page 110: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Implementation details

OrthogonalizationIn standard LOBPCG, orthogonalization of U is recommended[Knyazev 2010]. This is not practical with low-rank tensors, as rankswould grow and truncation would destroy orthogonality.

TruncationXk ,Rk ,Pk are truncated in every step. Moreover, application of A(·)and preconditioner B−1(·) may involve truncation during theapplication of these operators.

Inner productReduced matrix A is very sensitive to truncation in A(·). Thecomputation of Ai,j = 〈Ui ,Uj〉A must be exact.

110

Page 111: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Numerical Experiments - Sine potential

PDE-eigenvalue problem with Ω = [0, π]d and sine potential

V (ξ) = q ·d∏

i=1

sin(ξi )

for some constant q > 0. We choose d = 10, n = 128.

Preconditioner: [Grasedyck 2004]

A−1L =

∫ ∞0

exp(−tAL)dt

≈M∑

j=−M

ωj exp(−αjA(d)L )⊗ · · · ⊗ exp(−αjA

(1)L ) =: B−1,

for a certain, optimized and tabulated choice of coefficients αj , ωj > 0.We choose M = 10.

111

Page 112: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Numerical Experiments - Sine potential

q = 1

0 10 20 30 4010

−8

10−6

10−4

10−2

100

102

104

Re

sid

ua

l

Iterations

0 10 20 30 400

10

20

30

40

50

Ma

xim

al ra

nk

eps 1e−2

eps 1e−4

eps 1e−8

q = 1000

0 10 20 30 4010

−8

10−6

10−4

10−2

100

102

104

Re

sid

ua

l

Iterations

0 10 20 30 400

10

20

30

40

50

Ma

xim

al ra

nk

eps 1e−2

eps 1e−4

112

Page 113: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

ALSOriginally from computational quantum physics [Schollwöck 2011],recently investigated by [Huckle et al. 2010; Oseledets, Khoromskij2010; Holtz et al. 2010; Dolgov, Oseledets 2011]

Goal:

min 〈X ,A(X )〉〈X ,X〉

: X ∈ H-Tucker((rt )t∈T

), X 6= 0

Method: Choose one node t , fix all other nodes, set new tensor atnode t to minimize Rayleigh quotient 〈X ,A(X )〉

〈X ,X〉 . This is done for allnodes (a sweep), and sweeps are continued until convergence.

Sketch:

X (t) = UtV Tt =

(Utr ⊗ Utl

)BtV T

t ,

vec(X ) =(Vt ⊗ Utr ⊗ Utl

)vec(Bt ) = Ut vec(Bt ).

⇒ min

yT (UTt AUt )y

yT (UTt Ut )y

: y ∈ Rrtl rtr rt , y 6= 0.

113

Page 114: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Computation of reduced matrices

Consider A = Ad ⊗ · · · ⊗ A1 (Other operators can be treated similarly)

Compute

At := UTt AUt =

(Vt ⊗ Utr ⊗ Utl

)TA(Vt ⊗ Utr ⊗ Utl

)= At ⊗ Atr ⊗ Atl ,

where

Atl = UTtl

(⊗i∈tl

Ai

)Utl , Atr = UT

tr

(⊗i∈tr

Ai

)Utr , At = V T

t

(⊗i 6∈t

Ai

)Vt .

Additionally

Mt := UTt Ut = V T

t Vt ⊗ UTtr Utr ⊗ UT

tl Utl = Mt ⊗Mtr ⊗Mtl ,

114

Page 115: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Computation of reduced matrices

A1 A3 A5 A6 A7 A8A2 A4

A12 A34

A1234

115

Page 116: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

MALS

Method:I Select edge of tensor network.I Combine tensors at the adjacent nodes to form a higher-order

tensor.I Set this tensor to minimize the Rayleigh quotient.I Use low-rank approximation to split new combined tensor into

two tensors at adjacent nodes of selected edge.

116

Page 117: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

MALS - Illustration

117

Page 118: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Numerical Experiments – Sine potential

PDE-eigenvalue problem with Ω = [0, π]d and sine potential

V (ξ) = q ·d∏

i=1

sin(ξi )

for some constant q > 0. Choose d = 10, n = 128, q = 1000.Preconditioner: [Grasedyck 2004]

A−1L =

∫ ∞0

exp(−tAL)dt

≈M∑

j=−M

ωj exp(−αjA(d)L )⊗ · · · ⊗ exp(−αjA

(1)L ) =: B−1,

for a certain, optimized choice of coefficients αj , ωj > 0. We chooseM = 10.

118

Page 119: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Numerical Experiments – Sine potential

ALS

0 100 200 300 400 50010

−15

10−10

10−5

100

105

Execution time [s]

0 100 200 300 400 50015

20

25

30

35

40

45err_lambda

res

nr_iter

Hierarchical ranks 40.

MALS

0 100 200 300 400 50010

−15

10−10

10−5

100

105

Execution time [s]

0 100 200 300 400 5000

20

40

60

80

100err_lambda

res

eps

rank

nr_iter

Maximal hierarchical rank 30.

119

Page 120: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Conclusions and Outlook

120

Page 121: Low-Rank Tensor Techniques for High-Dimensional …Low-Rank Tensor Techniques for High-Dimensional Problems Daniel Kressner CADMOS Chair for Numerical Algorithms and HPC MATHICSE,

Conclusions and OutlookI Scientific computing with low-rank tensors rapidly evolving field

and highly technical.I Precise scope of applications far from clear; many applications

remain to be explored. More analysis and comparison toalternative techniques (sparse grids, adaptive tensordiscretization, Monte Carlo, . . .) needed.

Some current trends:I Tensorization of vectors + low rank (discrete Chebfun?) by

Hackbusch, Khoromskij, Oseledets, Tyrtishnikov, . . .I Computational differential geometry on low-rank tensor manifolds

by Koch, Lubich, Schneider, Uschmajew, Vandereycken, . . .I Robust low rank (Candes et al.) for tensors suitable way of

dealing with singularities?I . . .

Acknowledgments: Presentation heavily benefited from joint workwith Christine Tobler (ETH Zurich).

121