22
TT methods Math 671: Tensor Train decomposition methods II Eduardo Corona 1 1 University of Michigan at Ann Arbor December 13, 2016

Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

Math 671: Tensor Train decomposition methodsII

Eduardo Corona1

1University of Michigan at Ann Arbor

December 13, 2016

Page 2: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

Table of Contents

1 What we’ve talked about so far:

2 The Tensor Train decomposition

3 Fast algorithms for TT arithmetic

Page 3: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

What we’ve talked about so far:

What we’ve talked about so far:

We talked about tensors, and how we can vectorize a tensoror tensorize a vector or matrix. This creates relationshipsbetween tensor and vector / matrix algebra.

Our goal: to use the Tensor train decomposition to compressfunctions (vectors) and linear operators (matrices).

We discussed unfolding matrices for tensorized vectors andmatrices, when they come from function and kernelevaluations, respectively.

Page 4: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

What we’ve talked about so far:

Tensorized vector index (domain hierarchy)

i

Ωi

i

i

x

1

2

3

1

1 1

1 1 1 1

0

0 0

0 0 0 0

Domain

UnfoldingMatrix

FT ( i i , i ) = f(x )1 2 3 i

Page 5: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

What we’ve talked about so far:

Model problem: Integral Equations for PDEs

Many boundary value problems from classical physics, when cast asboundary or volume integral equations, take the form

A[σ](x) = a(x)σ(x) +

∫Γ

K (x , y)σ(y)dS(y) = f (x), ∀x ∈ Γ,

K (x , y) kernel function related to PDE fundamental solution.

Typically singular near diagonal (y = x) but otherwise smooth.

We prefer Fredholm of the 2nd kind (Identity + Compact).

Page 6: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

What we’ve talked about so far:

Tensorized matrix index (interaction / block hierarchy)

A =

A1 =

16× 16

M1

=

16× r1

U1 ⇒ G1

[ ]

r1 × 16

V1

4r1 × 4

M2

=

4r1 × r2

U2 ⇒ G2

[ ]

r2 × 4

V2

4r2 × 1

M3 = U3 ⇒ G3

=

Unfolding matrix A` → all interactions between source and targetnodes at a given level.

Page 7: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

What is the TT decomposition?

For a d dimensional tensor A sampled at N =∏d

i=1 ni pointsindexed by (i1, i2, . . . , id), this decomposition can be written as:

A(i1, i2, . . . , id) ≈∑

α1,...,αd−1

G1(i1, α1)G2(α1, i2, α2) . . .Gd(αd−1, id)

Each Gk is known as a tensor core. Auxiliary indices αk determinenumber of terms in the decomposition and run from 1 to rk . rk isknown as the kth TT rank.

Page 8: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

Intuition: extension of the SVD

For a matrix A of size N × N and rank k , using the SVD wefind:

A(i , j) =∑α

U(i , α)S(α, α)V (α, j)

Where U(i , α),V (α, j) are members of orthonormal bases forthe row and column spaces, respectively.

You need to store two matrices of size N × k ( 2kN numbersinstead of N2)

Page 9: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

Observations about the TT

Note each tensor core Gk(αk−1, ik , αk) is a 3-tensor, thatdepends on only one tensor index ik .

You can think of it as an extension of the SVD to tensors.

Quasi-optimal analogue to the optimal low rankapproximation.

We need to store d tensors of size rk−1 × nk × rk .

If all ranks rk are bounded by r and mode sizes are equal to n,that is: ∑

k

nk rk−1rk ≤ dnr2

If ranks are small, that can be much smaller than N = nd

Page 10: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

How to compute TT cores and ranks?

The k-th TT rank is the rank of the k-th unfolding matrix Ak .

A TT decomposition may be obtained by a series of low rankapproximations (using a truncated SVD, interpolativedecompositions, or others).

Page 11: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

TT decomposition algorithm (1)

If we compute our favorite low rank approximation of A1, weobtain:

A1(i1, i2..id) = U1(i1, α1)V1(α1, i2..id)

The first core G1(α1, i1) is a reshaping of U1.

Page 12: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

TT decomposition algorithm (2)

We can iterate this procedure for V1(α1i2, i3, .., id):

V1(α1i2, i3..id) = U2(α1i2, α2)V2(α2, i3..id)

The second core G2(α1, i2, α2) is a reshaping of U2.

Page 13: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

TT decomposition algorithm (k)

We continue this low rank approximation process until we runout of indices.

We end up with d tensor cores Gk(αk−1, ik , αk).

From this construction, a low rank approximation for anyunfolding matrix Ak can be obtained by contracting all indicesexcept αk

Page 14: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

TT for vectors (function samples)

Along with View function unfoldingmat.m, I’ve also addedget ttf.m to obtain the TT decomposition for the tenzorizedform of f (x) sampled in N = 2d points in [a, b].

This uses the tt tensor, full and round routines from the TTtoolbox.

Page 15: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

TT for matrices (Kernel samples)

Along with View matrix unfoldingmat.m, I’ve also addedget tta.m to obtain the TT decomposition for the tenzorizedform of K (x , y) sampled in N × N = 2d × 2d points in[a, b]× [c , d ].

This uses the tt tensor, tt matrix, full, and round routinesfrom the TT toolbox.

Page 16: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

TT for matrices

A =

A1 =

16× 16

M1

=

16× r1

U1 ⇒ G1

[ ]

r1 × 16

V1

4r1 × 4

M2

=

4r1 × r2

U2 ⇒ G2

[ ]

r2 × 4

V2

4r2 × 1

M3 = U3 ⇒ G3

=

Page 17: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

The Tensor Train decomposition

Why does it achieve better compression?

TT rank is low if there exists a small basis of interactions

For many examples from differential and integral equations,ranks are actually bounded or grow very slowly (like logN)

Other examples have higher growth (Toeplitz have ranks withgrowth N1/2)

Symmetries, and particularly translation or rotation invariancereduce TT ranks significantly.

Page 18: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

Fast algorithms for TT arithmetic

TT Toolbox

All the routines discussed in this lecture can be found in theMatlab TT Toolbox, by Ivan Oseledets and his researchgroup.

The two families of objects we’ll be working with are tttensor and tt matrix.

Besides basic routines to manipulate these, we’ll be looking atfast algorithms for TT matrices in the cross and solvesubdirectories.

Page 19: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

Fast algorithms for TT arithmetic

Fast algorithms for TT

Given function f (x), kernel K (x , y) or tensor we can sampleentries from,

amen cross or dmrg cross: produce TT decompositionsfaster (O

(r3d)) than the standard TT algorithm.

Given matrix A in the TT matrix format, we can compute

mtimes: Matrix-vector product y = Ab for a dense vector b

mtimes: If b is also compressed in the tt tensor format, ituses a different algorithm that computes y in tt tensor form.

amen solve2 or dmrg solve3: solve linear system Ax = b orproduce a TT decomposition for A−1.

Page 20: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

Fast algorithms for TT arithmetic

Fast compression example: amen cross

Inputs: a tensor entry evaluator routine.

How does it work: starting from an initial guess, it iteratesover all cores trying to find the best approximation for givenranks.

Each core is updated while the others stay fixed.

After each pass, it uses an approximation of the residual toincrease ranks / tensor evaluations until convergence.

Function call: amen cross(Size,Eval,acc,params);

Page 21: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

Fast algorithms for TT arithmetic

Fast matrix-vector apply

Inputs: matrix in tt matrix form, vector b in dense form.

How does it work: applying Gk , it contracts one column indexjk at a time, producing results for the corresponding row indexik .

Function call: y = mtimes(TTA,b);

Page 22: Math 671: Tensor Train decomposition methods IIcoronae/Talk_UM_TT_lecture2.pdf · 2018-01-05 · Fast algorithms for TT arithmetic Fast algorithms for TT Given function f(x), kernel

TT methods

Fast algorithms for TT arithmetic

Fast solve and inversion example: amen solve

Inputs: matrix in tt matrix form, vector b in tt tensor form.

How does it work: starting from an initial guess, it iteratesover all cores solving local linear systems.

Each linear system holds all cores but one fixed.

After each pass, it uses an approximation of the residual toincrease ranks / tensor evaluations until convergence.

Function call: amen cross(Size,Eval,acc,params);