7
NOTES ON THE LOW-RANK MATRIX APPROXIMATION OF KERNEL MATRICES Hiroshi Tsukahara Denso IT Laboratory, Inc. Aug. 23 (Fri) 2013

Notes on the low rank matrix approximation of kernel

Embed Size (px)

Citation preview

Page 1: Notes on the low rank matrix approximation of kernel

NOTES ON THE LOW-RANK MATRIX APPROXIMATION OF KERNEL MATRICESHiroshi Tsukahara

Denso IT Laboratory, Inc.

Aug. 23 (Fri) 2013

Page 2: Notes on the low rank matrix approximation of kernel

KERNEL METHOD

Supervised Learning Problem

Solving in Reproducing Kernel Hilbert Spaces

( ){ }niYXyxD iin ,,2,1, =×∈= )( s.t. : Find ii xfyYXf =→

2

1 2))(,(

1min f

nxfyl

n ii

n

iFf

λ+∑=∈

( )∑=

=→n

iii xfFX

1

s.t. : Assuming ϕαϕ

nRin on Optimizati

(1)

ill-defined problem!!

cf. representer theorem

Page 3: Notes on the low rank matrix approximation of kernel

Kernel method If the loss function is given by

the explicit form of is not necessary but their inner

products:

Define the mapping implicitly by a kernel function:

),(:)( ⋅= xkxϕ

( ) 2)(2

1))(,( xfyxfyl −=

ϕ

),(:)(),( xxkxx ′=′ϕϕ

ϕ

RHS is called as a kernel function

Page 4: Notes on the low rank matrix approximation of kernel

Solution can formally be written as:

However, the complexity for computing the solution is very high:

Tnjiijini yyyyxxkyIK ),,,( and ),(K where])[( 21

1 ==+= −λα

)( 3nO

Page 5: Notes on the low rank matrix approximation of kernel

LOW-RANK APPROXIMATION

Low-rank approximation of kernel matrices Their rank is usually very low comparing to n. Making use of this property, assume that the kernel

matrix can be written as

Then, the complexity of calculating the solution can be reduced considerably, due to the formula:

( )[ ]TrT

nnT RIRRRIIRR

11 1)(

−− ++=+ λλ

λ

nrrnRRRK T <<×≈ h matrix wit is where

O(r2n)

Page 6: Notes on the low rank matrix approximation of kernel

Rough sketch for the derivation of the formula

1)( −+ nT IRR λ

( )[ ].1

,1

,1

,1

,1

1

1

2

32

Tr

Tn

TT

rn

TTT

rn

TTT

n

T

n

RIRRRI

RRR

IR

I

RRRRR

IR

I

RRRRRRI

RRI

+−=

+−=

+

+

−−=

+

+−=

+=

λλ

λλλ

λλλλ

λλλλ

λλ

Page 7: Notes on the low rank matrix approximation of kernel

There are several algorithms for deriving the low-rank approximation: Nystrom approximation Incomplete Cholesky decompositon