Download ppt - Kernel Discriminant Analysis Based on Canonical Difference for Face Recognition in Image Sets

Kernel Discriminant Analysis Based on Canonical Difference for Face Recogniti

on in Image Sets

Wen-Sheng Chu (朱文生 ) Ju-Chin Chen (陳洳瑾 )

Jenn-Jier James Lien (連震杰 )

Robotics Lab, CSIE NCKUhttp://robotics.csie.ncku.edu.tw

CVGIP 2007

http://robotics.csie.ncku.edu.tw/

2

Motivation

• Challenges of face recognition– Facial variations

• Face recognition using image sets– Surveillance– Video retrieval

illumination pose facial expression

3

Why Multi-view Image Sets?

• Multiple facial images contain more information than a single image.

Person BPerson A Person BPerson A

Single input pattern(Single-to-many)

Multiple input patterns(Many-to-many)

A or B?

4

Training/Testing Data: Facial Expression

Image Set 1

Image Set 2

Image Set 3

Image Set 4

Image Set 5

…

…

…

…

…

Training

Testing

For subjecti

5

More Training/Testing Data: Illumination (Yale B)

…

…

…

…

…

Image Set 1

Image Set 2

Image Set 3

Image Set 4

Image Set 5

Training

Testing

For subjectj

6Identification result

System Overview

Testing image set Xtest

Kernel SubspaceGeneration

Ptest

Reference SubspaceReftest

Output

Subject N

Xm-2

Xm...

..

.

Xm-1...

Subject 1

..

.X3

X1

..

.

..

.X2

Testing Process

Training image sets {X1,…,Xm}

Training Data...

Testing Data

Kernel Discriminant

Transformation (KDT)


(Total m subspaces)

Reference Subspace: Refi=TTPi

Pi

idii ee ,...,P 1

TXTraining Process


Training Process



Ptest


Subject N

Xm-2

Xm...

..

.

Xm-1...

Subject 1

..

.X3

X1

..

.

..

.X2

Testing Process


Training Data

Xi={ , ,…, }32

321

3232

2

3232

ni

ni ~= 100

...

Kernel Discriminant


Kernel SubspaceGeneration Pi

(Total m subspaces)


Pi

X

Pi={ei1,…,eid} Training Process

8

i1x

…

i2x i

nix

32 x 32

ni

Nonlinear mapping function

i1x ini

x

… h

ni

iXiX

Kernel matrix

ijK

ni

nj

iP

Kernel subspace of Xi (d<n

i)

…

ie1ide

Kernel Subspace Generation (KSG)

9

KSG: Kernel PCA (KPCA)

• From the theory of reproducing kernels,

in

s

is

isp

ipe

1)x(a where

Tiiii aaK

X1={ , ,…, }1 2 n1

mdmp

m eee ,...,,...,1Pm=Xm={ , ,…, }nm21

…

SVD

KPCA: B Scholkopf, A Smola, KR Muller - Advances in Kernel Methods-Support Vector Learning, 1999

K11

Kmm

… …Image SetXi

KernelMatrix

Kii

,d < ni

i-th image set

s-th image of i-th image set

Dimensionality may be ∞ !

1111 ,...,,..., dp eeeP1= KernelSubspace

Pi


Training Process:Kernel Discriminant Transformation (KDT)

Kernel Discriminant



(Total m subspaces)




Ptest


Subject N

Xm-2

Xm...

..

.

Xm-1...

Subject 1

..

.X3

X1

..

.

..

.X2

Testing Process


...

Pi

idii ee ,...,P 1

TX

11

KDT: Main Idea• Based on the concept of LDA, KDT is derived to find a

transformation matrix T.• We proposed an iterative process to optimize T.• Dimensionality of T is assumed to be w.

Between Subjects

Within SubjectsTT

TTmaxargTT W

TB

T

S

S

d x w transformation matrix (KDT matrix)

Subspace

32x32-dim

1

2

KPCA

KPCA

1

2

d-dim

T1

2

w-dim

How to measure similarity?

12

KDT: Canonical Difference (CD) – Similarity Measurement

• Capture more common views and illumination than eigenvectors.

u1

v1

v2

u2

Kernel subspace P1

↓Canonical subspace C1

d1

d2

Kernel subspace P2

↓Canonical subspace C2

13

KDT: CD – Canonical Vector v.s. Eigenvector (cont.)

canonical vectors

C1

C2

C1－ C2

B1

B2

B1－ B2

eigenvectors

A similarity measurement of two subspaces

14

KDT: CD – Canonical Subspace (cont.)

TT211221 BB

212112 BB TT

1211 BC 2122 BC

d-dimensinoal orthonormal

basis matrices eigenvector

canonical subspaces (also orthonormal)

0 Eigenvalue = cos≦ 2θi 1≦

• Consider SVD on orthonormal basis matrices B1 and B2:

SVD

Similarity measurement

21 CCT

T.K. Kim, J. Kittler and R. Cipolla, “Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations”, IEEE Trans. on PAMI, 2007

jiT

ji

d

rrr

d

rr

CCCCtrace

vudjiiffCanonicalD

,1

2

1

2

15

KDT: KDT Matrix Optimization

• Orthonormal basis matrices are required to obtain canonical subspaces Ci.

• Is Refi normalized? Usually not!

T

KDT MatrixKernel

Subspace

idii ee ,...,P 1

ReferenceSubspace

iT

i PTRef

CanonicalSubspace

iC

Kernelsubspace

Based on LDA

Iterativelearning

CanonicalDifference

iff(i,j)CanonicalD

16

KDT: Kernel Subspace Normalization

• QR-decomposition is performed to obtain two orthonormal basis matrices.

1PT iiT

i RQ

w × d orthonormal matrix d × d invertible upper triangular matrix

jijjijii QCQC ,

iiiT RQPT

Tjiijj

Ti QQ

SVD

jijTij

Ti QQ ,

17

KDT: Formulation

TPPPPT

CCCC),(

T

jijijijijijiT

jijiji

T

jijiji

ji

T

ji

trace

QQQQtrace

tracejiiffCanonicalD

TT

TTmaxarg

,

,maxargT

T

1

1

T

WT

BT

Wkmi

Bmi

Strace

Strace

kiiffCanonicalD

iiffCanonicalD

i

i

SB, Sw

Qi = TTPiRi-1

Canonical Subspacejijjijii QCQC ,

Derivation

Form of LDA

iimi B

TiiiiiiB CBS

i X|,)PP)(PP(1

ikimi Wk

TkikikikikikiW CkWS

i X|,)PP)(PP(1

18

KDT: Solution

TT

TT

WT

BT

S

S

αα

ααα)

U

VJ

T

T

(

T

mi B

TiiiiiiB i

S 1 )PP)(PP(

mi Wk

TkikikikikikiW i

S 1 )PP)(PP(

M

uuuq

1q xt α

T={t1,…,tq,…,tw}

Contain the info of Dimensionality may be !

19αα

ααα)

U

VJ

T

T(

TT

TT

WT

BT

S

S

αα US TW

T TT

Using the theory of reproducing kernels again:

α.α VS TB

T TTFollowing similar steps, we can obtain

xt1

q

M

uuuq α wheret,...,t,...,tT 1 wq

mi Wk

Tkikikikikiki

TW

TiS 1 T)PP)(PP(TTT

Replace using kernel trick T

Derivation

mi Wk

Tikkiikkii

U 1 ZZZZ

isurpij

isr

ns

drupij ki x,xaZ 11

mi B

Tiiiii

V 1 ZZZZ

20

KDT: Numerical Issues

• is solved by simply computing the leading eigenvectors of U-1V.

• To make sure that U is positive-definite, we regularize U by Uμ (μ=0.001) where

αα

ααα)

U

VJ

T

T(

IUUμ

α

α


Training Process



Ptest


Subject N

Xm-2

Xm...

..

.

Xm-1...

Subject 1

..

.X3

X1

..

.

..

.X2

Testing Process


...

Kernel Discriminant



(Total m subspaces)


Pi

idii ee ,...,P 1

TX

Refi = TTPi where each element is given by

.x,xaPT 1 1isu

isp

Mu

ns uqqpi

T ki α


Testing ProcessTesting image set Xtest


Ptest

Subject N

Xm-2

Xm...

..

.

Xm-1...

Subject 1

..

.X3

X1

..

.

..

.X2


T

...

Kernel Discriminant



(Total m subspaces)


Pi

idii ee ,...,P 1

TX

X

Reference SubspaceReftest=TTPtest

Testing Process

23

Training List

#individual (N) 32

#image set/individual 3

#image/set (ni) ~100

size of normalized template 32x32

dimensionality

KMSM 30

KCMSM 30

DCC 20

KDT 30

σ of Gaussian kernel function 0.05

μ for regularization 10-3

24

Training: Convergence of Jacobian Value

• J(α) tends to converge to a specified value under different initializations.

25

Testing: Comparison with Other Methods

• The proposed KDT is compared to 3 related methods under 10 randomly chosen experiments.– KMSM (avg=0.837)– KCMSM (0.862)– DCC (0.889)– KDT (0.911)

26

Conclusions

• Canonical differences is provided as a similarity measurement between two subspaces.

• Based on canonical difference, we derived a KDT and applied it to a proposed face recognition system.

• Our system is capable of recognizing faces using image sets against facial variations.

27

Thanks for your attention

28

Related Works

• Mutual subspace method (MSM)

θ

θc

project project

Subspace U Subspace V

Uc Vc

ConstrainedSubspace

• Constrained MSM (CMSM)

• Discriminantive canonical correlation (DCC)• Kernel MSM (KMSM), Kernel CMSM (KCMSM)

29

Mutual Subspace Method (MSM)

• Utilize the canonical angles for similarity.

θu

v

2cos),(similarity vu

n

inVU

1

2cos1

),(similarity

u1

v1

v2

u2

Subspace B1

Subspace B2

θ1

θ2

K. Fukui and O. Yamaguchi, “Face Recognition Using Multi-viewpoint Patterns for Robot Vision”, ISRR 2003

…

…

Eigenvectors

30

Perform KDT on Subspace?

• By KPCA, we can obtain s.t.

• Multiply T to both sides of equal sign,

• It can be observed that the kernel subspace of transformed mapped image sets is equivalent to applying T to the original kernel subspace.

Tiii

Ti PPXX

TiTi

TTi

Ti

T PTPTXTXT

dhi

RP

31

KDT Optimization

M

uuuq

1q xt α

where~,,~PP~

1ijd

ijijiij ee d

rns

is

rpij

isr

ijp

ie 1 1 )x(a~

isurpij

isr

ns

drupij ki x,xaZ 11

mi Wk

Tikkiikki

TW

Ti

UUS 1 ZZZZ where,TT αα

whereZP~

T ijijT α

αα

ααα)

U

VJ

S

ST

T

WT

BT

(TT

TT

• Using the theory of reproducing kernels again:

• Following similar steps, we can obtain

That is,

αα VS TB

T TT

mi B

TiiiiiiB i

S 1 )PP)(PP( mi Wk

TkikikikikikiW i

S 1 )PP)(PP(

T={t1,…,tq,…,tw} where

32

Training: Dimensionality w of KDT V.S. Identification Rate

• The identification rate is guaranteed to be greater than 90% after w > 2,200.

33

Training: Similarity Matrix

• Similarity matrix behaves better after 10-times iterative learning.

10

0

Sim

ilarit

y

Id

Number Id

Number

1

32

1 32

1st iteration 10th iteration

21 CCT

34

KSG: Kernel Matrix

jris

Tjr

issrij k xxx,xK

,,...,1 ins jnr ,...,1

• Gaussian kernel function:

• Kernel matrix Kij: the correlation between i-th image set and j-th image set.

Kijni

nj

2

2xx

exp)x,x(

jr

isj

risk

…

s1

2

ni

..

.r

1 2 nj

j-th image set

i-th image set

Kernel trick