Kernel Discriminant Analysis Based on Canonical Difference for Face Recogniti
on in Image Sets
Wen-Sheng Chu (朱文生 ) Ju-Chin Chen (陳洳瑾 )
Jenn-Jier James Lien (連震杰 )
Robotics Lab, CSIE NCKUhttp://robotics.csie.ncku.edu.tw
CVGIP 2007
2
Motivation
• Challenges of face recognition– Facial variations
• Face recognition using image sets– Surveillance– Video retrieval
illumination pose facial expression
3
Why Multi-view Image Sets?
• Multiple facial images contain more information than a single image.
Person BPerson A Person BPerson A
Single input pattern(Single-to-many)
Multiple input patterns(Many-to-many)
A or B?
4
Training/Testing Data: Facial Expression
Image Set 1
Image Set 2
Image Set 3
Image Set 4
Image Set 5
…
…
…
…
…
Training
Testing
For subjecti
5
More Training/Testing Data: Illumination (Yale B)
…
…
…
…
…
Image Set 1
Image Set 2
Image Set 3
Image Set 4
Image Set 5
Training
Testing
For subjectj
6Identification result
System Overview
Testing image set Xtest
Kernel SubspaceGeneration
Ptest
Reference SubspaceReftest
Output
Subject N
Xm-2
Xm...
..
.
Xm-1...
Subject 1
..
.X3
X1
..
.
..
.X2
Testing Process
Training image sets {X1,…,Xm}
Training Data...
Testing Data
Kernel Discriminant
Transformation (KDT)
Kernel SubspaceGeneration
(Total m subspaces)
Reference Subspace: Refi=TTPi
Pi
idii ee ,...,P 1
TXTraining Process
7Identification result
Training Process
Testing image set Xtest
Kernel SubspaceGeneration
Ptest
Reference SubspaceReftest
Subject N
Xm-2
Xm...
..
.
Xm-1...
Subject 1
..
.X3
X1
..
.
..
.X2
Testing Process
Training image sets {X1,…,Xm}
Training Data
Xi={ , ,…, }32
321
3232
2
3232
ni
ni ~= 100
...
Kernel Discriminant
Transformation (KDT)
Kernel SubspaceGeneration Pi
(Total m subspaces)
Reference Subspace: Refi=TTPi
Pi
X
Pi={ei1,…,eid} Training Process
8
i1x
…
i2x i
nix
32 x 32
ni
Nonlinear mapping function
i1x ini
x
… h
ni
iXiX
Kernel matrix
ijK
ni
nj
iP
Kernel subspace of Xi (d<n
i)
…
ie1ide
Kernel Subspace Generation (KSG)
9
KSG: Kernel PCA (KPCA)
• From the theory of reproducing kernels,
in
s
is
isp
ipe
1)x(a where
Tiiii aaK
X1={ , ,…, }1 2 n1
mdmp
m eee ,...,,...,1Pm=Xm={ , ,…, }nm21
…
SVD
KPCA: B Scholkopf, A Smola, KR Muller - Advances in Kernel Methods-Support Vector Learning, 1999
K11
Kmm
… …Image SetXi
KernelMatrix
Kii
,d < ni
i-th image set
s-th image of i-th image set
Dimensionality may be ∞ !
1111 ,...,,..., dp eeeP1= KernelSubspace
Pi
10Identification result
Training Process:Kernel Discriminant Transformation (KDT)
Kernel Discriminant
Transformation (KDT)
Kernel SubspaceGeneration
(Total m subspaces)
Reference Subspace: Refi=TTPi
Testing image set Xtest
Kernel SubspaceGeneration
Ptest
Reference SubspaceReftest
Subject N
Xm-2
Xm...
..
.
Xm-1...
Subject 1
..
.X3
X1
..
.
..
.X2
Testing Process
Training image sets {X1,…,Xm}
...
Pi
idii ee ,...,P 1
TX
11
KDT: Main Idea• Based on the concept of LDA, KDT is derived to find a
transformation matrix T.• We proposed an iterative process to optimize T.• Dimensionality of T is assumed to be w.
Between Subjects
Within SubjectsTT
TTmaxargTT W
TB
T
S
S
d x w transformation matrix (KDT matrix)
Subspace
32x32-dim
1
2
KPCA
KPCA
1
2
d-dim
T1
2
w-dim
How to measure similarity?
12
KDT: Canonical Difference (CD) – Similarity Measurement
• Capture more common views and illumination than eigenvectors.
u1
v1
v2
u2
Kernel subspace P1
↓Canonical subspace C1
d1
d2
Kernel subspace P2
↓Canonical subspace C2
13
KDT: CD – Canonical Vector v.s. Eigenvector (cont.)
canonical vectors
C1
C2
C1- C2
B1
B2
B1- B2
eigenvectors
A similarity measurement of two subspaces
14
KDT: CD – Canonical Subspace (cont.)
TT211221 BB
212112 BB TT
1211 BC 2122 BC
d-dimensinoal orthonormal
basis matrices eigenvector
canonical subspaces (also orthonormal)
0 Eigenvalue = cos≦ 2θi 1≦
• Consider SVD on orthonormal basis matrices B1 and B2:
SVD
Similarity measurement
21 CCT
T.K. Kim, J. Kittler and R. Cipolla, “Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations”, IEEE Trans. on PAMI, 2007
jiT
ji
d
rrr
d
rr
CCCCtrace
vudjiiffCanonicalD
,1
2
1
2
15
KDT: KDT Matrix Optimization
• Orthonormal basis matrices are required to obtain canonical subspaces Ci.
• Is Refi normalized? Usually not!
T
KDT MatrixKernel
Subspace
idii ee ,...,P 1
ReferenceSubspace
iT
i PTRef
CanonicalSubspace
iC
Kernelsubspace
Based on LDA
Iterativelearning
CanonicalDifference
iff(i,j)CanonicalD
16
KDT: Kernel Subspace Normalization
• QR-decomposition is performed to obtain two orthonormal basis matrices.
1PT iiT
i RQ
w × d orthonormal matrix d × d invertible upper triangular matrix
jijjijii QCQC ,
iiiT RQPT
Tjiijj
Ti QQ
SVD
jijTij
Ti QQ ,
17
KDT: Formulation
TPPPPT
CCCC),(
T
jijijijijijiT
jijiji
T
jijiji
ji
T
ji
trace
QQQQtrace
tracejiiffCanonicalD
TT
TTmaxarg
,
,maxargT
T
1
1
T
WT
BT
Wkmi
Bmi
Strace
Strace
kiiffCanonicalD
iiffCanonicalD
i
i
SB, Sw
Qi = TTPiRi-1
Canonical Subspacejijjijii QCQC ,
Derivation
Form of LDA
iimi B
TiiiiiiB CBS
i X|,)PP)(PP(1
ikimi Wk
TkikikikikikiW CkWS
i X|,)PP)(PP(1
18
KDT: Solution
TT
TT
WT
BT
S
S
αα
ααα)
U
VJ
T
T
(
T
mi B
TiiiiiiB i
S 1 )PP)(PP(
mi Wk
TkikikikikikiW i
S 1 )PP)(PP(
M
uuuq
1q xt α
T={t1,…,tq,…,tw}
Contain the info of Dimensionality may be !
19αα
ααα)
U
VJ
T
T(
TT
TT
WT
BT
S
S
αα US TW
T TT
Using the theory of reproducing kernels again:
α.α VS TB
T TTFollowing similar steps, we can obtain
xt1
q
M
uuuq α wheret,...,t,...,tT 1 wq
mi Wk
Tkikikikikiki
TW
TiS 1 T)PP)(PP(TTT
Replace using kernel trick T
Derivation
mi Wk
Tikkiikkii
U 1 ZZZZ
isurpij
isr
ns
drupij ki x,xaZ 11
mi B
Tiiiii
V 1 ZZZZ
20
KDT: Numerical Issues
• is solved by simply computing the leading eigenvectors of U-1V.
• To make sure that U is positive-definite, we regularize U by Uμ (μ=0.001) where
αα
ααα)
U
VJ
T
T(
IUUμ
α
α
21Identification result
Training Process
Testing image set Xtest
Kernel SubspaceGeneration
Ptest
Reference SubspaceReftest
Subject N
Xm-2
Xm...
..
.
Xm-1...
Subject 1
..
.X3
X1
..
.
..
.X2
Testing Process
Training image sets {X1,…,Xm}
...
Kernel Discriminant
Transformation (KDT)
Kernel SubspaceGeneration
(Total m subspaces)
Reference Subspace: Refi=TTPi
Pi
idii ee ,...,P 1
TX
Refi = TTPi where each element is given by
.x,xaPT 1 1isu
isp
Mu
ns uqqpi
T ki α
22Identification result
Testing ProcessTesting image set Xtest
Kernel SubspaceGeneration
Ptest
Subject N
Xm-2
Xm...
..
.
Xm-1...
Subject 1
..
.X3
X1
..
.
..
.X2
Training image sets {X1,…,Xm}
T
...
Kernel Discriminant
Transformation (KDT)
Kernel SubspaceGeneration
(Total m subspaces)
Reference Subspace: Refi=TTPi
Pi
idii ee ,...,P 1
TX
X
Reference SubspaceReftest=TTPtest
Testing Process
23
Training List
#individual (N) 32
#image set/individual 3
#image/set (ni) ~100
size of normalized template 32x32
dimensionality
KMSM 30
KCMSM 30
DCC 20
KDT 30
σ of Gaussian kernel function 0.05
μ for regularization 10-3
24
Training: Convergence of Jacobian Value
• J(α) tends to converge to a specified value under different initializations.
25
Testing: Comparison with Other Methods
• The proposed KDT is compared to 3 related methods under 10 randomly chosen experiments.– KMSM (avg=0.837)– KCMSM (0.862)– DCC (0.889)– KDT (0.911)
26
Conclusions
• Canonical differences is provided as a similarity measurement between two subspaces.
• Based on canonical difference, we derived a KDT and applied it to a proposed face recognition system.
• Our system is capable of recognizing faces using image sets against facial variations.
27
Thanks for your attention
28
Related Works
• Mutual subspace method (MSM)
θ
θc
project project
Subspace U Subspace V
Uc Vc
ConstrainedSubspace
• Constrained MSM (CMSM)
• Discriminantive canonical correlation (DCC)• Kernel MSM (KMSM), Kernel CMSM (KCMSM)
29
Mutual Subspace Method (MSM)
• Utilize the canonical angles for similarity.
θu
v
2cos),(similarity vu
n
inVU
1
2cos1
),(similarity
u1
v1
v2
u2
Subspace B1
Subspace B2
θ1
θ2
K. Fukui and O. Yamaguchi, “Face Recognition Using Multi-viewpoint Patterns for Robot Vision”, ISRR 2003
…
…
Eigenvectors
30
Perform KDT on Subspace?
• By KPCA, we can obtain s.t.
• Multiply T to both sides of equal sign,
• It can be observed that the kernel subspace of transformed mapped image sets is equivalent to applying T to the original kernel subspace.
Tiii
Ti PPXX
TiTi
TTi
Ti
T PTPTXTXT
dhi
RP
31
KDT Optimization
M
uuuq
1q xt α
where~,,~PP~
1ijd
ijijiij ee d
rns
is
rpij
isr
ijp
ie 1 1 )x(a~
isurpij
isr
ns
drupij ki x,xaZ 11
mi Wk
Tikkiikki
TW
Ti
UUS 1 ZZZZ where,TT αα
whereZP~
T ijijT α
αα
ααα)
U
VJ
S
ST
T
WT
BT
(TT
TT
• Using the theory of reproducing kernels again:
• Following similar steps, we can obtain
That is,
αα VS TB
T TT
mi B
TiiiiiiB i
S 1 )PP)(PP( mi Wk
TkikikikikikiW i
S 1 )PP)(PP(
T={t1,…,tq,…,tw} where
32
Training: Dimensionality w of KDT V.S. Identification Rate
• The identification rate is guaranteed to be greater than 90% after w > 2,200.
33
Training: Similarity Matrix
• Similarity matrix behaves better after 10-times iterative learning.
10
0
Sim
ilarit
y
Id
Number Id
Number
1
32
1 32
1st iteration 10th iteration
21 CCT
34
KSG: Kernel Matrix
jris
Tjr
issrij k xxx,xK
,,...,1 ins jnr ,...,1
• Gaussian kernel function:
• Kernel matrix Kij: the correlation between i-th image set and j-th image set.
Kijni
nj
2
2xx
exp)x,x(
jr
isj
risk
…
s1
2
ni
..
.r
1 2 nj
j-th image set
i-th image set
Kernel trick