Upload
pamela-townsend
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
FFace ace RRecognitionecognitionEEigen-faces with 99 PCA igen-faces with 99 PCA
coefficientscoefficients
Project PresentationProject PresentationAshish TiwariAshish Tiwari
MAS622J/1.126JMAS622J/1.126JPattern Classification and AnalysisPattern Classification and Analysis
Problem StatementProblem Statement
• GivenGiven• 20002000 Training data points Training data points• 20002000 Test data points Test data points• faceDRfaceDR and and faceDSfaceDS: :
Description files for both Description files for both training and testing data training and testing data points.points.
• faceRfaceR: Matlab data file : Matlab data file containing 99 PCA containing 99 PCA coefficients for training coefficients for training imagesimages
• faceSfaceS: Matlab file : Matlab file containing 99 PCA containing 99 PCA coefficients for test coefficients for test images. images.
• Required to classifyRequired to classify
GenderGender MaleMale FemaleFemale
AgeAge ChilChildd
TeeTeenn
AduAdultlt
SeniSenioror
RacRacee
White /Hispanic/ Asian/ Black/ White /Hispanic/ Asian/ Black/ OtherOther
Facial Facial ExpressioExpressionsns
SerioSeriousus
FunFunnyny
SmilinSmilingg
HatHat YesYes NoNo
MoustachMoustachee
YesYes NoNo
GlassesGlasses YesYes NoNo
BandanaBandana YesYes NoNo
ObjectiveObjective
• The Objective is to implement, test and The Objective is to implement, test and compare performance of the various ‘ compare performance of the various ‘ Pattern Classification Algorithms ’ on Pattern Classification Algorithms ’ on
different facial features and to come up different facial features and to come up with a quantitative analysis of with a quantitative analysis of
effectiveness of Algorithms in different effectiveness of Algorithms in different scenarios.scenarios.
• To develop insight of a designer’s To develop insight of a designer’s perspective.perspective.
Summary of the WorkSummary of the Work
Parzen WindowParzen Window
K- nearest neighborhoodK- nearest neighborhood
Generalized linear discriminantGeneralized linear discriminant
Neural networkNeural network
BoostingBoosting (neural nets as component (neural nets as component classifiers)classifiers)
I have implemented all 5 Algorithms for almost I have implemented all 5 Algorithms for almost all feature classification excluding Facial all feature classification excluding Facial
Expressions.Expressions.
Bad Data IssuesBad Data Issues
As it invariably happens with any real-As it invariably happens with any real-world problem data base, this problem world problem data base, this problem also contains few bad faces, missing also contains few bad faces, missing
faces and outliers.faces and outliers.
In the implementation here the faces In the implementation here the faces with missing descriptors are taken out with missing descriptors are taken out from the data base where as outliers from the data base where as outliers
and bad faces are kept as it is.and bad faces are kept as it is.
Eigen-facesEigen-facesAn IntroductionAn Introduction
• Developed in 1991 by Developed in 1991 by M.TurkM.Turk• Based on PCABased on PCA
• Relatively simpleRelatively simple• FastFast
&&• RobustRobust
EigenfacesEigenfaces• PCA seeks directions that are efficient PCA seeks directions that are efficient
for representing the datafor representing the data• PCA reduces the dimension of the dataPCA reduces the dimension of the data• Speeds up the computational timeSpeeds up the computational time
efficient
not effic
ient
Class A
Class B
Class A
Class B
Eigenfaces, the Eigenfaces, the algorithmalgorithm
• Original ImagesOriginal Images
2
1
2
N
h
h
h
2
1
2
N
b
b
b
2
1
2
N
a
a
a
……
Eigenfaces, the Eigenfaces, the algorithmalgorithm
• The mean face The mean face can be can be computed as:computed as:
Mean-FaceMean-Face
2 2 2
1 1 1
2 2 21, 8
N N N
a b h
a b hm where M
M
a b h
Eigenfaces, the Eigenfaces, the algorithmalgorithm
• Then subtract it from the training Then subtract it from the training facesfaces
2 2 2 2 2 2 2 2
2 2
1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2
1 1 1 1
2 2
, , , ,
,
m m m m
N N N N N N N N
m m
N N
a m b m c m d m
a m b m c m d ma b c d
a m b m c m d m
e m f m
e m fe f
e m
2 2 2 2 2 2
1 1 1 1
2 2 2 2 2 2, ,m m
N N N N N N
g m h m
m g m h mg h
f m g m h m
Eigenfaces, the Eigenfaces, the algorithmalgorithm
• Now we build the matrix which is Now we build the matrix which is NN22
by by MM
• The covariance matrix which is The covariance matrix which is NN22 by by NN22
m m m m m m m mA a b c d e f g h
Cov AA
Eigenfaces, the Eigenfaces, the algorithmalgorithm
• Find eigenvalues of the covariance matrixFind eigenvalues of the covariance matrix– The matrix is very largeThe matrix is very large– The computational effort is very bigThe computational effort is very big
• We are interested in at most We are interested in at most M (?)M (?) eigenvalueseigenvalues– And in the given problem we have taken 99 And in the given problem we have taken 99
most significant Eigen directions to most significant Eigen directions to represents the data.represents the data.
Results – Results – Gender ClassificationGender Classification
KNNKNN MaleMale FemaleFemale AccurAccuracyacy
MaleMale 917917 360360 71.871.8%%
FemalFemalee
297297 422422 58.658.6%%
Overall Accuracy: Overall Accuracy:
66.95%66.95%
NNETNNET MaleMale FemaleFemale AccuraAccuracycy
MaleMale 959/12959/127777
318/7318/71919
75.0975.09%%
FemaFemalele
304/12304/127777
415/7415/71919
57.7157.71%%
Overall Accuracy: Overall Accuracy:
63.15%63.15%BOOSTINGBOOSTING MaleMale FemaleFemale AccuraAccuracycy
MaleMale 920920 357357 72.0472.04%%
FemaleFemale 279279 440440 61.1961.19%%
Overall Accuracy: Overall Accuracy:
68%68%
ParzeParzenn
MaleMale FemaleFemale AccuraAccuracycy
MaleMale 12761276 11 99.9299.92%%
FemaleFemale 00 719719 100%100%Overall Accuracy: Overall Accuracy:
99.75%99.75%
Results – Results – Gender ClassificationGender Classification
GeneralizGeneralized Linear ed Linear DiscriminDiscrimin
antant
MaleMale FemaleFemale AccuraAccuracycy
MaleMale 10241024 253253 80.18%80.18%
FemaleFemale 267267 452452 62.86%62.86%
Overall Accuracy: Overall Accuracy: 73.94% 73.94%
Discussion – Discussion – Gender ClassificationGender Classification
Male / FemaleMale / Female
• Very well behaved data Very well behaved data set.set.
• Both class examples are in Both class examples are in good proportion.good proportion.
• Good results are obtained Good results are obtained using using Generalized linear Generalized linear
discriminantdiscriminant..• Even comparable results Even comparable results
for simplest “Minimum for simplest “Minimum distance Classifier”distance Classifier”
• NeuralNeural netnet, , GLDGLD and and KNNKNN results are almost same.results are almost same.
• Wonderful performance by Wonderful performance by Parzen windowParzen window classifier. classifier.
Boosting – Boosting – commentscomments
Boosting doesn’t seem Boosting doesn’t seem improving the improving the
performance anymore for performance anymore for this class.this class.
• Very few training points.Very few training points.
• Samples are not in Samples are not in equal proportion in equal proportion in
second dataset (D2) second dataset (D2) (not proper utilization of the (not proper utilization of the
entire training set)entire training set)
Results – Results – Race ClassificationRace Classification
KNNKNN Correctly Correctly ClassifiedClassified
AccuraAccuracycy
WhiteWhite 1657/16991657/1699 97.5297.52%%
HispaniHispanicc
0/130/13 0%0%
AsianAsian 1/261/26 3.843.84%%
BlackBlack 9/2499/249 0.030.03%%
OtherOther 0/90/9 0%0%
Overall Accuracy: Overall Accuracy: 83.35% 83.35%
ParzenParzen Correctly Correctly ClassifiedClassified
AccuraAccuracycy
WhiteWhite 1699/16991699/1699 100%100%
HispanHispanicic
0/130/13 00
AsianAsian 0/260/26 00
BlackBlack 0/2490/249 00
OtherOther 0/90/9 00
Overall Accuracy: Overall Accuracy: 84.95% 84.95%
Results – Results – Race ClassificationRace Classification
NNETNNET Correctly Correctly ClassifiedClassified
AccuraAccuracycy
WhiteWhite 1476/16991476/1699 86.8786.87%%
HispanHispanicic
0/130/13 0%0%
AsianAsian 1/261/26 3.843.84%%
BlackBlack 15/24915/249 5.795.79%%
OtherOther 1/91/9 7.697.69%%
Overall Accuracy: Overall Accuracy: 74.65% 74.65%
BOOSTINBOOSTINGG
CorrectCorrect AccuraAccuracycy
WhiteWhite 1322/161322/169999
77.8177.81%%
HispanicHispanic 3/133/13 23.0723.07%%
AsianAsian 9/269/26 34.6134.61%%
BlackBlack 12/24912/249 4.634.63%%
OtherOther 8/98/9 88.8388.83%%
Overall Accuracy: Overall Accuracy: 67.95%67.95%
Discussion- Discussion- Race ClassificationRace Classification
White/Hispanic/Asian/Black/White/Hispanic/Asian/Black/OtherOther
• Huge Dimensional space, few Huge Dimensional space, few training pointstraining points
• Very few examples of Very few examples of categories other than whitecategories other than white
• Classifiers tend to bias Classifiers tend to bias towards White class towards White class
• Caution: Even a trivial Caution: Even a trivial classifier (classify all as classifier (classify all as
white) can attain a accuracy white) can attain a accuracy of 85%of 85%
What is the solution?What is the solution?
• Gather more training Gather more training data of different data of different
classes.classes.
• Synthesize data from Synthesize data from existing ones.existing ones.
• Boosting Boosting
Race - Race - DiscussionDiscussionLet us try to scrutinize the reasons for fairly poor performance Let us try to scrutinize the reasons for fairly poor performance
ofof
the Parzen Window and K- nearest neighborhood methods.the Parzen Window and K- nearest neighborhood methods.
ReasonReason
1.1. Very few training points are given for the classes other than the Very few training points are given for the classes other than the white class (85%), so by the inherent structure of the dataset, white class (85%), so by the inherent structure of the dataset, the whites points are most likely to be present in any chosen the whites points are most likely to be present in any chosen volume in the space. volume in the space.
2.2. Also it is clear from the projection of data as if white data form a Also it is clear from the projection of data as if white data form a sphere in the space and other classes are wrapped around the sphere in the space and other classes are wrapped around the sphere like ribbons (this is also the case when the data sphere like ribbons (this is also the case when the data inherently don’t have that much dimension in which it is actually inherently don’t have that much dimension in which it is actually projected). Owing to the above mentioned reason whenever we projected). Owing to the above mentioned reason whenever we try to develop a volume around a point, it is more likely to try to develop a volume around a point, it is more likely to contain points of white data than one of itself because of its contain points of white data than one of itself because of its many fold character. many fold character.
3.3. Had it been these faces were projected onto lower dimensional Had it been these faces were projected onto lower dimensional space should have obtained better results for KNN and Parzen. space should have obtained better results for KNN and Parzen. ((IntuitionIntuition))
Fisher Projected Race Fisher Projected Race data on 3Ddata on 3D
Courtesy - Larissa
Results – Results – Age ClassificationAge Classification
KNNKNN Correctly Correctly ClassifiedClassified
AccuraAccuracycy
ChildChild 47/6847/68 69.1169.11%%
TeenTeen 0/830/83 0%0%
AdultAdult 1688/17301688/1730 97.5797.57%%
SenioSeniorr
1/1151/115 0.860.86%%
Overall Accuracy: Overall Accuracy: 86.97% 86.97%
ParzeParzenn
Correctly Correctly ClassifiedClassified
AccuraAccuracycy
ChildChild 23/6823/68 33.8233.82%%
TeenTeen 0/830/83 0%0%
AdultAdult 1707/17301707/1730 98.6798.67%%
SenioSeniorr
0/1150/115 0%0%
Overall Accuracy: Overall Accuracy: 86.67% 86.67%
Results – Results – Age ClassificationAge Classification
NNETNNET Correctly Correctly ClassifiedClassified
AccuraAccuracycy
ChildChild 12/6812/68
(3)(3)69.1169.11
%%
TeenTeen 10/8310/83
(3)(3)0%0%
AdultAdult 1390/17301390/1730
(1695)(1695)97.5797.57
%%
SenioSeniorr
19/11519/115
(1)(1)0.860.86
%%
Overall Accuracy: Overall Accuracy: 71.69% 71.69%
BoostiBoostingng
Correctly Correctly ClassifiedClassified
AccuraAccuracycy
ChildChild 16/6816/68 23.5223.52%%
TeenTeen 27/8327/83 32.5332.53%%
AdultAdult 1219/17301219/1730 70.4670.46%%
SeniorSenior 33/11533/115 28.6928.69%%
Overall Accuracy: Overall Accuracy: 64.87% 64.87%
Discussion - Discussion - Age ClassificationAge Classification
Q:Q: Although it looks like both child and teens are in almost Although it looks like both child and teens are in almost equal proportion (68 - 83) then why both KNN and Parzen equal proportion (68 - 83) then why both KNN and Parzen always favoring child over teen??always favoring child over teen??
A: A: This proportion is actually reflecting the fact about Testing This proportion is actually reflecting the fact about Testing set, whereas in Training set this thing is much more clearer set, whereas in Training set this thing is much more clearer where number of child (240) dominates number of teen data where number of child (240) dominates number of teen data points significantly. points significantly.
NNETNNET:: Although the evidence curve suggested highest Although the evidence curve suggested highest accuracy can accuracy can be achieved at n = 23 neurons in the hidden be achieved at n = 23 neurons in the hidden layer, I deliberately chosen layer, I deliberately chosen 8 neurons. The idea was to not 8 neurons. The idea was to not allow the net to build complex allow the net to build complex boundaries so that it may boundaries so that it may have good generalization on the unseen data. have good generalization on the unseen data. And the And the results support the argument too. Bracketed results are from results support the argument too. Bracketed results are from
n=23 neurons.n=23 neurons.
Boosting - Boosting - CommentsCommentsIn almost all of the implementations boosting helped In almost all of the implementations boosting helped improving performance of the week classes but have improving performance of the week classes but have not shown very encouraging results. The reasons I not shown very encouraging results. The reasons I sort out may besort out may be::
1.1. The amount of data needed to actually The amount of data needed to actually implement boosting implement boosting was not sufficient.was not sufficient.
2.2. In constructing the second dataset sometimes In constructing the second dataset sometimes we do not we do not find appropriate proportion to maximize find appropriate proportion to maximize disorder at the disorder at the same time utilizing full training same time utilizing full training resource. (A part of the resource. (A part of the expensive data set left expensive data set left unutilized)unutilized)
3.3. More often than not the condition to More often than not the condition to construct third data construct third data set further excludes most of set further excludes most of the remaining data points, the remaining data points, hence reducing the size hence reducing the size of available samples for training.of available samples for training.
Results – Results – Bandana Bandana ClassificationClassification
KNNKNN YESYES NONO AccuraAccuracycy
BANDANABANDANA 11 77 12.512.5%%
NO NO BANDANABANDANA
00 19881988 100%100%
Overall Accuracy: Overall Accuracy: 99.64% 99.64%
ParzenParzen YESYES NONO AccuraAccuracycy
BANDANABANDANA 88 00 100%100%NO NO
BANDANABANDANA00 19881988 100%100%
Overall Accuracy: Overall Accuracy: 100% 100%
NNETNNET YESYES NONO AccuraAccuracycy
BANDANABANDANA 00 88 0%0%NO NO
BANDANABANDANA00 19881988 100%100%
Overall Accuracy: Overall Accuracy: 99.49% 99.49%
BOOSTINBOOSTINGG
YESYES NONO AccuraAccuracycy
BANDANABANDANA 33 55 37.537.5%%
NO NO BANDANABANDANA
11 19871987 99.9499.94%%
Overall Accuracy: Overall Accuracy: 99.69% 99.69%
Results – Results – Moustache Moustache ClassificationClassification
KNNKNN NONO YESYES AccuraAccuracycy
NO NO MOUSTACMOUSTAC
HEHE
15615622
00 100%100%
MOUSTACMOUSTACHEHE
434434 00 0%0%
Overall Accuracy: Overall Accuracy: 78.25% 78.25%
ParzenParzen NONO YESYES AccuraAccuracycy
NO NO MOUSTACMOUSTAC
HEHE
15615611
11 99.9399.93%%
MOUSTACMOUSTACHEHE
77 427427 98.3898.38%%
Overall Accuracy: Overall Accuracy: 99.59% 99.59%NNETNNET NONO YESYES AccuraAccura
cycy
NO NO MOUSTACHMOUSTACH
EE
382382 5252 88.0188.01%%
MOUSTACHMOUSTACHEE
122122 14401440 92.1892.18%%
Overall Accuracy: Overall Accuracy: 91.28% 91.28%
BOOSTINBOOSTINGG
NONO YESYES AccuraAccuracycy
NO NO MOUSTACMOUSTAC
HEHE
366366 6868 84.3384.33%%
MOUSTACMOUSTACHEHE
240240 13221322 84.6384.63%%
Overall Accuracy: Overall Accuracy: 84.56% 84.56%
Results – Results – Glasses ClassificationGlasses Classification
KNNKNN NONO YESYES AccuraAccuracycy
NO NO GLASSESGLASSES
19871987 11 99.9499.94%%
GLASSESGLASSES 88 00 0%0%
Overall Accuracy: Overall Accuracy: 99.54% 99.54%
ParzenParzen YESYES NONO AccuraAccuracycy
NO NO GLASSESGLASSES
19881988 00 100%100%
GLASSESGLASSES 00 88 100%100%
Overall Accuracy: Overall Accuracy: 100% 100%
NNETNNET YESYES NONO AccuraAccuracycy
NO NO GLASSESGLASSES
19601960 2828 98.5998.59%%
GLASSESGLASSES 77 11 14.2814.28%%
Overall Accuracy: Overall Accuracy: 98.24% 98.24%
BOOSTINBOOSTINGG
YESYES NONO AccuraAccuracycy
NO NO GLASSESGLASSES
19551955 3333 98.3498.34%%
GLASSESGLASSES 55 33 37.5437.54%%
Overall Accuracy: Overall Accuracy: 98.09% 98.09%
Results – Results – Beard ClassificationBeard Classification
KNNKNN NONO YESYES AccuraAccuracycy
NO BEARDNO BEARD 17011701 1919 98.8998.89%%
BEARDBEARD 275275 11 0.360.36%%
Overall Accuracy: Overall Accuracy: 85.27% 85.27%
ParzenParzen NONO YESYES AccuraAccuracycy
NO BEARDNO BEARD 17201720 00 100%100%BEARDBEARD 00 276276 100%100%
Overall Accuracy: Overall Accuracy: 100% 100%
NNETNNET NONO YESYES AccuraAccuracycy
NO BEARDNO BEARD 16961696 2424 98.6098.60%%
BEARDBEARD 271271 55 1.841.84%%
Overall Accuracy: Overall Accuracy: 85.22% 85.22%
BOOSTINBOOSTINGG
NONO YESYES AccuraAccuracycy
NO BEARDNO BEARD 16881688 3232 98.1398.13%%
BEARDBEARD 6161 215215 77.8977.89%%
Overall Accuracy: Overall Accuracy: 95.34% 95.34%
ConclusionConclusion
In this project I have In this project I have made an effort to made an effort to compare various compare various algorithms on various algorithms on various classification tasks such classification tasks such as Gender, Age, Race as Gender, Age, Race and other properties and other properties like Hat, Moustache etc. like Hat, Moustache etc. and where ever possible and where ever possible I also tried to compare I also tried to compare the results obtained the results obtained from simplest ‘Minimum from simplest ‘Minimum distance classifier’distance classifier’
GenderGender PW, GLD, PW, GLD, KNNKNN
RaceRace NNETNNETAgeAge NNET NNET
with with BoostingBoosting
MoustachMoustache, Hat, e, Hat, Beard, Beard, Bandana, Bandana, GlassesGlasses
PWPW