Upload
amul-raj
View
227
Download
0
Embed Size (px)
Citation preview
8/3/2019 Face Gabor
1/130
FACERECOGNITIONUSINGGABORWAVELETTRANSFORM
ATHESISSUBMITTEDTO
THEGRADUATESCHOOLOFNATURALSCIENCES
OF
THEMIDDLEEASTTECHNICALUNIVERSITY
BY
BURCUKEPENEKCI
INPARTIALFULLFILMENTOFTHEREQUIREMENTSFORTHEDEGREE
OF
MASTEROFSCIENCE
IN
THEDEPARTMENTOFELECTRICALANDELECTRONICSENGINEERING
SEPTEMBER2001
8/3/2019 Face Gabor
2/130
ii
ABSTRACT
FACERECOGNITIONUSINGGABORWAVELETTRANSFORM
Kepenekci,Burcu
M.S,DepartmentofElectricalandElectronicsEngineering
Supervisor:A.AydnAlatan
Co-Supervisor:GzdeBozda Akar
September2001,118pages
Face recognition is emerging as an active research area with numerous
commercial and law enforcement applications. Although existing methods
performs well under certain conditions, the illumination changes, out of plane
rotationsandocclusionsarestillremainaschallengingproblems.Theproposed
algorithmdealswithtwooftheseproblems,namelyocclusionandillumination
changes.Inourmethod,Gaborwavelettransformisusedforfacialfeaturevector
constructionduetoitspowerfulrepresentationofthebehaviorofreceptivefields
8/3/2019 Face Gabor
3/130
iii
inhuman visual system (HVS). The method isbasedon selecting peaks(high-
energizedpoints)oftheGaborwaveletresponsesasfeaturepoints.Comparedto
predefined graph nodes of elastic graph matching, our approach has better
representativecapabilityforGaborwavelets.Thefeaturepointsareautomatically
extracted using the local characteristics of each individual face in order to
decrease the effect ofoccluded features. Since thereisno trainingasinneural
network approaches, a single frontal face for each individual is enough as a
reference.Theexperimentalresultswithstandardimagelibraries,showthatthe
proposedmethodperformsbettercomparedtothegraphmatchingandeigenface
basedmethods.
Keywords:Automaticfacerecognition,Gaborwavelettransform,humanface
perception.
8/3/2019 Face Gabor
4/130
iv
Z
GABORDALGACIKLARINIKULLANARAKYZTANIMA
Kepenekci,Burcu
YksekLisans, ElektrikElektronikMhendisliiBlm
TezYneticisi:A.AydnAlatan
YardmcTezYneticisi:GzdeBozda Akar
Eyll2001,118 sayfa
Yz tanma gnmzde hem ticari hem de hukuksal alanlarda artan
saydauygulamas olan bir problemdir.Varolanyztanmametodlarkontroll
ortamdabaarlsonularversedertme,ynlenme,veaydnlatmadeiimleri
hala yz tanmada zlememi problemdir. nerilen metod ile bu
problemdenaydnlanmadeiimlerivertmeetkisielealnmtr.Bualmada
hem yze ait znitelik noktalar hem de vektrleri Gabor dalgack dn m
8/3/2019 Face Gabor
5/130
v
kullanlarakbulunmutur. Gabor dalgack dn m,insan grme sistemindeki
duyumsal blgelerin davrann modellemesinden dolay kullanlmtr.nerilen
metod, daha nceden tanmlanm izge dmleri yerine, Gabor dalgack
tepkeleri tepelerinin (yksek enerjili noktalarnn) znitelik noktalar olarak
seilmesine dayanmaktadr. Bylece Gabor dalgacklarnn en verimli ekilde
kullanlmas salanmtr. znitelik noktalar otomatik olarak her yzn farkl
yerel zellikleri kullanlarak bulunmakta, bunun sonucu olarak rtk
zniteliklerin etkisideazaltlmaktadr.Sinir alar yaklamlarnda olduu gibi
renme safhas olmamas nedeniyle tanma iinherkiinin sadece bir n yz
grnts yeterli olmaktadr. Yaplan deneylerin sonucunda nerilen metodun
varolan izge eleme ve zyzler yntemleriyle karla trldnda daha baarl
sonular verdii gzlenmitir.
Anahtar Kelimeler: Yz tanma, Gabor dalgack dn m, insan yz algs,
znitelikbulma,znitelike leme,rnttanma.
8/3/2019 Face Gabor
6/130
vi
ACKNOWLEDGEMENTS
IwouldliketoexpressmygratitudetoAssoc.Prof.Dr.G zdeBozda Akarand
Assist. Prof. Dr. A. Aydn Alatan for their guidance, suggestions and insight
throughout this research. I also thank to F. Boray Tek for his support, useful
discussions,andgreatfriendship.Tomyfamily,Ioffersincerethanksfortheir
unshakablefaithinme.SpecialthankstofriendsinSignalProcessingandRemote
Sensing Laboratory for their reassurance during thesis writing. I would like to
acknowledge Dr. Uur Murat Lelolu for his comments. Most importantly, I
expressmyappreciationtoAssoc.Prof.Dr.MustafaKaramanforhisguidance
duringmyundergraduatestudiesandhisencouragingmetoresearch.
8/3/2019 Face Gabor
7/130
vii
TABLEOFCONTENTS
ABSTRACT.........ii
Z....iv
ACKNOWLEDGEMENTS....vi
TABLEOFCONTENTS...vii
LISTOFTABLES.......x
LISTOFFIGURES........xi
CHAPTER
1. INTRODUCTION.....1
1.1. WhyFaceRecognition?..3
1.2. ProblemDefinition...6
1.3. Organizationofthethesis.....8
2. PASTRESEARCHONFACERECOGNITION.....9
2.1. HumanFaceRecognition.9
2.1.1. Discussion..13
2.2. AutomaticFaceRecognition..15
2.2.1. Representation,MatchingandStatisticalDecision...16
2.2.2. EarlyFaceRecognitionMethods...19
2.2.3. StatisticalApproachstoFaceRecognition....21
8/3/2019 Face Gabor
8/130
viii
2.2.3.1. Karhunen-LoeveExpansionBasedMethods..21
2.2.3.1.1. Eigenfaces.21
2.2.3.1.2. FaceRecognitionusing
Eigenfaces..26
2.2.3.1.3. Eigenfeatures.28
2.2.3.1.4. Karhunen-LoeveTransformofthe
FourierTransform.29
2.2.3.2. LinearDiscriminantMethods-Fisherfaces.....30
2.2.3.2.1. FishersLinearDiscriminant.30
2.2.3.2.2. FaceRecognitionUsingLinear
DiscriminantAnalysis...32
2.2.3.3. SingularValueDecompositionMethods35
2.2.3.3.1. SingularValueDecomposition.35
2.2.3.3.2. FaceRecognitionUsingSingular
ValueDecomposition35
2.2.4. HiddenMarkovModelBasedMethods.38
2.2.5. NeuralNetworksApproach....44
2.2.6. TemplateBasedMatching......49
2.2.7. FeatureBasedMatching.....51
2.2.8. CurrentStateofTheArt.60
3. FACECOMPARISONANDMATCHING
USINGGABORWAVELETS...62
3.1. FaceRepresentationUsingGaborWavelets..64
8/3/2019 Face Gabor
9/130
ix
3.1.1. GaborWavelets......64
3.1.2. 2DGaborWaveletRepresentationofFaces..68
3.1.3. FeatureExtraction..70
3.1.3.1. Feature-pointLocalization.70
3.1.3.2. Feature-vectorExtraction...73
3.2. MatchingProcedure...74
3.2.1. SimilarityCalculation74
3.2.2. FaceComparison...75
4. RESULTS...82
4.1. SimulationSetup....82
4.1.1. ResultsforUniversityofStirlingFaceDatabase...83
4.1.2. ResultsforPurdueUniversityFaceDatabase84
4.1.3. ResultsforTheOlivettiandOracleResearchLaboratory
(ORL)faceDatabase..87
4.1.4. ResultsforFERETFaceDatabase.....89
5. CONCLUSIONSANDFUTUREWORK...103
REFERENCES.....108
8/3/2019 Face Gabor
10/130
x
LISTOFTABLES
4.1 Recognition performances of eigenface, eigenhills and proposed method
onthePurduefacedatabase...85
4.2 Performanceresultsofwell-knownalgorithmsonORLdatabase.88
4.3 Probesetsandtheirgoalofevaluation...92
4.4 ProbesetsforFERETperformanceevaluation..92
4.5 FERET performance evaluation results for various face recognition
algorithms...94
8/3/2019 Face Gabor
11/130
xi
LISTOFFIGURES
2.1 AppearancemodelofEigenfaceAlgorithm.......26
2.2 DiscriminativemodelofEigenfaceAlgorithm..27
2.3 ImagesamplingtechniqueforHMMrecognition......40
2.4 HMMtrainingscheme...43
2.5 HMMrecognitionscheme......44
2.6 Auto-associationandclassificationnetworks...45
2.7 ThediagramoftheConvolutionalNeuralNetworkSystem..48
2.8 A2Dimagelattice(gridgraph)onMarilynMonroesface..55
2.9 Abrunchgraph...59
2.10 Bunchgraphmatchedtoaface..59
3.1 AnensembleofGaborwavelets....66
3.2 Smallsetoffeaturescanrecognizefacesuniquely,andreceptivefields
thataremachedtothelocalfeaturesontheface67
3.3 Gaborfilterscorrespondto5spatialfrequencyand8orientation.....68
3.4 ExampleofafacialimageresponsetoGaborfilters....69
3.5 Facialfeaturepointsfoundasthehigh-energizedpointsofGaborwavelet
responses........71
3.6 Flowchartofthefeatureextractionstageofthefacialimages ......72
8/3/2019 Face Gabor
12/130
xii
3.7 Testfacesvs.matchingfacefromgallery.....81
4.1 ExamplesofdifferentfacialexpressionsfromStirlingdatabase...83
4.2 ExampleofdifferentfacialimagesforapersonfromPurduedatabase....85
4.3 Wholesetoffaceimagesof40individuals10imagesperperson.. ..86
4.4 ExampleofmisclassifiedfacesofORLdatabase..88
4.5 ExampleofdifferentfacialimagesforapersonfromORLdatabasethat
areplacedattrainingorprobesetsbydifferent.....89
4.6 FERETidentificationperformancesagainstfbprobes.....96
4.7 FERETidentificationperformancesagainstduplicateIprobes...97
4.8 FERETidentificationperformancesagainst fcprobes..98
4.9 FERETidentificationperformancesagainstduplicateIIprobes...99
4.10 FERETaverageidentificationperformances...100
4.11 FERETcurrentupperboundidentificationperformances......101
4.12 FERETidentificationperformanceofproposedmethod
againstfbprobes......102
4.13 FERETidentificationperformanceofproposedmethod
againstfcprobes.......102
4.14 FERETidentificationperformanceofproposedmethod
againstduplicateIprobes.....103
4.15 FERETidentificationperformanceofproposedmethod
againstduplicateIIprobes...103
8/3/2019 Face Gabor
13/130
1
CHAPTER1
INTRODUCTION
Machine recognition of faces is emerging as an active research area
spanning several disciplines such as image processing, pattern recognition,
computervisionandneuralnetworks.Facerecognitiontechnologyhasnumerous
commercial and law enforcement applications. These applications range from
staticmatchingofcontrolledformatphotographssuchaspassports,creditcards,
photoIDs,driverslicenses,andmugshotstorealtimematchingofsurveillance
videoimages[82].
Humans seem to recognize faces in cluttered scenes with relative ease,
having the ability to identify distorted images, coarsely quantized images, and
faces with occluded details. Machine recognition is much more daunting task.
Understandingthehumanmechanismsemployedtorecognizefacesconstitutesa
challenge for psychologists and neural scientists. In addition to the cognitive
aspects,understanding face recognitionis important, since thesame underlying
8/3/2019 Face Gabor
14/130
2
mechanismscouldbeusedtobuildasystemfortheautomaticidentificationof
facesbymachine.
AformalmethodofclassifyingfaceswasfirstproposedbyFrancisGalton
in1888[53,54].Duringthe1980sworkonfacerecognitionremainedlargely
dormant. Since the 1990s, the research interest in face recognition has grown
significantlyasaresultofthefollowingfacts:
1. Theincreaseinemphasisoncivilian/commercialresearchprojects,
2. The re-emergence of neural network classifiers with emphasis on real time
computationandadaptation,
3. Theavailabilityofrealtimehardware,
4. The increasing need for surveillance related applications due to drug
trafficking,terroristactivities,etc.
Still most of the access control methods, with all their legitimate
applications in an expanding society, have a bothersome drawback. Except for
human and voice recognition, these methods require the user to remember a
password,toenteraPINcode,tocarryabatch,or,ingeneral,requireahuman
action in the course of identification or authentication. In addition, the
corresponding means (keys, batches,passwords, PINcodes) areprone to being
lost or forgotten, whereas fingerprints and retina scans suffer from low user
acceptance. Modern face recognition has reached an identification rate greater
than90%withwell-controlledposeandilluminationconditions.Whilethisisa
high rate for face recognition, it is not comparable to methods using keys,
passwordsorbatches.
8/3/2019 Face Gabor
15/130
3
1.1. Whyfacerecognition?
Within todays environment of increased importance of security and
organization,identificationandauthenticationmethodshavedevelopedintoakey
technology in various areas: entrance control in buildings; access control for
computers in general or for automatic teller machines in particular; day-to-day
affairs like withdrawing money from a bank account or dealing with the post
office;orintheprominentfieldofcriminalinvestigation.Suchrequirementfor
reliablepersonalidentificationin computerizedaccesscontrolhasresultedin an
increasedinterestinbiometrics.
Biometric identification is the technique of automatically identifying or
verifying an individual by a physical characteristic or personal trait. The term
automaticallymeansthebiometricidentificationsystemmustidentifyorverify
ahumancharacteristicortraitquicklywithlittleornointerventionfromtheuser.
Biometrictechnologywasdevelopedforuseinhigh-levelsecuritysystemsand
lawenforcementmarkets.Thekeyelementofbiometrictechnologyisitsabilityto
identifyahumanbeingandenforcesecurity[83].
Biometriccharacteristicsandtraitsaredividedintobehavioralorphysical
categories.Behavioral biometrics encompasses such behaviors as signature and
typingrhythms.Physicalbiometricsystemsusetheeye,finger,hand,voice,and
face,foridentification.A biometric-based system was developed by Recognition Systems Inc.,
Campbell, California, as reported by Sidlauskas [73]. The system was called
ID3D Handkey and used the three dimensional shape of a persons hand to
8/3/2019 Face Gabor
16/130
4
distinguish people. The side and top view ofa handpositionedina controlled
captureboxwereusedtogenerateasetofgeometricfeatures.Capturingtakesless
than two seconds and the data could be stored efficiently in a 9-byte feature
vector.Thissystemcouldstoreupto20000differenthands.
Another well-known biometric measure is that of fingerprints. Various
institutions aroundtheworld have carriedout research in the field. Fingerprint
systemsareunobtrusiveandrelativelycheaptobuy.Theyareusedinbanksand
tocontrolentrancetorestrictedaccessareas.Fowler[51]hasproducedashort
summaryoftheavailablesystems.
Fingerprintsareuniquetoeachhumanbeing.Ithasbeenobservedthatthe
iris of the eye, like fingerprints, displays patterns and textures unique to each
humanandthatitremainsstableoverdecadesoflifeasdetailedbySiedlarz[74].
Daugman designed a robust pattern recognition method based on 2-D Gabor
transformstoclassifyhumanirises.
Speechrecognitionisalsooffersoneofthemostnaturalandlessobtrusive
biometricmeasures,whereauserisidentifiedthroughhisorherspokenwords.
AT&Thaveproducedaprototypethatstoresapersonsvoiceonamemorycard,
detailsofwhicharedescribedbyMandelbaum[67].
Whileappropriateforbanktransactionsandentry intosecureareas,such
technologies have the disadvantage that they are intrusive both physically and
socially.Theyrequiretheusertopositiontheirbodyrelativetothesensor,then
pauseforasecondtodeclarehimselforherself.Thispauseanddeclareinteraction
isunlikelytochangebecauseofthefine-grainspatialsensingrequired.Moreover,
8/3/2019 Face Gabor
17/130
5
since people can not recognize people using this sort of data, these types of
identification do not have a place in normal human interactions and social
structures.
While the pause and present interaction perception are useful in high
security applications, they are exactly the opposite of what is required when
buildingastorethatrecognizingitsbestcustomers,oraninformationkioskthat
remembersyou,orahousethatknowsthepeoplewholivethere.
A face recognition system would allow user to be identified by simply
walkingpastasurveillancecamera.Humanbeingsoftenrecognizeoneanotherby
uniquefacialcharacteristics.Oneofthenewestbiometrictechnologies,automatic
facial recognition, is based on this phenomenon. Facial recognition is themost
successful form of human surveillance. Facial recognition technology, is being
usedtoimprovehumanefficiencywhenrecognizingfaces,isoneofthefastest
growing fields in the biometric industry. Interest in facial recognition is being
fueled by the availability and low cost of video hardware, the ever-increasing
number of video cameras being placed in the workspace, and the noninvasive
aspectoffacialrecognitionsystems.
Althoughfacialrecognitionisstillintheresearchanddevelopmentphase,
several commercial systems are currently available and research organizations,
such as Harvard University and the MIT Media Lab, are working on the
developmentofmoreaccurateandreliablesystems.
8/3/2019 Face Gabor
18/130
6
1.2. ProblemDefinition
Ageneralstatementoftheproblemcanbeformulatedasfollows:givenstill
orvideo images of a scene, identify one ormore persons in the scene using a
storeddatabaseoffaces.
Theenvironmentsurroundingafacerecognitionapplicationcancovera
widespectrumfromawell-controlledenvironmenttoanuncontrolledone.Ina
controlledenvironment,frontalandprofilephotographsaretaken,completewith
uniform background and identical poses among the participants. These face
images are commonly called mug shots. Each mug shot can be manually or
automatically cropped to extract a normalized subpart called a canonical face
image.Inacanonicalfaceimage,thesizeandpositionofthefacearenormalized
approximatelytothepredefinedvaluesandbackgroundregionisminimal.
Generalfacerecognition,ataskthatisdonebyhumansindailyactivities,
comes from virtually uncontrolled environment. Systems, which automatically
recognizefacesfromuncontrolledenvironment,mustdetectfacesinimages.Face
detectiontaskistoreportthelocation,andtypicallyalsothesize,ofallthefaces
from a given image and completely a different problem with respect to face
recognition.
Facerecognitionisadifficultproblemduetothegeneralsimilarshapeof
facescombinedwiththenumerousvariationsbetweenimagesofthesameface.
Recognitionoffacesfromanuncontrolledenvironmentisaverycomplextask:
lightingconditionmayvarytremendously;facialexpressionsalsovaryfromtime
to time; face may appear at different orientations and a face can be partially
8/3/2019 Face Gabor
19/130
7
occluded.Further,dependingontheapplication,handlingfacialfeaturesovertime
(aging)mayalsoberequired.
Although existing methods performs well under constrained conditions,
theproblemswiththeilluminationchanges,outofplanerotationsandocclusions
arestillremainsunsolved.Theproposedalgorithm,dealswithtwoofthesethree
importantproblems,namelyocclusionandilluminationchanges.
Sincethetechniquesusedinthebestfacerecognitionsystemsmaydepend
ontheapplicationofthesystem,onecanidentifyatleasttwobroadcategoriesof
facerecognitionsystems[19]:
1. Findingapersonwithinlargedatabaseoffaces(e.g.inapolicedatabase).
(Oftenonlyoneimageisavailableperperson.Itisusuallynotnecessaryfor
recognitiontobedoneinrealtime.)
2. Identifyingparticularpeopleinrealtime(e.g.locationtrackingsystem).
(Multiple images per person are often available for training and real time
recognitionisrequired.)
Inthisthesis,weprimarilyinterestedinthefirstcase.Detectionoffaceis
assumedtobedonebeforehand.Weaimtoprovidethecorrectlabel(e.g.name
label)associatedwiththatfacefromalltheindividualsinitsdatabaseincaseof
occlusionsandilluminationchanges.Databaseoffacesthatarestoredinasystem
iscalledgallery.Ingallery,thereexistsonlyonefrontalviewofeachindividual.
We do not considercases with high degreesof rotation, i.e. we assume that a
minimalpreprocessingstageisavailableifrequired.
8/3/2019 Face Gabor
20/130
8
1.3. OrganizationoftheThesis
Over the past 20 years extensive research has been conducted by
psychophysicists, neuroscientists and engineers on various aspects of face
recognitionbyhumanandmachines.Inchapter2,wesummarizetheliteratureon
bothhumanandmachinerecognitionoffaces.
Chapter 3 introduces the proposed approach based on Gabor wavelet
representationoffaceimages.Thealgorithmisexplainedexplicitly.
Performanceofour method is examined on four different standardface
databaseswithdifferentcharacteristics.Simulationresultsandtheircomparisons
towell-knownfacerecognitionmethodsarepresentedinChapter4.
In chapter 5, concluding remarks are stated. Future works, which may
followthisstudy,arealsopresented.
8/3/2019 Face Gabor
21/130
9
CHAPTER2
PASTRESEARCHONFACERECOGNITION
Thetaskofrecognizingfaceshasattractedmuchattentionbothfrom
neuroscientistsandfromcomputervisionscientists.Thischapterreviewssomeof
thewell-knownapproachesfromthesebothfields.
2.1. Human Face Recognition: Perceptual andCognitiveAspects
Themajorresearchissuesofinteresttoneuroscientistsincludethehuman
capacity for face recognition, themodeling of this capability, and the apparent
modularityoffacerecognition.Inthissectionsomefindings,reachedastheresult
ofexperimentsabouthumanfacerecognitionsystem,thatarepotentiallyrelevant
tothedesignoffacerecognitionsystemswillbesummarized
Oneofthebasicissuesthathavebeenarguedbyseveralscientistsisthe
existenceofadedicatedfaceprocessingsystem[82,3].Physiologicalevidence
8/3/2019 Face Gabor
22/130
10
indicatesthatthebrainpossessesspecializedfacerecognitionhardwareinthe
formoffacedetectorcellsintheinferotemporalcortexandregionsinthefrontal
right hemisphere; impairment in these areas leads to a syndrome known as
prosapagnosia. Interestingly, prosapagnosics, although unable to recognize
familiar faces, retain their ability to visually recognize non-face objects. As a
resultofmanystudiesscientistscomeupwiththedecisionthatfacerecognitionis
notlikeotherobjectrecognition[42].
Hence,thequestioniswhatfeatureshumansuseforfacerecognition.The
results of the relatedstudies are veryvaluable inthe algorithm design ofsome
facerecognitionsystems.Itisinterestingthatwhenallfacialfeatureslikenose,
mouth,eyeetc.arecontainedinanimage,butindifferentorderthanordinary,
recognitionisnotpossibleforhuman.Explanationoffaceperceptionastheresult
ofholisticorfeatureanalysisaloneisnotpossiblesincebotharetrue.Inhuman
both global and local features are used in a hierarchical manner [82]. Local
features provide a finer classification system for face recognition. Simulations
showthatthemostdifficultfacesforhumanstorecognizearethosefaces,which
are neither attractive nor unattractive [4]. Distinctive faces are more easily
recognizedthantypicalones.Informationcontainedinlowfrequencybandsused
inordertomakethedeterminationofthesexoftheindividual,whilethehigher
frequency components are used in recognition. The low frequency components
contribute to the global description, while the high frequency components
contributetothefinerdetailsrequiredintheidentificationtask[8,11,13].Ithas
8/3/2019 Face Gabor
23/130
11
alsobeenfoundthattheupperpartofthefaceismoreusefulforrecognitionthan
thelowerpart[82].
In[42],Bruceexplainsanexperimentthatisrealizedbysuperimposing
thelowspatialfrequencyMargaretThatchersfaceonthehighspatialfrequency
components of Tony Blairs face. Although when viewed close up only Tony
Blair was seen, viewed from distance, Blair disappears and Margaret Thatcher
becomesvisible.Thisdemonstratesthattheimportantinformationforrecognizing
familiarfacesiscontainedwithinaparticularrangeofspatialfrequencies.
Another important finding is that human face recognition system is
disrupted by changes in lighting direction and also changes of viewpoint.
Althoughsomescientiststendtoexplainhumanfacerecognitionsystembasedon
derivation of 3D models of faces using shape from shading derivatives, it is
difficulttounderstandwhyfacerecognitionappearssoviewpointdependent[1].
The effects of lighting change on face identificationandmatching suggest that
representationsforfacerecognitionarecruciallyaffectedbychangesinlowlevel
imagefeatures.
BruceandLangtonfoundthatnegation(invertingbothhueandluminance
valuesofanimage)effectsbadlytheidentificationoffamiliarfaces[124].They
alsoobservethatnegationhasnosignificanteffectonidentificationandmatching
ofsurfaceimagesthatlackedanypigmentedandtexturedfeatures,thisledthem
toattributethenegationeffecttothealterationofthebrightnessinformationabout
pigmentedareas.Anegativeimageofadark-hairedCaucasian,forexample,will
appear to be a blonde with dark skin. Kemp et al. [125] showed that the hue
8/3/2019 Face Gabor
24/130
12
values of these pigmented regions do not themselves matters for face
identification.Familiarfacespresentedinhuenegatedversions,withpreserved
luminance values, were recognized as well as those with original hue values
maintained,thoughtherewasadecrementinrecognitionmemoryforpicturesof
faceswhenhuewasalteredinthisway[126].Thissuggeststhatepisodicmemory
forpicturesofunfamiliarfacescanbesensitivetohue,thoughtherepresentations
offamiliarfacesseemsnottobe.Thisdistinctionbetweenmemoryforpictures
andfacesisimportant.Itisclearthatrecognitionoffamiliarandunfamiliarfaces
isnotthesameforhumans.Itislikelythatunfamiliarfacesareprocessedinorder
torecognizea picture whereasfamiliarfacesare fed into the face recognition
system of human brain. A detailed discussion of recognizing familiar and
unfamiliarfacescanbefoundin[41].
Youngchildrentypicallyrecognizeunfamiliarfacesusingunrelatedcues
such as glasses, clothes, hats, and hairstyle. By the age of twelve, these
paraphernaliaareusuallyreliablyignored.Curiously,whenchildrenasyoungas
fiveyearsareaskedtorecognizefamiliarfaces,theydoprettywellinignoring
paraphernalia.Severalother interestingstudiesrelatedtohowchildren perceive
invertedfacesaresummarizedin[6,7].
Humans recognize people from their own race better than people from
another race. Humans may encode an average face; these averages may be
different for different races and recognition may suffer from prejudice and
unfamiliaritywiththeclassoffacesfromanotherraceorgender[82].Thepoor
identification ofotherraces isnot a psychophysical problem but more likely a
8/3/2019 Face Gabor
25/130
13
psychosocialone.Oneoftheinterestingresultsofthestudiestoquantifytherole
of gender in face recognition is that in Japanese population, majority of the
womensfacialfeaturesismoreheterogeneousthanthemensfeatures.Ithasalso
beenfoundthatwhitewomensfacesareslightlymorevariablethanmens,but
thattheoverallvariationissmall[9,10].
2.1.1. Discussion
Therecognitionof familiarfaces plays a fundamental role in oursocial
interactions. Humans are able to identify reliably a large number of faces and
psychologists are interested in understanding the perceptual and cognitive
mechanisms at the base of the face recognition process. Those researches
illuminatecomputervisionscientistsstudies.
We can summarize the founding of studies on human face recognition
systemasfollows:
1. Thehumancapacityforfacerecognitionisadedicatedprocess,notmerelyan
application of the general object recognition process. Thus artificial face
recognitionsystemsshouldalsobefacespecific.
2. Distinctivefacesaremoreeasilyrecognizedthantypicalones.
3. Bothglobalandlocalfeaturesareusedforrepresentingandrecognizingfaces.
4. Humansrecognizepeoplefromtheirownracebetterthanpeoplefromanother
race.Humansmayencodeanaverageface.
8/3/2019 Face Gabor
26/130
14
5. Certain image transformations, such asintensitynegation,strange viewpoint
changes, andchanges in lighting direction can severely disrupt human face
recognition.
Usingthepresenttechnologyitisimpossibletocompletelymodelhuman
recognitionsystemandreachitsperformance.However, thehumanbrainhasits
shortcomings in thetotal number of persons that it canaccuratelyremember.
Thebenefitofacomputersystemwouldbeitscapacitytohandlelargedatasetsof
faceimages.
Theobservationsand findings abouthumanface recognitionsystem will
be a good starting point for automatic face recognition methods. As it is
mentionedaboveanautomatedfacerecognitionsystemshouldbefacespecific.It
shouldeffectivelyusefeaturesthatdiscriminateafacefromothers,andmoreasin
caricaturesitpreferablyamplifiessuchdistinctivecharacteristicsofface[5,13].
Differencebetweenrecognitionoffamiliarandunfamiliarfacesmustalso
benoticed.Firstofallweshouldfindoutwhatmakesusfamiliartoaface.Seeing
a face in many different conditions (different illuminations, rotations,
expressionsetc.)makeusfamiliartothatface,orbyjustfrequentlylookingat
the same face image can we be familiar to that face? Seeing a face in many
differentconditionsissomethingrelatedtotraininghowevertheinterestingpoint
isthatbyusingonlythesame2Dinformationhowwecanpassfromunfamiliarity
to familiarity. Methods, which recognize faces from a single view, should pay
attentiontothisfamiliaritysubject.
8/3/2019 Face Gabor
27/130
15
Someoftheearlyscientistswereinspiredbywatchingbirdflightandbuilt
their vehicles with mobile wings. Although a single underlying principle, the
Bernoullieffect,explainsbothbiologicalandman-madeflight,wenotethatno
modernaircrafthasflappingwings.Designersoffacerecognitionalgorithmsand
systems should be aware of relevant psychophysics and neurophysiological
studiesbutshouldbeprudentinusingonlythosethatareapplicableorrelevant
fromapractical/implementationpointofview.
2.2. AutomaticFaceRecognition
Although humans perform face recognition in an effortless manner,
underlying computations within the human visual system are of tremendous
complexity. The seemingly trivial task of finding and recognizing faces is the
resultofmillionsyearsofevolutionandwearefarawayfromfullyunderstanding
howthebrainperformsthistask.
Up to date, no complete solution has been proposed that allow the
automatic recognition of faces in real images. In this section we will review
existing face recognition systems in five categories: early methods, neural
networks approaches, statistical approaches, template based approaches, and
feature based methods. Finally current state of the art of the face recognition
technologywillbepresented.
8/3/2019 Face Gabor
28/130
16
2.2.1. Representation,MatchingandStatisticalDecision
The performance of face recognition depends on the solution of two
problems:representationandmatching.
Atanelementarylevel,theimageofafaceisatwodimensional(2-D)
arrayofpixelgraylevelsas,
x={xi,j,i,j S}, (2.1)
where S is a square lattice. However in some cases it is more convenient to
express the face image, x, as one-dimensional (1-D) column vector of
concatenatedrowsofpixels,as
x=[x1,x2,.....,x n]T (2.2)
Wheren=Sisthetotalnumberofpixelsintheimage.Therefore x Rn,then
dimensionalEuclideanspace.
For a given representation, two properties are important: discriminating
powerandefficiency;i.e.howfarapartarethefacesundertherepresentationand
howcompactistherepresentation.
Whilemanyprevioustechniquesrepresentfacesintheirmostelementary
formsof(2.1)or(2.2),manyothersuseafeaturevector, F(x)=[f1(x),f2(x),....,
fm(x)]T,wheref1(.),f2(.),...,fm(.)arelinearornonlinearfunctionals.Feature-based
representationsareusuallymoreefficientsincegenerallymismuchsmallerthan
n.
8/3/2019 Face Gabor
29/130
17
A simple way to achieve good efficiency is to use an alternative
orthonormal basis ofRn. Specifically, suppose e1, e2,..., en are an orthonormal
basis.ThenXcanbeexpressedas
i
n
i
i exx =
=1
~ (2.3)
where ii exx ,~ (inner product), and x can be equivalently represented by
[ ]Tnxxxx~,...,~,~~ 21= .Twoexamplesoforthonormalbasisarethenaturalbasisused
in(2.2)withei=[0,...,0,1,0,...,0]T,whereoneis ithposition,andtheFourierbasis
( ) T
n
inj
n
ij
n
ij
i eeen
e
=
12
222
2/1,...,,,1
1 .Ifforagivenorthonormalbasis ix
~
are small when mi , then the face vector x~ can be compressed into an m
dimensionalvector, [ ]Tmxxxx~,...,~,~~ 21 .
It is important to notice that an efficient representation does not
necessarilyhavegooddiscriminatingpower.
Inthematchingproblem,anincomingfaceisrecognizedbyidentifyingit
withaprestoredface.Forexample,supposetheinputfaceis xandthereareK
prestoredfacesck,k=1,2,...,K.Onepossibilityistoassignxto0k
c if
kcxkKk
=1
minarg0 (2.4)
8/3/2019 Face Gabor
30/130
18
where . representstheEuclideandistanceinRn.If ||ck|| isnormalizedsothat
||ck||=cforallk,theminimumdistancematchingin(2.4)simplifiestocorrelation
matching
kcxkKk
,minarg1
0
= . (2.5)
Sincedistanceandinnerproductareinvarianttochangeoforthonormal
basis, minimumdistance and correlation matching canbe performed using any
orthonormalbasisandtherecognitionperformancewillbethesame.Todothis,
simplyreplacexand ckin(2.4)or(2.5)by x~
and kc~
.Similarly(2.4)and(2.5)
alsocouldbeusedwithfeaturevectors.
Duetosuchfactorssuchasviewingangle,illumination,facialexpression,
distortion, and noise, the face images for a given person can have random
variations and therefore are better modeled as a random vector. In this case,
maximumlikelihood(ML)matchingisoftenused,
( )( )kcxpkKk
|logminarg1
0
= (2.6)
wherep(x|ck)isthedensityofxconditioningonitsbeingthe kthperson.TheML
criterion minimizes the probability of recognition error when a priori, the
incomingfaceisequallylikelytobethatofanyoftheKpersons.Furthermoreif
weassumethatvariationsinfacevectorsarecausedbyadditivewhiteGaussian
noise(AWGN)
xk=ck+wk(2.7)
8/3/2019 Face Gabor
31/130
19
wherewkisazero-meanAWGNwithpower2,thentheMLmatchingbecomes
theminimumdistancematchingof(2.4).
2.2.2. Earlyfacerecognitionmethods
Theinitialworkinautomaticfaceprocessingdatesbacktotheendofthe
19th century,asreportedbyBenson and Perrett[39]. Inhis lectureon personal
identificationattheRoyalInstitutionon25May1888,SirFrancisGalton[53],an
Englishscientist,explorerandacousinofCharlesDarwin,explainedthathehad
frequently chafed under the sense of inability to verbally explain hereditary
resemblance and types of features. In order to relieve himself from this
embarrassment, he took considerable trouble and made many experiments. He
described how French prisoners were identified using four primary measures
(headlength,headbreadth, foot length and middle digit length ofthe footand
hand respectively). Each measure could take one of the three possible values
(large,medium,orsmall),givingatotalof81possibleprimaryclasses.Galton
feltitwouldbeadvantageoustohaveanautomaticmethodofclassification.For
thispurpose,hedevisedanapparatus,whichhecalledamechanicalselector,that
could be used to compare measurements of face profiles. Galton reported that
mostofthemeasureshehadtriedwerefairyefficient.
Early face recognition methods were mostly feature based. Galtons
proposed method, and a lot of work to follow, focused on detecting important
facial features as eye corners, mouth corners, nose tip, etc. By measuring the
relativedistancesbetweenthesefacialfeaturesafeaturevectorcanbeconstructed
8/3/2019 Face Gabor
32/130
20
todescribeeachface.Bycomparingthefeaturevectorofanunknownfacetothe
feature vectors of known vectors from a database of known faces, the closest
matchcanbedetermined.
OneoftheearliestworksisreportedbyBledsoe[84].Inthissystem,a
humanoperatorlocatedthefeaturepointson thefaceandenteredtheirpositions
intothecomputer.Givenasetoffeaturepointdistancesofanunknownperson,
nearest neighbor or other classification rules were used for identifying the test
face.Sincefeature extractionis manuallydone, this systemcouldaccommodate
widevariationsinheadrotation,tilt,imagequality,andcontrast.
InKanadeswork[62],seriesfiducialpointsaredetectedusingrelatively
simple image processing tools (edge maps, signatures etc.) and the Euclidean
distancesarethenusedasafeaturevectortoperformrecognition.Thefacefeature
pointsarelocatedintwostages.Thecoarse-grainstagesimplifiedthesucceeding
differential operation and feature finding algorithms. Once the eyes, nose and
mouth are approximately located, more accurate information is extracted by
confiningtheprocessingtofoursmallergroups,scanningathigherresolution,and
usingbestbeamintensityfortheregion.Thefourregionsaretheleftandright
eye, nose, and mouth. The beam intensity isbasedonthe localarea histogram
obtainedinthecoarse-gainstage.Asetof16facialparameters,whicharerations
ofdistances,areas,andanglestocompensateforthevaryingsizeofthepictures,
isextracted.Toeliminatescaleanddimensiondifferencesthecomponentsofthe
resulting vector are normalized. A simple distance measure is used to check
similaritybetweentwofaceimages.
8/3/2019 Face Gabor
33/130
21
2.2.3. Statisticalapproachestofacerecognition
2.2.3.1. Karhunen-LoeveExpansionBasedMethods
2.2.3.1.1. Eigenfaces
Afaceimage,I(x,y),ofsizeNxN issimplyamatrixwithbeachelement
representingtheintensityatthatparticularpixel.I(x,y)mayalsobeconsideredas
avectoroflengthN2 orasinglepointinanN2dimentionalspace.Soa128x128
pixelimagecanberepresentedasapointina16,384dimensionalspace.Facial
imagesingeneralwilloccupyonlyasmallsub-regionofthishighdimensional
imagespaceandthusarenotoptimallyrepresentedinthiscoordinatesystem.
As mentioned in section 2.2.1, alternative orthonormal bases are often
usedtocompressfacevectors.OnesuchbasisistheKarhunen-Loeve(KL).
TheEigenfacesmethodproposedbyTurkandPentland[20],isbasedon
theKarhunen-LoeveexpansionandismotivatedbytheearlierworkofSirovitch
andKirby[63]forefficientlyrepresentingpictureoffaces.Eigenfacerecognition
derivesitisnamefromtheGermanprefixeigen,meaningownorindividual.
TheEigenfacemethodoffacialrecognitionisconsideredthefirstworkingfacial
recognitiontechnology.
TheeigenfacesmethodpresentedbyTurkandPentlandfindstheprincipal
components (Karhunen-Loeve expansion) of the face image distribution or the
eigenvectors of the covariance matrix of the set of face images. These
eigenvectorscanbethoughtasasetoffeatures,whichtogethercharacterizethe
variationbetweenfaceimages.
8/3/2019 Face Gabor
34/130
22
LetafaceimageI(x,y)beatwodimensionalarrayofintensityvalues,ora
vectorofdimensionn.LetthetrainingsetofimagesbeI1,I2,...,IN.Theaverage
face image of the set is defined by =
=N
i
iIN 1
1 . Each face differs from the
average by the vector = ii I . This set of very large vectors is subject to
principal component analysis which seeks a set of K orthonormal vectors vk,
k=1,...,Kandtheirassociatedeigenvalues kwhichbestdescribethedistribution
ofdata.
Vectors vk and scalars k are the eigenvectors and eigenvalues of the
covariancematrix:
,1
1
TT
i
N
i
i AAN
C == =
(2.9)
where the matrixA=[1, 2,..., N]. Finding the eigenvectors of matrix Cnxn is
computationallyintensive.However,theeigenvectorsofCcanbedeterminedby
firstfindingtheeigenvectorsofamuchsmallermatrixofsizeNxNandtakinga
linearcombinationoftheresultingvectors.
kkk vCv = (2.10)
kk
T
kk
T
k vvCvv = (2.11)
sinceeigenvectors,vk,areorthogonalandnormalized vkT
vk=1.
kk
T
k Cvv = (2.12)
8/3/2019 Face Gabor
35/130
23
=
=N
i
k
T
ii
T
kk vvN 1
1 (2.13)
=
=
=
=
=
=
=
=
=
=
N
i
T
ik
N
i
T
ik
T
ik
N
i
T
ik
N
i
T
ik
TT
ik
N
i
k
T
ii
T
k
IvN
IvmeanIvN
vN
vvN
vvN
1
1
2
1
2
1
1
)var(1
))((1
)(1
)()(1
1
Thuseigenvalue krepresentsthevarianceoftherepresentativefacialimageset
alongtheaxisdescribedbyeigenvector k.
Thespacespannedbytheeigenvectorsvk, k=1,...,Kcorrespondingtothe
largest K eigenvalues of the covariancematrixC,iscalled theface space.The
eigenvectorsofmatrixC,whicharecalledeigenfacesfromabasissetfortheface
images. A new face image is transformed into its eigenface components
(projectedontothefacespace)by:
)()(, >==< Tk vkvk (2.14)
For k=1,...,K. The projections wk form the feature vector =[ w1, w2,..., wK]
which describes the contribution of each of each eigenface in representing theinputimage.
GivenasetoffaceclassesEqandthecorrespondingfeaturevectorsq,
thesimplestmethodfordeterminingwhichfaceclassprovidesthebestdescription
8/3/2019 Face Gabor
36/130
24
ofaninputfaceimageistofindthefaceclassthatminimizestheEuclidean
distanceinthefeaturespace:
qq = (2.15)
AfaceisclassifiedasbelongingtoclassEqwhentheminimumqisbelowsome
thresholdandalso
{ }qqqE minarg= . (2.16)
Otherwise,thefaceisclassifiedasunknown.
Turk and Pentland [20] test how their algorithm performs in changing
conditions,byvaryingillumination,sizeandorientationofthefaces.Theyfound
thattheirsystemhadthemosttroublewithfacesscaledlargerorsmallerthanthe
originaldataset.Toovercomethisproblemtheysuggestusingamulti-resolution
methodinwhichfacesarecomparedtoeigenfacesofvaryingsizestocomputethe
bestmatch.Alsotheynotethatimagebackgroundcanhavesignificanteffecton
performance, which they minimize by multiplying input images with a 2-D
Gaussiantodiminishthecontributionofthebackgroundandhighlightthecentral
facial features. System performs face recognition in real-time. Turk and
Pentlands paper was very seminal in the field of face recognition and their
methodisstillquitepopularduetoitseaseofimplementation.
MuraseandNayar[85]extendedthecapabilitiesoftheeigenfacemethod
to general 3D-object recognition under different illumination and viewing
conditions. Given N object images taken under P views and L different
8/3/2019 Face Gabor
37/130
25
illumination conditions, a universal image set is built which contains all the
availabledata.Inthiswayasingleparametricspacedescribestheobjectidentity
aswellastheviewingorilluminationconditions.Theeigenfacedecompositionof
thisspacewasusedforfeatureextractionandclassification.Howeverinorderto
insurediscriminationbetweendifferentobjectsthenumberofeigenvectorsused
inthismethodwasincreasedcomparedtotheclassicalEigenfacemethod.
Later,basedontheeigenfacedecomposition,Pentlandetal[86]developed
a view based eigenspace approach for human face recognition under general
viewingconditions.GivenNindividualsunderPdifferentviews,recognitionis
performed over P separate eigenspaces, each capturing the variation of the
individuals in a common view. The view based approach is essentially an
extensionoftheeigenfacetechniquetomultiplesetsofeigenvectors,oneforeach
face orientation. In order todeal with multiple views, in the first stage of this
approach,theorientationofthetestfaceisdeterminedandtheeigenspacewhich
bestdescribestheinputimageisselected.Thisisaccomplishedbycalculatingthe
residual description error (distance from feature space: DFFS) for each view
space. Once the proper view is determined, the image is projected onto
appropriate view space and then recognized. The view based approach is
computationallymoreintensivethantheparametricapproachbecausePdifferent
sets of V projections are required (V is the number of eigenfaces selected to
represent each eigenspace). Naturally, the view-based representation can yield
moreaccuraterepresentationoftheunderlyinggeometry.
8/3/2019 Face Gabor
38/130
26
2.2.3.1.2. FaceRecognitionusingEigenfaces
Therearetwomainapproachesofrecognizingfacesbyusingeigenfaces.
Appearancemodel:
1- Adatabaseoffaceimagesiscollected
2- Asetofeigenfacesisgeneratedbyperformingprincipalcomponentanalysis
(PCA) on the face images. Approximately, 100 eigenvectors are enough to
codealargedatabaseoffaces.
3- Eachfaceimageisrepresentedasalinearcombinationoftheeigenfaces.
4- A given test image is approximated by a combination of eigenfaces. Adistancemeasureisusedtocomparethesimilaritybetweentwoimages.
Figure2.1:Appearancemodel
g
U
k
),( gpS
8/3/2019 Face Gabor
39/130
27
Figure2.2:Discriminativemodel
Discriminativemodel:
1- TwodatasetslandEareobtainedbycomputingintrapersonaldifferences
(by matching two viewsofeachindividualinthe dataset) and the otherby
computingextrapersonaldifferences(bymatchingdifferentindividualsinthe
dataset),respectively.
2- TwodatasetsofeigenfacesaregeneratedbyperformingPCAoneachclass.
E
l
g
( )
( )1
1
|
|
P
P
),( gpS
8/3/2019 Face Gabor
40/130
28
3- Similarity score between two images is derived by calculating S=P(l|),
where is the difference between a pair of images. Two images are
determinedtobethesameindividual,ifS>0.5.
Although the recognition performance is lower than the correlation
method, thesubstantial reduction in computational complexity of the eigenface
methodmakesthismethodveryattractive.Therecognitionratesincreasewiththe
number of principal components (eigenfaces) used and in the limit, as more
principal components are used, performance approaches that of correlation. In
[20], and [86], authors reported that the performances level off at about 45
principalcomponents.
Ithasbeenshownthatremovingfirstthreeprincipalcomponentsresultsin
betterrecognitionperformances(theauthorsreportedanerrorrateof%20when
usingtheeigenfacemethodwith30principalcomponentsonadatabasestrongly
affectedbyilluminationvariationsandonly%10errorrateafterremovingthefirst
three components). The recognition rates in this case were better than the
recognitionratesobtainedusingthecorrelationmethod.Thiswasarguedbasedon
the fact that first components are more influenced by variations of lighting
conditions.
2.2.3.1.3. Eigenfeatures
Pentland et al. [86] discussed the use of facial features for face
recognition.Thiscanbeeitheramodularora layeredrepresentationoftheface,
8/3/2019 Face Gabor
41/130
29
whereacoarse(low-resolution)descriptionofthewholeheadisaugmentedby
additional (high-resolution) details in terms of salient facial features. The
eigenfacetechniquewasextendedtodetectfacialfeatures.Foreachofthefacial
features, a featurespace is built by selecting themost significanteigenfeatures
(eigenvectorscorrespondingtothelargesteigenvaluesofthefeaturescorrelation
matrix).
After the facial features in a test image were extracted, a score of
similarity between the detected features and the features corresponding to the
modelimages iscomputed. A simple approach for recognition istocompute a
cumulative score in terms of equal contribution by each of the facial feature
scores. More elaborate weighting schemes can also be used for classification.
Oncethecumulativescoreisdetermined,anewfaceisclassifiedsuchthatthis
scoreismaximized.
Theperformance of eigenfeatures method is close to that of eigenfaces,
howeveracombinedrepresentationofeigenfacesandeigenfeaturesshowshigher
recognitionrates.
2.2.3.1.4. TheKarhunen-LoeveTransformoftheFourierSpectrum
Akamatsu et. al. [87], illustrated the effectiveness of Karhunen-Loeve
Transform of Fourier Spectrum in the Affine Transformed Target Image (KL-FSAT) for face recognition. First, the original images were standardized with
respecttoposition,size,andorientationusinganaffinetransformsothatthree
referencepointssatisfyaspecificspatialarrangement.Thepositionofthesepoints
8/3/2019 Face Gabor
42/130
30
isrelatedtothepositionofsomesignificantfacialfeatures.Theeigenfacemethod
is applied discussed in the section 2.2.3.1.1. to the magnitude of the Fourier
spectrum of the standardized images (KL-FSAT). Due to the shift invariance
property of the magnitude of the Fourier spectrum, the KL-FSAT performed
betterthanclassicaleigenfacesmethodundervariationsinheadorientationand
shifting. However the computational complexity of KL-FSAT method is
significantly greater than the eigenface method due to the computation of the
Fourierspectrum.
2.2.3.2. LinearDiscriminantMethods-Fisherfaces
In [88], [89], the authors proposed a new method for reducing the
dimensionalityofthefeaturespacebyusingFishersLinearDiscriminant(FLD)
[90]. The FLD uses the class membership information and develops a set of
feature vectors in which variations of different faces are emphasized while
differentinstancesoffacesdue toillumination conditions, facialexpressionand
orientationsarede-emphasized.
2.2.3.2.1. FishersLinearDiscriminant
Given c classes with a priori probabilities Pi, letNi be the number of
samples of class i, i=1,...,c . Then the following positive semi-definite scatter
matricesaredefinedas:
8/3/2019 Face Gabor
43/130
31
=
=c
i
T
iiiB PS1
))(( (2.17)
= =
=c
i
N
j
T
i
i
ji
i
jiw
i
xxPS1 1
))(( (2.18)
Where ijx denotesthejthn-dimensionalsamplevectorbelongingtoclass i,i is
themeanofclass i:
,1
1
=
=iN
j
i
j
i
i xN
(2.19)
andistheoverallmeanofsamplevectors:
,1
1 1
1
= =
=
=c
i
N
j
i
jc
i
i
i
x
N
(2.20)
Swisthewithinclassscattermatrixandrepresentstheaveragescatterofsample
vectorofclass i;SBisthebetween-classscattermatrixandrepresentsthescatter
ofthemeaniofclassiaroundtheoverallmeanvector.IfSwisnonsingular,
the Linear Discriminant Analysis (LDA) selects a matrix VoptRnxk with
orthonormal columns which maximizes the ratio of the determinant of the
betweenclassscattermatrixoftheprojectedsamples,
[ ],,...,,maxarg 21 kw
T
B
T
vopt vvvVSV
VSVV =
= (2.21)
8/3/2019 Face Gabor
44/130
32
Where {vi|i=1,...,k} is the set of generalized eigenvectors of SB and Sw
correspondingtothesetofdecreasingeigenvalues {i|i=1,...,k},i.e.
.iwiiB vSvS = (2.22)
As shown in [91], the upper bound of kis c-1. The matrix Vopt describes the
OptimalLinearDiscriminantTransformortheFoley-SammonTransform.While
theKarhunen-LoeveTransformperformsarotationonasetofaxesalongwhich
the projection of sample vectors differ most in the autocorrelation sense, the
LinearDiscriminantTransformperformsarotationonasetofaxes [v1,v2,...,vk]
alongwhichtheprojectionofsamplevectorsshowmaximumdiscrimination.
2.2.3.2.2. FaceRecognitionUsingLinearDiscriminantAnalysis
LetatrainingsetofNfaceimagesrepresentscdifferentsubjects.Theface
image in the training set are two-dimensional arrays of intensity values,
represented as vectors of dimension n. Different instances of a persons face
(variationsinlighting,poseorfacialexpressions)aredefinedtobeinthesame
classandfacesofdifferentsubjectsaredefinedtobefromdifferentclasses.
The scatter matrices SBand Sw are defined in Equations (2.17), (2.18).
Howeverthematrix VoptcannotbefounddirectlyfromEquation(2.21),because
ingeneralmatrixSwissingular.ThisstemsfromthefactthattherankofSwisless
thanN-c,andingeneral,thenumberofpixelsineachimagenismuchlargerthan
the number of images in the learning setN. There have been presented many
solutionsintheliteratureinordertoovercomethisproblem[92,93].In[88],the
8/3/2019 Face Gabor
45/130
33
authorsproposeamethodwhichiscalledFisherfaces.Theproblemof Swbeing
singularisavoidedbyprojectingtheimagesetontoalowerdimensionalspaceso
thattheresultingwithinclassscatterisnonsingular.Thisisachievedbyusing
PrincipalComponentAnalysis(PCA)toreducethedimensionofthefeaturespace
toN-c and then, applying the standard linear discriminant defined in Equation
(2.21)toreducethedimensionto c-1.MoreformallyVoptisgivenby:
Vopt=VfldVpca, (2.23)
Where
,maxarg CVVV Tvpca = (2.24)
and,
=
VSwVVV
VSBVVVV
pca
T
pca
T
pca
T
pca
T
vfld maxarg , (2.25)
Where Cisthecovariancematrixofthesetoftrainingimagesandiscomputed
fromEquation(2.9).ThecolumnsofVoptareorthogonalvectorswhicharecalled
Fisherfaces.UnliketheEigenfaces,theFisherfacesdonotcorrespondtofacelike
patterns.AllexamplefaceimagesEq,q=1,...,QintheexamplesetSareprojected
ontothevectorscorrespondingtothecolumnsofthe Vfldandasetoffeaturesis
extractedforeachexamplefaceimage.Thesefeaturevectorsareuseddirectlyfor
classification.
8/3/2019 Face Gabor
46/130
34
Havingextractedacompactandefficientfeatureset,therecognitiontask
canbeperformedbyusingtheEuclideandistanceinthefeaturespace.However,
in [89] as a measure in the feature space, is proposed a weighted mean
absolute/square distance with weights obtained based on the reliability of the
decisionaxis.
=
=
K
v
v
vvS
vvD1
2
2
)(
)(),( , (2.26)
Therefore,foragivenfaceimage,thebestmatchE0isgivenby
{ }),(minarg0 = DS . (2.27)
Theconfidencemeasureisdefinedas:
=
),(
),(1),(
1
00
D
DConf ,(2.28)
whereE1isthesecondbestcandidate.
In [87], Akamatsu et. al. applied LDA to the Fourier Spectrum of the
intensity image. The results reported by the authors showed that LDA in the
FourierdomainissignificantlymorerobusttovariationsinlightingthantheLDA
applieddirectlytotheintensityimages.Howeverthecomputationalcomplexityof
this method is significantly greater than classical Fisherface method due to the
computationoftheFourierspectrum.
8/3/2019 Face Gabor
47/130
35
2.2.3.3. SingularValueDecompositionMethods
2.2.3.3.1 Singularvaluedecomposition
MethodsbasedontheSingularValueDecompositionforfacerecognition
usethegeneralresultstatedbythefollowingtheorem:
Theorem: LetIpxqbearealrectangularmatrixRank(I)=r,thenthereexiststwo
orthonormal matrices Upxp, Vqxq and a diagonal matrix pxq and the following
formulaholds:
=
==r
i
T
iii
T vvVUI1
, (2.29)
where
U=(u1,u2,...,ur,ur+1,...,up),
V=(v1,v2,...,vr,vr+1,...,vq),
=diag(1,2,...,r,0,...,0),
1>2>...>r>0,2i , i=1,...,raretheeigenvaluesofII
TandITI, ui,vj , i=1,...,p,
j=1,...,qaretheeigenvectorscorrespondingtoeigenvaluesof IITandITI.
2.2.3.3.2. FacerecognitionUsingSingularValueDecomposition
Let a face imageI(x,y) be a two dimensional (mxn) array of intensity
valuesand[1,2,...,r]beitssingularvalue(SV)vector.In[93],Zhongrevealed
the importance of using SVD for human face recognition by proving several
importantpropertiesoftheSVvectoras:thestabilityoftheSVvectortosmall
8/3/2019 Face Gabor
48/130
36
perturbations caused by stochastic variation in the intensity image, the
proportional variance of theSV vector toproportional variance of pixels inthe
intensity image, the invariance of the SV feature vector to rotation transform,
translationandmirrortransform.TheabovepropertiesoftheSVvectorprovide
thetheoreticalbasisforusingsingularvaluesas imagefeatures.However,ithas
beenshownthatcompressingtheoriginalSVvectorintoalowdimensionalspace,
by means of various mathematic transforms leads to higher recognition
performances.Amongvarioustransformationsofcompressingdimensionality,the
Foley-Sammon transform based on Fisher criterion, i.e. optimal discriminant
vectors,isthemostpopularone.GivenNfaceimages,whichpresentcdifferent
subjects,the SVvectorsareextractedfromeachimage.AccordingtoEquations
(2.17)and(2.18),thescattermatricesSBandSwoftheSVvectorsareconstructed.
Ithasbeenshownthatitisdifficulttoobtaintheoptimaldiscriminantvectorsin
thecaseofsmallnumberofsamples,i.e.thenumberofsamplesislessthanthe
dimensionalityofthe SVvectorbecausethescattermatrix Swissingularinthis
case.Manysolutionshavebeenproposedtoovercomethisproblem.Hong[93],
circumvented the problem by adding a small singular value perturbation to Sw
resultingin Sw(t)suchthat Sw(t)becomesnonsingular.Howevertheperturbation
of Sw introduces an arbitrary parameter, and the range to which the authors
restrictedtheperturbationisnotappropriatetoensurethattheinversionofSw(t)is
numericallystable.Chengetal[92],solvedtheproblembyrankdecompositionof
Sw. This is a generalization of Tians method [94], who substitute Sw- by the
positivepseudo-inverseS w+.
8/3/2019 Face Gabor
49/130
37
After the set of optimal discriminant vectors {v1, v2, ..., vk} has been
extracted,thefeaturevectorsareobtained byprojectingthe SVvectorsontothe
spacespannedby{v1,v2,...,vk}.
Whena test imageis acquired,its SVvectorisprojectedontothespace
spannedby{v1,v2,...,vk}andclassificationisperformedinthefeaturespaceby
measuringtheEuclideandistanceinthisspaceandassigningthetestimagetothe
classofimagesforwhichtheminimumdistanceisachieved.
Anothermethodtoreducethefeaturespaceofthe SVfeaturevectorswas
describedbyChengetal[95].Thetrainingsetusedconsistedofasmallsampleof
faceimagesofthesameperson.If ijI representsthejthfaceimageofpersoni,
thentheaverageimageisgivenby =
N
j
i
jIN 1
1.Eigenvaluesandeigenvectorsare
determinedforthisaverageimageusingSVD.Theeigenvaluesaretresholdedto
disregardthevaluesclosetozero.Average eigenvectors (calledfeature vectors)
foralltheaveragefaceimagesarecalculated.Atestimageisthenprojectedonto
thespacespannedbytheeigenvectors.TheFrobeniusnormisusedasacriterion
todeterminewhichpersonthetestimagebelongs.
2.2.4. HiddenMarkovModelBasedMethods
Hidden Markov Models (HMM) are a set of statistical models used to
characterize the statistical properties of a signal. Rabiner [69][96], provides an
extensive and complete tutorial onHMMs.HMM are made oftwo interrelated
processes:
8/3/2019 Face Gabor
50/130
38
- anunderlying,unobservableMarkovchainwithfinitenumberofstates,astate
transitionprobabilitymatrixandaninitialstateprobabilitydistribution.
- Asetofprobabilitydensityfunctionsassociatedtoeachstate.
TheelementsofHMMare:
N,thenumberofstatesinthemodel.If Sisthesetofstates,then S={S1,S2,...,
SN}.Thestateofthemodel qttimetisgivenbyqtS, Tt 1 ,whereTisthe
lengthoftheobservationsequence(numberofframes).
M, the number of different observation symbols. If Visthesetofallpossible
observationsymbols(alsocalledthecodebookofthemodel),then V={V1,V2,...,
VM}.
A,thestatetransitionprobabilitymatrix;A={aij}where
NjiSqSqPA itjtij === ,1,]|[ 1 (2.30)
10 ija , NiaN
j
ij ==
1,11
(2.31)
B,theobservationsymbolprobabilitymatrix;B=bj(k)where,
MkNjSqvkQPkB jttj === 1,1,]|[)( (2.32)
AndQtistheobservationsymbolattimet.
,theinitialstatedistribution;=iwhere
[ ] NiSqP ii == 1,1 (2.33)
Usingashorthandnotation,aHMMisdefinedas:
8/3/2019 Face Gabor
51/130
39
=(A,B,).(2.34)
The above characterization corresponds to a discrete HMM, where the
observations characterized as discrete symbols chosen from a finite alphabet
V={v1,v2,...,vM}. In a continuous density HMM,thestates arecharacterized by
continuousobservationdensityfunctions.Themostgeneralrepresentationofthe
modelprobabilitydensityfunction(pdf)isafinitemixtureoftheform:
=
=M
k
ikikiki NiUONcOb1
1),,,()( (2.35)
where cikisthemixturecoefficientforthekthmixtureinstate i.Withoutlossof
generalityN(O,ik,Uik)isassumedtobeaGaussianpdfwithmeanvectorikand
covariancematrixUik.
HMMhavebeenusedextensivelyforspeechrecognition,wheredatais
naturally one-dimensional (1-D) along time axis. However, theequivalentfully
connected two-dimensional HMM would lead to a very high computational
problem[97].Attempts havebeenmade tousemulti-model representations that
lead to pseudo 2-D HMM [98]. These models are currently used in character
recognition[99][100].
8/3/2019 Face Gabor
52/130
40
Figure2.3:ImagesamplingtechniqueforHMMrecognition
In[101],Samariaetalproposedtheuseofthe1-DcontinuousHMMfor
face recognition. Assuming that each face is in an upright, frontal position,
featureswilloccurinapredictableorder.Thisorderingsuggeststheuseofatop-
bottommodel,whereonlytransitionsbetweenadjacentstatesin atoptobottom
manner are allowed [102]. The states of the model correspond to the facial
featuresforehead,eyes,nose,mouthandchin[103].TheobservationsequenceO
isgeneratedfromanXxYimageusinganXxLsamplingwindowwithXxMpixels
overlap(Figure2.3).EachobservationvectorisablockofLlines.ThereisanM
lineoverlapbetweensuccessiveobservations.Theoverlappingallowsthefeatures
to be captured in a manner, which is independent of vertical position, while a
disjoint partitioning of the image could result in the truncation of features
occurring across block boundaries. In [104], the effect of different sampling
parametershasbeendiscussed.Withnooverlap,ifasmallheightofthesampling
8/3/2019 Face Gabor
53/130
41
window is used, the segmented data do not correspond to significant facial
features.However,asthewindowheightincreases,thereisahigherprobabilityof
cuttingacrossthefeatures.
Given cfaceimagesforeachsubjectofthetrainingset,thegoalofthe
training stage is to optimize the parameters i=(A,B,) to describe best, the
observations O={o1, o2,...,oT},inthesenseofmaximizingP(O|).Thegeneral
HMMtrainingschemeisillustratedinFigure2.4andisavariantoftheK-means
iterativeprocedureforclusteringdata:
1. The training images are collected for each subject in the database and are
sampledtogeneratetheobservationsequence.
2. A common prototype (state) model is constructed with the purpose of
specifyingthenumberofstatesintheHMMandthestatetransitionsallowed,
A(modelinitialization).
3. A set of initial parameter values using the training data and the prototype
model are computed iteratively. The goal of this stage is to find a good
estimatefortheobservationmodelprobabilitymatrixB.In[96],ithasbeen
shownthatagoodinitialestimatesoftheparametersareessentialforrapid
andproperconvergence(totheglobalmaximumofthelikelihoodfunction)of
there-estimationformulas.Onthefirstcycle,thedataisuniformlysegmented,
matchedwitheachmodelstateandtheinitialmodelparametersareextracted.
Onsuccessivecycles,thesetoftrainingobservationsequenceswassegmented
intostatesviatheViterbialgorithm[50].Theresultofsegmentingeachofthe
trainingsequencesisforeachofNstates,amaximumlikelihoodestimateof
8/3/2019 Face Gabor
54/130
42
thesetofobservationsthatoccurwithineachstateaccordingtothecurrent
model.
4. Following the Viterbi segmentation, the model parameters are re-estimated
using the Baum-Welch re-estimation procedure. This procedure adjusts the
modelparameterssoastomaximizetheprobabilityofobservingthetraining
data,giveneachcorrespondingmodel.
5. Theresultingmodelisthencomparedtothepreviousmodel(bycomputinga
distancescorethatreflectsthestatisticalsimilarityoftheHMMs).Ifthemodel
distancescoreexceedsathreshold,thentheoldmodel isreplacedbythe
newmodel~
,andtheoveralltrainingloopisrepeated.Ifthemodeldistance
scorefallsbelowthethreshold,thenmodelconvergenceisassumedandthe
finalparametersaresaved.
Recognitioniscarriedoutbymatchingthetestimageagainsteachofthe
trainedmodels(Figure2.5).Inordertoachievethis,theimageisconvertedtoan
observationsequenceandthenmodellikelihoods P(Otest|i)arecomputedforeach
i,i=1,...,c.Themodelwithhighestlikelihoodrevealstheidentityoftheunknown
face,as
[ ])|(maxarg 1 itestci OPV = . (2.36)
8/3/2019 Face Gabor
55/130
43
Figure2.4:HMMtrainingscheme
The HMM based method showed significantly better performances for
facerecognitioncomparedtotheeigenfacemethod.ThisisduetofactthatHMM
based method offers a solution to facial features detection as well as face
recognition.
However the 1-D continuous HMM are computationally more complex
thantheEigenfacemethod.Asolutioninreducingtherunningtimeofthismethod
is the use of discrete HMM. Extremely encouraging preliminary results (error
rates below %5) were reported in [105] when pseudo 2-D HMM are used.
Furthermore,theauthorssuggestedthatFourierrepresentationoftheimagescan
lead to better recognition performance as frequency and frequency-space
representationcanleadtobetterdataseparation.
YES
NOTrainingData
ModelInitialization
StateSequenceSegmentation
EstimationofBParameters
ModelReestimation
SamplingModel
ConvergenceModelParameters
8/3/2019 Face Gabor
56/130
44
Figure2.5:HMMrecognitionscheme
2.2.5. NeuralNetworksApproach
Inprincipal,thepopularback-propagation(BP)neuralnetwork[106]can
betrainedtorecognizefaceimagesdirectly.However,asimplenetworkcanbe
verycomplexanddifficulttotrain.Atypicalimagerecognitionnetworkrequires
N=mxninputneurons,oneforeachofthepixelsinannxmimage.Forexample,if
theimagesare128x128,thenumberofinputsofthenetworkwouldbe16,384.In
order to reduce the complexity, Cottrell and Fleming [107] used two BP nets
(Figure2.6).Thefirstnetoperatesintheauto-associationmode[108]andextracts
TrainingSet Sampling
ProbabilityComputation
ProbabilityComputation
ProbabilityComputation
1
2
N
SelectMaximum
8/3/2019 Face Gabor
57/130
45
features for the second net, which operates in the more common classification
mode.
Theautoassociationnethas ninputs, noutputsandphiddenlayernodes.
Usuallypismuchsmallerthann.Thenetworktakesafacevectorxasaninput
andistrainedtoproduceanoutputythatisabestapproximationofx.Inthis
way,thehiddenlayeroutputhconstitutesacompressedversionofx,orafeature
vector,andcanbeusedastheinputtoclassificationnet.
Figure2.6:Auto-associationandclassificationnetworks
Bourland and Kamp [108] showed that under the best circumstances,
when the sigmoidal functions at the network nodes are replaced by linear
classificationnet
hiddenlayeroutputs
Auto-associationnet
8/3/2019 Face Gabor
58/130
46
functions (when the network is linear), the feature vector is the same as that
producedbytheKarhunen-Loevebasis,ortheeigenfaces.Whenthenetworkis
nonlinear,thefeaturevectorcoulddeviatefromthebest.Theproblemhereturns
outtobeanapplicationofthesingularvaluedecomposition.
Specifically,supposethatforeachtrainingfacevectorxk(n-dimensional),
k=1, 2,...,N, the outputs of the hidden layer and output layer for the auto-
association net are hk (p-dimensional, usually p
8/3/2019 Face Gabor
59/130
47
where
Up=[u1,u2,...,up]T,
Vp=[v1,v2,...,vp]T,
ArethefirstpleftandrightsingularvectorsintheSVDofXrespectively,which
alsoarethefirstpeigenvectorsofXXTandXTX.
OnewaytoachievethisoptimumistohavealinearF(.)andtosettheweightsto
p
T UWW == 21 (2.41)
SinceUpcontainsthefirsteigenvectorsofXXT,wehaveforanyinputx
xUxWh p== 1 (2.42)
which isthe same asthe feature vector inthe eigenface approach.However, it
mustbenotedthattheauto-associationnet,whenitistrainedbytheBPalgorithm
withannonlinearF(.),generallycannotachievethisoptimalperformance.
In[109],thefirst50principalcomponentsoftheimagesareextractedand
reducedto5dimensionsusinganauto-associativeneuralnetwork.Theresulting
representationisclassifiedusingastandardmulti-layerperceptron.
In a different approach, a hierarchical neural network, which is grown
automaticallyandnottrainedwithgradientdescent,wasusedforfacerecognition
byWengandHuang[110].
The most successive face recognition with neural networks is a recent
work of Lawrence et. al. [19] which combines local image sampling, a self
organizing map neural network, and a convolutional neural network. In the
8/3/2019 Face Gabor
60/130
48
corresponding work twodifferent methodsof representing local image samples
havebeenevaluated.Ineachmethod,awindowisscannedovertheimage.The
firstmethodsimplycreatesavectorfromalocalwindowontheimageusingthe
intensityvaluesateachpointinthewindow.Ifthelocalwindowisasquareof
sides 2W+1long,centeredonxij,thenthevectorassociatedwiththiswindowis
simply [xi-w,j-w,xi-w,j-w+1,...,xij,...,xi+w,j+w-1,xi+w,j+w].Thesecondmethodcreatesa
representationofthelocalsamplebyformingavectoroutoftheintensityofthe
centerpixelandthedifferenceinintensitybetweenthecenterpixelandallother
pixelswithinthesquarewindow.Thenthevectorisgivenby[xij-xi-w ,j -w,xij-xi-w,j-
w+1,...,wijxij,...,xij-xi+w,j+w-1,xij-xi+w,j+w],wi,jistheweightofcenterpixelvaluexi,j.
Theresultingrepresentationbecomespartiallyinvarianttovariationsinintensity
ofthecompletesampleandthedegreeofinvariancecanbemodifiedbyadjusting
theweightw ij.
Figure2.7:ThediagramoftheConvolutionalNeuralNetworkSystem
imagClassificati
on
ImageSampling
Self-
Organizin
Ma
Feature
Extractio
nLa ers
Multi-layerPerceptronStyleClassifier
ConvolutionalNeural NetworkDimensionality
Reduction
8/3/2019 Face Gabor
61/130
49
Theselforganizingmap,SOM,introducedbyTeudoKohonen[111,112],
is an unsupervised learning process which learns the distribution of a set of
patterns without any class information. The SOM defines a mapping from an
input spaceRn
onto a topologically ordered set of nodes, usually in a lower
dimensional space. For classification a convolutional network is used, as it
achievessomedegreeofshiftanddeformationinvarianceusingthreeideas:local
receptive field, shared weights, and spatial subsampling. The diagram of the
systemisshowninFigure2.7.
2.2.6. TemplateBasedMethods
Themostdirectoftheproceduresusedforfacerecognitionisthematching
between the test images and a set of training images based on measuring the
correlation. The matching technique is based on the computation of the
normalizedcrosscorrelationcoefficient CNdefinedby,
{ } { } { }
{ } { }GT
TGTG
NII
IEIEIIEC
= (2.43)
WhereIG isthegalleryimagewhichmustbematchedtothetestimage,IT.IGITis
the pixel bypixelproduct,E is the expectation operator and is the standard
deviationovertheareabeingmatched.Thisnormalizationrescalesthetestand
gallery images energy distribution so that their variances and averages match.
Howevercorrelationbasedmethodsarehighlysensitivetoillumination,rotation
andscalechanges.Thebestresultsforthereductionoftheilluminationchanges
8/3/2019 Face Gabor
62/130
50
wereobtainedusingtheintensityofgradient ( )GYGX II + .Correlationmethod
is computationally expensive, so the dependency of the recognition on the
resolutionoftheimagehasbeeninvestigated.
In[16],BrunelliandPoggiodescribeacorrelationbasedmethodforface
recognition in which templates corresponding to facial features of relevant
significance as the eyes, nose and mouth are matched. In order to reduce
complexity,inthismethodfirstpositionsofthosefeaturesaredetected.Detection
of facial features is also subjected to a lot of studies [36, 81, 113, 114]. The
methodpurposedbyBrunelliandPoggiousesasetoftemplatestodetecttheeye
position in a new image, by looking for the maximum absolute values of the
normalized correlation coefficient of these templates at each point in the test
image.Inordertohandlescalevariations,fiveeyetemplatesatdifferentscales
wereused.However,thismethodiscomputationallyexpensive,andalsoit must
benotedthateyesofdifferentpeoplecanbemarkedlydifferent.Suchdifficulties
canbereducedbyusingahierarchicalcorrelation[115].
Afterfacialfeaturesaredetectedforatestface,theyarecomparedtothose
ofgalleryfacesreturningavectorofmatchingscores(oneperfeature)computed
throughnormalizedcrosscorrelation.
The similarity scores of different features can be integrated to obtain a
globalscore.Thiscumulativescorecanbecomputedinseveralways:choosethe
score of the most similar feature or sum the feature scores or sum the feature
scores using weights. After cumulative scores are computed, a test face is
assignedtothefaceclassforwhichthisscoreismaximized.
8/3/2019 Face Gabor
63/130
51
Therecognitionratereportedin[16]ishigherthan96%.Thecorrelation
method as described above requires a robust feature detection algorithm with
respect to variations in scale, illumination and rotations. Moreover the
computationalcomplexityofthismethodisquitehigh.
Beymer [116] extended the correlation based approach to a view based
approachforrecognizingfacesundervaryingorientations,includingrotationsin
depth.Inthefirststagethreefacialfeatures(eyesandnoselobe)aredetectedto
determinefacepose.Althoughfeaturedetectionissimilartopreviouslydescribed
correlation method to handle rotations, templates from different views and
differentpeopleareused.Afterfaceposeisdetermined,matchingproceduretakes
placewiththecorrespondingviewofthegalleryfaces.Inthiscase,asthenumber
of model views for each person in the database is increased, computational
complexityisalsoincreased.
2.2.7. FeatureBasedMethods
Since most face recognition algorithmsareminimumdistance classifiers
insomesense,itisimportanttoconsidermorecarefullyhowadistanceshould
bedefined.In thepreviousexamples(eigenface,neuralnetsetc.)the distance
betweenanobservedfacexandagalleryfacecisthecommonEuclideandistance
cxcxd =),( , and this distance is sometimes computed using an alternative
orthonormalbasisas cxcxd ~~),( = .
8/3/2019 Face Gabor
64/130
52
Whilesuchanapproachiseasytocompute,italsohassomeshortcomings.
When there is an affine transformation between two faces (shift and dilation),
d(x,c)willnotbezero;infactitcanbequitelarge.Asanotherexample,when
there is local transformationsand deformations (x isa smiling version of c),
again d(x,c)willnotbezero.Moreover,itisveryusefultostoreinformationonly
aboutthekeypointsoftheface.Featurebasedapproachescanbeasolutiontothe
aboveproblems.
Manjunathetal.[36]proposedamethodthatrecognizesfacesbyusing
topologicalgraphsthatareconstructedfromfeaturepointsobtained fromGabor
waveletdecompositionoftheface.Thismethodreducesthestoragerequirements
bystoringfacialfeaturepointsdetectedusingtheGaborwaveletdecomposition.
Comparison of two faces begins with the alignment of those two graphs by
matching centroids of the features. Hence, this method has some degree of
robustnesstorotation indepthbutonlyunderrestrictedly controlledconditions.
Moreover,illuminationchangesandoccludedfacesarenottakenintoaccount.
In the proposed method by Manjunaht et. al. [36], the identification
processutilizestheinformationpresentinatopologicalgraphrepresentationof
thefeaturepoints.ThefeaturepointsarerepresentedbynodesVi i={1,2,3,},in
a consistent numbering technique. The information about a feature point is
containedin {S,q},where Srepresentsthespatiallocationsandqisthefeature
vectordefinedby,
[ ]),,(),...,,,( 1 Niii yxQyxQq = (2.44)
8/3/2019 Face Gabor
65/130
53
corresponding to ith feature point. The vector qiisasetofspatialandangular
distances from feature point itoitsN nearest neighbors denoted by Qi(x,y,j),
where j is the jth of the N neighbors. Ni represents a set of neighbors. The
neighborssatisfyingbothmaximumnumberNandminimumEuclideandistance
dijbetweentwopointsViandVjaresaidtobeofconsequencefor ith featurepoint.
In order to identify an input graph with a stored one, which might be
differenteitherintotalnumberoffeaturepointsorinthelocationoftherespective
faces,twocostvaluesareevaluated[36].Oneisthetopologicalcostandtheother
isasimilaritycost.If i,jrefertonodesintheinputgraphIand nmyx ,,, refer
tonodesinthestoredgraph Othenthetwographsarematchedasfollows[36]:
1. ThecentroidsofthefeaturepointsofIandOarealigned.
2. LetNibetheithfeaturepoint{Vi}ofI.Searchforthebestfeaturepoint {Vi}
inOusingthecriterion
miNmii
ii
ii Sqq
S i
== min
.
1 (2.45)
3. Aftermatching,thetotalcostiscomputedtakingintoaccountthetopologyof
thegraphs.Letnodesi andjoftheinputgraphmatchnodes i and j ofthe
stored graph and let iNj (i.e., Vj is a neighbor of Vi). Let
=
ij
ji
ji
ij
jjiid
d
d
d
,min .Thetopologycostisgivenby
.1 jjijjii = (2.46)
8/3/2019 Face Gabor
66/130
54
4. Thetotalcostiscomputedas
+=i Nj
jjiit
i
ii
i
SC 1 (2.47)
where t isa scalingparameterassigningrelativeimportanceto thetwocost
functions.
5. The totalcost isscaledappropriatelyto reflectthepossible differenceinthe
totalnumberofthefeaturepointsbetweentheinputandthestoredgraph.Ifni,
no are the numbers of the feature points in the input and stored graph,
respectively, then scaling factor
=
I
O
O
If
n
n
n
ns ,max and the scaled cost is
C(I,O)=sfC1(I,O).
6. Thebestcandidateistheonewiththeleastcost,
),(min),( * OICOICO
=
(2.48)
Therecognizedfaceistheonethathastheminimumofthecombinedcost
value. In this method [36], since comparison of two face graphs begins with
centroid alignment, occluded cases will cause a great performance decrease.
Moreover, directly using the number offeaturepoints offaces can beresult in
wrongclassificationswhilethenumberoffeaturepointscanbechangeddueto
exteriorfactors(glassesetc).
Anotherfeaturebasedapproachistheelasticmatchingalgorithmproposed
by Lades et al. [117], which has roots in aspect-graph matching. Let S be the
8/3/2019 Face Gabor
67/130
55
originaltwo-dimensionalimagelattice(Figure2.8).Thefacetemplateisavector
fieldbydefininganewtypeofrepresentation,
c={c i,i S1} (2.49)
where S1isalatticeembeddedin Sand ci isafeaturevectoratposition i. S1 is
much coarser and smaller than S. c should contain only the most critical
informationabouttheface,sinceci is composed ofthemagnitudeof the Gabor
filterresponsesatposition iS1.TheGaborfeatures ciprovidemulti-scaleedge
strengthsatposition i.
Figure2.8:A2Dimagelattice(gridgraph)onMarilynMonroesface.
AnobservedfaceimageisdefinedasavectorfieldontheoriginalimagelatticeS.
x={x j,jS} (2.50)
wherexjisthesametypeoffeaturevectorascibutdefinedonthefine-gridlattice
S.
8/3/2019 Face Gabor
68/130
56
Intheelasticgraphmatchingapproach[23,27,117],thedistanced(c,x)is
definedthroughbestmatchbetween xandc.Amatchbetweenxandccanbe
describeduniquelythroughamappingbetween S1andS.
M:S1S (2.51)
Howeverwithoutrestriction,thetotalnumberofsuchmappingsis||S||||s1||.
Recognizingaface,thebestmatchshouldpreservebothfeaturesandlocal
geometry.If iS1andj=M(i),thenfeaturepreservingmeansthatxjisnotmuch
different from ci. In addition, if i1and i2 are close in S1 then preserving local
geometrymeansj1=M(i1)tobeclosetoj2=M(i2).Suchamatchiscalledelastic
sincethepreservationcanbeapproximateratherthanexact,thelattice S1canbe
stretchedunevenly.
Finding the best match is based on minimizing the following energy function
[117],
[ ] .)()(,
)(21 ,
22121 +
=
iii ji
jijjii
xc
xcME (2.52)
Duetothelargenumberofpossiblematches,thismightbedifficulttoobtaininan
acceptabletime.Hence,anapproximatesolutionhastobefound.Thisisachieved
intwostages:Rigidmatchinganddeformablematching[117].Inrigidmatchingc
ismovedaroundinxlikeconventionaltemplatematchingandateachposition
xc iscalculated.Here x isthepartofxthat,afterbeingmatchedwith c,
does not lead to any deformation of S1. In deformable matching, lattice S1 is
8/3/2019 Face Gabor
69/130
57
stretched through random local perturbations to further reduce the energy
function.
There is still a question of face template construction. An automated
schemehasbeenshowntoworkquitewell.Theprocedureisasfollows[117]:
1. Pickthe faceimageofanindividual and generate a facevector fieldusing
Gaborfilters.Denotethisasvectorfieldx.
2. Place a coarse grid lattice S1 byhandonxsuchthatverticesarecloseto
importantfacialfeaturessuchastheeyes.
3. CollectthefeaturevectorsatverticesofS1toformthetemplatec.
4. Pick another face image (of a different person) and generate a face vector
field,denotedasy.
5. Performtemplatematchingbetweenthecwithyusingrigidmatching.
6. Whenabestmatchisfound,collectthefeaturevectorsatverticesofS1toform
thetemplateofy.
7. Repeatsteps4-6.
Malsburg et al. [23, 27] developed a system based on graph matching
approach on with several major modifications. First, they suggested using not
onlymagnitudebutalsophaseinformationoftheGaborfilterresponses.Due to
phaserotation,vectorstakenfromimagepointsonlyafewpixelsapartfromeach
otherhaveverydifferentcoefficients,althoughrepresentingalmostthesamelocal
feature.Therefore,phaseshouldeitherbeignoredorcompensatedforitsvariation
explicitly.
8/3/2019 Face Gabor
70/130
58
As mentioned in [23], using phase information has two potential
advantages,
1. Phase information is required to discriminate between patterns with similar
magnitudes.
2. Sincephasevariesquicklywithlocationprovidesameansforaccuratevector
localizationinanimage.
Malsburgetal.proposedtocompensatephaseshiftsbyestimatingsmall
relative displacements between two feature vectors to use a phase sensitive
similarityfunction[23].
Second modification is the use ofbunch graphs (Figure 2.9, 2.10). The
face bunch graph is able to represent a wide variety of faces, which allows
matchingonfaceimagesofpreviouslyunseenindividuals.Theseimprovements
makeitpossibletoextractanimagegraphfromanewfaceimageinonematching
process. Computational efficiency, and ability to deal with different poses
explicitlyarethemajoradvantagesofthesystemin[23],comparedto[117].
8/3/2019 Face Gabor
71/130
59
(a) (b)
Figure 2.9: A brunch graph from the a) artistic point of view (``Unfinished
Portrait''byTullioPericoli(1985)),b)scientificpointofview
Figure2.10:Bunchgraphmatchedtoaface.
8/3/2019 Face Gabor
72/130
60
2.2.8. CurrentStateoftheArt
Comparing recognition results of different face recognition systems is a
complextask,becausegenerallyexperimentsarecarriedoutondifferentdatasets.
However, the following 5 questions can help while reviewing different face
recognitionmethods:
1. Wereexpression,headorientationandlightingconditionscontrolled?
2. Where subjects allowed wearing glasses and having beards or other facial
masks?
3. Was the subject sample balanced? Where gender, age and ethnic origin
spannedevenly?
4. How many subjects there in database? How many images were used for
trainingandtesting?
5. Werethefacefeatureslocatedmanually?
Answering theabovequestions contributes tobuilding betterdescription
oftheconstraintswithineachapproachoperated.Thishelpstomakeamorefair
comparisonbetweendifferentsetofexperimentalresults.Howeverthemostdirect
and reliable comparison between different approaches is obtained by
experimentingwiththesamedatabase.
By 1993, there were several algorithms claiming to have accurate
performanceinminimallyconsternatedenvironments.Forabettercomparisonof
thosealgorithmsDARPAandArmyResearchLaboratoryestablishedtheFERET
program(seesection4.1.4)withthegoalsofbothevaluatingtheirperformance
andencouragingadvancesinthetechnology[83].
8/3/2019 Face Gabor
73/130
61
Today,therearethreealgorithmsthathavedemonstratedthehighestlevel
ofrecognitionaccuracyonlargedatabases(1196peopleormore)underdouble
blind testing conditions. These are the algorithms from University of Southern
California(USC),UniversityofMaryland(UMD),andMITMediaLab[83,128].
The MIT, Rockefeller and UMD algorithms all use a version of the eigenface
transformfollowed bydiscriminativemodeling (section 2.2.3.1.2). Howeverthe
USC system uses a very different approach. It begins by computing Gabor
Wavelettransformoftheimageanddoesthecomparisonbetweenimagesusinga
grap