View
214
Download
0
Category
Tags:
Preview:
Citation preview
Categorization by Learning Categorization by Learning and Combing Object Parts and Combing Object Parts
B. Heisele, T. Serre, M. Pontil, T. Vetter, T. Poggio. B. Heisele, T. Serre, M. Pontil, T. Vetter, T. Poggio.
Presented by Manish Jethwa
OverviewOverview
Learn discriminatory components of Learn discriminatory components of objects with Support Vector Machine objects with Support Vector Machine (SVM) classifiers.(SVM) classifiers.
BackgroundBackground
Global ApproachGlobal Approach– Attempt to classify the entire objectAttempt to classify the entire object– Successful when applied to problems in which Successful when applied to problems in which
the object pose is fixed.the object pose is fixed.
Component-based techniquesComponent-based techniques– Individual components vary less when object Individual components vary less when object
pose changes than whole objectpose changes than whole object– Useable even when some of the components Useable even when some of the components
are occluded.are occluded.
Linear Support Vector MachinesLinear Support Vector Machines
Linear SVMs are used to discriminate Linear SVMs are used to discriminate between two classes by determining the between two classes by determining the separating hyperplane.separating hyperplane.
Support Vectors
Decision functionDecision function
The decision function of the SVM has the The decision function of the SVM has the form:form:
Number of training data points Training data points
Class label {-1,1}Adjustable coefficients-solution of quadratic programming problem-positive weights for Support Vectors-zero for all other data points
New data pointf f ((xx) defines a hyperplane dividing ) defines a hyperplane dividing The data. The sign of The data. The sign of ff ((xx) indicates) indicatesthe class the class xx. .
Bias
i=1
lf f ((xx)=)= ∑∑ααii y yi i < < xxi i . . xx> + b> + b
Significance of Significance of ααii
Correspond to the weights of the support vectors. Correspond to the weights of the support vectors.
Learned from training data set.Learned from training data set.
Used to compute the margin M of the support Used to compute the margin M of the support vectors to the hyperplane.vectors to the hyperplane.
Margin M = (√∑il i )
-1
Non-separable DataNon-separable Data
The notion of a margin extends to non-separable data The notion of a margin extends to non-separable data also. also.
Misclassified points result in errors.Misclassified points result in errors.
The hyperplane is now defined by maximizing the margin The hyperplane is now defined by maximizing the margin while minimizing the summed error.while minimizing the summed error.
The expected error probability of the SVM satisfies the The expected error probability of the SVM satisfies the following bound:following bound:
EPerr ≤l -1E[D2/M2]
Diameter of sphere containing all training data
Measuring ErrorMeasuring ErrorProbability or error is proportional to the Probability or error is proportional to the
following ratio:following ratio: ρρ= = DD22/M/M22
Renders Renders ρρ, and therefore the probability of error, , and therefore the probability of error, invariant to scale.invariant to scale.
DD11
MM11
DD22
MM22
==DD22
MM22
DD11
MM11
ρρ1= = ρρ22
Learning ComponentsLearning Components
Expansion left
ρ
Learning Facial ComponentsLearning Facial Components
Extracting face components is time consumingExtracting face components is time consuming– Requires manually extracting each component from all training Requires manually extracting each component from all training
images.images.
Use textured head models insteadUse textured head models instead– Automatically produce a large number of faces under differing Automatically produce a large number of faces under differing
illumination and posesillumination and poses
Seven textured head models used to generate 2,457 face images Seven textured head models used to generate 2,457 face images of size 58x58of size 58x58
Negative Training SetNegative Training Set
Use extract 58x58 patches from 502 non-face Use extract 58x58 patches from 502 non-face images to give 10,209 negative training points.images to give 10,209 negative training points.
Train SVM classifier on this data, then add false Train SVM classifier on this data, then add false positives to the negative training set.positives to the negative training set.
Increases negative training set with those Increases negative training set with those images which look most like faces. images which look most like faces.
Learned ComponentsLearned Components
Start with fourteen manually selected 5x5 Start with fourteen manually selected 5x5 seed regions.seed regions.
The eyes (17x17 pixels)
The nose (15x20 pixels)
The mouth (31x15 pixels)The cheeks (21x20 pixels)The lip (13x16 pixels)
The nostrils (22x12 pixels)The corners of the mouth (15x20 pixels)The eyebrows (15x20 pixels)The bridge of the nose (15x20 pixels)
Combining ComponentsCombining Components
CombiningClassifier
Linear SVM
Shift 58x58 window over input image
Determinemaximum output and its location
Final decisionface / background
Shift componentExperts over 58x58 window
Left Eye Expert
Linear SVM
Nose Expert
Linear SVM
Mouth Expert
Linear SVM
ExperimentsExperiments
Recommended