Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Recognition: A machine learning approach
Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, Kristen Grauman, and Derek Hoiem
The machine learning framework
• Apply a prediction function to a feature representation of the image to get the desired output:
f( ) = “apple”f( ) applef( ) = “tomato”f( ) = tomatof( ) = “cow”f( ) = cow
The machine learning framework
f( )y = f(x)output prediction
functionImage feature
• Training: given a training set of labeled examples {(x1,y1), a g g e a a g se o abe ed e a p es {( 1,y1),…, (xN,yN)}, estimate the prediction function f by minimizing the prediction error on the training setT ti l f t b f t t l d• Testing: apply f to a never before seen test example x and output the predicted value y = f(x)
StepsT i i Training
LabelsTraining
Training
gImages
TrainingImage Learned TrainingFeatures model
Learned model
Image
Testing
PredictionImage Features
Test Image Slide credit: D. Hoiem
FeaturesFeatures
• Raw pixelsRaw pixels
• Histograms
• GIST descriptors• GIST descriptors
• …
Classifiers: Nearest neighborClassifiers: Nearest neighbor
Test example
Training examples
from class 1
Training examples
from class 2from class 1
f(x) = label of the training example nearest to x( ) g p
• All we need is a distance function for our inputs• No training required!
Classifiers: LinearClassifiers: Linear
• Find a linear function to separate the classes:
f(x) = sgn(w ⋅ x + b)f(x) sgn(w x + b)
I i th t i i t t b t t d ith th
Recognition task and supervision• Images in the training set must be annotated with the
“correct answer” that the model is expected to produce
Contains a motorbike
Unsupervised “Weakly” supervised Fully supervised
Definition depends on task
Generalization
Training set (labels known) Test set (labels
• How well does a learned model generalize from
Training set (labels known) Test set (labels unknown)
How well does a learned model generalize from the data it was trained on to a new test set?
GeneralizationC t f li ti• Components of generalization error – Bias: how much the average model over all training sets differ
from the true model?• Error due to inaccurate assumptions/simplifications made by
the model– Variance: how much models estimated from different training g
sets differ from each other
• Underfitting: model is too “simple” to represent all the relevant class characteristicsrelevant class characteristics– High bias and low variance– High training error and high test error
• Overfitting: model is too “complex” and fits irrelevant characteristics (noise) in the data– Low bias and high varianceLow bias and high variance– Low training error and high test error
Bias-variance tradeoffBias variance tradeoff
Underfitting Overfitting
Test errorErr
or
Training error
Slide credit: D. Hoiem
Complexity Low BiasHigh Variance
High BiasLow Variance
Bias-variance tradeoffBias variance tradeoff
Few training examples
r
Many training examplesTest
Erro
Complexity Low BiasHigh Variance
High BiasLow Variance
Slide credit: D. Hoiem
Effect of Training SizeEffect of Training Size
Fixed prediction model
TestingErr
or
Training
Generalization Error
Slide credit: D. Hoiem
Number of Training Examples
DatasetsDatasets
• Circa 2001: 5 categories, 100s of images perCirca 2001: 5 categories, 100s of images per category
• Circa 2004: 101 categoriesg• Today: up to thousands of categories, millions of
imagesg
Caltech 101 & 256http://www.vision.caltech.edu/Image_Datasets/Caltech101/http://www.vision.caltech.edu/Image_Datasets/Caltech256/
Griffin Holub Perona 2007Griffin, Holub, Perona, 2007
Fei-Fei, Fergus, Perona, 2004
Caltech-101: Intraclass variability
The PASCAL Visual Object Classes Challenge (2005 present)Classes Challenge (2005-present)
Ch ll l
http://pascallin.ecs.soton.ac.uk/challenges/VOC/
Challenge classes:Person: person Animal: bird, cat, cow, dog, horse, sheep Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train p , y , , , , ,Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
The PASCAL Visual Object Classes Challenge (2005 present)Classes Challenge (2005-present)
http://pascallin.ecs.soton.ac.uk/challenges/VOC/
• Main competitions– Classification: For each of the twenty classes,
predicting presence/absence of an example of that class in the test image
– Detection: Predicting the bounding box and label of– Detection: Predicting the bounding box and label of each object from the twenty target classes in the test image
The PASCAL Visual Object Cl Ch ll (2005 t)Classes Challenge (2005-present)
http://pascallin.ecs.soton.ac.uk/challenges/VOC/
• “Taster” challenges– Segmentation:
Generating pixel-wiseGenerating pixel-wise segmentations giving the class of the object visible at each pixel, orvisible at each pixel, or "background" otherwise
– Person layout: e so ayoutPredicting the bounding box and label of each part of a person (head, hands, feet)
The PASCAL Visual Object Cl Ch ll (2005 t)Classes Challenge (2005-present)
http://pascallin.ecs.soton.ac.uk/challenges/VOC/
• “Taster” challenges– Action classification
LabelMehttp://labelme.csail.mit.edu/
Russell, Torralba, Murphy, Freeman, 2008
80 Million Tiny Imageshttp://people csail mit edu/torralba/tinyimages/http://people.csail.mit.edu/torralba/tinyimages/
ImageNethttp://www.image-net.org/