Recognition: A machine learning approachlazebnik/spring11/lec19_recog2.pdf · Recognition: A machine learning approach Slides adapted from Fei-Fei Li, Rob Fergus, An tonio Torralba,

Recognition: A machine learning approach

Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, Kristen Grauman, and Derek Hoiem

The machine learning framework

• Apply a prediction function to a feature representation of the image to get the desired output:

f( ) = “apple”f( ) applef( ) = “tomato”f( ) = tomatof( ) = “cow”f( ) = cow

The machine learning framework

f( )y = f(x)output prediction

functionImage feature

• Training: given a training set of labeled examples {(x1,y1), a g g e a a g se o abe ed e a p es {( 1,y1),…, (xN,yN)}, estimate the prediction function f by minimizing the prediction error on the training setT ti l f t b f t t l d• Testing: apply f to a never before seen test example x and output the predicted value y = f(x)

StepsT i i Training

LabelsTraining

Training

gImages

TrainingImage Learned TrainingFeatures model

Learned model

Image

Testing

PredictionImage Features

Test Image Slide credit: D. Hoiem

FeaturesFeatures

• Raw pixelsRaw pixels

• Histograms

• GIST descriptors• GIST descriptors

• …

Classifiers: Nearest neighborClassifiers: Nearest neighbor

Test example

Training examples

from class 1

Training examples

from class 2from class 1

f(x) = label of the training example nearest to x( ) g p

• All we need is a distance function for our inputs• No training required!

Classifiers: LinearClassifiers: Linear

• Find a linear function to separate the classes:

f(x) = sgn(w ⋅ x + b)f(x) sgn(w x + b)

I i th t i i t t b t t d ith th

Recognition task and supervision• Images in the training set must be annotated with the

“correct answer” that the model is expected to produce

Contains a motorbike

Unsupervised “Weakly” supervised Fully supervised

Definition depends on task

Generalization

Training set (labels known) Test set (labels

• How well does a learned model generalize from

Training set (labels known) Test set (labels unknown)

How well does a learned model generalize from the data it was trained on to a new test set?

GeneralizationC t f li ti• Components of generalization error – Bias: how much the average model over all training sets differ

from the true model?• Error due to inaccurate assumptions/simplifications made by

the model– Variance: how much models estimated from different training g

sets differ from each other

• Underfitting: model is too “simple” to represent all the relevant class characteristicsrelevant class characteristics– High bias and low variance– High training error and high test error

• Overfitting: model is too “complex” and fits irrelevant characteristics (noise) in the data– Low bias and high varianceLow bias and high variance– Low training error and high test error

Bias-variance tradeoffBias variance tradeoff

Underfitting Overfitting

Test errorErr

or

Training error

Slide credit: D. Hoiem

Complexity Low BiasHigh Variance

High BiasLow Variance

Bias-variance tradeoffBias variance tradeoff

Few training examples

r

Many training examplesTest

Erro

Complexity Low BiasHigh Variance

High BiasLow Variance


Effect of Training SizeEffect of Training Size

Fixed prediction model

TestingErr

or

Training

Generalization Error


Number of Training Examples

DatasetsDatasets

• Circa 2001: 5 categories, 100s of images perCirca 2001: 5 categories, 100s of images per category

• Circa 2004: 101 categoriesg• Today: up to thousands of categories, millions of

imagesg

Caltech 101 & 256http://www.vision.caltech.edu/Image_Datasets/Caltech101/http://www.vision.caltech.edu/Image_Datasets/Caltech256/

Griffin Holub Perona 2007Griffin, Holub, Perona, 2007

Fei-Fei, Fergus, Perona, 2004

Caltech-101: Intraclass variability

The PASCAL Visual Object Classes Challenge (2005 present)Classes Challenge (2005-present)

Ch ll l

http://pascallin.ecs.soton.ac.uk/challenges/VOC/

Challenge classes:Person: person Animal: bird, cat, cow, dog, horse, sheep Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train p , y , , , , ,Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

The PASCAL Visual Object Classes Challenge (2005 present)Classes Challenge (2005-present)


• Main competitions– Classification: For each of the twenty classes,

predicting presence/absence of an example of that class in the test image

– Detection: Predicting the bounding box and label of– Detection: Predicting the bounding box and label of each object from the twenty target classes in the test image

The PASCAL Visual Object Cl Ch ll (2005 t)Classes Challenge (2005-present)


• “Taster” challenges– Segmentation:

Generating pixel-wiseGenerating pixel-wise segmentations giving the class of the object visible at each pixel, orvisible at each pixel, or "background" otherwise

– Person layout: e so ayoutPredicting the bounding box and label of each part of a person (head, hands, feet)

The PASCAL Visual Object Cl Ch ll (2005 t)Classes Challenge (2005-present)


• “Taster” challenges– Action classification

LabelMehttp://labelme.csail.mit.edu/

Russell, Torralba, Murphy, Freeman, 2008

80 Million Tiny Imageshttp://people csail mit edu/torralba/tinyimages/http://people.csail.mit.edu/torralba/tinyimages/

ImageNethttp://www.image-net.org/

Documents

Recognition: A machine learning approachlazebnik/spring11/lec19_recog2.pdf · Recognition: A machine learning approach Slides adapted from Fei-Fei Li, Rob Fergus, An tonio Torralba,