47
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Recognizing and Learning Learning Object Categories Object Categories ICCV 2005 Beijing, Short Course, ICCV 2005 Beijing, Short Course, Oct 15 Oct 15

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Embed Size (px)

Citation preview

Page 1: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Li Fei-Fei, UIUC

Rob Fergus, MIT

Antonio Torralba, MIT

Recognizing and Learning Recognizing and Learning Object CategoriesObject Categories

ICCV 2005 Beijing, Short Course, Oct 15ICCV 2005 Beijing, Short Course, Oct 15

Page 2: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

AgendaAgenda

• Introduction

• Bag of words models

• Part-based models

• Discriminative methods

• Segmentation and recognition

• Conclusions

Page 3: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15
Page 4: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

perceptibleperceptible visionvision materialmaterialthingthing

Page 5: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15
Page 6: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15
Page 7: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Plato said…• Ordinary objects are classified together if they

`participate' in the same abstract Form, such as the Form of a Human or the Form of Quartz.

• Forms are proper subjects of philosophical investigation, for they have the highest degree of reality.

• Ordinary objects, such as humans, trees, and stones, have a lower degree of reality than the Forms.

• Fictions, shadows, and the like have a still lower degree of reality than ordinary objects and so are not proper subjects of philosophical enquiry.

Page 8: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Bruegel, 1564

Page 9: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

How many object categories are there?

Biederman 1987

Page 14: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Object categorization

sky

building

flag

wallbanner

bus

cars

bus

face

street lamp

Page 15: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Scene and context categorization• outdoor

• city

• traffic

• …

Page 16: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Challenges 1: view point variation

Michelangelo 1475-1564

Page 17: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Challenges 2: illumination

slide credit: S. Ullman

Page 18: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Challenges 3: occlusion

Magritte, 1957

Page 19: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Challenges 4: scale

Page 20: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Challenges 5: deformation

Xu, Beihong 1943

Page 21: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Challenges 6: background clutter

Klimt, 1913

Page 22: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

History: single object recognition

Page 23: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

History: single object recognition

• Lowe, et al. 1999, 2003

• Mahamud and Herbert, 2000• Ferrari, Tuytelaars, and Van Gool, 2004• Rothganger, Lazebnik, and Ponce, 2004• Moreels and Perona, 2005• …

Page 24: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Challenges 7: intra-class variation

Page 25: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

History: early object categorization

Page 26: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

• Turk and Pentland, 1991• Belhumeur et al. 1997• Schneiderman et al. 2004• Viola and Jones, 2000

• Amit and Geman, 1999• LeCun et al. 1998• Belongie and Malik, 2002

• Schneiderman et al. 2004• Argawal and Roth, 2002• Poggio et al. 1993

Page 27: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15
Page 28: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

OBJECTS

ANIMALS INANIMATEPLANTS

MAN-MADENATURALVERTEBRATE …..

MAMMALS BIRDS

GROUSEBOARTAPIR CAMERA

Page 29: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15
Page 30: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Scenes, Objects, and Parts

Features

Parts

Objects

Scene

E. Sudderth, A. Torralba, W. Freeman, A. Willsky. ICCV 2005.

Page 31: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Object categorization: Object categorization: the statistical viewpointthe statistical viewpoint

)|( imagezebrap

)( ezebra|imagnopvs.

• Bayes rule:

)(

)(

)|(

)|(

)|(

)|(

zebranop

zebrap

zebranoimagep

zebraimagep

imagezebranop

imagezebrap

posterior ratio likelihood ratio prior ratio

Page 32: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Object categorization: Object categorization: the statistical viewpointthe statistical viewpoint

)(

)(

)|(

)|(

)|(

)|(

zebranop

zebrap

zebranoimagep

zebraimagep

imagezebranop

imagezebrap

posterior ratio likelihood ratio prior ratio

• Discriminative methods model posterior

• Generative methods model likelihood and prior

Page 33: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Discriminative

• Direct modeling of

Zebra

Non-zebra

Decisionboundary

)|(

)|(

imagezebranop

imagezebrap

Page 34: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

• Model and

Generative)|( zebraimagep ) |( zebranoimagep

Low Middle

High MiddleLow

)|( zebranoimagep)|( zebraimagep

Page 35: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Three main issuesThree main issues

• Representation– How to represent an object category

• Learning– How to form the classifier, given training data

• Recognition– How the classifier is to be used on novel data

Page 36: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Representation

– Generative / discriminative / hybrid

Page 37: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Representation

– Generative / discriminative / hybrid

– Appearance only or location and appearance

Page 38: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Representation

– Generative / discriminative / hybrid

– Appearance only or location and appearance

– Invariances• View point• Illumination• Occlusion• Scale• Deformation• Clutter• etc.

Page 39: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Representation

– Generative / discriminative / hybrid

– Appearance only or location and appearance

– invariances– Part-based or global

w/sub-window

Page 40: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

Representation

– Generative / discriminative / hybrid

– Appearance only or location and appearance

– invariances– Parts or global w/sub-

window– Use set of features or

each pixel in image

Page 41: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning

Learning

Page 42: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning)

– Methods of training: generative vs. discriminative

Learning

Page 43: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning)

– What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.)

– Level of supervision• Manual segmentation; bounding box; image

labels; noisy labels

Learning

Contains a motorbike

Page 44: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning)

– What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.)

– Level of supervision• Manual segmentation; bounding box; image

labels; noisy labels

– Batch/incremental (on category and image level; user-feedback )

Learning

Page 45: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning)

– What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.)

– Level of supervision• Manual segmentation; bounding box; image

labels; noisy labels

– Batch/incremental (on category and image level; user-feedback )

– Training images:• Issue of overfitting• Negative images for discriminative methods

Priors

Learning

Page 46: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning)

– What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.)

– Level of supervision• Manual segmentation; bounding box; image

labels; noisy labels

– Batch/incremental (on category and image level; user-feedback )

– Training images:• Issue of overfitting• Negative images for discriminative methods

– Priors

Learning

Page 47: Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15

– Scale / orientation range to search over – Speed

Recognition