106
T OWARDS LARGE-SCALE FINE-GRAINED CATEGORY LEARNING TREVOR DARRELL UC BERKELEY EECS & ICSI

T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

TOWARDS LARGE-SCALE FINE-GRAINED CATEGORY LEARNING

TREVOR DARRELL UC BERKELEY EECS & ICSI

Page 2: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

TOWARDS LARGE-SCALE FINE-GRAINED CATEGORY LEARNING:

POSE NORMALIZATION, HIERARCHICAL GENERALIZATION, AND TIMELY DETECTION

TREVOR DARRELL UC BERKELEY EECS & ICSI

Page 3: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

TOWARDS LARGE-SCALE FINE-GRAINED CATEGORY LEARNING:

POSE NORMALIZATION, HIERARCHICAL GENERALIZATION, AND TIMELY DETECTION

TREVOR DARRELL UC BERKELEY EECS & ICSI

RYAN FARRELL, NING ZHANG, YANGQING JIA, JOSHUA ABBOTT, JOESEPH AUSTERWEIL, TOM GRIFFITHS, SERGEY KARAYEV, TOBI

BAUMGARTNER, MARIO FRITZ

Page 4: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

CATEGORIZATION SPECTRUM

CALTECH 101 (2004) Fei-Fei, Fergus, Perona CALTECH 256 (2007)

Griffin, Holub, Perona

Animals with Attributes (2009) Lampert, Nickisch, Harmeling, Weidmann Exploration of Methods for Automatic

Whale Identification (2001) Heusel , Gunn

Identifying Individual Salamanders (2007) Gamble, Ravela, McGarigal Labeled Faces in the Wild (2007)

Huang, Ramesh, Berg, Learned-Miller

STONEFLY9 (2009) Martínez-Muñoz, Zhang, Payet, Todorovic, Larios, Yamamuro,

Lytle, Moldenke, Mortensen, Paasch, Shapiro, Dietterich Caltech/UCSD Birds 200 (2010)

Welinder, Branson, Mita, Wah, Schroff, Belongie, Perona

Visual Identification of Plant Species (2008) Belhumeur, Chen, Feiner, Jacobs, Kress,

Ling, Lopez, Ramamoorthi, Sheorey, White, Zhang

Oxford Flowers (2006-09) Nilsback, Zisserman

Page 5: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Fine-grained recognition bridges traditional instance and category- level tasks...

TOWARDS LARGE-SCALE FINE-GRAINED RECOGNITION

Tasks are relatively difficult even for humans: bird subspecies, vehicle brands, etc.

People learn them from small numbers of positive examples

Both local feature geometry and statistical appearance are salient.

Page 6: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Key differences vs. traditional recognition: – Distinctive fine-grained features are often relative

to object configuration – Degree of generalization varies across category

hierarchies – Finest grained categories may have few training

examples – Can’t detect everything all the time in a large-

scale setting

TOWARDS LARGE-SCALE FINE-GRAINED RECOGNITION

Page 7: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Today: • Pose Pooling Kernels

for Sub-category Recognition

• Generalization in Large-Scale Concept Hierarchies

• Timely Recognition

Page 8: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

POSE POOLING KERNELS FOR SUB-CATEGORY RECOGNITION

NING ZHANG, RYAN FARRELL, TREVOR DARRELL CVPR 2012

Page 9: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

FINE-GRAINED VISUAL CATEGORIZATION

Page 10: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

DISCRIMINATIVE DETAILS MAY BE HIGHLY LOCALIZED AND POSE RELATIVE

Scarlet Tanager Photo by Paul O’Toole

Summer Tanager Photo by Liam Wolff

Summer Tanager Photo by Patti Shoupe

Snowy Egret Photo by Shelley Rutkin

Great Egret Photo by James Hawkins

Page 11: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

“BIRDLETS” [ICCV11] APPROACH

“Blue-Headed Vireo”!

DETECTION OF VOLUMETRIC PRIMITIVES

POSE-NORMALIZED APPEARANCE SPACE

Page 12: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

2D VS. 3D

3D Representation 2D Representation

Pros • Most accurate representation for volumetric objects

• Facilitates pose-normalized appearance model

• Far easier to train (less complex and less costly annotations)

• Far more tolerant of localization errors

Cons • Accurate detection is challenging

• Annotations are costly • Volumetric part model

doesn’t capture flat parts such as wings

• Less precise part localization can attenuate key discriminative features

Page 13: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

A classic story: recognition via detection

faces detection

/part localization

Alignment/normalization

identification/ attribute learning

Page 14: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Traditional Spatial Pooling

Page 15: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

“POSE DEPENDENT” POOLING

Page 16: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

OVERVIEW

“Red-Eyed Vireo”

Pose-Normalized Representation

Annotated images

Poselet training

Poselet activations

SVM classification

Page 17: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

POSELET TRAINING

Parts: back, beak, belly, breast, crown, forehead, left eye, left leg, left wing, nape, right eye, right leg, right wing, , throat

Collect positive examples from other

training images

Page 18: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

POSELET TRAINING

Bourdev and Malik, in ICCV 2009.

Page 19: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

POSELET ACTIVATION (PREDICTION)

Poselet #90

Poselet #66

Poselet #65

Page 20: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

COMPARING POSELETS

Page 21: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

• Local image features from poselets which overlap in 3D are pooled together

POSE NORMALIZED REPRESENTATION

Page 22: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

• Cluster based on warping distance or keypoint distance

POSELET CLUSTERS

Page 23: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Pose pooling vs. spatial pyramid

Pose pooling

Page 24: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

EXPERIMENTAL RESULTS

P. Welinder, S. Branson, T. Mita, C. Wah, F. Schroff, S. Belongie, P. Perona. CalTech/UCSD, 2010. http://www.vision.caltech.edu/visipedia/CUB-200.html

27

Page 25: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

EXPERIMENTAL RESULTS

28

• Implementation Details – Poselet Activations

• Activations are generated by comparing each poselet’s keypoint distributions with the annotated keypoint locations on each image.

– Appearance Descriptor • Bag of words on SIFT feature / Spatial Pyramid Match on top of

SIFT.

– SVM classifier • linear SVM / fast implementation of χ2 and Intersection kernel.

– Baseline methods • VLFEAT toolbox-SIFT on the bounding-box object w.o pose

normalized information.

Page 26: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

CATEGORIZATION RESULTS

29

Baseline - VLFeat 29.73% (MAP)

Warped Feature 36.33% (MAP)

Pose Pooling Kernel 40.60% (MAP)

Confusion matrices on 14 categories across two bird families {vireos, woodpeckers}.

Page 27: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

MORE RESULTS

30

Page 28: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

• Detection • Better descriptors

– Physically inspired color representations – Salience cues to reduce false positive poselet

responses

• Experiments on larger datasets…

31

FUTURE WORK

Page 29: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Caltech UCSD Berkeley

For more information contact: Ryan Farrell - [email protected]

Page 30: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

“Sweet” Taster-25 (released June 2012)

“Bitter” Sparrows-33 (will be released this Fall with the full

700+ category dataset)

Page 31: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Bounding Box

Segmentation Mask

Part Keypoints

Part Regions

Attributes

Page 32: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Among the Key Innovations • Photos collected via enthusiast photographer

submissions and annotated by citizen scientists • Expert-curated collection (category for each image

is verified by a domain expert) • Deep Taxonomy and tree-based distance measures

The dataset is being collected with tools designed for re-use in other fine-grained domains

D( , ) << D( , )

Page 33: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Caltech UCSD Berkeley

For more information contact: Ryan Farrell - [email protected]

Page 34: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Today: • Pose Pooling Kernels

for Sub-category Recognition

• Generalization in Large-Scale Concept Hierarchies

• Timely Recognition

Page 35: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

BAYESIAN CONCEPT GENERALIZATION

Yangqing Jia, Joshua Abbott, Joe Austerweil,

Tom Griffiths, Trevor Darrell

Page 36: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

TOWARDS LARGE-SCALE FINE-GRAINED CATEGORY RECOGNITION

Current ML algorithms: – hundreds of positive

examples – thousands of

negative examples – same learning

method throughout hierarchy

Humans: – can learn from one,

two, or three examples – often no negative

examples – more generalization

for broader concepts (“size principle”)

Page 37: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

LEARNING A NOUN “blicket” “blicket”

“blicket”

Page 38: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

BAYESIAN INFERENCE

Posterior probability

Likelihood Prior probability

Sum over space of hypotheses h: hypothesis

d: data

Page 39: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

LEARNING NOUNS

• Data – object-word pairs

• Hypotheses – sets of objects

• Likelihood – weak sampling

– strong sampling

(Tenenbaum, 1999; Xu & Tenenbaum, 2007)

h x

w

h w

x weak strong

Page 40: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

“blicket”

p(d|h) = 0

Page 41: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

“blicket”

p(d|h) = 1/3

Page 42: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

“blicket”

p(d|h) = (1/3)3

“blicket”

“blicket”

Page 43: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

“blicket”

p(d|h) = 1/12

Page 44: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

“blicket”

p(d|h) = (1/12)3

“blicket”

“blicket”

Page 45: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Principles

Hypotheses

Data

Whole-object principle Shape bias Taxonomic principle Contrast principle Basic-level bias

PREVIOUS WORK (XU & TENENBAUM, 2007)

Page 46: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

PREVIOUS RESULTS (SMALL-SCALE)

Page 47: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

• Small, hand-constructed domains

• Toy stimuli

• Constructing hypothesis space based on pairwise similarity judgments requires O(n2) judgments

Challenges

LARGE-SCALE WORD LEARNING

Page 48: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

• Small, hand-constructed domains

• Toy stimuli

• Constructing hypothesis space based on pairwise similarity judgments requires O(n2) judgments

Solutions

• Hypothesis space is automatically derived from WordNet structure

Challenges

LARGE-SCALE WORD LEARNING

Page 49: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

LARGE-SCALE WORD LEARNING

Here are three BLICKETS

Here are five FEPS

Here are four ZIVS Is this a BLICKET?

Is this a ZIV?

Is this a FEP?

Page 50: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Here are three FEPS

Can you help Mr. Frog find the other “FEPS”

Page 51: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

LARGE SCALE MODEL RESULTS

[Abbott, Austerweil, and Griffiths, Constructing a hypothesis space from the web for large scale Bayesian word learning, COGSCI 2012]

Page 52: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

GENERALIZING TO NEW IMAGES

• Bayesian word learning offers advantages over standard machine learning approaches.

• First challenge, scaling, has been solved (going from 45 to 14 million objects); but only using imagenet images and their location in hierarchy.

• Recent extension of model to incorporate direct pixel observations and model noisy recognition generalization from arbitrary objects and better predicts human data….

Page 53: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Task

Page 54: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Task

renuzit air fresher

coke-can

pringles

pasta-box

book-robotics

tea-box

leaf-node classes from cropped PR2 camera

images…

Page 55: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Image Classification Pipeline

• A convolutional neural network (CNN) pipeline is well suited to this type of data ([Jia et al CVPR12])

local feature coding

spatial pooling

SVM classification

“pasta box”

Page 56: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Densely-coded Local Features

• We extract overlapping local image patches

Page 57: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Encoding Local Features

local features

dictionary (learned in an unsupervised

fashion)

Encoded activation maps

Page 58: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Spatial Pooling

... Take max on a regular grid (or learned spatial bins

[cvpr12])

...

Convert to a feature vector

Page 59: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

The whole pipeline

“pasta box”

Page 60: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Learning from Examples

• “Dear Robot, here are some feps”

• “Now get me more feps / are these also feps?”

(pasta-box) (peanut-butter) (spam)

Page 61: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Learning from Examples

• We assume a hierarchy of hypotheses from Wordnet or provided in a robot’s environment.

• Each hypothesis defines a subset of the leaf nodes (instances) that belong to it:

Page 62: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and
Page 63: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

These are feps

This is probably not fep

Page 64: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

These are not feps

Page 65: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Visually Grounded Word Learning at a Large Scale

• We tested this model in a large-scale with ImageNet

[Jia, Abbott, Austerweil, Griffiths, Darrell. 2012]

Page 66: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Human Behavior

More specific concepts More general concepts

More specific queries More general queries

Page 67: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Result Comparisons

Human Behavior

Our Model

Classical Concept Learning (without vision)

Classical Vision (without concept learning)

Page 68: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Video

Page 69: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Today: • Pose Pooling Kernels

for Sub-category Recognition

• Generalization in Large-Scale Concept Hierarchies

• Timely Recognition

Page 70: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Sergey Karayev, Tobi Baumgartner, Mario Fritz, Trevor Darrell NIPS 2012

Timely Object Recognition

Page 71: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Timely Object Recognition

Lots of classes and images in a large scale setting… Potentially different class values… Not enough time to run all object detectors…

Our Solution: Dynamic policy for selecting detectors

Page 72: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Vijayanarasimhan and Kapoor. Visual Recognition and Detection Under Bounded Computational Resources. CVPR 2010. Gao and Koller. Active Classification based on Value of Classifier. NIPS 2011

Recent Related Work

Page 73: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

New Metric: Performance (AP) vs. Time

Area under the AP vs. T curve between start and deadline times

Page 74: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Belief State

Action

Observations

Belief State

Image

Action

Time

etc Observatio

ns

Sequential Detection

Page 75: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

action selection

Belief State

Action

Observations

Belief State

Image

Action

Time

etc Observatio

ns maximize expected value

Sequential Detection

Page 76: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Actions

- scene context or object detector actions - run on the whole image - generate observations: - list of detections - GIST feature, etc.

Page 77: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Belief State

Action

Observations

Belief State

Image

Action

Time

etc Observatio

ns

execute action “black box”

receive observations

Sequential Detection

action selection maximize expected value

Page 78: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

action selection maximize expected value

Belief State

Action

Observations

Belief State

Image

Action

Time

etc Observatio

ns

belief state update with observations, leverage context

execute action “black box”

receive observations

Sequential Detection

Page 79: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Class presence probabilities inferred from observations

Belief state update

Page 80: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Sequential Detection

Page 81: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Sequential Detection

Page 82: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Sequential Detection

Page 83: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Sequential Detection

Page 84: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Sequential Detection

Page 85: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Sequential Detection

Page 86: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and
Page 87: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

action selection maximize expected value

Belief State

Action

Observations

Belief State

Image

Action

Time

etc Observatio

ns

belief state update with observations, leverage context

Sequential Detection

execute action “black box”

receive observations

Page 88: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

policy:

action-value function:

assuming linear structure:

reward definition

Action selection maximize expected value

Page 89: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Reward definition: derived from the AP vs. Time metric

Page 90: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Reward definition: derived from the AP vs. Time metric

Page 91: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Reward definition: derived from the AP vs. Time metric

Page 92: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

policy:

action-value function:

assuming linear structure:

learning the policy parameters

Action selection maximize expected value

Page 93: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Learning the policy

• Sample the expectation using Q-Iteration:

• collect (state,action,reward) tuples by running episodes

• solve for weights with L1 regression

Page 94: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Learning the policy

• Sample the expectation using Q-Iteration:

• collect (state,action,reward) tuples by running episodes

• solve for weights with L1 regression

• Cross-validate the discount

• When is 0, learning a greedy policy

• When is 1, looking ahead to the end of the episode.

Page 95: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Feature Representation & Learned Policy Weights

actio

ns

features

Features: - Prior probability of detector class presence - Class presence probabilities & entropies given observations - Time features (time to deadline, etc.)

Greedy policy ( )

Page 96: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

actio

ns

features “RL” policy ( )

Features: - Prior probability of detector class presence - Class presence probabilities & entropies given observations - Time features (time to deadline, etc.)

Feature Representation & Learned Policy Weights

Page 97: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Policy Trajectories

Page 98: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Policy Trajectories

Page 99: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Policy Trajectories

Page 100: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Policy Trajectories

Page 101: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Evaluation Results

PASCAL VOC 2007 DPM detectors Mean Detection AP

Page 102: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Evaluation Results

PASCAL VOC 2007 DPM detectors Mean Detection AP

Page 103: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

Today: • Pose Pooling Kernels

for Sub-category Recognition

• Generalization in Large-Scale Concept Hierarchies

• Timely Recognition

Page 104: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

• Pose matters! – Distinctive fine-grained features should be relative to object configuration…

• Size matters! – Degree of generalization varies across category hierarchies and enables learning from few training examples…

• Plan ahead! – Learn what to look for, and when, in a large-scale setting to maximize time-constrained value...

TOWARDS LARGE-SCALE FINE-GRAINED RECOGNITION

Page 105: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

• Ryan Farrell, Ning Zhang, Trevor Darrell, “Pose Pooling Kernels for Sub-category Recognition”, CVPR 2012

• Yangqing Jia, Joshua Abbott, Joe Austerweil, Tom Griffiths, Trevor Darrell, “Bayesian Concept Generalization”, UCB EECS TR

• Sergey Karayev, Tobi Baumgartner, Mario Fritz, Trevor Darrell, “Timely Object Recognition”, NIPS 2012

FOR MORE INFORMATION…

Page 106: T L -SCALE FINE-GRAINED Ltrevor/public_html/... · 2012-09-28 · t owards l arge-s cale f ine-g rained c ategory l earning: p ose n ormalization, h ierarchical g eneralization, and

TOWARDS LARGE-SCALE FINE-GRAINED CATEGORY LEARNING:

POSE NORMALIZATION, HIERARCHICAL GENERALIZATION, AND TIMELY DETECTION

TREVOR DARRELL UC BERKELEY EECS & ICSI

RYAN FARRELL, NING ZHANG, YANGQING JIA, JOSHUA ABBOTT, JOESEPH AUSTERWEIL, TOM GRIFFITHS, SERGEY KARAYEV, TOBI

BAUMGARTNER, MARIO FRITZ