46
Outline 1. Bag of visual words model for categorization SVM classifier 2. Adding spatial information for localization 3. Databases and challenges 4. Spatial layout 5. Class based segmentation Pixel level localization 6. Conclusions and the future

Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Outline

1. Bag of visual words model for categorization

• SVM classifier

2. Adding spatial information for localization

3. Databases and challenges

4. Spatial layout

5. Class based segmentation

• Pixel level localization

6. Conclusions and the future

Page 2: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Beyond a bag of visual words

Page 3: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Outline

• Region of Interest (ROI)

• jumping/sliding window for localization

• Spatial tiling

• Histogram of Gradients (HOG)

• Spatial pyramid

• Case study

Page 4: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

• Problem of background clutter • Use a sub-window

– At correct position, no clutter is present

Region of Interest (ROI)

Page 5: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

– Scale / orientation range to search over – Speed– Context

Sliding window detection

Page 6: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

search over scale

SmallestScale

LargerScale

Page 7: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Problems with sliding windows …

• aspect ratio

• granuality (finite grid)

• partial occlusion

• multiple responses

See recent work by

• Christoph Lampert et al CVPR 08, ECCV 08

• Bosch et al BMVC 08

Page 8: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Sliding window• Classifier: SVM with linear kernel

• bag of visual word representation of ROI

• Stronger training: ROI on object instance

Example detections for dog

Lampert et al CVPR 08

Page 9: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

More spatial information - tilingUse spatial grid to define correspondence

If codebook has V visual words, then representation has dimension 4V

Fergus et al ICCV 05

• parameter: number of tiles

Page 10: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Ex: Leibe & Schiele 03/04 : Generalized Hough Transform

• Learning: for every cluster, store possible “occurrences”

• Recognition: for new image, let the matched patches vote for possible object positions

Page 11: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Voting Space(continuous)

Interest Points Matched Codebook Entries

Probabilistic Voting

Backprojectionof Maximum

Ex: Leibe & Schiele 03/04 : Generalized Hough Transform

Page 12: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

More features – histogram of orientations

Counts in orientation bins can be thought of as visual words

imagedominant direction HOG

frequ

ency

orientation

• tiling

• each tile represents HOG

• dense descriptor

Page 13: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Ex 1: Human (Pedestrian) DetectionHistograms of Oriented Gradients for Human DetectionDalal & Triggs, CVPR 2005

Page 14: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

• training: ROI over pedestrian

• classification: linear SVM on HOG

• NB similarity to SIFT, GIST

Page 15: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Dalal and Triggs, CVPR 2005

Page 16: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Learned model

Slide from Deva Ramanan

f(x) = w>x+ b

Page 17: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Slide from Deva Ramanan

Page 18: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Ex 2: Upper body detector – using HOGs

• Ferrari et al CVPR 08

average training data

Page 19: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

More features – dense visual words

Vogel & Schiele 2004, Jurie & Triggs ICCV 05 , Fei-Fei & Perona CVPR 05, Bosch et al ECCV 06

DENSE PATCHES

Row reorder gray valuesand form a vector ofsize N2

Parameters: N – size of patchM – distance between patches

Textons

… …… …… …

128- SIFT descriptor

Parameters: r – radi of patchM – distance between patches

SIFT

Luong & Malik 1999, Varma & Zisserman 2003

Page 20: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

More Spatial information – Pyramid kernels

• Divide image into grids of varying resolution, and give more weight to agreement in finer grids.– 2^l grids at level l

• Intersect histograms, multiply by weight.

Lazebnik et al. [CVPR 2006]

Page 21: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Based on Grauman & Darrell ICCV 05

Spatial Pyramid Kernels for Geometry/Appearance Matching(Lazebnik et al CVPR’06)

More Spatial information – Pyramid kernels

Page 22: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik
Page 23: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

9 matches x 1

= 9

Page 24: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

9 matches x 1

= 9

Page 25: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

9 matches x 1 4 matches x ½

= 9 = 2

Page 26: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

9 matches x 1 4 matches x ½

= 9 = 2

Page 27: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

9 matches x 1 4 matches x ½ 2 matches x ¼

= 9 = 2 = 1/2

Total matching weight (value of spatial pyramid kernel ): 9 + 2 + 0.5 = 11.5

Page 28: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

l = 0 l = 1 l = 2

+ +

Pyramid spatial layout for appearance patches – for images

BoW 4BoW 16BoW

Represent appearance as dense grid of visual words

Page 29: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

l = 1

l = 0

x y

Generalizations• Use chi-squared kernel instead of histogram intersection

Kf(i, j) =Xl∈L

βlfe−μχ2(hlf(i),hlf(j))

Page 30: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

l = 0 l = 1 l = 2

1HOG 4HOG 16HOG

Pyramid HOG – for imagesRepresent local orientated gradients

Page 31: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Pyramid HOG for image regions0 1 LEVEL 2

Page 32: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Pyramid HOG for image regions0 1 LEVEL 2

Page 33: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Case study: scene classification

Suburb Bedroom Kitchen Living room Office

Coast Forest Mountain Open country River Sky/clouds

Vog

el&

Sch

iele

-VS

FeiF

ei&

Per

ona

-FP

Coast Forest Mountain Open country Highway Inside city Tall building Street

Oliv

a &

Tor

ralb

a -O

T

Laze

bnik

et a

l. -L

SP

Store Industrial

Page 34: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Vogel & Schiele DATASET

Coast Forest Mountain

Open country River Sky/clouds

702 images VS dataset6 categories

Page 35: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Oliva & Torralba DATASET

Coast Forest Mountain Open country

Highway Inside city Tall building Street

2688 images8 categories

OT dataset

Page 36: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

3759 images13 categories

FP dataset

Fei-Fei & Perona DATASET

Suburb Bedroom Kitchen Living room Office

Coast Forest Mountain Open country Highway Inside city Tall building Street

Page 37: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Lazebnik, Schmid & Ponce DATASET

Suburb Bedroom Kitchen Living room Office

Coast Forest Mountain Open country Highway Inside city Tall building Street

Store Industrial4385 images15 categories

LSP dataset

Page 38: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Mulit-way Classification

• For each class ‘c’ learn a 1-vs-rest SVM classifier

• Classification of test image I according to:

• where is the distance for the SVM for class c

c∗ = argmaxcDc(I)

Dc(I)

Page 39: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

VisualVocabulary

w1

w2

w3w4

……

……

Features

• bag of visual words

• HOG

• spatial pyramid of visual words

• spatial pyramid HOG

Parameters• vocabulary size V

• level weightings

• feature combination weights

l = 0 l = 1 l = 2

Page 40: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Dataset

Validation set

100

Learn themodels

Others

½ Training ½ Testing

Classification

Methodology for learning parameter values

• optimize classification performance on a validation set

• 1 vs rest SVM classifier

Page 41: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

SVM-BOW

60

65

70

75

80

85

90

200 500 1000 1500 2000 2500 5000

V

Per

form

ance

(%)

SVM-BOW

Optimize vocabulary size V on validation set

2688 images8 categories

OT dataset

Page 42: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Spatial Pyramid with optimizationSpatial Pyramid with optimizationSpatial pyramid with learnt weights

without optimization:

with optimization:

Kf(i, j) =Xl∈L

βlfe−μχ2(hlf(i),hlf(j))

• Learn level weights – linear combination of kernels

level weightbase kernel

dense visual words

• if weights common to all classes:

β0 = 0.25, β1 = 0.25, β2 = 0.5

β0 : β2 = 1.0, β1 : β2 = 0.8

Page 43: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Optimized values for each datasetSame number of Images as authors

SVM one against all – x2 kernel for visual words Up to L = 2 for spatial pyramid

88.0

100

89.2

78.3

93.6

91.1

90.8

97.5

Coast

Forest

Mountain

Op. Country

Highway

Inside City

Street

Tall building

Ope

nco

untry

Mou

ntai

ns

Coa

st

Ope

nco

untry

Hig

hway

Stre

et

Page 44: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Spatial Pyramid with optimizationSpatial Pyramid with optimizationSpatial pyramid with feature combination

Lazebnik et al.dense visual words

dense visual words optimized

dense visual words & HOG optimized

81.1 83.5 90.2

Kopt(i, j) =Xf∈F

dfKf(i, j)

• feature weights – linear combination of kernels

feature weightfeature kernel

• dense visual words

• HOG

Page 45: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

Take home messages

• Lite use of spatial information

• tiling, spatial pyramid

• Combination of features

• visual words, sparse, dense, HOG

• Learn parameters on validation set

Page 46: Pixel level localizationaz/icvss08_az_spatial.pdf · 2009-10-11 · Spatial Pyramid with optimization Spatial pyramid with feature combinationSpatial Pyramid with optimization Lazebnik

More classifiers …• SVM Classifier

• good performance

• convex optimization

• Logistic regression

• Adaboost

• e.g. used by Viola & Jones face detector

• slow to learn, fast to test

• Random forests

• fast to learn, fast to test

• Jamie Shotton tutorial