The Problem Object categories are often modeled by collections
(bag-of-features) or constellations (pictorial structures) of local
features Many simple, shape-based objects dont have any
discriminative local appearance features ?
Slide 3
The Multi-Local Feature A specific spatial constellation of
oriented edgels (or other local content) Captures global shape
properties Weak detector of shape-based object categories Described
by coordinate vector: (x 1,,x 12 )
Slide 4
Modeling Intra-Class Variation
Slide 5
1. Generate coordinate vectors by clicking corresponding edgels
in a (small) number of training images 2. Align coordinate vectors
wrt. similarity transform
Slide 6
Modeling Intra-Class Variation 3. Extend coordinate vectors
into their convex hull
Slide 7
Detection
Slide 8
For each occurrence x 1 of c 1 For each consistent occurrence x
2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image
locations of c 3 and c 4 Sample image edgels Compute normalized
distance to convex hull of training features If distance is below
threshold, an instance was detected End For
Slide 9
Detection For each occurrence x 1 of c 1 For each consistent
occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to
hypothesize image locations of c 3 and c 4 Sample image edgels
Compute normalized distance to convex hull of training features If
distance is below threshold, an instance was detected End For
Slide 10
Detection For each occurrence x 1 of c 1 For each consistent
occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to
hypothesize image locations of c 3 and c 4 Sample image edgels
Compute normalized distance to convex hull of training features If
distance is below threshold, an instance was detected End For
Slide 11
Detection For each occurrence x 1 of c 1 For each consistent
occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to
hypothesize image locations of c 3 and c 4 Sample image edgels
Compute normalized distance to convex hull of training features If
distance is below threshold, an instance was detected End For
Slide 12
Detection For each occurrence x 1 of c 1 For each consistent
occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to
hypothesize image locations of c 3 and c 4 Sample image edgels
Compute normalized distance to convex hull of training features If
distance is below threshold, an instance was detected End For
Slide 13
Detection For each occurrence x 1 of c 1 For each consistent
occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to
hypothesize image locations of c 3 and c 4 Sample image edgels
Compute normalized distance to convex hull of training features If
distance is below threshold, an instance was detected End For
Slide 14
Detection For each occurrence x 1 of c 1 For each consistent
occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to
hypothesize image locations of c 3 and c 4 Sample image edgels
Compute normalized distance to convex hull of training features If
distance is below threshold, an instance was detected End For
Slide 15
Detection For each occurrence x 1 of c 1 For each consistent
occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to
hypothesize image locations of c 3 and c 4 Sample image edgels
Compute normalized distance to convex hull of training features If
distance is below threshold, an instance was detected End For
Slide 16
Experiments Detection performance was evaluated on a standard
database (ETHZ Shape Classes) and we want to investigate: Is a
multi-local feature a good weak detector? How many local features
should be used?
Slide 17
Mugs - Training 3 1 8 10 149 7 1213 2 6 11 5 4 3 1 8 10 14 9 7
12 13 2 6 11 5 4 25 training images were downloaded from Google
images 14 edgels constituting a multilocal feature were marked in
each training image
Slide 18
Mugs - Results
Slide 19
Performance decreases when adding more than 9 local features
0.4 60.6 %
Slide 20
Bottles - Training 12 1 10 7 11 9 8 6 2 5 3 4 1 10 7 11 9 8 6 2
5 3 4 12 25 training images were downloaded from Google images 12
edgels constituting a multilocal feature were marked in each
training image
Slide 21
Bottles - Results
Slide 22
0.4 72.7 %
Slide 23
Apple logos - Training 20 training images were downloaded from
Google images 12 edgels constituting a multilocal feature were
marked in each training image
Slide 24
Apple logos - Results
Slide 25
Performance decreases when adding more than 11 local features
0.4 77.3 %
Slide 26
Conclusions A multi-local feature is a good weak detector of
shape-based object categories The best performance is achieved with
multi- local features with a moderate number of local features
Convex combinations of valid exemplars are in general also valid
exemplars (we can extend a few training examples into their convex
hull)
Slide 27
Future Work Automatic learning of multi-local features Building
combinations of multi-local feature detectors into an object
detection system
Slide 28
Related Work Pictorial Structures E.g.. Felzenszwalb,
Huttenlocher. Pictorial Structures for Object Recognition, IJCV No.
1, January 2005. Constellation Models E.g.. Fergus, Perona,
Zisserman. Object class recognition by unsupervised scale-invariant
learning, CVPR03. Differences Different detection methods Use rich
local features
Slide 29
Thanks!
Slide 30
Representation The multi-local feature manifold consists of all
convex combinations of the training examples