Multi-Local Feature Manifolds for Object Detection Oscar Danielsson ([email protected]) Stefan Carlsson ([email protected]) Josephine Sullivan ([email protected])

Embed Size (px)

Citation preview

  • Slide 1
  • Multi-Local Feature Manifolds for Object Detection Oscar Danielsson ([email protected]) Stefan Carlsson ([email protected]) Josephine Sullivan ([email protected]) DICTA08
  • Slide 2
  • The Problem Object categories are often modeled by collections (bag-of-features) or constellations (pictorial structures) of local features Many simple, shape-based objects dont have any discriminative local appearance features ?
  • Slide 3
  • The Multi-Local Feature A specific spatial constellation of oriented edgels (or other local content) Captures global shape properties Weak detector of shape-based object categories Described by coordinate vector: (x 1,,x 12 )
  • Slide 4
  • Modeling Intra-Class Variation
  • Slide 5
  • 1. Generate coordinate vectors by clicking corresponding edgels in a (small) number of training images 2. Align coordinate vectors wrt. similarity transform
  • Slide 6
  • Modeling Intra-Class Variation 3. Extend coordinate vectors into their convex hull
  • Slide 7
  • Detection
  • Slide 8
  • For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
  • Slide 9
  • Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
  • Slide 10
  • Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
  • Slide 11
  • Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
  • Slide 12
  • Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
  • Slide 13
  • Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
  • Slide 14
  • Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
  • Slide 15
  • Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For
  • Slide 16
  • Experiments Detection performance was evaluated on a standard database (ETHZ Shape Classes) and we want to investigate: Is a multi-local feature a good weak detector? How many local features should be used?
  • Slide 17
  • Mugs - Training 3 1 8 10 149 7 1213 2 6 11 5 4 3 1 8 10 14 9 7 12 13 2 6 11 5 4 25 training images were downloaded from Google images 14 edgels constituting a multilocal feature were marked in each training image
  • Slide 18
  • Mugs - Results
  • Slide 19
  • Performance decreases when adding more than 9 local features 0.4 60.6 %
  • Slide 20
  • Bottles - Training 12 1 10 7 11 9 8 6 2 5 3 4 1 10 7 11 9 8 6 2 5 3 4 12 25 training images were downloaded from Google images 12 edgels constituting a multilocal feature were marked in each training image
  • Slide 21
  • Bottles - Results
  • Slide 22
  • 0.4 72.7 %
  • Slide 23
  • Apple logos - Training 20 training images were downloaded from Google images 12 edgels constituting a multilocal feature were marked in each training image
  • Slide 24
  • Apple logos - Results
  • Slide 25
  • Performance decreases when adding more than 11 local features 0.4 77.3 %
  • Slide 26
  • Conclusions A multi-local feature is a good weak detector of shape-based object categories The best performance is achieved with multi- local features with a moderate number of local features Convex combinations of valid exemplars are in general also valid exemplars (we can extend a few training examples into their convex hull)
  • Slide 27
  • Future Work Automatic learning of multi-local features Building combinations of multi-local feature detectors into an object detection system
  • Slide 28
  • Related Work Pictorial Structures E.g.. Felzenszwalb, Huttenlocher. Pictorial Structures for Object Recognition, IJCV No. 1, January 2005. Constellation Models E.g.. Fergus, Perona, Zisserman. Object class recognition by unsupervised scale-invariant learning, CVPR03. Differences Different detection methods Use rich local features
  • Slide 29
  • Thanks!
  • Slide 30
  • Representation The multi-local feature manifold consists of all convex combinations of the training examples