1
Towards Learning Semantically Relevant Dictionary for Visual Category Recognition Ashish Gupta, Richard Bowden Centre for Vision, Speech, and Signal Processing, University of Surrey, Guildford, United Kingdom Objective Transform feature space rendered by the local patch affine invariant feature descriptor to a semantically relevant space for visual categorisation. Challenge I Large intra-category visual appearance variation. I Training data: insufficient, noisy, background clutter. I Feature descriptor is high-dimensional, sparsely populated, and renders highly inter-mixed vectors in feature space. Topic Words I Feature space is assumed to have local semantic integrity. I Intra-category appearance variance ameliorated. Grouping Scattered Clusters I Analyse Image-Word co-occurrence statistics. I Similar occurrence semantic equivalence. I Use co-clustering to discover word groups. I Group such words into topics. Multiple Sub-Manifolds I visual category object part I visual σ 2 (object part) is small . d (part 1 , part 2 ) is large. Disambiguation by projection to Sub-Manifolds I Separating inter-mixed descriptors. I Dual objective of inter-vector distance and sub-manifold embedding overcomes limitation of hard partitioning. Influence of Co-clustering Co-clustering aids grouping of semantically equivalent descriptors (similar co-occurrence statistics or similar sub-manifold embedding) by projecting from a higher dimensional space (words) to lower dimensional space (topics). This effectively reduces separation between equivalent descriptors, verified using a K-NN classifier. Experiment: Grouping Scattered Clusters Comparative classification performance (F 1 score) of standard clustered dictionary (BoW) vs. grouping scattered clusters dictionary for all categories of VOC 2010 data set; dictionary size is 1000. Grouping clusters: different co-clustering methods Comparison of Information-theoretic (i) and sum-squared Residue (r) co-clustering methods. Grouping clusters: influence of dictionary size Topics (100,500,1000,5000) Words (10,000) Comparative F 1 score, averaged for all categories, for various datasets. Experiment: Multiple Sub-Manifold Comparative classification performance (F 1 score) of standard clustered dictionary (BoW) vs. multi-manifold dictionary (SSRBC) for all categories of VOC 2010 data set; dictionary size is 100. Multi-Manifolds: different co-clustering methods Comparison of Information-theoretic (i) and sum-squared Residue (r) co-clustering methods. Towards Semantically Relevant Space I Group semantically similar small clusters. I Multi-manifolds dictionary. I Prune non-discriminative space. I Combine these paradigms. Summary The improvement in classification performance supports the hypotheses that semantic relevance of feature space can be improved by grouping scattered tiny clusters based on image-word co-occurrence and learning a dictionary on multiple sub-manifolds, which disambiguates descriptors by projecting them to different sub-manifolds. Future work implements pruning non-discriminative space and combine these paradigms to render a semantically relevant space. Acknowledgement Supported by the EU project Dicta-Sign (FP7/2007-2013) under Grant No. 231135 and PASCAL 2. Center for Vision, Speech, and Signal Processing - University of Surrey - Guildford, United Kingdom Mail: [email protected] WWW: http://www.ee.surrey.ac.uk/cvssp

Towards Learning a Semantically Relevant Dictionary for Visual Category Recognition

Embed Size (px)

DESCRIPTION

In Sparsity, Dictionaries and Projections in Machine Learning and Signal Processing, ICML Workshop, Edinburgh, Scotland, 2012.

Citation preview

Page 1: Towards Learning a Semantically Relevant Dictionary for Visual Category Recognition

Towards Learning Semantically Relevant Dictionary for Visual CategoryRecognition

Ashish Gupta, Richard BowdenCentre for Vision, Speech, and Signal Processing, University of Surrey, Guildford, United Kingdom

Objective

Transform feature space rendered by the local patchaffine invariant feature descriptor to a semanticallyrelevant space for visual categorisation.

Challenge

I Large intra-category visual appearance variation.I Training data: insufficient, noisy, background clutter.I Feature descriptor is high-dimensional, sparsely

populated, and renders highly inter-mixed vectors infeature space.

Topic←∑

Words

I Feature space isassumed to havelocal semanticintegrity.

I Intra-categoryappearance varianceameliorated.

Grouping Scattered Clusters

I Analyse Image-Wordco-occurrencestatistics.

I Similar occurrence⇒ semanticequivalence.

I Use co-clustering todiscover wordgroups.

I Group such wordsinto topics.

Multiple Sub-Manifolds

I visual category←∑

object partI visual σ2(object part) is small . d(part1,part2) is large.

Disambiguation by projection to Sub-Manifolds

I Separatinginter-mixeddescriptors.

I Dual objectiveof inter-vectordistance andsub-manifoldembeddingovercomeslimitation ofhardpartitioning.

Influence of Co-clustering

Co-clustering aids grouping of semanticallyequivalent descriptors (similar co-occurrencestatistics or similar sub-manifold embedding) byprojecting from a higher dimensional space (words) tolower dimensional space (topics). This effectivelyreduces separation between equivalent descriptors,verified using a K-NN classifier.

Experiment: Grouping Scattered Clusters

Comparative classification performance (F1 score) ofstandard clustered dictionary (BoW) vs. groupingscattered clusters dictionary for all categories of VOC2010 data set; dictionary size is 1000.

Grouping clusters: different co-clustering methods

Comparison of Information-theoretic (i) andsum-squared Residue (r) co-clustering methods.

Grouping clusters: influence of dictionary size

Topics (100,500,1000,5000)←∑

Words (10,000)Comparative F1 score, averaged for all categories, forvarious datasets.

Experiment: Multiple Sub-Manifold

Comparative classification performance (F1 score) ofstandard clustered dictionary (BoW) vs.multi-manifold dictionary (SSRBC) for all categories ofVOC 2010 data set; dictionary size is 100.

Multi-Manifolds: different co-clustering methods

Comparison of Information-theoretic (i) andsum-squared Residue (r) co-clustering methods.

Towards Semantically Relevant Space

I Group semantically similar small clusters.I Multi-manifolds dictionary.I Prune non-discriminative space.I Combine these paradigms.

Summary

The improvement in classification performancesupports the hypotheses that semantic relevance offeature space can be improved by grouping scatteredtiny clusters based on image-word co-occurrence andlearning a dictionary on multiple sub-manifolds, whichdisambiguates descriptors by projecting them todifferent sub-manifolds. Future work implementspruning non-discriminative space and combine theseparadigms to render a semantically relevant space.

Acknowledgement

Supported by the EU project Dicta-Sign (FP7/2007-2013) underGrant No. 231135 and PASCAL 2.

Center for Vision, Speech, and Signal Processing - University of Surrey - Guildford, United Kingdom Mail: [email protected] WWW: http://www.ee.surrey.ac.uk/cvssp