Semantically Relevant Visual Dictionary

  • View
    197

  • Download
    0

Embed Size (px)

DESCRIPTION

European Conference on Operational Research, 2012, Vilinius Lithuania

Transcript

  • Semantically Relevant Visual DictionaryAshish Gupta (CVSSP)University of Surreya.gupta@surrey.ac.ukJuly 10,2012Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • ContentsIntroduction: Visual Category RecognitionCurrent practice: Visual DictionaryProblem: inter-mixed feature vectorsApproach: Over-partition + Co-cluster image-word matrixSolution: Group estimated categorically related partitionsExperiments:SummaryAshish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Visual Category RecognitionDenitionDetect presence of an instance of avisual category in an image.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • ChallengesSeveral variations in visual category appearance render categoryrecognition very dicult.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Visual DictionaryVisual WordRepresentative feature vector(generally centroid) of eachcluster.Image HistogramHistogram of assignments ofimage feature vectors to visualwords.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Problems with Visual DictionaryInter-mixedCategorically dissimilar feature vectors inter-mixed in feature space.Semantic scatterFeature vectors pertaining to same category part scattered infeature space.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Inter-mixed Feature VectorsCategorically equivalentvectors mapped to naturallyoccurring clustersEasily partitioned to yielddiscriminative dictionaryelementsAshish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Inter-mixed Feature VectorsCategorically dissimilar vectorsinter-mixedPartitioning yieldsnon-discriminative dictionaryAshish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Inter-mixed Feature VectorsOver-partition feature space into tiny clusters.Build a dictionary using these tiny clusters.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Semantic ScatterSmall variations in instances of object part causes associateddescriptors to get scattered in feature space.Combine visual words which are related and create a visualtopic.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • HypothesisSemantically related words can be discovered by analysingimage-word distribution.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Visual Topic Dictionary Visual Word DictionaryAshish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Co-ClusteringFormulate the image-word matrix as a joint probability distribution.CX : {x1, x2, . . . , xm} { x1, x2, . . . , xk }CY : {y1, y2, . . . , yn} { y1, y2, . . . , yl }the tuple (CX , CY ) is referred to as co-clustering.re-order rows and columns of the matrix, which gives rise toblocks, referred to as co-clusters.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Co-clustering contd.Optimal co-clustering minimizes loss in mutual informationI(X; Y ) I( X; Y ), given number of row (k) and column (l)clusters.For a (CX , CY ), loss in mutual information can be expressed byKL-divergence between p(X, Y ) and an approximation q(X, Y ).I(X; Y ) I( X; Y ) = DKL(p(X, Y ) q(X, Y ))Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Conceptual viewImage histogram feature vectors in high-dimensional visual wordsspace are projected to lower dimensional visual topic space.The distance between feature vectors from the same category isreduced.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • ExperimentFeature descriptorSIFT : Ane co-variant local image patch descriptor.Data setsScene-15; Pascal VOC 2006; VOC 2007; VOC 2010.Classierk-NN : Verify if mutual distance between categorically equivalentfeature vectors is reduced.Performance metricF1-score: harmonic mean of precision and recall. Popularly used inclassication and retrieval communities.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Scene-15 DatasetIt has 15 visual categories of natural indoor and outdoor scenes.Each category has about 200 to 400 images and the entire datasethas 4485 images.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • PASCAL VOC2006 DatasetIt has 10 visual categories with about 175 to 650 images percategory. There are a total of 5304 images.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • PASCAL VOC2007 DatasetIt has 20 visual categories. Each category contains images rangingfrom 100 to 2000, with 9963 images in all.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • PASCAL VOC2010 DatasetIt has 20 visual categories and 300 to 3500 images in eachcategory. Combines data from VOC2008 and VOC2009.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Dictionary Size10,000 words n Topics. Appropriate number of Topics?Large dictionary becomes category dependent.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • SummaryVisual dictionary in limited: unsupervised clustering.Signicant intra-category appearance variation: semantic scatter.Feature vectors from dierent visual categories inter-mixed infeature space.Visual Topic Visual Word: grouping over-partitioned featurespace.Co-clustering Image-Word distribution: discover optimal groupingof words with minimal loss in mutual information.Semantic dimensionality reduction.Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
  • Thank you.AcknowledgementAshish Gupta (CVSSP) Semantically Relevant Visual Dictionary