Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B UILDING AND U SING A S EMANTIVISUAL I...
Preview:
Citation preview
- Slide 1
- Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B
UILDING AND U SING A S EMANTIVISUAL I MAGE H IERARCHY CVPR,
2010
- Slide 2
- Introduction Building the hierarchy Graphical modal Learning
Semantivisual image hierarchy Implementation Visualizing the
semantivisual hierarchy Quantitative evaluation Application
Annotation Labeling Classification OUTLINE
- Slide 3
- For images, a meaningful image hierarchy can make image
organization, browsing and searching more convenient and effective
Good image hierarchies can serve as knowledge ontology for end
tasks such as image retrieval, annotation or classication.
Language-based Low-level visual feature based INTRODUCTION
- Slide 4
- Use a multi-modal model to represent images and textual tags on
the semantivisual hierarchy Each image is associated with a path of
the hierarchy, where the image regions can be assigned to different
nodes of the path B ULIDING THE H IERARCHY Each image is decomposed
into a set of over-segmented regions R = [R1RrRN] each of the N
regions is characterized by four appearance features
- Slide 5
- Graphical model Each image-text pair (R,W) is assigned to a
path C c = [C c1,,C cl,,C cL ] B ULIDING THE H IERARCHY
- Slide 6
- Learning the semantivisual image hierarchy Given a set of
unorganized images and user tags associated with them Gibbs
sampling : samples concept index Z, coupling variable S and path C
Sampling Z Depend on 1) the likelihood of the region appearance 2)
the likelihood of tags associated with this region 3) the concept
indices of the other regions in the same image-text pair .. B
ULIDING THE H IERARCHY
- Slide 7
- Sampling S Its conditional distribution solely depends on the
likelihood of the tag Sampling C Inuenced by the previous
arrangement of the hierarchy and the likelihood of the image-text
pair B ULIDING THE H IERARCHY Prior probability induced by nCRP
likelihood
- Slide 8
- 4000 user upload images and 538 unique user tags Each image is
divided into small patches of 1010 pixels. Each patch is assigned
to a codeword in a codebook of 500 visual word obtained by K-means
Obtain 4 region codebook for color(HSV histogram), location,
texture, normalized SIFT histogram To speed up learning, we
initialize the levels in a path according to tf-idf score. We
obtain a hierarchy of 121 nodes, 4 levels and 53 paths. A S
EMANTIVISUAL I MAGE H IERARCHY -- Implementation
- Slide 9
- A S EMANTIVISUAL I MAGE H IERARCHY -- Visualizing the
Semantivisual Hierarchy General-to-specific relationship Purely
visual information cannot provide meaningful image hierarchy Purely
language-based hierarchy would miss close connection
- Slide 10
- Good clustering of images that share similar concepts,i.e.,
image along the same path, should be more or less annotated with
similar tags. Good hierarchical structure given path, i.e., images
and their associated tags at different levels of the path, should
demonstrate good general-to-specic relationships. A S EMANTIVISUAL
I MAGE H IERARCHY -- A Quantitative Evaluation Of Image Hierarchies
A path of L levels is selected from the hierarchy.
- Slide 11
- Given our learned image ontology, we can propose a hierarchical
annotation of an unlabeled query image. nCRP cannot perform well on
sparse tag words. Its proposed hierarchy has many words assigned to
the root node, resulting in very few paths. A simple clustering
algorithm such as KNN cannot nd a good association between the test
images and the training images in our challenging dataset with
large visual diversity. In contrast, our model learns an accurate
association of visual and text data simultaneously A PPLICATION --
Hierarchical Annotation of Image
- Slide 12
- Serving as an image and text knowledge ontology, our
semantivisual hierarchy and model can be used for image labeling
without a hierarchical relation. A PPLICATION -- Image Labeling
Collect the top 5 predicted words of each image Our model captures
the hierarchical structure of image and tags !!
- Slide 13
- Another 4000 image are held out as test images. A PPLICATION --
Image Classification By encoding semantic meaning to the hierarchy,
our semantivisual hierarchy delivers a more descriptive structure,
which could be helpful for classication.
- Slide 14
- Use image and their tags to construct a meaningful hierarchy
that organizes images in a general-to-specific structure. Our
quantitative evaluation by human subjects shows that our hierarchy
is more meaningful and accurate than others. C ONCLUSION