33
Spot the Dog An overview of semantic retrieval of unannotated images in the Semantic Gap project Semantic Image Retrieval - The User Perspective Jonathon Hare Intelligence, Agents, Multimedia Group School of Electronics and Computer Science University of Southampton {jsh2}@ecs.soton.ac.uk

Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Embed Size (px)

Citation preview

Page 1: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Spot the Dog An overview of semantic retrieval of unannotated images in the Semantic

Gap projectSemantic Image Retrieval - The User Perspective

Jonathon Hare Intelligence, Agents, Multimedia Group

School of Electronics and Computer ScienceUniversity of Southampton

{jsh2}@ecs.soton.ac.uk

Page 2: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

The previous talks have described the issues associated with image retrieval from the practitioner perspective -- a problem that has become known as the ‘semantic gap’ in image retrieval.

This presentation aims to explore how the use of novel computational and mathematical techniques can be used to help improve content-based multimedia search by enabling textual search of unannotated imagery.

Introduction

Page 3: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Unannotated Imagery

Manually constructing metadata in order to index images is expensive.

Perhaps US$1-$5 per image for simple keywording.

More for archival quality metadata (keywords, caption, title, description, dates, times, events).

Every day, the number of images is increasing.

In many domains, manually indexing everything is an impossible task!

Page 4: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Unannotated Imagery An Example

Kennel club image collection.

relatively small (~60,000 images)

~7000 of those digitised.

~3000 of those have subject metadata (mostly keywords), remainder have little/no information.

Each year, after the Crufts dog show they expect to receive additional (digital) images [of the order of a few 1000] with little, if any metadata, other than date/time (and only then if the camera is set-up correctly).

Page 5: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

An Overview of Our Approach

Conceptually simple idea: Teach a machine to learn the relationship between visual features of images and the metadata that describes them.

So, two stages:

Use exemplar image/metadata pairs to learn relationships.

Project learnt relationships to images without metadata in order to make them searchable.

Page 6: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Modelling Visual Information

In order to model the visual content of an image we can generate and extract descriptors or feature-vectors.

Feature-vectors can describe many differing aspects of the image content.

Low level features:

Fourier transforms, wavelet decomposition, texture histograms, colour histograms, shape primitives, filter primitives, etc.

Higher-level features:

Faces, objects, etc.

Page 7: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Visual Term Representations

A modern approach to modelling the content of an image is to treat it like a textual document.

Model image as a collection of “visual terms”.

Synonymous with words in a text document.

Feature-vectors can be transformed into visual terms through some mapping.

Page 8: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Visual Term Representations Bag-of-Terms

For indexing purposes, we often discount order/arrangement of terms and just count number of occurrences.

The quick brown fox

jumped over the lazy dog

brown dog fox jumped lazy over quick the

1 1 1 1 1 1 1 2[ ]1[ 2 0 0 6 ]

Page 9: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Visual Term Representations Example: Global Colour Visual Terms

A common way of indexing the global colours used in an image is the colour histogram.

The each bin of the histogram counts the number of pixels of the colour range represented by that bin.

The colour histogram can thus be used directly as a term occurrence vector in which each bin is represented as a visual term.

1569

3408

491

0 0

902

2146

5026

0 0 56

3633

0 0 0

6827

Page 10: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Visual Term Representations Example: Local interest-point based visual terms

Features based on Lowe’s difference-of-Gaussian region detector and SIFT feature vector.

A vocabulary of exemplar feature-vectors is learnt by applying k-means clustering to a training set of features.

Feature-vectors can then be quantised to discrete visual terms by finding the closest exemplar in the vocabulary.

Page 11: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic SpacesBasic idea: Create a large multidimensional space in which images, keywords (or other metadata) and visual terms can be placed.

In the training stage learn how keywords are related to visual terms and images.

Place related visual terms, images and keywords close-together within the space.

In the projection stage unannotated images can be placed in the space based upon the visual terms they contain.

The placement should be such that they lie near keywords that describe them.

Page 12: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Conceptual Overview

Page 13: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Conceptual Overview

Page 14: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Uses of the space

Once constructed, the semantic space has a number of uses:

Finding images (both annotated and unannotated) by keyword(s)/metadata.

Finding images (both annotated and unannotated) by semantically similar images.

Determining likely metadata for an image.

Examining keyword-keyword and keyword-visual term relationships.

Segmenting an image.

Page 15: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Searching by Keyword

SUN

TRAIN

Page 16: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Searching by Keyword

SUN

TRAIN

Ranked Search Results:

Search for images about “SUN”

Page 17: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Searching by Keyword

SUN

TRAIN

Ranked Search Results:

Search for images about “SUN”

SUN

Page 18: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Searching by Keyword

SUN

TRAIN

Ranked Search Results:

Search for images about “SUN”

SUN

Page 19: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Searching by Image

Page 20: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Searching by Image

Search for images like this:

Ranked Search Results:

Page 21: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Searching by Image

Search for images like this:

Ranked Search Results:

Page 22: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Searching by Image

Search for images like this:

Ranked Search Results:

Page 23: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Suggesting Keywords

SUN

SKY

MOUNTAINTREE

CAR

Page 24: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Suggesting Keywords

Suggested keywords:

Suggest keywords for this image: SUN

SKY

MOUNTAINTREE

CAR

Page 25: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Suggesting Keywords

Suggested keywords:

Suggest keywords for this image: SUN

SKY

MOUNTAINTREE

CAR

Page 26: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Suggesting Keywords

Suggested keywords:

Suggest keywords for this image: SUN

SKY

MOUNTAINTREE

CAR

SKY MOUNTAIN TREE SUN CAR

CARSUN

TREE

SKY

MOUNTAIN

Page 27: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Experimental Retrieval Results - Corel Dataset

Colour Histograms used as visual terms (each bin representing a single term).

Standard experimental collection: 500 test images, 4500 training images.

Results quite impressive ~ comparable with Machine Translation auto-annotation technique (but remember we are using much simpler image features).

Works well for query keywords that are easily associated with a particular set of colours,

but not so well for the other keywords.

Page 28: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Experimental Retrieval Results - Corel Dataset

Top 15 images when querying for ‘sun’

Page 29: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Experimental Retrieval Results - Corel Dataset

Top 15 images when querying for ‘horse’

Page 30: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Semantic Spaces Experimental Retrieval Results - Corel Dataset

Top 15 images when querying for ‘foals’

Page 31: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Demo The K9 Retrieval System

We have built a demonstration system around the semantic space idea and applied it to images from the Kennel Club picture library (>7000 images, ∼3000 with keywords).

The system allows annotated images to be retrieved by keywords and concepts (keywords with thesaurus expansion).

Both annotated and unannotated images can also be retrieved using the semantic space and regular content-based techniques.

This brief demo will concentrate on retrieval of annotated images using keyword matching, and unannotated images using the semantic space.

Page 32: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Conclusions

Semantic retrieval of unannotated images is hard!

Our semantic space approach takes us some of the way, but there is still a long way to go.

Retrieval is limited by the choice of visual features, and how well those features relate to the keywords.

Page 33: Spot the Dog: An overview of semantic retrieval of unannotated images in the Semantic Gap project

Questions?