Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy

Improving web image search results using query-relative classifiers

Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy

OutlineIntroductionQuery-relative featuresExperimental evaluationConclusion


IntroductionGoogle’s image search engine

have a precision of only 39%[16]Recently research improve image

search performance by visual information and not only text

Similar outlier detection, current setting the majority of retrieved image may be outliers, and inliers can be diverse

IntroductionRecently methods have the same

drawback : ◦a separate image re-ranking model is

learned for each and every query – large number of possible queries make these approach wasted computational time

IntroductionKey contribution :

◦Propose an image re-ranking method, based on textual and visual feature

◦Does not require learning a separate model for every query

◦The model parameters are shared across queries and learned once

IntroductionImage re-ranking approach :

Our image re-ranking approach :


Query-relative featuresQuery-relative text feature

◦Binary features◦Contextual features

Visual feature

Query-relative visual feature

Query-relative text featureOur base query-relative text

feature follow [6,16]◦ContexR◦Context10◦Filedir◦Filename◦Imagealt◦Imagetitle◦Websitetitle

Binary featureNine binary features indicate the

presence or absence of query terms :◦Surrounding text◦Image’s alternative text◦Web page’s title◦Image file’s URL’s hostname, directory

and filename◦Web page’s hostname, directory and

filenameWhich is active if some of the query

terms, but not all, are present in the field

Contextual featuresCan be understood as a form of

pseudo-relevance feedback

Divide the image’s text annotation in three parts :◦Text surrounding the image◦Image’s alternative text◦Words in the web page’s title

Contextual featuresDefine contextual features by

computing word histograms using all the image in the query set

Histogram of word counts : Image : iWord indexed : k

Contextual featuresUse (1) to define a set of

additional context featuresThe kth binary feature represents

the presence or absence of kth most common word

We trim these features down to the first N element, so we have 9+9+3N binary feature

Visual featuresOur image representation is based

on local appearance and position histograms

Local appearance◦Hierarchical k-means clustering ◦11-levels of quantisation, and k = 2

Position quantisation ◦Quad-tree with three level

The image is represented by appearance-position histogram

Query-relative visual featuresNo direct correspondence

between query terms and image appearance

We can find which visual words are strongly associated with query set by contextual text features

Define a set of visual features to represent their presence or absence in a given image

Query-relative visual featuresOrder the visual features :

◦A : query set◦T : training set◦ : average visual word

histogram

The kth feature relates to the visual word kth most related to this query

Query-relative visual featuresWe compared three ways of

representing each visual word’s presence or absence◦The visual word’s normalised count

for this image ◦The ratio ◦Binary version of this ratio, threshold

at 1:


Experimental evaluationNew data setModel trainingEvaluationRanking images by textual

featuresRanking images by visual

featuresCombining textual and visual

featuresPerformance on Fergus data set

New data setPrevious data set contain image

for only a few classes, and at most case without their corresponding meta-data

In our data set, we provide the top-ranked images with their associated meta-data

Our data set of 353 image search queries and in total there are 71478 images

Model trainingTrain a binary logistic

discriminant classifierQuery-relative features of

relevant images are used as positive examples

Query-relative features of irrelevant images are used as negative examples

Rank images for the query by the probability

Only need to be learnt once

Evaluation Used mean average precisionLow Precision(LP): 25 queries where

the search engine performs worstHigh Precision(HP): 25 queries where

the search engine performs bestSearch Engine Poor(SEP): 25 queries

where the search engine least over random ordering of query set

Search Engine Good(SEG): 25 queries where the search engine most over random ordering of query set

Ranking images by textual features

Diminishing gain per additional feature

Ranking images by visual features

Ranking images by visual features

Adding more visual features increases the overall performance, but with diminishing gain

Combining textual and visual features

◦a = visual features, 50~400◦b = additional context features,

20~100

10%

Performance on Fergus data set

Our method better than Google[4],[7] perform better, but they

require time-consuming training for every new query

Results


ConclusionConstruct query-relative features

that can be used to train generic classifiers

Rank images for previously unseen search queries without additional model training

The feature combined textual and visual information

Presence a new public data set

Thank you!!! &Happy New Year!!!!

Documents

Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy