Download ppt - Hands on Logo Search

Transcript
Page 1: Hands on Logo Search

Hands On Logo Search

Figurative Mark Retrieval Prototype

By Francisco José Alvarez SánchezAnalyst Developer (ext) at OHIM IPR-Laboratory

Page 2: Hands on Logo Search

Developing a Figurative Mark Retrieval Prototype (Figumare) 

Objective: • Create a Proof of Concept of how a Figurative Mark Retrieval

System could work • Document some “Lessons Learnt” about this experience.

Page 3: Hands on Logo Search

The Problem

 A user only has:• a printed logo on paper or• a logo image file or • a URL of a logo file

   and wants to know:

the CTM Profile of the logo

Page 4: Hands on Logo Search

Solution   

A “Figurative Mark Retrieval Tool”      Retrieving:• CTM-profiles with scores

    When just a reference logo is submited from : Scanner Camara Internet  Local file  

         In a reasonable time

 

Page 5: Hands on Logo Search

How?

 • Obtain a CBIR  Engine

o (Content Based Image Retrieval)     o There are many, in different architectures and

licences.o  See Wikipedia CBIR Engine List 

• Create an interface to interact with the engine • Tuning the CBIR with the particularities of the logos 

o by using a sample of logos• Index the whole collection of CTM logos

Page 6: Hands on Logo Search

Obtaining a CBIR

 The CBIR engine LIRE  has been selected because:  • The licence is open source (GPL) • Is completely written in Java (no dependencies) • Uses only  Lucene for storing the descriptors of the images

o DB is not needed • Very well structured 

o Prepared for adding more features. o The internal behaviour is easy to modify  o The source code is clear and well documentedo Includes unit tests. 

• The project is active

Page 7: Hands on Logo Search

How LIRE works - 1

Definition of a "descriptor" • A sequence of numbers that extracts as much

information as possible on each one of the following image features: o Color Histogram  o Color Layout  o Texture (Edge Histogram)o Etc

Page 8: Hands on Logo Search

LIRE Descriptors in LUCENE

 

Page 9: Hands on Logo Search

How LIRE works - 2

The Standard algorithm• Sequential

o  Compares a given descriptor with each descriptor in the collection

• Robust and well tested

The Space Metric Algorithm• Not sequential

o Much faster o It describes an image as an ordered list of similar imageso It's new, not completely testedo Need some tuning when indexing

Page 10: Hands on Logo Search

The Standard Algorithm - 1

 • Indexing the collection of images

o  For each image of the collection  Extracts the list of descriptors from the image  Creates a Lucene document with the following attributes:

An id of the logos The related descriptor for each feature

Page 11: Hands on Logo Search

The Stantard Algorithm - 2

• Searching an imageo Extracts the list of descriptors from the reference imageo For each Lucene document in the collection

For each feature Calculate distance between both descriptors

Calculate average distance applying weightingso Order the distances and calculate scoreo Returns the id of the logos with the highest scores

The drawback of this method is that the descriptor of the image has to be compared sequentially for the whole collection of images

Page 12: Hands on Logo Search

The Metric Spaces Algorithm

 The general idea of this method is to describe every image by an ordered list of most similar images  from a “vocabulary of images"

Page 13: Hands on Logo Search

The Metric Spaces - Example

 From a collection of 300,000 images • Index the collection of images

o Select a random subset of 1,000 images called s1o Index s1 in a new lucene index l1 by using the classic

method o For each image in the collection of 300,000 images

Search the 50 most similar from the 1,000 images in s1 Add the ordered list of 50 identificators as new

descriptors  

Page 14: Hands on Logo Search

The Metric Spaces - Example Cont

  • Searching an image 

o Search the 50 most similar from the 1,000 images in s1o Query in Lucene by comparing the sequence of 50 ordered idso  The footrule distance for comparing ranked vectors is used

Page 15: Hands on Logo Search

Create an interface to interact with the LIRE engine• The interface is a  Spring Web Application written in Grails 

 • The services of the system allows the following:

 o Index a collection from a directory o Index a collection from a list of URL's o Search by an uploaded logo o Search by an URL of a logo

 • An Android application has being created so it is possible to take a

picture of a logo with a mobile phone and search it directly.

Page 16: Hands on Logo Search

Figumare Web Interface

 

Page 17: Hands on Logo Search

Tuning Figumare - 1

 • Collection size 678 logos• Vienna Code: “Geometrical Figures"• A great percentage of the CTMs have noise  • Using ImageMagick before indexing would

o Reduce the noise ando Sharpen the edges of the figure

• Not incorporated in the prototype

Page 18: Hands on Logo Search

Tuning Figumare - 2 - Collection

 

Page 19: Hands on Logo Search

Tuning Figumare -3

The figumare application has been tested with some degradation of logos as well as direct pictures taken from a mobile phone camera, also it has been compared with some logo URL’s from the internet.

Since most of the logos of the collection are black and white, the only necessary feature is “texture”

Page 20: Hands on Logo Search

Tuning Figumare - 4

Some photos of logos to search:

Page 21: Hands on Logo Search

Figumare Search Example

 

Page 22: Hands on Logo Search

Conclusion

- Adopting a CBIR tool for logo search is possible

- The Tuning Phase is decisive in order to have a successful project

    - Clear noise from the logo collection and the logo reference    - Correct weighting of features    - New logo features OCR text descriptor       

   


Recommended