SAPIR Search in Audio-visual
content using P2p IR
Yosi Mass, Raul Santos
Chorus cluster meeting, Vilamoura 16-17 April 2008 2
Why SAPIR? Searchable space created by the growing amounts
of existing video and multimedia files may greatly exceed the area searched by major engines.
Traditional search engines are limited to searching in the associated text and meta-data of the multimedia content. If content providers don't clearly or accurately describe their multimedia files, or use inaccurate tags, the current method falls short.
Current internet search is geared mainly to relatively powerful desktop machines and accessed via regular web browsers, not lightweight mobile devices with their connectivity and interactivity limitations.
Chorus cluster meeting, Vilamoura 16-17 April 2008 3
SAPIR Objectives
Develop cutting-edge technology to index and search large scale audio-visual information by content.
Make information available on many devices, enhanced by social networking while keeping privacy and preventing fraud
Support new trends in MM content production: personal producer VS professional producers
Chorus cluster meeting, Vilamoura 16-17 April 2008 4
SAPIR challenges
Dimensions of the search problem: Efficiency (scalability is the key issue) Effectiveness (quality measures of results)
Efficiency challenges Scale in collection size Scale in number of users
Effectiveness challenges New search paradigm combining text + audio-
visual content Usability challenges
Chorus cluster meeting, Vilamoura 16-17 April 2008 5
SAPIR ConsortiumOrganization Activity type Country Nr.
EmployeesRTD Person Months
IBM IND Israel 621 88
CNR Research Institute
Italy 5962 83
MPI Research
Institute
Germany 150 64
UPD University Italy 2234 49
Eurix SME Italy 30 66
Xerox IND France 2080 17
MU-Brno University Czech Republic
908 46
TID IND Spain 1265 66
Telenor IND Norway 674 29
Chorus cluster meeting, Vilamoura 16-17 April 2008 6
SAPIR approach-P2P Architecture
Chorus cluster meeting, Vilamoura 16-17 April 2008 7
Search using the Query by Example Paradigm
• Search for information about a physical object by taking an image of it with a mobile phone or find a song by humming the melody.
• Support similarity search for metric spaces
Image Database
Chorus cluster meeting, Vilamoura 16-17 April 2008 8
<SapirMMObject>
<title>when waves collide</title>
<Mpeg7>
<VisualDescriptor type=“ScalableColorType”>
<VisualDescriptor type=“ColorStructureType”>
<VisualDescriptor type=“ColorLayoutType”>
<VisualDescriptor type=“EdgeHistogramType”>
<VisualDescriptor type=“HomogeneousTextureType”>
</Mpeg7>
<comments>
<comment id=“…" author=“…">beautiful…</comment>
<comment ...>very powerful…</comment>
</comments>
<tags>
<tag id="254" author=“12@N00">waves</tag>
<tag …>Victoria beach</tag>
</tags>
</SapirMMObject>
Feature extraction
Chorus cluster meeting, Vilamoura 16-17 April 2008 9
Indexing
<SapirMMObject>
<title>when waves collide</title>
<Mpeg7>
<VisualDescriptor type=“ScalableColorType”>
<VisualDescriptor type=“ColorStructureType”>
<VisualDescriptor type=“ColorLayoutType”>
<VisualDescriptor type=“EdgeHistogramType”>
<VisualDescriptor type=“HomogeneousTextureType”>
</Mpeg7>
<comments>
<comment id=“…" author=“…">beautiful…</comment>
<comment ...>very powerful…</comment>
</comments>
<tags>
<tag id="254" author=“12@N00">waves</tag>
<tag …>Victoria beach</tag>
</tags>
</SapirMMObject>
Visual Descriptors Overlay
Metric index
Text Overlay
Text index
Chorus cluster meeting, Vilamoura 16-17 April 2008 10
Querying
Tag: names
<Mpeg7Query weight=“1”>
<VisualDescriptor type=“ScalableColorType”>
<VisualDescriptor type=“ColorStructureType”>
<VisualDescriptor type=“ColorStructureType”>
</Mpeg7Query>
</Mpeg7Query weight=“0.5”>
<tag>waves</tag>
</Mpeg7Query>
Visual Descriptors Overlay
Text Overlay
MergeResults
Approximation
Chorus cluster meeting, Vilamoura 16-17 April 2008 11
Project status for Apr 2008 A scalable, extensible and versatile architecture for P2P was
defined. APIs for P2P content management, indexing and search were defined and implemented
Several Scenarios were defined and tested in Focus groups Definition of a common schema for feature representation using
MPEG-7 was defined. A demo for Indexing and search in 10M Flickr files using a
combination of content based image search combined with text and metadata was implemented using the SAPIR APIs.
Testbed of 50M Flickr files crawled by the EGEE grid aiming at 100M towards the Year End. This testbed collection will be available for scientific experiments (CoPhir – http://cophir.isti.cnr.it site)
Next demo (due Nov ’08) will include search in music, video and speech as well as some scenario integration.
Chorus cluster meeting, Vilamoura 16-17 April 2008 12
Tests
P2P architecture for search in Audio-Visual content
Efficiency – Some initial results: 1M FlickrXML files – ~500msec per query – 50
peers (8CPU, 16Gb) 10M FlickrXML files - ~500msec per query – 500
peers (16CPU, 64Gb) Effectiveness
Text + image improves over text or image only
Chorus cluster meeting, Vilamoura 16-17 April 2008 13
WP9 – Dissemination and exploitation Public website
http://www.sapir.eu Dissemination
First DUP was published Participate in Chorus meetings and road map Workshops – SIGIR’07, ECIR’08, SAC’08 Demos
Publications More than 20 SAPIR related publications so far
Contacts with Standards Bodies MPEG-21, MPEG-A, MPEG-7
Exploitation
Chorus cluster meeting, Vilamoura 16-17 April 2008 14
WP9 – Dissemination and exploitation
Proposed contribution to standards Extension to MPEG-7 for music and
speech. Proposals for MPQF (MPEG-7 Query
Format) A DRM implementation for P2P based on
Chillout Propose a call for MPEG-21 Query
Format
Chorus cluster meeting, Vilamoura 16-17 April 2008 15
Thank You!
For more info visit http://www.sapir.eu
Chorus cluster meeting, Vilamoura 16-17 April 2008 16
Results (Jan 2007 – Mar 2008) WP1 – Scenarios and a complete guideline for
usability and user interface design WP2 – Architecture for P2P and APIs WP3 - Definition of a common schema for feature
representation using MPEG-7. WP4, WP5 – Demo of indexing and search in 10M
Flickr files combining text and low level visual descriptors
WP6 – Work on interoperable DRM solution (Chillout) for P2P networks
WP7 – initial design of Social networking and support for mobile devices