31
Andreas Schmidt - MMEDIA/MOPAS Panel 2013 1/16 Progress in Description and Retrieval of Complex Multimedia Information The Fifth International Conferences on Advances in Multimedia The Fourth International Conference on Models and Ontology-based Design of Protocols, Architectures and Services April 21 - 26, 2013 - Venice, Italy Moderator: Andreas Schmidt IWI - Karlsruhe University of Applied Sciences, IAI - Karlsruhe Institute of Technology, Germany Panellists: Fabrizio Falchi ISTI - National Research Council (CNR) Pisa, Italy David Newell SSRG - Bournemouth and Poole College, UK Andreas Schmidt (see above) Mak Sharma TEE - Birmingham City University Panel

Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 1/16

Progress in Description and Retrieval of Complex Multimedia Information

The Fifth International Conferences on Advances in Multimedia

The Fourth International Conference on Models and Ontology-based Design of Protocols, Architectures and Services

April 21 - 26, 2013 - Venice, Italy

Moderator: Andreas Schmidt IWI - Karlsruhe University of Applied Sciences, IAI - Karlsruhe Institute of Technology, Germany

Panellists: Fabrizio Falchi ISTI - National Research Council (CNR) Pisa, Italy David Newell SSRG - Bournemouth and Poole College, UK Andreas Schmidt (see above)Mak Sharma TEE - Birmingham City University

Panel

Page 2: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 2/16

A very short introduction for „non multimedians“ ...

Page 3: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 3/16

Overview of a MIR System - Data

Progress in Description and Retrieval of Complex Multimedia Information

textsoundvideoimages3D-modelsbiological data

combination of above

example: filmvideo informationaudio informationsubtitlesothers

Page 4: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 4/16

Overview of a MIR System

• How can this data be queried ?

• „Typical“ database query:

Give my all autofocus cameras with at least 5 Megapixels, that cost not more than 200 €.

Easy ! why ?: data is stored at a semantic level, equal to a typical query

• Problem: Multimedia raw data is stored at a very low semantical level (typically compressed)

• To query multimedia data we need an appropriate description of the data

• Content description for a multimedia object can be extracted or manually added

Page 5: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 5/16

Progress in Description and Retrieval of Complex Multimedia Information

Overview of a MIR System - Description

automatically extracted features

- color- texture- shape

like (i.e. pictures):context information

semantic concepts (i.e. car, plane, face, crowd of people, ...)

- EXIF parameters (i.e. gps data)- surrounding text- caption

semantic gap

Page 6: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 6/16

Progress in Description and Retrieval of Complex Multimedia Information

Overview of a MIR System - Retrieval

Conventional queries:

• Point query

• Analytical query

• range query (without ranking)

Multimedia query:

• often (k-) nearest neighbor

• query is interpreted as (virtual) multimedia object

• ranking of results

• definition of similarity measure

Page 7: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 7/16

Progress in Description and Retrieval of Complex Multimedia Information

Overview of a MIR System - Retrieval

Query By Example (QBE)Query adaptive search strategiesInteractive retrievalPersonalized searchBrowsing

Page 8: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 8/16

Overview of a MIR System - Summary

• Multimedia objects cannot be queried directly

• Appropriate descriptors have to be found

• Low level

• contextual

• semantic level

• Challenges [1]:

• data representation

• similarity measure

• scalability

• Very interdisciplinary work:

• signal processing, pattern recognition,machine learning, information retrieval,NLP, HCI, psychology, computional vision, optimization theory, ...

Semantic gap

Page 9: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 9/16

Sources

[1] Eduard Y. Chang: Foundations of Large-Scale Multimedia Information Manage-ment and Retrieval: Mathematics of Perception. Springer, 2011

[2] A. Hanjalic, R. Lienhart, W. Y. Ma, J. R. Smith: The Holy Grail of Multimedia Information Retrieval: So Close or Yet So Far Away? Proceedings of the IEEE, Vol. 96, No. 4., 2008

Page 10: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 10/16

Aspect Retrieval Performance ...

Page 11: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 11/16

Aspect: Retrieval Performance

• Curse of Dimensionality

• Dimension > 20: no k-nearest neighbor (k-nn) algorithm is significantly fasterthen a linear scan [1]

• For high dimensional range queries, traditional multidemensional indexesare slower than a linear scan [2]

• Why ? - Exponential grow of minimal bounding regions (MBR)

• Solutions:

• Reduction of dimensionality (i.e. PCA1, ICA2)

• Appropriate memory layout:

• Approximate nearest neighbor search algorithms• Alternate index structure: Binned, compressed bitmap vector

1. Principal component analysis2. Independant componment analysis

Page 12: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16

Aspect: Retrieval Performance

random, disk

sequential, disk

random, SSD

sequential, SSD

random, memory

sequential, memory

1.26 mbytes/sec

212.8 mbytes/sec

168.8 mbytes/sec

146.8 mbytes/sec

1432.8 mbytes/sec

7.69 mbytes/sec

Comparision of random and sequential memory access

101 102 106103 104 105 107 108 109(bytes)

Source: Adam Jacobs: The Pathologies of Big Data, acmqueue, July 2009

Page 13: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 13/16

Aspect: Retrieval Performance

Approximate k-nn Algorithm SphereDex:

• Points are partitioned based on distance to a central point

• Make IO access sequential

• Intra partition index for effecive pruning

• Partition fits in L2 cache

• Inter partition index for finding good starting point in neighbor partition

• Neighboring partitions are sequentially stored on disk

• About one order of magnitude faster than M-Tree, LSH

Source: Edward Y. Chang, Approximate High-Dimensional Indexing with Kernel. In Foundations of Large-Scale Multime-dia Information Management and Retrieval - Mathematics of Perception, Edward Y. Chang (editor), Springer, 2011.

Page 14: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 14/16

Aspect: Retrieval Performance

Fastbit Bitmap Index [5, 6]:

• based on WAH algorithm

• Multidimensional queries are mapped on AND/OR operations of (compres-sed) bitmaps

• significantly reduction of I/O

• pure sequential IO

• Binning to reduce number of bitmaps

• Further improvement with Order-preserving Bin-based Clustering (OrBiC) structure [4] (3-25 times faster than with best known multidimensio-nal data structures)

Source: Kesheng Wu, Kurt Stockinger, Arie Shoshani: Breaking the Curse of Cardinality on Bitmap Indexes. LectureNotes in Computer Science 5069 Springer 2008, pages 348-365

Page 15: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 15/16

Aspect: Retrieval Performance

Quintessence

• Sequential disk access is magnitudes faster than random access

• Appropriate memory layout can speed up the application significantly

• Approximate k-nearest neighbor is sufficiant in most cases

• feature vector is also an approximation

• results are inspected by humans

• Bitmaps: Binnig and run length encoding is an appropriate combination to resolve the curse of dimensionality

Disadvantage of the proposed solutions: For read mostly applications only (append only)

Page 16: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Andreas Schmidt - MMEDIA/MOPAS Panel 2013 16/16

Sources

[1] P. Indyk, R. Motwani, Approximate nearest neighbors: towards removing the curse of dimensionality, in Proceedings of VLDB, 1998, pp. 604–613

[2] K. Beyer, J. Goldstein, R. Ramakrishnan, U. Shaft, When is x “nearest neighbor” meaning-ful? in Proceedings of ICDT, 1999, pp. 217–235

[3] Adam Jacobs: The Pathologies of Big Data, acmqueue, July 2009[4] Kesheng Wu, Kurt Stockinger, Arie Shoshani: Breaking the Curse of Cardinality on Bitmap

Indexes. Lecture Notes in Computer Science 5069 Springer 2008, pages 348-365[5] Kurt Stockinger, Kesheng Wu, and Arie Shoshani: Evaluation Strategies for Bitmap Indices

with Binning, (DEXA 2004), Lecture Notes in Computer Science 3180 Springer 2004.[6] D Rotem, K Stockinger, Wu: Efficient binning for bitmap indices on high cardinality

attributes, 2004[7] Edward Y. Chang, Approximate High-Dimensional Indexing with Kernel. In Foundations of

Large-Scale Multimedia Information Management and Retrieval - Mathematics of Percep-tion, Edward Y. Chang (editor), Springer, 2011.

Page 17: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Digital Audiovisual Media Preservation

Fabrizio [email protected]

Page 18: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Audiovisuals Digital Preservation

• changes in the hardware, software, and physical media used to process and store digital data will continue into the indefinite future

• Most AV files are compressed. Whatever ‘original quality’ was lost in compression will remain lost. Preservation should maximiseretention of quality, a capability that needs to be defined and added to current technology.

Richard Wright, BBC

• we need preservation metadata in any processes that impact the communication of the digital information over time and ensure the generation of additional preservation metadata that describe the processing and its results

MP-AF Draft

Page 19: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Presto4U

FP7 Coordination Action

The aim of the project is to:

identify useful results of research into digital audiovisual preservation and to raise awareness and improve the adoption of these both by technology and service providers as well as media owners.

Presto4U is building Communities of Practices of audiovisualpreservation related organizations.

We are also searching for audiovisual for science producers and archives to understand their preservation needs.

Page 20: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

MPEG

• Proposed Data Model and Representationfor Multimedia Preservation Application Format (MP-AF)• specifies the lossless representation of metadata relevant to

preservation of multimedia information for exchange of multimedia content between multimedia archives

• is specified for exchange multimedia contents among multimedia archives, avoiding the loss of metadata for preservation

Page 21: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Fabrizio FalchiISTI-CNR, [email protected]

Page 22: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Bournemouth University

Dr Philip DaviesDavid Newell

MMEDIA 2013Venice, Italy

Page 23: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Query Methods Research for ImageDatabases is Widespread

QBIC (Query by Image Content) content-based retrievalmethod

– QBIC queries large image and video databases using:– example images, sketches and drawings,– selected colour and texture patterns, camera and object

motion and other graphical informationmotion and other graphical information

– (1) use of image and video content-computableproperties of colour, texture, shape and motion ofimages, videos and their objects in the queries

– (2) Use of graphical query language, in which queriesare posed by drawing and selecting user-constructedgraphics

Source: QBIC technology is part of several IBM products

MMEDIA2013 - David Newell,Bournemouth University

Page 24: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Adaptive MM Interface(AMPS)

MMEDIA2013 - David Newell,Bournemouth University

Page 25: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Towards Ontology Calculus

Query method based on ‘Semantic Descriptor’QBSD (Query by Semantic Descriptor) learning object content-based

retrieval method– QBSD queries large learning object video databases using:

– Ontology meta-level description of curriculum, learning objects,fragments, user-constructed questions, and FAQs

– selected lessons, results of MPQ tests, and other learning or evaluation– selected lessons, results of MPQ tests, and other learning or evaluationmeta-data used to adaptively retrieve learning objects

– (1) use of text, image and video content meta properties -difficulty, level, content type, combine responses to questions,produce supplementary videos and learning objects from queries

– (2) Use of an adaptive semantic description is a form of querylanguage, from which queries are processed, and used to displayfragments of animated graphics, images and text

Source: AMPS technology is part of several BU prototypesdisseminated at MMEDIA2009-2013.

MMEDIA2013 - David Newell,Bournemouth University

Page 26: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Computing

Connectivity (Comms)

Content

CTN …together we can succeed and make a difference …

Technologies C3

Mak Sharma, BEng, PgC, PgCE, FHEA, MIET, MBCS.

Head of School.

1 [email protected] - MMEDIA 2013,

The Fifth International Conferences on Advances in Multimedia, Venice.

Page 27: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Progress in Description and Retrieval of Complex Multimedia Information

What are we Retrieving ?

How do we learn KAV ?

How do we remember ?

How can we do ?

What will it cost ?

What are the casualties ?

Throw technology at it

[email protected] - MMEDIA 2013, The Fifth International Conferences on

Advances in Multimedia, Venice. 2

Page 28: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

• Over 60 hours of video are uploaded every minute, or one hour of video is uploaded to YouTube every second.

• Over 4 billion videos are viewed a day • Over 800 million unique users visit YouTube each month • Over 3 billion hours of video are watched each month on YouTube • More video is uploaded to YouTube in one month than the 3 major US networks created in

60+ years • YouTube is localized in 39 countries and across 54 languages • 1+ trillion views • 140+ views for every person on Earth

• 500 years of YouTube video are watched every day on Facebook • Etc Etc ….

Read more at http://www.jeffbullas.com/2012/05/23/35-mind-numbing-youtube-facts-figures-and-statistics-infographic/#ESOLPFtylPS5KidR.99

YouTube

[email protected] - MMEDIA 2013, The Fifth International Conferences on

Advances in Multimedia, Venice. 3

Page 29: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

• Developed from adding video clips to moodle

• More defined aims and objective within each module via BCU+ gives focused delivery of content

• Focuses students to the task

A better way of learning ?

Video everything !!!! [email protected] - MMEDIA 2013,

The Fifth International Conferences on Advances in Multimedia, Venice.

4

Page 30: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

What are considering ?

HP Cloud Amazon Cloud Microsoft Cloud Cisco Servers - Cloud solutions & define your own optimised network SDN

What is reality or are we in the Cloud ? £

[email protected] - MMEDIA 2013, The Fifth International Conferences on

Advances in Multimedia, Venice. 5

Page 31: Panel Progress in Description and Retrieval of Complex ......Andreas Schmidt - MMEDIA/MOPAS Panel 2013 12/16 Aspect: Retrieval Performance random, disk sequential, disk random, SSD

Mak Sharma, BEng, PgC, PgCE, FHEA, MIET, MBCS.

Head of School.

[email protected] - MMEDIA 2013, The Fifth International Conferences on

Advances in Multimedia, Venice. 6