12
Audio-Based Multimedia Indexing and Retrieval Framework in MUVIS System Overview & Applications by Serkan KIRANYAZ. Tampere Univ. of Tech.

Audio Based Indexing and Retrieval in MUVIS

Embed Size (px)

Citation preview

Page 1: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 1/12

Audio-Based Multimedia

Indexing and Retrieval

Framework in MUVIS

System Overview & Applications

by Serkan KIRANYAZ.

Tampere Univ. of Tech.

Page 2: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 2/12

MUVIS Overview

MBrowser 

Query

Browsing

Video

Summarization

Display

DbsEditor 

Encoding-Decoding-

Rendering

DatabaseCreation

FeX- AFeX

Management

AVDatabase

Capturing

Encoding

Recording

AV Database

Creation

Still Images

*.jpg

*.gif  *.bmp

*.pct

*.pcx *.png

*.pgm

*.wmf 

*.eps

*.tga

Real Time

Video-Audio

Stored MM

(Video-Audio)

An Image

A Frame

A Video-

Audio Clip

Image and MM files

Appending - Deleting

Appending into Dbs.

Image and MM

files - typesConvertions

Database

Editing

*.jp2

*.yuv

AV

Database

Image

Database

HybridDatabase

FeX  ModulesAFeX 

Modules

Fex & AFeX API

 Indexing   Retrieval 

Page 3: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 3/12

MUVIS Multimedia

44.1 KHz

32 KHzPCM

RGB 24MP424 KHzG723

 YUV 4:2:0AVI22.050 KHzG721

MP4MPEG-4AACStereo12 & 16 KHzAAC

AVIAny1..25 fpsH263+MP3Mono8 & 11.025 KhzMP3

File FormatsFrameSizeFrame RateCodecsFile FormatsChannelSampling

Freq.Codecs

MUVIS VideoMUVIS Audio

PGMWMFEPSPCXTGAPCTGIFPCXPNGTIFFBMPJPEG 2KJPEGMUVIS Images

Page 4: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 4/12

Audio-Based Multimedia Indexing

and Retrieval Framework for MUVIS

A global framework implementation in order to

achieve a robust and generic solution for audio-

based multimedia indexing and retrieval,

specifically: Generic Support for Audio Codecs

Generic Support for File Formats

Generic Support for Audio Capturing & Encoding Parameters

Generic Support for  AFeX Framework Parameters

The main objective is content-based (speaker,

subject, “sounds like..”) retrieval of the audio, which

is suitable to human judgment and (aural)perception.

Page 5: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 5/12

Audio Indexing Scheme in MUVIS

Silence MusicSpeech NotClassified

Audio Framing & Classification Conversion

Uncertain Speech Music NotClassified  

AFeX Module

. . .. . .

5

7

3

010

201

29 6

15

Audio Indexing

Speech Music NotClassified

KF Feature Vectors

KF Extraction

via MST Clustering 4

 AFeX Operation

 per frame3

 Audio Framing 

in Valid Classes2

Classification & Segmentation per granule/frame.1

Audio Stream

Page 6: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 6/12

2. Audio Framing with

Classification Conversion

M MMM MM M M SS S SSSSS SX XXXX

Music SpeechUncertain

Classification per granule/frame

Final Classification per audio frame

M: Music 

S: Speech

X: Silence

Uncertain

Page 7: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 7/12

Audio Feature Extraction

(AFeX) Framework Independent  AFeX module(s) integration

capability into MUVIS framework for audio-based indexing and retrieval.

DBSEditor 

MBrowser 

AFex_API.h

AFex_Bind()

AFex_Init()

AFex_Extract()AFex_GetDistance()

AFex_Exit()

AFex_*.DLL

AFex_Bind

AFex_Init

AFex_ExtractAFex_Exit

AFex_GetDistance

Page 8: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 8/12

Key-Framing via MST Clustering

S p ee ch L a b

9

8

1

1 1

213

1418

19

4

'a'

p

1

111

12

17

'L'

1

3168

21

'b'

8

7

69

0

101

1

20

'S'

1

29 6

2

1 2

1

15

'ch'

5

7

3

1

2

'ee'

21

Page 9: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 9/12

A Sample  AFeX Module Imp.: MFCC

MFCC (Mel-Frequency Cepstrum Coefficients) AFeX module provide generic feature vectorsindependent from the following parameters: Sampling Frequency. Number of audio channels (mono/stereo).

Audio Volume level.

 

 

 

 −

⋅⋅=

∑= )5.0(coslog)/2( 1

2/1

 j N 

i

m P c

 j ji

π  

Page 10: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 10/12

Audio Retrieval in MUVIS

FV(i)

FV(0)

Sub-feature Vectorsof a Database Clip

FV(i)

FV(0)

FV(i)

FV(0)

Sub-feature Vectorsof Query Clip

For each frame, a search is

done to find a matching frame

which gives minimum distance.

Matching Class Types

Feature vectors per 

class type

Page 11: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 11/12

Audio Retrieval in MUVIS (cont.)

In order to accomplish an audio based query within MUVIS, an audioclip is chosen in a multimedia database and queried through thedatabase if the database includes at least one audio feature.

Let NoS be the number of feature sets for a database and let NoF(s)is the number of sub-features per feature. Sub-features are obtainedby changing the  AFeX module parameters or the audio frame sizeduring the audio feature extraction process.

( )( )

( )

),(),(

,),(

0

),(),,(min),(

)(

 f   s D f   sW QD

 f   s D f   s D

C  jif  

C  jif   f   s DFV  f   sQFV SD f   s D

 NoS 

 s

 s NoF 

 f  c

C i

i

i

i

iC  j

 j

i

i

i

i

ii

∑ ∑

×=

=

∅=∈

∅≠∈=

Page 12: Audio Based Indexing and Retrieval in MUVIS

8/3/2019 Audio Based Indexing and Retrieval in MUVIS

http://slidepdf.com/reader/full/audio-based-indexing-and-retrieval-in-muvis 12/12

Conclusions & Remarks

Audio is important. Sometimes it bears moresemantic and content information than video.

Henceforth the preliminary results shows theeffectiveness of the audio-based retrieval compared

to visual retrievals (similar or better results). Classification and segmentation algorithm has been

recently improved. A new approach based on fuzzy-

regions and semantic-rule-based classification withintra segment boundary detection has beendeveloped.