Visual instance mining of news videos using a graph-based approach

Preview:

DESCRIPTION

Full details: https://imatge.upc.edu/web/publications/visual-instance-mining-news-videos-using-graph-based-approach Author: David Almendros-Gutiérrez Advisors: Xavier Giró-i-Nieto (UPC) and Horst Eidenberger (TU Wien) Degree: Telecommunications Engineering (5 years) at Telecom BCN-ETSETB (UPC) The aim of this thesis is to design a tool that performs visual instance search mining for news video summarization. This means to extract the relevant content of the video in order to be able to recognize the storyline of the news. Initially, a sampling of the video is required to get the frames with a desired rate. Then, different relevant contents are detected from each frame, focusing on faces, text and several objects that the user can select. Next, we use a graph-based clustering method in order to recognize them with a high accuracy and select the most representative ones to show them in the visual summary. Furthermore, a graphical user interface in Wt was developed to create an online demo to test the application. During the development of the application we have been testing the tool with the CCMA dataset. We prepared a web-based survey based on four results from this dataset to check the opinion of the users. We also validate our visual instance mining results comparing them with the results obtained applying an algorithm developed at Columbia University for video summarization. We have run the algorithm on a dataset of a few videos on two events: 'Boston bombings' and the 'search of the Malaysian airlines flight'. We carried out another web-based survey in which users could compare our approach with this related work. With these surveys we analyze if our tool fulfill the requirements we set up. We can conclude that our system extract visual instances that show the most relevant content of news videos and can be used to summarize these videos effectively.

Citation preview

BY D AV I D A L M E N D R O S G U T I É R R E Z

D I R E C T E D BYH O R S T E I D E N B E R G E RXAV I E R G I R Ó - I - N I E T O

2 0 1 3 - 2 0 1 4

VISUAL INSTANCE MINING USING OF NEWS VIDEOS USING

A GRAPH-BASED APPROACH

2

CONTENTS

IntroductionState of the artRequirements analysisDeveloped solutionEvaluationFuture work

3

CONTENTS

Introduction Motivation

State of the artRequirements analysisDeveloped solutionEvaluationFuture work

4

INTRODUCTION

5

INTRODUCTION

Motivation

Manel Martos’s Thesis (2013)“Content-based video summarization

oriented to movie trailers”

6

INTRODUCTION

News domain

• Websites

• News bulletins

• Newspaper

7

CONTENTS

IntroductionState of the art

Visual instance mining News summarization

Requirements analysisDeveloped solutionEvaluationFuture work

8

STATE OF THE ART

Visual instance mining

From a video

From a large collection of images

* Wei Zhang et al, "Scalable Visual Instance Mining with Threads of Features" (ACM MultiMedia 2014)

*

9

STATE OF THE ART

News summarizationNews Rover * Developed at Columbia University

* H. Li et al, "News rover: exploring topical structures and serendipity in heterogeneous multimedia news" (ACM MultiMedia 2013)

10

CONTENTS

IntroductionState of the artRequirements analysis

Content requirements Structural requirements

Developed solutionEvaluationFuture work

11

REQUIREMENTS ANALYSIS

Content requirements

Barack ObamaPresident of the USA

Núria SoléAnchorwoman of tv3 news

Flag

Fire truck

12

REQUIREMENTS ANALYSIS

Structural requirements

13

CONTENTS

IntroductionState of the artRequirements analysisDeveloped solution

Environment System architecture overview Temporal sampling Instances detection Graph-based selection Presentation

EvaluationFuture work

14

DEVELOPED SOLUTION

Environment

15

DEVELOPED SOLUTION

System architecture overview

16

DEVELOPED SOLUTION

Temporal samplingFrom user’s desired frame rate Uniform sampling

17

DEVELOPED SOLUTION

Instance detection Faces detection

Viola & Jones algorithm

DetectMultiscale method

18

DEVELOPED SOLUTION

Objects detection SURF descriptors and matching

Training images

19

DEVELOPED SOLUTION

3. Matching

2. Keypoints & Surf descriptors of frames

1. Keypoints & Surf descriptors of training images

20

DEVELOPED SOLUTION

Heuristic decision

0.1 0.2 0.3 0.4 0.5

0.600000000000001

0.700000000000001 0.8

00.10.20.30.40.50.60.7

Test with ambulances

Detection threshold

% c

orre

ct d

etec

tion

0.1 0.2 0.3 0.4 0.5

0.600000000000001

0.700000000000001

00.10.20.30.40.50.60.7

Test with police cars

Detection threshold

% c

orre

ct d

etec

tions

21

Edge detection

DEVELOPED SOLUTION

Texts detection Stroke width based algorithm

ResultsStroke width of all pixels are computed

22

DEVELOPED SOLUTION

Graph-based selection of representative instances

Pre-processing Increase the accuracy

Original GrayscaleCropped

Resized Equalized

Pre-processin

g

Features extraction

Similarity graph

Clustering Selection

23

Features extraction LBPH

Histogram comparing Histogram intersection Chi-square distance

Similarity value

DEVELOPED SOLUTION

With 𝛼 = 1

24

DEVELOPED SOLUTION

Similarity graph (Full connectivity)

Node = Visual instance

Awn = Visual similarity

25

Clustering by Edge filtering

DEVELOPED SOLUTION

Similarity value > Threshold

Subgraphs Heuristic decision

26

DEVELOPED SOLUTION

Selection of the representatives visual instances

Mutual reinforcement Scores

Number of nodes > Threshold

Time of appearanceHeuristic decision

27

DEVELOPED SOLUTION

Presentation Graphical User Interface online (GUI)

Developed with Wt

Initial design

28

DEVELOPED SOLUTION

Final result of the GUI

29

CONTENTS

IntroductionState of the artRequirements analysisSystem architecture overviewDeveloped solutionEvaluation

User study 1 User study 2 Conclusions

Future work

30

EVALUATION

User study 1 2 complementary web-based surveys

4 videos from the CCMA dataset 40 participants

Evaluation Redundancy Understanding Quality (Mean Opinion Score (MOS))

1. Unacceptable 2. Poor 3. Fair 4. Good 5. Excellent

31

EVALUATION

Visual summary 1

Visual summary 4

32

EVALUATION

Redundancy

70%

30%

Visual summary 1

YESNO

70%

30%

Visual summary 2

YESNO

70%

30%

Visual summary 3

YESNO 48%53%

Visual summary 4

YESNO

33

EVALUATION

UnderstandingRanking Keywords before

watching the videoKeywords after

watching the video1 Puerto Rico Independence

2 Independence Puerto Rico

3 Political party Future

4 Election Voting

5 Opinion Political party

Ranking Keywords before watching the video

Keywords after watching the video

1 Music New schedule

2 Catalunya Radio Novelty

3 Programming Catalunya Radio

4 Office Culture

5 Schedule Information

34

EVALUATION

1 2 3 4 502468

101214161820

Visual summary 1

Visual summary 2

Score rate

o Quality

Part

icip

an

ts

MOS1 = 3,8MOS2 = 3,57MOS3 = 3,6MOS4 = 3,72

35

EVALUATION

User study 2 Web-based survey

2 well-known news 356 videos of “Boston Marathon bombings” 406 videos of “Disappearance of the Malaysia airlines flight”

55 participants

Evaluation Comparison with W. Zhang (ACM MM 2014)

Quality (Mean Opinion Score (MOS))• 1. Unacceptable • 2. Poor• 3. Fair• 4. Good• 5. Excellent

36

EVALUATION

Boston Marathon bombings

W. Zhang (ACM MM

2014)

Our visual summary

37

EVALUATION

1 2 3 4 50

5

10

15

20

25

30

W. Zhang (ACM MM 2014)Our visual summary

Score rate

Part

icip

an

ts

MOS = 2,2MOS = 4,15

38

EVALUATION

Disappearance of the Malaysia airlines flight

W. Zhang(ACM MM 2014)

Our visual summary

39

EVALUATION

1 2 3 4 50

5

10

15

20

25

30

W. Zhang (ACM MM 2014)Our visual summary

Score rate

Part

icip

an

ts

MOS = 2,56MOS = 3,62

40

EVALUATION

Conclusions

Pros Extract relevant content Summarize the news video Seem to be competitive with the state of the art

Cons Exist redundancy Low accuracy of the object detection

41

CONTENTS

IntroductionState of the artRequirements analysisSystem architecture overviewDeveloped solutionEvaluationFuture work

42

CONCLUSION

Future work

Improve the detection

Audio transcription

Content presentation

Interactive prototype

43

THANK YOU VERY MUCH FOR YOUR ATTENTION

Recommended