Transcript

Player Action Recognition in Broadcast Tennis Video with Applications

to Semantic Analysis of Sport Game

Guangyu Zhu, Changsheng Xu Qingming Huang, Wen Gao

Liyuan Xing

Outline

• Introduction

• Framework Overview

• Player Action Recognition

• Video Analysis

• Experimental Results

Introduction

• Semantic gap– between user semantics and low-level

feature– Object in sports video can consider as

an effective mid-level representation

• Action Recognition– Far-view– Foreside-swing backside-Swing

Introduction

• Multimodal Framework– Action recognition method based on

motion analysis– High-level analysis

• Video Indexing• Highlight ranking• Tactic analysis

Framework Overview

• Sports video database

• Low-level analysis

• Middle-level analysis

• Fusion scheme

• High-level analysis

Framework Overview

Framework Overview

Low-level Analysis

• Dominant color-based algorithm in [16] was used to identify all the in-play shots

Player Action Recognition

• Related Work– Shah[8], Gavrila[9] recognition with close-up

views– Motion representation

• Motion history/energy image [12]• Spatial arrangement of moving points [13]• Several Constraints

– Efroes[11]• Motion descriptor in a spatio-temporal volume• NNC similarity measure

– Miyamori[14][15]• Base on silhouette transition• Appearance feature is not preserved across videos

Player Action Recognition

Player Tracking and Stabilization

• Player Tracking– Initial position: detection algo. in [16]– SVR particle filter [24]

• Player region centroid

Optical Flow Computation

• Background subtraction

Optical Flow Computation

• Noise elimination

Local Motion Representation

• S-OFHs– slice based optical flow histogram

• The prob. of bin(u)

• The prob. of bin(u) in slice

Local Motion Representation

• Two slice of the figure is used• Horizontal and vertical optical field is used

Action Classification

• Using SVM• The concatenation of four S-OFHs is fed

as feature vector• Audio keywords

– Silence, hitting ball, applause

Action Classification

• Action clip window is set to 25 frames

• Voting Strategy

Video Analysis

• Fusion of mid-level features

• Action Based Tennis Video Indexing

• Highlights Ranking and Browsing

• Tactics Analysis and Statistics

Video Indexing• Based on action recognition and domain knowledge

Highlights Ranking

• Player action recognition• Real-world trajectory computation

Highlights Ranking

• Affective Features(4 for this paper)• Features on action

– Swing Switching Rate

Highlights Ranking

• Features on trajectory– Speed of Player (SOP)– Maximum Covered Court

• The rectangle shaped with left most, rightmost, topmost, and bottommost points

– Direction Switching Rate

Highlights Ranking

• The feature vector comprised of four affective features is fed into the ranking model

• Support vector regression• User defined threshold

Tactics Analysis and Statictics

Experimental Results

• Action Recognition (6 seq, 194 clips)

Experimental Results• Video Indexing

Experimental Results

• Highlights ranking

Experimental Results

Experimental Results

Experimental Results

Future Work

• More effective slice partition• Involve more semantic action

– Ex. Overhead-swing

• Action recognition apply to more applications such as 3-D scene reconstruction

• Include the ranking accuracy by combining audio features

Thank You


Recommended