Upload
marian-horton
View
216
Download
0
Embed Size (px)
Citation preview
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Unsupervised and weakly-supervised
discovery of events in video(and audio)
Fernando De la Torre
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
A dream
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Outline
• Introduction• CMU-Multimodal Activity database• Unsupervised discovery of video events
• Aligned Cluster Analysis (ACA)• Weakly-supervised discovery of video events
• Detection-Segmentation SVMs• Conclusions
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Quality of life technologies (QLoT)
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Multimodal data collection• 40 subjects, 5 recipes• www.kitchen.cs.cmu.edu
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Multimodal data collection• 40 subjects, 5 recipes• www.kitchen.cs.cmu.edu
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Anomalous dataset
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Time series analysis
• Anomalous detection formulated as detecting outliers in multimodal time series.– Supervised– Unsupervised– Semi-supervised or weakly supervised
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Time series analysis
• Anomalous detection formulated as detecting outliers in multimodal time series.– Supervised– Unsupervised– Semi-supervised or weakly supervised
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Unsupervised discovery ofevents in video
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Motivation• Mining facial expression for one subject
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
• Mining facial expression for one subject
Motivation• Mining facial expression for one subject
• Summarization
• Visualization
• Indexing
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
• Mining facial expression for one subject
Looking up Sleeping SmilingLooking forwardWaking up
Motivation
• Summarization
• Visualization
• Indexing
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
• Mining facial expression of one subject
Motivation
• Summarization
• Embedding
• Indexing
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
• Mining facial expression for one subject
Motivation
• Summarization
• Embedding
• Indexing
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Related work in time series
• Change point detection (e.g. Page ‘54, Stephens 94’, Lai ‘95, Ge and Smyth ‘00, Steyvers & Brown ’05, Murphy et al. ‘07, Harchaoui et al. ‘08)
• Segmental HMMs (e.g. Ge and Smith ‘00, Kohlmoren et al. ’01, Ding & Fan ‘07)
• Mixtures of HMMs (e.g. Fine et al. ‘98, Murphy & Paskin ‘01, Oliver et al. ’02, Alon et al. ‘03)
• Switching LDS (e.g. Pavolvic et al. ‘00, Oh et al. ‘08, Turaga et al. ‘09)
• Hierarchical Dirichelet Process (e.g. Beal et al. ‘02, Fox et al. ‘08)
• Aligned Cluster Analysis (ACA)
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Summarization with ACA
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
MG xy
Kernel k-means and spectral clustering(Ding et al. ‘02, Dhillon et al. ‘04, Zass and Shashua ‘05, De la Torre ‘06)
2||||),( FJ MGXGM
1
2
3
4
5
6
7
8
9
10x
y
G
xyX
)))((()( 1n GGGGIKG TTtrJ
)(
)()( XXK T
M xy
G
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
2)(),,(
FacaJ MGXGM
Problem formulation for ACA
H )..[)..[)..[ 13221,...,,
mm hhhhhh XXX
)..[ 21 hhX )..[ 32 hhX )..[ 1mm hhX
Labels (G)3h
Start and end of the segments (h)mh 1mh
Dynamic Time Alignment Kernel (Shimodaira et al. 01)
1h 2h 4h
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
k
ccSS
m
ici mg
ii1
2
2)..[1
1X
Dynamic Time Alignment Kernel (Shimodaira et al. 01)
X [Si , Si+1) mc
X [Si , Si+1)
mc
2
)..[)..[)..[ ),...,,(),,(13221 Fssssssaca mm
J MGXXXSGM
Problem formulation for ACA
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Matrix formulation for ACA
GGGGILKL 1n )(with)( TT
kmk trJ
samples
segm
ents
2371,0 H
GHGGGHILWLK 1n )(with))o(( TTT
aca trJ
2323RW
clus
ters
segments
731,0 G
)()( XXK T
Dynamic Time Alignment Kernel (Shimodaira et al. 01)
23 frames, 3 clusters
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Facial image features
Appearance
• Active Appearance Models (Baker and Matthews ‘04)
Upper face
Lower face
Shape• Image features
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Unsupervised facial event discovery
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
• Cohn-Kanade: 30 people and five different expressions (surprise, joy, sadness, fear, anger)
Facial event discovery across subjects
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
ACA Spectral Clustering (SC)
0.87(.05) 0.56(.04)
• Cohn-Kanade: 30 people and five different expressions (surprise, joy, sadness, fear, anger)
Facial event discovery across subjects
• 10 sets of 30 people
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Honey bee dance(Oh et al. ‘08)
Seq 1 Seq 2 Seq 3 Seq 4 Seq 5 Seq 6
ACA 0.845 0.925 0.600 0.922 0.878 0.928
PS- SLDS (Oh et al. ‘08) 0.759 0.924 0.831 0.934 0.904 0.910
HDP- VAR(1)-HMM (Fox et al. ‘08)
0.465 0.441 0.456 0.832 0.932 0.887
Spectral Clustering 0.698 0.631 0.509 0.671 0.577 0.649
Three behaviors: 1-waggling2-turning left3-turning right
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Clustering human motion
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Weakly supervised discoveryof events in images and video
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Spot the differences!
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
What distinguish these images?
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Classification of time series
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Similarity of these problems?
• Global statistics are not distinctive enough!• Better understanding of the discriminative regions or events
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
ImageImage Bag of ‘regions’Bag of ‘regions’
At least one positive
All negative
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Support vector machines (SVMs)
2
2
1
w
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Learning formulation• Standard SVM
-3-2
-1
-10.5
3
(Andrews et. al. ’03, Felzenszwalb et al. ‘08)
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Optimization
all possible subwindows 100ms/image (480*640 pixels)(Lampert et al. CVPR08)
1)
2)
0.5
0.1
3) SVM with QP
-3-2
-1 1
2
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Discriminative patterns in time series
At most k disjoint intervals
We name it:k-segmentation
• Efficient search: Global optimum guaranteed!
10ms/sequence (15000 frames)
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Representation of signals
Training data
Compute frame-levelfeature vectors
IDs of visual words
Visual dictionary
Visual dictionary
clustering
5,10,97,...,9,42,10,91
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
K-segmentation
Original signal
IDs of visual words
Histogram of visual words
We need:
40,13,10,5,10,97,...,9,42,10,91
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
What is ?
SVM parameters
Original signal (x)
IDs of visual words 40,13,10,5,10,97,...,9,42,10,91
x
xwi
iT w)(
401310510979421091 ,,,,,,...,,,, wwwwwwwwww
m-segmentation (m+1)-segmentation
Consider m-segmentation:
Situation 1:
Situation 2:
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Experiment 1 – glasses vs. no-glasses• 624 images, 20 people under different expression/pose• 8 people training (126 sunglasses, 128 no glasses), 12 testing (185 sunglasses and 185 no glasses)
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Localization result
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Experiment 2 – car vs. no car• 400 images, half contains cars and other half no cars. • Each image 10,000 SIFT descriptors and a vocabulary of 1,000 visual words.
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Localization result
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Bad localization cases
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Classification performance
Human labelsOur method outperforms SVM with human labels!!!
whole image discriminative regions
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Experiment 3 – synthetic data
Positive class
Negative class
Result
k: maximum number of disjoint intervals.
Accu
racy
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Experiment 4 – mouse activity
• Mouse activities:– Drinking, eating, exploring, grooming, sleeping
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Result – F1 scores
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Conclusions• CMU Multimodal Activity database• Unsupervised discovery of events in time-series
– Aligned Cluster Analysis for summarization, indexing and visualization of time-series
– Code online (www.humansensing.cs.cmu.edu)– Open problems: automatic selection of number of clusters
• Weakly-supervised discovery of events in time-series– DS-SVM – Novel & efficient algorithm for time series– Outperform methods with human labeled data
• Kernel methods a fundamental framework for multimodal data fusion.
ACA DS-SVM ConclusionsIntroduction CMU-MMAC
Thanks
Questions?