Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Dynamic Data Modeling, Recognition, and SynthesisRui ZhaoThesis DefenseAdvisor: Professor Qiang Ji
/ 51
Contents▪ Introduction
▪ Related Work
▪ Dynamic Data Modeling & Analysis• Temporal localization
• Insufficient annotations
• Large intra-class variations
• Complex dynamics
▪ Summary
2
/ 51
Overview
3
Speech Processing
Computer Vision
Neurology
Natural Language Processing
Dynamic Data
/ 51
Major Tasks▪ Analyze Dynamic Data
▪ Modeling • Provide a mathematical description of the dynamic process
▪ Analysis• Prediction: Forecast future values of dynamic data
• Regression: Estimate a target value given dynamic data
• Classification: Divide dynamic data into different categories
• Synthesis: Generate new dynamic data
4
Modeling AnalysisCollection AnnotationPre-
processing
/ 51
Challenges▪ Insufficient annotation
▪ Uncertainty in data and model
▪ Intra-class variation
▪ Complex dynamics
5
/ 51
Related Work▪ A taxonomy of modeling techniques
6
Modeling dynamic data
Probabilistic approaches
Deterministic approaches
Directed PGM
Undirected PGM
Gaussian Process
Neural Networks
Dynamic Feature
LDS
HMM
DBN
DRF
RBM
RNN
LSTM
/ 51
Part 1: Dynamic Pattern Localization▪ Problem: Temporally localize the dynamic pattern in a time series by determining its starting and ending time
▪ Applications: • Brain computer interface (BCI)
• Speech recognition
• Event recognition
▪ Our Solution: • Combine dynamic model (HMM) with robust estimation (RANSAC)
7
Irrelevant segment Irrelevant segment
Inlier subsequence
/ 51
Methods▪ Overview
8
Set of CompleteSequences
HMM𝑠
Set of Sub-sequences
HMM∗
𝑠𝑡ℎ subset
𝑝: Probability of selecting all inliers𝜖: Proportion of outliers𝐾: Total number complete sequences
/ 51
Methods▪ Step 1: Divide complete sequences into overlapping subsequences
9
⋯ ⋯ ⋯
⋯
⋯
⋯
⋯
𝐿
𝑀
𝐿 −𝑀 + 1
/ 51
Methods▪ Step 2: Randomly select a subsequence from each complete sequence to train HMM
10
HMM𝐾
Learning: EM algorithm
/ 51
Methods▪ Step 3: Vote for the learned HMM using remaining subsequences
11
HMM 𝜃𝐾
Votes:
/ 51
Methods▪ Step 1: Divide complete sequences into overlapping subsequences
▪ Step 2: Randomly select a subsequence from each complete sequence to train HMM
▪ Step 3: Vote for the learned HMM using remaining subsequences
▪ Step 4: Go back to Step 2 and repeat for enough times
▪ Step 5: Find the HMM with the highest vote and use it to identify the inlier subsequences
▪ Step 6: Retrain the HMM using inlier subsequences
▪ Step 7: Identify inlier subsequences and merge to get signals of interest
12
/ 51
Experiments▪ Electrocorticographic (ECoG) data• Data collection
• Preprocessing & Feature Extraction: Spectrum filter (70-170Hz) + Hilbert transform
Experiment protocol
13
Wang et al. IJCAI2013
Stage 1 State 2 Stage 3
/ 51
Experiments▪ Data Statistics• 4 subjects
• Each has 10 trials, each of length 800
• Average hit time
▪ Classification Results:
14
/ 51
Part 2: Dynamic Regression under Insufficient Annotation▪ Problem: Given sequential data x1, … x𝑇, we want to compute a regression function 𝑦𝑡 = 𝑓(x𝑡), for some target value 𝑦𝑡
▪ Challenge: • Only part of the sequential data are annotated
▪ Applications: • Facial expression intensity estimation
• Sensor fault detection
• Part-of-speech tagging
▪ Our solution: Incorporate temporal information as additional constraints
15
/ 51
Problem Statement▪ Goal: Given input set with partial labels,
find a regression function from 𝐱 to 𝑦
▪ Example:𝑥1, 𝑦1 𝑥2 𝑥3 𝑥4 (𝑥5, 𝑦5)
𝐕 = 1,5
16
onset
Peak
offset
𝑦 ↑ 𝑦 ↓
𝐄 =1,2 , 1,3 , 1,4 , 1,5 , (2,3)
2,4 , 2,5 , 3,4 , 3,5 , (4,5)
/ 51
Ordinal Support Vector Regression (OSVR)▪ Regression model with parameters
▪ Dataset with weak labels:
▪ Optimization problem
▪ Optimization method: alternating direction method of multipliers (ADMM)
Regression Loss Ordinal LossRegularizationObjective = + +
17
/ 51
Experiments: Facial Expression Intensity Estimation▪ Overview
▪ Dataset: PAIN
▪ Feature: Gabor features, landmark, LBP + PCA
▪ Evaluation Criteria:• Pearson correlation coefficient (PCC)• Intra-class correlation (ICC)• Mean absolute error (MAE)
18
/ 51
Experiments▪ PAIN dataset• Experiments under different annotation settings
• Partial labels: about 8.8% of the total number of frames
• Use of ordinal information is very helpful
19
/ 51
Experiments▪ PAIN dataset• Comparison with state-of-the-art
20
/ 51
Part 3: Classification and Synthesis under Large Intra-class Variation▪ Challenge:• Same underlying dynamic pattern can
manifest significant intra-class variation
▪ Applications: • Human action recognition
• Human motion synthesis
• Speech recognition
• Language translation
▪ Our Solution: • Hidden Semi-Markov Model
• Bayesian Hierarchical Modeling
21
/ 51
Methods▪ Hidden Markov Model (HMM)
22
𝑋2
𝑍2
𝑋3
𝑍3
𝑋1
𝑍1 𝑍𝑇
𝑋𝑇
𝐴
𝜓
𝜋
initial transition emission
/ 51
Methods▪ Hidden Semi-Markov Model (HSMM)
23
𝑋2
𝐷𝑇
𝑍2
𝐷1
𝑋3
𝑍3
𝑋1
𝑍1
𝐷3𝐷2
𝑍𝑇
𝑋𝑇
𝐴
𝐶
𝜓
𝜋
initial transition emissionduration
/ 51
Methods▪ Bayesian Hierarchical Dynamic Model (HDM)
𝛼
24
prior
𝑋2
𝐷𝑇
𝑍2
𝐷1
𝑋3
𝑍3
𝑋1
𝑍1
𝐷3𝐷2
𝑍𝑇
𝑋𝑇
𝐴
𝐶
𝜓
𝜉
𝜂
𝜂0
𝜆
𝜋
likelihood
𝜃
/ 51
Methods▪ Learning: estimating hyperparameter 𝛼• Overall objective
• Approximate objective
• An alternating strategy
Intractable
▪ Optimization methods:
MAP-EM
ML
25
/ 51
Methods▪ Bayesian Inference: predictive likelihood
• Advantage: reduce overfitting and improve generalization
• Posterior inference:
• Method: Gibbs Sampling
• Classification:
26
/ 51
Experiments: Action Recognition▪ Overview:
▪ Dataset: MSR-Action3D, UTD-MHAD, G3D, Penn
▪ Feature: Joint position and motion in 3D or 2D
27
/ 51
Experiments: Action Recognition▪ Individual dataset: comparison with different baselines
28
/ 51
Experiments: Action Recognition▪ Individual dataset: comparison with different state-of-the-art
29
/ 51
Experiments: Action Recognition▪ Cross dataset classification accuracy• A: MSR
• B: UTD
• C: G3D
30
/ 51
Methods: Adversarial Learning▪ Adversarial Learning: a better criterion for data synthesis
• Overall objective
31
/ 51
Methods: Adversarial Inference▪ Bayesian adversarial inference• Sample both generator 𝜃 and discriminator 𝜙
32
𝐗+ 𝜙 𝐗−
𝐻+
𝜃
𝐻−
Discriminator Generator
𝐙,𝐃𝛼𝑑 𝛼𝑔
/ 51
Methods: Adversarial Inference▪ Bayesian adversarial inference• Sample generator 𝜃 conditioned on 𝜙
33
𝜙
𝐙,𝐃
𝐗− 𝜃
𝛼𝑔
𝐻−
Generator
likelihood prior
▪ Inference method: Stochastic Gradient Hamiltonian Monte Carlo (SGHMC)
/ 51
Methods: Adversarial Inference▪ Bayesian adversarial inference• Sample discriminator 𝜙 conditioned on 𝜃
34
𝐗+ 𝜙
𝐻+ 𝐻−
Discriminator
𝛼𝑑
𝐗−
Like-lihood
prior
▪ Inference method: Stochastic Gradient Hamiltonian Monte Carlo (SGHMC)
/ 51
Methods▪ Bayesian adversarial inference: data synthesis• Overall synthesis target:
• Steps:
35
Posterior sampling
𝐗− 𝜃
𝑀
𝐙,𝐃 𝛼𝑔
𝒟+
Generate new data using all the {𝜃𝑚}
/ 51
Experiments: Motion Synthesis▪ Dataset• CMU Motion capture• Berkeley Motion capture
▪ Feature: Joint angles
▪ Qualitative Results
36
Real data Synthetic data
/ 51
Experiments: Motion Synthesis▪ Quantitative Results• Metric: BLEU score: fidelity
CMU Berkeley
37
/ 51
Experiments: Motion Synthesis▪ Quantitative Results
Inception score: diversity MMD: distribution-level similarity
38
/ 51
Part 4: Modeling Complex Dynamics▪ Challenge:• Structural dependency
• Long-term temporal dependency
•Uncertainty and large variations
▪ Application: ▪ Action recognition
▪ Our solution: Bayesian Graph Convolution LSTM
39
/ 51
Methods▪ Overall framework:
40
Bayesian GC-LSTM
LSTMGCU
Classifier
Discriminator
Δ𝑡
Testing
data
Training
data
/ 51
Methods▪ GC-LSTM• Overall architecture
41
/ 51
Methods▪ GC-LSTM• Graph convolution: generalize convolution to arbitrary graph
structured data
42
𝐹𝑘
𝑁 × 𝑁
𝐻(𝑙)
𝑁 × 𝐷𝑙
𝑊𝑘(𝑙)
𝐷𝑙 × 𝐷𝑙+1 +𝑏𝑘(𝑙)
××
Graph kernel Data Kernel parameters
𝑁: number of nodes𝐷: dimension of each node
/ 51
Methods▪ GC-LSTM• Long short-term memory (LSTM) network:
modeling long-term temporal dynamics
43
𝑦2
𝑋𝑇
𝑍2
𝑋1
𝑦1
𝑍1
𝑋2
𝑍𝑇
𝑦𝑇
Input
Hiddenstate
Output
/ 51
Methods▪ Bayesian GC-LSTM• Extend GC-LSTM to a probabilistic model: 𝜃~𝑃(𝜃|𝛼𝜃)
• Infer the posterior distribution of parameters
44
Bayesian GC-LSTM
LSTMGCU
Classifier
Δ𝑡
Training
data 𝐗+
likelihood prior
/ 51
Methods▪ Bayesian GC-LSTM• Adversarial Prior: use additional discriminator to regularize parameters
• Intuition: promote a feature representation to be invariant of subject
45
Bayesian GC-LSTM
LSTMGCU
Classifier
Discriminator
Δ𝑡
Testing
data
Training
data 𝐗+
𝐗−
/ 51
Methods▪ Bayesian Inference
• Classification:
46
/ 51
Experiments▪ Ablation study
47
R: random rotationN: random noise
Effect of graph convolution
Effect of Bayesian inference
/ 51
Experiments▪ Comparison with state-of-the-art
48
MSR Action3D SYSU
UTD MHAD
/ 51
Experiments▪ Generalization across different datasets
49
/ 51
Thesis Summary▪ Localization of dynamic pattern• Propose a method that combines robust estimation and dynamic model
for localization
▪ Dynamic pattern regression under insufficient annotation• Incorporate ordinal information as addition constraints for model learning• Develop an optimization algorithm for parameter estimation
▪ Dynamic pattern classification and synthesis under large intra-class variation• Propose a Bayesian hierarchical model• Develop two Bayesian inference algorithms
▪ Modeling complex dynamics• Propose a Bayesian neural network model• Develop a Bayesian inference algorithm
50
/ 51
Thank You!
51
▪ Related Publications:• Rui Zhao, Quan Gan, Shangfei Wang and Qiang Ji, Facial Expression Intensity Estimation
Using Ordinal Information, CVPR 2016.• Rui Zhao, Md Ridwan Al Iqbal, Kristin Bennett and Qiang Ji, Wind Turbine Fault Prediction
Using Soft Label SVM, ICPR 2016.• Rui Zhao, Gerwin Schalk, Qiang Ji, Robust Signal Identification for Dynamic Pattern
Classification, ICPR 2016.• Rui Zhao, Qiang Ji, An Adversarial Hierarchical Hidden Markov Model for Human Pose
Modeling and Generation, AAAI 2018.• Rui Zhao, Gerwin Schalk, Qiang Ji, Temporal Pattern Localization using Mixed Integer
Linear Programming, ICPR 2018.• Rui Zhao, Qiang Ji, An Empirical Evaluation of Bayesian Inference Methods for Bayesian
Neural Networks, NIPS Workshop 2018 (To appear)• Rui Zhao, Wanru Xu, Hui Su, Qiang Ji, Bayesian Hierarchical Dynamic Model for Human
Action Recognition (Under review)• Rui Zhao, Hui Su, Qiang Ji, Bayesian Adversarial Human Motion Synthesis (Under review)• Rui Zhao, Kang Wang, Hui Su, Qiang Ji, Bayesian Graph Convolution LSTM for Skeleton
based Action Recognition (Under review)