25
Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

  • View
    221

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Hilbert Space Embeddings of Hidden Markov Models

Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola

1

Page 2: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Graphical Models Kernel Methods

Big Picture Question

Dependent variablesHidden variables

High dimensional Nonlinear Multimodal

High dimensional Nonlinear Multimodal

Dependent variablesHidden variables

Combine the best of graphical models and kernel methods?

2

Page 3: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Hidden Markov Models (HMMs)

Video sequence

Music

High-dimensional features Hidden variablesUnsupervised learning

3

Page 4: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Notation

ObservationTransitionPrior

4

Page 5: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Previous Work on HMMs• Expectation maximization [Dempster et al. 77]:

– Maximum likelihood solutionLocal maximaCurse of dimensionality

• Singular value decomposition (SVD) for surrogate hidden states

No local optimaConsistent

Spectral HMMs [Hsu et al. 09, Siddiqi et al. 10], Subspace Identification [Van Overschee and De Moor 96]

5

Page 6: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

• Input output• Variable elimination:

Predictive Distributions of HMMs

=Observable

Operator [Jaeger 00]

6

Page 7: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

• Input output• Variable elimination (matrix representation):

Predictive Distributions of HMMs

7

Page 8: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

• Key observation: need not recover :

Observable representation of HMM

• where are singular vectors of joint probability of sequence pairs [Hsu et al. 09]

Only need to estimate O, Ax and π up to invertible transformation S

8

Page 9: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Observable representation for HMM

pairs triplets singletons

sequence

Thin SVD of C2,1, get principal left singular vectors U 9

Page 10: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Observable representation for HMMs

pairs triplets singletons

Works only for discrete case 10

Page 11: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Key Objects in Graphical Models• Marginal distributions• Joint distributions• Conditional distributions • Sum rule • Product rule

Use kernel representation for distributions, do probabilistic inference in feature space

11

Page 12: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Embedding distributions• Summary statistics for distributions :

• Pick a kernel , and generate a different summary statistic

Mean

Covariance

expected features

Probability P(y0)

12

Page 13: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Embedding distributions

• One-to-one mapping from to for certain kernels (RBF kernel)

• Sample average converges to true mean at13

Page 14: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Embedding joint distributions• Embedding joint distributions using

outer-product feature map

• is also the covariance operator • Recover discrete probability with delta kernel • Empirical estimate converges at

14

Page 15: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Embedding Conditionals

• For each value X=x conditioned on, return the summary statistic for

• Some X=x are never observed15

Page 16: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Embedding conditionals

Conditional Embedding Operator

avoid data partition

16

Page 17: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Conditional Embedding Operator• Estimation via covariance operators [Song et al. 09]

• Gaussian case: covariance matrix instead• Discrete case: joint over marginal • Empirical estimate converges at

17

Page 18: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Sum and Product RulesProbabilistic

RelationHilbert Space

Relation

Sum Rule

Product Rule

Total Expectation

ConditionalEmbedding

Linearity

18

Page 19: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Hilbert Space HMMs

19

Page 20: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Hilbert space HMMs

pairs triplets singletons

20

Page 21: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Experiment• Video sequence prediction• Slot car sensor measurement prediction • Speech classification• Compare with discrete HMMs learned by EM

[Dempster et al. 77], spectral HMM [Sajid et al. 10], and Linear dynamical system approach (LDS) [Sajid et al. 08]

21

Page 22: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Predicting Video Sequences

• Sequence of grey scale images as inputs• Latent space dimension 50

22

Page 23: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Predicting Sensor Time-series• Inertial unit: 3D acceleration and orientation• Latent space dimension 20

23

Page 24: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Audio Event Classification• Mel-Frequency Cepstral Coefficients features• Varying latent space dimension

24

Page 25: Hilbert Space Embeddings of Hidden Markov Models Le Song, Byron Boots, Sajid Siddiqi, Geoff Gordon and Alex Smola 1

Summary• Represent distributions in feature spaces,

reason using Hilbert space sum and product rules

• Extends HMMs nonparametrically to domains with kernels

• Kernelize belief propagation, CRF and general graphical models with hidden variables?

25