22
University of Joensuu Dept. of Computer Scie P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955 www.cs.joensuu.fi Gaussian Mixture Models Speech and Image Processing Unit Department of Computer Science University of Joensuu, FINLAND Ville Hautamäki Clustering Methods: Part 8

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955 Gaussian Mixture

Embed Size (px)

Citation preview

Page 1: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Gaussian Mixture Models

Speech and Image Processing UnitDepartment of Computer Science

University of Joensuu, FINLAND

Ville Hautamäki

Clustering Methods: Part 8

Page 2: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Preliminaries

• We assume that the dataset X has been generated by a parametric distribution p(X).

• Estimation of the parameters of p is known as density estimation.

• We consider Gaussian distribution.

http://research.microsoft.com/~cmbishop/PRML/Figures taken from:

Page 3: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Typical parameters (1)

• Mean (μ): average value of p(X), also called expectation.

• Variance (σ): provides a measure of variability in p(X) around the mean.

Page 4: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Typical parameters (2)

• Covariance: measures how much two

variables vary together.

• Covariance matrix: collection of covariances between all dimensions.

– Diagonal of the covariance matrix

contains the variances of each attribute.

Page 5: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

One-dimensional Gaussian

• Parameters to be estimated are the mean (μ) and variance (σ)

2222

1 1Normal( | , ) exp

22x x

Page 6: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Multivariate Gaussian (1)

• In multivariate case we have covariance matrix instead of variance

2 1/ 2

1 1 1Normal( | , ) exp

(2 ) det( ) 2

T

x x x

Page 7: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Multivariate Gaussian (2)

Full covarianceDiagonalSingle

2

2

0

0

21

22

0

0

2 211 12

2 212 22

1

ln ( ) ln Normal( | , )N

nn

p X

x

Complete data log likelihood:

Page 8: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Maximum Likelihood (ML) parameter estimation

• Maximize the log likelihood formulation

• Setting the gradient of the complete data log

likelihood to zero we can find the closed form

solution.

– Which in the case of mean, is the sample average.

Page 9: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

When one Gaussian is not enough

• Real world datasets are rarely unimodal!

Page 10: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Mixtures of Gaussians

1

( ) Normal( | , )M

k k kk

p

x x

Page 11: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Mixtures of Gaussians (2)

• In addition to mean and covariance parameters (now M times), we have mixing coefficients πk.

1

1M

kk

0 1k

Following properties hold for the mixing coefficients:

It can be seen as the prior probability of the component k

Page 12: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Responsibilities (1)

• Component labels (red, green and blue)

cannot be observed.

• We have to calculate approximations

(responsibilities).

Complete data Incomplete data Responsibilities

Page 13: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Responsibilities (2)

• Responsibility describes, how

probably observation vector x is from

component k.

• In clustering, responsibilities take

values 0 and 1, and thus, it defines the

hard partitioning.

Page 14: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

We can express the marginal density p(x) as:

1

( ) ( ) ( | )M

k

p p k p k

x x

( ) ( | )

( ) ( | )

( ) ( | )

Normal( | , )

Normal( | , )

k

l

k k k

l l ll

p k

p p k

p l p l

x x

x x

x

x

x

From this, we can find the responsibility of the kth component of x using Bayesian theorem:

Responsibilities (3)

Page 15: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Expectation Maximization (EM)

• Goal: Maximize the log likelihood of the whole data

• When responsibilities are calculated, we can maximize individually for the means, covariances and the mixing coefficients!

1 1

ln ( | , , ) ln Normal( | , )N M

k n k kn k

p

X x

Page 16: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Exact update equations

New mean estimates:

Covariance estimates

Mixing coefficient estimates

1

1( )

N

k k n nnkN

x x1

( )N

k k nn

N

x

1

1( )( )( )

TN

k k nnkN

x x x

kk

N

N

Page 17: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

EM Algorithm

• Initialize parameters

• while not converged

– E step: Calculate responsibilities.

– M step: Estimate new parameters

– Calculate log likelihood of the new

parameters

Page 18: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Example of EM

Page 19: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Computational complexity

• Hard clustering with MSE criterion is NP-complete.

• Can we find optimal GMM in polynomial time?

• Finding optimal GMM is in class NP

Page 20: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Some insights

• In GMM we need to estimate the parameters, which all are real numbers– Number of parameters:

M+M(D) + M(D(D-1)/2)

• Hard clustering has no parameters, just set partitioning (remember optimality criteria!)

Page 21: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Some further insights (2)

• Both optimization functions are mathematically rigorous!

• Solutions minimizing MSE are always meaningful

• Maximization of log likelihood might lead to singularity!

Page 22: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  Gaussian Mixture

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Example of singularity