Upload
others
View
32
Download
0
Embed Size (px)
Citation preview
Deep Adversarial
Gaussian Mixture
Auto-Encoder
for Clustering
Warith HARCHAOUI Pierre-Alexandre MATTEI
Charles BOUVEYRON
Université Paris Descartes � MAP5
Oscaro.com � Research & Development
February 2017
1/17
Clustering
Clustering is grouping similar objects together!
2/17
Thesis
Representation Learning and Clustering operate a symbiosis
3/17
Gaussian Mixture Model
I Density Estimation applied to Clustering for Kmodes/clusters
I Linear complexity suitable for Large Scale Problems
4/17
Learning Representations
I Successful in a supervised context (Kernel SVM)
I Successful in an unsupervised context (Spectral Clustering)
5/17
Auto-Encoder
An auto-encoder is a neural network that consists of:
I an Encoder: E : RD → Rd (compression)
I a Decoder: D : Rd → RD (decompression)
D >> d
D(E(x)) ' x
6/17
Optimization Scheme
Input Space
Code Space
Gaussian Clusters (π, µ, Σ)
Encoder Decoder
GMM Discriminator
Figure: Global Optimization Scheme for DAC
7/17
Adversarial Auto-Encoder
An adversarial auto-encoder is a neural network that consists of:
I an Encoder: E : RD → Rd (compression)
I a Decoder: D : Rd → RD (decompression)
I a Prior: P : Rd → R and∫Rd P = 1 associated with a random
generator of distribution PI a Discriminator: A : Rd → [0, 1] ⊂ R that distinguishes fake
data from the random generator and real data from the
encoder
8/17
Optimizations
3 lines objectives:
I The encoder and decoder try to minimize the reconstruction
loss
I The discriminator tries to distinguish fake codes (from the
random generator associated with the prior) and real codes
(from the encoder)
I The encoder also tries to fool the discriminator (opposite
discriminator loss function)
9/17
Results
Datasets MNIST-70k Reuters-10k HHAR
DAC EC (Ensemble Clustering) 96.50 73.34 81.24DAC 94.08 72.14 80.5
GMVAE 88.54 - -
DEC 84.30 72.17 79.86
AE + GMM (full covariances, median accuracy over 10 runs) 82.56 70.12 78.48
GMM 53.73 54.72 60.34
KM 53.47 54.04 59.98
Table: Experimental accuracy results (%, the higher, the better) based onthe Hungarian method
10/17
Visualizations
11
3
2
23
7
3
2
5
6
6710
0
2
3
1
0
1
0
1
7486
0
59
69
90
20
22
22
84
6863
173
68
127
235
17
2
94
0
6819
42
8
22
318
13
80
7
7
6747
2
13
88
1
44
170
2
124
6104
4
125
3
2
43
3
17
0
6678
39
11
1
3
8
49
104
15
7025
0
1
17
24
29
25
2
62
6281
15
21
29
1
81
27
80
8
6230
20
59
0
10
18
3
4
1
0
11
3
2
23
7
3
2
5
6
6710
0
2
3
1
0
1
0
1
7486
0
59
69
90
20
22
22
84
6863
173
68
127
235
17
2
94
0
6819
42
8
22
318
13
80
7
7
6747
2
13
88
1
44
170
2
124
6104
4
125
3
2
43
3
17
0
6678
39
11
1
3
8
49
104
15
7025
0
1
17
24
29
25
2
62
6281
15
21
29
1
81
27
80
8
6230
20
59
0
10
18
3
4
1
0
9
8
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9
Predicted class
Actu
al cla
ss
Figure: Confusion matrix for DAC on MNIST. (best seen in color)
11/17
Visualizations
µk
µk + 0.5σ
µk + 1σ
µk + 1.5σ
µk + 2σ
µk + 2.5σ
µk + 3σ
µk + 3.5σ
Figure: Generated digits images. From left to right, we have the tenclasses found by DAC and ordered thanks to the Hungarian algorithm.From top to bottom, we go further and further in random directions fromthe centroids (the �rst row being the decoded centroids).
12/17
Visualizations
Figure: Principal Component Analysis rendering of the code space forMNIST at the end of the DAC optimization, with colors indicating thetrue labels. (best seen in color)
13/17
Conclusion
Representation Learning and Clustering operate a symbiosis
14/17
References I
Christopher M.. Bishop.
Pattern recognition and machine learning.
Springer, 2006.
Ian Goodfellow, Yoshua Bengio, and Aaron Courville.
Deep Learning.
MIT Press, 2016.
http://www.deeplearningbook.org.
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu,
David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua
Bengio.
Generative adversarial nets.
In Advances in Neural Information Processing Systems, pages
2672�2680, 2014.
15/17
References II
Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, and Ian
Goodfellow.
Adversarial autoencoders.
arXiv preprint arXiv:1511.05644, 2015.
Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua
Bengio, and Pierre-Antoine Manzagol.
Stacked denoising autoencoders: Learning useful
representations in a deep network with a local denoising
criterion.
Journal of Machine Learning Research, 11(Dec):3371�3408,
2010.
16/17