Upload
others
View
41
Download
3
Embed Size (px)
Citation preview
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Deep generative models of natural images
Emily Denton
Spring 2016
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
1 Motivation
2 Deep generative models: Intro
3 Deep generative models: Recent algorithmsVariational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
4 Extensions
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Outline
1 Motivation
2 Deep generative models: Intro
3 Deep generative models: Recent algorithmsVariational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
4 Extensions
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Generative models
Have access to x ∼ pdata(x) through training set
Want to learn a model x ∼ pmodel(x)
Want pmodel to be similar to pdataSamples drawn from pmodel reflect structure of pdataSamples from true data distribution have high likelihoodunder pmodel
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Why do generative modeling?
Unsupervised representation learningCan transfer learned representation so discriminative tasks,retrieval, clustering, etc.
Train network with both discriminative and generativecriterion
Utilize unlabeled data, regularize
Understand data
Density estimation
Data augmentation
...Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Focus of this talk
Generative modeling is a HUGE field...I will focus on (aselected set of) deep directed models of natural images
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Outline
1 Motivation
2 Deep generative models: Intro
3 Deep generative models: Recent algorithmsVariational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
4 Extensions
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Directed graphical models
x
z
We assume data is generated by:
z ∼ p(z) x ∼ p(x|z)
z is latent/hidden x is observed (image)
Use θ to denote parameters of the generative model
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Deep directed graphical models
P (x) =∑h
P (x|h)P (h)
Intractable
Can’t optimize data likelihooddirectly
:���
:���
:���
]���
]���
]���
[
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Bounding the marginal likelihood
Recall Jenson’s inequality: When f is concave, f(E[x]) ≥ E[f(x)]
log p(x) = log
∫zp(x, z)
= log
∫zq(z)
p(x, z)
q(z)
≥∫zq(z) log
p(x, z)
q(z)= L(x; θ, φ) (by Jensons inequality)
=
∫zq(z) log p(x, z)−
∫zq(z) log q(z)
= Eq(z)[log p(x, z)]︸ ︷︷ ︸Expectation of joint distribution
+ H(q(z))︸ ︷︷ ︸Entropy
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Evidence Lower BOund (ELBO)
Bound is tight when variational approximation matchestrue posterior:
log p(x)− L(x; θ, φ) = log p(x)−∫zq(z) log
p(x, z)
q(z)
=
∫zq(z) log p(x)−
∫zq(z) log
p(x, z)
q(z)
=
∫zq(z) log
q(z)p(x)
p(x, z)
= DKL(q(z;φ)||p(z|x))
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Summary
Assume existence of q(z;φ)
Bound log p(x; θ) with L(x; θ, φ)
Bound is tight when:
DKL(q(z;φ)||p(z|x)) = 0 ⇐⇒ q(z;φ) = p(z|x)
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Learning directed graphical models
Maximize bound on likelihood of data:
maxθ
N∑i=1
log p(xi; θ) ≥ maxθ,φ1,...,φN
N∑i=1
L(xi; θ, φi)
Historically, used different φi for every data point
But we’ll move away from this soon..
q(z;φ) typically factorized distribution
For more info see Blei et al. (2003)
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
New method of learning: approximate inference model
Instead of having different variational parameters for eachdata point, fit a conditional parametric function
The output of this function will be the parameters of thevariational distribution q(z|x)
Instead of q(z) we have qφ(z|x)
ELBO becomes:
L(x; θ, φ) = Eqφ(z|x)[log pθ(x, z)]︸ ︷︷ ︸Expectation of joint distribution
+ H(qφ(z|x))︸ ︷︷ ︸Entropy
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
Outline
1 Motivation
2 Deep generative models: Intro
3 Deep generative models: Recent algorithmsVariational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
4 Extensions
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
Variational autoencoder
Encoder network maps fromimage space to latent space
Outputs parameters ofqφ(z|x)
Decoder maps from latentspace back into image space
Outputs parameters ofpθ(x|z)
[Kingma & Welling (2013)]
(QFRGHU�,QIHUHQFH�QHWZRUN
]�a�TṞ��]_[�
[
'HFRGHU�*HQHUDWLYH�QHWZRUN
]
[�a�SḎ��[_]�A
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
Variational autoencoder
Rearranging the ELBO:
L(x; θ, φ) =
∫zqφ(z|x) log
∫z
p(x, z)
qφ(z|x)
=
∫zqφ(z|x) log
∫z
p(x|z)p(z)qφ(z|x)
=
∫zqφ(z|x) log p(x|z) +
∫zqφ(z|x) log
p(z)
qφ(z|x)
= Eq(z|x) log p(x|z)− Eq(z|x) logq(z|x)
p(z)
= Eq(z|x) log p(x|z)︸ ︷︷ ︸Reconstruction term
−DKL(q(z|x)||p(z))︸ ︷︷ ︸Prior term
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
Variational autoencoder
Inference network outputsparameters of qφ(z|x)
Generative network outputsparameters of pθ(x|z)
Optimize θ and φ jointly bymaximizing ELBO:
L(x; θ, φ) = Eq(z|x) log p(x|z)︸ ︷︷ ︸Reconstruction term
−DKL(q(z|x)||p(z))︸ ︷︷ ︸Prior term
(QFRGHU�,QIHUHQFH�QHWZRUN
]�a�TṞ��]_[�
[
'HFRGHU�*HQHUDWLYH�QHWZRUN
]
[�a�SḎ��[_]�A
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
Stochastic gradient variation bayes (SGVB) estimator
Reparameterization trick : re-parameterize z ∼ qφ(h|z) as
z = gφ(x, ε) with ε ∼ p(ε)
For example, with a Gaussian can write z ∼ N (µ, σ2) as
z = µ+ εσ2 with ε ∼ N (0, 1)
[Kingma & Welling (2013); Rezende et al. (2014)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
Stochastic gradient variation bayes (SGVB) estimator
L(x; θ, φ) = Eq(z|x) log p(x|z)︸ ︷︷ ︸Reconstruction term
−DKL(q(z|x)||p(z))︸ ︷︷ ︸Prior term
Using reparameterization trick we form Monte Carloestimate of reconstruction term:
Eqφ(z|x) log pθ(x|z) = Ep(ε) log pθ(x|gφ(x, ε))
' 1
L
L∑i=1
log pθ(x|gφ(x, ε)) where ε ∼ p(ε)
KL divergence term can often be computed analytically(eg. Gaussian)
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
VAE learned manifold
[Kingma & Welling (2013)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
VAE samples
[Kingma & Welling (2013)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
VAE tradeoffs
Pros:Theoretically pleasingOptimizes bound on likelihoodEasy to implement
Cons:Samples tend to be blurry
Maximum likelihood minimizes DKL(pdata||pmodel)
[Theis et al. (2016)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
Generative adversarial networks
Don’t focus on optimizingp(x), just learn to sample
Two networks pittedagainst one another:
Generative model Gcaptures datadistributionDiscriminative model Ddistinguishes betweenreal and fake samples
*HQHUDWLYH�QHWZRUN
]�a�SQRLVH�]�
[
[�a�SGDWD��[�
A
'LVFULPLQDWLYHQHWZRUN
'�WULHV�WR�RXWSXW��
'LVFULPLQDWLYHQHWZRUN
'�WULHV�WR�RXWSXW��
[Goodfellow et al. (2014)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
Generative adversarial networks
D is trained to estimate the probability that a samplecame from data distribution rather than G
G is trained to maximize the probability of D making amistake
minG
maxD
Ex∼pdata(x) logD(x) + Ez∼pnoise(z) log(1−D(G(z)))
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
GAN samples
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
GAN tradeoffs
Pros:
Very powerful modelHigh quality samples
Cons:
Tricky to train (no clear objective to track, instability)Can ignore large parts of image space
Because approximately minimizing Jenson-Shannondivergence [Goodfellow et al. (2014); Theis et al. (2016);
Huszar (2015)]
[Theis et al. (2016)]Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
Generative moment matching networks
Same idea as GANs, butdifferent optimization method
Match moments of data andgenerative distributions
Maximum mean discrepancy
Estimator for answeringwhether two samples comefrom same distribution
Evaluate MMD on generatedsamples
*HQHUDWLYH�QHWZRUN
]�a�SQRLVH�]�
[A
[Li et al. (2015); Dziugaite et al. (2015)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
Generative moment matching networks
LMMD2 = ‖ 1N
N∑i=1
φ(xi)−1
M
M∑j=1
φ(xj)‖2
=1
N2
N∑i=1
N∑i′=1
φ(xi)>φ(xi′)−
1
M2
M∑j=1
M∑j′=1
φ(xj)>φ(xj′)
− 2
NM
N∑i=1
∑j = 1Mφ(xi)
>φ(xj)
Can make use of kernel trick
If φ is identity, then matching means
Complex φ can match higher order moments
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
GMMN samples
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
GMMN tradeoffs
Pros:
Theoretically pleasing
Cons:
Batch size very importantSamples aren’t great (get better when combined withautoencoder)
[Theis et al. (2016)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
How to evaluate a generative model?
Log likelihood on held out data
Makes sense when goal is density estimationMany approaches don’t have tractable likelihood or it isn’texplicitly representedHave to resort to Parzen window estimates [Breuleux et al.(2009)]
Quality of samples
But a lookup table of training images will succeed here...
Best: evaluate in context of particular application
See Theis et al. (2016) for more details
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Outline
1 Motivation
2 Deep generative models: Intro
3 Deep generative models: Recent algorithmsVariational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models
4 Extensions
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
DRAW: Deep Recurrent Attentive Writer
Basic idea:
Iteratively construct image
Observe image throughsequence of glimpses
Recurrent encoder anddecoder
Optimizes variationalbound
Attention mechanismdetermines:
Input region observed byencoderOutput region modifiedby decoder
[Gregor et al. (2015)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
DRAW samples
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Generating images from captions
Language model: bidirectional RNN
Image model: conditional DRAW network
Image sharpening with Laplacian pyramid adversarialnetwork
[Mansimov et al. (2016)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Generating images from captions
[Mansimov et al. (2016)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Laplacian pyramid of adversarial networks
Difficult to generate large images in one shot
Break problem up into sequence of manageable steps
Samples drawn in coarse-to-fine fashion
→ → → →
Each scale is a convnet trained using GAN framework
[Denton et al. (2015)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
LAPGAN training procedure
G0
l2
~ I3
G3
D0
z0
D1
D2
h2 ~ h2 z3
D3
I3 I2 I2 I3
Real/Generated?
Real/ Generated?
G1
z1
G2
z2
Real/Generated?
Real/ Generated?
l0
I = I0
h0
I1 I1
l1
~ h1 h1
h0 ~
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
LAPGAN samples
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
LAPGAN samples
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Deep convolutional generative adversarial networks(DCGAN)
Radford et al. (2016) propose several tricks to make GANtraining more stable
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
DCGAN vector arithmetic
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Texture synthesis with spatial LSTMs
Two dimensional LSTM
Sequentially predict pixels in an image conditioned onprevious pixels
[Theis & Bethge (2015)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Texture synthesis with spatial LSTMs
[Theis & Bethge (2015)]
Emily Denton Deep generative models of natural images
MotivationDeep generative models: Intro
Deep generative models: Recent algorithmsExtensions
Pixel recurrent neural networks
Sequentially predict pixels in an image conditioned onprevious pixelsUses spatial LSTM
[van den Oord et al. (2016)]Emily Denton Deep generative models of natural images
AppendixReferences
References I
Blei, David M., Ng, Andrew Y., & Jordan, Michael I. 2003.Latent Dirichlet Allocation. Journal of Machine LearningResearch.
Breuleux, O., Bengio, Y., & Vincent, P. 2009. Unlearning forbetter mixing. Technical report, Universite de Montreal.
Denton, Emily, Chintala, Soumith, Szlam, Arthur, & Fergus,Rob. 2015. Deep generative image models using a laplacianpyramid of adversarial networks. In: NIPS.
Dziugaite, Gintare Karolina, Roy, Daniel M., & Ghahramani,Zoubin. 2015. Training generative neural networks viaMaximum Mean Discrepancy optimization. In: UAI.
Emily Denton Deep generative models of natural images
AppendixReferences
References II
Goodfellow, Ian J., Pouget-Abadie, Jean, Mirza, Mehdi, Xu,Bing, Warde-Farley, David, Ozair, Sherjil, Courville,Aaron C., & Bengio, Yoshua. 2014. Generative adversarialnetworks. In: NIPS.
Gregor, Karol, Danihelka, Ivo, Graves, Alex, & Wierstra, Daan.2015. DRAW: A Recurrent Neural Network For ImageGeneration. CoRR, abs/1502.04623.
Huszar, F. 2015. How (not) to train your generative model:schedule sampling, likelihood, adversary? arXiv preprintarXiv:1511.05101.
Kingma, Diederik P, & Welling, Max. 2013. Auto-encodingvariational bayes. arXiv preprint arXiv:1312.6114.
Emily Denton Deep generative models of natural images
AppendixReferences
References III
Li, Yujia, Swersky, Kevin, & Zemel, Richard. 2015. GenerativeMoment Matching Networks. In: ICML.
Mansimov, Elman, Parisotto, Emilio, Ba, Jimmy, &Salakhutdinov, Ruslan. 2016. Generating Images fromCaptions with Attention. In: ICLR.
Radford, Alec, Metz, Luke, & Chintala, Soumith. 2016.Unsupervised Representation Learning with DeepConvolutional Generative Adversarial Networks. In: ICLR.
Rezende, Danilo Jimenez, Mohamed, Shakir, & Wierstra, Daan.2014. Stochastic backpropagation and approximate inferencein deep generative models. arXiv preprint arXiv:1401.4082.
Theis, Lucas, & Bethge, Matthias. 2015. Generative ImageModeling Using Spatial LSTMs. In: NIPS.
Emily Denton Deep generative models of natural images
AppendixReferences
References IV
Theis, Lucas, van den Oord, Aaron, & Bethge, Matthias. 2016.A note on the evaluation of generative models. In: ICLR.
van den Oord, Aaron, Kalchbrenner, Nal, & Kavukcuoglu,Koray. 2016. Pixel Recurrent Neural Networks. In: ICML.
Emily Denton Deep generative models of natural images