50
Motivation Deep generative models: Intro Deep generative models: Recent algorithms Extensions Deep generative models of natural images Emily Denton Spring 2016 Emily Denton Deep generative models of natural images

Deep generative models of natural imagesVariational autoencoders Generative adversarial networks Generative moment matching networks Evaluating generative models Outline 1 Motivation

  • Upload
    others

  • View
    41

  • Download
    3

Embed Size (px)

Citation preview

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Deep generative models of natural images

Emily Denton

Spring 2016

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

1 Motivation

2 Deep generative models: Intro

3 Deep generative models: Recent algorithmsVariational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

4 Extensions

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Outline

1 Motivation

2 Deep generative models: Intro

3 Deep generative models: Recent algorithmsVariational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

4 Extensions

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Generative models

Have access to x ∼ pdata(x) through training set

Want to learn a model x ∼ pmodel(x)

Want pmodel to be similar to pdataSamples drawn from pmodel reflect structure of pdataSamples from true data distribution have high likelihoodunder pmodel

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Why do generative modeling?

Unsupervised representation learningCan transfer learned representation so discriminative tasks,retrieval, clustering, etc.

Train network with both discriminative and generativecriterion

Utilize unlabeled data, regularize

Understand data

Density estimation

Data augmentation

...Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Focus of this talk

Generative modeling is a HUGE field...I will focus on (aselected set of) deep directed models of natural images

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Outline

1 Motivation

2 Deep generative models: Intro

3 Deep generative models: Recent algorithmsVariational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

4 Extensions

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Directed graphical models

x

z

We assume data is generated by:

z ∼ p(z) x ∼ p(x|z)

z is latent/hidden x is observed (image)

Use θ to denote parameters of the generative model

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Deep directed graphical models

P (x) =∑h

P (x|h)P (h)

Intractable

Can’t optimize data likelihooddirectly

:���

:���

:���

]���

]���

]���

[

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Bounding the marginal likelihood

Recall Jenson’s inequality: When f is concave, f(E[x]) ≥ E[f(x)]

log p(x) = log

∫zp(x, z)

= log

∫zq(z)

p(x, z)

q(z)

≥∫zq(z) log

p(x, z)

q(z)= L(x; θ, φ) (by Jensons inequality)

=

∫zq(z) log p(x, z)−

∫zq(z) log q(z)

= Eq(z)[log p(x, z)]︸ ︷︷ ︸Expectation of joint distribution

+ H(q(z))︸ ︷︷ ︸Entropy

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Evidence Lower BOund (ELBO)

Bound is tight when variational approximation matchestrue posterior:

log p(x)− L(x; θ, φ) = log p(x)−∫zq(z) log

p(x, z)

q(z)

=

∫zq(z) log p(x)−

∫zq(z) log

p(x, z)

q(z)

=

∫zq(z) log

q(z)p(x)

p(x, z)

= DKL(q(z;φ)||p(z|x))

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Summary

Assume existence of q(z;φ)

Bound log p(x; θ) with L(x; θ, φ)

Bound is tight when:

DKL(q(z;φ)||p(z|x)) = 0 ⇐⇒ q(z;φ) = p(z|x)

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Learning directed graphical models

Maximize bound on likelihood of data:

maxθ

N∑i=1

log p(xi; θ) ≥ maxθ,φ1,...,φN

N∑i=1

L(xi; θ, φi)

Historically, used different φi for every data point

But we’ll move away from this soon..

q(z;φ) typically factorized distribution

For more info see Blei et al. (2003)

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

New method of learning: approximate inference model

Instead of having different variational parameters for eachdata point, fit a conditional parametric function

The output of this function will be the parameters of thevariational distribution q(z|x)

Instead of q(z) we have qφ(z|x)

ELBO becomes:

L(x; θ, φ) = Eqφ(z|x)[log pθ(x, z)]︸ ︷︷ ︸Expectation of joint distribution

+ H(qφ(z|x))︸ ︷︷ ︸Entropy

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

Outline

1 Motivation

2 Deep generative models: Intro

3 Deep generative models: Recent algorithmsVariational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

4 Extensions

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

Variational autoencoder

Encoder network maps fromimage space to latent space

Outputs parameters ofqφ(z|x)

Decoder maps from latentspace back into image space

Outputs parameters ofpθ(x|z)

[Kingma & Welling (2013)]

(QFRGHU�,QIHUHQFH�QHWZRUN

]�a�TṞ��]_[�

[

'HFRGHU�*HQHUDWLYH�QHWZRUN

]

[�a�SḎ��[_]�A

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

Variational autoencoder

Rearranging the ELBO:

L(x; θ, φ) =

∫zqφ(z|x) log

∫z

p(x, z)

qφ(z|x)

=

∫zqφ(z|x) log

∫z

p(x|z)p(z)qφ(z|x)

=

∫zqφ(z|x) log p(x|z) +

∫zqφ(z|x) log

p(z)

qφ(z|x)

= Eq(z|x) log p(x|z)− Eq(z|x) logq(z|x)

p(z)

= Eq(z|x) log p(x|z)︸ ︷︷ ︸Reconstruction term

−DKL(q(z|x)||p(z))︸ ︷︷ ︸Prior term

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

Variational autoencoder

Inference network outputsparameters of qφ(z|x)

Generative network outputsparameters of pθ(x|z)

Optimize θ and φ jointly bymaximizing ELBO:

L(x; θ, φ) = Eq(z|x) log p(x|z)︸ ︷︷ ︸Reconstruction term

−DKL(q(z|x)||p(z))︸ ︷︷ ︸Prior term

(QFRGHU�,QIHUHQFH�QHWZRUN

]�a�TṞ��]_[�

[

'HFRGHU�*HQHUDWLYH�QHWZRUN

]

[�a�SḎ��[_]�A

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

Stochastic gradient variation bayes (SGVB) estimator

Reparameterization trick : re-parameterize z ∼ qφ(h|z) as

z = gφ(x, ε) with ε ∼ p(ε)

For example, with a Gaussian can write z ∼ N (µ, σ2) as

z = µ+ εσ2 with ε ∼ N (0, 1)

[Kingma & Welling (2013); Rezende et al. (2014)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

Stochastic gradient variation bayes (SGVB) estimator

L(x; θ, φ) = Eq(z|x) log p(x|z)︸ ︷︷ ︸Reconstruction term

−DKL(q(z|x)||p(z))︸ ︷︷ ︸Prior term

Using reparameterization trick we form Monte Carloestimate of reconstruction term:

Eqφ(z|x) log pθ(x|z) = Ep(ε) log pθ(x|gφ(x, ε))

' 1

L

L∑i=1

log pθ(x|gφ(x, ε)) where ε ∼ p(ε)

KL divergence term can often be computed analytically(eg. Gaussian)

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

VAE learned manifold

[Kingma & Welling (2013)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

VAE samples

[Kingma & Welling (2013)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

VAE tradeoffs

Pros:Theoretically pleasingOptimizes bound on likelihoodEasy to implement

Cons:Samples tend to be blurry

Maximum likelihood minimizes DKL(pdata||pmodel)

[Theis et al. (2016)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

Generative adversarial networks

Don’t focus on optimizingp(x), just learn to sample

Two networks pittedagainst one another:

Generative model Gcaptures datadistributionDiscriminative model Ddistinguishes betweenreal and fake samples

*HQHUDWLYH�QHWZRUN

]�a�SQRLVH�]�

[

[�a�SGDWD��[�

A

'LVFULPLQDWLYHQHWZRUN

'�WULHV�WR�RXWSXW��

'LVFULPLQDWLYHQHWZRUN

'�WULHV�WR�RXWSXW��

[Goodfellow et al. (2014)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

Generative adversarial networks

D is trained to estimate the probability that a samplecame from data distribution rather than G

G is trained to maximize the probability of D making amistake

minG

maxD

Ex∼pdata(x) logD(x) + Ez∼pnoise(z) log(1−D(G(z)))

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

GAN samples

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

GAN tradeoffs

Pros:

Very powerful modelHigh quality samples

Cons:

Tricky to train (no clear objective to track, instability)Can ignore large parts of image space

Because approximately minimizing Jenson-Shannondivergence [Goodfellow et al. (2014); Theis et al. (2016);

Huszar (2015)]

[Theis et al. (2016)]Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

Generative moment matching networks

Same idea as GANs, butdifferent optimization method

Match moments of data andgenerative distributions

Maximum mean discrepancy

Estimator for answeringwhether two samples comefrom same distribution

Evaluate MMD on generatedsamples

*HQHUDWLYH�QHWZRUN

]�a�SQRLVH�]�

[A

[Li et al. (2015); Dziugaite et al. (2015)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

Generative moment matching networks

LMMD2 = ‖ 1N

N∑i=1

φ(xi)−1

M

M∑j=1

φ(xj)‖2

=1

N2

N∑i=1

N∑i′=1

φ(xi)>φ(xi′)−

1

M2

M∑j=1

M∑j′=1

φ(xj)>φ(xj′)

− 2

NM

N∑i=1

∑j = 1Mφ(xi)

>φ(xj)

Can make use of kernel trick

If φ is identity, then matching means

Complex φ can match higher order moments

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

GMMN samples

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

GMMN tradeoffs

Pros:

Theoretically pleasing

Cons:

Batch size very importantSamples aren’t great (get better when combined withautoencoder)

[Theis et al. (2016)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Variational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

How to evaluate a generative model?

Log likelihood on held out data

Makes sense when goal is density estimationMany approaches don’t have tractable likelihood or it isn’texplicitly representedHave to resort to Parzen window estimates [Breuleux et al.(2009)]

Quality of samples

But a lookup table of training images will succeed here...

Best: evaluate in context of particular application

See Theis et al. (2016) for more details

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Outline

1 Motivation

2 Deep generative models: Intro

3 Deep generative models: Recent algorithmsVariational autoencodersGenerative adversarial networksGenerative moment matching networksEvaluating generative models

4 Extensions

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

DRAW: Deep Recurrent Attentive Writer

Basic idea:

Iteratively construct image

Observe image throughsequence of glimpses

Recurrent encoder anddecoder

Optimizes variationalbound

Attention mechanismdetermines:

Input region observed byencoderOutput region modifiedby decoder

[Gregor et al. (2015)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

DRAW samples

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Generating images from captions

Language model: bidirectional RNN

Image model: conditional DRAW network

Image sharpening with Laplacian pyramid adversarialnetwork

[Mansimov et al. (2016)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Generating images from captions

[Mansimov et al. (2016)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Laplacian pyramid of adversarial networks

Difficult to generate large images in one shot

Break problem up into sequence of manageable steps

Samples drawn in coarse-to-fine fashion

→ → → →

Each scale is a convnet trained using GAN framework

[Denton et al. (2015)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

LAPGAN training procedure

G0

l2

~ I3

G3

D0

z0

D1

D2

h2 ~ h2 z3

D3

I3 I2 I2 I3

Real/Generated?

Real/ Generated?

G1

z1

G2

z2

Real/Generated?

Real/ Generated?

l0

I = I0

h0

I1 I1

l1

~ h1 h1

h0 ~

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

LAPGAN samples

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

LAPGAN samples

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Deep convolutional generative adversarial networks(DCGAN)

Radford et al. (2016) propose several tricks to make GANtraining more stable

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

DCGAN vector arithmetic

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Texture synthesis with spatial LSTMs

Two dimensional LSTM

Sequentially predict pixels in an image conditioned onprevious pixels

[Theis & Bethge (2015)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Texture synthesis with spatial LSTMs

[Theis & Bethge (2015)]

Emily Denton Deep generative models of natural images

MotivationDeep generative models: Intro

Deep generative models: Recent algorithmsExtensions

Pixel recurrent neural networks

Sequentially predict pixels in an image conditioned onprevious pixelsUses spatial LSTM

[van den Oord et al. (2016)]Emily Denton Deep generative models of natural images

AppendixReferences

References I

Blei, David M., Ng, Andrew Y., & Jordan, Michael I. 2003.Latent Dirichlet Allocation. Journal of Machine LearningResearch.

Breuleux, O., Bengio, Y., & Vincent, P. 2009. Unlearning forbetter mixing. Technical report, Universite de Montreal.

Denton, Emily, Chintala, Soumith, Szlam, Arthur, & Fergus,Rob. 2015. Deep generative image models using a laplacianpyramid of adversarial networks. In: NIPS.

Dziugaite, Gintare Karolina, Roy, Daniel M., & Ghahramani,Zoubin. 2015. Training generative neural networks viaMaximum Mean Discrepancy optimization. In: UAI.

Emily Denton Deep generative models of natural images

AppendixReferences

References II

Goodfellow, Ian J., Pouget-Abadie, Jean, Mirza, Mehdi, Xu,Bing, Warde-Farley, David, Ozair, Sherjil, Courville,Aaron C., & Bengio, Yoshua. 2014. Generative adversarialnetworks. In: NIPS.

Gregor, Karol, Danihelka, Ivo, Graves, Alex, & Wierstra, Daan.2015. DRAW: A Recurrent Neural Network For ImageGeneration. CoRR, abs/1502.04623.

Huszar, F. 2015. How (not) to train your generative model:schedule sampling, likelihood, adversary? arXiv preprintarXiv:1511.05101.

Kingma, Diederik P, & Welling, Max. 2013. Auto-encodingvariational bayes. arXiv preprint arXiv:1312.6114.

Emily Denton Deep generative models of natural images

AppendixReferences

References III

Li, Yujia, Swersky, Kevin, & Zemel, Richard. 2015. GenerativeMoment Matching Networks. In: ICML.

Mansimov, Elman, Parisotto, Emilio, Ba, Jimmy, &Salakhutdinov, Ruslan. 2016. Generating Images fromCaptions with Attention. In: ICLR.

Radford, Alec, Metz, Luke, & Chintala, Soumith. 2016.Unsupervised Representation Learning with DeepConvolutional Generative Adversarial Networks. In: ICLR.

Rezende, Danilo Jimenez, Mohamed, Shakir, & Wierstra, Daan.2014. Stochastic backpropagation and approximate inferencein deep generative models. arXiv preprint arXiv:1401.4082.

Theis, Lucas, & Bethge, Matthias. 2015. Generative ImageModeling Using Spatial LSTMs. In: NIPS.

Emily Denton Deep generative models of natural images

AppendixReferences

References IV

Theis, Lucas, van den Oord, Aaron, & Bethge, Matthias. 2016.A note on the evaluation of generative models. In: ICLR.

van den Oord, Aaron, Kalchbrenner, Nal, & Kavukcuoglu,Koray. 2016. Pixel Recurrent Neural Networks. In: ICML.

Emily Denton Deep generative models of natural images