48
Generative Adversarial Networks (GANs) Based on: Generative Adversarial Networks (GANs), Ian Goodfellow. NIPS, 2016 04/12/2017 Anthony Ortiz 1

Generative Adversarial Networks (GANs) and Play Generative Models •New state of the art generative model (Nguyen et al 2016) •Generates 227x227 realistic images from all ImageNet

  • Upload
    buidiep

  • View
    225

  • Download
    4

Embed Size (px)

Citation preview

Generative Adversarial Networks (GANs)

Based on:

Generative Adversarial Networks (GANs), Ian Goodfellow. NIPS, 2016

04/12/2017 Anthony Ortiz 1

What are some recent and potentially upcoming breakthroughs in deep learning?

2Anthony Ortiz04/12/2017

The most important one, in my opinion, is adversarial

training (also called GAN for Generative Adversarial

Networks)…

This, and the variations that are now being proposed is

the most interesting idea in the last 10 years in ML, in

my opinion.

Generative Modeling

3Anthony Ortiz04/12/2017

Why is important to study GANs

• Excellent test of our ability to use high-dimensional, complicatedprobability distributions

• Simulate possible futures for planning or simulated RL

• Missing data

• Semi-supervised learning

• Multi-modal outputs

• Realistic generation tasks

4Anthony Ortiz04/12/2017

Sample Generation

5Anthony Ortiz04/12/2017

Next Frame prediction

6Anthony Ortiz04/12/2017

Lotter et al 2016

Next frame prediction

7Anthony Ortiz04/12/2017

Super-Resolution

8Anthony Ortiz04/12/2017

Ledig et al 2016

iGAN

9Anthony Ortiz04/12/2017

(Zhu et al 2016)

Image to Image Translation

10Anthony Ortiz04/12/2017

How GANs work?

11Anthony Ortiz04/12/2017

Maximum Likelihood

12Anthony Ortiz04/12/2017

Generative Models’ Taxonomy

13Anthony Ortiz04/12/2017

Fully Visible Belief Networks

• Explicit formula based on chain rule:

Disadvantages:

• O(n) sample generation cost

• Generation not controlled by a latent code

14Anthony Ortiz04/12/2017

Wavenet

15Anthony Ortiz04/12/2017

Amazing quality Samplegeneration slow

Two minutes to synthesizeone second of audio

GANs

• Use a latent code

• Asymptotically consistent (unlike variational methods)

• No Markov chains needed

• Often regarded as producing the best samples

• No good way to quantify this

16Anthony Ortiz04/12/2017

Adversarial Nets Framework

17Anthony Ortiz04/12/2017

Generator Network

• Must be differentiable

• No invertibility requirement

• Trainable for any size of z

• Some guarantees require z to have higher dimension than x

• Can make x conditionally Gaussian given z but need not do so

18Anthony Ortiz04/12/2017

Training Procedure

• Use SGD-like algorithm of choice (Adam) on two minibatchessimultaneously:

• A minibatch of training examples

• A minibatch of generated samples

• Optional: run k steps of one player for every step of the other player.

19Anthony Ortiz04/12/2017

Minimax Game

20Anthony Ortiz04/12/2017

Discriminator Strategy

21Anthony Ortiz04/12/2017

Non-Saturation Game

22Anthony Ortiz04/12/2017

DCGAN Architecture

23Anthony Ortiz04/12/2017

Is the divergence important?

24Anthony Ortiz04/12/2017

Modifying GANs to do Maximum Likelihood

25Anthony Ortiz04/12/2017

Loss does not seem to explain why GAN samples are sharp

26Anthony Ortiz04/12/2017

Hint: The approximation strategy matters more than the loss

Labels improve subjective sample quality

27Anthony Ortiz04/12/2017

Implementation, tips and tricks

28Anthony Ortiz04/12/2017

GAN for MNIST using Tensorflow

29Anthony Ortiz04/12/2017

Training GAN

30Anthony Ortiz04/12/2017

Training GAN

31Anthony Ortiz04/12/2017

• Above, we use negative sign for the loss functions because they need to be maximized, whereas TensorFlow’s optimizer can only do minimization.

• Also, as per the paper’s suggestion, it’s better to maximize tf.reduce_mean(tf.log(D_fake)) instead of minimizing tf.reduce_mean(1 - tf.log(D_fake)) in the algorithm above.

Training GAN

• Then we train the networks one by one with those Adversarial Training, represented by those loss functions above.

32Anthony Ortiz04/12/2017

Training process by sampling G(Z)

33Anthony Ortiz04/12/2017

We are done!

One-sided label smoothing

34Anthony Ortiz04/12/2017

Benefits of label smoothing

35Anthony Ortiz04/12/2017

Batch Norm

36Anthony Ortiz04/12/2017

Batch norm in G can cause strong intra-batchcorrelation

37Anthony Ortiz04/12/2017

Balancing G and D

38Anthony Ortiz04/12/2017

Research Fronteirs

39Anthony Ortiz04/12/2017

Non-convergence

40Anthony Ortiz04/12/2017

Non-convergence in GANs

41Anthony Ortiz04/12/2017

Problems with Counting

42Anthony Ortiz04/12/2017

Problems with Perspective

43Anthony Ortiz04/12/2017

Problems with Global Structure

44Anthony Ortiz04/12/2017

Evaluation

45Anthony Ortiz04/12/2017

Plug and Play Generative Models

• New state of the art generative model (Nguyen et al 2016)

• Generates 227x227 realistic images from all ImageNet classes

• Combines adversarial training, moment matching, denoisingautoencoders, and Langevin sampling

46Anthony Ortiz04/12/2017

PPGN Models

47Anthony Ortiz04/12/2017

Nguyen et al 2016)

Conclusions

• GANs are a generative models that use supervised learning to estimate an intractable cost function

• GANs allow a model to learn that there are many correct answers

• GANs can simulate many cost functions, including the one used formaximum likelihood

• Adversarial training can be useful for people as well as machine learning models

48Anthony Ortiz04/12/2017