007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics

Preview:

Citation preview

Deep Unsupervised Learning using Nonequlibrium Thermodynamics

Tran Quoc Hoan

@k09ht haduonght.wordpress.com/

14 December 2015, Paper Alert, Hasegawa lab., Tokyo

The University of Tokyo

Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, Surya Ganguli Proceedings of the 32nd International Conference on Machine Learning, 2015

Abstract

Deep Unsupervised Learning using Nonequilibrium Thermodynamics 2

“…The essential idea, inspired by non-equilibrium statistical

physics, is to systematically and slowly destroy structure in

a data distribution through an iterative forward diffusion

process. We then learn a reverse diffusion process

that restores structure in data, yielding a highly flexible

and tractable generative model of the data…”

Outline

3

- The promise of deep unsupervised learning

• Motivation

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

- Diffusion processes and time reversal

• Physical intuition

- Derivation and experimental results

• Diffusion probabilistic model

Deep Unsupervised Learning

4

- Novel modalities

• Unknown features/labels

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

- Ex. disease part in medical image

• Expensive labels

• Unpredictable tasks / one shot learning

- Exploratory data analysis

https://www.ceessentials.net/article40.html

Physical Intuition

5

- Destroy structure in data

• Diffusion processes and time reversal

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

- Carefully characterize the destruction

- Learn how to reverse time

Observation 1: Diffusion Destroy Structure

6

Data distribution

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Uniform distribution

Uniform distributionData distribution

(Observation)Diffusion destroys structure

(Recover structure)Recover data distribution by starting from uniform

distribution and running dynamics backwards

Observation 2: Microscopic Diffusion

7

• Time reversible

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

https://www.youtube.com/watch?v=cDcprgWiQEY

• Brownian motion

• Position updates are small Gaussians (both forwards and backwards in time)

Diffusion-based Probabilistic Models

8

• Destroy all structure in data distribution using diffusion process

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

• Learn reversal of diffusion process

- Estimate function for mean and covariance of each step in the reverse diffusion process (Ex. binomial rate for binary data)

• Reverse diffusion process is the model of the data

Diffusion-based Probabilistic Models

9

• Algorithm

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

• Deep convolutional network: universal function approximatior

• Multiplying distributions: inputation, denoising, computing posteriors

Destroy by Diffusion Process

10

Datadistribution

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Forwarddiffusion

Noisedistribution

Temporal diffusion rate

Destroy by Gaussian Process

11

Datadistribution

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Forwarddiffusion

Noisedistribution

Decay towards origin

Add small noise

Reversal Gaussian Diffusion Process

12

Datadistribution

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Reversediffusion

Noisedistribution

Learned drift and covariance functions

Case Study: Swiss Roll

13Deep Unsupervised Learning using Nonequilibrium Thermodynamics

True model

Inference model

Training the reverse diffusion

14Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Model probability

Annealed importance sampling

Training the reverse diffusion

15Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Log likelihood

Jensen’s inequality

Training the reverse diffusion

16Deep Unsupervised Learning using Nonequilibrium Thermodynamics

…do some algebra…

Training the reverse diffusion

17Deep Unsupervised Learning using Nonequilibrium Thermodynamics

…for Gaussian diffusion process…

Training

unsupervised learning becomes regression problem

Training the reverse diffusion

18Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Setting the diffusion rate

• For Binomial diffusion (erase constant fraction of stimulus variance each step)

• For Gaussian diffusion

�t

�1

�t = (T � t+ 1)�1

= small constant (prevent over-fitting)

Training �t

Multiplying Distributions

19Deep Unsupervised Learning using Nonequilibrium Thermodynamics

• Required to compute posterior distribution - Missing data (inpainting)

- Corrupted data (denoising)

• Difficult and expensive using competing techniques

- Ex. VAE, GSNs, NADEs, most graphical models

Interested in

Acts as small perturbation to diffusion process

Multiplying Distributions

20Deep Unsupervised Learning using Nonequilibrium Thermodynamics

• Modified marginal distributions

Interested in

Acts as small perturbation to diffusion process

Multiplying Distributions

21Deep Unsupervised Learning using Nonequilibrium Thermodynamics

• Modified diffusion steps

Equilibrium condition

Normalized

Multiplying Distributions

22Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Reversal gaussian Diffusion Process

Interested in

Acts as small perturbation to diffusion process

Small perturbation affects only mean

Deep Network as Approximator for Images

23Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Multi-scale convolution

24Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Downsample

Convolve

Upsample

Sum

Applied to CIFAR-10

25Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Training data Samples from Generative Adversarial [Goodfellow

et al, 2014]

Samples from diffusion model

Applied to CIFAR-10

26Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Samples from DRAW

[Gregor et al, 2015]

Samples from Generative Adversarial [Goodfellow

et al, 2014]

Samples from diffusion model

Applied to Dead Leaves

27Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Training dataSamples from

[Theis et al, 2012]Log likelihood 1.24

bits/pixel

Samples from diffusion model

Log likelihood 1.49 bits/pixel

Applied to Inpainting

28Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Table App.1

29Deep Unsupervised Learning using Nonequilibrium Thermodynamics

References

30Deep Unsupervised Learning using Nonequilibrium Thermodynamics

h"p://jmlr.org/proceedings/papers/v37/sohl-dickstein15.html

h"p://videolectures.net/icml2015_sohl_dickstein_deep_unsupervised_learning/

h"p://www.inference.vc/icml-paper-unsupervised-learning-by-inverEng-diffusion-processes/

Recommended