Structure learning with deep neuronal networks

Structure learning

with deep neuronal networks

6th Network Modeling Workshop, 6/6/2013

Patrick Michl

Page 26/6/2013

Patrick MichlNetwork Modeling

Agenda

Autoencoders

Biological Model

Validation & Implementation

Page 36/6/2013


Real world data usually is high dimensional …

x1

x2

Dataset Model

Autoencoders

Page 46/6/2013


… which makes structural analysis and modeling complicated!

x1

x2

x1

x2

Dataset Model

?

Autoencoders

Page 56/6/2013


Dimensionality reduction techinques like PCA …

x1

x2

PCA

Dataset Model

Autoencoders

Page 66/6/2013


… can not preserve complex structures!

x1

x2

PCA

Dataset Model

x1

x2

𝑥2=α 𝑥1+β

Autoencoders

Page 76/6/2013


Therefore the analysis of unknown structures …

x1

x2

Dataset Model

Autoencoders

Page 86/6/2013


… needs more considerate nonlinear techniques!

x1

x2

Dataset Model

x1

x2

𝑥2= 𝑓 (𝑥1)

Autoencoders

Page 96/6/2013


Autoencoders are artificial neuronal networks …

Autoencoder

• Artificial Neuronal Network

Autoencoders

input data X

output data X‘

Perceptrons

Gaussian Units

Page 106/6/2013



Autoencoder


Autoencoders

input data X

output data X‘

Perceptrons

Gaussian Units

Perceptron1

0

Gauss UnitsR

Page 116/6/2013



Autoencoder


Autoencoders

input data X

output data X‘

Perceptrons

Gaussian Units

Page 126/6/2013


Autoencoder

• Artificial Neuronal Network• Multiple hidden layers

Autoencoders

… with multiple hidden layers.

Gaussian Units

input data X

output data X‘

Perceptrons

(Visible layers)

(Hidden layers)

Page 136/6/2013


Autoencoder


Autoencoders

Such networks are called deep networks.

Gaussian Units

input data X

output data X‘

Perceptrons

(Visible layers)

(Hidden layers)

Page 146/6/2013


Autoencoder


Autoencoders


Gaussian Units

input data X

output data X‘

Perceptrons

(Visible layers)

(Hidden layers)Definition (deep network)

Deep networks are artificial neuronal networks with multiple hidden layers

Page 156/6/2013


Autoencoder

Autoencoders

Gaussian Units

input data X

output data X‘

Perceptrons

(Visible layers)

(Hidden layers)


• Deep network

Page 166/6/2013


Autoencoder

Autoencoders

Autoencoders have a symmetric topology …

Gaussian Units

input data X

output data X‘

Perceptrons

(Visible layers)

(Hidden layers)

• Deep network• Symmetric topology

Page 176/6/2013


Autoencoder

Autoencoders

… with an odd number of hidden layers.

Gaussian Units

input data X

output data X‘

Perceptrons

(Visible layers)

(Hidden layers)

• Deep network• Symmetric topology

Page 186/6/2013


Autoencoder

Autoencoders

The small layer in the center works lika an information bottleneck

input data X

output data X‘

• Deep network• Symmetric topology• Information bottleneck

Bottleneck

Page 196/6/2013


Autoencoder

Autoencoders

... that creates a low dimensional code for each sample in the input data.

input data X

output data X‘

• Deep network• Symmetric topology• Information bottleneck

Bottleneck

Page 206/6/2013


Autoencoder

Autoencoders

The upper stack does the encoding …

input data X

output data X‘

• Deep network• Symmetric topology• Information bottleneck• Encoder

Encoder

Page 216/6/2013


Autoencoder

Autoencoders

… and the lower stack does the decoding.

input data X

output data X‘

• Deep network• Symmetric topology• Information bottleneck• Encoder• Decoder

Encoder

Decoder

Page 226/6/2013


• Deep network• Symmetric topology• Information bottleneck• Encoder• Decoder

Autoencoder

Autoencoders

… and the lower stack does the decoding.

input data X

output data X‘

Encoder

Decoder

Definition (deep network)

Deep networks are artificial neuronal networks with multiple hidden layers

Definition (autoencoder)

Autoencoders are deep networks with a symmetric topology and an odd number of hiddern layers, containing a encoder, a low dimensional representation and a decoder.

Page 236/6/2013


Autoencoder

Autoencoders

Autoencoders can be used to reduce the dimension of data …

input data X

output data X‘

Problem: dimensionality of data

Idea:1. Train autoencoder to minimize the distance

between input X and output X‘2. Encode X to low dimensional code Y3. Decode low dimensional code Y to output X‘4. Output X‘ is low dimensional

Page 246/6/2013


Autoencoder

Autoencoders

… if we can train them!

input data X

output data X‘

Problem: dimensionality of data

Idea:1. Train autoencoder to minimize the distance

between input X and output X‘2. Encode X to low dimensional code Y3. Decode low dimensional code Y to output X‘4. Output X‘ is low dimensional

Page 256/6/2013


Autoencoder

Autoencoders

In feedforward ANNs backpropagation is a good approach.

input data X

output data X‘

Training

Backpropagation

Page 266/6/2013


Backpropagation

Autoencoder

Autoencoders

input data X

output data X‘

Training


Backpropagation

(1) The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function

error


Page 276/6/2013


Backpropagation

Autoencoder

Autoencoders

In feedforward ANNs backpropagation is the choice

input data X

output data X‘

Training


Backpropagation


Example (linear neuronal unit with two inputs)

Page 286/6/2013


Backpropagation

Autoencoder

Autoencoders

input data X

output data X‘

Training


Backpropagation


(2) By calculating we get a vector that shows in a direction which decreases the error

(3) We update the parameters to decrease the error


Page 296/6/2013


Backpropagation

Autoencoder

Autoencoders

In feedforward ANNs backpropagation is the choice

input data X

output data X‘

Training


Backpropagation


(2) By calculating we get a vector that shows in a direction which decreases the error

(3) We update the parameters to decrease the error(4) We repeat that

Page 306/6/2013


Autoencoder

Autoencoders

… the problem are the multiple hidden layers!

input data X

output data X‘

Training

Backpropagation

Problem: Deep Network

Page 316/6/2013


Autoencoder

Autoencoders

input data X

output data X‘

Training

Backpropagation is known to be slow far away from the output layer …

Backpropagation

Problem: Deep Network• Very slow training

Page 326/6/2013


Autoencoder

Autoencoders

input data X

output data X‘

Training

… and can converge to poor local minima.

Backpropagation

Problem: Deep Network• Very slow training• Maybe bad solution

Page 336/6/2013


Autoencoder

Autoencoders

input data X

output data X‘

Training

Backpropagation


Idea: Initialize close to a good solution

The task is to initialize the parameters close to a good solution!

Page 346/6/2013


Autoencoder

Autoencoders

input data X

output data X‘

Training

Backpropagation


Idea: Initialize close to a good solution• Pretraining

Therefore the training of autoencoders has a pretraining phase …

Page 356/6/2013


Autoencoder

Autoencoders

input data X

output data X‘

Training

Backpropagation


Idea: Initialize close to a good solution• Pretraining• Restricted Boltzmann Machines

… which uses Restricted Boltzmann Machines (RBMs)

Page 366/6/2013


Autoencoder

Autoencoders

input data X

output data X‘

Training

Backpropagation




Restricted Boltzmann Machine

• RBMs are Markov Random Fields

Page 376/6/2013


Autoencoder

Autoencoders

input data X

output data X‘

Training

Backpropagation





• RBMs are Markov Random Fields

Markov Random FieldEvery unit influences every neighborThe coupling is undirected

Motivation (Ising Model)A set of magnetic dipoles (spins)is arranged in a graph (lattice)where neighbors arecoupled with a given strengt

Page 386/6/2013


Autoencoder

Autoencoders

input data X

output data X‘

Training

Backpropagation





• RBMs are Markov Random Fields• Bipartite topology: visible (v), hidden (h)• Use local energy to calculate the probabilities of values

Training:contrastive divergency(Gibbs Sampling)

h1

v1 v2 v3 v4

h2 h3

Page 396/6/2013


Autoencoder

Autoencoders

input data X

output data X‘

Training

Backpropagation





Gibbs Sampling

Page 406/6/2013


Autoencoders

Autoencoder

The top layer RBM transforms real value data into binary codes.

Top

Training

Page 416/6/2013


Autoencoders

Autoencoder

Top

Therefore visible units are modeled with gaussians to encode data …

h2

v1 v2 v3 v4

h3 h4 h5h1

Training

Page 426/6/2013


Autoencoders

Autoencoder

Top

… and many hidden units with simoids to encode dependencies

h2

v1 v2 v3 v4

h3 h4 h5h1

Training

Page 436/6/2013


Autoencoders

Autoencoder

Top

The objective function is the sum of the local energies.

Local Energy

𝐸𝑣≔−∑h𝑤 h𝑣

𝑥𝑣

𝜎𝑣𝑥

h+

(𝑥𝑣−𝑏𝑣 )2

2𝜎 𝑣2

h2

v1 v2 v3 v4

h3 h4 h5h1

Training

Page 446/6/2013


Autoencoders

Autoencoder

Reduction

The next RBM layer maps the dependency encoding…

Training

Page 456/6/2013


Autoencoders

Autoencoder

Reduction

… from the upper layer …

v

h1

v1 v2 v3 v4

h2 h3

Training

Page 466/6/2013


Autoencoders

Autoencoder

Reduction

… to a smaller number of simoids …

h

h1

v1 v2 v3 v4

h2 h3

Training

Page 476/6/2013


Autoencoders

Autoencoder

Reduction

… which can be trained faster than the top layer

Local Energy𝐸𝑣≔−∑

h𝑤 h𝑣 𝑥𝑣 𝑥h+𝑥h𝑏h

𝐸h≔−∑𝑣𝑤 h𝑣 𝑥𝑣 𝑥h+𝑥𝑣𝑏𝑣

h1

v1 v2 v3 v4

h2 h3

Training

Page 486/6/2013


Autoencoders

Autoencoder

Unrolling

The symmetric topology allows us to skip further training.

Training

Page 496/6/2013


Autoencoders

Autoencoder

Unrolling

The symmetric topology allows us to skip further training.

Training

Page 506/6/2013


After pretraining backpropagation usually finds good solutions

Autoencoders

Autoencoder

Training

• PretrainingTop RBM (GRBM)Reduction RBMsUnrolling

• FinetuningBackpropagation

Page 516/6/2013


The algorithmic complexity of RBM training depends on the network size

Autoencoders

Autoencoder

Training

• Complexity: O(inw)i: number of iterationsn: number of nodesw: number of weights

• Memory Complexity: O(w)

Page 526/6/2013


Agenda

Autoencoders

Biological Model

Validation & Implementation

Page 536/6/2013

Patrick MichlNetwork Modeling Network Modeling

Restricted Boltzmann Machines (RBM)

How to model the topological structure?

S

E

TF

Page 546/6/2013


We define S and E as visible data Layer …

S

E

TF

Network ModelingRestricted Boltzmann Machines (RBM)

Page 556/6/2013


S E

TF


We identify S and E with the visible layer …

Page 566/6/2013


S E

… and the TFs with the hidden layer in a RBM

TF


Page 576/6/2013


S E

The training of the RBM gives us a model

TF


Page 586/6/2013


Agenda

Autoencoder

Biological Model

Implementation & Results

Page 596/6/2013


Results

Validation of the results

• Needs information about the true regulation• Needs information about the descriptive power of the data

Page 606/6/2013


Results

Validation of the results

• Needs information about the true regulation• Needs information about the descriptive power of the data

Without this infomation validation can only be done,using artificial datasets!

Page 616/6/2013


Results

Artificial datasets

We simulate data in three steps:

Page 626/6/2013


Results

Artificial datasets

We simulate data in three steps

Step 1Choose number of Genes (E+S) and create random bimodal distributed data

Page 636/6/2013


Results

Artificial datasets



Step 2Manipulate data in a fixed order

Page 646/6/2013


Results

Artificial datasets



Step 2Manipulate data in a fixed order

Step 3Add noise to manipulated dataand normalize data

Page 656/6/2013


Simulation

Results

Step 1Number of visible nodes 8 (4E, 4S)Create random data:Random {-1, +1} + N(0,

Page 666/6/2013


Simulation

Results

NoiseStep 2Manipulate data

Page 676/6/2013


Simulation

Results

Step 3Add noise: N(0,

Page 686/6/2013


Results

We analyse the data Xwith an RBM

Page 696/6/2013


Results

We train an autoencoder with 9 hidden layersand 165 nodes:

Layer 1 & 9: 32 hidden unitsLayer 2 & 8: 24 hidden unitsLayer 3 & 7: 16 hidden unitsLayer 4 & 6: 8 hidden unitsLayer 5: 5 hidden units

input data X

output data X‘

Page 706/6/2013


Results

We transform the data from X to X‘And reduce the dimensionality

Page 716/6/2013


Results

We analyse thetransformed data X‘with an RBM

Page 726/6/2013


Results

Lets compare the models

Page 736/6/2013


Results

Another Example with more nodes and larger autoencoder

Page 746/6/2013


Conclusion

Conclusion

• Autoencoders can improve modeling significantly by reducing the dimensionality of data

• Autoencoders preserve complex structures in their multilayer perceptron network. Analysing those networks (for example with knockout tests) could give more structural information

• The drawback are high computational costsSince the field of deep learning is getting more popular (Face recognition / Voice recognition, Image transformation). Many new improvements in facing the computational costs have been made.

Page 756/6/2013


Acknowledgement

eilsLABSProf. Dr. Rainer KönigProf. Dr. Roland EilsNetwork Modeling Group

Documents

Structure learning with deep neuronal networks