Upload
roscoe
View
29
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Structure learning with deep neuronal networks. 6 th Network Modeling Workshop, 6/6/2013 Patrick Michl. Agenda. Autoencoders. Dataset. Model. x 2. x 1. Real world data usually is high dimensional …. x 1. x 2. Autoencoders. Dataset. Model. x 2. ?. x 1. - PowerPoint PPT Presentation
Citation preview
Structure learning
with deep neuronal networks
6th Network Modeling Workshop, 6/6/2013
Patrick Michl
Page 26/6/2013
Patrick MichlNetwork Modeling
Agenda
Autoencoders
Biological Model
Validation & Implementation
Page 36/6/2013
Patrick MichlNetwork Modeling
Real world data usually is high dimensional …
x1
x2
Dataset Model
Autoencoders
Page 46/6/2013
Patrick MichlNetwork Modeling
… which makes structural analysis and modeling complicated!
x1
x2
x1
x2
Dataset Model
?
Autoencoders
Page 56/6/2013
Patrick MichlNetwork Modeling
Dimensionality reduction techinques like PCA …
x1
x2
PCA
Dataset Model
Autoencoders
Page 66/6/2013
Patrick MichlNetwork Modeling
… can not preserve complex structures!
x1
x2
PCA
Dataset Model
x1
x2
𝑥2=α 𝑥1+β
Autoencoders
Page 76/6/2013
Patrick MichlNetwork Modeling
Therefore the analysis of unknown structures …
x1
x2
Dataset Model
Autoencoders
Page 86/6/2013
Patrick MichlNetwork Modeling
… needs more considerate nonlinear techniques!
x1
x2
Dataset Model
x1
x2
𝑥2= 𝑓 (𝑥1)
Autoencoders
Page 96/6/2013
Patrick MichlNetwork Modeling
Autoencoders are artificial neuronal networks …
Autoencoder
• Artificial Neuronal Network
Autoencoders
input data X
output data X‘
Perceptrons
Gaussian Units
Page 106/6/2013
Patrick MichlNetwork Modeling
Autoencoders are artificial neuronal networks …
Autoencoder
• Artificial Neuronal Network
Autoencoders
input data X
output data X‘
Perceptrons
Gaussian Units
Perceptron1
0
Gauss UnitsR
Page 116/6/2013
Patrick MichlNetwork Modeling
Autoencoders are artificial neuronal networks …
Autoencoder
• Artificial Neuronal Network
Autoencoders
input data X
output data X‘
Perceptrons
Gaussian Units
Page 126/6/2013
Patrick MichlNetwork Modeling
Autoencoder
• Artificial Neuronal Network• Multiple hidden layers
Autoencoders
… with multiple hidden layers.
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
Page 136/6/2013
Patrick MichlNetwork Modeling
Autoencoder
• Artificial Neuronal Network• Multiple hidden layers
Autoencoders
Such networks are called deep networks.
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
Page 146/6/2013
Patrick MichlNetwork Modeling
Autoencoder
• Artificial Neuronal Network• Multiple hidden layers
Autoencoders
Such networks are called deep networks.
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)Definition (deep network)
Deep networks are artificial neuronal networks with multiple hidden layers
Page 156/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
Such networks are called deep networks.
• Deep network
Page 166/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
Autoencoders have a symmetric topology …
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
• Deep network• Symmetric topology
Page 176/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
… with an odd number of hidden layers.
Gaussian Units
input data X
output data X‘
Perceptrons
(Visible layers)
(Hidden layers)
• Deep network• Symmetric topology
Page 186/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
The small layer in the center works lika an information bottleneck
input data X
output data X‘
• Deep network• Symmetric topology• Information bottleneck
Bottleneck
Page 196/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
... that creates a low dimensional code for each sample in the input data.
input data X
output data X‘
• Deep network• Symmetric topology• Information bottleneck
Bottleneck
Page 206/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
The upper stack does the encoding …
input data X
output data X‘
• Deep network• Symmetric topology• Information bottleneck• Encoder
Encoder
Page 216/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
… and the lower stack does the decoding.
input data X
output data X‘
• Deep network• Symmetric topology• Information bottleneck• Encoder• Decoder
Encoder
Decoder
Page 226/6/2013
Patrick MichlNetwork Modeling
• Deep network• Symmetric topology• Information bottleneck• Encoder• Decoder
Autoencoder
Autoencoders
… and the lower stack does the decoding.
input data X
output data X‘
Encoder
Decoder
Definition (deep network)
Deep networks are artificial neuronal networks with multiple hidden layers
Definition (autoencoder)
Autoencoders are deep networks with a symmetric topology and an odd number of hiddern layers, containing a encoder, a low dimensional representation and a decoder.
Page 236/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
Autoencoders can be used to reduce the dimension of data …
input data X
output data X‘
Problem: dimensionality of data
Idea:1. Train autoencoder to minimize the distance
between input X and output X‘2. Encode X to low dimensional code Y3. Decode low dimensional code Y to output X‘4. Output X‘ is low dimensional
Page 246/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
… if we can train them!
input data X
output data X‘
Problem: dimensionality of data
Idea:1. Train autoencoder to minimize the distance
between input X and output X‘2. Encode X to low dimensional code Y3. Decode low dimensional code Y to output X‘4. Output X‘ is low dimensional
Page 256/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
In feedforward ANNs backpropagation is a good approach.
input data X
output data X‘
Training
Backpropagation
Page 266/6/2013
Patrick MichlNetwork Modeling
Backpropagation
Autoencoder
Autoencoders
input data X
output data X‘
Training
Definition (autoencoder)
Backpropagation
(1) The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function
error
In feedforward ANNs backpropagation is a good approach.
Page 276/6/2013
Patrick MichlNetwork Modeling
Backpropagation
Autoencoder
Autoencoders
In feedforward ANNs backpropagation is the choice
input data X
output data X‘
Training
Definition (autoencoder)
Backpropagation
(1) The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function
Example (linear neuronal unit with two inputs)
Page 286/6/2013
Patrick MichlNetwork Modeling
Backpropagation
Autoencoder
Autoencoders
input data X
output data X‘
Training
Definition (autoencoder)
Backpropagation
(1) The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function
(2) By calculating we get a vector that shows in a direction which decreases the error
(3) We update the parameters to decrease the error
In feedforward ANNs backpropagation is a good approach.
Page 296/6/2013
Patrick MichlNetwork Modeling
Backpropagation
Autoencoder
Autoencoders
In feedforward ANNs backpropagation is the choice
input data X
output data X‘
Training
Definition (autoencoder)
Backpropagation
(1) The distance (error) between current output X‘ and wanted output Y is computed. This gives a error function
(2) By calculating we get a vector that shows in a direction which decreases the error
(3) We update the parameters to decrease the error(4) We repeat that
Page 306/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
… the problem are the multiple hidden layers!
input data X
output data X‘
Training
Backpropagation
Problem: Deep Network
Page 316/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation is known to be slow far away from the output layer …
Backpropagation
Problem: Deep Network• Very slow training
Page 326/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
… and can converge to poor local minima.
Backpropagation
Problem: Deep Network• Very slow training• Maybe bad solution
Page 336/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
Problem: Deep Network• Very slow training• Maybe bad solution
Idea: Initialize close to a good solution
The task is to initialize the parameters close to a good solution!
Page 346/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
Problem: Deep Network• Very slow training• Maybe bad solution
Idea: Initialize close to a good solution• Pretraining
Therefore the training of autoencoders has a pretraining phase …
Page 356/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
Problem: Deep Network• Very slow training• Maybe bad solution
Idea: Initialize close to a good solution• Pretraining• Restricted Boltzmann Machines
… which uses Restricted Boltzmann Machines (RBMs)
Page 366/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
Problem: Deep Network• Very slow training• Maybe bad solution
Idea: Initialize close to a good solution• Pretraining• Restricted Boltzmann Machines
… which uses Restricted Boltzmann Machines (RBMs)
Restricted Boltzmann Machine
• RBMs are Markov Random Fields
Page 376/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
Problem: Deep Network• Very slow training• Maybe bad solution
Idea: Initialize close to a good solution• Pretraining• Restricted Boltzmann Machines
… which uses Restricted Boltzmann Machines (RBMs)
Restricted Boltzmann Machine
• RBMs are Markov Random Fields
Markov Random FieldEvery unit influences every neighborThe coupling is undirected
Motivation (Ising Model)A set of magnetic dipoles (spins)is arranged in a graph (lattice)where neighbors arecoupled with a given strengt
Page 386/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
Problem: Deep Network• Very slow training• Maybe bad solution
Idea: Initialize close to a good solution• Pretraining• Restricted Boltzmann Machines
… which uses Restricted Boltzmann Machines (RBMs)
Restricted Boltzmann Machine
• RBMs are Markov Random Fields• Bipartite topology: visible (v), hidden (h)• Use local energy to calculate the probabilities of values
Training:contrastive divergency(Gibbs Sampling)
h1
v1 v2 v3 v4
h2 h3
Page 396/6/2013
Patrick MichlNetwork Modeling
Autoencoder
Autoencoders
input data X
output data X‘
Training
Backpropagation
Problem: Deep Network• Very slow training• Maybe bad solution
Idea: Initialize close to a good solution• Pretraining• Restricted Boltzmann Machines
… which uses Restricted Boltzmann Machines (RBMs)
Restricted Boltzmann Machine
Gibbs Sampling
Page 406/6/2013
Patrick MichlNetwork Modeling
Autoencoders
Autoencoder
The top layer RBM transforms real value data into binary codes.
Top
Training
Page 416/6/2013
Patrick MichlNetwork Modeling
Autoencoders
Autoencoder
Top
Therefore visible units are modeled with gaussians to encode data …
h2
v1 v2 v3 v4
h3 h4 h5h1
Training
Page 426/6/2013
Patrick MichlNetwork Modeling
Autoencoders
Autoencoder
Top
… and many hidden units with simoids to encode dependencies
h2
v1 v2 v3 v4
h3 h4 h5h1
Training
Page 436/6/2013
Patrick MichlNetwork Modeling
Autoencoders
Autoencoder
Top
The objective function is the sum of the local energies.
Local Energy
𝐸𝑣≔−∑h𝑤 h𝑣
𝑥𝑣
𝜎𝑣𝑥
h+
(𝑥𝑣−𝑏𝑣 )2
2𝜎 𝑣2
h2
v1 v2 v3 v4
h3 h4 h5h1
Training
Page 446/6/2013
Patrick MichlNetwork Modeling
Autoencoders
Autoencoder
Reduction
The next RBM layer maps the dependency encoding…
Training
Page 456/6/2013
Patrick MichlNetwork Modeling
Autoencoders
Autoencoder
Reduction
… from the upper layer …
v
h1
v1 v2 v3 v4
h2 h3
Training
Page 466/6/2013
Patrick MichlNetwork Modeling
Autoencoders
Autoencoder
Reduction
… to a smaller number of simoids …
h
h1
v1 v2 v3 v4
h2 h3
Training
Page 476/6/2013
Patrick MichlNetwork Modeling
Autoencoders
Autoencoder
Reduction
… which can be trained faster than the top layer
Local Energy𝐸𝑣≔−∑
h𝑤 h𝑣 𝑥𝑣 𝑥h+𝑥h𝑏h
𝐸h≔−∑𝑣𝑤 h𝑣 𝑥𝑣 𝑥h+𝑥𝑣𝑏𝑣
h1
v1 v2 v3 v4
h2 h3
Training
Page 486/6/2013
Patrick MichlNetwork Modeling
Autoencoders
Autoencoder
Unrolling
The symmetric topology allows us to skip further training.
Training
Page 496/6/2013
Patrick MichlNetwork Modeling
Autoencoders
Autoencoder
Unrolling
The symmetric topology allows us to skip further training.
Training
Page 506/6/2013
Patrick MichlNetwork Modeling
After pretraining backpropagation usually finds good solutions
Autoencoders
Autoencoder
Training
• PretrainingTop RBM (GRBM)Reduction RBMsUnrolling
• FinetuningBackpropagation
Page 516/6/2013
Patrick MichlNetwork Modeling
The algorithmic complexity of RBM training depends on the network size
Autoencoders
Autoencoder
Training
• Complexity: O(inw)i: number of iterationsn: number of nodesw: number of weights
• Memory Complexity: O(w)
Page 526/6/2013
Patrick MichlNetwork Modeling
Agenda
Autoencoders
Biological Model
Validation & Implementation
Page 536/6/2013
Patrick MichlNetwork Modeling Network Modeling
Restricted Boltzmann Machines (RBM)
How to model the topological structure?
S
E
TF
Page 546/6/2013
Patrick MichlNetwork Modeling
We define S and E as visible data Layer …
S
E
TF
Network ModelingRestricted Boltzmann Machines (RBM)
Page 556/6/2013
Patrick MichlNetwork Modeling
S E
TF
Network ModelingRestricted Boltzmann Machines (RBM)
We identify S and E with the visible layer …
Page 566/6/2013
Patrick MichlNetwork Modeling
S E
… and the TFs with the hidden layer in a RBM
TF
Network ModelingRestricted Boltzmann Machines (RBM)
Page 576/6/2013
Patrick MichlNetwork Modeling
S E
The training of the RBM gives us a model
TF
Network ModelingRestricted Boltzmann Machines (RBM)
Page 586/6/2013
Patrick MichlNetwork Modeling
Agenda
Autoencoder
Biological Model
Implementation & Results
Page 596/6/2013
Patrick MichlNetwork Modeling
Results
Validation of the results
• Needs information about the true regulation• Needs information about the descriptive power of the data
Page 606/6/2013
Patrick MichlNetwork Modeling
Results
Validation of the results
• Needs information about the true regulation• Needs information about the descriptive power of the data
Without this infomation validation can only be done,using artificial datasets!
Page 616/6/2013
Patrick MichlNetwork Modeling
Results
Artificial datasets
We simulate data in three steps:
Page 626/6/2013
Patrick MichlNetwork Modeling
Results
Artificial datasets
We simulate data in three steps
Step 1Choose number of Genes (E+S) and create random bimodal distributed data
Page 636/6/2013
Patrick MichlNetwork Modeling
Results
Artificial datasets
We simulate data in three steps
Step 1Choose number of Genes (E+S) and create random bimodal distributed data
Step 2Manipulate data in a fixed order
Page 646/6/2013
Patrick MichlNetwork Modeling
Results
Artificial datasets
We simulate data in three steps
Step 1Choose number of Genes (E+S) and create random bimodal distributed data
Step 2Manipulate data in a fixed order
Step 3Add noise to manipulated dataand normalize data
Page 656/6/2013
Patrick MichlNetwork Modeling
Simulation
Results
Step 1Number of visible nodes 8 (4E, 4S)Create random data:Random {-1, +1} + N(0,
Page 666/6/2013
Patrick MichlNetwork Modeling
Simulation
Results
NoiseStep 2Manipulate data
Page 676/6/2013
Patrick MichlNetwork Modeling
Simulation
Results
Step 3Add noise: N(0,
Page 686/6/2013
Patrick MichlNetwork Modeling
Results
We analyse the data Xwith an RBM
Page 696/6/2013
Patrick MichlNetwork Modeling
Results
We train an autoencoder with 9 hidden layersand 165 nodes:
Layer 1 & 9: 32 hidden unitsLayer 2 & 8: 24 hidden unitsLayer 3 & 7: 16 hidden unitsLayer 4 & 6: 8 hidden unitsLayer 5: 5 hidden units
input data X
output data X‘
Page 706/6/2013
Patrick MichlNetwork Modeling
Results
We transform the data from X to X‘And reduce the dimensionality
Page 716/6/2013
Patrick MichlNetwork Modeling
Results
We analyse thetransformed data X‘with an RBM
Page 726/6/2013
Patrick MichlNetwork Modeling
Results
Lets compare the models
Page 736/6/2013
Patrick MichlNetwork Modeling
Results
Another Example with more nodes and larger autoencoder
Page 746/6/2013
Patrick MichlNetwork Modeling
Conclusion
Conclusion
• Autoencoders can improve modeling significantly by reducing the dimensionality of data
• Autoencoders preserve complex structures in their multilayer perceptron network. Analysing those networks (for example with knockout tests) could give more structural information
• The drawback are high computational costsSince the field of deep learning is getting more popular (Face recognition / Voice recognition, Image transformation). Many new improvements in facing the computational costs have been made.
Page 756/6/2013
Patrick MichlNetwork Modeling
Acknowledgement
eilsLABSProf. Dr. Rainer KönigProf. Dr. Roland EilsNetwork Modeling Group