73
1 Neural networks: Unsupervised learning

Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

1

Neural networks:Unsupervised learning

Page 2: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

2

PreviouslyThe supervised learning paradigm:

given example inputs x and target outputs tlearning the mapping between them

the trained network is supposed to give ‘correct response’ for any given input stimulus

training is equivalent of learning the appropriate weights

to achiece this an objective function (or error function) is defined, which is minimized during training

Page 3: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

3

Previously

Optimization wrt. an objective function

where

(error function)

(regularizer)

Page 4: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

4

Previously

Interpret y(x,w) as a probability:

Page 5: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

4

Previously

Interpret y(x,w) as a probability:

the likelihood of the input data can be expressed with the original error function function

Page 6: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

4

Previously

Interpret y(x,w) as a probability:

the likelihood of the input data can be expressed with the original error function function

the regularizer has the form of a prior!

Page 7: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

4

Previously

Interpret y(x,w) as a probability:

the likelihood of the input data can be expressed with the original error function function

the regularizer has the form of a prior!

what we get in the objective function M(w): the posterior distribution of w:

Page 8: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

4

Previously

Interpret y(x,w) as a probability:

the likelihood of the input data can be expressed with the original error function function

the regularizer has the form of a prior!

what we get in the objective function M(w): the posterior distribution of w:

The neuron’s behavior is faithfully translated into probabilistic terms!

Page 9: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

5

PreviouslyWhen making predictions....

Page 10: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

5

Previously

Original estimateWhen making predictions....

Page 11: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

5

Previously

Original estimate Bayesian estimateWhen making predictions....

Page 12: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

5

Previously

Original estimate Bayesian estimateWhen making predictions....

• The probabilistic interpretation makes our assumptions explicit:by the regularizer we imposed a soft constraint on the learned parameters, which expresses our prior expecations.

• An additional plus:beyond getting wMP we get a measure for learned parameter uncertainty

Page 13: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

6

What’s coming?

• Networks & probabilistic framework: from the Hopfield network to Boltzmann machine

• What we learn?

Density estimation, neural architecture and optimization principles: principal component analysis (PCA)

• How we learn?

Hebb et al: Learning rules• Any biology?

Simple cells & ICA

Page 14: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

7

Learning data...

Page 15: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

7

Learning data...

Page 16: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

8

Unsupervised learning: what is it about?

Capacity of a single neuron is limited: certain data can only be learnedSo far, we used a supervised learning paradigm: a teacher was necessaryto teach an input-output relation

Hopfield networks try to cure both

Hebb rule: an enlightening example

assuming 2 neurons and a weight modification process:

This simple rule realizes an associative memory!

Neural networks Unsupervised learning

Page 17: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

9

Neural networks The Hopfield network

Architecture: a set of I neuronsconnected by symmetric synapses of weight wij no self connections: wii=0output of neuron i: xi

Activity rule:

Synchronous/ asynchronous update

Learning rule:

;

Page 18: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

9

Neural networks The Hopfield network

Architecture: a set of I neuronsconnected by symmetric synapses of weight wij no self connections: wii=0output of neuron i: xi

Activity rule:

Synchronous/ asynchronous update

Learning rule:

alternatively, a continuous network can be defined as:

;

Page 19: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

10

Neural networks Stability of Hopfield network

Are the memories stable?

Necessary conditions: symmetric weights; asynchronous update

the activation and activity rule together define a Lyapunov function

Page 20: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

10

Neural networks Stability of Hopfield network

Are the memories stable?

Necessary conditions: symmetric weights; asynchronous update

the activation and activity rule together define a Lyapunov function

Page 21: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

10

Neural networks Stability of Hopfield network

Are the memories stable?

Necessary conditions: symmetric weights; asynchronous update

Robust against perturbation ofa subset of weights

the activation and activity rule together define a Lyapunov function

Page 22: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

11

Neural networks Capacity of Hopfield network

How many traces can be memorized by a network of I neurons?

Page 23: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

12

Neural networks Capacity of Hopfield network

Failures of the Hopfield networks:

• Corrupted bits• Missing memory traces• Spurious states not directly related to training data

Page 24: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

13

Neural networks Capacity of Hopfield network

Activation rule:

Trace of the ‘desired memory and additional random memories:

Page 25: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

13

Neural networks Capacity of Hopfield network

Activation rule:

Trace of the ‘desired memory and additional random memories:

Page 26: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

13

Neural networks Capacity of Hopfield network

Activation rule:

Trace of the ‘desired memory and additional random memories:

desired state

Page 27: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

13

Neural networks Capacity of Hopfield network

Activation rule:

Trace of the ‘desired memory and additional random memories:

desired state random contribution

Page 28: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

13

Neural networks Capacity of Hopfield network

Activation rule:

Trace of the ‘desired memory and additional random memories:

desired state random contribution

Page 29: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

14

Neural networks Capacity of Hopfield network

Failure in operation: avalanches

• N/I < 0.138: ‘spin glass states’• N/I (0 0.138): states close to desired memories• N/I (0 0.05): desired states have lower energy than

spurious states• N/I (0.05 0.138): spurious states dominate• N/I (0 0.03): mixture states

∈∈

∈∈

The Hebb rule determines how well it performsother learning might do a better job(reiforcement learning)

Page 30: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

15

Hopfield network for optimization

Page 31: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

16

The Boltzmann machine

The optimization performed by Hopfield network:

minimizing

Page 32: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

16

The Boltzmann machine

The optimization performed by Hopfield network:

minimizing

Again: we can make a correspondence with aprobabilistic model:

Page 33: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

16

The Boltzmann machine

The optimization performed by Hopfield network:

minimizing

Again: we can make a correspondence with aprobabilistic model:

What do we gain by this: • more transparent functioning• superior performance than Hebb rule

Page 34: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

16

The Boltzmann machine

The optimization performed by Hopfield network:

minimizing

Again: we can make a correspondence with aprobabilistic model:

What do we gain by this: • more transparent functioning• superior performance than Hebb rule

Activity rule:

Page 35: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

16

The Boltzmann machine

The optimization performed by Hopfield network:

minimizing

Again: we can make a correspondence with aprobabilistic model:

What do we gain by this: • more transparent functioning• superior performance than Hebb rule

Activity rule:

How is learning performed?

Page 36: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

17

Boltzmann machine -- EM

Likelihood function:

Page 37: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

17

Boltzmann machine -- EM

Likelihood function:

Estimating the parameters:

Page 38: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

17

Boltzmann machine -- EM

Likelihood function:

Minimizing for w:

Estimating the parameters:

Page 39: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

17

Boltzmann machine -- EM

Likelihood function:

Minimizing for w:

Estimating the parameters:

Sleeping and waking:

Page 40: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

17

Boltzmann machine -- EM

Likelihood function:

Minimizing for w:

Estimating the parameters:

Sleeping and waking:

Page 41: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

17

Boltzmann machine -- EM

Likelihood function:

Minimizing for w:

Estimating the parameters:

Sleeping and waking:

Page 42: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

18

Learning data...

Page 43: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

18

Learning data...

Page 44: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

19

Summary

Boltmann translates the neural network mecanisms into a probablisitic framework

Its capabilities are limited

We learned that the probabilistic framework clarifies assumptions

We learned that within the world constrained by our assumptions the probabilistic approach gives clear answers

Page 45: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

20

Learning data...

Page 46: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

20

Learning data...

Page 47: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

20

Learning data...

Hopfield/Boltman

Page 48: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

20

Learning data...

Hopfield/Boltman

?

Page 49: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

21

Principal Component Analysis

Page 50: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

21

Principal Component Analysis

Let’s try to find linearly independent filtersSet the basis along the eigenvectors of the data

Page 51: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

22

Principal Component Analysis

Page 52: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

22

Principal Component Analysis

Page 53: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

22

Principal Component Analysis

Page 54: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

22

Principal Component Analysis

Olshausen & Field, Nature (1996)

Page 55: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

23

Relation between PCA and learning rules

A single neuron driven by multiple inputs:

Basic Hebb rule:

Averaged Hebb rule:

Correlation based rule:

note that

Page 56: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

24

Relation between PCA and learning rules

Making possible both LTP and LTD

Postsynaptic threshold

Postsynaptic threshold

Setting the threshold to average postsynaptic activity:

, where

Page 57: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

24

Relation between PCA and learning rules

Making possible both LTP and LTD

Postsynaptic threshold

Postsynaptic threshold

Setting the threshold to average postsynaptic activity:

, where

heterosynaptic depression

homosynaptic depression

Page 58: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

24

Relation between PCA and learning rules

Making possible both LTP and LTD

Postsynaptic threshold

Postsynaptic threshold

Setting the threshold to average postsynaptic activity:

, where

heterosynaptic depression

homosynaptic depression

BCM rule:

Page 59: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

25

Relation between PCA and learning rules Regularization again

BCM rule:

Hebb rule:

(Oja rule)

Page 60: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

26

Relation between PCA and learning rules

Page 61: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

26

Relation between PCA and learning rules

Page 62: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

26

Relation between PCA and learning rules

Page 63: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

27

The architecture of the network andlearning rule hand-in-hand detemine the learned representation

Page 64: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

28

Density estimation

Empirical distribution/ input distribution

Latent variables: v

Recognition: p[ v | u]

Generative distribution:

Kullback-Leibler divergence:

Page 65: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

28

Density estimation

Empirical distribution/ input distribution

Latent variables: v

Recognition: p[ v | u]generative model

Generative distribution:

Kullback-Leibler divergence:

Page 66: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

28

Density estimation

Empirical distribution/ input distribution

Latent variables: v

Recognition: p[ v | u]generative model

Generative distribution:

Recognition distribution:

Kullback-Leibler divergence:

Page 67: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

28

Density estimation

Empirical distribution/ input distribution

Latent variables: v

Recognition: p[ v | u]generative model

Generative distribution:

Recognition distribution:

Kullback-Leibler divergence:

The match between our model distribution and input distribution

Page 68: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

29

How to solve density estmation? EM

Page 69: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

30

Sparse coding

PCA:

Sparse coding: make the reconstruction faithfulkeep the units/neurons quiet

Page 70: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

30

Sparse coding

PCA: trying to find linearly independent filterssetting the basis along the eigenvectors of the data

Sparse coding: make the reconstruction faithfulkeep the units/neurons quiet

Page 71: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

30

Sparse coding

PCA: trying to find linearly independent filterssetting the basis along the eigenvectors of the data

Sparse coding: make the reconstruction faithfulkeep the units/neurons quiet

Page 72: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

31

Sparse coding

Neural dynamics: gradient descent

Page 73: Neural networks: Unsupervised learningcneuro.rmki.kfki.hu/sites/default/files/3_hopfield_network.pdf · Hopfield networks try to cure both Hebb rule: an enlightening example assuming

32

Sparse coding