41
1 Neurons and neural networks II. Hopfield network

Neurons and neural networks II. Hopfield networkcneuro.rmki.kfki.hu/sites/default/files/nn_perceptron_2.pdfHopfield networks try to cure both Hebb rule: an enlightening example assuming

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

1

Neurons and neural networks II. Hopfield network

2

Perceptron recap• key ingredient: adaptivity of the system• unsupervised vs supervised learning• architecture for discrimination: single neuron — perceptron• error function & learning rule• gradient descent learning & divergence• regularisation• learning as inference

3

Interpreting learning as inferenceSo far: optimization wrt. an objective function

where

3

Interpreting learning as inferenceSo far: optimization wrt. an objective function

where

What’s this quirky regularizer, anyway?

4

Interpreting learning as inferenceLet’s interpret y(x,w) as a probability:

4

Interpreting learning as inferenceLet’s interpret y(x,w) as a probability:

in a compact form:

4

Interpreting learning as inferenceLet’s interpret y(x,w) as a probability:

in a compact form:

the likelihood of the input data can be expressed with the original error function function

4

Interpreting learning as inferenceLet’s interpret y(x,w) as a probability:

in a compact form:

the likelihood of the input data can be expressed with the original error function function

the regularizer has the form of a prior!

4

Interpreting learning as inferenceLet’s interpret y(x,w) as a probability:

in a compact form:

the likelihood of the input data can be expressed with the original error function function

the regularizer has the form of a prior!

what we get in the objective function M(w): the posterior distribution of w:

5

Interpreting learning as inference Relationship between M(w) and the posterior

interpretation: minimizing M(w) leads to finding the maximum a posteriori estimate wMP

The log probability interpretation of the objective function retains:additivity of errors, whilekeeping the multiplicativity of probabilities

6

Interpreting learning as inference Properties of the Bayesian estimate

6

Interpreting learning as inference Properties of the Bayesian estimate

• The probabilistic interpretation makes our assumptions explicit:by the regularizer we imposed a soft constraint on the learned parameters, which expresses our prior expecations.

• An additional plus:beyond getting wMP we get a measure for learned parameter uncertainty

7

Interpreting learning as inference Demo

7

Interpreting learning as inference Demo

7

Interpreting learning as inference Demo

7

Interpreting learning as inference Demo

8

Interpreting learning as inference Making predictions

Up to this point the goal was optimization:

8

Interpreting learning as inference Making predictions

Up to this point the goal was optimization:

8

Interpreting learning as inference Making predictions

Up to this point the goal was optimization:

Are we equally confident in the two predictions?

8

Interpreting learning as inference Making predictions

Up to this point the goal was optimization:

Are we equally confident in the two predictions?

The Bayesian answer exploits the probabilistic interpretation:

9

Interpreting learning as inference Calculating Bayesian predictions

Predictive probability:

9

Interpreting learning as inference Calculating Bayesian predictions

Predictive probability:

Likelihood:

Weight posterior

Partition function:

9

Interpreting learning as inference Calculating Bayesian predictions

Predictive probability:

Likelihood:

Weight posterior

Partition function:

Finally:

10

Interpreting learning as inference Calculating Bayesian predictions

How to solve the integral?

10

Interpreting learning as inference Calculating Bayesian predictions

How to solve the integral?Bad news: Monte Carlo integration is needed

11

12

13

Interpreting learning as inference Calculating Bayesian predictions

13

Interpreting learning as inference Calculating Bayesian predictions

Original estimate

13

Interpreting learning as inference Calculating Bayesian predictions

Original estimate Bayesian estimate

14

Interpreting learning as inference Gaussian approximation

Taylor expansion around the MAP estimate

14

Interpreting learning as inference Gaussian approximation

The Gaussian approximation:

Taylor expansion around the MAP estimate

14

Interpreting learning as inference Gaussian approximation

The Gaussian approximation:

Taylor expansion around the MAP estimate

15

Neural networks Unsupervised learning

Unsupervised learning: what is it about?

Capacity of a single neuron is limited: certain data can only be learnedSo far, we used a supervised learning paradigm: a teacher was necessaryto teach an input-output relation

Hopfield networks try to cure both

Hebb rule: an enlightening example

assuming 2 neurons and a weight modification process:

This simple rule realizes an associative memory!

16

Neural networks The Hopfield network

Architecture: a set of I neuronsconnected by symmetric synapses of weight wij no self connections: wii=0output of neuron i: xi

Activity rule:

Synchronous/ asynchronous update

Learning rule:

;

16

Neural networks The Hopfield network

Architecture: a set of I neuronsconnected by symmetric synapses of weight wij no self connections: wii=0output of neuron i: xi

Activity rule:

Synchronous/ asynchronous update

Learning rule:

alternatively, a continuous network can be defined as:

;

17

Neural networks Stability of Hopfield network

Are the memories stable?

Necessary conditions: symmetric weights; asynchronous update

17

Neural networks Stability of Hopfield network

Are the memories stable?

Necessary conditions: symmetric weights; asynchronous update

17

Neural networks Stability of Hopfield network

Are the memories stable?

Necessary conditions: symmetric weights; asynchronous update

Robust against perturbation ofa subset of weights

18

Neural networks Capacity of Hopfield network

How many traces can be memorized by a network of I neurons?

19

Neural networks Capacity of Hopfield network