21
Multi-Layer Perceptrons Michael J. Watts http://mike.watts.net.nz

Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

  • Upload
    opa

  • View
    25

  • Download
    0

Embed Size (px)

DESCRIPTION

Multi-Layer Perceptrons Michael J. Watts http://mike.watts.net.nz. Lecture Outline. Perceptron revision Multi-Layer Perceptrons Terminology Advantages Problems. Perceptron Revision. Two neuron layer networks Single layer of adjustable weights Feed forward network - PowerPoint PPT Presentation

Citation preview

Page 1: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Multi-Layer Perceptrons

Michael J. Watts 

http://mike.watts.net.nz

Page 2: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Lecture Outline

• Perceptron revision• Multi-Layer Perceptrons• Terminology• Advantages• Problems

Page 3: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Perceptron Revision

• Two neuron layer networks• Single layer of adjustable weights• Feed forward network• Cannot handle non-linearly-separable

problems• Supervised learning algorithm• Mostly used for classification

Page 4: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Multi-Layer Perceptrons

• Otherwise known as MLPs• Adds an additional layer (or layers) of neurons

to a perceptron• Additional layer called hidden (or

intermediate) layer• Additional layer of adjustable connections

Page 5: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Multi-Layer Perceptrons

• Proposed mid-1960s, but learning algorithms not available until mid-1980s

• Continuous inputs• Continuous outputs• Continuous activation functions used

o e.g. sigmoid, tanh• Feed forward networks

Page 6: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Multi-Layer Perceptrons

Page 7: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Multi-Layer Perceptrons

Page 8: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Multi-Layer Perceptrons

• MLPs can also have biases• A bias is an additional, external, input• Input to bias node is always one• Bias performs the task of a threshold

Page 9: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Multi-Layer Perceptrons

Page 10: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Multi-Layer Perceptrons

• Able to model non-linearly separable functions• Each hidden neuron is equivalent to a

perceptron • Each hidden neuron adds a hyperplane to the

problem space• Requires sufficient neurons to partition the

input space appropriately

Page 11: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Terminology

• Some define MLP according to the number of connection layers

• Others define MLP according to the number of neuron layers

• Some don’t count the input neuron layero doesn’t perform processing

• Some refer to the number of hidden neurons layers

Page 12: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Terminology

• Best to specify whicho i.e. “three neuron layer MLP”, oro “two connection layer MLP”, oro “three neuron layer, one hidden layer MLP”

Page 13: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Advantages

• A MLP with one hidden layer of sufficient size can approximate any continuous function to any desired accuracyo Kolmogorov theorem

• MLP can learn conditional probabilities• MLP are multivariate non-linear regression

models

Page 14: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Advantages

• Learning modelso several ways of training a MLPo backpropagation is the most common

• Universal function approximators• Able to generalise to new data

o can accurately identify / model previously unseen examples

Page 15: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Problems with MLP

• Choosing the number of hidden layerso how many are enough?o one should be sufficient, according to the

Kolmogorov Theoremo two will always be sufficient

Page 16: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Problems with MLP

• Choosing the number of hidden nodeso how many are enough?o Usually, the number of connections in the network

should be less than the number of training examples

• As number of connections approaches the number of training examples, generalisation decreases

Page 17: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Problems with MLP

• Representational capacity• Catastrophic forgetting• Occurs when a trained network is further

trained on new data• Network forgets what it learned about the old

data• Only knows about the new data

Page 18: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Problems with MLP

• Initialisation of weights• Random initialisation can cause problems with

trainingo start in a bad spot

• Can initialise using statistics of training datao PCAo Nearest neighbour

Page 19: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Problems with MLP

• Weight initialisationo can initialise using an evolutionary algorithmo initialisation from decision tresso initialisation from rules

Page 20: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Summary

• MLP overcome the linear separability problem of perceptrons

• Add an additional layer of neurons • Additional layer of variable connections• No agreement on terminology to use

o be precise!

Page 21: Multi-Layer Perceptrons Michael J. Watts mike.watts.nz

Summary

• Convergence is guaranteed over continuous functions

• Problems existo but solutions also exist