NeuralNetworks-1

7/27/2019 NeuralNetworks-1

1/36

G51IAI

Introduction to AIAndrew Parkes

Neural Networks 1


2/36

Neural Networks

AIMA

Section 20.5 of 2003 edition

Fundamentals of Neural Networks :Architectures, Algorithms andApplications. L, Fausett, 1994

An Introduction to Neural Networks(2nd Ed). Morton, IM, 1995


3/36

Brief History

Try to create artificial intelligence basedon the natural intelligence we know:

The brain

massively interconnected neurons


4/36

G51IAI Neural Networks

Neural Networks


5/36


6/36

Natural Neural Networks

We are born with about 100 billionneurons

A neuron may connect to as many as100,000 other neurons


7/36

Natural Neural Networks

McCulloch & Pitts (1943) are generallyrecognised as the designers of the first

neural network Many of their ideas still used today e.g.

many simple units, neurons combine to

give increased computational power the idea of a threshold


8/36


Modelling a Neuron

aj :Activation value of unit j

wj,i :Weight on link from unit j to unit i ini :Weighted sum of inputs to unit i

ai :Activation value of unit i

g :Activation function

j jiji aWin ,


9/36


Activation Functions

Stept(x) = 1 if x t, else 0 threshold=t

Sign(x) = +1 if x 0, else 1

Sigmoid(x) = 1/(1+e-x)


10/36

Building a Neural Network

1. Select Structure: Design the way that theneurons are interconnected

2. Select weights decide the strengths withwhich the neurons are interconnected

weights are selected so get a goodmatch to a training set

training set: set of inputs and desiredoutputs

often use a learning algorithm


11/36

Neural Networks

Hebb (1949) developed the firstlearning rule

on the premise that if two neurons wereactive at the same time the strengthbetween them should be increased


12/36

Neural Networks

During the 50s and 60s many researchers worked,amidst great excitement, on a particular net structurecalled the perceptron.

Minsky & Papert (1969) demonstrated a strong limiton the power of perceptrons saw the death of neural network research for about 15 years

Only in the mid 80s (Parker and LeCun) was interestrevived because of their learning algorithm for abetter design of net (in fact Werbos discovered algorithm in 1974)


13/36

Basic Neural Networks Will first look at simplest networks

Feed-forward

Signals travel in one direction through net

Net computes a function of the inputs


14/36


The First Neural Neural Networks

Neurons in a McCulloch-Pitts network areconnected by directed, weighted paths

-1

2

2X

1

X2

X3

Y


15/36



If the weight on a path is positive the path isexcitatory, otherwise it is inhibitory

-1

2

2X

1

X2

X3

Y


16/36



The activation of a neuron is binary. That is,

the neuron either fires (activation of one) or

does not fire (activation of zero).

-1

2

2X

1

X2

X3

Y


17/36



For the network shown here the activation

function for unit Yis

f(y_in) = 1, ify_in>= else 0

where y_in is the total input signal received

is the threshold forY

-1

2

2X

1

X2

X3

Y


18/36



Originally, all excitatory connections into a

particular neuron have the same weight,

although different weighted connections canbe input to different neurons

Later weights allowed to be arbitrary

-1

2

2X

1

X2

X3

Y


19/36



Each neuron has a fixed threshold. If the net

input into the neuron is greater than or equal

to the threshold, the neuron fires

-1

2

2X

1

X2

X3

Y


20/36



The threshold is set such that any non-zero

inhibitory input will prevent the neuron from

firing

-1

2

2X

1

X2

X3

Y


21/36

Building Logic Gates

Computers are built out of logic gates

Can we use neural nets to represent logicalfunctions?

Use threshold (step) function for activationfunction

all activation values are 0 (false) or 1 (true)


22/36



AND Function

1

1X1

X2

Y

AND

X1 X2 Y

1 1 1

1 0 0

0 1 0

0 0 0

Threshold(Y) = 2


23/36



AND FunctionOR Function

2

2X1

X2

Y

ORX1 X2 Y

1 1 1

1 0 1

0 1 1

0 0 0

Threshold(Y) = 2


24/36



AND NOT Function

-1

2X1

X2

Y

AND

NOT

X1 X2 Y

1 1 0

1 0 1

0 1 0

0 0 0

Threshold(Y) = 2


25/36


Simple Networks

AND OR NOT

Input 1 0 0 1 1 0 0 1 1 0 1

Input 2 0 1 0 1 0 1 0 1Output 0 0 0 1 0 1 1 1 1 0


26/36


Simple Networks

t = 0.0

y

x

W = 1.5

W = 1

-1


27/36


Perceptron

Synonym for Single-

Layer, Feed-ForwardNetwork

First Studied in the

50s

Other networks wereknown about but the

perceptron was the

only one capable of

learning and thus allresearch was

concentrated in this

area


28/36


Perceptron

A single weight only

affects one output sowe can restrict our

investigations to a

model as shown on

the right Notation can be

simpler, i.e.

jWjIjStepO0


29/36


What can perceptrons represent?

AND XOR Input 1 0 0 1 1 0 0 1 1

Input 2 0 1 0 1 0 1 0 1

Output 0 0 0 1 0 1 1 0


30/36



0,0

0,1

1,0

1,1

0,0

0,1

1,0

1,1

AND XOR

Functions which can be separated in this way are calledL inearl y Separable

Only linearly separable functions can be represented by a

perceptron

XOR cannot be represented by a perceptron


31/36



Linear Separability is also possible in more than 3 dimensions

but it is harder to visualise


32/36

XOR

XOR is not linearly separable Cannot be represented by a perceptron

What can we do instead?1. Convert to logic gates that can be

represented by perceptrons2. Chain together the gates

Make sure you understand the following check it using truth tables

X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)

G51IAI N l N k


33/36



XOR Function

2

2

2

2

-1

-1

Z1

Z2

Y

X1

X2

XORX1 X2 Y

1 1 0

1 0 1

0 1 10 0 0

X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)


34/36

Single- vs. Multiple-Layers

Once we chain together the gates then we havehidden layers layers that are hidden from the output lines

Have just seen that hidden layers allow us torepresent XOR Perceptron is single-layer Multiple layers increase the representational

power, so e.g. can represent XOR Generally useful nets have multiple-layers

typically 2-4 layers


35/36

Expectations

Be able to explain the terminology used, e.g. activation functions step and threshold functions perceptron feed-forward multi-layer, hidden layers

linear separability XOR why perceptrons cannot cope with XOR how XOR is possible with hidden layers


36/36

Questions?

Documents

NeuralNetworks-1