34
Neural Networks • Resources – Chapter 19, textbook • Sections 19.1-19.6

Course notes (.ppt)

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Course notes (.ppt)

Neural Networks

• Resources– Chapter 19, textbook

• Sections 19.1-19.6

Page 2: Course notes (.ppt)

Neuroanatomy Metaphor• Neural networks (aka connectionist, PDP, artificial

neural networks, ANN)– Rough approximation to animal nervous system– See systems such as NEURON for modeling at more biological

levels of detail; http://neuron.duke.edu/

• Neuron components in brains– Soma (cell body); dendritic tree – Axon: sends signal downstream– Synapses

• Receive incoming signals from upstream neurons• Connections on dendrites, cell body, axon, synapses• Neurotransmitter mechanisms

Page 3: Course notes (.ppt)

‘a’ or ‘the’ brain?

• Are we using computer models of neurons to model ‘the’ brain or model ‘a’ brain?

Page 4: Course notes (.ppt)

Neuron Firing Process

1) Synapse receives incoming signals, change electrical (ionic) potential of cell body

2) When a potential of cell body reaches some limit, neuron “fires”, electrical signal (action potential) sent down axon

3) Axon propagates signal to other neurons, downstream

Page 5: Course notes (.ppt)

What is represented by a biological neuron?

• Cell body sums electrical potentials from incoming signals– Serves as an accumulator function over time– But “as a rule many impulses must reach a neuron

almost simultaneously to make it fire” (p. 33, Brodal, 1992; italics added)

• Synapses have varying effects on cell potential– Synaptic strength

Page 6: Course notes (.ppt)

ANN (Artificial Neural Nets)

• Approximation of biological neural nets by ANN’s– No direct model of accumulator function

– Synaptic strength• Approximate with connection weights (real numbers)

– Spiking of output• Approximate with non-linear activation functions

• Neural units– Represent activation values (numbers)

– Represent inputs, and outputs (numbers)

Page 7: Course notes (.ppt)

Graphical Notation & Terms

• Circles– Are neural units– Metaphor for nerve cell

body• Arrows

– Represent synaptic connections from one unit to another

– These are often called weights and represented with a single value (e.g., real value)

One layer of neural units

Another layer of neural units

Page 8: Course notes (.ppt)

Another Example: 8 units in each layer, fully connected network

Page 9: Course notes (.ppt)

Units & Weights

• Units– Sometimes notated

with unit numbers• Weights

– Sometimes give by symbols

– Sometimes given by numbers

– Always represent numbers

– May be boolean valued or real valued

1

2

3

4

0.3-0.1

2.1

-1.1

1

1

Unit num

bers

Unit num

ber

1

2

3

4

W1,1

W1,2

W1,3

W1,4

Page 10: Course notes (.ppt)

Computing with Neural Units

• Inputs are presented to input units

• How do we generate outputs

• One idea– Summed Weighted Inputs

1

2

3

4

0.3-0.1

2.1

-1.1

Input: (3, 1, 0, -2)

Processing:

3(0.3) + 1(-0.1) + 0(2.1) + -1.1(-2)

= 0.9 + (-0.1) + 2.2

Output: 3

Page 11: Course notes (.ppt)

Activation Functions

• Usually, don’t just use weighted sum directly• Apply some function to the weighted sum before it

is used (e.g., as output)• Call this the activation function• Step function could be a good simulation of a

biological neuron spiking

x 0

x 1)(

if

ifxf

Is called the threshold

Page 12: Course notes (.ppt)

Step Function Example

• Let = 4

1

2

3

4

0.3-0.1

2.1

-1.14

1

0)3( f

Page 13: Course notes (.ppt)

Another Activation Function:The Sigmoidal

• The math of some neural nets requires that the activation function be continuously differentiable

• A sigmoidal function often used to approximate the step function

xexf

1

1)(

Is the steepness parameter

Page 14: Course notes (.ppt)

Sigmoidal Example

0.3-0.1

2.1

-1.1

95.1

1)3(

xe

f

1xe

xf

1

1)(

Input: (3, 1, 0, -2)

Page 15: Course notes (.ppt)

Sigmoidal

0

0.2

0.4

0.6

0.8

1

1.2

-5

-4.4

-3.8

-3.2

-2.6 -2

-1.4

-0.8

-0.2 0.4 1

1.6

2.2

2.8

3.4 4

4.6

1/(1+exp(-x)))

1/(1+exp(-10*x)))

Page 16: Course notes (.ppt)

Another Example

• A two weight layer, feedforward network• Two inputs, one output, one ‘hidden’ unit

0.5

-0.5

Input: (3, 1)xe

xf

1

1)(

0.75

What is the output?

Page 17: Course notes (.ppt)

Computing in Multilayer Networks

• Start at leftmost layer– Compute activations based on inputs

• Then work from left to right, using computed activations as inputs to next layer

• Example solution– Activation of hidden unit

f(0.5(3) + -0.5(1)) =f(1.5 – 0.5) =f(1) = 0.731

– Output activationf(0.731(0.75)) =

f(0.548) = .634

xexf

1

1)(

Page 18: Course notes (.ppt)

Notation for Weighted Sums

Weight (scalar) from unit j in left layer to unit i in right layerjiW ,Activation value of unit k in layer l; layers increase in number from left to right lka ,

)(1 ,,1,

n

i lijilk aWfaW1,1

W1,2

W1,3

W1,4

1,2a1,3a1,4a

2,1a

1,1a

Page 19: Course notes (.ppt)

Notation

)(

1 1,,12,1

n

j jjaWfa

)( 1,44,11,33,11,22,11,11,1 aWaWaWaWf

1,4a

W1,1

W1,2

W1,3

W1,4

1,2a1,3a 2,1a

1,1a

Page 20: Course notes (.ppt)

Notation

iW Row vector of incoming weights for unit i

ia Column vector of activation values of units connected to unit i

Page 21: Course notes (.ppt)

Example

4,13,12,11,11 WWWWW

1,4

1,3

1,2

1,1

4,13,12,11,111

a

a

a

a

WWWWaW

4

3

2

1

1

a

a

a

a

a Recall: multiplying a n×r with a r×m matrix produces an n×m matrix, C, where each element in that n×m matrix Ci,j is produced as the scalar product of row i of the left and column j of the right

1,4a

W1,1

W1,2

W1,3

W1,4

1,2a1,3a 2,1a

1,1a

Page 22: Course notes (.ppt)

Scalar Result:Summed Weighted Input

1,44,11,33,11,22,11,11,1 aWaWaWaW

1×4 row vector

4×1 column vector

1×1 matrix (scalar)

1,4

1,3

1,2

1,1

4,13,12,11,111

a

a

a

a

WWWWaW

Page 23: Course notes (.ppt)

Computing New Activation Value

Where: f(x) is the activation function, e.g., the sigmoid function

)( iiaWfa

)( 11aWfa

For the case we were considering:

In the general case:

)( 1,44,11,33,11,22,11,11,1 aWaWaWaWfa

Page 24: Course notes (.ppt)

Example

• Compute the output value

• Draw the corresponding ANN

)

3

2

1

1- 0.5 4.0(f

Page 25: Course notes (.ppt)

ANN Solving the Equality Problem for 2 Bits

x1

x2

y1

y2

z1

Network Architecture

x1 x2 z1

0 0 1

0 1 0

1 0 0

1 1 1

What weights solve this problem?

Goal outputs:

Page 26: Course notes (.ppt)

Approximate Solution

http://www.d.umn.edu/~cprince/courses/cs5541fall02/lectures/neural-networks/

x1

x2

y1

y2

z1

Network Architecture

x1 x2 z1

0 0 .925

0 1 .192

1 0 .19

1 1 .433

Actual network results:

Weights

w_x1_y1 w_x1_y2 w_x2_y1 w_x2_y2

-1.8045 -7.7299 -1.8116 -7.6649

w_y1_z1 w_y2_z1

-10.3022 15.3298

Page 27: Course notes (.ppt)

How well did this approximate the goal function?

• Categorically– For inputs x1=0, x2=0 and x1=1, x2=1, the

output of the network was always greater than for inputs x1=1, x2=0 and x1=0, x2=1

• Summed squared error

2

1

)( s

mplesnumTrainSa

ss putDesiredOututActualOutp

Page 28: Course notes (.ppt)

• Compute the summed squared error for our example

2

1

)( s

mplesnumTrainSa

ss putDesiredOututActualOutp

x1 x2 z1

0 0 .925

0 1 .192

1 0 .19

1 1 .433

Page 29: Course notes (.ppt)

Solution

Expected Actual

x1 x2 z1 z1 squared error

0 0 1 0.925 0.005625

0 1 0 0.192 0.036864

1 0 0 0.19 0.0361

1 1 1 0.433 0.321489

0.400078Sum squared error =

Page 30: Course notes (.ppt)

Weight Matrix

• Row vector provides weights for a single unit in “right” layer

• A weight matrix provides all weights connecting “left” layer to “right” layer

• Let W be a n×r weight matrix– Row vector i in matrix connects unit i on “right” layer

to units in “left” layer– n units in layer to “right”– r units in layer to “left”

Page 31: Course notes (.ppt)

Notation

ia The vector of activation values of layer to “left”; an r×1 column vector (same as before)

iaW n×1 column vector; summed weights for “right” layer

)( iaWf n×1 - New activation values for “right” layer

Function f is now taken as applying to elements of a matrix

Page 32: Course notes (.ppt)

Example

)75.

4.0

23

02

11.1

34

0.31

(

f

Updating hidden layer activation values

)

1

3

3

2

1.

6.3310

56471.

4.1322

(

f

Updating output activation values

Draw the architecture (units and arcs representing weights) of the connectionist model

Page 33: Course notes (.ppt)

Answer

• 2 input units

• 5 hidden layer units

• 3 output units

• Fully connected, feedforward network

Page 34: Course notes (.ppt)

Bias Weights

• Used to provide a train-able threshold

W1,1

W1,2

W1,3

W1,4

1

b

b is treated as another weight; but connected to a unit with constant activation value