Upload
marina761
View
56
Download
8
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Neural Networks
• Resources– Chapter 19, textbook
• Sections 19.1-19.6
Neuroanatomy Metaphor• Neural networks (aka connectionist, PDP, artificial
neural networks, ANN)– Rough approximation to animal nervous system– See systems such as NEURON for modeling at more biological
levels of detail; http://neuron.duke.edu/
• Neuron components in brains– Soma (cell body); dendritic tree – Axon: sends signal downstream– Synapses
• Receive incoming signals from upstream neurons• Connections on dendrites, cell body, axon, synapses• Neurotransmitter mechanisms
‘a’ or ‘the’ brain?
• Are we using computer models of neurons to model ‘the’ brain or model ‘a’ brain?
Neuron Firing Process
1) Synapse receives incoming signals, change electrical (ionic) potential of cell body
2) When a potential of cell body reaches some limit, neuron “fires”, electrical signal (action potential) sent down axon
3) Axon propagates signal to other neurons, downstream
What is represented by a biological neuron?
• Cell body sums electrical potentials from incoming signals– Serves as an accumulator function over time– But “as a rule many impulses must reach a neuron
almost simultaneously to make it fire” (p. 33, Brodal, 1992; italics added)
• Synapses have varying effects on cell potential– Synaptic strength
ANN (Artificial Neural Nets)
• Approximation of biological neural nets by ANN’s– No direct model of accumulator function
– Synaptic strength• Approximate with connection weights (real numbers)
– Spiking of output• Approximate with non-linear activation functions
• Neural units– Represent activation values (numbers)
– Represent inputs, and outputs (numbers)
Graphical Notation & Terms
• Circles– Are neural units– Metaphor for nerve cell
body• Arrows
– Represent synaptic connections from one unit to another
– These are often called weights and represented with a single value (e.g., real value)
One layer of neural units
Another layer of neural units
Another Example: 8 units in each layer, fully connected network
Units & Weights
• Units– Sometimes notated
with unit numbers• Weights
– Sometimes give by symbols
– Sometimes given by numbers
– Always represent numbers
– May be boolean valued or real valued
1
2
3
4
0.3-0.1
2.1
-1.1
1
1
Unit num
bers
Unit num
ber
1
2
3
4
W1,1
W1,2
W1,3
W1,4
Computing with Neural Units
• Inputs are presented to input units
• How do we generate outputs
• One idea– Summed Weighted Inputs
1
2
3
4
0.3-0.1
2.1
-1.1
Input: (3, 1, 0, -2)
Processing:
3(0.3) + 1(-0.1) + 0(2.1) + -1.1(-2)
= 0.9 + (-0.1) + 2.2
Output: 3
Activation Functions
• Usually, don’t just use weighted sum directly• Apply some function to the weighted sum before it
is used (e.g., as output)• Call this the activation function• Step function could be a good simulation of a
biological neuron spiking
x 0
x 1)(
if
ifxf
Is called the threshold
Step Function Example
• Let = 4
1
2
3
4
0.3-0.1
2.1
-1.14
1
0)3( f
Another Activation Function:The Sigmoidal
• The math of some neural nets requires that the activation function be continuously differentiable
• A sigmoidal function often used to approximate the step function
xexf
1
1)(
Is the steepness parameter
Sigmoidal Example
0.3-0.1
2.1
-1.1
95.1
1)3(
xe
f
1xe
xf
1
1)(
Input: (3, 1, 0, -2)
Sigmoidal
0
0.2
0.4
0.6
0.8
1
1.2
-5
-4.4
-3.8
-3.2
-2.6 -2
-1.4
-0.8
-0.2 0.4 1
1.6
2.2
2.8
3.4 4
4.6
1/(1+exp(-x)))
1/(1+exp(-10*x)))
Another Example
• A two weight layer, feedforward network• Two inputs, one output, one ‘hidden’ unit
0.5
-0.5
Input: (3, 1)xe
xf
1
1)(
0.75
What is the output?
Computing in Multilayer Networks
• Start at leftmost layer– Compute activations based on inputs
• Then work from left to right, using computed activations as inputs to next layer
• Example solution– Activation of hidden unit
f(0.5(3) + -0.5(1)) =f(1.5 – 0.5) =f(1) = 0.731
– Output activationf(0.731(0.75)) =
f(0.548) = .634
xexf
1
1)(
Notation for Weighted Sums
Weight (scalar) from unit j in left layer to unit i in right layerjiW ,Activation value of unit k in layer l; layers increase in number from left to right lka ,
)(1 ,,1,
n
i lijilk aWfaW1,1
W1,2
W1,3
W1,4
1,2a1,3a1,4a
2,1a
1,1a
Notation
)(
1 1,,12,1
n
j jjaWfa
)( 1,44,11,33,11,22,11,11,1 aWaWaWaWf
1,4a
W1,1
W1,2
W1,3
W1,4
1,2a1,3a 2,1a
1,1a
Notation
iW Row vector of incoming weights for unit i
ia Column vector of activation values of units connected to unit i
Example
4,13,12,11,11 WWWWW
1,4
1,3
1,2
1,1
4,13,12,11,111
a
a
a
a
WWWWaW
4
3
2
1
1
a
a
a
a
a Recall: multiplying a n×r with a r×m matrix produces an n×m matrix, C, where each element in that n×m matrix Ci,j is produced as the scalar product of row i of the left and column j of the right
1,4a
W1,1
W1,2
W1,3
W1,4
1,2a1,3a 2,1a
1,1a
Scalar Result:Summed Weighted Input
1,44,11,33,11,22,11,11,1 aWaWaWaW
1×4 row vector
4×1 column vector
1×1 matrix (scalar)
1,4
1,3
1,2
1,1
4,13,12,11,111
a
a
a
a
WWWWaW
Computing New Activation Value
Where: f(x) is the activation function, e.g., the sigmoid function
)( iiaWfa
)( 11aWfa
For the case we were considering:
In the general case:
)( 1,44,11,33,11,22,11,11,1 aWaWaWaWfa
Example
• Compute the output value
• Draw the corresponding ANN
)
3
2
1
1- 0.5 4.0(f
ANN Solving the Equality Problem for 2 Bits
x1
x2
y1
y2
z1
Network Architecture
x1 x2 z1
0 0 1
0 1 0
1 0 0
1 1 1
What weights solve this problem?
Goal outputs:
Approximate Solution
http://www.d.umn.edu/~cprince/courses/cs5541fall02/lectures/neural-networks/
x1
x2
y1
y2
z1
Network Architecture
x1 x2 z1
0 0 .925
0 1 .192
1 0 .19
1 1 .433
Actual network results:
Weights
w_x1_y1 w_x1_y2 w_x2_y1 w_x2_y2
-1.8045 -7.7299 -1.8116 -7.6649
w_y1_z1 w_y2_z1
-10.3022 15.3298
How well did this approximate the goal function?
• Categorically– For inputs x1=0, x2=0 and x1=1, x2=1, the
output of the network was always greater than for inputs x1=1, x2=0 and x1=0, x2=1
• Summed squared error
2
1
)( s
mplesnumTrainSa
ss putDesiredOututActualOutp
• Compute the summed squared error for our example
2
1
)( s
mplesnumTrainSa
ss putDesiredOututActualOutp
x1 x2 z1
0 0 .925
0 1 .192
1 0 .19
1 1 .433
Solution
Expected Actual
x1 x2 z1 z1 squared error
0 0 1 0.925 0.005625
0 1 0 0.192 0.036864
1 0 0 0.19 0.0361
1 1 1 0.433 0.321489
0.400078Sum squared error =
Weight Matrix
• Row vector provides weights for a single unit in “right” layer
• A weight matrix provides all weights connecting “left” layer to “right” layer
• Let W be a n×r weight matrix– Row vector i in matrix connects unit i on “right” layer
to units in “left” layer– n units in layer to “right”– r units in layer to “left”
Notation
ia The vector of activation values of layer to “left”; an r×1 column vector (same as before)
iaW n×1 column vector; summed weights for “right” layer
)( iaWf n×1 - New activation values for “right” layer
Function f is now taken as applying to elements of a matrix
Example
)75.
4.0
23
02
11.1
34
0.31
(
f
Updating hidden layer activation values
)
1
3
3
2
1.
6.3310
56471.
4.1322
(
f
Updating output activation values
Draw the architecture (units and arcs representing weights) of the connectionist model
Answer
• 2 input units
• 5 hidden layer units
• 3 output units
• Fully connected, feedforward network
Bias Weights
• Used to provide a train-able threshold
W1,1
W1,2
W1,3
W1,4
1
b
b is treated as another weight; but connected to a unit with constant activation value