18
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes [email protected] Module: Neural Networks: Concepts (Reading: Chapter 20.5)

Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes [email protected] Module: Neural Networks: Concepts (Reading:

  • View
    224

  • Download
    2

Embed Size (px)

Citation preview

Carla P. GomesCS4700

CS 4700:Foundations of Artificial Intelligence

Prof. Carla P. [email protected]

Module: Neural Networks:

Concepts(Reading: Chapter 20.5)

Carla P. GomesCS4700

Basic Concepts

Neural Network

Input 0 Input 1 Input n...

Output 0 Output 1 Output m...

A Neural Network maps a set of inputs to a set of outputs

Number of inputs/outputs is variable

The Network itself is composed of an

arbitrary number of nodes or units, connected

by links, with an arbitrary topology.

A link from unit i to unit j serves to propagate the activation aj to j, and it has a weight Wij.

What can a neural networks do?

Compute a known function / Approximate an unknown function

Pattern Recognition / Signal Processing

Learn to do any of the above

Carla P. GomesCS4700

Different types of nodes

Carla P. GomesCS4700

Output edges,

each with weights

(positive, negative, and

change over time,

learning)

Input edges,

each with weights

(positive, negative, and

change over time,

learning)

An Artificial NeuronNode or Unit:

A Mathematical Abstraction

Artificial Neuron, Node or unit ,

Processing Unit i

j

n

jiji aWin

0

,

Input function(ini):weighted sumof its inputs,including

fixed input a0.

a processing element producing an output based on a function of its inputs

Activation

function (g)applied to

input function

(typically

non-linear).

Output

)(0

, j

n

jiji aWga

Note: the fixed input and bias weight are conventional; some authors instead, e.g., or a0=1 and -W0i

Carla P. GomesCS4700

(a) Threshold activation function a step function or threshold function(outputs 1 when the input is positive; 0 otherwise). (b) Sigmoid (or logistics function) activation function (key advantage: differentiable)(c) Sign function, +1 if input is positive, otherwise -1.

Activation Functions

Changing the bias weight W0,i moves the threshold location.

These functions have a threshold (either hard or soft) at zero.

Carla P. GomesCS4700

Threshold Activation Function

Input edges,

each with weights

(positive, negative, and

change over time,

learning)

;00

,

j

n

jiji aWin

i=0 i=t

iiij

n

jij wwaWgetweadefining ,0,0

1,0 ,1

;00,01

,

awawin ij

n

jiji

i threshold value associated with unit i

iiij

n

jij wwaWgetweadefining ,0,0

1,0 ,1

Carla P. GomesCS4700

i

n

jjij WaW ,0

1,

Implementing Boolean Functions

Units with a threshold activation function can act as logic gates; we can use these units to compute Boolean function of its inputs.

Activation of threshold units when:

Carla P. GomesCS4700

Boolean AND

input x1 input x2 ouput

0 0 0

0 1 0

1 0 0

1 1 1 x2x1

w2=1w1=1

W0= 1.5

-1

i

n

jjij WaW ,0

1,

Activation of threshold units when:

Carla P. GomesCS4700

Boolean OR

input x1 input x2 ouput

0 0 0

0 1 1

1 0 1

1 1 1 x2x1

w2=1w1=1

w0= 0.5

-1

i

n

jjij WaW ,0

1,

Activation of threshold units when:

Carla P. GomesCS4700

Inverter

input x1 output

0 1

1 0

x1

w1= 1-1

w0= -

i

n

jjij WaW ,0

1,

Activation of threshold units when:

So, units with a threshold activation function can act as logic gates given the appropriate input and

bias weights.

Carla P. GomesCS4700

Network Structures

Acyclic or Feed-forward networks

Activation flows from input layer to

output layer– single-layer perceptrons– multi-layer perceptrons

Recurrent networks– Feed the outputs back into own inputs

Network is a dynamical system

(stable state, oscillations, chaotic behavior)Response of the network depends on initial state

– Can support short-term memory– More difficult to understand

Feed-forward networks implement functions, have no internal state (only weights).

Our focus

Carla P. GomesCS4700

Recurrent Networks

Can capture internal state (activation keeps going around);

more complex agents.

Brain cannot be a just a feed-forward network!

Brain has many feed-back connections and cycles

brain is a recurrent network!

Two key examples:

Hopfield networks:

Boltzmann Machines .

Carla P. GomesCS4700

Hopfield Networks

A Hopfield neural network is typically used for pattern recognition.

Hopfield networks have symmetric weights (Wij=Wji);

Output: 0/1 only.

Train weights to obtain associative memory

e.g., store template patterns as multiple stable states; given a new input pattern, the network converges to one of the exemplar patterns.

It can be proven that an N unit Hopfield net can learn up to 0.138N patterns reliably.

Note: no explicit storage: all in weights!

Hopfield Networks

• The user trains the network with a set of black-and-white templates; • Input units: 100 pixels;• Output units: 100 pixels;

• For each template, each neuron in the network (corresponding to one

pixel) learns to turn itself on or off based on the current output of every

other neuron in the network.

• After training, the network can be provided with an arbitrary input pattern,

and it (may) converges to an output pattern resembling whichever

template most closely matches this input pattern

http://www.cbu.edu/~pong/ai/hopfield/hopfieldapplet.html

Hopfield Networks

http://www.cbu.edu/~pong/ai/hopfield/hopfieldapplet.html

Given input pattern:

After around 500 iterations the network converges to:

Carla P. GomesCS4700

Boltzmann Machines

Generalization of Hopfield Networks:

Hidden neurons: the Boltzamnn machines have hidden units;

Neuron update: stochastic activation functions

Both Hopfield and Boltzamnn networks can solve optimizationproblems (similar to Monte Carlo methods).

We will not cover these networks.

Carla P. GomesCS4700

Feed-forward Network:Represents a function of Its Input

Two hidden unitsTwo input units One Output

By adjusting the weights we get different functions:that is how learning is done in neural networks!

Each unit receives input only from units in the immediately

preceding layer.

Given an input vector x = (x1,x2), the activations of the input units are set to values of the

input vector, i.e., (a1,a2)=(x1,x2), and the network computes:

Feed-forward network computes a parameterized family of functions hW(x)

(Bias unit omitted for simplicity)

Note: the input layer in general does not include computing units.

Weights are the parameters of the function

Carla P. GomesCS4700

Large IBM investment in the next generation of Neural Nets

IBM plans 'brain-like' computersPage last updated at 14:52 GMT, Friday, 21 November 2008

By Jason Palmer Science and technology reporter, BBC News

IBM has announced it will lead a US

government-funded collaboration to

make electronic

circuits that mimic brains.

http://news.bbc.co.uk/2/hi/science/nature/7740484.stm