1 Machine Learning The Perceptron. 2 Heuristic Search Knowledge Based Systems (KBS) Genetic...

Preview:

Citation preview

1

Machine Learning

The Perceptron

2

• Heuristic Search

• Knowledge Based Systems (KBS)

• Genetic Algorithms (GAs)

In a Knowledge Based Representation the solution is a set of rules

RULE Prescribes_Erythromycin IF Diseases = strep_throat AND Allergy <> erythromycin THEN Prescribes = erythromycin CNF 95 BECAUSE "Erythromycin is an effective treatment for a strep throat if the patient is not allergic to it";

In a Genetic Algorithm Based Representation the solution is a chromosome

011010001010010110000111

4

Genetic algorithmGenetic algorithm

1 01 0X1i

Generation i

0 01 0X2i

0 00 1X3i

1 11 0X4i

0 11 1X5i f = 56

1 00 1X6i f = 54

f = 36

f = 44

f = 14

f = 14

1 00 0X1i+1

Generation (i + 1)

0 01 1X2i+1

1 10 1X3i+1

0 01 0X4i+1

0 11 0X5i+1 f = 54

0 11 1X6i+1 f = 56

f = 56

f = 50

f = 44

f = 44

Crossover

X6i 1 00 0 01 0 X2i

0 01 0X2i 0 11 1 X5i

0X1i 0 11 1 X5i1 01 0

0 10 0

11 101 0

Mutation

0 11 1X5'i 01 0

X6'i 1 00

0 01 0X2'i 0 1

0 0

0 1 111X5i

1 1 1 X1"i1 1

X2"i0 1 0

0X1'i 1 1 1

0 1 0X2i

5

The neuron as a simple computing element The neuron as a simple computing element The perceptronThe perceptron Multilayer artificial neural networksMultilayer artificial neural networks Accelerated learning in multilayer ANNsAccelerated learning in multilayer ANNs

TopicsTopics

6

Machine LearningMachine Learning

Machine learning involves adaptive mechanisms Machine learning involves adaptive mechanisms that enable computers to learn from experience, that enable computers to learn from experience, learn by example and learn by analogy. Learning learn by example and learn by analogy. Learning capabilities can improve the performance of an capabilities can improve the performance of an intelligent system over time. The most popular intelligent system over time. The most popular approaches to machine learning are approaches to machine learning are artificial artificial neural networksneural networks and and genetic algorithmsgenetic algorithms. This . This lecture is dedicated to neural networks.lecture is dedicated to neural networks.

7

Our brain can be considered as a highly complex, Our brain can be considered as a highly complex, non-linear and parallel information-processing non-linear and parallel information-processing system. system.

Information is stored and processed in a neural Information is stored and processed in a neural network simultaneously throughout the whole network simultaneously throughout the whole network, rather than at specific locations. In other network, rather than at specific locations. In other words, in neural networks, both data and its words, in neural networks, both data and its processing are processing are globalglobal rather than local. rather than local.

Learning is a fundamental and essential Learning is a fundamental and essential characteristic of biological neural networks. The characteristic of biological neural networks. The ease with which they can learn led to attempts to ease with which they can learn led to attempts to emulate a biological neural network in a computer.emulate a biological neural network in a computer.

8

Biological neural networkBiological neural network

Soma Soma

Synapse

Synapse

Dendrites

Axon

Synapse

Dendrites

Axon

9

The neurons are connected by weighted links passing The neurons are connected by weighted links passing signals from one neuron to another. signals from one neuron to another.

The human brain incorporates nearly 10 billion neurons The human brain incorporates nearly 10 billion neurons and 60 trillion connections, and 60 trillion connections, synapsessynapses, between them. , between them. By using multiple neurons simultaneously, the brain By using multiple neurons simultaneously, the brain can perform its functions much faster than the fastest can perform its functions much faster than the fastest computers in existence today.computers in existence today.

10

A A artificial artificial neural networkneural network can be defined as a model can be defined as a model of reasoning based on the human brain. The brain of reasoning based on the human brain. The brain consists of a densely interconnected set of nerve cells, consists of a densely interconnected set of nerve cells, or basic information-processing units, called or basic information-processing units, called neuronsneurons. .

An artificial neural network consists of a number of An artificial neural network consists of a number of very simple processors, also called very simple processors, also called neuronsneurons, which are , which are analogous to the biological neurons in the brain.analogous to the biological neurons in the brain.

11

Architecture of a typical artificial neural networkArchitecture of a typical artificial neural network

Input Layer Output Layer

Middle Layer

I n

p u

t S

i g

n a

l s

O u

t p

u t

S

i g n

a l

s

12

Analogy between biological and Analogy between biological and artificial neural networksartificial neural networks

Soma Soma

Synapse

Synapse

Dendrites

Axon

Synapse

Dendrites

Axon

Input Layer Output Layer

Middle Layer

I n

p u

t S

i g

n a

l s

O u

t p

u t

S

i g n

a l

s

13

Each neuron has a simple structure, but an army of Each neuron has a simple structure, but an army of such elements constitutes a tremendous processing such elements constitutes a tremendous processing power. power.

A neuron consists of a cell body, A neuron consists of a cell body, somasoma, a number of , a number of fibers called fibers called dendritesdendrites, and a single long fiber , and a single long fiber called the called the axonaxon..

14

The neuron as a simple computing elementThe neuron as a simple computing element

Diagram of a neuronDiagram of a neuron

Neuron Y

Input Signals

x1

x2

xn

Output Signals

Y

Y

Y

w2

w1

wn

Weights

15

The PerceptronThe Perceptron

The operation of Rosenblatt’s perceptron is based The operation of Rosenblatt’s perceptron is based on the on the McCulloch and Pitts neuron modelMcCulloch and Pitts neuron model. The . The model consists of a linear combiner followed by a model consists of a linear combiner followed by a hard limiter. hard limiter.

The weighted sum of the inputs is applied to the The weighted sum of the inputs is applied to the hard limiter. (Step or sign function)hard limiter. (Step or sign function)

16

Threshold

Inputs

x1

x2

Output

Y

HardLimiter

w2

w1

LinearCombiner

Single-layer two-input perceptronSingle-layer two-input perceptron

17

The neuron computes the weighted sum of the input The neuron computes the weighted sum of the input signals and compares the result with a signals and compares the result with a threshold threshold valuevalue, , . If the net input is less than the threshold, . If the net input is less than the threshold, the neuron output is –1. But if the net input is greater the neuron output is –1. But if the net input is greater than or equal to the threshold, the neuron becomes than or equal to the threshold, the neuron becomes activated and its output attains a value +1.activated and its output attains a value +1.

The neuron uses the following transfer or The neuron uses the following transfer or activationactivation functionfunction::

This type of activation function is called a This type of activation function is called a sign sign functionfunction..

n

iiiwxX

1

X

XY

if ,1

if ,1

18

Y = sign(x1 * w1 + x2 * w2 - threshold);

static int sign(double x) { return (x < - 0) ? -1 : 1; }

19

Possible Activation functionsPossible Activation functions

20

Exercise

Threshold

Inputs

x1

x2

Output

Y

HardLimiter

w2

w1

LinearCombiner

A neuron uses a step function as its activation function, has threshold 0.2 and has W1 = 0.1, W2 = 0.1.

What is the output with the following values of x1 and x2?

Do you recognize this?

x1 x2 Y

0 0

0 1

1 0

1 1

Can you find weightsthat will give an orfunction?An xor function?

21

Can a single neuron learn a task?Can a single neuron learn a task?

In 1958, In 1958, Frank RosenblattFrank Rosenblatt introduced a training introduced a training algorithm that provided the first procedure for algorithm that provided the first procedure for training a simple ANN: a training a simple ANN: a perceptronperceptron. .

The perceptron is the simplest form of a neural The perceptron is the simplest form of a neural network. It consists of a single neuron with network. It consists of a single neuron with adjustableadjustable synaptic weights and a synaptic weights and a hard limiterhard limiter. .

22

This is done by making small adjustments in the This is done by making small adjustments in the weights to reduce the difference between the actual weights to reduce the difference between the actual and desired outputs of the perceptron. The initial and desired outputs of the perceptron. The initial weights are randomly assigned, usually in the range weights are randomly assigned, usually in the range [[0.5, 0.5], and then updated to obtain the output 0.5, 0.5], and then updated to obtain the output consistent with the training examples.consistent with the training examples.

How does the perceptron learn its classification How does the perceptron learn its classification tasks?tasks?

23

If at iteration If at iteration pp, the actual output is , the actual output is YY((pp) and the ) and the desired output is desired output is YYd d ((pp), then the error is given by:), then the error is given by:

where where pp = 1, 2, 3, . . . = 1, 2, 3, . . .

Iteration Iteration pp here refers to the here refers to the ppth training example th training example presented to the perceptron.presented to the perceptron.

If the error, If the error, ee((pp), is positive, we need to increase ), is positive, we need to increase perceptron output perceptron output YY((pp), but if it is negative, we ), but if it is negative, we need to decrease need to decrease YY((pp).).

)()()( pYpYpe d

24

The perceptron learning ruleThe perceptron learning rule

where where pp = 1, 2, 3, . . . = 1, 2, 3, . . .

is the is the learning ratelearning rate, a positive constant less than, a positive constant less than

unity.unity.

The perceptron learning rule was first proposed byThe perceptron learning rule was first proposed by

Rosenblatt Rosenblatt in 1960. Using this rule we can derive in 1960. Using this rule we can derive

the perceptron training algorithm for classification the perceptron training algorithm for classification

tasks.tasks.

)()()()1( pepxpwpw iii

25

Step 1Step 1: Initialisation: InitialisationSet initial weights Set initial weights ww11, , ww22,…, ,…, wwnn and threshold and threshold

to random numbers in the range [to random numbers in the range [0.5, 0.5]. 0.5, 0.5].

Perceptron’s training algorithmPerceptron’s training algorithm

26

Step 2Step 2: Activation: ActivationActivate the perceptron by applying inputs Activate the perceptron by applying inputs xx11((pp), ),

xx22((pp),…, ),…, xxnn((pp) and desired output ) and desired output YYd d ((pp). ).

Calculate the actual output at iteration Calculate the actual output at iteration pp = 1 = 1

where where nn is the number of the perceptron inputs, is the number of the perceptron inputs, and and stepstep is a step activation function. is a step activation function.

Perceptron’s tarining algorithm (continued)Perceptron’s tarining algorithm (continued)

n

iii pwpxsteppY

1

)( )()(

27

Step 3Step 3: Weight training: Weight trainingUpdate the weights of the perceptronUpdate the weights of the perceptron

where is the weight correction at iteration where is the weight correction at iteration pp..

The weight correction is computed by the The weight correction is computed by the delta delta rulerule::

Step 4Step 4: Iteration: IterationIncrease iteration Increase iteration pp by one, go back to by one, go back to Step 2Step 2 and and repeat the process until convergence.repeat the process until convergence.

)()()1( pwpwpw iii

Perceptron’s tarining algorithm (continued)Perceptron’s tarining algorithm (continued)

)()()( pepxpw ii )()()( pYpYpe d

28

Epoch x1 X2 Yd w1 w2 Y err w1 w2

1 0 0 0.3 -0.1

2

Exercise: Complete the first two epochs training the perceptron to learn the logical AND function. Assume = 0.2, = 0.1

29

The perceptron in effect classifies the inputs, The perceptron in effect classifies the inputs,

xx11, , xx22, . . ., , . . ., xxnn, into one of two classes, say , into one of two classes, say

AA11 and and AA22. .

In the case of an elementary perceptron, the n-In the case of an elementary perceptron, the n-dimensional space is divided by a dimensional space is divided by a hyperplanehyperplane into into two decision regions. The hyperplane is defined by two decision regions. The hyperplane is defined by the the linearly separablelinearly separable function function::

01

n

iiiwx

30

• If you proceed with the algorithm described above the termination occurs with w1 = 0.1 and w2 = 0.1

• This means that the line is

0.1x1 + 0.1x2 = 0.2or

x1 + x2 - 2 = 0• The region below the line is where

x1 + x2 - 2 < 0 and above the line is where

x1 + x2 - 2 >= 0

31

Linear separability in the perceptronLinear separability in the perceptron

x1

x2

Class A2

Class A1

1

2

x1w1 + x2w2 = 0

(a) Two-input perceptron. (b) Three-input perceptron.

x2

x1

x3x1w1 + x2w2 + x3w3 = 0

12

32

Perceptron Capability

• Independence of the activation function (Shynk, 1990; Shynk and Bernard, 1992)

33

General Network Structure

• The output signal is transmitted through the The output signal is transmitted through the neuron’s outgoing connection. The outgoing neuron’s outgoing connection. The outgoing connection splits into a number of branches that connection splits into a number of branches that transmit the same signal. The outgoing branches transmit the same signal. The outgoing branches terminate at the incoming connections of other terminate at the incoming connections of other neurons in the network.neurons in the network.

• Training by back-propagation algorithmTraining by back-propagation algorithm• Overcame the proven limitations of perceptronOvercame the proven limitations of perceptron

34

Architecture of a typical artificial neural networkArchitecture of a typical artificial neural network

Input Layer Output Layer

Middle Layer

I n

p u

t S

i g

n a

l s

O u

t p

u t

S

i g n

a l

s

Recommended