Hybrid intelligent systems - mksaad.files.wordpress.com · Negnevitsky, Pearson Education, 2011 1 Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary

Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 1

Lecture 12Lecture 12

Hybrid intelligent systems:Hybrid intelligent systems:Evolutionary neural networks and fuzzy Evolutionary neural networks and fuzzy

evolutionary systemsevolutionary systems

�� IntroductionIntroduction

�� Evolutionary neural networksEvolutionary neural networks

�� Fuzzy evolutionary systemsFuzzy evolutionary systems

�� SummarySummary


Evolutionary neural networksEvolutionary neural networks

�� Although neural networks are used for solving a Although neural networks are used for solving a

variety of problems, they still have some variety of problems, they still have some

limitations. limitations.

�� One of the most common is associated with neural One of the most common is associated with neural

network training. The backnetwork training. The back--propagation learning propagation learning

algorithm cannot guarantee an optimal solution. algorithm cannot guarantee an optimal solution.

In realIn real--world applications, the backworld applications, the back--propagation propagation

algorithm might converge to a set of subalgorithm might converge to a set of sub--optimal optimal

weights from which it cannot escape. As a result, weights from which it cannot escape. As a result,

the neural network is often unable to find a the neural network is often unable to find a

desirable solution to a problem at hand. desirable solution to a problem at hand.


�� Another difficulty is related to selecting an Another difficulty is related to selecting an

optimal topology for the neural network. The optimal topology for the neural network. The

““rightright”” network architecture for a particular network architecture for a particular

problem is often chosen by means of heuristics, problem is often chosen by means of heuristics,

and designing a neural network topology is still and designing a neural network topology is still

more art than engineering.more art than engineering.

�� Genetic algorithms are an effective optimisation Genetic algorithms are an effective optimisation

technique that can guide both weight optimisation technique that can guide both weight optimisation

and topology selection.and topology selection.


y

0.91

3

4

5

6

7

8

x1

x3

x22

-0.8

0.4

0.8

-0.7

0.2

-0.2

0.6

-0.3 0.1

-0.2

0.9

-0.60.1

0.3

0.5

From neuron:

To neuron:

1 2 3 4 5 6 7 8

1

2

3

4

5

6

7

8

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0.9 -0.3 -0.7 0 0 0 0 0

-0.8 0.6 0.3 0 0 0 0 0

0.1 -0.2 0.2 0 0 0 0 0

0.4 0.5 0.8 0 0 0 0 0

0 0 0 -0.6 0.1 -0.2 0.9 0

Chromosome: 0.9 -0.3 -0.7 -0.8 0.6 0.3 0.1 -0.2 0.2 0.4 0.5 0.8 -0.6 0.1 -0.2 0.9

Encoding a set of weights in a chromosomeEncoding a set of weights in a chromosome


�� The second step is to define a fitness function for The second step is to define a fitness function for

evaluating the chromosomeevaluating the chromosome’’s performance. This s performance. This

function must estimate the performance of a function must estimate the performance of a

given neural network. We can apply here a given neural network. We can apply here a

simple function defined by the sum of squared simple function defined by the sum of squared

errors. errors.

�� The training set of examples is presented to the The training set of examples is presented to the

network, and the sum of squared errors is network, and the sum of squared errors is

calculated. The smaller the sum, the fitter the calculated. The smaller the sum, the fitter the

chromosome. chromosome. The genetic algorithm attempts The genetic algorithm attempts

to find a set of weights that minimises the sum to find a set of weights that minimises the sum

of squared errors.of squared errors.


�� The third step is to choose the genetic operators The third step is to choose the genetic operators ––

crossover and mutation. A crossover operator crossover and mutation. A crossover operator

takes two parent chromosomes and creates a takes two parent chromosomes and creates a

single child with genetic material from both single child with genetic material from both

parents. Each gene in the childparents. Each gene in the child’’s chromosome is s chromosome is

represented by the corresponding gene of the represented by the corresponding gene of the

randomly selected parent.randomly selected parent.

�� A mutation operator selects a gene in a A mutation operator selects a gene in a

chromosome and adds a small random value chromosome and adds a small random value

between between −−1 and 1 to each weight in this gene.1 and 1 to each weight in this gene.


Crossover in weight optimisationCrossover in weight optimisation

3

4

5

y6

x22

-0.3

0.9

-0.7

0.5

-0.8

-0.6

Parent 1

x11

-0.2

0.1

0.4

3

4

5

y6

x22

-0.1

-0.5

0.2

-0.9

0.6

0.3

Parent 2

x11

0.9

0.3

-0.8

0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9 0.4 -0.3 0.3 0.2 0.3 -0.9 0.60.9 -0.5 -0.8 -0.1

0.1 -0.7 -0.6 0.5 -0.80.9 -0.5 -0.8 0.1

3

4

5

y6

x22

-0.1

-0.5

-0.7

0.5

-0.8

-0.6

Child

x11

0.9

0.1

-0.8


Mutation in weight optimisationMutation in weight optimisation

Original network3

4

5

y6

x22

-0.3

0.9

-0.7

0.5

-0.8

-0.6x11

-0.2

0.1

0.4

0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9

3

4

5

y6

x22

0.2

0.9

-0.7

0.5

-0.8

-0.6x11

-0.2

0.1

-0.1

0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9

Mutated network

0.4 -0.3 -0.1 0.2


Can genetic algorithms help us in selecting Can genetic algorithms help us in selecting

the network architecture?the network architecture?

The architecture of the network (i.e. the number of The architecture of the network (i.e. the number of

neurons and their interconnections) often neurons and their interconnections) often

determines the success or failure of the application. determines the success or failure of the application.

Usually the network architecture is decided by trial Usually the network architecture is decided by trial

and error; there is a great need for a method of and error; there is a great need for a method of

automatically designing the architecture for a automatically designing the architecture for a

particular application. Genetic algorithms may particular application. Genetic algorithms may

well be suited for this task.well be suited for this task.


�� The basic idea behind evolving a suitable network The basic idea behind evolving a suitable network

architecture is to conduct a genetic search in a architecture is to conduct a genetic search in a

population of possible architectures.population of possible architectures.

�� We must first choose a method of encoding a We must first choose a method of encoding a

networknetwork’’s architecture into a chromosome.s architecture into a chromosome.


Encoding the network architectureEncoding the network architecture

�� The connection topology of a neural network can The connection topology of a neural network can

be represented by a square connectivity matrix. be represented by a square connectivity matrix.

�� Each entry in the matrix defines the type of Each entry in the matrix defines the type of

connection from one neuron (column) to another connection from one neuron (column) to another

(row), where 0 means no connection and 1 (row), where 0 means no connection and 1

denotes connection for which the weight can be denotes connection for which the weight can be

changed through learning. changed through learning.

�� To transform the connectivity matrix into a To transform the connectivity matrix into a

chromosome, we need only to string the rows of chromosome, we need only to string the rows of

the matrix together.the matrix together.


Encoding of the network topologyEncoding of the network topology

From neuron:

To neuron:

1 2 3 4 5 6

1

2

3

4

5

6

0 0 0 0 0 0

0 0 0 0 0 0

1 1 0 0 0 0

1 0 0 0 0 0

0 1 0 0 0 0

0 1 1 1 1 0

3

4

5

y6

x22

x11

Chromosome:

0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 0


The cycle of evolving a neural network topologyThe cycle of evolving a neural network topology

Neural Network j

Fitness = 117

Neural Network j

Fitness = 117Generation i

Training Data Set 0 0 1.0000

0.1000 0.0998 0.8869

0.2000 0.1987 0.7551

0.3000 0.2955 0.61420.4000 0.3894 0.4720

0.5000 0.4794 0.3345

0.6000 0.5646 0.2060

0.7000 0.6442 0.0892

0.8000 0.7174 -0.0143

0.9000 0.7833 -0.10381.0000 0.8415 -0.1794

Child 2

Child 1

CrossoverParent 1

Parent 2

Mutation

Generation (i + 1)


Fuzzy evolutionary systemsFuzzy evolutionary systems

�� Evolutionary computation is also used in the Evolutionary computation is also used in the

design of fuzzy systems, particularly for generating design of fuzzy systems, particularly for generating

fuzzy rules and adjusting membership functions of fuzzy rules and adjusting membership functions of

fuzzy sets. fuzzy sets.

�� In this section, we introduce an application of In this section, we introduce an application of

genetic algorithms to select an appropriate set of genetic algorithms to select an appropriate set of

fuzzy IFfuzzy IF--THEN rules for a classification problem.THEN rules for a classification problem.

�� For a classification problem, a set of fuzzy For a classification problem, a set of fuzzy

IFIF--THEN rules is generated from numerical data. THEN rules is generated from numerical data.

�� First, we use a gridFirst, we use a grid--type fuzzy partition of an input type fuzzy partition of an input

space.space.


Fuzzy partition by a 3Fuzzy partition by a 3××××××××3 fuzzy grid3 fuzzy grid

0 1

A1 A2 A3

X1

B2

B1

B3

0

1X2

Class 1:

Class 2:

µ(x1)

µ(x2)

0

10 1

1

2

3

6

7

45

9

8

1110

12

16

15

14

13

x11

x21


�� Black and white dots denote the training patterns Black and white dots denote the training patterns

of of ClassClass 1 and 1 and ClassClass 2, respectively. 2, respectively.

�� The gridThe grid--type fuzzy partition can be seen as a type fuzzy partition can be seen as a

rule table. rule table.

�� The linguistic values of input The linguistic values of input xx1 (1 (AA11, , AA22 and and AA33) )

form the horizontal axis, and the linguistic form the horizontal axis, and the linguistic

values of input values of input xx2 (2 (BB11, , BB22 and and BB33) form the ) form the

vertical axis. vertical axis.

�� At the intersection of a row and a column lies the At the intersection of a row and a column lies the

rule consequent. rule consequent.

Fuzzy partitionFuzzy partition


In the rule table, each fuzzy subspace can have In the rule table, each fuzzy subspace can have

only one fuzzy IFonly one fuzzy IF--THEN rule, and thus the total THEN rule, and thus the total

number of rules that can be generated in a number of rules that can be generated in a KK××KKgrid is equal to grid is equal to KK××KK. .


Fuzzy rules that correspond to the Fuzzy rules that correspond to the KK××KK fuzzy fuzzy

partition can be represented in a general form as:partition can be represented in a general form as:

where where xxpp is a training pattern on input space is a training pattern on input space XX11××XX2, 2,

PP is the total number of training patterns, is the total number of training patterns, CCnn is the is the

rule consequent (either rule consequent (either ClassClass 1 or 1 or ClassClass 2), and 2), and

is the certaintyis the certainty factor that a pattern in fuzzy factor that a pattern in fuzzy

subspace subspace AAiiBBjj belongs to class belongs to class CCnn..

is Ai i = 1, 2, . . . , K

is Bj j = 1, 2, . . . , K

Rule Rij :

IF x1p

THEN xp

AND x2p

∈ Cn

n

ji

C

BACF xp = (x1p, x2p), p = 1, 2, . . . , P

CFCFAAii BBjjCCnn


To determine the rule consequent and the certainty To determine the rule consequent and the certainty

factor, we use the following procedure:factor, we use the following procedure:

Step 1Step 1:: Partition an input space into Partition an input space into KK××KK fuzzy fuzzy

subspaces, and calculate the strength of each class subspaces, and calculate the strength of each class

of training patterns in every fuzzy subspace.of training patterns in every fuzzy subspace.

Each class in a given fuzzy subspace is represented Each class in a given fuzzy subspace is represented

by its training patterns. The more training patterns, by its training patterns. The more training patterns,

the stronger the class the stronger the class −− in a given fuzzy subspace, in a given fuzzy subspace,

the rule consequent becomes more certain when the rule consequent becomes more certain when

patterns of one particular class appear more often patterns of one particular class appear more often

than patterns of any other class.than patterns of any other class.

Step 2Step 2:: Determine the rule consequent and the Determine the rule consequent and the

certainty factor in each fuzzy subspace.certainty factor in each fuzzy subspace.


The certainty factor can be interpreted as The certainty factor can be interpreted as

follows:follows:

�� If all the training patterns in fuzzy subspace If all the training patterns in fuzzy subspace AAiiBBjjbelong to the same class, then the certainty belong to the same class, then the certainty

factor is maximum and it is certain that any new factor is maximum and it is certain that any new

pattern in this subspace will belong to this class. pattern in this subspace will belong to this class.

�� If, however, training patterns belong to different If, however, training patterns belong to different

classes and these classes have similar strengths, classes and these classes have similar strengths,

then the certainty factor is minimum and it is then the certainty factor is minimum and it is

uncertain that a new pattern will belong to any uncertain that a new pattern will belong to any

particular class.particular class.


�� This means that patterns in a fuzzy subspace can This means that patterns in a fuzzy subspace can

be misclassified. Moreover, if a fuzzy subspace be misclassified. Moreover, if a fuzzy subspace

does not have any training patterns, we cannot does not have any training patterns, we cannot

determine the rule consequent at all.determine the rule consequent at all.

�� If a fuzzy partition is too coarse, many patterns If a fuzzy partition is too coarse, many patterns

may be misclassified. On the other hand, if a may be misclassified. On the other hand, if a

fuzzy partition is too fine, many fuzzy rules fuzzy partition is too fine, many fuzzy rules

cannot be obtained, because of the lack of cannot be obtained, because of the lack of

training patterns in the corresponding fuzzy training patterns in the corresponding fuzzy

subspaces.subspaces.


Training patterns are not necessarily Training patterns are not necessarily

distributed evenly in the input space. As a distributed evenly in the input space. As a

result, it is often difficult to choose an result, it is often difficult to choose an

appropriate density for the fuzzy grid. To appropriate density for the fuzzy grid. To

overcome this difficulty, we use overcome this difficulty, we use multiple multiple

fuzzy rule tablesfuzzy rule tables..


Multiple fuzzy rule tablesMultiple fuzzy rule tables

K = 2 K = 3 K = 4 K = 5 K = 6

Fuzzy IFFuzzy IF--THEN rules are generated for each fuzzy THEN rules are generated for each fuzzy

subspace of multiple fuzzy rule tables, and thus a subspace of multiple fuzzy rule tables, and thus a

complete set of rules for our case can be specified complete set of rules for our case can be specified

as: as:

2222 ++ 3322 ++ 4422 ++ 5522 ++ 6622 = 90 rules.= 90 rules.


Once the set of rules Once the set of rules SSALLALL is generated, a new is generated, a new

pattern, pattern, xx = (= (xx1, 1, xx2), can be classified by the 2), can be classified by the

following procedure:following procedure:

Step 1Step 1:: In every fuzzy subspace of the multiple In every fuzzy subspace of the multiple

fuzzy rule tables, calculate the degree of fuzzy rule tables, calculate the degree of

compatibility of a new pattern with each class.compatibility of a new pattern with each class.

Step 2Step 2:: Determine the maximum degree of Determine the maximum degree of

compatibility of the new pattern with each class.compatibility of the new pattern with each class.

Step 3Step 3:: Determine the class with which the new Determine the class with which the new

pattern has the highest degree of compatibility, pattern has the highest degree of compatibility,

and assign the pattern to this class.and assign the pattern to this class.


The number of multiple fuzzy rule tables The number of multiple fuzzy rule tables

required for an accurate pattern classification required for an accurate pattern classification

may be large. Consequently, a complete set of may be large. Consequently, a complete set of

rules can be enormous. Meanwhile, these rules rules can be enormous. Meanwhile, these rules

have different classification abilities, and thus have different classification abilities, and thus

by selecting only rules with high potential for by selecting only rules with high potential for

accurate classification, we reduce the number accurate classification, we reduce the number

of rules.of rules.


Can we use genetic algorithms for selecting Can we use genetic algorithms for selecting

fuzzy IFfuzzy IF--THEN rules ?THEN rules ?

�� The problem of selecting fuzzy IFThe problem of selecting fuzzy IF--THEN rules THEN rules

can be seen as a combinatorial optimisation can be seen as a combinatorial optimisation

problem with two objectives.problem with two objectives.

�� The first, more important, objective is to The first, more important, objective is to

maximise the number of correctly classified maximise the number of correctly classified

patterns.patterns.

�� The second objective is to minimise the number The second objective is to minimise the number

of rules. of rules.

�� Genetic algorithms can be applied to this Genetic algorithms can be applied to this

problem.problem.


A basic genetic algorithm for selecting fuzzy IFA basic genetic algorithm for selecting fuzzy IF--

THEN rules includes the following steps:THEN rules includes the following steps:

Step 1Step 1:: Randomly generate an initial population of Randomly generate an initial population of

chromosomes. The population size may be chromosomes. The population size may be

relatively small, say 10 or 20 chromosomes. relatively small, say 10 or 20 chromosomes.

Each gene in a chromosome corresponds to a Each gene in a chromosome corresponds to a

particular fuzzy IFparticular fuzzy IF--THEN rule in the rule set THEN rule in the rule set

defined by defined by SSALLALL..

Step 2Step 2:: Calculate the performance, or fitness, of Calculate the performance, or fitness, of

each individual chromosome in the current each individual chromosome in the current

population.population.


The problem of selecting fuzzy rules has two The problem of selecting fuzzy rules has two

objectives: to maximise the accuracy of the pattern objectives: to maximise the accuracy of the pattern

classification and to minimise the size of a rule set. classification and to minimise the size of a rule set.

The fitness function has to accommodate both these The fitness function has to accommodate both these

objectives. This can be achieved by introducing two objectives. This can be achieved by introducing two

respective weights, respective weights, wwPP and and wwNN, in the fitness function:, in the fitness function:

where where PPss is the number of patterns classified is the number of patterns classified

successfully, successfully, PPALLALL is the total number of patterns is the total number of patterns

presented to the classification system, presented to the classification system, NNSS and and NNALLALL are are

the numbers of fuzzy IFthe numbers of fuzzy IF--THEN rules in set THEN rules in set SS and set and set

SSALLALL, respectively., respectively.

ALL

SN

ALLP

N

Nw

P

PwSf s −=)(


The classification accuracy is more important than The classification accuracy is more important than

the size of a rule set. That is,the size of a rule set. That is,

ALL

S

ALL N

N

P

PSf s −=10)(


Step 3Step 3:: Select a pair of chromosomes for mating. Select a pair of chromosomes for mating.

Parent chromosomes are selected with a Parent chromosomes are selected with a

probability associated with their fitness; a better probability associated with their fitness; a better

fit chromosome has a higher probability of being fit chromosome has a higher probability of being

selected.selected.

Step 4Step 4: : Create a pair of offspring chromosomes Create a pair of offspring chromosomes

by applying a standard crossover operator. by applying a standard crossover operator.

Parent chromosomes are crossed at the randomly Parent chromosomes are crossed at the randomly

selected crossover point.selected crossover point.

Step 5Step 5:: Perform mutation on each gene of the Perform mutation on each gene of the

created offspring. The mutation probability is created offspring. The mutation probability is

normally kept quite low, say 0.01. The mutation normally kept quite low, say 0.01. The mutation

is done by multiplying the gene value by is done by multiplying the gene value by ––1.1.


Step 6Step 6:: Place the created offspring chromosomes in Place the created offspring chromosomes in

the new population.the new population.

Step 7Step 7:: Repeat Repeat Step 3Step 3 until the size of the new until the size of the new

population becomes equal to the size of the initial population becomes equal to the size of the initial

population, and then replace the initial (parent) population, and then replace the initial (parent)

population with the new (offspring) population.population with the new (offspring) population.

Step 9Step 9:: Go to Go to Step 2Step 2, and repeat the process until a , and repeat the process until a

specified number of generations (typically several specified number of generations (typically several

hundreds) is considered.hundreds) is considered.

The number of rules can be cut down to less than The number of rules can be cut down to less than

2% of the initially generated set of rules.2% of the initially generated set of rules.

Documents

Hybrid intelligent systems - mksaad.files.wordpress.com · Negnevitsky, Pearson Education, 2011 1 Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary