63
Negnevitsky, Pearson Education, 2005 1 Lecture Lecture 11 11 al expert systems and neuro-fuzzy syst al expert systems and neuro-fuzzy syst Introduction Introduction Neural expert systems Neural expert systems Neuro-fuzzy systems Neuro-fuzzy systems ANFIS: Adaptive Neuro-Fuzzy ANFIS: Adaptive Neuro-Fuzzy Inference Inference System System Summary Summary brid intelligent systems: brid intelligent systems:

© Negnevitsky, Pearson Education, 2005 1 Lecture 11 Neural expert systems and neuro-fuzzy systems Introduction Introduction Neural expert systems Neural

Embed Size (px)

Citation preview

© Negnevitsky, Pearson Education, 2005 1

Lecture 11Lecture 11

Neural expert systems and neuro-fuzzy systemsNeural expert systems and neuro-fuzzy systems IntroductionIntroduction

Neural expert systemsNeural expert systems

Neuro-fuzzy systemsNeuro-fuzzy systems

ANFIS: Adaptive Neuro-Fuzzy InferenceANFIS: Adaptive Neuro-Fuzzy Inference SystemSystem

SummarySummary

Hybrid intelligent systems:Hybrid intelligent systems:

© Negnevitsky, Pearson Education, 2005 2

IntroductionIntroduction

A A hybrid intelligent system hybrid intelligent system is one that combinesis one that combines at least two intelligent technologies. For example, at least two intelligent technologies. For example, combining a neural network with a fuzzy systemcombining a neural network with a fuzzy system results in a hybrid neuro-fuzzy system.results in a hybrid neuro-fuzzy system.

The combination of probabilistic reasoning, fuzzyThe combination of probabilistic reasoning, fuzzy logic, neural networks and evolutionarylogic, neural networks and evolutionary computation forms the core of computation forms the core of soft computingsoft computing, an, an emerging approach to building hybrid intelligentemerging approach to building hybrid intelligent systems capable of reasoning and learning in ansystems capable of reasoning and learning in an uncertain and imprecise environment.uncertain and imprecise environment.

© Negnevitsky, Pearson Education, 2005 3

Although words are less precise than numbers, Although words are less precise than numbers, precision carries a high cost. We use wordsprecision carries a high cost. We use words when there is a tolerance for imprecision. Softwhen there is a tolerance for imprecision. Soft computing exploits the tolerance for uncertaintycomputing exploits the tolerance for uncertainty and imprecision to achieve greater tractabilityand imprecision to achieve greater tractability and and robustness, and lower the cost of solutions.robustness, and lower the cost of solutions.

We also use words when the available data isWe also use words when the available data is not precise enough to use numbers. This isnot precise enough to use numbers. This is often the case with complex problems, andoften the case with complex problems, and while “hard” computing fails to produce anywhile “hard” computing fails to produce any solution, soft computing is still capable ofsolution, soft computing is still capable of finding good solutions.finding good solutions.

© Negnevitsky, Pearson Education, 2005 4

Lotfi ZadehLotfi Zadeh is reputed to have said that a goodis reputed to have said that a good hybrid would be “British Police, Germanhybrid would be “British Police, German Mechanics, French Cuisine, Swiss Mechanics, French Cuisine, Swiss Banking andBanking and Italian Love”. But Italian Love”. But “British Cuisine, German“British Cuisine, German Police, Police, French Mechanics, Italian Banking andFrench Mechanics, Italian Banking and Swiss Love” would be a bad one. Likewise, aSwiss Love” would be a bad one. Likewise, a hybrid intelligent system can be hybrid intelligent system can be good or bad – itgood or bad – it depends on which depends on which components constitute thecomponents constitute the hybrid. hybrid. So our goal is to select the rightSo our goal is to select the right components for building a good hybrid system.components for building a good hybrid system.

© Negnevitsky, Pearson Education, 2005 5

Comparison of Expert Systems, FuzzyComparison of Expert Systems, Fuzzy Systems, Systems, Neural Networks and Genetic AlgorithmsNeural Networks and Genetic Algorithms

Knowledge representation

Uncertaintytolerance

Imprecisiontolerance

Adaptability

Learningability

Explanationability

Knowledge discovery and data mining

Maintainability

ES FS NN GA

* The terms used for grading are:

- bad, - ratherbad, - good- rather good and

© Negnevitsky, Pearson Education, 2005 6

Expert systems rely on logical inferences andExpert systems rely on logical inferences and decision trees and focus on modelling humandecision trees and focus on modelling human reasoning. Neural networks rely on parallel datareasoning. Neural networks rely on parallel data processing and focus on modelling a human processing and focus on modelling a human brain.brain.

Expert systems treat the brain as a black-box. Expert systems treat the brain as a black-box. Neural networks look at its structure and Neural networks look at its structure and functions, particularly at its ability to learn.functions, particularly at its ability to learn.

Knowledge in a rule-based expert system isKnowledge in a rule-based expert system is represented by IF-THEN production rules. represented by IF-THEN production rules. Knowledge in neural networks is stored asKnowledge in neural networks is stored as synaptic weights between neurons.synaptic weights between neurons.

Neural expert systemsNeural expert systems

© Negnevitsky, Pearson Education, 2005 7

In expert systems, knowledge can be divided intoIn expert systems, knowledge can be divided into individual rules and the user can see andindividual rules and the user can see and understand the piece of knowledge understand the piece of knowledge applied by theapplied by the system. system.

In neural networks, one cannot select a singleIn neural networks, one cannot select a single synaptic weight as a discrete piece of synaptic weight as a discrete piece of knowledge. knowledge. Here knowledge is Here knowledge is embedded in the entireembedded in the entire network; it network; it cannot be broken into individualcannot be broken into individual pieces, and any change of a synaptic weight maypieces, and any change of a synaptic weight may lead to unpredictable results. A lead to unpredictable results. A neural network is,neural network is, in fact, a in fact, a black-black-boxbox for its user.for its user.

© Negnevitsky, Pearson Education, 2005 8

Can we combine advantages of expert systemsCan we combine advantages of expert systems and neural networks to create a and neural networks to create a more powerfulmore powerful and effective and effective expert system?expert system?

A hybrid system that combines a neural networkA hybrid system that combines a neural network and a rule-based expert system is called aand a rule-based expert system is called a neuralneural expert systemexpert system (or a (or a connectionist expert connectionist expert systemsystem).).

© Negnevitsky, Pearson Education, 2005 9

Basic structure of a neural expert systemBasic structure of a neural expert system

Inference E ngine

Neural Knowledge Base Rule Extraction

Explanation Facilities

User Interface

User

Rule: IF - THEN

Training Data

NewData

© Negnevitsky, Pearson Education, 2005 10

The heart of a neural expert system is theThe heart of a neural expert system is the inference engineinference engine. It controls the . It controls the informationinformation flow in the system flow in the system and initiates inference overand initiates inference over the the neural knowledge base. A neural inferenceneural knowledge base. A neural inference engine also ensures engine also ensures approximate approximate reasoningreasoning..

© Negnevitsky, Pearson Education, 2005 11

In a rule-based expert system, the inference engineIn a rule-based expert system, the inference engine compares the condition part of each rule with datacompares the condition part of each rule with data given in the database. When the IF part of the rulegiven in the database. When the IF part of the rule matches the data in the database, the rule is matches the data in the database, the rule is fired andfired and its THEN part is executed. The its THEN part is executed. The precise matching precise matching isis required (inference engine required (inference engine cannot cope with noisy orcannot cope with noisy or incomplete data).incomplete data).

Neural expert systems use a trained neural network inNeural expert systems use a trained neural network in place of the knowledge base. The input data place of the knowledge base. The input data does notdoes not have to precisely match the data that have to precisely match the data that was used inwas used in network training. This ability is network training. This ability is called called approximateapproximate reasoningreasoning..

Approximate reasoningApproximate reasoning

© Negnevitsky, Pearson Education, 2005 12

Neurons in the network are connected by links,Neurons in the network are connected by links, each of which has a numerical weight each of which has a numerical weight attached to it.attached to it.

The weights in a trained neural network determineThe weights in a trained neural network determine the strength or importance of the the strength or importance of the associated neuronassociated neuron inputs.inputs.

Rule extractionRule extraction

© Negnevitsky, Pearson Education, 2005 13

The neural knowledge baseThe neural knowledge base

-0.8

-0.2

-0.1-1.1

2.20.0

-1.0

2.8-1.6

-2.9

-1.3

Bird

Plane

Glider

+1

Wings

Tail

Beak

Feathers

Engine

+1

0

+1

+1

1

-1.6 -0.7

-1.1 1.9

1

1

Rule 1

Rule 2

Rule 3

1.0

1.0

1.0

© Negnevitsky, Pearson Education, 2005 14

If we set each input of the input layer to either +1If we set each input of the input layer to either +1 (true), (true), 1 (false), or 0 (unknown), we can give a1 (false), or 0 (unknown), we can give a semantic interpretation for the activation ofsemantic interpretation for the activation of any outputany output neuron. For example, if the object has neuron. For example, if the object has Wings Wings (+1), (+1), Beak Beak (+1) and (+1) and Feathers Feathers (+1), (+1), but does not havebut does not have Engine Engine ((1), then we 1), then we can conclude that this object iscan conclude that this object is Bird Bird (+1):(+1):

03.5)1.1()1(8.212.21)2.0(0)8.0(11 RuleX

11 BirdRule YY

© Negnevitsky, Pearson Education, 2005 15

We can similarly conclude that this object is notWe can similarly conclude that this object is not PlanePlane::

and not and not GliderGlider::

02.49.1)1()6.1(10.01)1.0(0)7.0(12 RuleX

12 PlaneRule YY

02.4)3.1()1()9.2(1)0.1(1)1.1(0)6.0(13 RuleX

13 GliderRule YY

© Negnevitsky, Pearson Education, 2005 16

By attaching a corresponding question to each inputBy attaching a corresponding question to each input neuron, we can enable the system to prompt the userneuron, we can enable the system to prompt the user for initial values of the input variables:for initial values of the input variables:

Neuron: Neuron: WingsWings Question: Does the object have Question: Does the object have wings? wings? Neuron: Neuron: TailTail Question: Does the object Question: Does the object have a tail? have a tail? Neuron: Neuron: BeakBeak Question: Does the Question: Does the object have a beak? object have a beak? Neuron: Neuron: FeathersFeathers Question: Does the object have feathers? Question: Does the object have feathers? Neuron: Neuron: EngineEngine Question: Does the object have an engine?Question: Does the object have an engine?

© Negnevitsky, Pearson Education, 2005 17

An inference can be made if the known netAn inference can be made if the known net weighted input to a neuron is greater weighted input to a neuron is greater than thethan the sum of the absolute sum of the absolute values of the weights ofvalues of the weights of the the unknown inputs.unknown inputs.

where where i i known, known, j j known and known and nn is the numberis the number of neuron inputs.of neuron inputs.

n

jj

n

iii wwx

11

© Negnevitsky, Pearson Education, 2005 18

Enter initial value for the input Feathers:Enter initial value for the input Feathers: +1+1

Example:Example:

KNOWN = 1KNOWN = 12.8 = 2.82.8 = 2.8 UNKNOWN = UNKNOWN = 0.80.8+ + 0.20.2+ + 2.22.2+ + 1.11.1= 4.3= 4.3 KNOWN KNOWN UNKNOWNUNKNOWN

KNOWN = 1KNOWN = 12.8 + 12.8 + 12.2 = 5.02.2 = 5.0 UNKNOWN = UNKNOWN = 0.80.8+ + 0.20.2+ + 1.11.1= 2.1= 2.1 KNOWN KNOWN UNKNOWNUNKNOWN

Enter initial value for the input Beak:Enter initial value for the input Beak:+1+1

CONCLUDE: Bird is TRUECONCLUDE: Bird is TRUE

© Negnevitsky, Pearson Education, 2005 19

An example of a multi-layer knowledge baseAn example of a multi-layer knowledge base

ConjunctionLayer

InputLayer

R1

R2

R3

R4

a1

a2

a3

a4

a5 R5

b2

b1

b3

0.2

0.8

-0.1

0.9

0.6

R6

R7

R8

c1

c2

0.1

0.9

0.7

DisjunctionLayer

ConjunctionLayer

DisjunctionLayer

Rule 1: Rule 5:IF a1 AND a3 THEN b1 (0.8) IF a5 THEN b3 (0.6)

Rule 2: Rule 6:IF a1 AND a4 THEN b1 (0.2) IF b1 AND b3 THEN c1 (0.7)

Rule 3: Rule 7:IF a2 AND a5 THEN b2 (-0.1 ) IF b2 THEN c1 (0.1)

Rule 4: Rule 8:IF a3 AND a4 THEN b3 (0.9) IF b2 AND b3 THEN c2 (0.9)

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

1.0

© Negnevitsky, Pearson Education, 2005 20

Fuzzy logic and neural networks are naturalFuzzy logic and neural networks are natural complementary tools in building intelligentcomplementary tools in building intelligent systems. While neural networks systems. While neural networks are low-levelare low-level computational computational structures that perform well whenstructures that perform well when dealing with raw data, fuzzy logic deals withdealing with raw data, fuzzy logic deals with reasoning on a higher level, using reasoning on a higher level, using linguisticlinguistic information acquired information acquired from domain experts.from domain experts. However, fuzzy systems lack the ability However, fuzzy systems lack the ability to learnto learn and cannot adjust and cannot adjust themselves to a newthemselves to a new environment. On the other hand, environment. On the other hand, although neuralalthough neural networks can learn, they are opaque to the user.networks can learn, they are opaque to the user.

Neuro-fuzzy systemsNeuro-fuzzy systems

© Negnevitsky, Pearson Education, 2005 21

Integrated neuro-fuzzy systems can combine theIntegrated neuro-fuzzy systems can combine the parallel computation and learning parallel computation and learning abilities ofabilities of neural networks with neural networks with the human-like knowledgethe human-like knowledge representation and explanation abilities of fuzzyrepresentation and explanation abilities of fuzzy systems. As a result, neural networks systems. As a result, neural networks becomebecome more transparent, while more transparent, while fuzzy systems becomefuzzy systems become capable of capable of learning.learning.

© Negnevitsky, Pearson Education, 2005 22

A neuro-fuzzy system is a neural network whichA neuro-fuzzy system is a neural network which is functionallyis functionally equivalent to a fuzzy equivalent to a fuzzy inferenceinference model. It can be model. It can be trained to develop IF-THENtrained to develop IF-THEN fuzzy rules and determine membership functionsfuzzy rules and determine membership functions for input and output variables of for input and output variables of the system. the system. Expert knowledge Expert knowledge can be incorporated into thecan be incorporated into the structure of thestructure of the neuro-fuzzy system. At the sameneuro-fuzzy system. At the same time, the connectionist structure time, the connectionist structure avoids fuzzyavoids fuzzy inference, which inference, which entails a substantialentails a substantial computational computational burden.burden.

© Negnevitsky, Pearson Education, 2005 23

The structure of a neuro-fuzzy system is similarThe structure of a neuro-fuzzy system is similar to a multi-layer neural network. In to a multi-layer neural network. In general, ageneral, a neuro-fuzzy system neuro-fuzzy system has input and output layers, has input and output layers, and three hidden layers that representand three hidden layers that represent membership functions and fuzzy rules. membership functions and fuzzy rules.

© Negnevitsky, Pearson Education, 2005 24

Neuro-fuzzy systemNeuro-fuzzy system

Layer 2 Layer 3 Layer 4 Layer 5Layer 1

y

x1

A1

A2

A3

B1

B2x2

C1

C2

x1

x1

x1

x2

x2

x2

B1

A2

B3

C2

C1

R1

R3

R5

R6

R4

R1

R5

R4

R6

R2

R3

R2

B3

A1

wR3

wR

wR1wR2

wR4

wR5

A3

B2

© Negnevitsky, Pearson Education, 2005 25

Each layer in the neuro-fuzzy system is associatedEach layer in the neuro-fuzzy system is associated with a particular step in the fuzzy inference with a particular step in the fuzzy inference process. process. Layer Layer 11 is the is the input layerinput layer. Each neuron in this layer. Each neuron in this layer

transmits external crisp signals directly to the transmits external crisp signals directly to the nextnext layer. That is,layer. That is,

Layer Layer 22 is the is the fuzzification layerfuzzification layer. Neurons in this. Neurons in this layer represent fuzzy sets used in the layer represent fuzzy sets used in the antecedentsantecedents of fuzzy rules. A of fuzzy rules. A fuzzification neuron receives afuzzification neuron receives a crisp crisp input and determines the degree to whichinput and determines the degree to which this input belongs to the neuron’sthis input belongs to the neuron’s fuzzy set.fuzzy set.

)1()1(ii xy

© Negnevitsky, Pearson Education, 2005 26

The activation function of a membershipThe activation function of a membership neuronneuron isis set to the function that specifies the set to the function that specifies the neuron’s fuzzyneuron’s fuzzy set. We use set. We use triangular sets, and therefore, thetriangular sets, and therefore, the activation functions for the neuronsactivation functions for the neurons inin Layer Layer 2 are2 are set to the set to the triangular membership triangular membership functionsfunctions. A. A triangular membership triangular membership function can be specified byfunction can be specified by two two parameters {parameters {aa, , bb} as follows:} as follows:

2if,0

22if,

21

2if,0

)2(

)2()2(

)2(

)2(

bax

bax

ba

b

ax

bax

y

i

ii

i

i

© Negnevitsky, Pearson Education, 2005 27

Triangular activation functionsTriangular activation functions

(b) Effect of parameter b.

0.2

0.4

0.6

0.8

1

a = 4, b = 6

a = 4, b = 4

0 6 80

1 2 3 5 7X

(a) Effect of parameter a.

0.2

0.4

0.6

0.8

1

a = 4, b = 6

a = 4.5, b = 6

0 4 6 80

1 2 3 5 7X

4

© Negnevitsky, Pearson Education, 2005 28

Layer Layer 33 is the is the fuzzy rule layerfuzzy rule layer. Each. Each neuron in thisneuron in this layer corresponds to a single fuzzy rule. A layer corresponds to a single fuzzy rule. A fuzzyfuzzy rulerule neuron receives inputs neuron receives inputs from thefrom the fuzzificationfuzzification neurons that neurons that represent fuzzy sets in the rulerepresent fuzzy sets in the rule antecedents. For instance, neuron antecedents. For instance, neuron RR1, which1, which corresponds tocorresponds to Rule Rule 1, receives inputs from1, receives inputs from neurons neurons AA1 and 1 and BB1. 1.

In a neuro-fuzzy system, intersection can beIn a neuro-fuzzy system, intersection can be implemented by the implemented by the product operatorproduct operator. . Thus, theThus, the output of neuron output of neuron i i inin Layer Layer 3 3 is obtained as:is obtained as:

)3()3(2

)3(1

)3(kiiii xxxy 111

)3(1 RBARy . . .

© Negnevitsky, Pearson Education, 2005 29

Layer Layer 44 is the is the output membership layeroutput membership layer. Neurons. Neurons in this layer represent fuzzy sets in this layer represent fuzzy sets used in theused in the consequent of fuzzy consequent of fuzzy rules.rules.

An output membership neuron combines all itsAn output membership neuron combines all its inputs by using the fuzzy inputs by using the fuzzy operation operation unionunion. . This This operation can be implemented by theoperation can be implemented by the probabilistic ORprobabilistic OR. That is,. That is,

The value of The value of CC11 represents the integrated firing represents the integrated firing

strength of fuzzy rule neurons strength of fuzzy rule neurons RR3 and 3 and RR66

)4()4(2

)4(1

)4(liiii xxxy 163

)4(1 CRRCy ...

© Negnevitsky, Pearson Education, 2005 30

Layer Layer 55 is the is the defuzzification layerdefuzzification layer. Each neuron. Each neuron in this layer represents a single in this layer represents a single output of theoutput of the neuro-fuzzy neuro-fuzzy system. It takes the output fuzzy setssystem. It takes the output fuzzy sets clipped by the respective clipped by the respective integrated firingintegrated firing strengths and combines them into strengths and combines them into a single fuzzya single fuzzy set. set.

Neuro-fuzzy systems can apply standardNeuro-fuzzy systems can apply standard defuzzification defuzzification methods, including the centroidmethods, including the centroid technique. technique.

We will use the We will use the sum-product compositionsum-product composition method.method.

© Negnevitsky, Pearson Education, 2005 31

The sum-product composition calculates the crispThe sum-product composition calculates the crisp output as the weighted average of the output as the weighted average of the centroids ofcentroids of all output all output membership functions. For example, themembership functions. For example, the weighted average of the centroids of the clippedweighted average of the centroids of the clipped fuzzy sets fuzzy sets CC1 and 1 and CC2 is calculated as,2 is calculated as,

2211

222111

CCCC

CCCCCC

bb

b baay

© Negnevitsky, Pearson Education, 2005 32

How does a neuro-fuzzy system learn?How does a neuro-fuzzy system learn?

A neuro-fuzzy system is essentially a multi-layerA neuro-fuzzy system is essentially a multi-layer neural network, and thus it can apply neural network, and thus it can apply standardstandard learning algorithms learning algorithms developed for neural networks, developed for neural networks, including including the back-propagation algorithm.the back-propagation algorithm.

© Negnevitsky, Pearson Education, 2005 33

When a training input-output example is presentedWhen a training input-output example is presented to the system, the back-to the system, the back-propagation algorithmpropagation algorithm computes the system output and compares it computes the system output and compares it withwith the desired the desired output of the training example. Theoutput of the training example. The error is propagated backwards error is propagated backwards through the networkthrough the network from the output layer to the input layer. Thefrom the output layer to the input layer. The neuron activation neuron activation functions are modified as thefunctions are modified as the error is propagated. To determine error is propagated. To determine the necessarythe necessary modifications, the modifications, the back-propagation algorithmback-propagation algorithm differentiates the activation functions of differentiates the activation functions of thethe neurons.neurons.

Let us demonstrate how aLet us demonstrate how a neuro-fuzzy systemneuro-fuzzy system works on a simple example.works on a simple example.

© Negnevitsky, Pearson Education, 2005 34

Training patternsTraining patterns

0

1

0

Y

1

10

© Negnevitsky, Pearson Education, 2005 35

The data set is used for training the five-rule neuro-The data set is used for training the five-rule neuro- fuzzy system shown below. fuzzy system shown below. Five-rule neuro-fuzzy systemFive-rule neuro-fuzzy system

(a) Five-rule system.

x2

L

S

x2

L

S

0 10 20 30 40 500

0.2

0.4

0.6

0.8

1

y

1

2

4

5

3L

S

Epoch

wR2

0.9 9

0

0.72

0.61

0.79

(b) Training for 50 epochs.

wR3 wR4

wR5

wR1

© Negnevitsky, Pearson Education, 2005 36

Suppose that fuzzy IF-THEN rules incorporatedSuppose that fuzzy IF-THEN rules incorporated into the system structure are into the system structure are supplied by asupplied by a domain domain expert. expert. Prior Prior or existing knowledge canor existing knowledge can dramatically expedite the system training. dramatically expedite the system training.

Besides, if the quality of training data is poor, the Besides, if the quality of training data is poor, the expert knowledge may be the only way to expert knowledge may be the only way to come to a solution at all. However, experts do come to a solution at all. However, experts do occasionally make mistakes, and thus some rules occasionally make mistakes, and thus some rules used in a neuro-fuzzy system may be false orused in a neuro-fuzzy system may be false or redundant. Therefore, a redundant. Therefore, a neuro-fuzzy systemneuro-fuzzy system should also be capable of identifying should also be capable of identifying bad rules.bad rules.

© Negnevitsky, Pearson Education, 2005 37

Given input and output linguistic values, a neuro-Given input and output linguistic values, a neuro- fuzzy system can automatically generate a fuzzy system can automatically generate a completecomplete set of fuzzy IF-THEN rules.set of fuzzy IF-THEN rules.

Let us create the system for the XOR example. Let us create the system for the XOR example. This system consists of 2This system consists of 222 2 = 8 2 = 8 rules. Becauserules. Because expert knowledge is not embodied expert knowledge is not embodied inin thethe systemsystem this time, we set allthis time, we set all initial initial weights between weights between Layer Layer 33 and and Layer Layer 4 to 0.5. 4 to 0.5.

After training we can eliminate all rules whoseAfter training we can eliminate all rules whose certainty factors are less than some certainty factors are less than some sufficientlysufficiently small number, say small number, say 0.1. As a result, we obtain the0.1. As a result, we obtain the same set of four fuzzy IF-THEN rules thatsame set of four fuzzy IF-THEN rules that represents the XOR operation.represents the XOR operation.

© Negnevitsky, Pearson Education, 2005 38

Eight-rule neuro-fuzzy systemEight-rule neuro-fuzzy system

(a) Eight-rule system.

Sx1

L

x2

y

L

S

L

S

1

2

3

4

5

6

7

80 10 20 30 40 50

Epoch

wR1wR4

0

0.78

0.69

0.62

00

0.80

0

wR2 wR8

wR5wR3

wR6 & wR7

0

0.2

0.3

0.5

0.7

0.1

0.4

0.6

0.8

© Negnevitsky, Pearson Education, 2005 39

Neuro-fuzzy systems: summaryNeuro-fuzzy systems: summary The combination of fuzzy logic and neuralThe combination of fuzzy logic and neural

networks constitutes a powerful means fornetworks constitutes a powerful means for designing intelligent designing intelligent systems.systems.

Domain knowledge can be put into aDomain knowledge can be put into a neuro-fuzzyneuro-fuzzy system bysystem by human experts in the human experts in the form of linguisticform of linguistic variables variables and fuzzy rules.and fuzzy rules.

When a representative set of examples is available, When a representative set of examples is available, a neuro-fuzzya neuro-fuzzy system can system can automatically transformautomatically transform it into a robust set of fuzzy IF-THEN it into a robust set of fuzzy IF-THEN rules, andrules, and thereby reduce our dependence on expertthereby reduce our dependence on expert knowledge when building intelligent knowledge when building intelligent systems.systems.

© Negnevitsky, Pearson Education, 2005 40

ANFIS:ANFIS:

Adaptive Neuro-Adaptive Neuro-Fuzzy Inference SystemFuzzy Inference System

The SugenoThe Sugeno fuzzy model was proposed for generatingfuzzy model was proposed for generating fuzzy rules from a given input-output fuzzy rules from a given input-output data set. A typicaldata set. A typical Sugeno fuzzy rule is Sugeno fuzzy rule is expressed in the following form:expressed in the following form: IF IF xx11 is is AA11

AND AND xx22 is is AA22

. . . . .. . . . .

AND AND xxmm is is AAmm

THEN THEN y y = = f f ((xx11, , xx22, . . . , , . . . , xxmm))

where where xx11, , xx22, . . . , , . . . , xxmm are input are input variables; variables; AA11, , AA22, . . . , , . . . , AAmm

are fuzzy sets.are fuzzy sets.

© Negnevitsky, Pearson Education, 2005 41

When When y y is a constant, we obtain a is a constant, we obtain a zero-orderzero-order Sugeno fuzzy model Sugeno fuzzy model in which the in which the consequent ofconsequent of a rule is a rule is specified by a singleton.specified by a singleton.

When When y y is a first-order polynomial, i.e.is a first-order polynomial, i.e.

y y = = kk00 + + kk11 xx11 + + kk22 xx22 + . . . + + . . . + kkmm x xmm

we obtain a we obtain a first-order Sugeno fuzzy modelfirst-order Sugeno fuzzy model..

© Negnevitsky, Pearson Education, 2005 42

Adaptive Neuro-Fuzzy Inference SystemAdaptive Neuro-Fuzzy Inference System

N1

y

N3

N2

N4

1

x2

x1

Layer 2 Layer 5Layer 3 Layer 4 Layer 6Layer 1

A1

A2

B1

B2

x1 x2

1

2

3

4

2

3

4

© Negnevitsky, Pearson Education, 2005 43

Layer Layer 11 is the is the input layerinput layer. Neurons in this layer. Neurons in this layer simply pass external crisp simply pass external crisp signals to signals to Layer Layer 2. 2.

Layer Layer 22 is the is the fuzzification layerfuzzification layer. Neurons in this. Neurons in this layer perform fuzzification. In layer perform fuzzification. In Jang’s model, Jang’s model, fuzzification neurons have a fuzzification neurons have a bell activationbell activation functionfunction..

© Negnevitsky, Pearson Education, 2005 44

Layer Layer 33 is the is the rule layerrule layer. Each neuron in this layer. Each neuron in this layer corresponds to a single Sugeno-type corresponds to a single Sugeno-type fuzzy rule. fuzzy rule. A rule neuron A rule neuron receives inputs from the respectivereceives inputs from the respective fuzzification neurons and calculates the firingfuzzification neurons and calculates the firing strength of the rule it represents. strength of the rule it represents. In an ANFIS, In an ANFIS, the conjunction the conjunction of the rule antecedents isof the rule antecedents is evaluated by the operator evaluated by the operator productproduct. Thus, the. Thus, the output of neuron output of neuron i i in in Layer Layer 3 3 is obtained as,is obtained as,

where the value of where the value of 11 represents the firing represents the firing strength, or the truth value, of strength, or the truth value, of Rule Rule 1.1.

k

jjii xy

1

)3()3( y(3) = A1 B1 = 1,1

© Negnevitsky, Pearson Education, 2005 45

Layer Layer 44 is the is the normalisation layernormalisation layer. Each. Each neuronneuron inin this layer receives inputs from all this layer receives inputs from all neurons in theneurons in the rule layer, and rule layer, and calculates the calculates the normalised firingnormalised firing strength strength of a given rule.of a given rule.

The normalised firing strength is the ratio of theThe normalised firing strength is the ratio of the firing strength of a given rule to firing strength of a given rule to the sum of firingthe sum of firing strengths of all strengths of all rules. It represents the contributionrules. It represents the contribution of a given rule to the final result. Thus, the of a given rule to the final result. Thus, the outputoutput of neuron of neuron ii in in Layer Layer 4 is 4 is determined as,determined as,

in

jj

in

jji

iii

x

xy

11

)4(

)4()4(

14321

1)4(1N

y

© Negnevitsky, Pearson Education, 2005 46

Layer Layer 55 is the is the defuzzification layerdefuzzification layer. Each neuron. Each neuron in this in this layer is connected to the respectivelayer is connected to the respective normalisation normalisation neuron, and also receives initialneuron, and also receives initial inputs, inputs, xx11 and and xx22. A defuzzification . A defuzzification

neuronneuron calculates the weighted consequent value of acalculates the weighted consequent value of a given rule as,given rule as,

where is the input and is the output ofwhere is the input and is the output of defuzzification neuron defuzzification neuron i i in in Layer Layer 5, and 5, and kkii00, , kkii11 and and kkii2 2 is a set of is a set of

consequent parameters of rule consequent parameters of rule ii..

2121 210210)5()5( xkxkkxkxkkxy iiiiiiiii ][

© Negnevitsky, Pearson Education, 2005 47

Layer Layer 66 is represented by a singleis represented by a single summationsummation neuronneuron. This neuron calculates the sum . This neuron calculates the sum ofof outputs of all outputs of all defuzzification neurons anddefuzzification neurons and produces the overall ANFIS output, produces the overall ANFIS output, yy,,

n

iiiii

n

ii xkxkkxy

1210

1

)6( 21 ][

© Negnevitsky, Pearson Education, 2005 48

Can an ANFIS deal with problems where weCan an ANFIS deal with problems where we do not have any prior do not have any prior knowledge of the ruleknowledge of the rule consequent parameters?consequent parameters?

It is not necessary to have any prior knowledge ofIt is not necessary to have any prior knowledge of rule consequent parameters. An rule consequent parameters. An ANFIS learnsANFIS learns these these parameters and tunes membership functions.parameters and tunes membership functions.

© Negnevitsky, Pearson Education, 2005 49

Learning in the ANFIS modelLearning in the ANFIS model

An ANFIS uses a hybrid learning algorithm thatAn ANFIS uses a hybrid learning algorithm that combines the least-squares combines the least-squares estimator and theestimator and the gradient gradient descent method.descent method.

In the ANFIS training algorithm, each epoch isIn the ANFIS training algorithm, each epoch is composed composed from a forward pass and a backwardfrom a forward pass and a backward pass. In thepass. In the forward forward pass, a training set of inputpass, a training set of input patterns (an input vector) is presented to patterns (an input vector) is presented to thethe ANFIS, neuron outputs are calculated on the layerANFIS, neuron outputs are calculated on the layer- - by-layer by-layer basis, and rule consequent parameters arebasis, and rule consequent parameters are identified.identified.

© Negnevitsky, Pearson Education, 2005 50

The rule consequent parameters are identified byThe rule consequent parameters are identified by the least-squares estimator. In the least-squares estimator. In the Sugeno-stylethe Sugeno-style fuzzy fuzzy inference, an output, inference, an output, yy, is a linear function. , is a linear function. Thus, given the values of the Thus, given the values of the membershipmembership parameters and parameters and a training set of a training set of P P input-outputinput-output patterns, we can form patterns, we can form P P linear equations in termslinear equations in terms of the consequent parameters of the consequent parameters as:as:

(1) f(1)yd (1) (1) f(1)

(2) f(2)yd (2) (2) f(2)

(p) f(p)yd (p) (p) f(p)

(P) f(P)yd (P) (P) f(P)

n(1) fn(1)

n(2) fn(2)

n(p) fn(p)

n(P) fn(P)

...

...

...

...

© Negnevitsky, Pearson Education, 2005 51

In the matrix notation, we have In the matrix notation, we have yydd = = A A kk, ,

where where yyd d is a is a P P 1 desired output vector,1 desired output vector,

and and k k is an is an n n (1 (1 mm) ) 1 vector of unknown consequent1 vector of unknown consequent parameters,parameters,

K K = [= [kk1010 k k1111 k k1212 … … kk1m1m k k2020 k k21 21 k k2222 … … kk2m2m … … kknn00 kknn11 k kn2n2 … … kkn mn m]]TT

yd (1)

yd (2)

yd (p)

yd

yd (P)

A

(2) (1)x(2) (2)xm(2)

(1) (1)x(1) (1)xm(1) n(1) n (1)x(1) n (1)xm(1)

n(2) n (2)x(2) n (2)xm(2)

(p) (p)x(p) (p)xm(p) n(p) n (p)x(p) n (p)xm(p)

(P) (P)x(P) (P)xm(P) n(P) n (P)x(P) n (P)xm(P)

© Negnevitsky, Pearson Education, 2005 52

As soon as the rule consequent parameters are As soon as the rule consequent parameters are established, we compute an actual network output established, we compute an actual network output vector, vector, yy, and determine the error vector, , and determine the error vector, ee

e e = = yyd d y y

In the backward pass, the back-propagation In the backward pass, the back-propagation algorithm is applied. The error signals are algorithm is applied. The error signals are propagated back, and the antecedent parameters propagated back, and the antecedent parameters are updated according to the chain rule. are updated according to the chain rule.

© Negnevitsky, Pearson Education, 2005 53

In the ANFIS training algorithm suggested by In the ANFIS training algorithm suggested by Jang, both antecedent parameters and Jang, both antecedent parameters and consequent parameters are optimised. In the consequent parameters are optimised. In the forward pass, the consequent parameters are forward pass, the consequent parameters are adjusted while the antecedent parametersadjusted while the antecedent parameters remain fixed. In the backward pass, theremain fixed. In the backward pass, the antecedent parameters are tuned while theantecedent parameters are tuned while the consequent parameters are kept fixed.consequent parameters are kept fixed.

© Negnevitsky, Pearson Education, 2005 54

In this example, an ANFIS is used to follow a In this example, an ANFIS is used to follow a trajectory of the non-linear function defined by the trajectory of the non-linear function defined by the equationequation

Function approximation using the ANFIS modelFunction approximation using the ANFIS model

First, we choose an appropriate architecture for First, we choose an appropriate architecture for the ANFIS. An ANFIS must have two inputs the ANFIS. An ANFIS must have two inputs xx1 and 1 and xx2 2 and one output and one output yy. .

Thus, in our example, the ANFIS is defined by Thus, in our example, the ANFIS is defined by four rules, and has the structure shown below.four rules, and has the structure shown below.

2)12cos(

xe

xy

© Negnevitsky, Pearson Education, 2005 55

An ANFIS model with four rulesAn ANFIS model with four rules

N1

y

N3

N2

N4

1

x2

x1

Layer 2 Layer 5Layer 3 Layer 4 Layer 6Layer 1

A1

A2

B1

B2

x1 x2

1

2

3

4

2

3

4

© Negnevitsky, Pearson Education, 2005 56

The ANFIS training data includes 101 training The ANFIS training data includes 101 training samples. They are represented by a 101 samples. They are represented by a 101 3 3 matrix [matrix [xx1 1 xx2 2 yydd ], where ], where xx1 1 and and xx2 2 are input are input

vectors, and vectors, and yydd is a desired output vector. is a desired output vector.

The first input vector, The first input vector, xx11, starts at 0, increments , starts at 0, increments by 0.1 and ends at 10.by 0.1 and ends at 10.

The second input vector, The second input vector, xx22, is created by taking , is created by taking sin from each element of vector sin from each element of vector xx11, with elements , with elements of the desired output vector, of the desired output vector, yydd, determined by the , determined by the

function equation.function equation.

© Negnevitsky, Pearson Education, 2005 57

Learning in an ANFIS with two membership Learning in an ANFIS with two membership functions assigned to each input (one epoch)functions assigned to each input (one epoch)

1-3

-2

-1

0

1

2

-1

-0.5

0

0.5

02

46

810

y

x1x2

Training DataANFIS Output

© Negnevitsky, Pearson Education, 2005 58

Learning in an ANFIS with two membership Learning in an ANFIS with two membership functions assigned to each input (100 epochs)functions assigned to each input (100 epochs)

02

46

810

-1

-0.5

0

0.5

1-3

-2

-1

0

1

2

x1x2

y

Training DataANFIS Output

© Negnevitsky, Pearson Education, 2005 59

We can achieve some improvement, but much We can achieve some improvement, but much better results are obtained when we assign threebetter results are obtained when we assign three membership functions to each input variable. Inmembership functions to each input variable. In this case, the ANFIS model will have nine rules,this case, the ANFIS model will have nine rules, as shown in figure below.as shown in figure below.

© Negnevitsky, Pearson Education, 2005 60

An ANFIS model with nine rulesAn ANFIS model with nine rules

y

x1 x2

11

2

3

4

5

6

7

8

9

A3

A1

x1A2

B3

B1

B2

N1

N2

N3

N4

N6

N5

N7

N8

N9

2

3

4

5

6

7

8

9

x2

© Negnevitsky, Pearson Education, 2005 61

Learning in an ANFIS with three membership Learning in an ANFIS with three membership functions assigned to each input (one epoch)functions assigned to each input (one epoch)

1-3

-2

-1

0

1

2

-1

-0.5

0

0.5

02

46

810

y

x1x2

Training DataANFIS Output

© Negnevitsky, Pearson Education, 2005 62

Learning in an ANFIS with three membership Learning in an ANFIS with three membership functions assigned to each input (100 epochs)functions assigned to each input (100 epochs)

02

46

810

-1

-0.5

0

0.5

13

2

1

0

1

2

x1x2

y

Training DataANFIS Output

© Negnevitsky, Pearson Education, 2005 63

Initial and final membership functions of the ANFISInitial and final membership functions of the ANFIS

0 1 2 3 4 5 6 7 8 9 100

0.2

0.4

0.6

0.8

1

x1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(a) Initial membership functions.

0 1 2 3 4 5 6 7 8 9 100

0.2

0.4

0.6

0.8

1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

x1

(b) Membership functions after 100 epochs of training.x2

x2x2