6
Hedge Algebra based Type-2 Fuzzy Logic System and its Application to Predict Survival Time of Myeloma Patients Phan Anh Phong Faculty of Information Technology Vinh University Vinh, Nghe An, VietNam e-mail: [email protected] Dinh Khac Dong Faculty of Information Technology Hanoi University of Technology Hanoi, VietNam e-mail:[email protected] Tran Dinh Khang Faculty of Information Technology Hanoi University of Technology Hanoi, VietNam e-mail: [email protected] Abstract—In this paper, we propose a method to construct hedge algebra based type-2 fuzzy logic systems (HA-T2FLS). In these fuzzy logic systems, the footprints of uncertainty (FOU) of type-2 fuzzy sets are optimized by genetic algorithm and the dispersion of data. The key ingredient of our system is the concept of centroid of hedge algebra based type-2 fuzzy sets. It is used in the type-reducing of the HA-T2FLS, and transforming interval type-2 fuzzy sets to hedge algebra based type-2 fuzzy sets. As an application, we show how hedge algebra based type-2 fuzzy logic systems can be used to predict survival time of myeloma patients. The results show that hedge algebra based type-2 fuzzy logic systems are more accurate than type-1 and interval type-2 fuzzy logic systems in this class of problems. I. INTRODUCTION The concept of an ordinary fuzzy set (henceforth called a type-1 fuzzy set, T1FS) was originally introduced by L.Zadeh in 1965. Substantial works have been done especially for the fuzzy logic systems (T1FLS) constructed on these fuzzy sets. However, an important problem when designing these fuzzy logic systems is how to determine membership functions for fuzzy sets or to describe uncertainty with certain membership functions. This is quite a difficult problem. Moreover, because membership grades of each element of T1FS are crisp numbers, these membership functions reduce the fuzziness of knowledge. The concept of a type-2 fuzzy set (T2FS) is first introduced by Zadeh as an expansion T1FS, which has partially solved the above problem. In general, T2FS is a fuzzy sets whose membership grades are themselves type-1 fuzzy sets in [0, 1], instead of a crisp number in T1FS. The results in [7]–[11][13] have shown the ability of T2FS in modeling and minimizing the influences of uncertainty of input data in rule based fuzzy logic systems, which T1FS shows its limitations. A remarkable advantage of T2FS is the ability to deal with uncertain data. For this reason, there have been more and more research and applications of T2FS and T2FLS in various fields such as artificial intelligence, signal processing, and image processing [7][13], some of which are classification, noise cancellation and digital images edge detection [13]. Especially, the applications of these fuzzy logic systems in prediction problems have gained promising results [8][9]. However, some problems arise when designing T2FLS. The first one is the huge calculation, so most of the applications must use interval type-2 fuzzy logic systems (IT2 FLS). Nonetheless, the calculation of this IT2 FLS is still great. The second problem is how to determine the suitable FOUs of T2FSs. Recently, the references [2] and [3] have proposed the concept of hedge algbra based type-2 fuzzy set (HA-T2FS). This kind of fuzzy set not only considers the relationship among membership grades but also has a remarkable advantage. That is, the set operations and inferring process are not too complicated; hence, the amount of calculation is manageable. This paper uses genetic algorithm and the dispersion of data to construct FOUs of T2FSs, and then proposes the notion of centroid of hedge algebra based type-2 fuzzy sets to construct HA-T2FLS. This fuzzy logic system is used to predict the survival time of myeloma patients and its result is better than type-1 fuzzy logic systems and interval type-2 fuzzy logic systems. Its prediction result is close to the result in [1] and shows the reliability of hedge algebra based type-2 fuzzy logic systems in prediction problems. The rest of the paper is organized as follows: section II introduces the concept of hedge algebra based type-2 fuzzy logic systems and presents the concept of centroid of hedge algebra based type-2 fuzzy sets. Next, the prediction result of survival time of myeloma patients using this fuzzy logic system is described in section III. The performance of hedge algebra based type-2 fuzzy logic system is compared with the other fuzzy systems in section IV before conclusions are drawn. II. HEDGE ALGEBRA BASED TYPE-2 FUZZY LOGIC SYSTEM The nature of language is vague and uncertain. Hedge algebra is one of the approaches to manipulate on words both qualitatively and quantitatively [4]. In hedge algebra, linguistic values obey semantic orders. In order to operate on these values, a function mapping linguistic values to the values in [0,1] has been created. This function, which is described in [4], is called semantically quantifying mapping and denoted as v: X [0,1], where X is a set of linguistic values in hedge algebra. Fuzziness measure of linguistic values is the principal element to quantify them. This section 2009 International Conference on Knowledge and Systems Engineering 978-0-7695-3846-4/09 $26.00 © 2009 IEEE DOI 10.1109/KSE.2009.22 13

[IEEE 2009 International Conference on Knowledge and Systems Engineering (KSE) - Hanoi, Vietnam (2009.10.13-2009.10.17)] 2009 International Conference on Knowledge and Systems Engineering

Embed Size (px)

Citation preview

Page 1: [IEEE 2009 International Conference on Knowledge and Systems Engineering (KSE) - Hanoi, Vietnam (2009.10.13-2009.10.17)] 2009 International Conference on Knowledge and Systems Engineering

Hedge Algebra based Type-2 Fuzzy Logic System and its Application to Predict Survival Time of Myeloma Patients

Phan Anh Phong Faculty of Information Technology

Vinh University Vinh, Nghe An, VietNam

e-mail: [email protected]

Dinh Khac Dong Faculty of Information Technology

Hanoi University of Technology Hanoi, VietNam

e-mail:[email protected]

Tran Dinh Khang Faculty of Information Technology

Hanoi University of Technology Hanoi, VietNam

e-mail: [email protected]

Abstract—In this paper, we propose a method to construct hedge algebra based type-2 fuzzy logic systems (HA-T2FLS). In these fuzzy logic systems, the footprints of uncertainty (FOU) of type-2 fuzzy sets are optimized by genetic algorithm and the dispersion of data. The key ingredient of our system is the concept of centroid of hedge algebra based type-2 fuzzy sets. It is used in the type-reducing of the HA-T2FLS, and transforming interval type-2 fuzzy sets to hedge algebra based type-2 fuzzy sets. As an application, we show how hedge algebra based type-2 fuzzy logic systems can be used to predict survival time of myeloma patients. The results show that hedge algebra based type-2 fuzzy logic systems are more accurate than type-1 and interval type-2 fuzzy logic systems in this class of problems.

I. INTRODUCTION The concept of an ordinary fuzzy set (henceforth called a

type-1 fuzzy set, T1FS) was originally introduced by L.Zadeh in 1965. Substantial works have been done especially for the fuzzy logic systems (T1FLS) constructed on these fuzzy sets. However, an important problem when designing these fuzzy logic systems is how to determine membership functions for fuzzy sets or to describe uncertainty with certain membership functions. This is quite a difficult problem. Moreover, because membership grades of each element of T1FS are crisp numbers, these membership functions reduce the fuzziness of knowledge.

The concept of a type-2 fuzzy set (T2FS) is first introduced by Zadeh as an expansion T1FS, which has partially solved the above problem. In general, T2FS is a fuzzy sets whose membership grades are themselves type-1 fuzzy sets in [0, 1], instead of a crisp number in T1FS. The results in [7]–[11][13] have shown the ability of T2FS in modeling and minimizing the influences of uncertainty of input data in rule based fuzzy logic systems, which T1FS shows its limitations.

A remarkable advantage of T2FS is the ability to deal with uncertain data. For this reason, there have been more and more research and applications of T2FS and T2FLS in various fields such as artificial intelligence, signal processing, and image processing [7][13], some of which are classification, noise cancellation and digital images edge detection [13]. Especially, the applications of these fuzzy logic systems in prediction problems have gained promising

results [8][9]. However, some problems arise when designing T2FLS. The first one is the huge calculation, so most of the applications must use interval type-2 fuzzy logic systems (IT2 FLS). Nonetheless, the calculation of this IT2 FLS is still great. The second problem is how to determine the suitable FOUs of T2FSs.

Recently, the references [2] and [3] have proposed the concept of hedge algbra based type-2 fuzzy set (HA-T2FS). This kind of fuzzy set not only considers the relationship among membership grades but also has a remarkable advantage. That is, the set operations and inferring process are not too complicated; hence, the amount of calculation is manageable. This paper uses genetic algorithm and the dispersion of data to construct FOUs of T2FSs, and then proposes the notion of centroid of hedge algebra based type-2 fuzzy sets to construct HA-T2FLS. This fuzzy logic system is used to predict the survival time of myeloma patients and its result is better than type-1 fuzzy logic systems and interval type-2 fuzzy logic systems. Its prediction result is close to the result in [1] and shows the reliability of hedge algebra based type-2 fuzzy logic systems in prediction problems.

The rest of the paper is organized as follows: section II introduces the concept of hedge algebra based type-2 fuzzy logic systems and presents the concept of centroid of hedge algebra based type-2 fuzzy sets. Next, the prediction result of survival time of myeloma patients using this fuzzy logic system is described in section III. The performance of hedge algebra based type-2 fuzzy logic system is compared with the other fuzzy systems in section IV before conclusions are drawn.

II. HEDGE ALGEBRA BASED TYPE-2 FUZZY LOGIC SYSTEM The nature of language is vague and uncertain. Hedge

algebra is one of the approaches to manipulate on words both qualitatively and quantitatively [4]. In hedge algebra, linguistic values obey semantic orders. In order to operate on these values, a function mapping linguistic values to the values in [0,1] has been created. This function, which is described in [4], is called semantically quantifying mapping and denoted as v: X → [0,1], where X is a set of linguistic values in hedge algebra. Fuzziness measure of linguistic values is the principal element to quantify them. This section

2009 International Conference on Knowledge and Systems Engineering

978-0-7695-3846-4/09 $26.00 © 2009 IEEE

DOI 10.1109/KSE.2009.22

13

Page 2: [IEEE 2009 International Conference on Knowledge and Systems Engineering (KSE) - Hanoi, Vietnam (2009.10.13-2009.10.17)] 2009 International Conference on Knowledge and Systems Engineering

presents an overview of hedge algebra, fuzziness measure, and semantically quantifying mapping. Then, it introduces the concept of hedge algebra based type-2 fuzzy set and its centroid and HA-T2FLS.

A. Overview of hedge algebra Consider a set of terms of TRUTH variable: Dom

(TRUTH) = {True, False, VeryTrue, VeryFalse, MoreTrue, MoreFalse, PossiblyTrue, PossiblyFalse, LessTrue, LessFalse, …} in which True, False are base terms, emphatic words are Very, More, Possibly, Less. According to [5], linguistic domain T=Dom (TRUTH) can be represented as an algebra structure AX = (X, G, H, ), in which G={c+, c−} is a set of base terms; H is a set of hedges includes positive hedges H+ and negative hedges H−; “ ” is semantically ordering relation. The base terms c+, c− belong to X. The set X is created by emphasizing hedges in H on terms in X.

When studying hedge algebra (HA), we have to mention two important notions. The first one is fuzziness measure of linguistic terms. In [4], the authors have proposed the concept fuzziness measure fm as follows. The mapping

: [0,1]fm X → is called a fuzziness measure if: 1) fm is a full measure, i.e. (i) if c−, c+ are all base terms then fm(c−)+ fm(c+)=1 (ii) if H is set of all hedges then {fm(hc),h∈H}=fm(c)

with c∈G 2) If x is a crisp term, i.e. h( x )={ x }, ∀ h∈H, then

fm( x )=0 3) ∀ x , y ∈ X, ∀h∈H then ˆ ˆ( ) ( ) ( ),

ˆ ˆ( ) ( )fm hx fm hy fm hfm x fm y

= =

hence, fm(h) is called fuzziness measure of hedge h. Based on fuzziness measure of base terms fm(c) and

fuzziness measure of hedges fm(hi), we can compute the fuzziness measure of all linguistic values in X. Therefore, each fuzziness measure of a term x in X is associated with an interval ˆ ˆ ˆ ˆ ˆ ˆ( ), ( ) [0,1], , ( ) ( ) ( )fm x fm x x X fm x fm x fm x⊂ ∀ ∈ − =

The second notion is the semantically quantifying mapping. Assume that {fm(h), h∈H−}=α>0, {fm(h), h∈H+}=β>0, α+β=1, semantically quantifying mapping

: [0,1]v X → divides ,fm fm into two subbintervals in

proportion α to β. This mapping shows us a representative value of interval ,fm fm .

Example 1: Consider a linear hedge algebra AX = (X, G, H, ), base term G = {False, True}, and hedge set is H = {Very, More, Possibly, Less} ∪ {Inf, Sup}. H+={More, Very}, and H−={Less, Possibly}. fm(Less) = fm(Possibly) = fm(More) = fm(Very) = 0.25, fm(False) = fm(True) = 0.5. Therefore, α = fm(Less) + fm(Possibly) = 0.5, and β = fm(More) + fm(Very) = 0.5. In this paper, the hedges Very, More, Possibly, Less are referred to as V, M, P, L and base terms False, True as F, T respectively. The fuzziness measure of linguistic value MoreTrue is fm(MT) = fm(M) ×

fm(T) = 0.25 × 0.5 = 0.125. We also have fm(VT) = fm(MT) = fm(PT) = fm(LT) = 0.125 and ( ) ( )fm T fm LT= = 0.5,

then ( ) ( )( ) ( ) 0.75fm MT fm LT fm LT fm PT= + + =

and ( )( ) ( ) 0.875.fm MT fm MT fm MT= + = The semantically quantifying value of MoreTrue is v(MT) = 0.8125.

B. Hedge algebra based type-2 fuzzy set 1) Definition: Consider hedge algebra AX = (X, G, H,

), G = {False, True}. A hedge algebra based type-2 fuzzy set determined on domain U is the type-2 fuzzy set whose membership grade of elements are truth values of this hedge algebra.

ˆ ˆix U

A x x∈

= , with ˆix are truth values and belong to X

( ˆix X∈ ) Example 2: Consider hedge algebra in Example 1 and U

is the universe of discourse. The hedge algebra based type-2 fuzzy set presents the linguistic value “Tall” which is given as follows:

Tall=False/a + LessTrue/b + VeryVeryTrue/c in which False, LessTrue, and VeryVeryTrue belong to X and a, b, and c are values in U.

The use of truth values in HA as membership grades of members of fuzzy set has turned the intersection, union and complement operation among membership grades into the intersection, union and complement operation in lattice [2], so that the amount of the calculation is reduced considerably.

In the following section, this paper will present an important notion when studying HA-T2FLS, namely, the centroid of hedge algebra based type-2 fuzzy set.

2) Centroid of hedge algebra based type-2 fuzzy set: As we already know, T1FS, A, has the universe of discourse U discreted into N points x1, x2,…, xN with the centroid as the follows:

1

1

( )

( )

Ni A ii

A NA ii

x xc

x

μ

μ=

=

=

Similar to the concept of centroid of type-2 fuzzy sets in [6], we can define the approximate centroid of the hedge algebra based type-2 fuzzy set as follows.

Given that hedge algebra based type-2 fuzzy set ˆ ˆ ˆ/ ,i i

U

A x x x X= ∈ . x ∈U, is discretized into N points x1,

x2,…, xN , we have: 1

ˆ ˆN

i ii

A x x=

= with ˆix X∈

Then approximate centroid of A is an interval fuzzy set and is denoted as ˆ( )C A whose mean is

( )( )

1

1

ˆ*

ˆ

Ni ii

Nii

v x xx

v x=

=

= (1)

14

Page 3: [IEEE 2009 International Conference on Knowledge and Systems Engineering (KSE) - Hanoi, Vietnam (2009.10.13-2009.10.17)] 2009 International Conference on Knowledge and Systems Engineering

and the deviation of

( )1

1

ˆ* ( )

ˆ2.

Ni ii

Nii

x x fm xx

v x=

=

−Δ = (2)

when the following condition is satisfied:

( )1

1

1 ˆ( )2 1

ˆ

Nii

Nii

fm x

v x

=

=

(3)

The higher the accuracy level is, the smaller the value of

1 1

1 ˆ ˆ( ) ( )2

N Ni ii i

fm x v x= =

is.

Example 3: Consider hedge algebra in Example 1, assuming determined domain of HA-T2FS “Tall” is [0, 250] centimeters, the hedge algebra based type-2 fuzzy set “Tall” is given as follows:

150 180 200LessFalse MoreTrue VeryVeryTrueTall = + +

According to (1), (2), the centroid of this fuzzy set is an interval fuzzy set with whose mean is

( ) 150 ( ) 180 ( ) 200* 182,937( ) ( ) ( )

v LF v MT v VVTxv LF v MT v VVT

× + × + ×= =+ +

and the deviation of

1

1

ˆ* ( )8.92

ˆ2. ( )

Ni ii

Nii

x x fm xx

v x=

=

−Δ = =

condition (3) is satisfied because

( )( )

1

1

1 ˆ2 0.0629 1

ˆ

Nii

Nii

fm x

v x

=

=

=

C. Hedge algebra based type-2 fuzzy logic system The elements of hedge algebra based type-2 fuzzy logic

system are depicted in Figure 1.

Figure 1. HA based type-2 fuzzy logic system

1) Hedge algebra based fuzzifier: Hedge algebra based fuzzifier has crisp numbers, linguistic values, type-1 fuzzy

sets, interval type-2 fuzzy sets as the inputs and hedge algebra based type-2 fuzzy sets as the outputs.

Normally, when designing fuzzy logic systems, interval type-2 fuzzy sets are used. Interval type-2 fuzzy set has the membership grade of elements as subbintervals of [0, 1]. While HA-T2FS’s membership grades is truth values in X characterized by their fuzziness which is also a subbinterval of [0, 1]. Thereby, we can construct HA-T2FS on the basis of interval type-2 fuzzy set. Formally, transforming from interval type-2 fuzzy set to HA based type-2 fuzzy set is substitute interval type-1 fuzzy set by a linguistic value in X respectively in order to minimize the loss. The algorithm presented below transforms one IT2FS to one HA-T2FS so that the difference of their centroids is less than ε. Algorithm 1: Input: ε >0, parameters of HA: fm(c+), fm(hi), interval

type-2 fuzzy set 1

( )N

iA ii

A x xμ=

=

Output: HA based type-2 fuzzy set 1

ˆ ˆN

i ii

A x x=

= , ˆix is

the truth value in HA. Step 1: Determine type-1 fuzzy sets:

[ ]( ) , 1,...,i i i i iAx i Nμ μ σ μ σ= − + =

Step 2: Determine ˆ ˆ ˆ ˆ( ), ( ), ( ), ( )i i i ifm x fm x fm x v x with

ˆix X∈ : truth value set

Step 3: Set 1

( )ii Nx max x

≤ ≤= , determine truth values

ˆix X∈ so that: ( ) 1ˆ( 2 )

Nii

i iv xN xε μ

μ ξε

− ≤ =+

, 1,...,i N=

Step 4: Among ˆix X∈ in Step 3, determine ˆix so as for

[ ] [ ]ˆ ˆ ˆ ˆ( ), ( ) , \ ( ), ( ) ,i i i i i i i i i i i ifm x fm x fmx fmxμ σ μ σ μ σ μ σ∪ − + ∩ − + to be minimal.

Step 5: Determine 1

ˆ ˆN

i ii

A x x=

= which is HA based

type-2 fuzzy set of the algorithm With the above algorithm, the difference between

centroid of created HA based type-2 fuzzy set and centroid of original interval type-2 fuzzy set is less than ε.

For example: Consider interval type-2 fuzzy set

[ ] [ ] [ ] [ ] [ ]0.12,0.13 0.93,0.95 0.89,0.91 0.48,0.5 0.23,0.250.15 0.3 0.8 1.05 1.2

A = + + + +

With approximate centroid of ( ) 0.6763C A = according

to [6], after Algorithm 1, we have HA based type-2 fuzzy set:

ˆ0.15 0.3 0.8 1.05 1.2

VVVMF LMVT MPLVT MVLF LVLMFA = + + + +

This fuzzy set has approximate centroid ˆ( ) 0.6762C A =

Hedge

Algebra Fuzzifier

Rule Base

Inference Engine

Defuzzifier Type-

Reducer Input

Output

Type-reduced fuzzy set

Output HA-T2FS

Output Processing

15

Page 4: [IEEE 2009 International Conference on Knowledge and Systems Engineering (KSE) - Hanoi, Vietnam (2009.10.13-2009.10.17)] 2009 International Conference on Knowledge and Systems Engineering

Obviously, with ε = 0.005 then the difference between these centroids is ˆ( ) ( ) 0.001C A C A ε− = <

The output of fuzzifier is HA based type-2 fuzzy set which is the input of inference engine.

2) Rule base: The rule base consists of M fuzzy rules IF-THEN, each of which has p antecedents and one consequent, the i-th rule has the following form:

Rule i : IF x1 is 1i

F and x2 is 2i

F and … xp is ipF THEN y

is i

G , in which x1, x2, …, xp are input variables determined in U1,U2, …, Up respectively ; y is output variable

determined in V ; 1i

F , 2i

F , …, ipF ,

iG which are HA

based type-2 fuzzy set determined in correlative domains.

3) Inference engine: Details of inference with HA-T2FS are presented in [2]. To understand inference of HA-T2FS, let’s consider the following simple rule: Rule: IF x is A THEN y is B

Assumption: x is 0A

Conclusion: y is 0B In which x is input variable, y is output variable, A , B , 0A , 0B are HA based type-2 fuzzy sets.

Similar to fuzzy inference, rule IF x is A THEN y is B forms a fuzzy relationship R , this fuzzy relationship is also a HA-T2FS. Consider ,C SR R as the expansion of SC RR , :

( ) ( ( ) ( )) / ( , )C A BU V

R A B u v u vμ μ×

= × = ∧

S( ) ( ) [ (u) (v)] / ( , )S

A BS U V

R A V B U u vμ μ×

= × × = →

With S

1, (u) (v)(u) (v)

0, (u) (v)A B

A BA B

μ μμ μ

μ μ≤

→ =>

1, 0 respectively, are supremum and infimum elements in the representation of X, 1= Sup (True), 0 = Inf (False).

The conclusion is inferred from composition operation between fuzzy set 0A and the above fuzzy relationship R as the follows:

0 00 0 , ( ) ( ( ) ( , ))

B A RuB A R v u u vμ μ μ= = ∨ ∧

Output processing: Output processing of HA based type-2 fuzzy logic system includes 2 parts: type-reduction and defuzzifier.

In section II, we have the centroid of HA–T2FS which is an interval type-1 fuzzy set. It is the result of the centroid type-reduction of HA-T2FS.

[ ]( ) * , *C A x x x x= −Δ +Δ Defuzzifier is represented as done with regular interval

type-1 fuzzy set to have the output of HA-T2FLS.

III. APPLICATION IN PREDICTING SURVIVAL TIME OF MYELOMA PATIENTS

A. Data In [1], Yu Qiu, Yan-Qing Zhang and Yichuan Zhao used

data from a study on multiple myeloma in which 65 patients were treated with alkylating agents. The data can be found in SAS/STAT 9.2 User’s Guide The PHREG Procedure (2008) [14]. In the data set MYELOMA, the variable TIME represents the survival time in months from diagnosis. The variable VSTATUS consists of two values, 0 and 1, indicating whether the patient was alive or dead, respectively, at the end of the study. If the value of VSTATUS is 0, the corresponding value of TIME is censored. The variables thought to be related to survival are LOGBUN (log BUN at diagnosis), HGB (hemoglobin at diagnosis), PLATELET (platelets at diagnosis: 0=abnormal, 1=normal), AGE (age at diagnosis in years), LOGWBC (log WBC at diagnosis), FRAC (fractures at diagnosis: 0=none, 1=present), LOGPBM (log percentage of plasma cells in bone marrow), PROTEIN (proteinuria at diag-nosis), and SCALC (serum calcium at diagnosis). Most of the studies related to this data set show that two variables which have the largest influence on the survival time of patients are LOGBUN and HGB. To simplify the fuzzy logic system, we choose these 2 factors as the input variables; the output variable is log survival time of patients. Among these 65 pairs of data, 45 pairs are selected to be the training data and the other 20 pairs are testing data.

B. Hedge algebra based fuzzifier Firstly, we will determine the upper membership function (UMF) of FOU by using genetic algorithm [12]. An important problem when using genetic algorithm is how to encode parameters of fuzzy system into the chromosome. In this fuzzy logic system, variable LOGBUN is fuzzified by 3 triangular fuzzy sets with 3 linguistic labels Low – Medium – High respectively. Thus, we need 5 parameters for LOGBUN. Moreover, these input values are determined on [0.5, 2.5], so we need 11 bits to describe a parameter. Consequently, with 5 parameters, variable LOGBUN is encoded by 11 × 5 = 55 genes in the chromosome. Similarly, variable HGB determined on [0,20] is fuzzified by 2 triangular fuzzy sets Low – High, we need 2 parameters, so the number of genes we need to encode is 8 × 2 = 16 genes. Output variable survival time of patients determined on [0, 100] is fuzzified by 3 fuzzy sets Short – Normal – Long, we need 5 parameters; each of them uses 7 genes to encode, so the number of genes is 7 × 5 = 35 genes. In total, the chromosome has 106 genes to describe parameters of fuzzy system. The one point cross over and random mutation is used in the evolution. The corresponding probability of these

16

Page 5: [IEEE 2009 International Conference on Knowledge and Systems Engineering (KSE) - Hanoi, Vietnam (2009.10.13-2009.10.17)] 2009 International Conference on Knowledge and Systems Engineering

processes is 95% and 1%. Fitness function is chosen as

Least Square Error: F = ( )2

1

ni ii

z z=

− , in which zi is the

output of the system and iz is the output of training data.

The first generation is initialized, and then the fitness values are calculated. After that, the next generation is created by cross over and mutation. The evolution is stopped only when the fitness is good enough or the number of generations reaches the limit. After the evolution, we have the chromosome with the best fitness carrying the parameters of upper membership function. Having the upper membership function, we determine the lower membership function (LMF) by using the dispersion of input data.

1

1 ni meani

max min

x xnd

x x=

−=

In which, n is the number of training data, meanx is the mean of training data. xmax − xmin represents value domain of data. The larger the dispersion of the input data is, the larger the distance between upper membership function and lower membership function is. So, we can construct lower membership function on the basis of upper membership function. We can construct LMF from UMF and the dispersion of the input data by lowering the highest point of UMF a distance d as we can see in Figure 2.

Figure 2. Determine LMF from UMF and the dispersion of input data

When we have the interval type-2 fuzzy set, the Algorithm 1 will help us to transform this fuzzy set to HA based type-2 fuzzy set.

C. Rule base and inference engine The rule base has great influences on the accuracy of

inference in fuzzy logic system. In this prediction system, we use 6 rules constructed by experts [1]. These rules are shown in Table 1.

Inference engine of HA-T2FLS is represented as in [2]. After this process, we have the output HA-T2FSs.

D. Output processing The centroid of HA based type-2 fuzzy set helps us to

type-reduce this fuzzy set to an interval type-1 fuzzy set. Then, it is defuzzified to be the crisp output of the fuzzy logic system.

TABLE I. RULE BASE OF HA BASED TYPE-2 FLS

IF LOGBUN AND HGB THEN Survival Time

LOW LOW NORMAL

LOW HIGH NORMAL

MEDIUM LOW NORMAL

MEDIUM HIGH LONG

HIGH LOW SHORT

HIGH HIGH SHORT

IV. RESULTS AND DISCUSSIONS In order to compare the prediction result of HA-T2FLS to that of type-1 fuzzy logic system and interval type-2 fuzzy logic system, it is necessary to design these fuzzy systems with the same antecedents and consequent, training data and testing data. The rule base in these fuzzy logic systems is similar. In the inference of interval type-2 fuzzy logic system, t-norm and s-conorm are chosen as min and max, respectively. The Least Square Error (LSE) of T1FLS, interval type-2 FLS and HA based type-2 fuzzy logic system in the prediction problem are presented in Table 2. Obviously, the error of HA-T2FLS is the smallest, i.e. the prediction result of this fuzzy logic system is the most accurate. Figure 3 shows the prediction results of 20 pairs of testing data with three above fuzzy logic systems. We can see that the outputs number 7, 10, 11, 12, 13, 14 and 16 are close to expected values.

Figure 3. The survival time prediction of different systems

TABLE II. LSE OF FUZZY LOGIC SYSTEMS

Type of fuzzy logic system T1-FLS IT2-FLS HA-T2FLS

LSE 2.34 2.06 1.85

17

Page 6: [IEEE 2009 International Conference on Knowledge and Systems Engineering (KSE) - Hanoi, Vietnam (2009.10.13-2009.10.17)] 2009 International Conference on Knowledge and Systems Engineering

The comparison between HA based type-2 fuzzy logic system and genetic statistical interval-valued fuzzy system in [1] shows that their prediction LSEs are approximate. This comparison has shown the viability of HA-T2FLS in prediction problems.

V. CONCLUSION We have proposed the concept of centroid of HA based

type-2 fuzzy set and an algorithm to transform interval type-2 fuzzy sets to HA based type-2 fuzzy sets. To our knowledge, this is the first time the concept of centroid of HA-T2FS is introduced. Then the HA based T2FLS is constructed. This method inherits the advantages of T1FS and also reduces the amount of calculation in T2FLS.

In order to get high performance, the upper membership function is optimized by genetic algorithm with the fitness function Least Square Error. The lower membership function is created on the basis of dispersion of input data. The survival time prediction result shows that the HA-T2FLS is better than T1FLS and IT2FLS. Moreover, the prediction LSE of HA-T2FLS (1.85) is close to the LSE (1.78) of genetic statistical interval-valued fuzzy system in [1]. This affirms the accuracy of HA-T2FLS in prediction problems. In the future, we plan to optimize the parameters fm(c), fm(hi) of hedge algebra to minimize the errors in order to solve various problems and thereby to get better results.

REFERENCES [1] Yu Qiu, Yan-Qing Zhang and Yichuan Zhao, “Statistical

Genetic Interval-valued Fuzzy Systems with Prediction in Clinical Trials,” IEEE International Conference on Granular Computing, pp. 129-132, 2007.

[2] Tran Dinh Khang, Dinh Khac Dzung, “Inference with Hedge Algebra based type-2 Fuzzy Sets,” Journal of Computer Science and Cybernertics, vol 19, Issue 1, pp. 28-43, 2003. (in Vietnamese).

[3] Phan Anh Phong, Tran Dinh Khang, Dinh Khac Dong, “Fuzzy Modeling on Hedge Algebra based type-2 Fuzzy Sets,” in Proceedings of the 4th National Symposium on Research, Development and Application of Information and Communication Technology (ICT.rda), Hanoi, Vietnam, pp. 103-111, August 2008. (in Vietnamese).

[4] N.C. Ho, L.H. Chau, T.D. Khang and H.V. Nam, “Hedge Algebras, Linguistic- valued Logic and their Application to Fuzzy Reasoning,” International Journal of Uncertainty, Fuzziness and Knowledge-Based System, Vol 7, No.4, 347-361, 1999.

[5] N. C. Ho and W. Wechler, “Extended hedge algebras and their application to Fuzzy logic,” Fuzzy Sets and Systems 52, pp. 259-281, 1992.

[6] N.N.Karnik and J.M.Mendel “Centroid of a Type-2 Fuzzy Logic Systems,” Information Sciences 132, pp. 195-220, 2001.

[7] J. M. Mendel, Uncertain Rule-Based Fuzzy Logic Systems: Introduction and New Directions. Upper Saddle River, NJ Prentice-Hall, 2001.

[8] N.N. Karnik, J.M. Mendel, “Applications of type-2 fuzzy logic systems to forecasting of timeseries,” Information Sciences, vol. 120, pp. 89–111, 1999.

[9] M.H. Fazel Zarandi, B. Rezaee, I.B. Turksen, E. Neshat, “A type-2 fuzzy rule-based expert system model for stock price analysis,” Expert Systems with Applications, 36(1), pp. 139-154, 2009.

[10] Q. Liang and J.M. Mendel, “Interval type-2 fuzzy logic systems: Theory and Design,” IEEE Trans. Fuzzy Systems, Vol. 8, No. 5, pp. 535–550, 2000.

[11] R. John and S. Coupland, “Type-2 Fuzzy logic: A history View,” Computational Intelligence Magazine, IEEE, Volume: 2, Issue: 1, pp. 57-62, 2007.

[12] Melanie Mitchell, An Introduction to Genetic Algorithm, The MIT Press, 1999.

[13] O. Castillo and P. Melin, Type-2 Fuzzy Logic Theory and Applications, STUDFUZZ 223, Springer-Verlag Berlin Heidelberg, 2008.

[14] Database SAS/STAT User Guide 9.2 The PHREG Procedure, 2008.

18