6
978-1-4799-0059-6/13/$31.00 ©2014 IEEE Experimental Study on the Effective Range of FCM’s Fuzzifier Values for Web Services’ QoS Data Mohd Hilmi Hasan, Jafreezal Jaafar and Mohd Fadzil Hassan Computer and Information Sciences Department Universiti Teknologi PETRONAS 31750, Tronoh, Perak, Malaysia Abstract—The work presented in this paper is part of the development of a fuzzy Interval Type-2 (IT2)-based system. An IT2 system contains fuzzy type-2 membership functions that can be generated using a pair of FCM fuzzifier, m values. Evidences show that the effective range of m values is influenced by the underlying dataset. Hence, the objective of this paper is to present the experimental study on finding the effective range of m values for web services’ QoS data. The study aimed at identifying range of m values that successfully generated Gaussian membership functions. As proposed by previous work, the experiment was carried out upon the m values in the range of 1.4 to 2.6. The works involved the datasets of three QoS parameters. The results showed that two of the datasets has the effective range of 1.7 to 2.6, while another one was 1.6 to 2.6. Keywords—fuzzifier, FCM, fuzzy weighting exponent, m values I. INTRODUCTION Web services’ quality of service (QoS) monitoring has become an important procedure in web services environment. QoS monitoring is related to non-functional aspects, hence it determines how good a service is in carrying out its tasks [1, 2]. Furthermore, QoS monitoring can also be used to detect the existence of problems [3] as well as to decide whether to continue subscribing any particular service or not [4]. In relation to that, we proposed a fuzzy-based model for web services’ QoS monitoring using fuzzy interval type-2 (IT2) [5, 6]. The main reason for proposing IT2-based monitoring system was to allow customers to specify their QoS requirements using linguistic definitions instead of exact numeric data. This can overcome the issues of inaccurate monitoring results [7], and unrealistically specifying QoS requirements using exact values due to the uncertain nature of web services [8]. Additionally, the customers in general do not have the knowledge of the exact numerical QoS values to be specified in their agreement [9]. In another perspective, IT2 was proposed instead of fuzzy type-1 because the former is better in handling uncertainty and vagueness. As reported by [10], one of the three reasons for adopting fuzzy type-2 is to handle more uncertainty and vagueness. Reference [11], on the other hand, compares fuzzy type-1 and type-2 approaches for implementing plant monitoring and diagnostic. The results show that type-2 outperforms type-1 approach. Furthermore, reference [12] also shows that fuzzy type-2 is better than type-1 in designing control systems. In our work, the construction of the model’s fuzzy membership functions was based on clustering of the actual web services’ QoS data gathered from network [13, 14]. Although a number of clustering algorithms are available for use, none of them has the capability to optimally cluster all types of data [15]. Hence, in this work, we proposed the use of Fuzzy C-Means (FCM) algorithm [13, 14]. Besides, we chose this automatic clustering using FCM as opposed to deriving from expert knowledge because the latter may result in loss of accuracy [16] and may not always available [17]. The IT2 membership functions were constructed based on two FCM’s fuzzifier values, m 1 and m 2 [18]. The fuzzifier, m is a parameter of FCM where its value is in the range of 1 to +. The importance of fuzzifier can be perceived in terms of its capabilities to suppress noise and improve the smoothness of membership functions [19]. As apparent in literature, numerous studies have been reported to find the optimal range of fuzzifier values. Reference [20] proposes the value of m>n/(n-2). In another work, [21] proposes that the fuzzifier value is in the range of 1.25 to 1.75. Moreover, [22] reports that the m value should be in between 1.5 and 2.5, with m=2 is commonly applied in FCM algorithm. Furthermore, [23] reports that the m values of lower than 1.4 and beyond 2.6 do not affect the membership functions. Hence, they propose m value of [1.4, 2.6]. Reference [19] describes the finding of the optimal range of m values through a theoretical approach instead of empirical or experimental basis which are proposed by most of the previous works. However, they are still investigating the relationship between m and the distribution of clusters, which is important to determine the accuracy of the transformation of the rule that is applied in their approach. In this work, we found that it is important to conduct an experimental study on finding the range of m values for web services’ QoS data. This is due to the fact that the structure of dataset has the influences over the m values [19, 24, 25]. The study was conducted by performing FCM clustering upon the datasets to generate membership functions. The generated membership functions were observed in terms of their shapes. That means, we only chose the m values that generated the expected shape of membership functions. In this work, we limited the membership functions’ shape to Gaussian as it is easy to represent and computes faster for small number of rules [26]. Another limitation was that the number of clusters for each datasets was based on our previous works as reported in [13, 14]. These number of clusters values were required in performing FCM clustering. Moreover, we limited this

[IEEE 2014 International Conference on Computer and Information Sciences (ICCOINS) - Kuala Lumpur, Malaysia (2014.6.3-2014.6.5)] 2014 International Conference on Computer and Information

Embed Size (px)

Citation preview

Page 1: [IEEE 2014 International Conference on Computer and Information Sciences (ICCOINS) - Kuala Lumpur, Malaysia (2014.6.3-2014.6.5)] 2014 International Conference on Computer and Information

978-1-4799-0059-6/13/$31.00 ©2014 IEEE

Experimental Study on the Effective Range of FCM’s Fuzzifier Values for Web Services’ QoS Data

Mohd Hilmi Hasan, Jafreezal Jaafar and Mohd Fadzil Hassan Computer and Information Sciences Department

Universiti Teknologi PETRONAS 31750, Tronoh, Perak, Malaysia

Abstract—The work presented in this paper is part of the development of a fuzzy Interval Type-2 (IT2)-based system. An IT2 system contains fuzzy type-2 membership functions that can be generated using a pair of FCM fuzzifier, m values. Evidences show that the effective range of m values is influenced by the underlying dataset. Hence, the objective of this paper is to present the experimental study on finding the effective range of m values for web services’ QoS data. The study aimed at identifying range of m values that successfully generated Gaussian membership functions. As proposed by previous work, the experiment was carried out upon the m values in the range of 1.4 to 2.6. The works involved the datasets of three QoS parameters. The results showed that two of the datasets has the effective range of 1.7 to 2.6, while another one was 1.6 to 2.6.

Keywords—fuzzifier, FCM, fuzzy weighting exponent, m values

I. INTRODUCTION Web services’ quality of service (QoS) monitoring has

become an important procedure in web services environment. QoS monitoring is related to non-functional aspects, hence it determines how good a service is in carrying out its tasks [1, 2]. Furthermore, QoS monitoring can also be used to detect the existence of problems [3] as well as to decide whether to continue subscribing any particular service or not [4].

In relation to that, we proposed a fuzzy-based model for web services’ QoS monitoring using fuzzy interval type-2 (IT2) [5, 6]. The main reason for proposing IT2-based monitoring system was to allow customers to specify their QoS requirements using linguistic definitions instead of exact numeric data. This can overcome the issues of inaccurate monitoring results [7], and unrealistically specifying QoS requirements using exact values due to the uncertain nature of web services [8]. Additionally, the customers in general do not have the knowledge of the exact numerical QoS values to be specified in their agreement [9]. In another perspective, IT2 was proposed instead of fuzzy type-1 because the former is better in handling uncertainty and vagueness. As reported by [10], one of the three reasons for adopting fuzzy type-2 is to handle more uncertainty and vagueness. Reference [11], on the other hand, compares fuzzy type-1 and type-2 approaches for implementing plant monitoring and diagnostic. The results show that type-2 outperforms type-1 approach. Furthermore, reference [12] also shows that fuzzy type-2 is better than type-1 in designing control systems. In our work, the construction of

the model’s fuzzy membership functions was based on clustering of the actual web services’ QoS data gathered from network [13, 14]. Although a number of clustering algorithms are available for use, none of them has the capability to optimally cluster all types of data [15]. Hence, in this work, we proposed the use of Fuzzy C-Means (FCM) algorithm [13, 14]. Besides, we chose this automatic clustering using FCM as opposed to deriving from expert knowledge because the latter may result in loss of accuracy [16] and may not always available [17].

The IT2 membership functions were constructed based on two FCM’s fuzzifier values, m1 and m2 [18]. The fuzzifier, m is a parameter of FCM where its value is in the range of 1 to +∞. The importance of fuzzifier can be perceived in terms of its capabilities to suppress noise and improve the smoothness of membership functions [19]. As apparent in literature, numerous studies have been reported to find the optimal range of fuzzifier values. Reference [20] proposes the value of m>n/(n-2). In another work, [21] proposes that the fuzzifier value is in the range of 1.25 to 1.75. Moreover, [22] reports that the m value should be in between 1.5 and 2.5, with m=2 is commonly applied in FCM algorithm. Furthermore, [23] reports that the m values of lower than 1.4 and beyond 2.6 do not affect the membership functions. Hence, they propose m value of [1.4, 2.6]. Reference [19] describes the finding of the optimal range of m values through a theoretical approach instead of empirical or experimental basis which are proposed by most of the previous works. However, they are still investigating the relationship between m and the distribution of clusters, which is important to determine the accuracy of the transformation of the rule that is applied in their approach.

In this work, we found that it is important to conduct an experimental study on finding the range of m values for web services’ QoS data. This is due to the fact that the structure of dataset has the influences over the m values [19, 24, 25]. The study was conducted by performing FCM clustering upon the datasets to generate membership functions. The generated membership functions were observed in terms of their shapes. That means, we only chose the m values that generated the expected shape of membership functions. In this work, we limited the membership functions’ shape to Gaussian as it is easy to represent and computes faster for small number of rules [26]. Another limitation was that the number of clusters for each datasets was based on our previous works as reported in [13, 14]. These number of clusters values were required in performing FCM clustering. Moreover, we limited this

Page 2: [IEEE 2014 International Conference on Computer and Information Sciences (ICCOINS) - Kuala Lumpur, Malaysia (2014.6.3-2014.6.5)] 2014 International Conference on Computer and Information

experimental study to only consider the range of m values of [1.4, 2.6] as proposed by [23].

Therefore, the objective of this paper is to present the experimental study on finding the FCM’s range of m values for web services’ QoS data. This paper is written as the following; section 2 contains the methods used in this work, section 3 comprises the results and discussion, and section 4 summarizes the presented work and outlines the future works.

II. METHODS

A. Datasets The conducted experiments were based on the web

services’ QoS data provided by [27, 28]. The data were gathered from 365 actual web services using Web Service Crawler Engine (WSCE). In this work, we used the data of three QoS parameters namely response time (ms), latency (ms) and availability (%). These data were divided into three separate datasets. Each of the datasets contained 1500 data points.

B. Fuzzy C-Means The execution of FCM algorithm requires three inputs

namely dataset, number of clusters, c, and fuzzifier, m. As mentioned earlier, the values of c were based on our previous work where the optimum number of clusters of each datasets was determined using a clustering validity index. Therefore, c=3 was set for latency and availability datasets, while c=4 was used for response time dataset. The FCM was executed upon a dataset with different values of m, where the tested values were in the range of [1.4, 2.6]. The following paragraphs briefly explain the procedures of FCM algorithm.

FCM clusters a collection of data into a number of different clusters. Each data point is assigned to every cluster with different membership degree [29]. FCM involves an iterative process that runs towards minimizing an objective function. Assuming n data points is represented by {X1, X2,…, Xn} and number of clusters is represented as c, FCM guesses the center of each cluster, ci, i=1, 2, ..., c, in the initial iteration. This initial set of clusters’ centers does not represent the optimal clustering condition. Next, FCM computes and assigns each data point with a membership degree of each cluster [30]. The computation is performed based on Eq. (1), where matrix U stores the generated membership degrees and dij = ||ci – xj|| is the value of Euclidean distance from jth data point to the ith cluster center.

∑=

⎟⎟

⎜⎜

⎛=

c

k

m

jk

jiji

d

du

1

)1/(21

(1)

Then, FCM computes the objective function based on Eq. (2).

∑ ∑∑= = =

==c

i

c

i

n

jij

mijic duJccUJ

1 1 1

21 ),...,,(

(2)

The objective function value is reduced in each iterative process, until it reaches the threshold value. The threshold value is the minimum value of the objective function, which marks the termination of FCM execution [30]. As long as the threshold value is not reached, FCM continues with the computation of a new set of clusters’ centers. This is carried out using Eq. (3). The generated values of clusters’ centers will replace the previous values.

=

== n

j

mij

n

jj

mij

i

u

Xu

c

1

1

(3)

Subsequently, the same procedures of computing membership degrees using Eq. (1), the objective function using Eq. (2) and clusters’ centers using Eq. (3) are repeated. These processes execute iteratively until the objective function reaches its minimum value. In each iterative process, FCM minimizes the value of objective function, as well as improves the values of clusters’ centers and membership degrees. In this work, the clustering experiment using FCM algorithm was executed using Matlab’s Fuzzy Clustering and Data Analysis Toolbox.

C. Gaussian Membership Functions The outputs of FCM algorithm, namely clusters’ centers, ci,

i=1, 2, ..., c, and matrix U were used for generating Gaussian membership functions. A Gaussian membership function is constructed based on Eq. (4) as follows:

ecx

cxf 2

2

2)(

),;( σσ−−

= (4)

Based on Eq. (4), it is shown that a Gaussian function requires the values of cluster center, c and sigma, σ. That actually means a dataset with four clusters needs to run Eq. (4) for four times with different values of c and σ. The value of σ can be derived from Eq. (4) with c and matrix U as the inputs. In this work, the Gaussian membership functions were generated using genfis3 function provided by Matlab.

III. RESULTS AND DISCUSSION As stated previously, the objective of the experiment was to

find the generated membership functions with Gaussian shape. Fig. 1 shows the membership functions of response time dataset for m of [1.4, 2.6]. It is apparent that m=1.4 does not produce a right Gaussian membership functions. One of the four clusters constructs a horizontal line instead of a Gaussian

Page 3: [IEEE 2014 International Conference on Computer and Information Sciences (ICCOINS) - Kuala Lumpur, Malaysia (2014.6.3-2014.6.5)] 2014 International Conference on Computer and Information

shape. The fuzzifier value of m=1.5 also does not construct a right Gaussian membership functions as required. The rest of the m values however managed to produce the Gaussian membership functions as shown in Fig. 1.

Fig. 2 shows the results of membership functions for availability dataset. The fuzzifier value of m=1.4 produced the same horizontal line for one of its three clusters as mentioned

before. Furthermore, m=1.5 and m=1.6 also do not construct right Gaussian membership functions. The other m values successfully generated Gaussian shape for all of the three clusters.

Furthermore, Fig. 3 shows the results of membership functions for latency dataset using different m values. The

Fig. 1. Membership functions of response time

m=1.4 m=1.5 m=1.6

m=1.7 m=1.8 m=1.9

m=2.0 m=2.1

m=2.5 m=2.4 m=2.3

m=2.2

m=2.6

Page 4: [IEEE 2014 International Conference on Computer and Information Sciences (ICCOINS) - Kuala Lumpur, Malaysia (2014.6.3-2014.6.5)] 2014 International Conference on Computer and Information

Fig. 2. Membership functions of availability

m=1.4 m=1.5 m=1.6

m=1.7 m=1.8 m=1.9

m=2.0 m=2.1

m=2.5 m=2.4 m=2.3

m=2.2

m=2.6

Page 5: [IEEE 2014 International Conference on Computer and Information Sciences (ICCOINS) - Kuala Lumpur, Malaysia (2014.6.3-2014.6.5)] 2014 International Conference on Computer and Information

Fig. 3. Membership functions of latency

similar patterns to availability dataset can be seen, where all fuzzifier values except m=1.4, m=1.5 and m=1.6 have produced right Gaussian membership functions.

We also extended the experiments by performing the same processes upon synthetic datasets. This was due to the fact that web services QoS data are uncertain in nature. Each of the synthetic datasets contained 1500 data points, which were similar to the three actual QoS datasets. However, the data points in these synthetic datasets were generated by adding to or deducting from the data points of the actual QoS data with certain range of random numbers. In this work, we prepared the synthetic datasets of +/-5ms, +/-10ms, +/-50ms and +/-100ms from the actual QoS data of response time and latency. Besides, the testing datasets for availability were prepared based on +/-0.5%, +/-1%, +/-5% and +/-10% from the actual QoS dataset. Three sets of synthetic datasets, A, B, and C were used in this experiment.

Table 1-3 shows the results of FCM’s Gaussian membership functions generation upon the synthetic datasets. It is shown that the range of m values of [1.6, 2.6] is effective for response time, while [1.7, 2.6] is effective for latency and availability. These results are similar to the results produced using the actual QoS datasets as described earlier.

TABLE I. RANGE OF m VALUES FOR RESPONSE TIME USING SYNTHETIC DATASETS

Dataset Range of m values +/-5 ms +/-10 ms +/-50 ms +/-100 ms

A 1.6-2.6 1.6-2.6 1.6-2.6 1.6-2.6B 1.6-2.6 1.6-2.6 1.6-2.6 1.6-2.6C 1.6-2.6 1.6-2.6 1.6-2.6 1.6-2.6

m=1.4 m=1.5 m=1.6

m=1.7 m=1.8 m=1.9

m=2.0 m=2.1

m=2.5 m=2.4 m=2.3

m=2.2

m=2.6

Page 6: [IEEE 2014 International Conference on Computer and Information Sciences (ICCOINS) - Kuala Lumpur, Malaysia (2014.6.3-2014.6.5)] 2014 International Conference on Computer and Information

TABLE II. RANGE OF m VALUES FOR LATENCY USING SYNTHETIC DATASETS

Dataset Range of m values +/-5 ms +/-10 ms +/-50 ms +/-100 ms

A 1.7-2.6 1.7-2.6 1.7-2.6 1.7-2.6 B 1.7-2.6 1.7-2.6 1.7-2.6 1.7-2.6 C 1.7-2.6 1.7-2.6 1.7-2.6 1.7-2.6

TABLE III. RANGE OF m VALUES FOR AVAILABILITY USING SYNTHETIC DATASETS

Dataset Range of m values +/-0.5% +/-1% +/-5% +/-10%

A 1.7-2.6 1.7-2.6 1.7-2.6 1.7-2.6 B 1.7-2.6 1.7-2.6 1.7-2.6 1.7-2.6 C 1.7-2.6 1.7-2.6 1.7-2.6 1.7-2.6

IV. CONCLUSION In this work, we studied the effective range of FCM

fuzzifier, m values of web services’ QoS data. The work was a part of our main goal in producing web services’ QoS monitoring using fuzzy Interval Type-2. Based on the conducted experiments, we found that the effective range of m values for response time was [1.6, 2.6]. Meanwhile, the effective range for availability and latency was [1.7, 2.6].

It is important to note that the results of this study do not portray the effective range of m values for general execution of FCM. Instead, we were only interested in studying the effect of m values in producing Gaussian membership functions as required by our proposed web services’ QoS monitoring model. Moreover, the experiments were specifically conducted upon the web services’ QoS data. Hence, different results maybe produced for other data.

For future work, the identified ranges of m values will be used to generate fuzzy Interval Type-2 membership functions. This is one of the major works that is required to develop the proposed web services QoS monitoring model.

REFERENCES [1] L. Chung and J. C. Sampaio do Prado Leite, "On non-functional

requirements in software engineering," Lecture Notes in Computer Science, vol. 5600/2009, pp. 363-379, 2009.

[2] G. D. Modica, et al., "Dynamic SLAs management in service oriented environments," J. Syst. Softw., vol. 82, pp. 759-771, 2009.

[3] A. Leff, et al., "Service-Level Agreements and Commercial Grids," IEEE Internet Computing, vol. 7, pp. 44-50, 2003.

[4] M. H. Zadeh and M. A. Seyyedi, "QoS Monitoring for Web Services by Time Series Forecasting," in 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT), 2010, pp. 659-663.

[5] M. Hasan, et al., "Monitoring web services’ quality of service: a literature review," Artificial Intelligence Review, pp. 1-16, 2012/10/01 2012.

[6] M. H. Hasan, et al., "A Review on Monitoring Vague Quality of Service (QoS) Compliance for Web Services," in 2012 International Conference on Computer and Information Sciences (ICCIS2012), 2012.

[7] D. Allenotor and R. K. Thulasiram, "A fuzzy grid-QoS framework for obtaining higher grid resources availability," in Proceedings of the 3rd international conference on Advances in grid and pervasive computing, Kunming, China, 2008, pp. 128-139.

[8] S. Rosario, et al., "Probabilistic QoS and Soft Contracts for Transaction-Based Web Services Orchestrations," IEEE Trans. Serv. Comput., vol. 1, pp. 187-200, 2008.

[9] D. Mobedpour and C. Ding, "User-centered design of a QoS-based web service selection system," pp. 1-11, 2011.

[10] T. Dereli, et al., "Industrial Applications of Type-2 Fuzzy Sets and Systems: A Concise Review," Computers in Industry, vol. 62, pp. 125-137, 2010.

[11] O. Castillo and P. Melin, "A new hybrid approach for plant monitoring and diagnostics combining type-2 fuzzy logic and fractal theory," in Fuzzy Information Processing Society, 2002. Proceedings. NAFIPS. 2002 Annual Meeting of the North American, 2002, pp. 111-116.

[12] R. Sepulveda, et al., "Experimental study of intelligent controllers under uncertainty using type-1 and type-2 fuzzy logic," Inf. Sci., vol. 177, pp. 2023-2048, 2007.

[13] M. H. Hasan, et al., "Fuzzy-based Clustering of Web Services’ Quality of Service: A Review," Journal of Communications, vol. 9, pp. 81-90, 2014.

[14] M. H. Hasan, et al., "Development of Web Services Fuzzy Quality Models using Data Clustering Approach," in The First International Conference on Advanced Data and Information Engineering Kuala Lumpur, 2013.

[15] S. Vega-Pons and J. Ruiz-Shulcloper, "A survey of clustering ensemble algorithms," International Journal of Pattern Recognition and Artificial Intelligence, vol. 25, pp. 337-372, 2011.

[16] S. Guillaume, "Designing Fuzzy Inference Systems from Data: An Interpretability-Oriented Review," IEEE Transactions on Fuzzy Systems, vol. 9, pp. 426-443, 2001.

[17] J.-S. R. Jang, "Self-Learning Fuzzy Controllers Based on Temporal Back Propagation," IEEE Transactions on Neural Networks, vol. 3, pp. 714-723, 1992.

[18] B.-I. Choi and F. C.-H. Rhee, "Interval type-2 fuzzy membership function generation methods for pattern recognition," Information Sciences 179, vol. 179, pp. 2102–2122, 2009.

[19] M. Huang, et al., "The range of the value for the fuzzifier of the fuzzy c-means algorithm," Pattern Recognition Letters, vol. 33, pp. 2280–2284, 2012.

[20] J. C. Bezdek and R. J. Hathaway, "Convergence and theory for fuzzy c-means clustering: counterexamples and repairs," IEEE Trans. Pattern Anal., vol. 17, pp. 873–877, 1987.

[21] K. P. Chan and Y. S. Cheung, "Clustering of clusters," Pattern Recognition Letters, vol. 25, pp. 211–217, 1992.

[22] N. R. Pal and J. C. Bezdek, "On cluster validity for the fuzzy c-means model," IEEE Transactions on Fuzzy Systems, vol. 3, pp. 370-379, 1995.

[23] I. Ozkan and I. B. Turksen, "Upper and lower values for the level of fuzziness in FCM," Information Sciences, vol. 177, pp. 5143–5152, 2007.

[24] C. Hwang and F. C.-H. Rhee, "Uncertain fuzzy clustering: interval type-2 fuzzy approach to C-means," IEEE Trans. Fuzzy Systems, vol. 15, pp. 107–120, 2007.

[25] Y. Jian, et al., "Analysis of the weighting exponent in the FCM," IEEE Trans. Systems Man. Cybernet. C. Part B, vol. 34, pp. 634–639, 2004.

[26] D. Wu, "An Overview of Alternative Type-Reduction Approaches for Reducing the Computational Cost of Interval Type-2 Fuzzy Logic Controllers," in IEEE World Congress on Computational Intelligence, 2012.

[27] E. Al-Masri and Q. H. Mahmoud, "Discovering the best web service," in 16th International Conference on World Wide Web (WWW) (poster) 2007, pp. 1257-1258.

[28] E. Al-Masri and Q. H. Mahmoud, "QoS-based Discovery and Ranking of Web Services," in IEEE 16th International Conference on Computer Communications and Networks (ICCCN), 2007, pp. 529-534.

[29] L. Wang and J. Wang, "Feature Weighting fuzzy clustering integrating rough sets and shadowed sets," International Journal of Pattern Recognition and Artificial Intelligence, vol. 26, 2012.

[30] H. Guldemır and A. Sengur, "Comparison of clustering algorithms for analog modulation classification," Expert Systems with Applications 30, vol. 30, pp. 642-649, 2006.