The key technology research of web service selection

October 2012, 19(Suppl. 2): 104–108 www.sciencedirect.com/science/journal/10058885 http://jcupt.xsw.bupt.cn

The Journal of China Universities of Posts and Telecommunications

The key technology research of web service selection

MA Lin (�), SONG Jun-de, TONG Jun-jie

School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China

Abstract

In recent years, web service is the hotspot in the academia and industry research. How to select the service to meet user needs accurately and efficiently, has become one of the key problems, which are restricting the application and development of web service. In this paper, firstly it introduces the outline of the web service selection; and then, generalizes and summaries the present related research of web service selection , from the outline of web service selection, the key technology of service selection and selection experimental methods and evaluation parameters; finally, points out the experimental methods and evaluation parameters of web service selection.

Keywords web service selection, web service registry, web service search, web service authentication

1 Introduction �

Web service as a self-contained, modular, loosely coupled distributed computing model, is widespread concern in academia and industry [1]. Web service is described as the basic functional unit which is published, found and called [2]. In recent years, the number of web service is increasing. However, with the increasing popularity and the rapid growth of web service, how to quickly and accurately select web service to meet users’ need, becomes one of the key constraints for web service development, and became the hot spot of academic research [3–4].

Fig. 1 shows the standard web service architecture model [2]. The model contains three roles: service provider (SP), service registry and service requestor. SPs publish service, and respond to service requests. Service registration center has released the web service, their classification, and provide the search service. Service requestor uses the registry to find the service, and then bind with the service to use the service. It shows that service selection plays an important role in the entire process of using the service, and provided a guarantee for

Received date: 29-06-2012 Corresponding author: MA Lin, E-mail: [email protected] DOI: 10.1016/S1005-8885(11)60436-6

the smooth conduct of follow-up to the service call, service portfolio, service processes.

Fig. 1 Web service architecture model [2]

2 The outline of web service selection

The definition of the web service selection has been very broad, and it is difficult to distinguish service selection, service discovery, and service matching.

Decker et al. [5] describe web service selection as the process for matching, for the first time. That is the process which the service requester seek the appropriate SP, through intermediate agents.

Booth et al. [6] defined web service selection as “the process of web service location by the description, to meet specific functional requirements”.

Web service as the distribution, independent of the basic computational unit, have a strong heterogeneity between different service. It includes: technical heterogeneity (different platform or data format), topological heterogeneity (service in specific areas and concepts) and practical

Supplement 2 MA Lin, et al. / The key technology research of web service selection 105

heterogeneity (different development of specific area process and different concepts of specific area tasks). The fundamental purpose of the web service selection is better to shield the heterogeneity between service, before the service calls and service composition.

Through extensive and in-depth study, we believe that web service selection should be included in the set to meet the release of web service and web service calls and to ensure that the web service availability and credibility of the mechanism. This paper discusses the key technologies of web service selection from three aspects of the service registry, service search and service verification, and introduces the related laboratory methods and evaluation parameters.

3 The key technology of web service selection

3.1 Web service registry technology

The software as a component of the idea of reuse has generated nearly 40 years [7]. With the birth of the web service, the scale and intensity of reuse is also the deepening. The software is gradually showing the development trend of the distribution and network. Web service selection mechanism is to protect web service to meet the vast array of applications, the basis of fully used. However, the experience has demonstrated that the artificial management of this registration information is doomed to failure, in years [8–10]. The main reasons are

1) Declining the quality of the content make the registration unavailable, ultimately.

2) Huge amount of content stored leads to difficult to manage and maintain.

Recently, there are three main ways for web service registration: centralized registered type, distributed registered type and distribution of non-registered type. Web service registration determines the web service publishment and deployment, and plays a great impact on the efficiency of the service selection. The main differences of the three types: the registered type preserves and maintains service through a specific protocol and description, however the non-registered type through the specific way of service.

3.1.1 Centralized registered type

The centralized registered type is the most fundamental and traditional web service registry way. In this way, the SP

need register web service information in the central directory server, in the release of web service. Users obtain a list of web service by querying a central directory server, filter the SPs to meet the requirements from this information, and then access the SP to call the web service. The universal description discovery and integration (UDDI)-based approach [11] is the centralized registered-type structure, which UDDI business registry (UBR) [12] plays the role of the directory server

The major shortcoming of the centralized registered type is single point of failure and the lack of scalability. To solve the problem, scholars put forward the concept of UDDI cloud and joint UDDI [13]. It is mainly centralized organization available UDDI nodes to form a service, and like a node in the work outside

Tewari et al. [14] expanded the existing service-oriented architecture (SOA) architecture, register service through multiple UBR nodes as a distributed storage, manage distributed nodes using Internet protocol (IP) table, to overcome a single point of failure of the UDDI registry server.

Wang et al. [15] proposed peer-to-peer (P2P) overlay network to solve the traditional the UDDI single point, established the field of topology by the service classification. The field nodes make up distributed chord network, and locate the field node through keyword matching.

Kashani et al. [16] proposed content-based similarity P2P networks (QDN). To solve the single sign-on problem, It takes the idea of P2P, and multiple UDDI server to work together. The research focus is organization way (centralized, fully distributed and so on), node organization way (classification of service, content similarity, functional similarity and so on), node query (keyword matching, semantic matching and so on).

3.1.2 Distributed registered type

The distributed registered type is the distribution of registration center server according to a certain topology connected together to form the union of the registration center [17–18]. The registration centers worked together to complete the web service publishing and lookup. In this way, web service information is distributed in more than one registration center server, the service request based on a certain routing protocol and positioning mechanism to be forwarded to the appropriate registration center server to

106 The Journal of China Universities of Posts and Telecommunications 2012

match. The typical representative is METEOR-S [18], which uses a registered body UDDI registry based on the areas of classification, and the web service register in accordance with their respective areas in the appropriate registration center in order to locate and query.

The major shortcoming of the distributed registered type is that fixed classification to bring the low accuracy of the search. There are multiple solutions which is inconvenient to browse and search by fixed classification of web service. It uses labels the level of classification directly and lightweight [19–20], but the labels have the problems of unstructured information garbage, semantic ambiguity and semantic lost; It makes use of data mining or text mining clustering algorithm to classify.

3.1.3 Distributed non-registered type

The most effective way to achieve the distributed non-registered type is based on P2P networks, there is no need to set the central directory server, and the service is completely decentralized in a peer node, in this environment.

Unstructured P2P networks uses random graph of the organization to form a loose network. When the web service is queried, the query request is the flooding spread in the network. The node which receives the request will use the matching algorithm to query the web service of information stored locally to match, and then returned to the requesting node to match the results and the next step forwards queries. Compared with the unstructured P2P network, structured P2P network has strong scalability. It uses the distributed Hash table (DHT)-based distribution of discovery and routing algorithms. Web service query keyword is the only distributed hash function mapped to the storage node of the keyword, and then establish a connection to the node through some specific routing algorithm. Structured P2P network limitation is that only support exact match, do not support fuzzy matching and complex queries. Lausen proposed a reservation address properties to lower dimensional indexing mechanisms: the use of space-filling curve multidimensional keywords are mapped to physical nodes, the one-dimensional index space, in order to support complex queries with wildcards and partial keywords.

3.2 Web service search technology

Service search process is to allow the user needs and

available service that match the process. The process includes the extraction of service similarity, the Match of user requests and the sort and recommended of service.

3.2.1 The extraction of service similarity

At present, classification of web service include: fixed classification, label the level of classification and various cluster. Web service feature extraction and classification need web service-based similarity. When it is non-fixed classification, web service feature is extracted as a classification parameter to establish the relationship similarity comparison, firstly. Similar extraction for web service mainly focus on the following areas:

The similarity based web service functions: Khalid and other people form the final normalized similarity measure by extracting five specific properties from the web services description language (WSDL) document, and use quality threshold algorithm to cluster. Qi et al. extracted the feature of the service and operations (operation is the function of service), turned the two co-occurrence matrix into two chart, and then clustered by singular value decomposition (SVD) method.

The similarity based web service messages or behavior: mealy service model, business process execution language (BPEL) and web service choreography interface (WSCI) model are the model based on message; Roman model, problem statement language (PSL), CTR-S and ontology web language for services (OWL-S) are the model based on events or behavior. In addition, Shen et al. proposed the mixed model of message and behavior, expanded the IOPR model, associated service message with behavior, and form non-deterministic finite automata.

The similarity based web service operations: operation is modeled as term-bag, found by the similarity between the term-bags. To improve accuracy, the similar words are clustered of for the concept, and the concept is used for comparison and matching of the operation. It has some shortcomings: the operations and parameters are defined limitedly in the WSDL file; characteristic dimension is formatted more higher; it is likely to vary in function which web service operation contains similar words.

In addition, Zhao at el. proposed auto-discovery algorithm based on semantic and structural similarity to the service relationship, and semantic similarity calculated by the cosine similarity.

Supplement 2 MA Lin, et al. / The key technology research of web service selection 107

3.2.2 The match of service demands

The service demands match is consists of two main parts: match between web service, match between web service and user request. At present, there are two main service matching way: one is based on keyword match, the other is based on semantic match.

The match based on key-word: keyword matching is generally the basis of the classification of web service. It provides keyword-based Web service search in UDDI; Schmidt proposed a classification search based on keyword matching way, when field node is searched in the P2P overlay network. Search based on keyword matching is lower in accuracy and callback rate. It is generally combination based on keyword matching with other matching.

The match based on semantic: in classification of web service, it uses of the term-bags to create vector machine, based on creating a web service similarity model most commonly. Web service is calculated in semantic similarity and it is as a basis for matching, including: cosine similarity, NGD algorithm, term frequency–inverse document frequency (TF/IDF) weight matrices, match maker and so on. In addition, there are also WSIL, OWL-S and semantic annotation of WSDL semantic matching.

With the rise of the semantic web, it becomes a hot by constructing a semantic network to facilitate the discovery of web service. Luo et al. achieve that automated ontology description builds the ALN association link network by E-FCM. Web services are organized for C-ALN network, with a shorter path to facilitate the rapid selection and efficient recommended. At the same time, there is the semantic web in the P2P overlay network layer. The matching of the field node is based on the key-words in the P2P overlay network, yet, the matching in the semantic web is the communication by Gnutalla protocol and reasoning based on first order logic.

3.3 Web service verification technology

In order to ensure that the search to the web service is available, it needs to verify the availability of web service from the following aspects: the trigger timing of the availability of monitoring (search and non-search), the location of the monitoring (server and client), test data(server-side data and client runtime data) and so on.

Hamill proposed a built-in execution engine. The data

which is generated in the process of users testing is stored in the database as the test data. There are two ways of service testing ways. One is the way that users take the initiative to the implementation of trigger; the other is background periodically trigger. When service is unavailable in the two trigger case, the cycle of background periodic monitoring will reduce, the testing frequency will increase. When the number of failures reaches a certain value, the web service will be considered invalid and deleted.

Schmidt proposed that the service search results are tested in the usability to ensure the web service availability, before the search results is returned by the validate the WS components.

4 The experimental methods and evaluation parameters

4.1 Experimental methods

It is mainly summary on two aspects : the present experimental data collection and the semantic matching tools.

In the present experimental data collection, there are two ways to collect the service data. One is the search engine and some service publishing sites, for example, binding point, salcentral, xmethod, Webservicex, Webservicelist and so on. It downloads the WSDL files in them, due to the lack of OWL-S; The other is to access the UDDI registry center, get the WSDL data. There are mainly three ways: SOAP application program interface (API) based on Java, UDDI client API based on Java and JAXR (register API based on Java extensive makeup language (XML)).

In the semantic matching, there are WVTool (Java) which is a word tool, and porter stemmer which is used to word form conversion and word pretreatment.

4.2 Evaluation parameters

The main evaluation parameters of web service selection include: accuracy rate ( accuracyR ) and callback rate ( callbackR ), in addition to scalability, and so on.

The accuracy rate is the proportion of true association web service, in web service selection; the callback rate is the return rate of the associated service, in web service selection.

Supposed for a particular web service option to request, ‘relevant’ represents the collection of related web service;

108 The Journal of China Universities of Posts and Telecommunications 2012

‘retrieved’ represents the result set selected by the web service:

accuracy(relevant retrived)=

retrivedR � (1)

callback(relevant retrived)=

retrivedR � (2)

5 Conclusions

Web service selection is the basis of service call, service composition and service reuse, also is the key to widespread use of web service. This paper generalizes and summaries the present related research of web service selection, from the outline of web service selection, the key technology of service selection and selection experimental methods and evaluation parameters. With the continuous deepening of the development of web service selection, academic research for the web service selection change to semantic and network. The semantic makes the service meet the need of the specific users; the network is considered web service as a whole, enhance the efficiency and accuracy of web service selection ,and the same time, provide the basic structure of support for web service composition.

Acknowledgements

This work was supported by the National Natural Science Foundation of China (60973110), the Natural Science Foundation of Beijing City of China (4102059), the Major Projects of Ministry of Industry and Information Technology (2010ZX03006-002-03, 2011ZX03005-005), the Guangdong Chinese Academy of Science Comprehensive Strategic Cooperation Project (2011A090100016), Jiangsu Provincial Science and Technology Support Program (BE2010017), the National Internet of Things Development Special Fund Project (2011-05), Tianjin Binhai New Area Science Little Giant Enterprises Growth Plan (2011-XJR12009).

References

1. Papazoglou M P, Traverso P, Dustdar S, et al. Service-oriented computing: state of the art and research challenges. Computer, 2007, 3: 38�45

2. Gu N, Liu J M, Chai X L, et al. Web service principles and R&D practice. Beijing: China Machine Press, 2006

3. Universal description discovery and integration [EB/OL]. [2010-01-10]. http://www.uddi.org

4. Publishing and discovering web service with DISCO and UDDI [EB/OL]. [2010-01-10]. http://msdn.microsoft.com/ msdnmag/issues/02/02/xml/

5. Decker K, Sycara K, Williamson M. Middle-agents for the Internet. 15th IJCAI, Nagoya, Japan. 1997: 578�583

6. Booth D, Haas H, McCabe F, et al. Web service architecture. W3C WG Note. http://www.w3.org/TR/ws-arch/

7. Atkinson C, Stoll D, Acker H, et al. Separating per-client and pan-client views in service specification. The Int Workshop on Service Oriented Software Engineering (IW-SOSE), Shanghai, China. May 2006

8. Fafchamps D. Organizational factors and reuse. IEEE Software, 1994, 11(5) 9. Frakes W B, Fox C J. Quality improvement using a software reuse failure

modes model. IEEE Transactionson Software Engineering, 1996, 22(4) 10. Morisio M, Ezran M, Tully C. Success and failure factors in software reuse.

IEEE Transactions on Software Engineering, 2002, 28(4) 11. UDDI Specification v3.02, Universal description discovery and integration,

http://www.oasisopen.org/committees/uddi-spec/doc/spec/v3/uddi-v3.0.2-20041019.htm, Apr 2007

12. Overhage S. On specifying web service using UDDI improvements. 3rd Annual International Conference on Object-Oriented and Internet-based Technologies, Concepts, and Applicationsfor a Networked World Net. ObjectDays, Germany, 2002

13. Rompothong P, Senivongse T W. A query federation of UDDI registries. ISICT, 2003

14. Tewai V, Dadgee N, Singh I, et al. An improved discovery engine for efficient and intelligent discovery of web service with publication facility. Service- , 2009: 63�70

15. Wang Z Q, Hu Y Y. An approach for semantic web service discovery based on P2P network. WiCOM’08: 1�4

16. Kashani F B, Chen C C, Hahabi D S. WSPS: web service peer-to-peer discovery service. 4th International Conference on Internet Computing, Jun, 2004: 733�743

17. Ballinger K, Brittenharn P, Malhotra A, et al. Web services inspection language (WS-inspection) 1.0 [EB/OL]. [2010-01-10]. ftp://www6.software. ibm.com/software/de-veloper/library/ws-wsilspec.pdf

18. Zhang L J, Zhou Q, Chao T. A dynamic services discovery framework for traversing web service representation chain. International Conference on Web Service (ICWS’04), 2004

19. Vanderlei T, Durão F, Martins A, et al. A cooperative classification mechanism for search and retrieval of software components. ACM Symposium on Applied Computing (SAC), Information Retrieval Track, Seoul, Korea, 2007

20. Chunkmol U, Benharkat A N, Amghar Y. Enhancing web service discovery by using collaborative tagging system. International Journal of Web Science Practice, 2008, 3(3�4): 129�135

Documents

The key technology research of web service selection