8
A self-adaptive structuring for P2P-based Grid Bassirou Gueye 1,2 , Olivier Flauzac 1 , Cyril Rabat 1 , Ibrahima Niang 2 1 CReSTIC, UFR Sciences Exactes et Naturelles, Université de Reims Champagne Ardenne, FRANCE [email protected], {olivier.flauzac, cyril.rabat}@univ-reims.fr 2 LID, Département de Mathématiques et d’Informatique, Université Cheikh Anta Diop, SENEGAL [email protected], [email protected] Abstract—Grids that use the concept of services are generally based on highly centralized hierarchical architectures. The main issue of this centralization is the unified management of resources, but it is difficult to react rapidly against failure that can affect grid users. Therefore, in our previous works, we proposed a specification called P2P4GS for services manage- ment in a grid-computing environment based on peer-to-peer paradigm. In this approach, all nodes can participate to the deployment and the discovery processes for a given service. In addition, each node maintains a table called “Service Registry", which lists the services owned by this node, as well as the other services located inside the grid and learned during a discovery process. However, the growth of the distributed systems size, in terms of number of nodes, services and users, raises the question of scalability. In this paper, we propose to limit the knowledge about the location of grid services on some nodes that we call ISP (Information System Proxy). Around each ISP, we form a community constituted by a set of nodes of the grid. In order to reduce the ISP overload, on the one hand, we delegate invocation and execution services tasks for nodes called IP (Invocation Proxy). On the other hand, we memorize information about the location of frequently requested services on LP (Location Proxy) nodes. Keywords-Peer-to-peer network; Grid computing; Web Ser- vices; Information Systems. I. I NTRODUCTION Grid computing is an important field, distinguished from conventional distributed computing by its focus on large- scale resource sharing, innovative applications, and high- performance orientation [1][2]. Grids provide a distributed computing paradigm or infrastructure that extends across multiple virtual organizations (VO) where each VO can consist on either physically distributed institutions or log- ically related projects/groups. The goal of such a paradigm is to enable federated resource sharing in dynamic and distributed environments. Indeed, end-users can access to large computing and storage resources that they could not operate otherwise. The emergence of Web Services [3] has provided a framework that initiated its alignment with the grid com- puting technologies as well as cloud computing [4]. These convergences allowed the appearance on one hand, of grid services [5][6] and on the other hand, of cloud services [7]. Indeed, grid services are the result of research established by the OGF 1 (Open Grid Forum) and leading to the OGSA (Open Grid Service Architecture) [5] and the OGSI (Open Grid Service Infrastructure) [6]. These grid services have allowed to use resources more rationally. This is due to leveraging load-balancing mechanism between nodes. A particular effort was made in order to normalise the grids services. Like Web Services, the goal is to gather resources in order to promote the development of applications that use grid based services. Notwithstanding, grids that use the concept of services are generally based on highly centralized hierarchical architec- tures [2][8]. This centralization involves an unified manage- ment of resources as well as a difficulty to react promptly against failures and faults that affect the community (the community designates the set of nodes of the grid). To overcome these drawbacks, convergence solutions of grid computing and peer-to-peer systems have been proposed [1][9][10][11]. Peer-to-peer (P2P) systems are distributed systems consisting on interconnected nodes able to self- organize into network topologies. Their purpose is to share resources like computational power, storage, files or appli- cations, or network bandwidth. These systems are capable of adapting to failures and accommodating transient popu- lations of nodes while maintaining acceptable connectivity and performance. They’re not require the intermediation or support of a global centralized server or authority [12]. Grid and P2P systems are both concerned with the pooling and coordinated use of resources within distributed communities and are constructed as overlay structures that operate largely independently of institutional relationships [1]. However, it does not exist a semantics of the services execution chain in the literature. Most of the proposed solutions around P2P grids are limited on resource discovery and are generally distinguished by the type of search algorithm used. For this purpose, we have proposed in our previous works a specification called, P2P4GS (Peer-To-Peer For Grid Ser- vices), for services management in a grid-computing envi- ronment based on peer-to-peer paradigm [13]. The proposed specification is generic, any combination of peer-to-peer protocol could be composed with any services execution platform. In this approach, all nodes can participate to the 1 http://www.ogf.org/ 978-1-4799-5350-9/14/$31.00 c 2014 IEEE

[IEEE 2014 14th International Conference on Innovations for Community Services (I4CS) - Reims, France (2014.6.4-2014.6.6)] 2014 14th International Conference on Innovations for Community

Embed Size (px)

Citation preview

Page 1: [IEEE 2014 14th International Conference on Innovations for Community Services (I4CS) - Reims, France (2014.6.4-2014.6.6)] 2014 14th International Conference on Innovations for Community

A self-adaptive structuring for P2P-based Grid

Bassirou Gueye1,2, Olivier Flauzac1, Cyril Rabat1, Ibrahima Niang2

1 CReSTIC, UFR Sciences Exactes et Naturelles, Université de Reims Champagne Ardenne, [email protected], {olivier.flauzac, cyril.rabat}@univ-reims.fr

2 LID, Département de Mathématiques et d’Informatique, Université Cheikh Anta Diop, [email protected], [email protected]

Abstract—Grids that use the concept of services are generallybased on highly centralized hierarchical architectures. Themain issue of this centralization is the unified managementof resources, but it is difficult to react rapidly against failurethat can affect grid users. Therefore, in our previous works, weproposed a specification called P2P4GS for services manage-ment in a grid-computing environment based on peer-to-peerparadigm. In this approach, all nodes can participate to thedeployment and the discovery processes for a given service. Inaddition, each node maintains a table called “Service Registry",which lists the services owned by this node, as well as the otherservices located inside the grid and learned during a discoveryprocess. However, the growth of the distributed systems size,in terms of number of nodes, services and users, raises thequestion of scalability. In this paper, we propose to limit theknowledge about the location of grid services on some nodesthat we call ISP (Information System Proxy). Around each ISP,we form a community constituted by a set of nodes of thegrid. In order to reduce the ISP overload, on the one hand,we delegate invocation and execution services tasks for nodescalled IP (Invocation Proxy). On the other hand, we memorizeinformation about the location of frequently requested serviceson LP (Location Proxy) nodes.

Keywords-Peer-to-peer network; Grid computing; Web Ser-vices; Information Systems.

I. INTRODUCTION

Grid computing is an important field, distinguished fromconventional distributed computing by its focus on large-scale resource sharing, innovative applications, and high-performance orientation [1][2]. Grids provide a distributedcomputing paradigm or infrastructure that extends acrossmultiple virtual organizations (VO) where each VO canconsist on either physically distributed institutions or log-ically related projects/groups. The goal of such a paradigmis to enable federated resource sharing in dynamic anddistributed environments. Indeed, end-users can access tolarge computing and storage resources that they could notoperate otherwise.

The emergence of Web Services [3] has provided aframework that initiated its alignment with the grid com-puting technologies as well as cloud computing [4]. Theseconvergences allowed the appearance on one hand, of gridservices [5][6] and on the other hand, of cloud services [7].Indeed, grid services are the result of research established

by the OGF1 (Open Grid Forum) and leading to the OGSA(Open Grid Service Architecture) [5] and the OGSI (OpenGrid Service Infrastructure) [6]. These grid services haveallowed to use resources more rationally. This is due toleveraging load-balancing mechanism between nodes. Aparticular effort was made in order to normalise the gridsservices. Like Web Services, the goal is to gather resourcesin order to promote the development of applications that usegrid based services.

Notwithstanding, grids that use the concept of services aregenerally based on highly centralized hierarchical architec-tures [2][8]. This centralization involves an unified manage-ment of resources as well as a difficulty to react promptlyagainst failures and faults that affect the community (thecommunity designates the set of nodes of the grid).

To overcome these drawbacks, convergence solutions ofgrid computing and peer-to-peer systems have been proposed[1][9][10][11]. Peer-to-peer (P2P) systems are distributedsystems consisting on interconnected nodes able to self-organize into network topologies. Their purpose is to shareresources like computational power, storage, files or appli-cations, or network bandwidth. These systems are capableof adapting to failures and accommodating transient popu-lations of nodes while maintaining acceptable connectivityand performance. They’re not require the intermediation orsupport of a global centralized server or authority [12]. Gridand P2P systems are both concerned with the pooling andcoordinated use of resources within distributed communitiesand are constructed as overlay structures that operate largelyindependently of institutional relationships [1]. However, itdoes not exist a semantics of the services execution chainin the literature. Most of the proposed solutions around P2Pgrids are limited on resource discovery and are generallydistinguished by the type of search algorithm used.

For this purpose, we have proposed in our previous worksa specification called, P2P4GS (Peer-To-Peer For Grid Ser-vices), for services management in a grid-computing envi-ronment based on peer-to-peer paradigm [13]. The proposedspecification is generic, any combination of peer-to-peerprotocol could be composed with any services executionplatform. In this approach, all nodes can participate to the

1http://www.ogf.org/

978-1-4799-5350-9/14/$31.00 c©2014 IEEE

Page 2: [IEEE 2014 14th International Conference on Innovations for Community Services (I4CS) - Reims, France (2014.6.4-2014.6.6)] 2014 14th International Conference on Innovations for Community

deployment and the discovery processes for a given service.In addition, each node maintains a table called “ServiceRegistry", which lists the services owned by this node, aswell as the other services located inside the grid and learnedduring a discovery process. However, the issue of scalabilityarises in such distributed environments where the systemssize in terms of number of nodes, services and users evolves.

In this paper, we propose to limit the knowledge aboutthe location of grid services on some nodes that we callISP (Information System Proxy). Around each ISP, we forma community constituted by a set of nodes of the grid. Inorder to reduce the ISP overload, we delegate invocationand execution services tasks for nodes called IP (InvocationProxy). In addition, we propose to memorize informationabout the location of frequently requested services on nodescalled LP (Location Proxy).

The remainder of the paper is organized as follows : Sec-tion II illustrates the related work on P2P grid environments.Section III gives an overview of P2P4GS specification andour contribution. In Section IV, we describe the proposed or-ganizing approach of the P2P4GS community nodes. SectionV presents the validation of the proposed approach throughsimulation. Section VI concludes this paper and presents ourworking perspectives.

II. RELATED WORK

The remote code execution has been defined in variousmanners. The possibility to remotely call a procedure led theconcept of RPC (Remote Procedure Call). RPC specificationspecifies how information is exchanged between a consumer(client) and a resource (server). Different implementationsthat consider different languages, protocols and systems havebeen proposed. For instance, we can cite the followingimplementations based on Web Services: XML-RPC, SOAP[3]. RPC concept has been transferred in the grid either fromlibraries that enable technical solutions, like Globus [14], ordirectly from the solutions that are designed to ensure gridRPC. However, the proposed solutions are hierarchical andneed centralization points that gather the related informationto a given service. For example tools such as DIET [8]should register their requirements (e.g., computational ormemory resource, data resource, applications, etc.) on a fixedcentralization point.In addition, in the case of Web Service execution on Internet,solutions of deployment [6] and resource discovery [15][16]in such environments have been proposed. In fact, theauthors of [15] proposed a hierarchical topology of webservices for resource discovery in Grids based on GlobusToolkit. In this proposed solution, the Grid system is dividedinto virtual organizations (VO). In each VO, there are masterand slave nodes. The master nodes are responsible forupdating the resource information while the slave nodes arein charge for retrieving resource information from each ma-chine which composes the Grid environment. On the other

hand, Fouad et al. [16] presented scalable Grid resourcediscovery through a distributed search. The proposed modelincludes the application layer which provides a web interfacefor the user and the collective layer which is a web service todiscover resources. The resource discovery model containsthe meta-data and resource finder web services to provide ascalable solution for information administrative requirementswhen the grid system expands over the Internet.Nevertheless, these works are based on solutions of hierar-chical architectures that have a very low degree of dynam-icity and whose infrastructural services are themselves cen-tralized. Indeed, these centralized or hierarchical approachessuffer from a single point of failure, of bottlenecks in highlydynamic systems, and lack of scalability in highly distributedenvironments [17].

Fortunately, Grid and P2P systems share several featuresand can profitably be integrated. Convergence of the Gridsystem with the philosophy and techniques of the P2P archi-tecture is a promising approach to alleviate the disadvantagesof traditional grid systems.Torkestani [18] asserted that P2P Grids exploit the synergybetween the Grid system and peer-to-peer network to ef-ficiently manage the Grid resources and services in large-scale distributed environments. The decentralized approachcan enhance scalability and fault-tolerance but it induces avery large network traffic and may limit search effectiveness.In addition, Marin Perez et al. [19] argue that the ultimategoal of building P2P Grid is to integrate the P2P, Gridsystem, and Web Services.

However, most of proposed solutions around peer-to-peergrids are limited on resource discovery [9][10][20][11][18]and are generally distinguished by the type of search al-gorithm used. In this sense, the authors of the survey [21]argue that the resource discovery (which aims to discoverappropriate resources based on a requested task) is one ofthe essential challenges in Grid. There are certain factorsthat make the resource discovery problem difficult to solve.For instance, Torkestani [11] proposed a multi-attributedistributed learning automata-based resource discovery al-gorithm for large-scale peer-to-peer grids. Based on thelearning automata theory, the author proposed a methodthat enables to route the resource query through the pathhaving the minimum expected hop count toward the gridpeers including the requested resources.

On the other hand, the authors of [10] proposed a designof a distributed Grid resource discovery protocol in whichthe network routers are in charge of resource discoveryprocess. In this approach, besides the routing table, eachrouter maintains a resource table. Resource table maps theIP addresses to the available computing resource values. Dis-covery packets are encapsulated with in the TCP/IP packetsand lookup the resource tables for finding the requestedresources.

Page 3: [IEEE 2014 14th International Conference on Innovations for Community Services (I4CS) - Reims, France (2014.6.4-2014.6.6)] 2014 14th International Conference on Innovations for Community

III. BACKGROUND AND CONTRIBUTION

In this section, we firstly give an overview of our P2P4GSspecification [13][22]. Afterwards, we briefly present ourcontributions.

A. Overview of the P2P4GS specification

P2P4GS (Peer-To-Peer For Grid Services) specificationpresents the originality to dissociate the peer-to-peer infras-tructure and the service management platform. Indeed, wepropose to separate the peer-to-peer grid management layerto the location and runtime services layer. The proposedspecification is generic, any combination of peer-to-peerprotocol could be composed with any services executionplatform. Deployment as well as invocation is absolutelydelegated to the platform. Consequently, both tasks becometransparent regarding end-user.

We proposed an architecture model composed of fourlayers. This allows us to define each layer independently.Indeed, each layer solves a number of problems (handles anumber of tasks required by the overall system), in order toprovide well-defined services to the higher layers.Figure 1 illustrates the architecture of our specification.

serviceapplication /

bookingconference

messaging

P2P link

e−learning

System Proxy

Layer

Physical

routing communication

in getId neigh send receive route

deploy lookup invoke exec save

adressing

node

web

ApplicationLayer

ManagementLayer

peer

P2P Layer

agenda

link

communicationlink

Legend

Legend

Legend

communication

Legend

Simple Node

Location Proxy

Invocation Proxy

Information

Figure 1: Architecture of P2P4GS specification

In the following, we describe the different layers of ourarchitecture.

1) Physical layer: this layer provides addressing, routingand communication functions. We are modeling this layeras a directed and connected graph G1 = (V,E1) where V isthe set of nodes that hosting the services and E1 the set ofdirected communications links between the nodes.These definitions allow taking into account the differentcharacteristics of the network:• Security elements such as firewalls that restrict the

possibility of communication of the nodes;• Configuration and implementation elements of the net-

work such as the NAT.2) P2P layer: it corresponds to the peer-to-peer mid-

dleware and provides communication functionalities andmaintenance of the topology. P2P layer is modeled asan undirected and connected graph G2 = (V,E2) where Vcorresponds to the set of nodes that hosting the servicesand E1 the set of undirected communications links betweenthe nodes established by the P2P protocol. In this case, allcommunication link are bidirectional.We assume that this layer offers at the minimum the follow-ing primitives :• in() : this primitive is executed by a new node when it

enters in the P2P community;• getId() : returns the P2P node’s identifier;• neigh() : retrieves the list of all its P2P neighbors;• send(id, message) : sends a message to a node identi-

fied by “id" in the system;• receive() : receives a message from a node of the

system;• route(id, message) : routes a message to a destination.Therefore, all peer-to-peer system presenting these basic

features can be exploited by our specification.3) Management layer: this layer constitutes the core of

our specification. All tasks of management, administrationand maintenance of a service are defined in this layer.Indeed, nodes are in charge of the local management of thenetwork. We assume that each node has a unique identifierin the communication network. Moreover, nodes collectivelyensure the following tasks: the deployment, the life cycleof services, the management of the execution platform, themanagement of requests and their executions.

In order to achieve our goal, P2P4GS specification pro-vides a set of primitives that are deploy, lookup, invoke,exec and save for respectively deployment, localization,invocation, execution and recording of a given service.Services are integrated objects to execution platforms heldby the nodes in the community.

4) Application layer: this layer is the upper layer thatinterfaces with the end-users. Primitives of the underlyinglayer are exploited by different platforms with whom theyinteract in order to provide services to Application layer.Note that, the access to P2P grid resources will be done atransparent manner with respect to the end-user.

Page 4: [IEEE 2014 14th International Conference on Innovations for Community Services (I4CS) - Reims, France (2014.6.4-2014.6.6)] 2014 14th International Conference on Innovations for Community

Nevertheless, in this approach, all nodes can participateto the deployment and the discovery processes for a givenservice. This may induce a communication cost in terms ofexchanged messages.

In addition, each node maintains a table called “ServiceRegistry", which lists the services owned by this node, aswell as the other services located inside the grid and learnedduring a discovery process. This may induce an overloading-memory of the nodes service registries.

B. Contributions

Given that the size of distributed systems grows in termsof number of nodes, services and users, in order to ensurescalability, we propose to limit the knowledge on somedistinguished nodes that we call proxies. In fact, we definethree types of proxy nodes : the ISP (Information SystemProxy), the IP (Invocation Proxy) and the LP (LocationProxy) for efficiently ensure function of the primitives abovementioned (i.e. “deploy", “lookup", “invoke", “exec" and“save"). According to the P2P paradigm, it should be notedthat the status of the nodes is not fixed and can evolve duringthe execution process.

Furthermore, based on these proxy nodes, we propose anorganization approach of the P2P4GS community nodes. Infact, according node’s connection we elect progressively theISP by using the concept of direct neighborhood.

Finally, using the OverSim simulator, we show that ourspecification can easily be used upon existing standard P2Pprotocols.

IV. TOWARDS A SELF-ADAPTIVE STRUCTURING OF THEP2P4GS COMMUNITY NODES

In this section, we firstly define the node proxy concept.Afterwards, we describe the proposed organizing approachof the P2P4GS community nodes.

A. Specification of the node proxy concept

Since the size of distributed systems grows in terms ofnumber of nodes, services and users, we propose to limitthe knowledge about the location of grid services on somenodes that we call ISP (Information System Proxy) in orderto ensure scalability. Around each ISP, we form a communityconstituted by a set of nodes of the grid. Consequently, eachISP node knows the location of the all services hosted in itscommunity. Moreover, ISP nodes record information aboutlocation of services learned during a discovery process andhosted in the others communities.

In order to reduce the ISP overload, we delegate in-vocation and execution services tasks for nodes called IP(Invocation Proxy).

Remark 1. The services can be viewed as different objectsthat are integrated to the runtime platforms located in eachnode. A service is characterized by:

• the implementation platform;• the required resources for its execution (computer, data

resource and connection resource, etc.);• the format and constraints according to the requested

invocation and obtained results.

Services have not the same execution constraints in termsof CPU, RAM, execution platform, etc. Thus, a node isdefined as IP for a given service, if it knows its localizationand respects its implementation constraints. Therefore, IPnode owns the client part of this service (stub).A node having only a knowledge about the localization ofa given service but do not having its stub will be called LP(Location Proxy) for this service. Any node involved in thelocation of a given service becomes LP for this service.It should be noted that a node is both invocation proxy andlocation proxy for its own services.

According to the P2P paradigm, the status of the nodes isnot fixed and can evolve during the execution process. Thedefault node status is SN (Simple Node).

Remark 2. In order to avoid the overload of the memoryof the ISP, IP and LP nodes, we propose to remove theknowledge about a given service at the end of a TTL (Time-To-Live). Accordingly, a service that is rarely requested willnot be stored indefinitely. In contrast, a node will be alwaysan invocation proxy for a service frequently requested of thefact that its TTL will be reset after each new invocation. Inthe case of the ISP nodes, this TTL is only applicable onservices known through others ISP nodes.

B. Organization approach of the P2P4GS community nodes

We describe our organization approach of the P2P4GScommunity nodes. As we explained earlier, our goal is tolimit the knowledge about the location of grid serviceson the ISP nodes. Since each node of a grid presents theacceptable minimal capacities in terms of CPU, memoryand bandwidth, all of them can potentially become ISP.Consequently, election of ISP node is not necessary.

Moreover, in a large-scale environment where nodes aregeographically dispersed, the network does not formed spon-taneously. Therefore, using the concept of direct neigh-borhood, we propose a progressive ISP election accordingnode’s connection. The neighborhood of a node is estab-lished upon first connection and can evolve during theexecution process.

The definition of the neighborhood offered by the P2Player depends on the type of P2P system (structured orunstructured). In addition, some protocols define severaltypes of neighborhoods. This is the case for example of theChord protocol [23] which defines two types of neighbors:sequential neighbors and fingers tables.

In order to structure our network, we based on the directneighborhood offered by the underlying P2P layer.We define the concept of direct neighborhood as follows:

Page 5: [IEEE 2014 14th International Conference on Innovations for Community Services (I4CS) - Reims, France (2014.6.4-2014.6.6)] 2014 14th International Conference on Innovations for Community

for any pair of nodes (u, v) of the P2P4SG community, uand v are direct neighbors if and only if:

• u can communicate with v and vice-versa. This meansthat there exists a bidirectional communication linkbetween u and v;

• u knows v in its list of neighbors and vice-versa.

The following steps describe the process creation ofP2P4GS community:

1) The first node connected to the system (therefore, itdoes not have any neighbors), elects itself as ISP;

2) Any new node that connects to the system checks itsneighborhood table:• If it has at least one ISP as neighbor then, it keeps

its status (Simple Node). Moreover, if it has hostedservices, it sends informations on services to itsISP that will update their Service Registry (exam-ple of the connection of node 16 as illustrated inFigure 2(a));

• Otherwise, it elects itself as ISP and informs itsneighbors (example of Figure 2(b)).

In Section V, we illustrate how to create our P2P4GScommunity above a given structured or unstructured peer-to-peer protocol.

Elasticity management in the P2P4GS community :

Given that large scale systems can be highly dynamic,disconnection of a node can occur at any time. Indeed, thedynamic nature of peers poses challenges in the communica-tion paradigm. In order to ensure scalability, it is necessaryto take into account the disconnection of the nodes. Thus,according to the disconnected node’s status, we proposefollowing approaches.

• Disconnection of a simple node: in this case, its servicesbecome unavailable. As noted in Remark 2, if a serviceremains unavailable until end of the TTL, its knowledgewill be deleted at the different service registries havingrecorded it.

• Disconnection of a LP node: in this case, source nodewill start a new process of discovery if it does not knowanother LP for desired service.

• Disconnection of an IP node: same treatment as thecase of a disconnected LP.

• Disconnection of an ISP node: in this case, one of thefollowing events occurs:

– In the case of the service localization, processingremains the same as those IP or LP node;

– In the case of the service deployment, if candidatenode has other ISP neighbors, it sends servicesinformation to its ISP neighbors that will updatetheir Service Registry. Otherwise, it elects itself asISP and informs its neighbors.

16

9

15

5

4

8

7

10

11

12

13

14

3

6

2

1

11s

3

3s

3

3s

10

14s

11

14s

1s

10s

11s

12s

3

3s

12

16s

1

2s

2

4s

2s

4

12s

5

15s

3s

1

2s

4s

5

15s

6

15s

5s

6s

6

15s

9s

4

12s

9

10s

7s

7

11s

8s

7

(a) node 16 has as neighbors, the nodes 1 (a ISP node) and 14(a Simple Node). In this case, the node 16 maintains its status asSimple Node and sends information about the hosted services (inthis example, the service S12) to his neighbor ISP. Subsequently,the node 1 (ISP node) updates its Service Registry.

16

9

15

5

4

8

7

10

11

12

13

14

3

6

2

1

s3

3s

3

3s

10

14s

11

14s

1s

10s

11s

3

3s

s1

2s

3

3

10

14s

11

14s

1

2s

2

4s

2s

4

12s

5

15s

3s

1

2s

4s

5

15s

6

15s

5s

6s

6

15s

9s

4

12s

9

10s

7s

7

11s

8s

7

11

(b) node 16 has as neighbors, nodes 2, 3 and 14, which all have thestatus “Simple Node". Since node 16 has not ISP neighbors in itsneighborhood list, it elects itself as ISP and informs its neighbors.The latter (i.e. nodes 2, 3 and 14) send to node 16 the informationabout the hosted services (in this example, service S1, service S3and services S10 and S11 hosted by respectively node 2, 3 and 14).Thereafter, the node 16 updates its registry table.

Registred service

Simple Node Location ProxyInvocation Proxy

System Proxy

Information

Legend:

yOwned service

xs

xs

Figure 2: Connection of the node 16 in the P2P4GS com-munity

Page 6: [IEEE 2014 14th International Conference on Innovations for Community Services (I4CS) - Reims, France (2014.6.4-2014.6.6)] 2014 14th International Conference on Innovations for Community

V. SIMULATION EXPERIMENTS

In this section, we show by simulation that our spec-ification can easily be used upon existing standard P2Pprotocols. There are several simulation frameworks for peer-to-peer networks. In this study, we focus on the OverSimsimulator [24]. OverSim is an open-source P2P simulationframework that we are using in our ongoing work, based onthe discrete event simulation system OMNeT++2. Moreover,OverSim includes several structured and unstructured peer-to-peer protocols like Pastry, Chord, Kademlia, Bamboo andGia. These protocol implementations can be used for bothsimulation as well as real world networks.

In order to achieve our goal, we use two protocols thatwork in totally different ways i.e. Gia [25] (an unstructuredpeer-to-peer protocol) and Pastry [26] (a structured peer-to-peer protocol). The other motivation to base our simulationson Gia and Pastry protocols is that they both offer an open-source implementation API and we subsequently plan toimplement this specification.

A. P2P4GS specification on Gia overlay

GIA [25] is basically modification of Gnutella3 thatdynamically adapts to the overlay topology and the searchalgorithm in order to accommodate the natural heterogeneityof most peer-to-peer systems. GIA is a decentralized andunstructured protocol. It has replaced the flooding withbiased random walk for search. Each node maintains a keylistof its own content as well as of its immediate neighbor andany match with its own content or the neighbor’s contentresults in a positive reply.

The search query is forwarded towards the high degreenodes, and in order to make sure that even high degree nodesdo not get overloaded, node’s capacity is taken account. Thecapacity of node depends upon various factors including itsprocessing power, disk latencies and access bandwidth. Thetopology adaptation process of GIA makes sure that the highcapacity nodes are the ones with high degree and that all lowcapacity nodes have direct access to atleast one high capacitynode. To achieve this goal, each node tries to maintain itslevel of satisfaction (S). ‘S’ is function of node’s capacity,neighbor’s capacity and neighbor’s degrees. Also, if a nodeis already joined to its maximum allowed neighbors, it maystill join to a new node by dropping an existing neighborand prior to this process, it makes a three way handshake.

Table I gives an overview of the used simulation pa-rameters and Figure 3 corresponds to a screen-shot fromsimulations with these parameters.

In this figure, nodes with star inside represent the ISPnodes. Others nodes are simple nodes. Note that for struc-turing Gia overlay, as described in Subsection IV-B we haveexploited the neighborhood that it offers.

2http://www.omnetpp.org/3http://rfc-gnutella.sourceforge.net/

Table I: Gia simulation parameters

Parameter Settings

Init Phase Creation Interval 0.1 s

Target Overlay Terminal Number 25

Maximum Neighbors 10

Minimum Neighbors 1

GIA Level of Satisfaction 1

Maximum Hop Count 10

GIA Update Delay 60 s

Message Timeout 180 s

Token Wait Time 5 s

Neighbor Timeout 250 s

Routing Type semi-recursive

Figure 3: Screen-shot showing the structuring of P2P4GScommunity on Gia overlay

B. P2P4GS specification on Pastry overlay

A Pastry system is a self-organizing overlay network ofnodes, where each node routes client requests and interactswith local instances of one or more applications.

Each node in the Pastry peer-to-peer overlay network isassigned a 128-bit node identifier (nodeId). The nodeId isused to indicate a node’s position in a circular nodeId space,which ranges from 0 to 2128− 1. The nodeId is assignedrandomly when a node joins the system. It is assumedthat nodeIds are generated such that the resulting set ofnodeIds is uniformly distributed in the 128-bit nodeId space.For instance, nodeIds could be generated by computing acryptographic hash of the node’s public key or its IP address.

Assuming a network consisting of N nodes, Pastry canroute to the numerically closest node to a given key inless than [log2b N] steps under normal operation (‘b’ is aconfiguration parameter with typical value 4).

Page 7: [IEEE 2014 14th International Conference on Innovations for Community Services (I4CS) - Reims, France (2014.6.4-2014.6.6)] 2014 14th International Conference on Innovations for Community

In addition to the routing table, each node maintains IPaddresses for the nodes in its leaf set, i.e., the set of nodeswith the L/2 numerically closest larger nodeIds, and the L/2nodes with numerically closest smaller nodeIds, relative tothe present node’s nodeId. (|L| is a configuration parameterwith a typical value of 16 or 32)

Table II gives an overview of the used simulation pa-rameters and Figure 4 corresponds to a screen-shot fromsimulations with these parameters.

Table II: Pastry simulation parameters

Parameter Settings

joinTimeout 20 s

Target Overlay Terminal Number 25

Bits Per Digit 4

Number of leaves 4

Minimal join state False

Proximity neighbor selection True

Always send update False

Routing table maintenance interval 60 s

Repair timeout 60 s

Join timeout 5 s

Routing Type semi-recursive

Figure 4: Screen-shot showing the structuring of P2P4GScommunity on Pastry overlay

In this figure, nodes with star inside represent the ISPnodes. Others nodes are simple nodes. Note that for struc-turing Pastry overlay, as described in Subsection IV-B wehave exploited the neighborhood that it offers. The parameter“number of leaves" gives the radius of each cluster.

VI. CONCLUSION

In this paper, we have proposed a self-adaptive structur-ing of our P2P4GS specification. Given that the size ofdistributed systems grows in terms of number of nodes,services and users, in order to ensure scalability, we haveproposed to limit the knowledge about the services locationon ISP (Information System Proxy) nodes, to delegate invo-cation and execution services tasks for IP (Invocation Proxy)nodes, and to memorize information about the location offrequently requested services on LP (Location Proxy) nodes.Furthermore, based on these proxy nodes, we have proposedan organization approach of the P2P4GS community nodes.On the other hand, using the OverSim simulator, we showthat our specification can easily be used upon any existingstandard P2P protocols. In order to illustrate that P2P4GSspecification is generic, we have used Pastry (a structuredpeer-to-peer protocol) and Gia (an unstructured peer-to-peerprotocol), two protocols that work in totally different ways.

Currently, we are working on the performances study ofthe proposed service’s deployment and localization in eachstrategy, communication cost in terms of messages and faulttolerance mechanisms.

As future work, we plan to implement the proposedspecification and compare it with other existing solutions.

REFERENCES

[1] I. Foster and A. Iamnitchi, “On death, taxes, and the conver-gence of peer-to-peer and grid computing,” in Peer-to-PeerSystems II. Springer, 2003, pp. 118–128.

[2] I. Foster, C. Kesselman, and S. Tuecke, “The anatomy of thegrid: Enabling scalable virtual organizations,” Internationaljournal of high performance computing applications, vol. 15,no. 3, pp. 200–222, 2001.

[3] G. Alonso, F. Casati, H. Kuno, and V. Machiraju, Web services- Concepts, Architectures and Applications. Springer Verlag,ISBN 3-540-44008-9, chapter 5, 2004.

[4] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz,A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoicaet al., “A view of cloud computing,” Communications of theACM, vol. 53, no. 4, pp. 50–58, 2010.

[5] I. Foster, H. Kishimoto, A. Savva, D. Berry, A. Djaoui,A. Grimshaw, B. Horn, F. Maciel, F. Siebenlist, R. Subra-maniam et al., “The open grid services architecture (ogsa),version 1.5.” OGF Specification GFD-I. 080, July 2006.

[6] S. Tuecke, K. Czajkowski, I. Foster, J. Frey, S. Graham,C. Kesselman, T. Maquire, T. Sandholm, D. Snelling, andP. Vanderbilt, “Open grid services infrastructure (ogsi), ver-sion 1.0,” June 2003.

[7] K. R. Jackson, L. Ramakrishnan, K. Muriki, S. Canon,S. Cholia, J. Shalf, H. J. Wasserman, and N. J. Wright,“Performance analysis of high performance computing appli-cations on the amazon web services cloud,” in IEEE SecondInternational Conference on Cloud Computing Technologyand Science (CloudCom), 2010, pp. 159–168.

Page 8: [IEEE 2014 14th International Conference on Innovations for Community Services (I4CS) - Reims, France (2014.6.4-2014.6.6)] 2014 14th International Conference on Innovations for Community

[8] E. Caron and F. Desprez, “Diet: A scalable toolbox to buildnetwork enabled servers on the grid,” International Journalof High Performance Computing Applications, vol. 20, no. 3,pp. 335–352, 2006.

[9] P. Trunfio, D. Talia, H. Papadakis, P. Fragopoulou, M. Mor-dacchini, M. Pennanen, K. Popov, V. Vlassov, and S. Haridi,“Peer-to-peer resource discovery in grids: Models and sys-tems,” Future Generation Computer Systems, vol. 23, no. 7,pp. 864–878, 2007.

[10] T. Kocak and D. Lacks, “Design and analysis of a distributedgrid resource discovery protocol,” Cluster Computing, vol. 15,no. 1, pp. 37–52, 2012.

[11] J. A. Torkestani, “A multi-attribute resource discovery algo-rithm for peer-to-peer grids,” Applied Artificial Intelligence,vol. 27, no. 7, pp. 575–598, 2013.

[12] S. Androutsellis-Theotokis and D. Spinellis, “A survey ofpeer-to-peer content distribution technologies,” ACM Com-puting Surveys (CSUR), vol. 36, no. 4, pp. 335–371, 2004.

[13] B. Gueye, O. Flauzac, I. Niang, and C. Rabat, “P2p4gs:une spécification de grille pair-à-pair de services auto-gérés,”vol. 17. ARIMA Journal, Special issue of CARI’12, 2014,pp. 155–175.

[14] I. Foster, “Globus toolkit version 4: Software for service-oriented systems,” Journal of computer science and technol-ogy, vol. 21, no. 4, pp. 513–520, 2006.

[15] T. Gomes Ramos and A. C. M. A. de Melo, “An extensibleresource discovery mechanism for grid computing environ-ments,” in Cluster Computing and the Grid, 2006. CCGRID06. Sixth IEEE International Symposium on, vol. 1. IEEE,2006, pp. 115–122.

[16] F. Butt, S. S. Bokhari, A. Abhari, and A. Ferworn, “Scalablegrid resource discovery through distributed search,” Interna-tional Journal of Distributed and Parallel Systems (IJDPS),vol. 2, no. 5, pp. 1–19, 2011.

[17] Y. Deng, F. Wang, and A. Ciura, “Ant colony optimizationinspired resource discovery in p2p grid systems,” The Journalof Supercomputing, vol. 49, no. 1, pp. 4–21, 2009.

[18] J. A. Torkestani, “A distributed resource discovery algorithmfor p2p grids,” Journal of Network and Computer Applica-tions, vol. 35, no. 6, pp. 2028 – 2036, 2012.

[19] J. M. Marin Perez, J. B. Bernabe, J. M. Alcaraz Calero, F. J.Garcia Clemente, G. M. Perez, and A. F. Gomez Skarmeta,“Semantic-based authorization architecture for grid,” FutureGeneration Computer Systems, vol. 27, pp. 40–55, 2011.

[20] D. Chen, G. Chang, X. Zheng, D. Sun, J. Li, and X. Wang,“A novel p2p based grid resource discovery model,” Journalof Networks, vol. 6, no. 10, pp. 1390–1397, 2011.

[21] N. Jafari Navimipour, A. Masoud Rahmani,A. Habibizad Navin, and M. Hosseinzadeh, “Resourcediscovery mechanisms in grid systems: A survey,” Journalof Network and Computer Applications, pp. 389–410, 2013.

[22] B. Gueye, O. Flauzac, C. Rabat, and I. Niang, “P2p4gs: Aspecification for services management in peer-to-peer grids,”in The Fourth International Conference on Advanced Com-munications and Computation (INFOCOMP). IARIA XPS,Paris, July 2014. (To appear).

[23] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Bal-akrishnan, “Chord: A scalable peer-to-peer lookup servicefor internet applications,” in ACM SIGCOMM ComputerCommunication Review, vol. 31, no. 4. ACM, 2001, pp.149–160.

[24] I. Baumgart, B. Heep, and S. Krause, “Oversim: A flexibleoverlay network simulation framework,” in IEEE GlobalInternet Symposium, 2007. IEEE, 2007, pp. 79–84.

[25] Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, andS. Shenker, “Making gnutella-like p2p systems scalable,” inProceedings of the 2003 conference on Applications, tech-nologies, architectures, and protocols for computer commu-nications. ACM, 2003, pp. 407–418.

[26] A. Rowstron and P. Druschel, “Pastry: Scalable, decentral-ized object location, and routing for large-scale peer-to-peersystems,” in Middleware 2001. Springer, 2001, pp. 329–350.