28
International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015 DOI:10.5121/ijcsa.2015.5307 77  A  SERIAL COMPUTING MODEL OF  A GENT ENABLED MINING OF GLOBALLY STRONG  A SSOCIATION R ULES G.S.Bhamra 1 , A. K.Verma 2  and R.B.Patel 3 1 M. M. University, Mullana, Haryana, 133207 - India 2 Thapar University, Patiala, Punjab, 147004- India 3 Chandigarh College of Engineering & Technology, Chandigarh- 160019- India  A  BSTRACT  The intelligent agent based model is a popular approach in constructing Distributed Data Mining (DDM) systems to address scalable mining over lar ge scale and ever increasing distributed data. In an agent based distributed system, variety of agents coordinate and communicate with each other to perform the various tasks of the Data Mining (DM) process. In this study a serial computing mode of a multi-agent system (MAS) called Agent enabled Mining of Globally Strong Association Rules (AeMGSAR) is presented based on the serial itinerary of the mobile agents. A Running environment is also designed for the implementation and performance study of AeMGSAR system.  K  EYWORDS  Knowledge Discovery, Association Rules, Intelligent Agents, Multi-Agent System 1.INTRODUCTION Data Mining (DM) technique is used t o extract some interesting and valid data patterns implicitly stored in large databases [1], [2]. Intelligent software agent technology is an interdisciplinary technology dealing with the development and efficient utilizati on of autonomous software objects called agents which have access to geograp hically distributed and heterogen eous resources. They are autonomous, adaptive, reactive, pro-active, social, cooperative, collaborative and flexible. They also support temporal continuity and mobility within the network. An intelligent agent with mobility feature is known as Mobile Agent (MA). MA migrates from node to node in a heterogeneous network without losing its operability. On reaching at a network node MA is delivered to an Agent Execution Environment (AEE) where its executable parts are started running. Upon completion of the desired task, it delivers the results to the home node. A Mobile Agent Platform (MAP) or Agent Execution Environment (AEE), is a server application that provides the appropriate functionality to MAs to authenticate, execute, communicate, migrate to other platform, and use system resources in a secure way. A Multi Agent System (MAS) is distributed application comprised of multiple interacting intelligent agent components [3]. Let { } , 1  j  DB T j D = =  K be a transactional dataset of size  D  where each transaction T is assigned an identifier ( TID ) and  { } , i 1 i  I d m = =  K , total m data items in  DB . A set of items in a particular transaction T  is called i temset or pattern. An itemset, { } , i 1 i P d k = =  K , which is a set of k data

A Serial Computing Model of Agent

Embed Size (px)

Citation preview

Page 1: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 1/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

DOI:10.5121/ijcsa.2015.5307 77

 A  SERIAL COMPUTING MODEL OF A GENT

ENABLED MINING OF GLOBALLY STRONG

 A SSOCIATION R ULES 

G.S.Bhamra1, A. K.Verma

2 and R.B.Patel

3

1M. M. University, Mullana, Haryana, 133207 - India2Thapar University, Patiala, Punjab, 147004- India

3Chandigarh College of Engineering & Technology, Chandigarh- 160019- India

 A BSTRACT  

The intelligent agent based model is a popular approach in constructing Distributed Data Mining (DDM)

systems to address scalable mining over large scale and ever increasing distributed data. In an agent based

distributed system, variety of agents coordinate and communicate with each other to perform the various

tasks of the Data Mining (DM) process. In this study a serial computing mode of a multi-agent system

(MAS) called Agent enabled Mining of Globally Strong Association Rules (AeMGSAR) is presented based

on the serial itinerary of the mobile agents. A Running environment is also designed for the implementation

and performance study of AeMGSAR system.

 K  EYWORDS 

Knowledge Discovery, Association Rules, Intelligent Agents, Multi-Agent System

1.INTRODUCTION

Data Mining (DM) technique is used to extract some interesting and valid data patterns implicitly

stored in large databases [1], [2]. Intelligent software agent technology is an interdisciplinarytechnology dealing with the development and efficient utilization of autonomous software objects

called agents which have access to geographically distributed and heterogeneous resources. They

are autonomous, adaptive, reactive, pro-active, social, cooperative, collaborative and flexible.They also support temporal continuity and mobility within the network. An intelligent agent with

mobility feature is known as Mobile Agent (MA). MA migrates from node to node in aheterogeneous network without losing its operability. On reaching at a network node MA is

delivered to an Agent Execution Environment (AEE) where its executable parts are startedrunning. Upon completion of the desired task, it delivers the results to the home node. A Mobile

Agent Platform (MAP) or Agent Execution Environment (AEE), is a server application that

provides the appropriate functionality to MAs to authenticate, execute, communicate, migrate toother platform, and use system resources in a secure way. A Multi Agent System (MAS) is

distributed application comprised of multiple interacting intelligent agent components [3].

Let { }, 1 j

 DB T j D= =   K be a transactional dataset of size  D  where each transaction T is assigned

an identifier (TID ) and   { }, i 1i

 I d m= =   K , total m data items in DB . A set of items in a particular

transaction T   is called itemset or pattern. An itemset, { }, i 1i

P d k = =   K , which is a set of k data

Page 2: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 2/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

78

items in a particular transaction T and P I ⊆ , is called k-itemset. Support of an itemset,

( )No_of_T_containing_P

%s P D

= is the frequency of occurrence of itemset P   in DB , where

No_of_T_containing_P is the support count (sup_count) of itemset P . Frequent Itemsets (FIs)

are the itemset that appear in  DB   frequently, i.e., if ( ) min_th_sups P   ≥ (given minimum

threshold support), then P is a frequent k-itemset. Finding such FIs plays an essential role in

miming the interesting relationships among itemsets. Frequent Itemset Mining (FIM) is the task

of finding the set of all the subsets of FIs in a transactional database [2].

Association Rules (ARs) are used to discover the associations among item in a database [4]. It is

an implication of the form [ ]support,confidenceP Q⇒ where, ,P I Q I  ⊂ ⊂  and P Q∩ = ∅ . An

AR is measured in terms of its support and confidence factor where support of the rule

(   ( )s P Q⇒ ) is the probability of both P and Q   appearing in T  , i.e., ( ) p P Q∪   and the

confidence of the rule (   ( )c P Q⇒ ) is the conditional probability of Q  given P , i.e., ( )| p Q P .

An AR is said to be strong if ( ) min_th_sups P Q⇒   ≥   (given minimum threshold support) and

( )min_th_conf c P Q⇒   ≥ (given minimum threshold confidence). Association Rule Mining (ARM)

today is one of the most important aspects of DM tasks. In ARM all the strong ARs are generated

from the FIs. The ARM can be viewed as two step process [5], [6].

1.  Find all the frequent k-itemsets (k 

 L )

2.  Generate Strong ARs fromk 

 L  

a.  For each frequent itemset,k 

l L∈ , generate all non empty subsets of l .

b.  For every non empty subset s of l , output the rule “   ( )s l s⇒   − ”, if

( )

( )

sup_countmin_th_conf 

sup_count

l

s≥  

Distributed Association Rule Mining (DARM) is the task of generating the globally strongassociation rules from the global FIs in a distributed environment. Few preliminaries notationsand definitions required for defining DARM and to make this study self contained are as follows:

•  { }, i 1i

S S n= =   K , n distributed sites.

• 

CENTRALS  , Central Site.

•  { }, 1i j i

 DB T j D= =   K , Horizontally partitioned data set of sizei

 D at the local sitei

S  , where

each transaction j

T   is assigned an identifier (TID).

• 

1

n

ii DB DB

==U , the aggregated dataset of size

1

n

ii D D

==∑ ,

i j DB DB∩ = ∅  

•  { }, i 1i

 I d m= =   K , total m data items in eachi

 DB .

• 

( )

FI 

k i L , Local frequent k-itemsets at site iS  .

• 

( )

FISC 

k i L , List of support count

( )

FI 

k i Itemset L∀ ∈ .

• 

 LSAR

i L , List of locally strong association rules at site

iS  .

• 

1

nTLSAR LSAR

ii L L

==U , List of total locally strong association rules.

• 

( )1

nTFI FI  

k k ii L L

==U , List of total frequent k-itemsets.

Page 3: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 3/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

79

• 

( )1

nGFI FI  

k k ii L L

==I , List of global frequent k-itemsets.

• 

GSAR

CENTRAL L , List of Globally strong association rule.

Local Knowledge Base (LKB), at sitei

S  , comprises of ( )

FI 

k i L ,

( )

FISC 

k i L   and

 LSAR

i L   which can provide

reference to the local supervisor for local decisions. Global Knowledge Base (GKB), at CENTRALS  ,

comprises of TLSAR

 L ,TFI 

k  L ,

GFI 

k  L and

GSAR

CENTRAL L  for the global decision making [7]. Like ARM, DARM

task can also be viewed as two-step process [6]:

1.  Find the global frequent k-itemset (GFI 

k  L ) from the distributed Local frequent k-itemsets

(( )

FI 

k i L ) from the partitioned datasets.

2.  Generate globally strong association rules (GSAR

CENTRAL L ) from GFI 

k  L .

The existing agent based systems specifically dealing with DARM task are: Knowledge

Discovery Management System (KDMS) [8], Efficient Distributed Data Mining using IntelligentAgents [9], Mobile Agent based Distributed Data Mining [10], An Agent based Framework for

Association Rule Mining of Distributed Data (AFARMDD) [11], [12], Multi-Agent DistributedAssociation Rule Miner (MADARM) [13]. All these systems are academic research projects.

Qualitative comparison of these DARM frameworks is provided in [14]. Most of the existingagent based frameworks for DARM task are only prototype model and lacks the appropriate

underlying AEE, scalability, privacy preserving techniques, global knowledge generation andimplementation using a real datasets.

The rest of the paper is organised as follows. Section 2 described the running environment for theproposed system along with various algorithms involved. Serial computing model of AeMGSAR

is presented in Section 3. Algorithms for all the agents involved in this system are also discussed.

Section 4 describes the implementation and performance study of the system and finally thearticle is concluded in Section 5.

2.ENVIRONMENT FOR THE PROPOSED SYSTEM

Every MAS needs an underlying AEE to provide a running infrastructure on which agents can bedeployed and tested. A running environment has been designed in Java. Various attributes of the

MA are encapsulated within a data structure known as  AgentProfile . It contains the name of MA

( AgentName ), version number (  AgentVersion ), entire byte code ( BC ), list of nodes to be

visited by MA, i.e., itinerary plan (  NODES  L ) , type of the itinerary ( ItinType ) which can be

serial or parallel, a reference of current execution state ( AObject ) and an additional data structure

known as  Briefcase that acts as a result bag of MA to store final resultant knowledge ( i Result_S  )

at a particular site. Computational time ( CPUTime ) taken by a MA at a particular site is also

stored in i Result_S  . In addition to results,  Briefcase  also contains the system time for start ofagent journey ( start TripTime ), system time for end of journey ( end TripTime ) and total round trip

time of MA (TripTime ) calculated usingend start  

TripTime TripTime TripTime← − . Stationary as well

as mobile agents involved in the models would be discussed later on. This environment consistsof the following three components:

Page 4: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 4/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

80

•   Data Mining Agent Execution Environment (DM_AEE): It is the key component that

acts as a Server.  DM_AEE   is deployed on any distributed sites iS   and is responsible for

receiving, executing and migrating all the visiting DM agents. It receives the incoming

 AgentProfile   at site iS  , retrieves the entire  BC   of agent and save it with

. AgentName class  in the local file system of the site iS  after that execution of the agent is

started using  AObject . Steps are shown in Algorithm 1. 

•   Agent Launcher (AL): It acts a Client at agent launching station (CENTRAL

S  ) and launches

the goal oriented DM agents on behalf of the user through a user interface to the

 DM_AEE  running at the distributed sites. Agent Pool (or Zone) atCENTRAL

S  is a repository

of all mobile as well as stationary agents (SAs). AL first reads and stores  AgentName  

in  AgentProfile . The entire  BC  of the  AgentName  is loaded from the Agent Pool and

stored in  AgentProfile .  NODES  L   and  ItinType are retrieved and stored in  AgentProfile .

start TripTime  is maintained in  Briefcase which is further added to  AgentProfile . In case of

serial computing model, i.e., if  ItinType Serial= ,  AL  dispatches a specific single MA

along with  NODES  L , and it travels from node to node.  AgentVersion   is set as 1 for this

agent. AL also contacts the Result Manager (RM) for processing the  Briefcase of an agent.

Detailed steps are given in Algorithm 2. 

•   Result Manager (RM): It manages and processes the  Briefcase of all MAs. RM  is either

contacted by a MA for submitting its results or by  AL for processing the results of thespecific MA. On completion of itinerary, each DM agent submits its results to  RM which

computes total round trip time ( TripTime ) of that MA and saves it in the  Briefcase of that

agent. It  ItinType Serial= then it saves the updated  AgentProfile of an agent at CENTRALS  .

When it is contacted by AL for processing the results of a specific agent it sends back the

 AgentProfile  of that agent. Steps are defined in Algorithm 3.

Algortihm 1 DATA MINING AGENT EXECUTION ENVIRONMENT (DM_AEE) 

1: 

procedure DM_AEE( ) 2:  while TRUE do

3:  i AgentPofile listen and receive AgentProfile at S ←  

4:   AgentName get AgentName from AgentProfile←  

5:   BC retrieve the BC of agent from AgentProfile←  

6:  isave the BC with AgentName.class in the local file system of S   

7:   AObject get AObject from AgentProfile←   > current state 

8:  . () AObject run   > start executing mobile agent 

9:  end while

10:  end procedure

Algortihm 2 AGENT LAUNCHER (AL) 

1:  procedure AL( ) 

2:  option read option (dispatch / result)←  

3:  switch option  do

4:  case dispatch   >dispatch the mobile agent to DM_AEE 

5:   AgentName read Mobile Agent's name←  

6:  add AgentName to AgentProfile  

Page 5: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 5/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

81

7:   BC load entire byte code of AgentName from AgentPool←  

8:  add BC to AgentProfile  

9:   NODES  L read Itinerary (IP addresses) of mobile agent ←  

10:   ItinType read ItinType ( Serial / Parallel)←  

11:  add ItinType to AgentProfile  

12: 

if " " ItinType Serial=  then >Serial Itinerary 

13:  1 AgentVersion ←  

14:  add AgentVersion to AgentProfile  

15:   NODES add L to AgentProfile  

16:  switch  AgentName  do

17:  case  LFIGA  

18:  minthrsup read minimum threshold support ←  

19:   AObject new LFIGA(AgentProfile, minthrsup)←  

20:  end case

21:  case  LKGA  

22:  minthrconf read minimum threshold confidence←  

23: 

 AObject new LKGA(AgentProfile, minthrconf)←  24:  end case

25:  case TFICA  

26:   AObject new TFICA(AgentProfile)←  

27:  end case

28:  case  LKCA  

29:  ( AObject new LKCA AgentProfile)←  

30:  end case

31:  case GKDA  

32:  GSAR GSAR

CENTRAL CENTRAL CENTRAL L load L generated by GKGA at S ←  

33:  GSAR

CENTRALadd L to Briefcase  

34:  add updated Briefcase to AgentProfile  

35: 

 AObject new GKDA (AgentProfile)←  36:  end case

37:  end switch

38:  add AObject to AgentProfile   >current state 

39:   NODES Transfer AgentProfile to DM_AEE at first IP address in L  

40:  end if

41:  end case

42:  case result   >process the result of mobile agent 

43:   AgentName read mobile agent's name←  

44:   ItinType read mobile agent's ItinType←  

45:   AgentInfoadd AgentName to L  

46:   AgentInfoadd ItinType to L  

47: 

>  Result processing for Serial Itinerary Agents 

48:  if " " ItinType Serial=  then

49:   AgentInfo AgentProfile contact RM for L←  

50:   Briefcase retrieve Briefcase from AgentProfile←  

51:  switch  AgentName  do

52:  case  LFIGA  

Page 6: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 6/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

82

53:   process the Briefcase of LFIGA  

54:  end case

55:  case  LKGA  

56:   process the Briefcase of LKGA  

57:  end case

58:  case TFICA  

59:  call GFIGA (Briefcase)   > stationary agent 

60:  end case

61:  case  LKCA  

62:  call GKGA (Briefcase)   >stationary agent 

63:  end case64:  case GKDA  

65:   process the Briefcase of GKDA  

66:  end case

67:  end switch

68:  end if

69:  end case

70:  end switch71:  end procedure

Algortihm 3 RESULT MANAGER (RM) 

1:  procedure RM( ) 

2:  while TRUE do

3:  listen and receive the incomming request  

4:  ifi

contacted by a mobile agent for submitting results from site S  then

5: i

 AgentProfile receive the incomming AgentProfile from site S ←  

6:   ItinType retrieve ItinType from AgentProfile←  

7:   Briefcase retrieve mobile agent's Briefcase from AgentProfile←  

8: start start  

TripTime retrieve TripTime from Briefcase←  

9: 

end end  TripTime retrieve TripTime from Briefcase←  

10: end start  

TripTime TripTime TripTime← −  

11:  add TripTime to Briefcase  

12:  add updated Briefcase to AgentProfile  

13:  if " " ItinType Serial=  then

14: CENTRAL

save AgentProfile at S   

15:  end if

16:  end if

17:  if contacted by AL for processing the results  then

18:   AgentInfo AgentName retrieve AgentName from incomming L←  

19:   AgentInfo ItinType retrieve ItinType from incomming L←  

20: 

if " " ItinType Serial=  then

21: CENTRAL

 AgentProfile load AgentProfile for AgentName from S ←  

22:  dispatch AgentProfile to AL  

23:  end if

24:  end if

25:  end while26:  end procedure

Page 7: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 7/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

83

The overall working of AeMGSAR system may be divided into following six stages:

1.   Request Stage: Request for the DARM is initiated atCENTRAL

S   by  AL on behalf of the user

with necessary credentials. 

2.   Preparation Stage:   AL  through User Interface reads agent name; version number;

Itinerary for the MAs journey is obtained in terms of IP addresses of the distributed nodesto be visited by a MA; any specific additional data for a specific MA is obtained; Agentcode for the specific MA is loaded from AgentPool; for serial itinerary a single specificMA is dispatched by AL to travel and visit n distributed sites in parallel. 

3.   Local Mining Stage: ARM process is performed locally by specific DM agents on eachdistributed site and results are kept as local knowledge base at that site. 

4.   Result Collection Stage: Collector agents visits each site and collect the results generated

by DM agents and submit the results back to RM  atCENTRALS  . 

5.   Knowledge Integration and Global Knowledge Generation Stage: Knowledge or result

integration is carried out by the  RM   with the help of stationary agent and GlobalKnowledge in the form of Globally Strong Association Rules may be generated with the

help of other stationary agents atCENTRAL

S  . 

6. 

Global Knowledge Dispatching Stage: Global knowledge is dispatched to the distributed

sites by a dispatching agent to compare it with the local knowledge at each site.

Figure 1. AeMGSAR Serial Computing Model

Page 8: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 8/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

84

3.SERIAL COMPUTING MODEL OF AEMGSAR

Serial computing model of AeMGSAR system is shown in Figure 1. It consists of total seven

agents, five of these are MAs dispatched fromCENTRAL

S   with serial itinerary multi-hop migration

and other two are intelligent SAs running atCENTRAL

S   to perform different tasks. The CPU time

taken by a MA while processing on each site along with some other specific information is

carried back in the result bag atCENTRAL

S  . Agents in serial number 1-5 visit n sites serially other

parameters are collected from different resources. Detailed relationship among these agents and

working behaviour of each agent is as follows:

1.   Local Frequent Itemset Generater Agent (LFIGA): This is a MA that carries the

 AgentProfile & min_th_sup . LFIGA generates and stores ( )

FI 

k i L   and ( )

FISC 

k i L at site

iS  by

scanning the locali

 DB at that site with the constraint of min_th_sup . It carries back the

computational time ( CPUTime ) at each sitei

S   andend 

TripTime . This agent is embedded

with Apriori algorithm [15] for generating all the frequent k-itemset lists. It may be

equipped with decision making capability to select other FIM algorithms based on thedensity of the dataset at a particular site. More details are available in Algorithm 4. 

 2.   Local Knowledge Generater Agent (LKGA): This is a MA that carries the

 AgentProfile & min_th_conf  . LKGA applies the constraint of min_th_conf   to generate and

store  LSAR

i L  by using the ( )

FI 

k i L  and ( )

FISC 

k i L  lists already generated by LFIGA agent at site

iS  .

 LSAR

i L  list also support and confidence for a particular association rule along with the site

name. It carries back the computational time ( CPUTime ) at each site iS   and end TripTime .

Detailed steps are given in Algorithm 7.

 3.  Total Frequent Itemset Collector Agent (TFICA): This is a MA that carries the

 AgentProfile . TFICA collects list of local frequent k-itemset ( ( )

FI 

k i L ) generated by LFIGA

agent and carries back the list of total frequent k-itemset TFI 

k  L  in the result bag to  RM  at

CENTRALS  . In addition to this resultant knowledge, it also carries back the computational

time ( CPUTime ) at each site iS   and end TripTime . It executes Algorithm 8. 

 4.   Local Knowledge Collctor Agent (LKCA): This is a MA that carries the  AgentProfile .

LKCA collects the list of locally strong association rules (  LSAR

i L ) generated by LKGA

agent and carries back the list of total locally strong association rules ( TLSAR L ) in the result

bag to  RM   atCENTRALS  . In addition to this resultant knowledge, it also carries back the

computational time ( CPUTime ) at each site iS    and end TripTime . Steps are shown in

Algprithm 9. 

 5.  Global Knowledge Dispatcher Agent (GKDA):  This is a MA that carries the

 AgentProfile   containing global knowledge ( GSAR

CENTRAL L ). It dispatches global knowledge at

every site for further decision making and comparing with the local knowledge at thatsite. It executes Algorithm 12. 

6. 

Global Frequent Itemset Generater Agent (GFIGA): It is a stationary agent at CENTRALS  ,

mainly used for processing the result bag of TFICA, i.e., total frequent k-itemset list

( TFI 

k  L ) generated y TIFCA to generate the global frequent itemset list, GFI 

k  L . More details

are available in Algorithm 10.

7.  Global Knowledge Generater Agent (GKGA): It is also a stationary agent at CENTRALS  ,

mainly used for processing the GFI 

k  L  list and TLSAR L  list to compile the global knowledge,

Page 9: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 9/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

85

i.e., the list of globally strong association rules, GSAR

CENTRAL L . Detailed steps are shown in

Algorithm 11. 

Algortihm 4 LOCAL FREQUENT ITEMSET GENERATER AGENT (LFIGA) 

Input:

• 

 AgentProfile,A collection of agent attributes set by the AL  •  min_th_sup, the given minimum threshold support  

Output: FI&SC  L , the list of frequent itemsets and their support counts  

1:  procedure LFIGA( AgentProfile,min_th_sup ) 

2:  start CPUTime get system time←  

3:   Briefcase get Briefcase from AgentProfile←  

4:  i i i DB load DB from local file system of site S ←  

5:  . (0)iT DB get  ←   >No. of records 

6:  . (1)i I DB get ←   >No. of items 

7:  . (3)i DB[T][I] DB get ←   > itemset data bank  

8:  minsupcount (T× min_th_sup) / 100←  

9: 

>generate frequent-1 itemset list ( 1FIL ) and support count list ( 1FISC ) 

10:  1CFIL {1,2,3...I}←   >candidate frequent-1 itemset 

11:  for i 1,I  ←  do > initialize the support count array 1SCFIL to zero 

12:  01SCFIL [i]  ←  

13:  end for

14:  1k  ←  

15:  for all 1candidate c CFIL∈ do > find support count for every candidate 

16:  for all transaction t DB∈  do

17:  if c t ⊂  then

18: 1 1[ ] [ ] 1SCFIL k SCFIL k  ← +  

19:  end if

20:  end for

21: 

1k k ← +  22:  end for

23:  >prune 1 1 1CFIL to generate FIL and FISC   

24:  for 1,k I ←  do

25:  if 1[ ]SCFIL k minsupcount  ≥  then

26:  k 1 1add c CFIL to FIL∈  

27: 1 1add SCFIL [k] to FISC   

28:  end if

29:  end for

30:  if 1FIL   ≠ ∅  then

31:  FI 

1add FIL to L  

32:  FISC 

1add FISC to L  

33:  end if

34:  2k  ←  

35:  while 1k FIL−

  ≠ ∅  do

36:  k k -1CFIL Call GenerateCFIL(FIL )←   > see Algorithm 5 

37:  for 1, .k i CFIL length←  do > initialize the array k SCFIL to zero 

38:  [ ] 0k SCFIL i   ←  

Page 10: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 10/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

86

39:  end for

40:  1i ←  

41:  for all k candidate c CFIL∈ do > find support count for every candidate 

42:  for all transaction t DB∈  do > scan DB 

43:  if c t ⊂  then

44: 1 1[ ] [ ] 1SCFIL k SCFIL k  ← +  

45:  end if

46:  end for

47:  1i i← +  

48:  end for

49:  >prunek CFIL to generate

k FIL andk FISC   

50:  for 1, .k i SCFIL length←  do

51:  if [i]k SCFIL minsupcount  ≥   then

52:  i k k add c CFIL to FIL∈  

53:  k k add SCFIL [i] to FISC   

54:  end if

55:  end for

56:  if k FIL   ≠ ∅  then

57: 

FI 

k add FIL to L  

58:  FISC 

k add FISC to L  

59:  end if

60:  1k k ← +  

61:  end while

62:  FI&SC add T to L  

63:  FI FI &SC  add L to L  

64:  FISC FI &SC  add L to L  

65:  FI&SC 

isave L in the local file system of this site S  

66:  end CPUTime get system time←  

67:  end start  CPUTime CPUTime CPUTime← −  

68: 

iadd CPUTime to Result_S   69:  iadd Result_S to Briefcase  

70:  add updated Briefcase to AgentProfile  

71:   NODES  L get itinerary list from AgentProfile←  

72:   NODES NODES  L remove first IP address from L←   >visited site 

73:   NODES add updated L to AgentProfile  

74:  if  NODES  L   ≠ ∅  then > itinerary not empty 

75:   AObject new LGFIGA(AgentProfile, min_th_sup)←  

76:  add AObject to AgentProfile  

77:   NODES transfer AgentProfile to DM_AEE at first IP address in L  

78:  else

79: 

end TripTime get system time for end of agent journey←  

80:  end add TripTime to Briefcase  

81:  add updated Briefcase to AgentProfile  

82:  CENTRALtransfer AgentProfile to RM at S   

83:  end if84:  end procedure

Page 11: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 11/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

87

Algortihm 5 GENERATECFIL 

Input: 1,k  L Frequent k - 1 itemsets−

 

Output: k C , Candidate Frequent k itemsets  

1:  procedure GENERATECFIL ( 1k  L−

2:  for all 1 k-1itemset l L∈   do

3: 

for all 2 k- 1itemset l L∈  do

4:  if 1 2 1 2 1 2(l [1] = l [1]) (l [2] = l [2]) (l [k - 1] = l [k -1])∧ ∧ ∧L  then

5: 1 2c l l← ⊗   > join step: generate candidates 

6:  end if

7:  if HASINFREQUENTSUBSET (1, k c L

−)  then > see Algorithm 6 

8:  delete c  

9:  else

10: k add c to C   

11:  end if

12:  end for

13:  end for

14:  return k C   

15: 

end procedure

Algortihm 6 HASINFREQUENTSUBSET

Input: ,c Candidate k itemsets  

Output: 1 1k  L , Frequent k itemsets−

  −  

1:  procedure HASINFREQUENTSUBSET (1, k c L

−) 

2:  for all (k -1) subset s c∈   do

3:  if 1k s L−

∉  then

4:  return TRUE 

5: 

else6:  return FALSE 

7:  end if

8:  end for

9:  end procedure

Algortihm 7 LOCAL KNOWLEDGE GENERATER AGENT (LKGA) 

Input:

•   AgentProfile,A collection of agent attributes set by the AL  

•  min_th_conf, the given minimum threshold confidence  

Output: LSAR L , the list of locally strong association rules  

1: 

procedure LKGA( AgentProfile,min_th_conf  ) 

2:  start CPUTime get system time←  

3:   Briefcase get Briefcase from AgentProfile←  

4:  FI &SC FI &SC  

i L load L from local file system of this site S ←  

5:  & . (0)FI SC  T L get  ←   >No. of records 

6:  & . (1)FI FI SC   L L get ←   > frequent k-itemset list 

Page 12: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 12/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

88

7:  & . (2)FISC FI SC   L L get ←   >support count list 

8:  for 2, .FI 

k L size←  do

9:  . ( )FI 

k  L L get k ←   >get frequent k-itemset list 

10:  for all k l L∈  do

11:  subsetsl generate all non - empty subsets of l←  

12: FISC 

spcount l get support count of l from L←  

13:  (l / T) 100support spcount   AR   ← ×   > support of the association rule 

14:  for all subsetsnon - empty subset s l∈  do

15: FISC 

spcount s get support count of s from L←  

16:  conf spcount spcount   AR (l / s )×100←   >confidence of the association rule 

17:  if conf  AR min_th_conf ≥  then

18:  strong support conf   AR "s l - s[AR %,AR %]" ←   ⇒  

19:  print strong AR  

20:  strongadd l to AR  

21:   IP

i iS get IP address of this site S  ←  

22: 

 IP

i strongadd S to AR  

23:  LSAR

strongadd AR to L  

24:  end if

25:  end for

26:  end for

27:  end for

28:   LSAR

isave L in the local file system of this site S   

29:  end CPUTime get system time←  

30:  end start  CPUTime CPUTime CPUTime← −  

31:  iadd CPUTime to Result_S   

32: iadd Result_S to Briefcase  

33: 

add updated Briefcase to AgentProfile  34:   NODES  L get itinerary list from AgentProfile←  

35:   NODES NODES  L remove first IP address from L←   >visited site 

36:   NODES add updated L to AgentProfile  

37:  if  NODES  L   ≠ ∅  then > itinerary not empty 

38:   AObject new LKGA(AgentProfile, min_th_conf)←  

39:  add AObject to AgentProfile  

40:   NODES transfer AgentProfile to DM_AEE at first IP address in L  

41:  else

42:  end TripTime get system time for end of agent journey←  

43:  end add TripTime to Briefcase  

44: 

add updated Briefcase to AgentProfile  45:  CENTRALtransfer AgentProfile to RM at S   

46:  end if

47:  end procedure

Page 13: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 13/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

89

Algortihm 8 TOTAL FREQUENT ITEMSET COLLECTOR AGENT (TFICA) 

Input: AgentProfile,A collection of agent attributes set by the AL  

Output: FI  L , the list of locally frequent itemsets  

1:  procedure TFICA( AgentProfile,min_th_conf  ) 

2:  start CPUTime get system time←  

3:   Briefcase get Briefcase from AgentProfile←  

4:  FI &SC FI &SC  

i L load L from local file system of this site S ←  

5:  & . (1)FI FI SC   L L get ←   > frequent k-itemset list 

6:  FI 

iadd L to Result_S   

7: end CPUTime get system time←  

8:  end start  CPUTime CPUTime CPUTime← −  

9:  iadd CPUTime to Result_S   

10:  iadd Result_S to Briefcase  

11:  add updated Briefcase to AgentProfile  

12:   NODES  L get itinerary list from AgentProfile←  

13: 

 NODES NODES 

 L remove first IP address from L←   >visited site 

14:   NODES add updated L to AgentProfile  

15:  if  NODES  L   ≠ ∅  then > itinerary not empty 

16:   AObject new TFICA(AgentProfile)←  

17:  add AObject to AgentProfile  

18:   NODES transfer AgentProfile to DM_AEE at first IP address in L  

19:  else

20:  end TripTime get system time for end of agent journey←  

21:  end add TripTime to Briefcase  

22:  add updated Briefcase to AgentProfile  

23: CENTRALtransfer AgentProfile to RM at S   

24: 

end if25:  end procedure

Algortihm 9 LOCAL KNOWLEDGE COLLECTOR AGENT (LKCA) 

Input: AgentProfile, A collection of agent attributes set by the AL  

Output:  LSAR L , the list of locally strong association rules  

1:  procedure LKCA( AgentProfile ) 

2:  start CPUTime get system time←  

3:   Briefcase get Briefcase from AgentProfile←  

4:   LSAR LSAR

i L load L from local file system of this site S ←  

5: 

 LSARiadd L to Result_S   

6:  end CPUTime get system time←  

7: end start  CPUTime CPUTime CPUTime← −  

8:  iadd CPUTime to Result_S   

9:  iadd Result_S to Briefcase  

10:  add updated Briefcase to AgentProfile  

Page 14: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 14/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

90

11:   NODES  L get itinerary list from AgentProfile←  

12:   NODES NODES  L remove first IP address from L←   >visited site 

13:   NODES add updated L to AgentProfile  

14:  if  NODES  L   ≠ ∅  then > itinerary not empty 

15:   AObject new LKCA(AgentProfile)←  

16: 

add AObject to AgentProfile  

17:   NODES transfer AgentProfile to DM_AEE at first IP address in L  

18:  else

19:  end TripTime get system time for end of agent journey←  

20:  end add TripTime to Briefcase  

21:  add updated Briefcase to AgentProfile  

22:  CENTRALtransfer AgentProfile to RM at S   

23:  end if

24:  end procedure

Algortihm 10 GLOBAL FREQUENT ITEMSET GENERATER AGENT (GFIGA) 

Input:   Briefcase, Result bag of TFICA agent  

Output: GFI  L , the list of global frequent itemsets  

1:  procedure GFIGA( Briefcase ) 

2:  start CPUTime get system time←  

3:  ( )nTFI FI  

ii=1 L retrieve total frequent itemsets L from Briefcase←   U  

4:  ( )1

nGFI FI  

ii L retrieve global frequent itemsets L from Briefcase

=←   I  

5:  print GFI  L  

6:  GFI 

CENTRALsave L in the local file system of site S   

7: end CPUTime get system time←  

8: end start  

CPUTime CPUTime CPUTime← −  

9:  print CPUTime  

10:  return GFI  L  

11:  end procedure

Algortihm 11 GLOBAL KNOWLEDGE GENERATER AGENT (GKGA) 

Input:   Briefcase, Result bag of LKCA agent  

Output:GSAR

CENTRAL L , the list of globally strong association rules  

1:  procedure GKGA( Briefcase ) 

2: start 

CPUTime get system time←  

3:  ( )nTLSAR LSAR

ii=1 L retrieve total strong rules L from Briefcase←   U  

4:  ( )GFI GFI  

CENTRAL L load global frequent itemsets L from S ←  

5:  for all TLSAR

strong AR L∈  do

6:  strong L get frequent itemset from AR←  

7:  if GFI  L L∈  then

Page 15: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 15/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

91

8:  print IP

strong i AR along with the site address (S )  

9:  GSAR

strong CENTRALadd AR to L  

10:  end if

11:  end for

12:  GSAR

CENTRAL CENTRALsave L in the local file system of site S   

13: 

end CPUTime get system time←  

14:  end start  CPUTime CPUTime CPUTime← −  

15:  print CPUTime  

16:  return GSAR

CENTRAL L  

17:  end procedure

Algortihm 12 GLOBAL KNOWLEDGE DISPATCHER AGENT (GKDA) 

Input: AgentProfile,A collection of agent attributes set by the AL  

Output: GSAR

CENTRAL i Dispatch L at each distributed site S   

1:  procedure GKDA( AgentProfile ) 

2:  start CPUTime get system time←  

3:   Briefcase get Briefcase from AgentProfile←  

4: GSAR GSAR

CANTRAL CENTRAL L get L from Briefcase←  

5:  GSAR

CENTRAL isave L in the local file system of site S  

6: end CPUTime get system time←  

7:  end start  CPUTime CPUTime CPUTime← −  

8: iadd CPUTime to Result_S   

9:  iadd Result_S to Briefcase  

10:  add updated Briefcase to AgentProfile  

11:   NODES  L get itinerary list from AgentProfile←  

12: 

 NODES NODES 

 L remove first IP address from L←   >visited site 13:   NODES add updated L to AgentProfile  

14:  if  NODES  L   ≠ ∅  then > itinerary not empty 

15:   AObject new GKDA(AgentProfile)←  

16:  add AObject to AgentProfile  

17:   NODES transfer AgentProfile to DM_AEE at first IP address in L  

18:  else

19: end TripTime get system time for end of agent journey←  

20:  end add TripTime to Briefcase  

21:  add updated Briefcase to AgentProfile  

22:  CENTRALtransfer AgentProfile to RM at S   

23: 

end if24:  end procedure

Page 16: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 16/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

92

Figure 2. Control Panel of AeMGSAR

4.IMPLEMENTATION AND PERFORMANCE STUDY

All the agents as well as control panel as shown in Figure 2 are designed in Java. Synthetic

dataset ( i DB ) is stored across three distributed sites1S  , 2S    and

3S  , with 3500, 3850 and 3900

transactions and 10 items in each respectively using Transactional Data Set Generator (TDSG)

tool [16]. Binary and transactional versions of these datasets are shown in Appendix A. The

required configuration of the system is shown in Table 1 with additional deployment of DM_AEE  

at each distributed site and  AL  and  RM   at CENTRALS  . Round Trip time taken by various MAs is

shown in Figure 3. CPU time consumed by various MAs at site 1S  , 2S   and 3S   is shown in Figure

4, Figure 5 and Figure 6, respectively. CPU time for GFIGA  and GKGA  is 101357102 nano

seconds and 33317458 nano seconds, respectively.( )

FI 

k i L and( )

FISC 

k i L  at distributed sites generated by

 LFIGA agent with 20% min_th_sup  are shown in Appendix B.1, B.2 and B.3.  LSAR

i L  at distributed

sites generated by LKGA agent with 50% min_th_conf   are shown in Appendix B.4, B.5 and B.6.

Globally frequent itemsets generated by GFIGA at CENTRALS   is shown in Figure 7. Fifteen numbers

of 2-itemsets and eight number of 3-itemsets are globally frequent in TFI 

k  L   list and 4, 5 and 6-

itemsets, which are locally frequent, are not globally frequent. Globally strong association rules

( GSAR

CENTRAL L ) generated by GKGA atCENTRALS   for globally frequent 3-itemsets are shown in Figure 8

and GSAR

CENTRAL L  for 2-itemsets are shown in Appendix B.7.

On comparing this system with the traditional central data warehouse (DW) based approach for

ARM where entire data from the distributed sites is centrally collected in a DW [17], it is found

that the storage cost is reduced as data is mined locally and only the resultant knowledge iscarried at the central site by mobile agents. As size of the resultant data carried across by mobile

Page 17: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 17/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

93

agents is small so network communication cost is also reduced in this case. Data mining isperformed locally by agents, so computational cost at central site is also minimised. AeMGSAR

reflects the global knowledge because all the strong association rules generated are also strong ateach distributed site. The system relies upon the Java's in-built security system. As MAs arescalable in nature so performance would not be affected by adding more sites.

Table 1. Network Configuration

Site Name Processor OSLAN Configuration

IP a  Network

SCENTRAL  Intel MSc  192.168.46.5 NW

S1  Intelb  MS

c  192.168.46.212 NW

S2  Intelb  MS

c  192.168.46.189 NW

S3  Intelb  MS

c  192.168.46.213 NW

a.  IP address with Mask: 255.255.255.0 and Gateway 192.168.46.1

b.  Intel Pentium Dual Core(3.40 GHz, 3.40 GHz) with 512 MB RAMc.  Microsoft Windows XP Professional ver. 2002

d.  Network Speed: 100 Mbps and Network Adaptor: 82566DM-2 Gigabit NIC

Figure 3. Round Trip time taken by various MAs

Figure 4. CPU Time taken by various MAs at site1S   

Page 18: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 18/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

94

Figure 5. CPU Time taken by various MAs at site2S   

Figure 6. CPU Time taken by various MAs at site 3S   

Figure 7. Lists of global frequent k-itemsets at CENTRALS   

Page 19: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 19/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

95

Figure 8. Globally strong association rules for globally frequent 3-itemsets

5.CONCLUSION

Mobile agents strongly qualify for designing distributed applications and the amalgamation ofDDM and agent technology gives favourable results. Most of the existing agent based

frameworks for DARM task are only prototype model and lacks the appropriate underlyingexecution environment, scalability, privacy preserving techniques, global knowledge generation

and implementation using a real datasets. In this study, a scalable MAS, called Agent enabledMining of Globally Strong Association Rules (AeMGSAR), is presented based on the serial

itinerary of the mobile agents. In this system the overall task of mining the globally strongassociation rules is divided into subtasks which are handled by various mobile as well as

stationary agents. An AEE is also designed for the implementation and performance study ofAeMGSAR system. Serial itinerary used for mobile agent migration increases the overall cost ofDARM task so a parallel computing model could be designed where clones of each mobile agent

is dispatched in parallel to all distributed sites.

REFERENCES

[1] U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth & R. Uthurusamy, (1996) Advances in Knowledge

Discovery and Data Mining, AAAI/MIT Press.

[2] J. Han & M. Kamber, (2006) Data Mining: Concepts and Techniques, 2nd ed. Morgan Kaufmann.[3] G. S. Bhamra, R. B. Patel & A. K. Verma, (2014) “Intelligent Software Agent Technology: An

Overview”, International Journal of Computer Applications (IJCA), vol. 89, no. 2, pp. 19–31.

[4] R. Agrawal, T. Imielinski & A. Swami, (1993) “Mining association rules between sets of items in large

databases”, in Proceedings of the ACM-SIGMOD International Conference of Management of Data,pp. 207–216.

[5] R. Agrawal & J. C. Shafer, (1996) “Parallel mining of association rules”, IEEE Transaction on

Knowledge and Data Engineering, vol. 8, no. 6, pp. 962–969.

[6] M. J. Zaki, (1999) “Parallel and distributed association mining: a survey”, IEEE Concurrency, vol. 7,

no. 4, pp. 14–25.

[7] X. Wu & S. Zhang, (2003) “Synthesizing high-frequency rules from different data sources”, IEEE

Transactions on Knowledge and Data Engineering, vol. 15, no. 2, pp. 353–367.

Page 20: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 20/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

96

[8] Y.-L. Wang, Z.-Z. Li & H.-P. Zhu, (2003) “Mobile agent based distributed and incremental techniques

for association rules”, in Proceedings of the International Conference on Machine Learning and

Cybernetics(ICMLC 2003), vol. 1, pp. 266–271.

[9] C. Aflori & F. Leon, (2004) “Efficient Distributed Data Mining using Intelligent Agents”, in

Proceedings of the 8th International Symposium on Automatic Control and Computer Science, pp. 1–

6.

[10] U. P. Kulkarni, P. D. Desai, T. Ahmed, J. V. Vadavi & A. R. Yardi, (2007) “Mobile Agent BasedDistributed Data Mining”, in Proceedings of the International Conference on Computational

Intelligence and Multimedia Applications (ICCIMA 2007), IEEE Computer Society, pp. 18–24.

[11] G. Hu & S. Ding, (2009a) “An Agent-Based Framework for Association Rules Mining of Distributed

Data”, in Software Engineering Research, Management and Applications 2009, ser. Studies in

Computational Intelligence, R. Lee and N. Ishii, Eds. Springer Berlin - Heidelberg, vol. 253, pp. 13–

26.[12] G. Hu & S. Ding, (2009b) “Mining of Association Rules from Distributed Data using Mobile

Agents,” in Proceedings of the International Conference on e-Business(ICE-B 2009), pp. 21–26.

[13] A. O. Ogunde, O. Folorunso, A. S. Sodiya, J. A. Oguntuase & G. O. Ogunleye, (2011) “Improved

cost models for agent based association rule mining in distributed databases”, Anale SEria

Informatica, vol. 9, no. 1, pp. 231–250, Available: http://anale-

informatica.tibiscus.ro/download/lucrari/9-1-20-Ogunde.pdf

[14] G. S. Bhamra, A. K. Verma, & R. B. Patel, (2015) “Agent Based Frameworks for Distributed

Association Rule Mining: An Analysis”, International Journal in Foundations of Computer Science &Technology (IJFCST), vol. 5, no. 1, pp. 11-22.

[15] R. Agrawal & R. Srikant, (1994) “Fast Algorithms for Mining Association Rules in Large Databases”,

in Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). Morgan

Kaufmann Publishers Inc., pp. 487–499.[16] G. S. Bhamra, A. K. Verma, & R. B. Patel, (2011) “TDSGenerator: A Tool for generating synthetic

Transactional Datasets for Association Rules Mining”, International Journal of Computer Science

Issues (IJCSI), vol. 8, no. 2, pp. 184-188.

[17] G. S. Bhamra, A. K. Verma, & R. B. Patel, (2014) “An Investigation into the Central Data Warehouse

based Association Rule Mining”, International Journal of Computer Applications (IJCA), vol. 96, no.

10, pp. 1-12.

AUTHORS

Gurpreet Singh Bhamra  is currently working as Assistant Professor at

Department of Computer Science and Engineering, M. M. University, Mullana,

Haryana. He received his B.Sc. (Computer Sc.) and MCA from Kurukshetra

University, Kurukshetra in 1995 and 1998, respectively. He is pursuing Ph.D.

from Department of Computer Science and Engineering, Thapar University,

Patiala, Punjab. He is in teaching since 1998. He h   as published 13 research

papers in International/National Journals and International Conferences. He has

received Best Paper Award for “An Agent enriched Distributed Data Mining on

Heterogeneous Networks”, in “Challenges & Opportunities in Information

Technology” (COIT-2008). He is a Life Member of Computer Society of India. His research interests are in

Distributed Computing, Distributed Data Mining, Mobile Agents and Bio-informatics.

Dr. Anil Kumar Verma  is currently working as Associate Professor at

Department of Computer Science & Engineering, Thapar University, Patiala. He

received his B.S., M.S. and Ph.D. in 1991, 2001 and 2008 respectively, majoring inComputer science and engineering. He has worked as Lecturer at M.M.M.

Engineering College, Gorakhpur from 1991 to 1996. He joined Thapar Institute of

Engineering & Technology in 1996 as a Systems Analyst in the Computer Centre

and is presently associated with the same Institute. He has been a visiting faculty to

many institutions. He has published over 100 papers in referred journals and

conferences (India and Abroad). He is a MISCI (Turkey), LMCSI (Mumbai),

GMAIMA (New Delhi). He is a certified software quality auditor by MoCIT,

Page 21: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 21/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

97

Govt. of India. His research interests include wireless networks, routing algorithms and securing ad hoc

networks and data mining.

Dr. Ram Bahadur Patel is currently working as Professor and Head at Department

of Computer Science & Engineering, Chandigarh College of Engineering &

Technology, Chandigarh. He received PhD from IIT Roorkee in Computer Science &

Engineering, PDF from Highest Institute of Education, Science & Technology(HIEST), Athens, Greece, MS (Software Systems) from BITS Pilani and B. E. in

Computer Engineering from M. M. M. Engineering College, Gorakhpur, UP. Dr.

Patel is in teaching and research since 1991. He has supervised 36 M. Tech, 7 M.

Phil. and 8 PhD Thesis. He is currently supervising 6 PhD students. He has published

130 research papers in International/National Journals and Refereed International

Conferences. He has written 7 text books for engineering courses. He is member ofISTE (New Delhi), IEEE (USA). He is a member of various International Technical Committees and

participating frequently in International Technical Committees in India and abroad. His current research

interests are in Mobile & Distributed Computing, Mobile Agent Security and Fault Tolerance and Sensor

Network.

APPENDIX A – SYNTHETIC DATASETS

A.1 BDS3500T10I.txt and corresponding TDS3500T10I.txt(1 DB ) at site

1S   

These synthetic binary and transactional datasets of 3500 records are created by TDSG tool at

site1S  . In the binary version each column head represents the item number and each row

represents a transaction where integer ‘1’ is used for a purchased item and ‘0’ is used if it is nor

purchased. The corresponding transactional version has a Transaction It (TID) for each

transaction and Itemset is the set of all the purchased items for that particular transaction.

Page 22: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 22/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

98

A.2 BDS3850T10I.txt and corresponding TDS3850T10I.txt(2 DB ) at site

2S   

These synthetic binary and transactional datasets of 3850 records are created by TDSG tool at site

2S  .

A.3 BDS3900T10I.txt and corresponding TDS3900T10I.txt(3 DB ) at site

3S   

These synthetic binary and transactional datasets of 3900 records are created by TDSG tool at site

3S  .

Page 23: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 23/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

99

APPENDIX B–RESULTANT KNOWLEDGE OF AEMGSAR

SYSTEM

B.1 (1)

FI 

k  L  and (1)

FISC 

k  L   at site1S   

List of frequent k-itemset, i.e., (1)FI k  L  is represented by column L and column SC shows the support

count of the corresponding frequent k-itemset, i.e., (1)

FISC 

k  L  at site 1S  . These frequent itemsets and

their support counts are obtained by processing the synthetic dataset (1 DB ) as shown in Appendix

A.1.

B.2(2)

FI 

k  L  and(2)

FISC 

k  L   at site2S   

These frequent itemsets and their support counts are obtained by processing the synthetic dataset

(2 DB ) as shown in Appendix A.2.

Page 24: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 24/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

100

B.3(3)

FI 

 L  and(3)

FISC 

 L   at site3

S   

These frequent itemsets and their support counts are obtained by processing the synthetic dataset

( 3 DB ) as shown in Appendix A.3.

Page 25: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 25/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

101

B.41

 LSAR L   at site1S   

Column L represents frequent k-itemset and column AR(support, confidence)  shows the list of

locally strong association rules, i.e., 1

 LSAR L  at site 1S  . Each strong rule has its associated support

and confidence factor. The minimum threshold is taken as 20% and minimum thresholdconfidence as 50% for generating the strong rules by making use of the data as shown inAppendix B.1.

Page 26: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 26/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

102

B.52

 LSAR L   at site2S   

Column L represents frequent k-itemset and column AR(support, confidence)  shows the list of

locally strong association rules, i.e.,2

 LSAR L  at site2S  . Each strong rule has its associated support

and confidence factor. The minimum threshold is taken as 20% and minimum threshold

confidence as 50% for generating the strong rules by making use of the data as shown inAppendix B.2.

Page 27: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 27/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

103

B.63

 LSAR L   at site3S   

Column L represents frequent k-itemset and column AR(support, confidence)  shows the list of

locally strong association rules, i.e.,3

 LSAR L  at site3S  . Each strong rule has its associated support

and confidence factor. The minimum threshold is taken as 20% and minimum threshold

confidence as 50% for generating the strong rules by making use of the data as shown inAppendix B.3.

Page 28: A Serial Computing Model of Agent

8/20/2019 A Serial Computing Model of Agent

http://slidepdf.com/reader/full/a-serial-computing-model-of-agent 28/28

International Journal on Computational Sciences & Applications (IJCSA) Vol.5, No.3, June 2015

B.7 GSAR

CENTRAL L   at siteCENTRALS   

Column L represents globally frequent k-itemset, i.e., itemsets which are locally strong at all the

distributed sites and column AR(support, confidence)  shows the list of globally strong

association rules, i.e.,GSAR

CENTRAL L   for such itemsets. Each globally strong rule has its associatedsupport and confidence factor. The minimum threshold is taken as 20% and minimum thresholdconfidence as 50%. Site represents the IP address of the site where the rule is locally strong. IP

address 192.168.46.212 is used for site 1S  , 192.168.46.189 for site 2S    and address

192.168.46.213 is used for site3S  .