Location Prediction of Mobile Users in Knowledge Grid

7/29/2019 Location Prediction of Mobile Users in Knowledge Grid

1/54

CHAPTER 1

INTRODUCTION

. In proposed project, knowledge grid is developed to predict the next location of

mobile user in mobile environment. By using the predicted location, the system

effectively allocate resources to mobile users in the neighbor location and it is possible

to answer the queries that refer to the future position of mobile uses. Data grid is

designed to allow large moving logs to be stored in repositories. In business area it is

necessary to develop environment for analysis, inference and discovery over the data

grid. Therefore, the evolution of the data grid is represented by knowledge grid offering

high level services for distributed mining and extraction of knowledge from data

repositories available on data grid. Location management or location tracking is the

mechanism of locating the particular mobile user at any given point of time.

The most prominent example of distributed environment is grid, where a large

number of computing and storage units are interconnected over a high speed network.

Data mining is inductive, iterative process that extracts information or frequent patterns

from large volume of data. Most of the applications have created large volume of data

sets which are constantly increasing and stored in geographically distributed locations.

Grid computing has distinguished from conventional distributed computing by its focus

on large-scale resource sharing, innovative applications, and high-performance

orientation [1]. In Grid computing, the term resource management refers to the

operations used to control how capabilities provided by Grid resources and services are

made available to other entities such as users, applications, or services [2]. A resource

manager is one of the most critical components of the grid middleware [4] since it is

responsible for resource management that provides a resource selection and job

scheduling. Therefore, resource discovery, resource selection, and job scheduling that

has considerable influence on computing performance are important issues in grid

computing.

1


2/54

1.1OBJECTIVE

Location management or location tracking is the mechanism of locating theparticular mobile user at any given point of time. The two strategies for location

tracking are location updating and location prediction . Location updating is a passive

method in which the system periodically updates the current location in a database.

Location prediction is the dynamic strategy in which the system proactively estimates

the mobile users location based on a user movement model. Personal Communication

System (PCS) allows mobile users to move from one location to another location since

these systems are based on the notion of wireless Access. A parallel and distributedalgorithm is implemented on computational grid to mine data stored in data grids. The

Knowledge Grid is a parallel and distributed software architecture that integrates grid

technologies and data mining applications. Apriori algorithm is used to predict the next

location of the mobile user.

A Grid allows its constituent resources to be used in a coordinated fashion to

deliver various qualities of service, relating for example to response time, throughput,

availability, and security, and/or co allocation of multiple resource types to meet

complex user demands, so that the utility of the combined system is significantly

greater than that of the sum of its parts. Among the Areas of data mining, the problem

of deriving associations from the data has received a great deal of attention. The

algorithms for discovering frequent sets are not directly suitable when the underlying

database is incremented intermittently. The idea is to avoid computing the frequent sets

afresh for the incremented set of data.

2


3/54

CHAPTER 2

SYSTEM ANALYSIS

2.1 LITERATURE SURVEY

Rawat et.al[2009] has developed a parallel and distributed algorithm on a grid

environment to provide an effective support for the grid inorder to solve the problems

such as combining different technologies for implementing high performance

distributed knowledge discovery systems. Apriori algorithm is applied on grid network

to prune all the frequent items. Huge communication cost is one among the drawbacks.

D.Lee et.al[2009] attempted a project on enabling the mobile devices as a

resources to access the grid network.IP-Paging scheme has been proposed based on the

mobile grid environment. Location tracking ,limited power using grid paging cache and

service migration using reserved resource are some of the issues discussed in the

project.The drawbacks are problems in job scheduling, security issues etc.,

C.Aflori et.al[2007] presented the implementation of an association rules

discovery data mining task using Grid Technologies. For the mining task, Apriori

algorithm on top of the Globus tool kit was used.The project also dealt with the design

and integration of the data mining algorithm with Globus services.Interoperability

between the gridlets are low.

T.Liu et.al[2009] proposed a new algorithm Hierarchial Location-Prediction

Algorithm(HLP) to improve the connection reliability and location tracking efficiency

in the wireless network by predicting the future trajectory of the mobile user. The

proposed system uses this concept in predicting the next location of the mobile user in

the grid environment. Pattern matching algorithm does not well suited in distributed

environment.

3


4/54

2.2 EXISTING SYSTEM

The centralized mobility pattern algorithm was discussed in which

cannot be efficient for large data size. Many parallel and distributed variants of

sequential apriori algorithm have been discussed in other resources. In, grid

implementation of frequent item sets in a grid environment deals with sales transaction

of a company, these algorithms cannot be used directly in our domain, because the

specified algorithm does not take into account the network topology while generating

the candidate pattern.

The generation of candidate pattern in is not same as the candidate pattern in

mobile environment. In PCS, only the sequence of neighboring location of the network

can be considered as the mobility pattern.

2.2.1 Drawbacks of Existing System

The existing system are not efficient for large dataset of history of mobile

locations.

These algorithms cannot be used directly in our domain, because it does not

take into account the network topology while generating the candidate patterns

2.3 PROPOSED SYSTEM

The proposed system implements incremental parallel and distributed knowledge

grid mining approach based on apriori algorithm for mining mobility patterns in mobile

environment. Knowledge Discovery in Database (KDD) employs a variety of software,

tools called as data mining to find useful patterns and models. This technique is called

as Parallel and Distributed Knowledge Discovery (PDKD). Grid computing is the most

emerged area for high performance distributed applications like knowledge discovery

process. Knowledge gird is an integration of basic grid services.

4


5/54

ADVANTAGES OF PROPOSED SYSTEM

After the execution of parallel and distributed mining algorithm, mobility

patterns are stored in knowledge base repository.

Our proposed algorithm is executed on knowledge grid to generate mobility

patterns from the database distributed on the data grid Node.

The execution time of our parallel and distributed algorithm is relatively

small when compared to sequential algorithm

2.4 FEASIBILITY STUDY

A feasibility study is concerned to select the best system that meets performance

requirements. These entities are an identification description, an evaluation of candidate

systems and the selection of the best system for the job.

Economic feasibility

Operational feasibility

Technical feasibility

2.4.1 Economic Feasibility

The Economic feasibility deals with whether it is possible to work with

existing equipment, hardware, software facilities and available personnel. The proposed

system requires no external manpower.

2.4.2 Operational Feasibility

Operational feasibility is a measure of how well a proposed system solves theproblems, and takes advantage of the opportunities identified during scope definition

and how it satisfies the requirements identified in the requirements analysis phase of

system development.operational feasibility is mainly concerned with issues like

5


6/54

whether the system will be used if it is developed and implemented. Whether there will

be resistance from users that will effect the possible application benefits.

2.4.3 Technical Feasibility

Technical feasibility deals with whether the necessary technology exists to

do what is suggested and if the equipment has the capacity to hold the data required by

the use of new system etc.The proposed project deals with technical guarantee of

accuracy, reliability, ease of access, data security could be provided.

6


7/54

CHAPTER 3

SYSTEM CONFIGURATION

3.1 HARDWARE REQUIIREMENTS

Processor : Pentium IV

Speed : Above 500 MHz

RAM capacity : 2 GB

Hard disk drive : 80 GB

Key Board : Samsung 108 keys

Mouse : Logitech Optical Mouse

Printer : DeskJet HP

Motherboard : Intel

Cabinet : ATX

Monitor : 17 inch Samsung

3.2 SOFTWARE REQUIIREMENTS

Operating System : Windows XP and above

Front end used : Gridsim

Back end used : SQL server 2000

7


8/54

CHAPTER 4

SOFTWARE DESCRIPTION

4.1 GRIDSIM

As the project is used in distributed environment, a grid is constructed to share

the resources among mobile users for location based services. Inorder to form the grid,

a suitable tool Gridsim is used. The GridSim toolkit allows modeling and simulation of

entities in parallel and distributed computing (PDC) systems-users, applications,

resources, and resource brokers (schedulers) for design and evaluation of schedulingalgorithms. It provides a comprehensive facility for creating different classes of

heterogeneous resources that can be aggregated using resource brokers. for solving

compute and data intensive applications. A resource can be a single processor or multi-

processor with shared or distributed memory and managed by time or space shared

schedulers. The processing nodes within a resource can be heterogeneous in terms of

processing capability, configuration, and availability. The resource brokers use

scheduling algorithms or policies for mapping jobs to resources to optimize system or

user objectives depending on their goals.

4.1.1 Overview of GridSim functionalities:

incorporates failures of Grid resources during runtime.

new allocation policy can be made and integrated into the GridSim Toolkit,

by extending from AllocPolicy class.

has the infrastructure or framework to support advance reservation of a grid

system.

incorporates a functionality that reads workload traces taken from

supercomputers for simulating a realistic grid environment.

8


9/54

incorporates an auction model into GridSim.

incorporates a datagrid extension into GridSim.

incorporates a network extension into GridSim. Now, resources and other

entities can be linked in a network topology.

incorporates a background network traffic functionality based on a

probabilistic distribution. This is useful for simulating over a public network

where the network is congested.

incorporates multiple regional GridInformationService (GIS) entities

connected in a network topology. Hence, you can simulate an experiment

with multiple Virtual Organizations (VOs).

adds ant build file to compile GridSim source files.

The GridSim 1.0 toolkit does not have any additional GUI tools that enable

easier and faster modeling and simulation of Grid environments. It only comprises the

Java-based packages that are to be called by users self-written Java VM has now been

included in the recently released Grid- Sim 2.0 toolkit. There are many similar tools

like VM that generate source code, but are not related to Grid simulation in specific,

such as SansGUITM and SimCreator.

4.2 SQL SERVER

Working With SQL Server Through The Server Explorer

Using visual studio.net, there is no need to open the enterprise manager from

SQL server. Visual studio.net has the SQL servers tab within the server explorer that

gives a list of all the servers that are connected to those having SQL server on them.

Opening up a particular server tab gives five options:

Database diagrams

Tables

Views

9


10/54

Stored procedures

Functions

Database diagrams

To create a new diagram right click database diagrams and select new diagram.

The add tables dialog enables to select one to all the tables that you want in the visual

diagram you are going to create. Visual studio .net looks at all the relationships

between the tables and then creates a diagram that opens in the document window.

Each table is represented in the diagram and a list of all the columns that are

available in that particular table.

Each relationship between tables is represented by a connection line between

those tables. The properties of the relationship can be viewed by right clicking the

relationship line.

Tables

The server explorer allows working directly with the tables in SQL server. It

gives a list of tables contained in the particular database selected.

4.2.1 Features of SQL-Server

The OLAP services feature available in SQL server version 7.0 is now called

SQL server 2000 analysis services. The term OLAP service has been replaced with the

term analysis services. Analysis services also include a new data mining component.

The repository component available in SQL server version 7.0 is now called Microsoft

SQL server 2000 Meta data services. References to the component now use the term

Meta data services. SQL-server database consist of six type of objects.

Table

Query

Form

Report

10


11/54

Macro

Table

A database is a collection of data about a specific topic.

Views of table

It work with a table in two types,

Design view

Datasheet view

Design view

To build or modify the structure of a table we work in the table design view. We

can specify what kind of data will be hold.

Datasheet view

To add, edit or analyses the data itself ,working is done in tables datasheet view

mode.

Query

A query is a question that has to be asked the data. Access gathers data that

answers the question from one or more table. The data that make up the answer is either

dynaset (if you edit it) or a snapshot(it cannot be edited).each time, run the query, then

get latest information in the dynaset.access either displays the dynaset or snapshot for

us to view or perform an action on it ,such as deleting or updating.

Forms

A form is used to view and edit information in the database record by record. A

form displays only the information want to see in the way it is required to see it. Forms

use the familiar controls such as textboxes and checkboxes. This makes viewing and

entering data easy.

11


12/54

CHAPTER 5

PROJECT DESCRIPTION

5.1 PROBLEM DEFINITION

Most of the mobile users exhibit some regularity in their daily movement

and they move from one location to another location in a wireless PCS network. The

coverage area of the network is divided into number of location areas. Each mobile

device is linked with the base station. Each base station contains home location register

which stores permanent details of the mobile users and Visiting Location Register(VLR) which stores temporary details of the mobile users. These register includes

attributes like user ID, user location, call time, call duration. User ID acts as a key for

the mobile user records. The movement history of a mobile user is extracted from log

files and it is stored in grid node for mining mobility patterns. In our grid network,

mobile user movement history is geographically distributed in different data grid nodes.

Knowledge grid based mobility pattern mining algorithm is executed on

computational grid over the data grid to generate the trajectories that are frequently

used by mobile users. The frequently followed trajectory is named as User Mobility

Pattern (UMP) which is used to generate mobility rules. The communication of

candidate item sets between grid nodes is achieved by MPICH-G2 and reduces the

computation overhead. Our proposed algorithm is executed on knowledge grid to

generate mobility patterns from the database distributed on the data grid node. The

execution time of our parallel and distributed algorithm is relatively small when

compared to sequential algorithm.In general, mobility prediction is performed in two

steps. 1. Find all mobility patterns: By definition, each of these location sets will occur

at least as frequently as a predetermined global minimum support count, min_sup. 2.

Mobility rules are generated from the mobility patterns which satisfy global minimum

support and minimum confidence. Mobility Rule is the important knowledge derived

from database on a data grid. Moving logs are added incrementally to the data base. It

12


13/54

makes existing mobility rules invalidate. Re execution of mining algorithm is not an

efficient process, since it ignores previously discovered rules and repeats the work that

has already been done. Incremental mining algorithm is a useful technique for mining

mobility rules when logs are added to the data base continuously.

In mobile web environment, next location of mobile users is predicted using

mobility rule and current location of the mobile user. Mobility rule contains two parts

namely, head - the part before the arrow and tailthe part after the arrow. Our process

generates set of rules whose head matches with the current location of the mobile user.

These rules are called as matching rules. The first location in the tail of the matching

rule and match value is stored in the resultant array. Match value is calculated by

summing up the support value of the UMP and confidence of the rule. The matchingrules in the array are sorted in descending order with respect to match value. This

process generates most confident and frequent rules.

5.2 OVER VIEW OF PROJECT

Grid Computing has emerged as an important new field, distinguished from

conventional distributed system by focusing on large scale resource sharing, problem

solving in dynamic and multi-institutional virtual organizations. The Knowledge Grid

(KG) is a parallel and distributed architecture that integrates data mining techniques

and grid technologies. The Knowledge Grid is exploited to perform distributed data

mining on very large data sets available over grids to find hidden valuable information,

process models to make decisions and results to make business decisions. In our work

knowledge grid is developed to predict the next location of mobile user in mobile

environment. By using the predicted location, the system effectively allocate resources

to mobile users in the neighbor location and it is possible to answer the queries that

refer to the future position of mobile uses. Data grid is designed to allow large moving

logs to be stored in repositories. In business area it is necessary to develop environment

for analysis, inference and discovery over the data grid. Therefore, the evolution of the

13


14/54

data grid is represented by knowledge grid offering high level services for distributed

mining and extraction of knowledge from data repositories available on data grid.

Location management or location tracking is the mechanism of locating the particular

mobile user at any given point of time. The two strategies for location tracking are

location updating and location prediction. Location updating is a passive method in

which the system periodically updates the current location in a database. Location

prediction is the dynamic strategy in which the system proactively estimates the mobile

users location based on a user movement model. Personal Communication System

(PCS) allows mobile users to move from one location to another location since these

systems are based on the notion of wireless access. In mobile system each mobile user

is associated with Home Location Register (HLR) which stores up-to-date location of

the mobile users. These logs accumulate as large database, in which data mining

technique is applied to find the frequently followed location. Each Base Station (BS) in

PCS is connected with Separate Home Location Register led to the geographically

distributed data grid node. Grid network was built with cluster of grid node contains

moving logs of mobile users. Existing research work applied data mining technique on

mobile data for path mining in a single database server. If the size of the moving logs is

very large, the overhead in integrating the data source will be too high.

To overcome this problem data mining algorithm is executed in distributed

environment. The most prominent example of distributed environment is grid, where a

large number of computing and storage units are interconnected over a high speed

network. Data mining is inductive, iterative process that extracts information or

frequent patterns from large volume of data. Most of the applications have created large

volume of data sets which are constantly increasing and stored in geographically

distributed locations. A parallel and distributed algorithm is implemented on

computational grid to mine data stored in data grids. The Knowledge Grid is a parallel

and distributed software architecture that integrates grid technologies and data mining

applications.

14


15/54

5.3 MODULE DESCRIPTION

The project is comprised into five modules

They are,

Construction of Mobile Network Grid

Aggregation of User Actual Path (UAP)

Mobile Rule Generation using Knowledge Grid Mobility Pattern Mining

Algorithm (KMPM)

Mobile Rule Generation using Apriori Algorithm

Comparision

5.3.1 Construction of Grid For Mobile Users

In this module ,a grid for mobile users has been constructed using Gridsim.

The Grid architecture has formed with Gridlets and Grid Server to simulate the Grid.

For each location a a gridlet is assigned. Four gridlets are considered in the project.

5.3.2 Aggregation of User Actual Path(UAP)

In this module, the user actual paths which is nothing but the mobility routes of

the users are collected . The User Actual paths(UAPs) of the past mobility patterns of

the users are aggregated.For instance, consider UAPs 4, 6, 8, 0, 5 2, 4, 8, 0, 6 and

1, 2, 4, 6 where the number 4 represents location of mobile user. The support count

of the subsequence 4, 6 can be calculated as follows. s.count = s.count + suppInc and

suppInc= 1+

totdis1, where totdis is the number of location between 4 and 6. s.count

value is 2 because it appears in 1st and 3rd UAP. In 2nd UAP there are two locations

between 4 and 6. Therefore the support value for 4 and 6 is 4, 6count = 2+ 121

=2.33. It will increase the accuracy of the support counting. KMPM algorithm will

generate more accurate global mobility pattern.

15


16/54

5.3.3 Mobile Rule Generation Using Knowledge Grid Mobility Mining

Algorithm(KMPM )

The KMPM algorithm is used in a distributed environment. Every process

generates subsequence of length k called as Local Subsequence (LSk) and then

calculates local support count for each LSk. These subsequence local support count is

exchanged with all other process to generate Global Subsequence (GSk) and then

calculates global support count for each GSk. The subsequences which have a support

count greater than the threshold global support count are selected as Global Mobility

Pattern of length k (GMPk). The support count of the subsequence can be calculated as

follows. s.count = s.count + suppInc and suppInc= 1+totdis 1 , where totdis is the

number of location between 4 and 6. s.count value is 2 because it appears in 1st and 3rd

UAP. In 2nd UAP there are two locations between 4 and 6. Therefore the support value

for 4 and 6 is 4,6 .count = 2+ 1 2 1 + =2.33. It will increase the accuracy of the

support counting.The pseudocode for Knowledge grid based Mobility Pattern

Mining (KMPM) algorithm is given below:

// pseudo code for local mining at a process Pi

KMPM( )

Input:moving path of mobile users

Output:local frequent mobility pattern

LSk=null //k is the length of subsequence

for eachUAP a Di

find the subsequence of UAP and put it in S

for eachsubsequence s S

//calculate the support count and store it in

LSk

s.count = s.count +`1 s.suppInc

16


17/54

end for

end for

// pseudocode for global mining at kth pass

globalmining()

Input:local frequent patterns

Output: global frequent patterns

k=1

while(GSk null)

for every from 1 to n increment by 1

node Pi exchange and merge local support

counts of LSk with all n nodes and find the

global support count of all subsequence and

store it in GSk

end for

for eachsubsequence in GSk

if the global support count of subsequence is above the minimum global support count

then put the subsequence in global mobility pattern

GMPk

end for

increment k by 1

end while

5.3.4 Mobile Rule Generation In Apriori Algorithm

17


18/54

In Apriori Algorithm,association rules are usually required to satisfy a user-

specified minimum support and a user-specified minimum confidence at the same time.

Association rule generation is usually split up into two separate steps:

First, minimum support is applied to find all frequent itemsets in a database.

Second, these frequent itemsets and the minimum confidence constraint are

used to form rules. While the second step is straight forward, the first step needs more

attention.

Finding all frequent itemsets in a database is difficult since it involves searching

all possible itemsets (item combinations). The set of possible itemsets is the power set

over I and has size 2n 1 (excluding the empty set which is not a valid itemset).

Although the size of the powerset grows exponentially in the number of items n in I,

efficient search is possible using the downward-closure property.

5.3.5 COMPARISION

In the proposed project, number of rules generated by both the

algorithms(KMPM and Apriori)and the supportive values considered for them are

compared.The result is that, Apriori algorithm generates less number of rules from the

user actual path of a particular user. Comparison is represented by graph.

0

2000

4000

6000

8000

10000

12000

N1 N2 N3 N4

Apriori

KMPN

Fig 5.1 Sample Graph

18
http://en.wikipedia.org/wiki/Power_sethttp://en.wikipedia.org/wiki/Power_set


19/54

5.4 SYSTEM FLOW DIAGRAM

.

Fig 5.2 System Flow Diagram

19

User

Registration of

New Users

New User Registration

Verification

of Users

New

Already Existing

Plz Re

login UrAggregation of User

Actual Paths

Applying Apriori algorithm

Applying Knowledge grid based

Server Resource

Server

Com arision

Not and

authorized

authorized


20/54

5.5 DATA FLOW DIAGRAM

Fig 5.3 Data Flow Diagram

20

Construction of Mobile

Network Grid

Aggregat

ion of

User

Actual

User Login

Applying

Knowledge

grid based

Mobility

Pattern

Mining

(KMPM)

Applying

Apriori

algorithm

Comparisio

n

Rules Detail


21/54

5.6 DATABASE DESIGN

Table 5.1 Userlogin in gridsim

Table 5.2 Location of mobile users

21


22/54

5.7 INPUT DESIGN

Input design is the part of overall system design which requires very careful

attention. Often the collection of input data is the most expensive part of the system, in

terms of both the equipment used and the number of people involved; it is the point of

most contact for the users with the computer system; and it is prone to error. If data

going into the system are incorrect, then the processing and output will magnify these

error.

In proposed system inputs are given in two ways, the Existing users can directly

enter into the system using login form, and new users have to register all their details in

the registration form provided.

Input design is the very important part in the project and should be concentrated

well as it is prone to error. The data that are to be inserted are to be inserted with care as

this plays a very important role. In order to get the meaningful output and to achieve

good accuracy the input should be acceptable and understandable by the user.

5.8 OUTPUT DESIGN

Output design plays a very important role in a system. Getting a correct output

is a task that has to be concentrated, as a system is validated as a correct one only if it

gives the correct output according to the input.Here in this project in all the three days

of inductions if the employee has completed all his/her input, then the output shows the

status as completed or his status will be pending.

22


23/54

CHAPTER 6

SYSTEM TESTING

Testing is a means of discovering the quality level of a software system, not a

means of assuring software quality. Testing cannot show the absence of effects, it can

only show that software defects are present. So when planning is on for software

testing, the activity has to be considered as a destructive activity of the software that

has been built.

THE OBJECTIVES OF TESTING ARE

Testing is a process of executing a program with the intent of finding an

error.

A good test is one that has a high probability of finding an as yet

undiscovered error.

A successful test is one that uncovers an as yet undiscovered error.

Software testing is an important element of software quality assurance

and represents the ultimate review of specification design and coding.

6.1 UNIT TESTING

Unit testing is the testing of individual hardware or software units or groups of

related unit. Using white box testing techniques, testers (usually the developers creating

the code implementation) verify that the code does what it is intended to do at a very

low structural level. For example, the tester will write some test code that will call a

method with certain parameters and will ensure that the return value of this method is

as expected. Looking at the code itself, the tester might notice that there is a branch (an

if-then) and might write a second test case to go down the path not executed by the first

test case. When available, the tester will examine the low-level design of the code;

otherwise, the tester will examine the structure of the code by looking at the code itself.

Unit testing is generally done within a class or a component.Each module from

23


24/54

construction of grid to predicting next location by using Apriori algorithm is tested

separately.

6.2 ACCEPTANCE TESTING

After functional and system testing, the product is delivered to a customer and

the customer runs black box acceptance tests based on their expectations of the

functionality. Acceptance testing is formal testing conducted to determine whether or

not a system satisfies its acceptance criteria (the criteria the system must satisfy to be

accepted by a customer) and to enable the customer to determine whether or not to

accept the system. These tests are often pre-specified by the customer and given to the

test team to run before attempting to deliver the product. The customer reserves the

right to refuse delivery of the software if the acceptance test cases do not pass.However, customers are not trained software testers. Customers generally do not

specify a complete set of acceptance test cases. Their test cases are no substitute for

creating its own set of functional/system test cases. The customer is probably very good

at specifying at most one good test case for each requirement. As you will learn below,

many more tests are needed. Whenever possible, it should run customer acceptance test

cases ourselves so that it can increase its confidence that they will work at the customer

location.

6.3 TEST CASES

Unit Testing

A program represents the logical elements of a system. For a program to run

satisfactorily, it must compile and test data correctly and tie in properly with

other programs. Achieving an error free program is the responsibility of the

programmer. Program testing checks for two types of errors: syntax and logical.

Syntax error is a program statement that violates one or more rules of the

language in which it is written. An improperly defined field dimension or omitted

keywords are common syntax errors. These errors are shown through error message

24


25/54

generated by the computer. For Logic errors the programmer must examine the output

carefully.

When a program is tested, the actual output is compared with the

expected output. When there is a discrepancy the sequence of instructions must

be traced to determine the problem. The process is facilitated by breaking the

program into self-contained portions, each of which can be checked at certain

key points .The idea is to compare program values against desk-calculated values

to isolate the problems.In the aggregation of user actual path, location of each user is

Collected and both algorithms are applied.

Functional Testing

Functional testing of an application is used to prove the application delivers

correct results, using enough inputs to give an adequate level of confidence that will

work correctly for all sets of inputs. The functional testing will need to prove that the

application works for each client type and that personalization function work correctly.

Test case no Description Expected result

1 Test for all modules All module should

communicate in the

application.

2 Test for every functions in a framework

as it display all results.

The result after execution

should give the accurate

result.Non-Functional Testing

This testing used to check that an application will work in the operational

environment.

Non-functional testing includes:

Load testing

Performance testing

25


26/54

Usability testing

Reliability testing

Load Testing


1 It is necessary to ascertain that the

application behaves correctly under

loads all the components.

Should display error

message for particular

components.

PERFORMANCE TESTING


1 This is required to assure that an

application perforce adequately, having

the capability to handle many peers,

delivering its results in expected time

and using an acceptable level of

resource and it is an aspect of

operational management.

Should handle large input

values, and produce

accurate result in a

expected time

RELIABILITY TESTING


1 This is to check that the system is

rugged and reliable and can handle the

failure of any of the components

involved in provide the application.

In case of failure of the

system an error handling

function should take over

the process

26


27/54

White Box Testing

White box testing, sometimes called glass-box testing

is a test case design method that uses the control structure of the procedural

design to derive test cases. Here the mobility rule generation is tested separately as

because they are the parameters for prediction of next location method

Using white box testing method, the software engineer can derive test cases.


1 Exercise all logical decisions on their

true and false sides

All the logical decisions

must be valid

2 Execute all loops at their boundaries

and within their operational bounds.

All the loops must be finite

3 Exercise internal data structures to

ensure their validity.

All the data structures must

be valid

Black Box Testing

Black box testing, also called behavioral testing, focuses on the functional

requirements of the software. That is, black testing enables the software engineer to

derive sets of input conditions that will fully exercise all functional requirements

for a program. Black box testing is not alternative to white box techniques. Rather it

is a complementary approach that is likely to uncover a different class of errors

than white box methods. Black box testing attempts to find errors in the following

categories.In the project,all the nodes which are connected to the grid must have a

perfect interface with the provider inorder to retrive and update data from mobile users

register and to store the same.

27


28/54

Test

case no

Description Expected result

1 To check for incorrect or missing

functions

All the functions must be valid

2 To check for interface errors All the interface must function normally

3 To check for errors in a data

structures or external data base

access.

The database updation and retrieval must

be done

4 To check for initialization and

termination errors.

All the functions and data structures must

be initialized properly and terminated

normally

28


29/54

CHAPTER 7

SYSTEM IMPLEMENTATION

System implementation is the process of making the newly designed system

fully operational and consistent in performance. That is, implementation is the process

of having the personnel check out and put new equipment into use, train the users to use

the new system and construct any file that are needed to use it. At this stage the main

workload, the major impact on the existing practices shifts to the user department.

If the implementation is not carefully planned and controlled, it can cause

chaws. Thus it can be considered to be the most crucial stage in achieving a successful

new system and in giving the users confidence that the new system will work and be

effective.

Before the development of the system, the user specification, the forms are

prepared. The user can specify the change if any, then the design department examines

the changes and if accepted then the requirement of the user are taken care of stagewhere the system design begins, i.e., the theoretical design is converted into a working

system. All the technical errors are fixed and the test data is entered. Then the reports

are prepared and compared with that of the existing system. If the new system is not

working properly, then once again go back to the exiting system and after rectification;

the new system can be installed.

7.1 RESULTS & GRAPH

The next location of the mobile user is predited for providing the location based

services. Then the results generated by both algorithms are compared and represented

in the form of graph.Graph is ploted between the number of rules generated and the

number of UAPs of a particular user. The algorithm with less number of rules are

suited for predicting the users next location.

29


30/54

Fig.7.1 Comparison Based on No.of Rules

Fig.7.2 Comparison based on Supportive Value

30


31/54

CHAPTER 8

CONCLUSION

8.1 CONCLUSION

In the proposed project, Apriori algorithm on Knowledge grid to predict the

next location of mobile user in a mobile web computing system has been implemented.

Apriori algorithm performs better when compared to KMPM for re-computation of

larger datasets. In the first step of mining algorithm, mobility patterns are mined from

the user actual path and mobility rules are generated using mobility patterns. Finally

current location of mobile user is compared with the mobility rule to predict the

location of mobile user. By using the predicted movement, the system can effectively

allocate resources and provide location based services to the mobile users. Apriori

algorithm for mobility prediction takes less computation time when compared to

knowledge grid based mobility pattern mining algorithm and it supports scalability. The

proposed approach shows how the knowledge grid system is used for distributed data

analysis.

8.2 FUTURE ENHANCEMENT

The grid reduces the message communication overhead using MPICH-G2

technology. MPICH-G2 is a grid enabled open source library for implementing

Message Passing Interface (MPI). MPICH-G2 uses TCP for inter-

machinecommunication and vendor API for intra-machine communication.The

subsequence exchange between processes is effectively achieved by using MPICH-G2.

In future the work can be extended by applying the SQL sub query to calculate the

support count. Apriori algorithm along with FP-growth (Frequent Pattern Growth) can

be implemented on grid network in each grid node, which finds the local support counts

and prunes all infrequent item sets. After completing local pruning, each grid node

broadcasts messages containing all the remaining frequent patterns to the coordinator.

31


32/54

CHAPTER 9

APPENDIX

9.1 SOURCE CODE

//KMPM ALGORITHM

import java.sql.ResultSet;

import java.sql.SQLException;

import java.util.HashSet;

import java.util.Iterator;

import javax.swing.JOptionPane;

public class kmpm {

kmpm()

{

database_conn1 db=new database_conn1();

HashSet hs1 = new HashSet();

ResultSet rs1= null,rs2= null,rs = null,rs3= null,rs4= null,rs5= null,rs6= null,rs7= null;

String cityname="",date="",subsequence="",phonenum="";

//rule generation

//String sss="delete from tblAsrulesGA";

//String ddd="delete from tblkmpm";

String qrygetcity="select distinct city from mobileprediction ";

try {

//db.st.executeUpdate(sss);

//db.stat8.executeUpdate(ddd);

rs1=db.stat.executeQuery(qrygetcity);

while(rs1.next())

{

32


33/54

cityname=rs1.getString("city");

String hhh="select distinct city,area,phone, CONVERT ( varchar ,Date_Time ,103) as

dat from mobileprediction where city='"+cityname+"' ";

rs = db.stat1.executeQuery(hhh);

while (rs.next()) {

phonenum=rs.getString("phone");

date= rs.getString("dat");

String qrydatewise="select distinct city,area, CONVERT ( varchar ,Date_Time ,103) as dat

from mobileprediction where city='"+cityname+"' and CONVERT ( varchar ,Date_Time ,

103)='"+date+"' and phone='"+phonenum+"'";

rs2=db.stat2.executeQuery(qrydatewise);

while(rs2.next())

{

subsequence=subsequence+rs2.getString("area")+",";

}

System.out.println("date--"+date);

System.out.println("cityname--"+cityname);

System.out.println("subsequence--"+subsequence);

String insertrule="insert into tblAsrulesGA

values('"+date+"','"+phonenum+"','"+cityname+"','"+subsequence+"')";

System.out.println(insertrule);

db.stat3.executeUpdate(insertrule);

subsequence="";

}

//System.out.println("rule generated");

//JOptionPane.showMessageDialog(null, "Rule generated Successfully");

}

rs.close();

rs1.close();

33


34/54

rs2.close();

}

catch (SQLException e1) {

// TODO Auto-generated catch block

e1.printStackTrace();

}

//end of rule generation

String kmpmcountquery="select count(*) as cnt from tblAsRulesGA";

String kmpmrulescount="";

try {

ResultSet rs10=db.stat6.executeQuery(kmpmcountquery);

while(rs10.next())

{

kmpmrulescount=rs10.getString("cnt");

}

JOptionPane.showMessageDialog(null, "KMPM Rules Generated--count

value"+kmpmrulescount);

} catch (SQLException e) {


e.printStackTrace();

}

//insert data Successfully

/*String qrygetphone="select distinct phone from tblAsrulesGA";

String ph="";

int occuranceofsequence=0;

double cntofphoneocurance=0;

double supportvalue=0;

String inssubsequence="";

34


35/54

String supportqry="";

database_conn1 db1=new database_conn1();

try {

rs3=db1.stat4.executeQuery(qrygetphone);

while(rs3.next())

{

ph=rs3.getString("phone");

String qrycntofphoneocurance="select count(*) as cntt from tblAsrulesGA

where phone='"+ph+"'";

rs4=db1.stat5.executeQuery(qrycntofphoneocurance);

try {

while(rs4.next())

{

cntofphoneocurance=Integer.parseInt(rs4.getString("cntt"));

String qrydistsequence="select distinct subsequence,count(subsequence) as

cnt from tblAsrulesGA where phone='"+ph+"' group by subsequence order by cnt desc";

rs5=db1.stat6.executeQuery(qrydistsequence);

try{

while(rs5.next())

inssubsequence=rs5.getString("subsequence");

occuranceofsequence=Integer.parseInt(rs5.getString("cnt"));

// supportvalue=occuranceofsequence/cntofphoneocurance;

System.out.println("supsequence---"+inssubsequence);

supportqry="insert into tblkmpm

values('"+ph+"','"+inssubsequence+"','"+occuranceofsequence+"')";

System.out.println(supportqry);

try{

db1.stat7.executeUpdate(supportqry);

}catch(Exception e2){

35


36/54


}

}

}catch (Exception e3) {

// TODO: handle exception


}

}

}

catch (Exception e4) {



}

}

rs3.close();

rs4.close();

rs5.close();

} catch (SQLException e5) {



*/

}

public static void main(String args[]) {

new kmpm();

}

}

//APRIORI ALGORITHM

36


37/54

import java.sql.ResultSet;

import java.sql.SQLException;

import java.util.StringTokenizer;

import javax.swing.JInternalFrame;

public class Apriorialgorithm {

database_conn1 db=new database_conn1();

String str="",str1="",area;

double totalcount=0;

double occofcount=0;

double supportvalue=0;

String supportqry="";

int i=0;

Apriorialgorithm()

{

String qry="select * from mobileprediction";

try {

ResultSet rs= db.stat.executeQuery(qry);

while(rs.next())

{

str=rs.getString("Date_time");

String date=str.substring(0,9);

System.out.println(date);

//totalcount=Integer.parseInt(rs.getString("cnt1"));

}




37


38/54

}

///////tis is for logic calculation

/*String qry="select phone,count(*) as cnt1 from mobileprediction group by phone order by

phone";

try {

ResultSet rs= db.stat.executeQuery(qry);

while(rs.next())

{

str=rs.getString("phone");

totalcount=Integer.parseInt(rs.getString("cnt1"));

//System.out.println("--tot--"+totalcount);

String qry1="select phone,area,count(*) as cnt from mobileprediction group by phone,area order

by phone,area";

ResultSet rs1= db.stat1.executeQuery(qry1);

while(rs1.next())

{

str1=rs1.getString("phone");

area=rs1.getString("area");

occofcount=Integer.parseInt(rs1.getString("cnt"));

if(str.equalsIgnoreCase(str1))

{

supportvalue=occofcount/totalcount;

supportqry="insert into tblsupportval

values('"+str1+"','"+area+"','"+supportvalue+"')";

db.stat4.executeUpdate(supportqry);

i++;

}

}

38


39/54

}




}

System.out.println("iiiii--"+i);*/

//////////end of logic

}

public static void main(String args[])

{

new Apriorialgorithm();

}

//RULE COMPARISON

import java.awt.*;

import javax.swing.*;

import java.sql.*;

import org.jfree.chart.ChartPanel;

import org.jfree.chart.ChartFactory;

import org.jfree.chart.JFreeChart;

import javax.swing.JTable.*;

import org.jfree.data.jdbc.JDBCCategoryDataset;

import org.jfree.chart.plot.*;

public class RuleComparison {

int fprules;

database_conn1 db;

public RuleComparison() {

db=new database_conn1();

39


40/54

try{

ResultSet rsd=db.stat5.executeQuery("select count(*)as cnt from tblAsrulesGA");

while(rsd.next())

fprules=Integer.parseInt(rsd.getString("cnt"));

}catch (Exception e1) {


}

try {

db.stat1.executeUpdate("truncate table tblCompare");

String fp="Kmpm Algorithm";

db.stat2.executeUpdate("insert into tblCompare values('"+fp+"',"+fprules+")");

ResultSet rs=db.stat3.executeQuery("select count(*) as cnt1 from tblsupportval");

rs.next();

int garules=Integer.parseInt(rs.getString("cnt1"));

String ga="Apriori Algorithm";

rs.close();

db.stat4.executeUpdate("insert into tblCompare values('"+ga+"',"+garules+")");

} catch(Exception e) {

JOptionPane.showMessageDialog(null, e.getMessage());

// chart start

try {

Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");

Connection cn = DriverManager.getConnection("jdbc:odbc:driver={SQL Server};

server=(local);database=Mobile_Processing");

JDBCCategoryDataset dataset = new JDBCCategoryDataset(cn);

//String qry = "SELECT data1,data2 from tblrulechart";

String qry = "SELECT * from tblCompare";

dataset.executeQuery(qry);

40


41/54

JFreeChart chart = ChartFactory.createBarChart("Rules Comparision", "Number Of UAPS ",

"Number Of RULES", dataset, PlotOrientation.VERTICAL, false, true, false);

//JFreeChart chart = ChartFactory.createPieChart("Strong Rules",dataset,true,true,true);

CategoryPlot p = chart.getCategoryPlot();

p.setRangeGridlinePaint(Color.green);

ChartPanel chartPanel = new ChartPanel(chart);

chartPanel.setPreferredSize(new java.awt.Dimension(650, 400));

//ApplicationFrame f = new ApplicationFrame("Chart");

JFrame f = new JFrame("Chart");

f.setContentPane(chartPanel);

f.setDefaultCloseOperation(JFrame.DISPOSE_ON_CLOSE);

f.pack();

f.setVisible(true);

try {

//ChartUtilities.saveChartAsJPEG(new File("F:/chart.jpg"), chart,400,300);

} catch (Exception e) {

System.out.println("Problem in creating chart.");

cn.close();

} catch (Exception ee) {

System.out.println("Problem in Connection" + ee.getStackTrace() + ee.getMessage()); }

public static void main(String args[])

{

new RuleComparison();

}

// chart

}

41


42/54

9.2 SCREEN SHOTS

Fig.9.1 Gridlet View

\

42


43/54

Fig.9.2 Login

Fig.9.3 Admin username and password

43


44/54

Fig.9.4 Logged Admin

44


45/54

Fig.9.5 New Registration

Fig.9.6 Mobile Location of all users

45


46/54

Fig.9.7 Mobile Users Location view

46


47/54

Fig.9.8 User Actual Path

Fig.9.9 Enter the Phone number

47


48/54

Fig.9.10 User Actual Path View

Fig.9.11 Applying KMPM Algorithm

48


49/54

Fig.9.12 No of rules generated by KMPM Algorithm

Fig.9.13 Applying Apriori Algorithm

49


50/54

Fig.9.14 Rules Generated

50


51/54

Fig.9.15 No of Rules generated by Apriori Algorithm

Fig.9.16 Predicting Next Location

51


52/54

Fig.9.17 Comparison Based on No.of Rules

Fig.9.18 Comparison based on Supportive Value

52


53/54

CHAPTER 10

REFERENCES

1. Agrawal, Imielinski and Swami, Mining Association Rules between Sets of

Items in Large Databases, Proceedings of ACM SIGMOD International

Conference on the Management of Data, 1993, pp. 207-216.

2. Assaf Schuster, Ran Wolff, and Dan Trock, A High- Performance Distributed

Algorithm for Mining Association Rules, Knowledge and Information Systems

(KAIS) Journal

3. Bhuvaneshwaran and Shakthi,Incremental and SQL based Data Grid Mining

Algorithm for Mobility Prediction of Mobile Users,International Conference on

Computational Science and Its Applications,2009

4. Cannataro and Talia,The Knowledge Grid, Communications of the ACM,

2003, pp. 89-93.

5. Cristian Aflori and Mitica Craus, Grid Implementation of Apriori Algorithm,

Advances in Engineering Software, pp. 295-300.

6. Foster and C. Kesselman,The Anatomy of the Grid: Enabling Scalable Virtual

Organizations, International Journal of Super Computing Applications, 2001.

7. Gokhan Yavas, Dimitrios Katsaros, Ozgur Ulssoy, and Yannis Manolopoulos,A Data Mining Approach for Location Prediction in Mobile Web

Environments, IEEE Transaction on Data and Knowledge Engineering, Sce.

2005, pp.121-146.

8. Ian Foster, Globus Toolkit Version 4: Software for Service-Oriented System, Journalof Computer Science and Technology, Jul. 2006, pp. 513-520.

53


54/54

9. Luo, Anil L. Pereira, and Soon M. Chung, Distributed Mining of Maximal

Frequent Item sets on a Data Grid System, The Journal of Super Computing,

2006, pp.71-90.

10.10. Mario Cannataro, Antonio Congiusta, Andrea Pugliese, Talia, and PaoloTrunfio, Distributed Data Mining on Grids: Services, Tools and

Applications, IEEE Transaction on Systems Man and Cybernetics, Dec. 2004,

pp.2451-2465.

11. Oracle, Java Stored Procedures Developers Guide,

http://otn.oracle.com/doc/oracle8i_816/java/a81353/toc.htm

12.Sedong Kwon, Hyunmin Park, and Kangsun Lee, A Novel Mobility Prediction

Algorithm Based on User Movement History in Wireless Networks AsiaSim

2004,LNAI 3398

, 2005, pp. 419428.

13.Shafer, Parallel Mining of Association Rules, IEEE Transaction on

Knowledge and Data Engineering, 1996, pp. 962-969.

14.Thuraisingham, A Primer for Understanding and Applying Data Mining, IEEE

Educational Activities Department, 2000, pp. 28-31.

15.Tong Liu, Paramvir Bahl, and Imrich Chlamtac, Mobility Modeling, Location

Tracking, and Trajectory Prediction in Wireless ATM Networks,IEEE Journal

onSelected Areas in Communications, Aug. 1998, pp. 922-936

16.Wu-Shan Jiang and Ji-Hui Yu, Distributed Data Mining on the Grid,

Proceedings of International Conference on Machine Learning and Cybernetics,

Aug. 2005, pp. 2010-2014.

Documents

Location Prediction of Mobile Users in Knowledge Grid