11
Hindawi Publishing Corporation Advances in Soſtware Engineering Volume 2013, Article ID 351913, 10 pages http://dx.doi.org/10.1155/2013/351913 Research Article Tuning of Cost Drivers by Significance Occurrences and Their Calibration with Novel Software Effort Estimation Method Brajesh Kumar Singh, Shailesh Tiwari, K. K. Mishra, and A. K. Misra CSED, MNNIT, Allahabad 211004, India Correspondence should be addressed to Shailesh Tiwari; [email protected] Received 5 June 2013; Revised 27 August 2013; Accepted 8 November 2013 Academic Editor: Henry Muccini Copyright © 2013 Brajesh Kumar Singh et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Estimation is an important part of soſtware engineering projects, and the ability to produce accurate effort estimates has an impact on key economic processes, including budgeting and bid proposals and deciding the execution boundaries of the project. Work in this paper explores the interrelationship among different dimensions of soſtware projects, namely, project size, effort, and effort influencing factors. e study aims at providing better effort estimate on the parameters of modified COCOMO along with the detailed use of binary genetic algorithm as a novel optimization algorithm. Significance of 15 cost drivers can be shown by their impact on MMRE of efforts on original 63 NASA datasets. Proposed method is producing tuned values of the cost drivers, which are effective enough to improve the productivity of the projects. Prediction at different levels of MRE for each project reflects the percentage of projects with desired accuracy. Furthermore, this model is validated on two different datasets which represents better estimation accuracy as compared to the COCOMO 81 based NASA 63 and NASA 93 datasets. 1. Introduction Estimation is an important part of soſtware engineering projects, and the ability to produce accurate effort estimates has an impact on key economic processes, including budget- ing and bid proposals and deciding the execution boundaries of the project [1]. Effort estimation is a critical activity for planning and monitoring soſtware project development and for delivering the product on time and within budget. Also, feasibility of project in terms of cost and ability to meet customer’s requirements is considered in the process of esti- mation [2]. e prediction of the effort to be consumed in a soſtware project is, probably, the most sought aſter variable in the process of project management. e determination of the value of this variable in the early stages of a soſtware project drives the planning of remaining activities. e estimation activity is plagued with uncertainties and obstacles, and the measurement of past projects is a necessary step for solving the question. e problem of accurate effort estimation is still open and the project manager is confronted at the beginning of the project with the same quagmires as a few years ago [3]. e soſtware industry’s inability to provide accurate estimates of development cost, effort, and/or time is well known [4]. Over the past few years, soſtware development effort is found to be one of the worst estimated attributes. Significant over- or underestimates can be very expensive for company and the competitiveness of a soſtware company heavily depends on the ability of its project managers to accurately predict in advance the effort required to develop the soſtware systems [5]. It is also found that efforts need to be estimated reliably in order to complete the projects on time and within budget as less than one-quarter of the projects is estimated accurately. Many model structures evolved in the literature and these structures consider modeling relationship between soſtware effort, developed line of code (DLOC), and influencing fac- tors: Effort =(DLOC, influencing factors). (1) Building such a relationship as a function helps project mana- gers to accurately allocate the available resources for the project [6]. Among the others, Constructive Cost Model (COCOMO) is a widely known effort estimation model where developed lines of code (DLOC) are the primary element which affects the effort estimation. e DLOC

Research Article Tuning of Cost Drivers by Significance

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Tuning of Cost Drivers by Significance

Hindawi Publishing CorporationAdvances in Software EngineeringVolume 2013 Article ID 351913 10 pageshttpdxdoiorg1011552013351913

Research ArticleTuning of Cost Drivers by Significance Occurrences and TheirCalibration with Novel Software Effort Estimation Method

Brajesh Kumar Singh Shailesh Tiwari K K Mishra and A K Misra

CSED MNNIT Allahabad 211004 India

Correspondence should be addressed to Shailesh Tiwari shailtiwariyahoocom

Received 5 June 2013 Revised 27 August 2013 Accepted 8 November 2013

Academic Editor Henry Muccini

Copyright copy 2013 Brajesh Kumar Singh et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Estimation is an important part of software engineering projects and the ability to produce accurate effort estimates has an impacton key economic processes including budgeting and bid proposals and deciding the execution boundaries of the project Workin this paper explores the interrelationship among different dimensions of software projects namely project size effort and effortinfluencing factors The study aims at providing better effort estimate on the parameters of modified COCOMO along with thedetailed use of binary genetic algorithm as a novel optimization algorithm Significance of 15 cost drivers can be shown by theirimpact on MMRE of efforts on original 63 NASA datasets Proposed method is producing tuned values of the cost drivers whichare effective enough to improve the productivity of the projects Prediction at different levels of MRE for each project reflects thepercentage of projects with desired accuracy Furthermore this model is validated on two different datasets which represents betterestimation accuracy as compared to the COCOMO 81 based NASA 63 and NASA 93 datasets

1 Introduction

Estimation is an important part of software engineeringprojects and the ability to produce accurate effort estimateshas an impact on key economic processes including budget-ing and bid proposals and deciding the execution boundariesof the project [1] Effort estimation is a critical activity forplanning and monitoring software project development andfor delivering the product on time and within budget Alsofeasibility of project in terms of cost and ability to meetcustomerrsquos requirements is considered in the process of esti-mation [2] The prediction of the effort to be consumed in asoftware project is probably themost sought after variable inthe process of project managementThe determination of thevalue of this variable in the early stages of a software projectdrives the planning of remaining activities The estimationactivity is plagued with uncertainties and obstacles and themeasurement of past projects is a necessary step for solvingthe questionThe problem of accurate effort estimation is stillopen and the project manager is confronted at the beginningof the project with the same quagmires as a few years ago [3]The software industryrsquos inability to provide accurate estimatesof development cost effort andor time is well known [4]

Over the past few years software development effort isfound to be one of the worst estimated attributes Significantover- or underestimates can be very expensive for companyand the competitiveness of a software company heavilydepends on the ability of its project managers to accuratelypredict in advance the effort required to develop the softwaresystems [5] It is also found that efforts need to be estimatedreliably in order to complete the projects on time and withinbudget as less than one-quarter of the projects is estimatedaccurately

Manymodel structures evolved in the literature and thesestructures consider modeling relationship between softwareeffort developed line of code (DLOC) and influencing fac-tors

Effort = 119891 (DLOC influencing factors) (1)

Building such a relationship as a function helps projectmana-gers to accurately allocate the available resources for theproject [6] Among the others Constructive Cost Model(COCOMO) is a widely known effort estimation modelwhere developed lines of code (DLOC) are the primaryelement which affects the effort estimation The DLOC

2 Advances in Software Engineering

include all program instructions and formal statements [6 7]The aim here is to provide a basis for the software effortestimation through a systematic review of previous researchpapers [8] Some research studies have demonstrated thatthe level of accuracy in software effort estimates is stronglyinfluenced by selection of the input values of the parametersof these methods Combination of input features selectionand parameters optimization of machine learning methodsimproves the accuracy of software development effort [9]

Recently the uses of search basedmethods have been sug-gested to address the software development effort estimationproblem [10 11] Such a problem can be formulated as an opti-mization problem where we have to identify the estimationmodel which provides the best prediction [5]This study aimsat providing the better effort estimate on the parameters ofmodified COCOMO along with the detailed use of binarygenetic algorithm as a novel optimization algorithm Theperformance in terms of estimation accuracy of the devel-oped model was tested on 93 and 63 NASA dataset projectsand compared to the preexisting COCOMO The developedmodel is able to provide better estimation capabilities

The whole paper is organized in 7 sections Section 2illustrates the problem and the techniques as a part of prob-lem Section 3 depicts the solution approach and Section 4describes the proposed algorithm for solving the problemFour submodels are introduced in this section Evaluationcriteria and data analysis are discussed in Section 5 Resultanalysis is made with help of proposed method in Section 6Finally the paper has been concluded in Section 7

2 Problem Illustration

21 Problem Statement Software development effort esti-mates are likely to be highly inaccurate and systematicallyoveroptimistic due to the valence effect of prediction anchor-ing and planning fallacy and cognitive effects Empiricalevidence suggests that the causes of the problem to someextent were due to the influence of irrelevant and misleadinginformation for example information regarding the clientrsquosbudget present in the estimation material [12] Previousresearches have shown that the average effort overrun insoftware development projects is about 30ndash40 [1 4] Esti-mating techniques have emerged continually and attemptshave been made to compare these techniques and derive thebest practices [1 4 13]

Empirical software estimation models are mainly basedon cost drivers and scale factors These models show theproblem of instability due to values of the cost drivers andscale factors thus affecting the sensitivity in terms of accurateeffort estimation Also most of the models depend on thesize of the project and a small change in the size leads tothe proportionate change in the effort Miscalculations ofthe cost drivers have even more noisy data as a result tooFor example a misjudgment in personnel capability costdriver in COCOMO between ldquovery high to very lowrdquo willresult in 300 increase in effort Similarly in SEER-SEMchanging security requirements values from ldquolowrdquo to ldquohighrdquowill result in 400 increase in effort In PRICE-S 20 changein effort will occur due to small change in the value of

the productivity factor [14] Above statements reveal that allmodels have one or more inputs for which small changeswill result in large changes in effort The input data problemis further compounded in that some inputs are difficult toobtain especially early stages in a program developmentThe size must be estimated early in a project using one ormore sizing models Some sensitive inputs such as analystand programmer capability in cost drivers are based onindividual and are often difficult to determine Many studieslike the one performed by [15] show that personnel parameterdata are difficult to collect

22 Algorithmic Models Many software estimation modelshave been proposed by various researchers and can becategorized according to their basic formulation schemesanalogy based estimation schemes [16ndash19] expert-judgmentestimation [20] and algorithmic models including empiricalmethods [21] rule inductionmethods [22] Bayesian networkapproaches [23] decision tree based methods [24] artificialneural network based approaches [25 26] and fuzzy logicbased estimation schemes [27]

Some of the famous algorithmic models among thesediversified models COCOMO SLIM SEER-SEM and FPanalysis methods are very much popular in practice inthe empirical category [28] while COCOMO and FunctionPoints allow us to guess the size (in KLOC) of the softwareourselves Albrecht observed in his research that FunctionPoints were highly correlated to lines of code so in effectthey are complementary [29] Function Points calculate thelogical source lines of codes and COCOMO is based onphysical source lines of codes These empirical models workwith certain inputs accurate estimate of specific attributessuch as source lines of code (SLOC) multiplicative factorsinterfaces and complexity and number of user screens whichare not always easy to acquire during the early stage ofsoftware project development Models based on historicaldata have limitations Understanding and calculation of thesemodels are difficult because inherent complex relationshipsbetween the related attributes to predict software develop-ment effort could change over time andor differ for softwaredevelopment environments [26] They are unable to handlecategorical data as well as lack reasoning capabilities [30]

The limitations in algorithmic models have led to theexploration of nonalgorithmicmodelswhich are soft comput-ing based [30]

23 Constructive Cost Model The original Constructive CostModel abbreviated as COCOMO was first published by DrBarry Boehm in 1981 The word ldquoconstructiverdquo prevails thatthe complexity of the model can easily be understood dueto the openness of the model which exhibits exactly whythe model gives the estimates Since the inception of thesoftware development techniques many efforts were donein the improvement of estimation COCOMO is the bestdocumented most transparent and reflects the softwaredevelopment practices of these days The main focus inCOCOMO is upon the estimation of the influence of 15cost drivers on the development effort cost The model doesnot support project management in estimating the size of

Advances in Software Engineering 3

the software COCOMO has been derived from a database of63 projects executed between 1964 and 1979 by the AmericanCompany TRW Systems Inc The projects considered duringthis time era were differing strongly in type of their applica-tion size and programming language [31]

Boehm introduced three levels of the estimation modelbasic intermediate and detailed

(i) The basic COCOMO 81 is a single-valued staticmodel which provides an approximate estimation ofsoftware development effort and cost as a function ofprogram size expressed in thousand delivered sourceof instructions (KDSI)

(ii) The intermediate COCOMO 81 describes softwaredevelopment effort as a function of program size inLOC and a set of fifteen ldquoeffort multipliers known ascost driversrdquoThese cost drivers incorporate subjectiveassessments of product project personnel and hard-ware attributes

(iii) The advanced or detailed COCOMO 81 reduces themargin of error in the final estimate by incorporatingall characteristics of the intermediate version with thedetermination of the cost driverrsquos impact on each stepthat is analysis and design of the software engineeringprocess

COCOMO assumes that the effort grows more thanlinearly with software size The value of few multipliers isrequired to be increased to decrease the effort For fewother multipliers the values are required to be decreased todecrease the effort that is person-months = 119886lowast (KSLOC)119887 lowast119888 Here 119886 and 119887 are domain-specific parameters KSLOCdenotes kilo source lines of code which is estimated directlyor computed from a function point analysis and 119888 is theproduct of fifteen effort multipliers (EM119894) here 119894 = 1 to 15

So the following equation can be represented as

Person-months = 119886 lowast (KSLOC)119887

lowast (EM1 lowast EM2 lowast sdot sdot sdot lowast EM15) see [7 25] (2)

3 Solution to the Problem

31 Nonalgorithmic Models Contrary to the algorithmicmodels since inception in 1990s the proposed nonalgorith-mic models are based on computational intelligence analyt-ical comparisons and inferences to project cost estimationThey have the capability tomodel the complex set of relation-ship between the dependent variables (cost effort) and theindependent variables (cost drivers) collected earlier in theproject lifecycle and to learn fromhistorical projects data Forusing the nonalgorithmic models information about thoseprevious projects datasets is required which are similar to theprojects under estimate Usually in thesemethods estimationprocess is done according to the analysis of the historicaldatasets Many software researchers have shown their interestin the research to new approaches of nonalgorithmic modelsthat are based on soft computing that is artificial neural

networks fuzzy logic and evolutionary algorithms Thesemethods are being used for the assessing because of theirpopularity and a large number of papers about their usagehave been published in the recent past years [26 32ndash34]Decision of choosing a suitable technique is a difficult one andrequires the support of a well-defined evaluation scheme torank each evolutionary computation technique as and whenit is applied to any optimization problem In this presentresearch study an effective model based on evolutionarycomputation has been proposed to overcome the problem ofuncertainty and to acquire better results

32 Genetic Algorithms Evolutionary computational meth-ods are generally used in software engineeringmethodologiessuch as test case generation [35 36] effort estimation costestimation and many more Genetic algorithms are a sim-ple and almost generic evolutionary computational methodinspired by Darwinrsquos theory of natural evolution to solve thecomplex optimization problems Genetic algorithm requiresa careful and suitable selection of parentrsquos selection methodsmutation methods population size and so forth to findgood solutions If improper parameters and methods arechosen there may have longer program runs or even badoptimization results [37] In nature competition among indi-viduals for scanty resources results in the fittest individualsdominating over the weakest ones [24 38]

321 Working Principle

(i) Genetic algorithm starts with randomly generatedinitial population as a set of solutions which arerepresented by chromosomes

(ii) The algorithm then generates a sequence of individu-als as new population At each iteration the algorithmuses the individuals of current generation to createthe next generation of population To create the newpopulation the algorithm works with the followingsteps(a) Score each individual member of the current

population by computing its fitness value(b) Scale the raw fitness scores to convert them into

a more desired range of values(c) Select the good individuals called parents

based on the value of fitness function(d) Few of the individuals in the current population

that are having lower fitness are selected as eliteThese individuals are directly sent to the nextgeneration of population for elitism

(e) Produce offsprings from the parents Offspringsare produced either by making mutation of asingle parent by combining the chromosomeof a pair of parents with the help of crossoveroperator

(f) Update the current population with the off-springs to form the new generation

(iii) The algorithm terminates only on the condition thatany one of the stopping criteria is reached that isnumber of generations or desired fitness value

4 Advances in Software Engineering

Sub-Model 1Step 1 Generate the MMRE (M) for Available119873 Projects using actual and COCOMO estimated efforts(i) [BEGIN](ii) Input the 15 cost drivers KLOC Actual Effort for NASA projects(iii) [LOOP]

for 119895 = 1 to no of projects (say 119899)EAF[119895] = D1 lowast D2 lowast sdot sdot sdot lowast D15Estimated Effort[119895] = 119886[119895] lowast (kloc[119895]and119887[119895]) lowast EAF[119895]MRE[119895] = 100381610038161003816

1003816

Actual Effort[119895] minus estimated Effort[119895]1003816100381610038161003816

Actual Effort[119895]MMRE (original) += MRE[119895]MMRE (original) = 119899 [The original MMRE is obtained and noted down]

(iv) [END OF LOOP](v) [END]Sub-Model 2Step 2 for 119868 = 1 to 15temp = EmiSet Emi = 1Calculate Influenced MMRE(MN)List[119868 1] = 119868List[119868 2] = MNsimMEmi = tempend for

Sub-Model 3Step 3 Sort the list according to the second parameter in descending orderFor 119868 = 1 to 14For 119895 = 2 to 15If (list[119868 2] lt list[119895 2])thenswap (list[119868] list[119895])end ifend forend for

Step 4 Sig = list[119868 1] represent the order of Significance occurrencesSub-Model 4Step 5 for 119868 = 1 to 15for 119895 = very low to Extra high (Six rating of cost driver)Select Projects (P) as an input for calculating the fitness value using fitness function F1 = MMRE(P)Set the range R as Rmax RminGenerate initial population for the cost driver with Range RperformsThe Genetic operations for K generations(1) Tournament Selection(2) Crossover with Pc = 08(3) Mutation with Pm = 03Select the individual (CDNEW) with the best MMRE

Step 6 Calculate the MMRE(Mmod) by replacing CDNEW with CDijif (Mmod ltM)then update the value of CDij and Melsediscard the valueend ifend forend for

Algorithm 1

4 Proposed Algorithm forSolving the Problem

Algorithm Description (see Algorithm 1) Proposed algorithmis divided into 4 submodels In submodel 1 we calculate the

mean magnitude of relative error (MMRE) of all projectsaccording to the results obtained byCOCOMOHere we firstcalculate the estimated efforts by considering 15 COCOMOcost drivers project modes and kilo lines of codes (KLOC)Estimated efforts along with actual efforts produced in

Advances in Software Engineering 5

Table 1 COCOMO cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 1 086 071pcap 142 117 1 086 07aexp 129 113 1 091 082modp 124 11 1 091 082tool 124 11 1 091 083vexp 121 11 1 09lexp 114 107 1 095sced 123 108 1 104 11stor 1 106 121 156data 094 1 108 116time 1 111 13 166turn 087 1 107 115virt 087 1 115 13cplx 07 085 1 115 13 165rely 075 088 1 115 14

various projects are used as the input parameters to calculatethe mean of relative error (MRE) of each project MMRE forCOCOMO results is recorded as the original MMRE

In submodel 2 influenced MMRE is calculated on thebasis of occurrences of 15 cost driversThis influencedMMREshows the effectiveness of each cost driver in the sequenceof development of efforts in terms of person-months In thisprocess we take sample data having 18 input parametersthat is 15 cost drivers modes source lines and actual effortThe estimated efforts are calculated for the sample data bynullifying the effect of cost driver one by oneThese efforts areused to calculate the influenced MMRE for each cost drivercorresponding to the actual effort provided in the sampledata The difference between influenced MMRE and originalMMRE is recorded in the list along with driver

In submodel 3 the list with the difference of influencedMMRE and originalMMRE is sorted in the descending orderof the difference to provide the significant occurrence of thedriver This order has been named as Sig

In submodel 4 we will try to minimize the MMRE byupdating the value of cost driver with the help of geneticalgorithm in the order of their significance This is done byselecting the projects falling in the category of particularcost driver and then using the genetic algorithm operatorThe results obtained are evaluated using the fitness functionas MMRE If the MMRE is reduced the cost driver valuefor particular rating is updated otherwise discarded ThereducedMMREwill be recorded asMmodwhichwill be usedas the MMRE for the remaining cost drivers

5 Evaluation Method

51 Conceptual View Software cost estimation models needto be quantitatively evaluated in terms of estimation accuracyto improve themodeling process Some rules or themeasure-ments must be provided for model assessment purpose Thismeasurement of accuracy defines how close the estimatedresult is with its actual value Software cost estimates play

significant role in delivering software projects As a resultresearchers have proposed the most widely used evaluationcriterion to assess the performance of software predic-tion models that is the mean magnitude of relative error(MMRE) to evaluate the opulence of prediction systemsMMRE is usually computed by following standard evaluationprocesses such as cross-validation [39] It is independent ofsize scale and effort units Comparisons can be made acrossdata sets and prediction model types [40]

COCOMO computes effort on the basis of source linesof codes In intermediate COCOMO Boehm used 15 morepredictor variables called cost drivers which are requiredto calibrate the nominal effort of a project to the actualproject environment The values are set to each cost driveraccording to the properties of the specific software projectThese numerical values of 15 cost drivers are multiplied to getthe effort adjustment factor that is EAF

52 Data Analysis Performance of estimation methods isusually evaluated by several ratio measurements of accuracymetrics including RE (relative error) MRE (magnitude ofrelative error) and MMRE (mean magnitude of relativeerror) which are computed as follows

RE = Estimated efforts minus Actual effortsActual Efforts

MRE = Estimated Efforts sim Actual EffortsActual Efforts

MMRE = sum MRE119873

(3)

Another parameter used in evaluation of performance ofestimationmethod is PRED (percentage of prediction) whichis determined as

PRED (119883) = 119860119873

(4)

6 Advances in Software Engineering

Table 2 Significant occurrences of cost drivers

1 acap2 pcap3 aexp4 rely5 Virt6 vexp7 time8 modp9 cplx10 data11 tool12 sced13 lexp14 turn15 stor

where 119860 is the number of projects with MRE less than orequal to level 119883 and119873 is the number of considered projectsUsually the acceptable level of 119883 is 025 and the variousmethods are compared based on this level Decreasing ofMMRE and increasing of PRED are the main aim of allestimation techniques

6 Results and Discussion

61 Dataset Description Experiments were done by taking63 COCOMO 81 based dataset used by NASA and variousother calculations performed on it 93 NASA projects fromdifferent centers for projects from the years of 1971 to 1987were collected by Jairus Hihn JPL NASA Manager of SQIPMeasurement and Benchmarking Element The proposedmodel is validated by these datasetsThese are one of themostanalyzed data setsThe independent variable used is ldquoadjusteddelivered source instructionsrdquo which takes into account thevariation of effort when adapting software COCOMO is builtupon these data points by introducing many factors in theform of multipliers

These datasets include 156 historical projects with 17 effortdrivers and one dependent variable of the software develop-ment effort

62 Result Analysis Cost drivers play a vital role in estima-tion of the efforts and cost to be incurredThey show charac-teristics of software development that influence effort incarrying out a certain project Cost drivers are selected basedon the arguments that they have a linear effect on effortCOCOMO cost drivers are the basis for the analysis ofproposed algorithm Table 1 depicts the COCOMO effortmultipliers

Significance of 15 cost drivers can be shown by theirimpact on MMRE of efforts on original 63 NASA datasetsThe significance occurrences of 15 cost drivers are calculatedby applying step 1 to step 4 which are shown in Table 2

0005

01015

02025

03035

04045

05

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

MM

RE

Cost drivers

MMRE(COCOMO)Significant occurrences

Figure 1 Relationship between MMRE and cost drivers

002040608

112141618

2

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61

MRE

Software projects

MRE COCOMOMRE with tuned parameters

Figure 2 Comparison of MRE for NASA 63 projects

The occurrence of each cost driver is having linearity withthe MMRE calculated between actual efforts produced andestimated effort with COCOMO In Figure 1 each cost drivermoves against MMRE that is constant for all cases The effectof each cost driver on the MMRE is the significant aspect ofderiving the occurrence of cost drivers The proportionaterelationship can be seen from Figure 1 where the higherinfluencedMMRE with each independent cost driver againstconstant value of MMRE is the most significant and thosewith lower values are less significant

Once significant occurrences of the cost drivers are foundthe sequence of cost drivers is used to produce tuned valuesfor different ratings of various cost drivers Step 5 and step 6are used to generate the new values of available cost driversTable 3 reveals the tuned values in preexisting cost drivers

The proposed algorithm is validated with two differentdatasets of NASA projects According to the evaluationcriteria the proposed method has marginal difference ineffortswith actual project efforts in comparison toCOCOMO

Advances in Software Engineering 7

Table 3 Proposed algorithm based cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 09 086 071pcap 142 09 1 086 07aexp 129 14 1 091 082modp 138 092 1 091 082tool 124 11 099 093 083vexp 138 103 1 09lexp 114 108 09 095sced 123 108 099 104 11stor 1 106 119 138data 103 09 106 138time 09 111 13 166turn 097 092 103 09virt 087 1 115 13cplx 07 085 111 115 116 165rely 075 088 1 125 14

0

2

4

6

8

10

12

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91

MRE

NASA projects

MRE proposed modelMRE COCOMO

Figure 3 Comparison of MRE for NASA 93 projects

generated efforts shown in Figure 2 Most of the results arekept near to the mean of MRE for 63 data values Other93 datasets were also used to evaluate the projects with theproposed method (Figure 3 and Figure 4)

A comparison is made between proposed method andother estimation methods by MMRE in Table 4 Proposedmethod is having average error 027 with actual efforts andCOCOMO produces a bit higher percentage of error withactual efforts Proposedmodel is working efficiently for other93 datasets as well

Essentially we want to measure useful functionalityproduced per time unit Productivity is anothermeasurement

of effectiveness of the model It is a measure of the rateor ratio at which individual software developers involvedin software development produce software and associateddocumentation

Higher productivity reflects the better quality achieve-ment for the project development Proposed method ishaving productivity 029 which is closer to the actual efforts027 as productivity Seven percent of proposed methodproductivity is increased and 9 percent of COCOMO pro-ductivity is decreased in comparison with actual productivity(Table 5) So the percentage of difference between proposedmethod and COCOMO results is approximately 1795

8 Advances in Software Engineering

0

02

04

06

08

1

12

14

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61

Prod

uctiv

ity

Software projects

Productivity with actual effortsProductivity with COCOMO effortsProductivity with proposed method

Figure 4 Comparison of productivity for NASA 63 projects

Table 4 The MMRE for two different methods

MMRE (for 63 datasets) MMRE (for 93 datasets)COCOMO versus actual Proposed method versus actual COCOMO versus actual Proposed method versus actual036 027 059 056

Table 5 The productivity of various approaches

Productivity(COCOMO)

Productivity(proposed method)

Productivity(actual)

025 030 028Differencefrom actual 003 002

Table 6 MMRE of NASA 63 projects for various project modes

Project mode No ofprojects (63)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 27 029 039Organic 25 028 037Semidetached 11 022 023

Table 7 MMRE of NASA 93 projects for various project modes

Project mode No ofprojects (93)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 21 072 082Organic 3 08 088Semidetached 69 051 051

Tables 6 and 7 depict the presence of error in all threecategories of projectmodes for two different types of datasetsThe comparison was made between proposed model gener-ated results versus COCOMO results We also evaluate thedifferent type of project application categorically 80 of totaldatasets are producing the results which are better than theCOCOMO based results (Table 8)

PRED was calculated with the two separate approachesand Table 9 depicts that for 3 different PRED assumptionsproposedmethod is producing approximately 6665 801and 834 increase in PRED respectively

7 Conclusion

Work carried out in the paper explores the inter-relationshipamong different dimensions of data driven software projectsnamely project size and effort The above-mentioned resultsdemonstrate that applying proposed method to the softwareeffort estimation is by far the most feasible approach foraddressing the problem of apprehension and ambiguityexisting in software effort drivers Order of occurrence ofvarious cost drivers has a significant impact on overall effortsin project estimation Small adjustments to the COCOMOcost drivers bring significant improvements to the qualitycriteria applied to the proposed approach Proposed methodis producing tuned values of the cost drivers which areeffective enough to improve the productivity of the projectsPrediction at different levels of MRE for each project reflects

Advances in Software Engineering 9

Table 8 Description of projects on application basis

Type of application No of projects MMRE COCOMO MMRE proposed methodApplication ground 2 028 025Avionics 11 095 080Avionics monitoring 30 066 055Batch data processing 2 008 012Communications 1 018 005Data capture 3 009 007Launch processing 1 032 046Mission planning 20 038 034Monitor control 8 020 050Operating system 4 382 363Real data processing 3 012 006Science 2 018 041Simulation 4 017 029Utility 2 012 031

Table 9 Pred calculation at different values for both the models

PREDCOCOMO Proposed method

10 20 30 10 20 30Percentage of 63 NASA datasets 2381 3968 5714 254 4286 6191

the percentage of projects with desired accuracy Further-more this model is validated on two different datasets whichrepresents better estimation accuracy as compared to theCOCOMO 81 based NASA 63 and NASA 93 datasets Theutilization of proposed algorithm for other applications in thesoftware engineering field can also be explored in the future

Conflict of Interests

The authors certify that there is no actual or potential conflictof interests in relation to this paper The American CompanyTRWSystems Inc has been referred to as the company whereBarry W Boehm the developer of COCOMO worked

References

[1] K M Furulund and K Moloslashkken-Oslashstvold ldquoIncreasing soft-ware effort estimation accuracymdashusing experience data esti-mationmodels and checklistsrdquo in Proceedings of the 7th Interna-tional Conference on Quality Software (QSIC rsquo07) pp 342ndash347Portland OR USA October 2007

[2] Q Alam P Bhatia and S Sarwar Systematic Review of EffortEstimation and Cost Estimation Institute of Management Stud-ies Roorkee India 2012

[3] J J Dolado On the Problem of the Software Cost FunctionFacultad de Informatica Universidad del Pais Vasco-EuskalHerriko Unibertsitatea Gipuzkoa Spain 2000

[4] K Molokken and M Jorgensen ldquoA review of software surveyson software effort estimationrdquo inProceedings of the InternationalSymposium on Empirical Software Engineering (ISESE rsquo03) pp220ndash230 2003

[5] F Ferrucci C Gravino R Oliveto and F Sarro ldquoGenetic pro-gramming for effort estimation an analysis of the impact of dif-ferent fitness functionsrdquo in Proceedings of the 2nd InternationalSymposium on Search Based Software Engineering (SSBSE rsquo10)pp 89ndash98 IEEE Computer Society DMI University of SalernoBenevento Italy October 2010

[6] A F Sheta ldquoEstimation of the COCOMO model parametersusing genetic algorithms for NASA software projectsrdquo Journalof Computer Science vol 2 no 2 pp 118ndash123 2006

[7] B W Boehm Software Engineering Economics Prentice HallIEEE 1984

[8] J Magne and M Shepperd ldquoA Systematic Review Of SoftwareDevelopment Cost Estimation Studiesrdquo IEEE Transactions onSoftware Engineering vol 33 no 1 pp 33ndash53 2007

[9] P L Braga A L I Oliveira and S R L Meira ldquoA GA-based feature selection andparameters optimization for supportvector regression applied to software effort estimationrdquo inProceedings of the 23rd Annual ACM Symposium on AppliedComputing (SAC rsquo08) pp 1788ndash1792 Ceara Brazil March 2008

[10] M Harman and B F Jones ldquoSearch-based software engineer-ingrdquo Information and Software Technology vol 43 no 14 pp833ndash839 2001

[11] J Clarke J J DoladoMHarman et al ldquoReformulating softwareengineering as a search problemrdquo IEE Proceedings Software vol150 no 3 pp 161ndash175 2003

[12] M Joslashrgensen and S Grimstad ldquoAvoiding irrelevant and mis-leading informationwhen estimating development effortrdquo IEEESoftware vol 25 no 3 pp 78ndash83 2008

[13] A L Lederer and J Prasad ldquoA causal model for software costestimating errorrdquo IEEE Transactions on Software Engineeringvol 24 no 2 pp 137ndash148 1998

10 Advances in Software Engineering

[14] S Basha and P Dhavachelvan ldquoAnalysis of empirical softwareeffort estimation modelsrdquo International Journal of ComputerScience and Information Security vol 7 no 3 pp 68ndash77 2010

[15] B L Barber Investigative search of quality historical softwaresupport cost data and software support cost-related data [MSthesis] 1991

[16] N H Chiu and S J Huang ldquoThe adjusted analogy-based soft-ware effort estimation based on similarity distancesrdquo Journal ofSystems and Software vol 80 no 4 pp 628ndash640 2007

[17] G Kadoda and M Shepperd ldquoUsing simulation to evaluateprediction techniquesrdquo in Proceedings of the 7th InternationalSoftware Metrics Symposium (METRICS rsquo01) pp 349ndash359 IEEEPress London UK 2001

[18] M J Shepperd and G F Kadoda ldquoComparing software predic-tion techniques using simulationrdquo IEEE Transactions on Soft-ware Engineering vol 27 no 11 pp 1014ndash1022 2001

[19] M J Shepperd and C Schofield ldquoEstimating software projecteffort using analogiesrdquo IEEE Transactions on Software Engineer-ing vol 23 no 11 pp 736ndash743 1997

[20] M Jooslashrgensen and D I K Sjoslashberg ldquoThe impact of customerexpectation on software development effort estimatesrdquo Interna-tional Journal of Project Management vol 22 no 4 pp 317ndash3252004

[21] J Kaczmarek and M Kucharski ldquoSize and effort estimation forapplications written in Javardquo Information and Software Technol-ogy vol 46 no 9 pp 589ndash601 2004

[22] R Jeffery M Ruhe and I Wieczorek ldquoUsing public domainmetrics to estimate software development effortrdquo in Proceedingsof the 7th International Software Metrics Symposium (METRICSrsquo01) pp 16ndash27 IEEE Computer Society Washington DC USAApril 2001

[23] G H Subramanian P C Pendharkar and MWallace ldquoAn em-pirical study of the effect of complexity platform and programtype on software development effort of business applicationsrdquoEmpirical Software Engineering vol 11 no 4 pp 541ndash553 2006

[24] D E GoldbergGenetic Algorithms in Search Optimization andMachine Learning chapter 1ndash8 Addison-Wesley New York NYUSA 1989

[25] A Heiat ldquoComparison of artificial neural network and regres-sion models for estimating software development effortrdquo Infor-mation and Software Technology vol 44 no 15 pp 911ndash9222002

[26] K Srinivasan and D Fisher ldquoMachine learning approaches toestimating software development effortrdquo IEEE Transactions onSoftware Engineering vol 21 no 2 pp 126ndash137 1995

[27] S J Huang C Y Lin and N H Chiu ldquoFuzzy decision treeapproach for embedding risk assessment information into soft-ware cost estimation modelrdquo Journal of Information Science andEngineering vol 22 no 2 pp 297ndash313 2006

[28] M van Genuchten and H Koolen ldquoOn the use of software costmodelsrdquo Information and Management vol 21 no 1 pp 37ndash441991

[29] A J Albrecht and J E Gaffney ldquoSoftware function source linesof code and development effort prediction a software sciencevalidationrdquo IEEE Transactions on Software Engineering vol 9no 6 pp 639ndash648 1983

[30] I Attarzadeh and SHOw ldquoA novel algorithmic cost estimationmodel based on soft computing techniquerdquo Journal of ComputerScience vol 6 no 2 pp 117ndash125 2010

[31] F J Heemstra Software Cost Estimation Models University ofTechnology Department of Industrial Engineering IEEE 1990

[32] M Joslashrgensen B Boehm and S Rifkin ldquoSoftware developmenteffort estimation formal models or expert judgmentrdquo IEEESoftware vol 26 no 2 pp 14ndash19 2009

[33] Y F Li M Xie and T N Goh ldquoA study of genetic algorithm forproject selection for analogy based software cost estimationrdquo inProceedings of the IEEE International Conference on IndustrialEngineering and EngineeringManagement (IEEM rsquo07) pp 1256ndash1260 Singapore December 2007

[34] H Liu and L Yu ldquoToward integrating feature selection algo-rithms for classification and clusteringrdquo IEEE Transactions onKnowledge and Data Engineering vol 17 no 4 pp 491ndash5022005

[35] A Kumar S Tiwari K KMishra andA KMisra ldquoGenerationof efficient test data using path selection strategy with elitist GAin regression testingrdquo inProceedings of the 3rd IEEE Internation-al Conference on Computer Science and Information Technology(ICCSIT rsquo10) vol 9 pp 389ndash393 Chengdu China July 2010

[36] K K Mishra S Tiwari A Kumar and A K Misra ldquoAnapproach for mutation testing using elitist genetic algorithmrdquoin Proceedings of the 3rd IEEE International Conference on Com-puter Science and Information Technology (ICCSIT rsquo10) vol 5pp 426ndash429 Chengdu China July 2010

[37] S Sarmady An Investigation on Genetic Algorithm ParametersP-COM000507(R) P-COM008807 School of Computer Sci-ences Universiti Sains Malaysia Penang Malaysia 2007

[38] K F Man K S Tang and S Kwong Genetic Algorithms Con-cepts and Designs Chapter 1ndash10 Springer New York NY USA2001

[39] L C Briand K El-Emam and I Wieczorek ldquoExplaining thecost of European space and military projectsrdquo in Proceedings ofthe International Conference on Software Engineering (ICSE rsquo99)pp 303ndash312 ACM Press May 1999

[40] L C Briand T Langley and I Wieczorek ldquoReplicated assess-ment and comparison of common software cost modelingtechniquesrdquo in Proceedings of the International Conference onSoftware Engineering (ICSE rsquo22) pp 377ndash386 ACM Press June2000

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article Tuning of Cost Drivers by Significance

2 Advances in Software Engineering

include all program instructions and formal statements [6 7]The aim here is to provide a basis for the software effortestimation through a systematic review of previous researchpapers [8] Some research studies have demonstrated thatthe level of accuracy in software effort estimates is stronglyinfluenced by selection of the input values of the parametersof these methods Combination of input features selectionand parameters optimization of machine learning methodsimproves the accuracy of software development effort [9]

Recently the uses of search basedmethods have been sug-gested to address the software development effort estimationproblem [10 11] Such a problem can be formulated as an opti-mization problem where we have to identify the estimationmodel which provides the best prediction [5]This study aimsat providing the better effort estimate on the parameters ofmodified COCOMO along with the detailed use of binarygenetic algorithm as a novel optimization algorithm Theperformance in terms of estimation accuracy of the devel-oped model was tested on 93 and 63 NASA dataset projectsand compared to the preexisting COCOMO The developedmodel is able to provide better estimation capabilities

The whole paper is organized in 7 sections Section 2illustrates the problem and the techniques as a part of prob-lem Section 3 depicts the solution approach and Section 4describes the proposed algorithm for solving the problemFour submodels are introduced in this section Evaluationcriteria and data analysis are discussed in Section 5 Resultanalysis is made with help of proposed method in Section 6Finally the paper has been concluded in Section 7

2 Problem Illustration

21 Problem Statement Software development effort esti-mates are likely to be highly inaccurate and systematicallyoveroptimistic due to the valence effect of prediction anchor-ing and planning fallacy and cognitive effects Empiricalevidence suggests that the causes of the problem to someextent were due to the influence of irrelevant and misleadinginformation for example information regarding the clientrsquosbudget present in the estimation material [12] Previousresearches have shown that the average effort overrun insoftware development projects is about 30ndash40 [1 4] Esti-mating techniques have emerged continually and attemptshave been made to compare these techniques and derive thebest practices [1 4 13]

Empirical software estimation models are mainly basedon cost drivers and scale factors These models show theproblem of instability due to values of the cost drivers andscale factors thus affecting the sensitivity in terms of accurateeffort estimation Also most of the models depend on thesize of the project and a small change in the size leads tothe proportionate change in the effort Miscalculations ofthe cost drivers have even more noisy data as a result tooFor example a misjudgment in personnel capability costdriver in COCOMO between ldquovery high to very lowrdquo willresult in 300 increase in effort Similarly in SEER-SEMchanging security requirements values from ldquolowrdquo to ldquohighrdquowill result in 400 increase in effort In PRICE-S 20 changein effort will occur due to small change in the value of

the productivity factor [14] Above statements reveal that allmodels have one or more inputs for which small changeswill result in large changes in effort The input data problemis further compounded in that some inputs are difficult toobtain especially early stages in a program developmentThe size must be estimated early in a project using one ormore sizing models Some sensitive inputs such as analystand programmer capability in cost drivers are based onindividual and are often difficult to determine Many studieslike the one performed by [15] show that personnel parameterdata are difficult to collect

22 Algorithmic Models Many software estimation modelshave been proposed by various researchers and can becategorized according to their basic formulation schemesanalogy based estimation schemes [16ndash19] expert-judgmentestimation [20] and algorithmic models including empiricalmethods [21] rule inductionmethods [22] Bayesian networkapproaches [23] decision tree based methods [24] artificialneural network based approaches [25 26] and fuzzy logicbased estimation schemes [27]

Some of the famous algorithmic models among thesediversified models COCOMO SLIM SEER-SEM and FPanalysis methods are very much popular in practice inthe empirical category [28] while COCOMO and FunctionPoints allow us to guess the size (in KLOC) of the softwareourselves Albrecht observed in his research that FunctionPoints were highly correlated to lines of code so in effectthey are complementary [29] Function Points calculate thelogical source lines of codes and COCOMO is based onphysical source lines of codes These empirical models workwith certain inputs accurate estimate of specific attributessuch as source lines of code (SLOC) multiplicative factorsinterfaces and complexity and number of user screens whichare not always easy to acquire during the early stage ofsoftware project development Models based on historicaldata have limitations Understanding and calculation of thesemodels are difficult because inherent complex relationshipsbetween the related attributes to predict software develop-ment effort could change over time andor differ for softwaredevelopment environments [26] They are unable to handlecategorical data as well as lack reasoning capabilities [30]

The limitations in algorithmic models have led to theexploration of nonalgorithmicmodelswhich are soft comput-ing based [30]

23 Constructive Cost Model The original Constructive CostModel abbreviated as COCOMO was first published by DrBarry Boehm in 1981 The word ldquoconstructiverdquo prevails thatthe complexity of the model can easily be understood dueto the openness of the model which exhibits exactly whythe model gives the estimates Since the inception of thesoftware development techniques many efforts were donein the improvement of estimation COCOMO is the bestdocumented most transparent and reflects the softwaredevelopment practices of these days The main focus inCOCOMO is upon the estimation of the influence of 15cost drivers on the development effort cost The model doesnot support project management in estimating the size of

Advances in Software Engineering 3

the software COCOMO has been derived from a database of63 projects executed between 1964 and 1979 by the AmericanCompany TRW Systems Inc The projects considered duringthis time era were differing strongly in type of their applica-tion size and programming language [31]

Boehm introduced three levels of the estimation modelbasic intermediate and detailed

(i) The basic COCOMO 81 is a single-valued staticmodel which provides an approximate estimation ofsoftware development effort and cost as a function ofprogram size expressed in thousand delivered sourceof instructions (KDSI)

(ii) The intermediate COCOMO 81 describes softwaredevelopment effort as a function of program size inLOC and a set of fifteen ldquoeffort multipliers known ascost driversrdquoThese cost drivers incorporate subjectiveassessments of product project personnel and hard-ware attributes

(iii) The advanced or detailed COCOMO 81 reduces themargin of error in the final estimate by incorporatingall characteristics of the intermediate version with thedetermination of the cost driverrsquos impact on each stepthat is analysis and design of the software engineeringprocess

COCOMO assumes that the effort grows more thanlinearly with software size The value of few multipliers isrequired to be increased to decrease the effort For fewother multipliers the values are required to be decreased todecrease the effort that is person-months = 119886lowast (KSLOC)119887 lowast119888 Here 119886 and 119887 are domain-specific parameters KSLOCdenotes kilo source lines of code which is estimated directlyor computed from a function point analysis and 119888 is theproduct of fifteen effort multipliers (EM119894) here 119894 = 1 to 15

So the following equation can be represented as

Person-months = 119886 lowast (KSLOC)119887

lowast (EM1 lowast EM2 lowast sdot sdot sdot lowast EM15) see [7 25] (2)

3 Solution to the Problem

31 Nonalgorithmic Models Contrary to the algorithmicmodels since inception in 1990s the proposed nonalgorith-mic models are based on computational intelligence analyt-ical comparisons and inferences to project cost estimationThey have the capability tomodel the complex set of relation-ship between the dependent variables (cost effort) and theindependent variables (cost drivers) collected earlier in theproject lifecycle and to learn fromhistorical projects data Forusing the nonalgorithmic models information about thoseprevious projects datasets is required which are similar to theprojects under estimate Usually in thesemethods estimationprocess is done according to the analysis of the historicaldatasets Many software researchers have shown their interestin the research to new approaches of nonalgorithmic modelsthat are based on soft computing that is artificial neural

networks fuzzy logic and evolutionary algorithms Thesemethods are being used for the assessing because of theirpopularity and a large number of papers about their usagehave been published in the recent past years [26 32ndash34]Decision of choosing a suitable technique is a difficult one andrequires the support of a well-defined evaluation scheme torank each evolutionary computation technique as and whenit is applied to any optimization problem In this presentresearch study an effective model based on evolutionarycomputation has been proposed to overcome the problem ofuncertainty and to acquire better results

32 Genetic Algorithms Evolutionary computational meth-ods are generally used in software engineeringmethodologiessuch as test case generation [35 36] effort estimation costestimation and many more Genetic algorithms are a sim-ple and almost generic evolutionary computational methodinspired by Darwinrsquos theory of natural evolution to solve thecomplex optimization problems Genetic algorithm requiresa careful and suitable selection of parentrsquos selection methodsmutation methods population size and so forth to findgood solutions If improper parameters and methods arechosen there may have longer program runs or even badoptimization results [37] In nature competition among indi-viduals for scanty resources results in the fittest individualsdominating over the weakest ones [24 38]

321 Working Principle

(i) Genetic algorithm starts with randomly generatedinitial population as a set of solutions which arerepresented by chromosomes

(ii) The algorithm then generates a sequence of individu-als as new population At each iteration the algorithmuses the individuals of current generation to createthe next generation of population To create the newpopulation the algorithm works with the followingsteps(a) Score each individual member of the current

population by computing its fitness value(b) Scale the raw fitness scores to convert them into

a more desired range of values(c) Select the good individuals called parents

based on the value of fitness function(d) Few of the individuals in the current population

that are having lower fitness are selected as eliteThese individuals are directly sent to the nextgeneration of population for elitism

(e) Produce offsprings from the parents Offspringsare produced either by making mutation of asingle parent by combining the chromosomeof a pair of parents with the help of crossoveroperator

(f) Update the current population with the off-springs to form the new generation

(iii) The algorithm terminates only on the condition thatany one of the stopping criteria is reached that isnumber of generations or desired fitness value

4 Advances in Software Engineering

Sub-Model 1Step 1 Generate the MMRE (M) for Available119873 Projects using actual and COCOMO estimated efforts(i) [BEGIN](ii) Input the 15 cost drivers KLOC Actual Effort for NASA projects(iii) [LOOP]

for 119895 = 1 to no of projects (say 119899)EAF[119895] = D1 lowast D2 lowast sdot sdot sdot lowast D15Estimated Effort[119895] = 119886[119895] lowast (kloc[119895]and119887[119895]) lowast EAF[119895]MRE[119895] = 100381610038161003816

1003816

Actual Effort[119895] minus estimated Effort[119895]1003816100381610038161003816

Actual Effort[119895]MMRE (original) += MRE[119895]MMRE (original) = 119899 [The original MMRE is obtained and noted down]

(iv) [END OF LOOP](v) [END]Sub-Model 2Step 2 for 119868 = 1 to 15temp = EmiSet Emi = 1Calculate Influenced MMRE(MN)List[119868 1] = 119868List[119868 2] = MNsimMEmi = tempend for

Sub-Model 3Step 3 Sort the list according to the second parameter in descending orderFor 119868 = 1 to 14For 119895 = 2 to 15If (list[119868 2] lt list[119895 2])thenswap (list[119868] list[119895])end ifend forend for

Step 4 Sig = list[119868 1] represent the order of Significance occurrencesSub-Model 4Step 5 for 119868 = 1 to 15for 119895 = very low to Extra high (Six rating of cost driver)Select Projects (P) as an input for calculating the fitness value using fitness function F1 = MMRE(P)Set the range R as Rmax RminGenerate initial population for the cost driver with Range RperformsThe Genetic operations for K generations(1) Tournament Selection(2) Crossover with Pc = 08(3) Mutation with Pm = 03Select the individual (CDNEW) with the best MMRE

Step 6 Calculate the MMRE(Mmod) by replacing CDNEW with CDijif (Mmod ltM)then update the value of CDij and Melsediscard the valueend ifend forend for

Algorithm 1

4 Proposed Algorithm forSolving the Problem

Algorithm Description (see Algorithm 1) Proposed algorithmis divided into 4 submodels In submodel 1 we calculate the

mean magnitude of relative error (MMRE) of all projectsaccording to the results obtained byCOCOMOHere we firstcalculate the estimated efforts by considering 15 COCOMOcost drivers project modes and kilo lines of codes (KLOC)Estimated efforts along with actual efforts produced in

Advances in Software Engineering 5

Table 1 COCOMO cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 1 086 071pcap 142 117 1 086 07aexp 129 113 1 091 082modp 124 11 1 091 082tool 124 11 1 091 083vexp 121 11 1 09lexp 114 107 1 095sced 123 108 1 104 11stor 1 106 121 156data 094 1 108 116time 1 111 13 166turn 087 1 107 115virt 087 1 115 13cplx 07 085 1 115 13 165rely 075 088 1 115 14

various projects are used as the input parameters to calculatethe mean of relative error (MRE) of each project MMRE forCOCOMO results is recorded as the original MMRE

In submodel 2 influenced MMRE is calculated on thebasis of occurrences of 15 cost driversThis influencedMMREshows the effectiveness of each cost driver in the sequenceof development of efforts in terms of person-months In thisprocess we take sample data having 18 input parametersthat is 15 cost drivers modes source lines and actual effortThe estimated efforts are calculated for the sample data bynullifying the effect of cost driver one by oneThese efforts areused to calculate the influenced MMRE for each cost drivercorresponding to the actual effort provided in the sampledata The difference between influenced MMRE and originalMMRE is recorded in the list along with driver

In submodel 3 the list with the difference of influencedMMRE and originalMMRE is sorted in the descending orderof the difference to provide the significant occurrence of thedriver This order has been named as Sig

In submodel 4 we will try to minimize the MMRE byupdating the value of cost driver with the help of geneticalgorithm in the order of their significance This is done byselecting the projects falling in the category of particularcost driver and then using the genetic algorithm operatorThe results obtained are evaluated using the fitness functionas MMRE If the MMRE is reduced the cost driver valuefor particular rating is updated otherwise discarded ThereducedMMREwill be recorded asMmodwhichwill be usedas the MMRE for the remaining cost drivers

5 Evaluation Method

51 Conceptual View Software cost estimation models needto be quantitatively evaluated in terms of estimation accuracyto improve themodeling process Some rules or themeasure-ments must be provided for model assessment purpose Thismeasurement of accuracy defines how close the estimatedresult is with its actual value Software cost estimates play

significant role in delivering software projects As a resultresearchers have proposed the most widely used evaluationcriterion to assess the performance of software predic-tion models that is the mean magnitude of relative error(MMRE) to evaluate the opulence of prediction systemsMMRE is usually computed by following standard evaluationprocesses such as cross-validation [39] It is independent ofsize scale and effort units Comparisons can be made acrossdata sets and prediction model types [40]

COCOMO computes effort on the basis of source linesof codes In intermediate COCOMO Boehm used 15 morepredictor variables called cost drivers which are requiredto calibrate the nominal effort of a project to the actualproject environment The values are set to each cost driveraccording to the properties of the specific software projectThese numerical values of 15 cost drivers are multiplied to getthe effort adjustment factor that is EAF

52 Data Analysis Performance of estimation methods isusually evaluated by several ratio measurements of accuracymetrics including RE (relative error) MRE (magnitude ofrelative error) and MMRE (mean magnitude of relativeerror) which are computed as follows

RE = Estimated efforts minus Actual effortsActual Efforts

MRE = Estimated Efforts sim Actual EffortsActual Efforts

MMRE = sum MRE119873

(3)

Another parameter used in evaluation of performance ofestimationmethod is PRED (percentage of prediction) whichis determined as

PRED (119883) = 119860119873

(4)

6 Advances in Software Engineering

Table 2 Significant occurrences of cost drivers

1 acap2 pcap3 aexp4 rely5 Virt6 vexp7 time8 modp9 cplx10 data11 tool12 sced13 lexp14 turn15 stor

where 119860 is the number of projects with MRE less than orequal to level 119883 and119873 is the number of considered projectsUsually the acceptable level of 119883 is 025 and the variousmethods are compared based on this level Decreasing ofMMRE and increasing of PRED are the main aim of allestimation techniques

6 Results and Discussion

61 Dataset Description Experiments were done by taking63 COCOMO 81 based dataset used by NASA and variousother calculations performed on it 93 NASA projects fromdifferent centers for projects from the years of 1971 to 1987were collected by Jairus Hihn JPL NASA Manager of SQIPMeasurement and Benchmarking Element The proposedmodel is validated by these datasetsThese are one of themostanalyzed data setsThe independent variable used is ldquoadjusteddelivered source instructionsrdquo which takes into account thevariation of effort when adapting software COCOMO is builtupon these data points by introducing many factors in theform of multipliers

These datasets include 156 historical projects with 17 effortdrivers and one dependent variable of the software develop-ment effort

62 Result Analysis Cost drivers play a vital role in estima-tion of the efforts and cost to be incurredThey show charac-teristics of software development that influence effort incarrying out a certain project Cost drivers are selected basedon the arguments that they have a linear effect on effortCOCOMO cost drivers are the basis for the analysis ofproposed algorithm Table 1 depicts the COCOMO effortmultipliers

Significance of 15 cost drivers can be shown by theirimpact on MMRE of efforts on original 63 NASA datasetsThe significance occurrences of 15 cost drivers are calculatedby applying step 1 to step 4 which are shown in Table 2

0005

01015

02025

03035

04045

05

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

MM

RE

Cost drivers

MMRE(COCOMO)Significant occurrences

Figure 1 Relationship between MMRE and cost drivers

002040608

112141618

2

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61

MRE

Software projects

MRE COCOMOMRE with tuned parameters

Figure 2 Comparison of MRE for NASA 63 projects

The occurrence of each cost driver is having linearity withthe MMRE calculated between actual efforts produced andestimated effort with COCOMO In Figure 1 each cost drivermoves against MMRE that is constant for all cases The effectof each cost driver on the MMRE is the significant aspect ofderiving the occurrence of cost drivers The proportionaterelationship can be seen from Figure 1 where the higherinfluencedMMRE with each independent cost driver againstconstant value of MMRE is the most significant and thosewith lower values are less significant

Once significant occurrences of the cost drivers are foundthe sequence of cost drivers is used to produce tuned valuesfor different ratings of various cost drivers Step 5 and step 6are used to generate the new values of available cost driversTable 3 reveals the tuned values in preexisting cost drivers

The proposed algorithm is validated with two differentdatasets of NASA projects According to the evaluationcriteria the proposed method has marginal difference ineffortswith actual project efforts in comparison toCOCOMO

Advances in Software Engineering 7

Table 3 Proposed algorithm based cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 09 086 071pcap 142 09 1 086 07aexp 129 14 1 091 082modp 138 092 1 091 082tool 124 11 099 093 083vexp 138 103 1 09lexp 114 108 09 095sced 123 108 099 104 11stor 1 106 119 138data 103 09 106 138time 09 111 13 166turn 097 092 103 09virt 087 1 115 13cplx 07 085 111 115 116 165rely 075 088 1 125 14

0

2

4

6

8

10

12

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91

MRE

NASA projects

MRE proposed modelMRE COCOMO

Figure 3 Comparison of MRE for NASA 93 projects

generated efforts shown in Figure 2 Most of the results arekept near to the mean of MRE for 63 data values Other93 datasets were also used to evaluate the projects with theproposed method (Figure 3 and Figure 4)

A comparison is made between proposed method andother estimation methods by MMRE in Table 4 Proposedmethod is having average error 027 with actual efforts andCOCOMO produces a bit higher percentage of error withactual efforts Proposedmodel is working efficiently for other93 datasets as well

Essentially we want to measure useful functionalityproduced per time unit Productivity is anothermeasurement

of effectiveness of the model It is a measure of the rateor ratio at which individual software developers involvedin software development produce software and associateddocumentation

Higher productivity reflects the better quality achieve-ment for the project development Proposed method ishaving productivity 029 which is closer to the actual efforts027 as productivity Seven percent of proposed methodproductivity is increased and 9 percent of COCOMO pro-ductivity is decreased in comparison with actual productivity(Table 5) So the percentage of difference between proposedmethod and COCOMO results is approximately 1795

8 Advances in Software Engineering

0

02

04

06

08

1

12

14

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61

Prod

uctiv

ity

Software projects

Productivity with actual effortsProductivity with COCOMO effortsProductivity with proposed method

Figure 4 Comparison of productivity for NASA 63 projects

Table 4 The MMRE for two different methods

MMRE (for 63 datasets) MMRE (for 93 datasets)COCOMO versus actual Proposed method versus actual COCOMO versus actual Proposed method versus actual036 027 059 056

Table 5 The productivity of various approaches

Productivity(COCOMO)

Productivity(proposed method)

Productivity(actual)

025 030 028Differencefrom actual 003 002

Table 6 MMRE of NASA 63 projects for various project modes

Project mode No ofprojects (63)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 27 029 039Organic 25 028 037Semidetached 11 022 023

Table 7 MMRE of NASA 93 projects for various project modes

Project mode No ofprojects (93)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 21 072 082Organic 3 08 088Semidetached 69 051 051

Tables 6 and 7 depict the presence of error in all threecategories of projectmodes for two different types of datasetsThe comparison was made between proposed model gener-ated results versus COCOMO results We also evaluate thedifferent type of project application categorically 80 of totaldatasets are producing the results which are better than theCOCOMO based results (Table 8)

PRED was calculated with the two separate approachesand Table 9 depicts that for 3 different PRED assumptionsproposedmethod is producing approximately 6665 801and 834 increase in PRED respectively

7 Conclusion

Work carried out in the paper explores the inter-relationshipamong different dimensions of data driven software projectsnamely project size and effort The above-mentioned resultsdemonstrate that applying proposed method to the softwareeffort estimation is by far the most feasible approach foraddressing the problem of apprehension and ambiguityexisting in software effort drivers Order of occurrence ofvarious cost drivers has a significant impact on overall effortsin project estimation Small adjustments to the COCOMOcost drivers bring significant improvements to the qualitycriteria applied to the proposed approach Proposed methodis producing tuned values of the cost drivers which areeffective enough to improve the productivity of the projectsPrediction at different levels of MRE for each project reflects

Advances in Software Engineering 9

Table 8 Description of projects on application basis

Type of application No of projects MMRE COCOMO MMRE proposed methodApplication ground 2 028 025Avionics 11 095 080Avionics monitoring 30 066 055Batch data processing 2 008 012Communications 1 018 005Data capture 3 009 007Launch processing 1 032 046Mission planning 20 038 034Monitor control 8 020 050Operating system 4 382 363Real data processing 3 012 006Science 2 018 041Simulation 4 017 029Utility 2 012 031

Table 9 Pred calculation at different values for both the models

PREDCOCOMO Proposed method

10 20 30 10 20 30Percentage of 63 NASA datasets 2381 3968 5714 254 4286 6191

the percentage of projects with desired accuracy Further-more this model is validated on two different datasets whichrepresents better estimation accuracy as compared to theCOCOMO 81 based NASA 63 and NASA 93 datasets Theutilization of proposed algorithm for other applications in thesoftware engineering field can also be explored in the future

Conflict of Interests

The authors certify that there is no actual or potential conflictof interests in relation to this paper The American CompanyTRWSystems Inc has been referred to as the company whereBarry W Boehm the developer of COCOMO worked

References

[1] K M Furulund and K Moloslashkken-Oslashstvold ldquoIncreasing soft-ware effort estimation accuracymdashusing experience data esti-mationmodels and checklistsrdquo in Proceedings of the 7th Interna-tional Conference on Quality Software (QSIC rsquo07) pp 342ndash347Portland OR USA October 2007

[2] Q Alam P Bhatia and S Sarwar Systematic Review of EffortEstimation and Cost Estimation Institute of Management Stud-ies Roorkee India 2012

[3] J J Dolado On the Problem of the Software Cost FunctionFacultad de Informatica Universidad del Pais Vasco-EuskalHerriko Unibertsitatea Gipuzkoa Spain 2000

[4] K Molokken and M Jorgensen ldquoA review of software surveyson software effort estimationrdquo inProceedings of the InternationalSymposium on Empirical Software Engineering (ISESE rsquo03) pp220ndash230 2003

[5] F Ferrucci C Gravino R Oliveto and F Sarro ldquoGenetic pro-gramming for effort estimation an analysis of the impact of dif-ferent fitness functionsrdquo in Proceedings of the 2nd InternationalSymposium on Search Based Software Engineering (SSBSE rsquo10)pp 89ndash98 IEEE Computer Society DMI University of SalernoBenevento Italy October 2010

[6] A F Sheta ldquoEstimation of the COCOMO model parametersusing genetic algorithms for NASA software projectsrdquo Journalof Computer Science vol 2 no 2 pp 118ndash123 2006

[7] B W Boehm Software Engineering Economics Prentice HallIEEE 1984

[8] J Magne and M Shepperd ldquoA Systematic Review Of SoftwareDevelopment Cost Estimation Studiesrdquo IEEE Transactions onSoftware Engineering vol 33 no 1 pp 33ndash53 2007

[9] P L Braga A L I Oliveira and S R L Meira ldquoA GA-based feature selection andparameters optimization for supportvector regression applied to software effort estimationrdquo inProceedings of the 23rd Annual ACM Symposium on AppliedComputing (SAC rsquo08) pp 1788ndash1792 Ceara Brazil March 2008

[10] M Harman and B F Jones ldquoSearch-based software engineer-ingrdquo Information and Software Technology vol 43 no 14 pp833ndash839 2001

[11] J Clarke J J DoladoMHarman et al ldquoReformulating softwareengineering as a search problemrdquo IEE Proceedings Software vol150 no 3 pp 161ndash175 2003

[12] M Joslashrgensen and S Grimstad ldquoAvoiding irrelevant and mis-leading informationwhen estimating development effortrdquo IEEESoftware vol 25 no 3 pp 78ndash83 2008

[13] A L Lederer and J Prasad ldquoA causal model for software costestimating errorrdquo IEEE Transactions on Software Engineeringvol 24 no 2 pp 137ndash148 1998

10 Advances in Software Engineering

[14] S Basha and P Dhavachelvan ldquoAnalysis of empirical softwareeffort estimation modelsrdquo International Journal of ComputerScience and Information Security vol 7 no 3 pp 68ndash77 2010

[15] B L Barber Investigative search of quality historical softwaresupport cost data and software support cost-related data [MSthesis] 1991

[16] N H Chiu and S J Huang ldquoThe adjusted analogy-based soft-ware effort estimation based on similarity distancesrdquo Journal ofSystems and Software vol 80 no 4 pp 628ndash640 2007

[17] G Kadoda and M Shepperd ldquoUsing simulation to evaluateprediction techniquesrdquo in Proceedings of the 7th InternationalSoftware Metrics Symposium (METRICS rsquo01) pp 349ndash359 IEEEPress London UK 2001

[18] M J Shepperd and G F Kadoda ldquoComparing software predic-tion techniques using simulationrdquo IEEE Transactions on Soft-ware Engineering vol 27 no 11 pp 1014ndash1022 2001

[19] M J Shepperd and C Schofield ldquoEstimating software projecteffort using analogiesrdquo IEEE Transactions on Software Engineer-ing vol 23 no 11 pp 736ndash743 1997

[20] M Jooslashrgensen and D I K Sjoslashberg ldquoThe impact of customerexpectation on software development effort estimatesrdquo Interna-tional Journal of Project Management vol 22 no 4 pp 317ndash3252004

[21] J Kaczmarek and M Kucharski ldquoSize and effort estimation forapplications written in Javardquo Information and Software Technol-ogy vol 46 no 9 pp 589ndash601 2004

[22] R Jeffery M Ruhe and I Wieczorek ldquoUsing public domainmetrics to estimate software development effortrdquo in Proceedingsof the 7th International Software Metrics Symposium (METRICSrsquo01) pp 16ndash27 IEEE Computer Society Washington DC USAApril 2001

[23] G H Subramanian P C Pendharkar and MWallace ldquoAn em-pirical study of the effect of complexity platform and programtype on software development effort of business applicationsrdquoEmpirical Software Engineering vol 11 no 4 pp 541ndash553 2006

[24] D E GoldbergGenetic Algorithms in Search Optimization andMachine Learning chapter 1ndash8 Addison-Wesley New York NYUSA 1989

[25] A Heiat ldquoComparison of artificial neural network and regres-sion models for estimating software development effortrdquo Infor-mation and Software Technology vol 44 no 15 pp 911ndash9222002

[26] K Srinivasan and D Fisher ldquoMachine learning approaches toestimating software development effortrdquo IEEE Transactions onSoftware Engineering vol 21 no 2 pp 126ndash137 1995

[27] S J Huang C Y Lin and N H Chiu ldquoFuzzy decision treeapproach for embedding risk assessment information into soft-ware cost estimation modelrdquo Journal of Information Science andEngineering vol 22 no 2 pp 297ndash313 2006

[28] M van Genuchten and H Koolen ldquoOn the use of software costmodelsrdquo Information and Management vol 21 no 1 pp 37ndash441991

[29] A J Albrecht and J E Gaffney ldquoSoftware function source linesof code and development effort prediction a software sciencevalidationrdquo IEEE Transactions on Software Engineering vol 9no 6 pp 639ndash648 1983

[30] I Attarzadeh and SHOw ldquoA novel algorithmic cost estimationmodel based on soft computing techniquerdquo Journal of ComputerScience vol 6 no 2 pp 117ndash125 2010

[31] F J Heemstra Software Cost Estimation Models University ofTechnology Department of Industrial Engineering IEEE 1990

[32] M Joslashrgensen B Boehm and S Rifkin ldquoSoftware developmenteffort estimation formal models or expert judgmentrdquo IEEESoftware vol 26 no 2 pp 14ndash19 2009

[33] Y F Li M Xie and T N Goh ldquoA study of genetic algorithm forproject selection for analogy based software cost estimationrdquo inProceedings of the IEEE International Conference on IndustrialEngineering and EngineeringManagement (IEEM rsquo07) pp 1256ndash1260 Singapore December 2007

[34] H Liu and L Yu ldquoToward integrating feature selection algo-rithms for classification and clusteringrdquo IEEE Transactions onKnowledge and Data Engineering vol 17 no 4 pp 491ndash5022005

[35] A Kumar S Tiwari K KMishra andA KMisra ldquoGenerationof efficient test data using path selection strategy with elitist GAin regression testingrdquo inProceedings of the 3rd IEEE Internation-al Conference on Computer Science and Information Technology(ICCSIT rsquo10) vol 9 pp 389ndash393 Chengdu China July 2010

[36] K K Mishra S Tiwari A Kumar and A K Misra ldquoAnapproach for mutation testing using elitist genetic algorithmrdquoin Proceedings of the 3rd IEEE International Conference on Com-puter Science and Information Technology (ICCSIT rsquo10) vol 5pp 426ndash429 Chengdu China July 2010

[37] S Sarmady An Investigation on Genetic Algorithm ParametersP-COM000507(R) P-COM008807 School of Computer Sci-ences Universiti Sains Malaysia Penang Malaysia 2007

[38] K F Man K S Tang and S Kwong Genetic Algorithms Con-cepts and Designs Chapter 1ndash10 Springer New York NY USA2001

[39] L C Briand K El-Emam and I Wieczorek ldquoExplaining thecost of European space and military projectsrdquo in Proceedings ofthe International Conference on Software Engineering (ICSE rsquo99)pp 303ndash312 ACM Press May 1999

[40] L C Briand T Langley and I Wieczorek ldquoReplicated assess-ment and comparison of common software cost modelingtechniquesrdquo in Proceedings of the International Conference onSoftware Engineering (ICSE rsquo22) pp 377ndash386 ACM Press June2000

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article Tuning of Cost Drivers by Significance

Advances in Software Engineering 3

the software COCOMO has been derived from a database of63 projects executed between 1964 and 1979 by the AmericanCompany TRW Systems Inc The projects considered duringthis time era were differing strongly in type of their applica-tion size and programming language [31]

Boehm introduced three levels of the estimation modelbasic intermediate and detailed

(i) The basic COCOMO 81 is a single-valued staticmodel which provides an approximate estimation ofsoftware development effort and cost as a function ofprogram size expressed in thousand delivered sourceof instructions (KDSI)

(ii) The intermediate COCOMO 81 describes softwaredevelopment effort as a function of program size inLOC and a set of fifteen ldquoeffort multipliers known ascost driversrdquoThese cost drivers incorporate subjectiveassessments of product project personnel and hard-ware attributes

(iii) The advanced or detailed COCOMO 81 reduces themargin of error in the final estimate by incorporatingall characteristics of the intermediate version with thedetermination of the cost driverrsquos impact on each stepthat is analysis and design of the software engineeringprocess

COCOMO assumes that the effort grows more thanlinearly with software size The value of few multipliers isrequired to be increased to decrease the effort For fewother multipliers the values are required to be decreased todecrease the effort that is person-months = 119886lowast (KSLOC)119887 lowast119888 Here 119886 and 119887 are domain-specific parameters KSLOCdenotes kilo source lines of code which is estimated directlyor computed from a function point analysis and 119888 is theproduct of fifteen effort multipliers (EM119894) here 119894 = 1 to 15

So the following equation can be represented as

Person-months = 119886 lowast (KSLOC)119887

lowast (EM1 lowast EM2 lowast sdot sdot sdot lowast EM15) see [7 25] (2)

3 Solution to the Problem

31 Nonalgorithmic Models Contrary to the algorithmicmodels since inception in 1990s the proposed nonalgorith-mic models are based on computational intelligence analyt-ical comparisons and inferences to project cost estimationThey have the capability tomodel the complex set of relation-ship between the dependent variables (cost effort) and theindependent variables (cost drivers) collected earlier in theproject lifecycle and to learn fromhistorical projects data Forusing the nonalgorithmic models information about thoseprevious projects datasets is required which are similar to theprojects under estimate Usually in thesemethods estimationprocess is done according to the analysis of the historicaldatasets Many software researchers have shown their interestin the research to new approaches of nonalgorithmic modelsthat are based on soft computing that is artificial neural

networks fuzzy logic and evolutionary algorithms Thesemethods are being used for the assessing because of theirpopularity and a large number of papers about their usagehave been published in the recent past years [26 32ndash34]Decision of choosing a suitable technique is a difficult one andrequires the support of a well-defined evaluation scheme torank each evolutionary computation technique as and whenit is applied to any optimization problem In this presentresearch study an effective model based on evolutionarycomputation has been proposed to overcome the problem ofuncertainty and to acquire better results

32 Genetic Algorithms Evolutionary computational meth-ods are generally used in software engineeringmethodologiessuch as test case generation [35 36] effort estimation costestimation and many more Genetic algorithms are a sim-ple and almost generic evolutionary computational methodinspired by Darwinrsquos theory of natural evolution to solve thecomplex optimization problems Genetic algorithm requiresa careful and suitable selection of parentrsquos selection methodsmutation methods population size and so forth to findgood solutions If improper parameters and methods arechosen there may have longer program runs or even badoptimization results [37] In nature competition among indi-viduals for scanty resources results in the fittest individualsdominating over the weakest ones [24 38]

321 Working Principle

(i) Genetic algorithm starts with randomly generatedinitial population as a set of solutions which arerepresented by chromosomes

(ii) The algorithm then generates a sequence of individu-als as new population At each iteration the algorithmuses the individuals of current generation to createthe next generation of population To create the newpopulation the algorithm works with the followingsteps(a) Score each individual member of the current

population by computing its fitness value(b) Scale the raw fitness scores to convert them into

a more desired range of values(c) Select the good individuals called parents

based on the value of fitness function(d) Few of the individuals in the current population

that are having lower fitness are selected as eliteThese individuals are directly sent to the nextgeneration of population for elitism

(e) Produce offsprings from the parents Offspringsare produced either by making mutation of asingle parent by combining the chromosomeof a pair of parents with the help of crossoveroperator

(f) Update the current population with the off-springs to form the new generation

(iii) The algorithm terminates only on the condition thatany one of the stopping criteria is reached that isnumber of generations or desired fitness value

4 Advances in Software Engineering

Sub-Model 1Step 1 Generate the MMRE (M) for Available119873 Projects using actual and COCOMO estimated efforts(i) [BEGIN](ii) Input the 15 cost drivers KLOC Actual Effort for NASA projects(iii) [LOOP]

for 119895 = 1 to no of projects (say 119899)EAF[119895] = D1 lowast D2 lowast sdot sdot sdot lowast D15Estimated Effort[119895] = 119886[119895] lowast (kloc[119895]and119887[119895]) lowast EAF[119895]MRE[119895] = 100381610038161003816

1003816

Actual Effort[119895] minus estimated Effort[119895]1003816100381610038161003816

Actual Effort[119895]MMRE (original) += MRE[119895]MMRE (original) = 119899 [The original MMRE is obtained and noted down]

(iv) [END OF LOOP](v) [END]Sub-Model 2Step 2 for 119868 = 1 to 15temp = EmiSet Emi = 1Calculate Influenced MMRE(MN)List[119868 1] = 119868List[119868 2] = MNsimMEmi = tempend for

Sub-Model 3Step 3 Sort the list according to the second parameter in descending orderFor 119868 = 1 to 14For 119895 = 2 to 15If (list[119868 2] lt list[119895 2])thenswap (list[119868] list[119895])end ifend forend for

Step 4 Sig = list[119868 1] represent the order of Significance occurrencesSub-Model 4Step 5 for 119868 = 1 to 15for 119895 = very low to Extra high (Six rating of cost driver)Select Projects (P) as an input for calculating the fitness value using fitness function F1 = MMRE(P)Set the range R as Rmax RminGenerate initial population for the cost driver with Range RperformsThe Genetic operations for K generations(1) Tournament Selection(2) Crossover with Pc = 08(3) Mutation with Pm = 03Select the individual (CDNEW) with the best MMRE

Step 6 Calculate the MMRE(Mmod) by replacing CDNEW with CDijif (Mmod ltM)then update the value of CDij and Melsediscard the valueend ifend forend for

Algorithm 1

4 Proposed Algorithm forSolving the Problem

Algorithm Description (see Algorithm 1) Proposed algorithmis divided into 4 submodels In submodel 1 we calculate the

mean magnitude of relative error (MMRE) of all projectsaccording to the results obtained byCOCOMOHere we firstcalculate the estimated efforts by considering 15 COCOMOcost drivers project modes and kilo lines of codes (KLOC)Estimated efforts along with actual efforts produced in

Advances in Software Engineering 5

Table 1 COCOMO cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 1 086 071pcap 142 117 1 086 07aexp 129 113 1 091 082modp 124 11 1 091 082tool 124 11 1 091 083vexp 121 11 1 09lexp 114 107 1 095sced 123 108 1 104 11stor 1 106 121 156data 094 1 108 116time 1 111 13 166turn 087 1 107 115virt 087 1 115 13cplx 07 085 1 115 13 165rely 075 088 1 115 14

various projects are used as the input parameters to calculatethe mean of relative error (MRE) of each project MMRE forCOCOMO results is recorded as the original MMRE

In submodel 2 influenced MMRE is calculated on thebasis of occurrences of 15 cost driversThis influencedMMREshows the effectiveness of each cost driver in the sequenceof development of efforts in terms of person-months In thisprocess we take sample data having 18 input parametersthat is 15 cost drivers modes source lines and actual effortThe estimated efforts are calculated for the sample data bynullifying the effect of cost driver one by oneThese efforts areused to calculate the influenced MMRE for each cost drivercorresponding to the actual effort provided in the sampledata The difference between influenced MMRE and originalMMRE is recorded in the list along with driver

In submodel 3 the list with the difference of influencedMMRE and originalMMRE is sorted in the descending orderof the difference to provide the significant occurrence of thedriver This order has been named as Sig

In submodel 4 we will try to minimize the MMRE byupdating the value of cost driver with the help of geneticalgorithm in the order of their significance This is done byselecting the projects falling in the category of particularcost driver and then using the genetic algorithm operatorThe results obtained are evaluated using the fitness functionas MMRE If the MMRE is reduced the cost driver valuefor particular rating is updated otherwise discarded ThereducedMMREwill be recorded asMmodwhichwill be usedas the MMRE for the remaining cost drivers

5 Evaluation Method

51 Conceptual View Software cost estimation models needto be quantitatively evaluated in terms of estimation accuracyto improve themodeling process Some rules or themeasure-ments must be provided for model assessment purpose Thismeasurement of accuracy defines how close the estimatedresult is with its actual value Software cost estimates play

significant role in delivering software projects As a resultresearchers have proposed the most widely used evaluationcriterion to assess the performance of software predic-tion models that is the mean magnitude of relative error(MMRE) to evaluate the opulence of prediction systemsMMRE is usually computed by following standard evaluationprocesses such as cross-validation [39] It is independent ofsize scale and effort units Comparisons can be made acrossdata sets and prediction model types [40]

COCOMO computes effort on the basis of source linesof codes In intermediate COCOMO Boehm used 15 morepredictor variables called cost drivers which are requiredto calibrate the nominal effort of a project to the actualproject environment The values are set to each cost driveraccording to the properties of the specific software projectThese numerical values of 15 cost drivers are multiplied to getthe effort adjustment factor that is EAF

52 Data Analysis Performance of estimation methods isusually evaluated by several ratio measurements of accuracymetrics including RE (relative error) MRE (magnitude ofrelative error) and MMRE (mean magnitude of relativeerror) which are computed as follows

RE = Estimated efforts minus Actual effortsActual Efforts

MRE = Estimated Efforts sim Actual EffortsActual Efforts

MMRE = sum MRE119873

(3)

Another parameter used in evaluation of performance ofestimationmethod is PRED (percentage of prediction) whichis determined as

PRED (119883) = 119860119873

(4)

6 Advances in Software Engineering

Table 2 Significant occurrences of cost drivers

1 acap2 pcap3 aexp4 rely5 Virt6 vexp7 time8 modp9 cplx10 data11 tool12 sced13 lexp14 turn15 stor

where 119860 is the number of projects with MRE less than orequal to level 119883 and119873 is the number of considered projectsUsually the acceptable level of 119883 is 025 and the variousmethods are compared based on this level Decreasing ofMMRE and increasing of PRED are the main aim of allestimation techniques

6 Results and Discussion

61 Dataset Description Experiments were done by taking63 COCOMO 81 based dataset used by NASA and variousother calculations performed on it 93 NASA projects fromdifferent centers for projects from the years of 1971 to 1987were collected by Jairus Hihn JPL NASA Manager of SQIPMeasurement and Benchmarking Element The proposedmodel is validated by these datasetsThese are one of themostanalyzed data setsThe independent variable used is ldquoadjusteddelivered source instructionsrdquo which takes into account thevariation of effort when adapting software COCOMO is builtupon these data points by introducing many factors in theform of multipliers

These datasets include 156 historical projects with 17 effortdrivers and one dependent variable of the software develop-ment effort

62 Result Analysis Cost drivers play a vital role in estima-tion of the efforts and cost to be incurredThey show charac-teristics of software development that influence effort incarrying out a certain project Cost drivers are selected basedon the arguments that they have a linear effect on effortCOCOMO cost drivers are the basis for the analysis ofproposed algorithm Table 1 depicts the COCOMO effortmultipliers

Significance of 15 cost drivers can be shown by theirimpact on MMRE of efforts on original 63 NASA datasetsThe significance occurrences of 15 cost drivers are calculatedby applying step 1 to step 4 which are shown in Table 2

0005

01015

02025

03035

04045

05

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

MM

RE

Cost drivers

MMRE(COCOMO)Significant occurrences

Figure 1 Relationship between MMRE and cost drivers

002040608

112141618

2

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61

MRE

Software projects

MRE COCOMOMRE with tuned parameters

Figure 2 Comparison of MRE for NASA 63 projects

The occurrence of each cost driver is having linearity withthe MMRE calculated between actual efforts produced andestimated effort with COCOMO In Figure 1 each cost drivermoves against MMRE that is constant for all cases The effectof each cost driver on the MMRE is the significant aspect ofderiving the occurrence of cost drivers The proportionaterelationship can be seen from Figure 1 where the higherinfluencedMMRE with each independent cost driver againstconstant value of MMRE is the most significant and thosewith lower values are less significant

Once significant occurrences of the cost drivers are foundthe sequence of cost drivers is used to produce tuned valuesfor different ratings of various cost drivers Step 5 and step 6are used to generate the new values of available cost driversTable 3 reveals the tuned values in preexisting cost drivers

The proposed algorithm is validated with two differentdatasets of NASA projects According to the evaluationcriteria the proposed method has marginal difference ineffortswith actual project efforts in comparison toCOCOMO

Advances in Software Engineering 7

Table 3 Proposed algorithm based cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 09 086 071pcap 142 09 1 086 07aexp 129 14 1 091 082modp 138 092 1 091 082tool 124 11 099 093 083vexp 138 103 1 09lexp 114 108 09 095sced 123 108 099 104 11stor 1 106 119 138data 103 09 106 138time 09 111 13 166turn 097 092 103 09virt 087 1 115 13cplx 07 085 111 115 116 165rely 075 088 1 125 14

0

2

4

6

8

10

12

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91

MRE

NASA projects

MRE proposed modelMRE COCOMO

Figure 3 Comparison of MRE for NASA 93 projects

generated efforts shown in Figure 2 Most of the results arekept near to the mean of MRE for 63 data values Other93 datasets were also used to evaluate the projects with theproposed method (Figure 3 and Figure 4)

A comparison is made between proposed method andother estimation methods by MMRE in Table 4 Proposedmethod is having average error 027 with actual efforts andCOCOMO produces a bit higher percentage of error withactual efforts Proposedmodel is working efficiently for other93 datasets as well

Essentially we want to measure useful functionalityproduced per time unit Productivity is anothermeasurement

of effectiveness of the model It is a measure of the rateor ratio at which individual software developers involvedin software development produce software and associateddocumentation

Higher productivity reflects the better quality achieve-ment for the project development Proposed method ishaving productivity 029 which is closer to the actual efforts027 as productivity Seven percent of proposed methodproductivity is increased and 9 percent of COCOMO pro-ductivity is decreased in comparison with actual productivity(Table 5) So the percentage of difference between proposedmethod and COCOMO results is approximately 1795

8 Advances in Software Engineering

0

02

04

06

08

1

12

14

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61

Prod

uctiv

ity

Software projects

Productivity with actual effortsProductivity with COCOMO effortsProductivity with proposed method

Figure 4 Comparison of productivity for NASA 63 projects

Table 4 The MMRE for two different methods

MMRE (for 63 datasets) MMRE (for 93 datasets)COCOMO versus actual Proposed method versus actual COCOMO versus actual Proposed method versus actual036 027 059 056

Table 5 The productivity of various approaches

Productivity(COCOMO)

Productivity(proposed method)

Productivity(actual)

025 030 028Differencefrom actual 003 002

Table 6 MMRE of NASA 63 projects for various project modes

Project mode No ofprojects (63)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 27 029 039Organic 25 028 037Semidetached 11 022 023

Table 7 MMRE of NASA 93 projects for various project modes

Project mode No ofprojects (93)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 21 072 082Organic 3 08 088Semidetached 69 051 051

Tables 6 and 7 depict the presence of error in all threecategories of projectmodes for two different types of datasetsThe comparison was made between proposed model gener-ated results versus COCOMO results We also evaluate thedifferent type of project application categorically 80 of totaldatasets are producing the results which are better than theCOCOMO based results (Table 8)

PRED was calculated with the two separate approachesand Table 9 depicts that for 3 different PRED assumptionsproposedmethod is producing approximately 6665 801and 834 increase in PRED respectively

7 Conclusion

Work carried out in the paper explores the inter-relationshipamong different dimensions of data driven software projectsnamely project size and effort The above-mentioned resultsdemonstrate that applying proposed method to the softwareeffort estimation is by far the most feasible approach foraddressing the problem of apprehension and ambiguityexisting in software effort drivers Order of occurrence ofvarious cost drivers has a significant impact on overall effortsin project estimation Small adjustments to the COCOMOcost drivers bring significant improvements to the qualitycriteria applied to the proposed approach Proposed methodis producing tuned values of the cost drivers which areeffective enough to improve the productivity of the projectsPrediction at different levels of MRE for each project reflects

Advances in Software Engineering 9

Table 8 Description of projects on application basis

Type of application No of projects MMRE COCOMO MMRE proposed methodApplication ground 2 028 025Avionics 11 095 080Avionics monitoring 30 066 055Batch data processing 2 008 012Communications 1 018 005Data capture 3 009 007Launch processing 1 032 046Mission planning 20 038 034Monitor control 8 020 050Operating system 4 382 363Real data processing 3 012 006Science 2 018 041Simulation 4 017 029Utility 2 012 031

Table 9 Pred calculation at different values for both the models

PREDCOCOMO Proposed method

10 20 30 10 20 30Percentage of 63 NASA datasets 2381 3968 5714 254 4286 6191

the percentage of projects with desired accuracy Further-more this model is validated on two different datasets whichrepresents better estimation accuracy as compared to theCOCOMO 81 based NASA 63 and NASA 93 datasets Theutilization of proposed algorithm for other applications in thesoftware engineering field can also be explored in the future

Conflict of Interests

The authors certify that there is no actual or potential conflictof interests in relation to this paper The American CompanyTRWSystems Inc has been referred to as the company whereBarry W Boehm the developer of COCOMO worked

References

[1] K M Furulund and K Moloslashkken-Oslashstvold ldquoIncreasing soft-ware effort estimation accuracymdashusing experience data esti-mationmodels and checklistsrdquo in Proceedings of the 7th Interna-tional Conference on Quality Software (QSIC rsquo07) pp 342ndash347Portland OR USA October 2007

[2] Q Alam P Bhatia and S Sarwar Systematic Review of EffortEstimation and Cost Estimation Institute of Management Stud-ies Roorkee India 2012

[3] J J Dolado On the Problem of the Software Cost FunctionFacultad de Informatica Universidad del Pais Vasco-EuskalHerriko Unibertsitatea Gipuzkoa Spain 2000

[4] K Molokken and M Jorgensen ldquoA review of software surveyson software effort estimationrdquo inProceedings of the InternationalSymposium on Empirical Software Engineering (ISESE rsquo03) pp220ndash230 2003

[5] F Ferrucci C Gravino R Oliveto and F Sarro ldquoGenetic pro-gramming for effort estimation an analysis of the impact of dif-ferent fitness functionsrdquo in Proceedings of the 2nd InternationalSymposium on Search Based Software Engineering (SSBSE rsquo10)pp 89ndash98 IEEE Computer Society DMI University of SalernoBenevento Italy October 2010

[6] A F Sheta ldquoEstimation of the COCOMO model parametersusing genetic algorithms for NASA software projectsrdquo Journalof Computer Science vol 2 no 2 pp 118ndash123 2006

[7] B W Boehm Software Engineering Economics Prentice HallIEEE 1984

[8] J Magne and M Shepperd ldquoA Systematic Review Of SoftwareDevelopment Cost Estimation Studiesrdquo IEEE Transactions onSoftware Engineering vol 33 no 1 pp 33ndash53 2007

[9] P L Braga A L I Oliveira and S R L Meira ldquoA GA-based feature selection andparameters optimization for supportvector regression applied to software effort estimationrdquo inProceedings of the 23rd Annual ACM Symposium on AppliedComputing (SAC rsquo08) pp 1788ndash1792 Ceara Brazil March 2008

[10] M Harman and B F Jones ldquoSearch-based software engineer-ingrdquo Information and Software Technology vol 43 no 14 pp833ndash839 2001

[11] J Clarke J J DoladoMHarman et al ldquoReformulating softwareengineering as a search problemrdquo IEE Proceedings Software vol150 no 3 pp 161ndash175 2003

[12] M Joslashrgensen and S Grimstad ldquoAvoiding irrelevant and mis-leading informationwhen estimating development effortrdquo IEEESoftware vol 25 no 3 pp 78ndash83 2008

[13] A L Lederer and J Prasad ldquoA causal model for software costestimating errorrdquo IEEE Transactions on Software Engineeringvol 24 no 2 pp 137ndash148 1998

10 Advances in Software Engineering

[14] S Basha and P Dhavachelvan ldquoAnalysis of empirical softwareeffort estimation modelsrdquo International Journal of ComputerScience and Information Security vol 7 no 3 pp 68ndash77 2010

[15] B L Barber Investigative search of quality historical softwaresupport cost data and software support cost-related data [MSthesis] 1991

[16] N H Chiu and S J Huang ldquoThe adjusted analogy-based soft-ware effort estimation based on similarity distancesrdquo Journal ofSystems and Software vol 80 no 4 pp 628ndash640 2007

[17] G Kadoda and M Shepperd ldquoUsing simulation to evaluateprediction techniquesrdquo in Proceedings of the 7th InternationalSoftware Metrics Symposium (METRICS rsquo01) pp 349ndash359 IEEEPress London UK 2001

[18] M J Shepperd and G F Kadoda ldquoComparing software predic-tion techniques using simulationrdquo IEEE Transactions on Soft-ware Engineering vol 27 no 11 pp 1014ndash1022 2001

[19] M J Shepperd and C Schofield ldquoEstimating software projecteffort using analogiesrdquo IEEE Transactions on Software Engineer-ing vol 23 no 11 pp 736ndash743 1997

[20] M Jooslashrgensen and D I K Sjoslashberg ldquoThe impact of customerexpectation on software development effort estimatesrdquo Interna-tional Journal of Project Management vol 22 no 4 pp 317ndash3252004

[21] J Kaczmarek and M Kucharski ldquoSize and effort estimation forapplications written in Javardquo Information and Software Technol-ogy vol 46 no 9 pp 589ndash601 2004

[22] R Jeffery M Ruhe and I Wieczorek ldquoUsing public domainmetrics to estimate software development effortrdquo in Proceedingsof the 7th International Software Metrics Symposium (METRICSrsquo01) pp 16ndash27 IEEE Computer Society Washington DC USAApril 2001

[23] G H Subramanian P C Pendharkar and MWallace ldquoAn em-pirical study of the effect of complexity platform and programtype on software development effort of business applicationsrdquoEmpirical Software Engineering vol 11 no 4 pp 541ndash553 2006

[24] D E GoldbergGenetic Algorithms in Search Optimization andMachine Learning chapter 1ndash8 Addison-Wesley New York NYUSA 1989

[25] A Heiat ldquoComparison of artificial neural network and regres-sion models for estimating software development effortrdquo Infor-mation and Software Technology vol 44 no 15 pp 911ndash9222002

[26] K Srinivasan and D Fisher ldquoMachine learning approaches toestimating software development effortrdquo IEEE Transactions onSoftware Engineering vol 21 no 2 pp 126ndash137 1995

[27] S J Huang C Y Lin and N H Chiu ldquoFuzzy decision treeapproach for embedding risk assessment information into soft-ware cost estimation modelrdquo Journal of Information Science andEngineering vol 22 no 2 pp 297ndash313 2006

[28] M van Genuchten and H Koolen ldquoOn the use of software costmodelsrdquo Information and Management vol 21 no 1 pp 37ndash441991

[29] A J Albrecht and J E Gaffney ldquoSoftware function source linesof code and development effort prediction a software sciencevalidationrdquo IEEE Transactions on Software Engineering vol 9no 6 pp 639ndash648 1983

[30] I Attarzadeh and SHOw ldquoA novel algorithmic cost estimationmodel based on soft computing techniquerdquo Journal of ComputerScience vol 6 no 2 pp 117ndash125 2010

[31] F J Heemstra Software Cost Estimation Models University ofTechnology Department of Industrial Engineering IEEE 1990

[32] M Joslashrgensen B Boehm and S Rifkin ldquoSoftware developmenteffort estimation formal models or expert judgmentrdquo IEEESoftware vol 26 no 2 pp 14ndash19 2009

[33] Y F Li M Xie and T N Goh ldquoA study of genetic algorithm forproject selection for analogy based software cost estimationrdquo inProceedings of the IEEE International Conference on IndustrialEngineering and EngineeringManagement (IEEM rsquo07) pp 1256ndash1260 Singapore December 2007

[34] H Liu and L Yu ldquoToward integrating feature selection algo-rithms for classification and clusteringrdquo IEEE Transactions onKnowledge and Data Engineering vol 17 no 4 pp 491ndash5022005

[35] A Kumar S Tiwari K KMishra andA KMisra ldquoGenerationof efficient test data using path selection strategy with elitist GAin regression testingrdquo inProceedings of the 3rd IEEE Internation-al Conference on Computer Science and Information Technology(ICCSIT rsquo10) vol 9 pp 389ndash393 Chengdu China July 2010

[36] K K Mishra S Tiwari A Kumar and A K Misra ldquoAnapproach for mutation testing using elitist genetic algorithmrdquoin Proceedings of the 3rd IEEE International Conference on Com-puter Science and Information Technology (ICCSIT rsquo10) vol 5pp 426ndash429 Chengdu China July 2010

[37] S Sarmady An Investigation on Genetic Algorithm ParametersP-COM000507(R) P-COM008807 School of Computer Sci-ences Universiti Sains Malaysia Penang Malaysia 2007

[38] K F Man K S Tang and S Kwong Genetic Algorithms Con-cepts and Designs Chapter 1ndash10 Springer New York NY USA2001

[39] L C Briand K El-Emam and I Wieczorek ldquoExplaining thecost of European space and military projectsrdquo in Proceedings ofthe International Conference on Software Engineering (ICSE rsquo99)pp 303ndash312 ACM Press May 1999

[40] L C Briand T Langley and I Wieczorek ldquoReplicated assess-ment and comparison of common software cost modelingtechniquesrdquo in Proceedings of the International Conference onSoftware Engineering (ICSE rsquo22) pp 377ndash386 ACM Press June2000

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article Tuning of Cost Drivers by Significance

4 Advances in Software Engineering

Sub-Model 1Step 1 Generate the MMRE (M) for Available119873 Projects using actual and COCOMO estimated efforts(i) [BEGIN](ii) Input the 15 cost drivers KLOC Actual Effort for NASA projects(iii) [LOOP]

for 119895 = 1 to no of projects (say 119899)EAF[119895] = D1 lowast D2 lowast sdot sdot sdot lowast D15Estimated Effort[119895] = 119886[119895] lowast (kloc[119895]and119887[119895]) lowast EAF[119895]MRE[119895] = 100381610038161003816

1003816

Actual Effort[119895] minus estimated Effort[119895]1003816100381610038161003816

Actual Effort[119895]MMRE (original) += MRE[119895]MMRE (original) = 119899 [The original MMRE is obtained and noted down]

(iv) [END OF LOOP](v) [END]Sub-Model 2Step 2 for 119868 = 1 to 15temp = EmiSet Emi = 1Calculate Influenced MMRE(MN)List[119868 1] = 119868List[119868 2] = MNsimMEmi = tempend for

Sub-Model 3Step 3 Sort the list according to the second parameter in descending orderFor 119868 = 1 to 14For 119895 = 2 to 15If (list[119868 2] lt list[119895 2])thenswap (list[119868] list[119895])end ifend forend for

Step 4 Sig = list[119868 1] represent the order of Significance occurrencesSub-Model 4Step 5 for 119868 = 1 to 15for 119895 = very low to Extra high (Six rating of cost driver)Select Projects (P) as an input for calculating the fitness value using fitness function F1 = MMRE(P)Set the range R as Rmax RminGenerate initial population for the cost driver with Range RperformsThe Genetic operations for K generations(1) Tournament Selection(2) Crossover with Pc = 08(3) Mutation with Pm = 03Select the individual (CDNEW) with the best MMRE

Step 6 Calculate the MMRE(Mmod) by replacing CDNEW with CDijif (Mmod ltM)then update the value of CDij and Melsediscard the valueend ifend forend for

Algorithm 1

4 Proposed Algorithm forSolving the Problem

Algorithm Description (see Algorithm 1) Proposed algorithmis divided into 4 submodels In submodel 1 we calculate the

mean magnitude of relative error (MMRE) of all projectsaccording to the results obtained byCOCOMOHere we firstcalculate the estimated efforts by considering 15 COCOMOcost drivers project modes and kilo lines of codes (KLOC)Estimated efforts along with actual efforts produced in

Advances in Software Engineering 5

Table 1 COCOMO cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 1 086 071pcap 142 117 1 086 07aexp 129 113 1 091 082modp 124 11 1 091 082tool 124 11 1 091 083vexp 121 11 1 09lexp 114 107 1 095sced 123 108 1 104 11stor 1 106 121 156data 094 1 108 116time 1 111 13 166turn 087 1 107 115virt 087 1 115 13cplx 07 085 1 115 13 165rely 075 088 1 115 14

various projects are used as the input parameters to calculatethe mean of relative error (MRE) of each project MMRE forCOCOMO results is recorded as the original MMRE

In submodel 2 influenced MMRE is calculated on thebasis of occurrences of 15 cost driversThis influencedMMREshows the effectiveness of each cost driver in the sequenceof development of efforts in terms of person-months In thisprocess we take sample data having 18 input parametersthat is 15 cost drivers modes source lines and actual effortThe estimated efforts are calculated for the sample data bynullifying the effect of cost driver one by oneThese efforts areused to calculate the influenced MMRE for each cost drivercorresponding to the actual effort provided in the sampledata The difference between influenced MMRE and originalMMRE is recorded in the list along with driver

In submodel 3 the list with the difference of influencedMMRE and originalMMRE is sorted in the descending orderof the difference to provide the significant occurrence of thedriver This order has been named as Sig

In submodel 4 we will try to minimize the MMRE byupdating the value of cost driver with the help of geneticalgorithm in the order of their significance This is done byselecting the projects falling in the category of particularcost driver and then using the genetic algorithm operatorThe results obtained are evaluated using the fitness functionas MMRE If the MMRE is reduced the cost driver valuefor particular rating is updated otherwise discarded ThereducedMMREwill be recorded asMmodwhichwill be usedas the MMRE for the remaining cost drivers

5 Evaluation Method

51 Conceptual View Software cost estimation models needto be quantitatively evaluated in terms of estimation accuracyto improve themodeling process Some rules or themeasure-ments must be provided for model assessment purpose Thismeasurement of accuracy defines how close the estimatedresult is with its actual value Software cost estimates play

significant role in delivering software projects As a resultresearchers have proposed the most widely used evaluationcriterion to assess the performance of software predic-tion models that is the mean magnitude of relative error(MMRE) to evaluate the opulence of prediction systemsMMRE is usually computed by following standard evaluationprocesses such as cross-validation [39] It is independent ofsize scale and effort units Comparisons can be made acrossdata sets and prediction model types [40]

COCOMO computes effort on the basis of source linesof codes In intermediate COCOMO Boehm used 15 morepredictor variables called cost drivers which are requiredto calibrate the nominal effort of a project to the actualproject environment The values are set to each cost driveraccording to the properties of the specific software projectThese numerical values of 15 cost drivers are multiplied to getthe effort adjustment factor that is EAF

52 Data Analysis Performance of estimation methods isusually evaluated by several ratio measurements of accuracymetrics including RE (relative error) MRE (magnitude ofrelative error) and MMRE (mean magnitude of relativeerror) which are computed as follows

RE = Estimated efforts minus Actual effortsActual Efforts

MRE = Estimated Efforts sim Actual EffortsActual Efforts

MMRE = sum MRE119873

(3)

Another parameter used in evaluation of performance ofestimationmethod is PRED (percentage of prediction) whichis determined as

PRED (119883) = 119860119873

(4)

6 Advances in Software Engineering

Table 2 Significant occurrences of cost drivers

1 acap2 pcap3 aexp4 rely5 Virt6 vexp7 time8 modp9 cplx10 data11 tool12 sced13 lexp14 turn15 stor

where 119860 is the number of projects with MRE less than orequal to level 119883 and119873 is the number of considered projectsUsually the acceptable level of 119883 is 025 and the variousmethods are compared based on this level Decreasing ofMMRE and increasing of PRED are the main aim of allestimation techniques

6 Results and Discussion

61 Dataset Description Experiments were done by taking63 COCOMO 81 based dataset used by NASA and variousother calculations performed on it 93 NASA projects fromdifferent centers for projects from the years of 1971 to 1987were collected by Jairus Hihn JPL NASA Manager of SQIPMeasurement and Benchmarking Element The proposedmodel is validated by these datasetsThese are one of themostanalyzed data setsThe independent variable used is ldquoadjusteddelivered source instructionsrdquo which takes into account thevariation of effort when adapting software COCOMO is builtupon these data points by introducing many factors in theform of multipliers

These datasets include 156 historical projects with 17 effortdrivers and one dependent variable of the software develop-ment effort

62 Result Analysis Cost drivers play a vital role in estima-tion of the efforts and cost to be incurredThey show charac-teristics of software development that influence effort incarrying out a certain project Cost drivers are selected basedon the arguments that they have a linear effect on effortCOCOMO cost drivers are the basis for the analysis ofproposed algorithm Table 1 depicts the COCOMO effortmultipliers

Significance of 15 cost drivers can be shown by theirimpact on MMRE of efforts on original 63 NASA datasetsThe significance occurrences of 15 cost drivers are calculatedby applying step 1 to step 4 which are shown in Table 2

0005

01015

02025

03035

04045

05

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

MM

RE

Cost drivers

MMRE(COCOMO)Significant occurrences

Figure 1 Relationship between MMRE and cost drivers

002040608

112141618

2

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61

MRE

Software projects

MRE COCOMOMRE with tuned parameters

Figure 2 Comparison of MRE for NASA 63 projects

The occurrence of each cost driver is having linearity withthe MMRE calculated between actual efforts produced andestimated effort with COCOMO In Figure 1 each cost drivermoves against MMRE that is constant for all cases The effectof each cost driver on the MMRE is the significant aspect ofderiving the occurrence of cost drivers The proportionaterelationship can be seen from Figure 1 where the higherinfluencedMMRE with each independent cost driver againstconstant value of MMRE is the most significant and thosewith lower values are less significant

Once significant occurrences of the cost drivers are foundthe sequence of cost drivers is used to produce tuned valuesfor different ratings of various cost drivers Step 5 and step 6are used to generate the new values of available cost driversTable 3 reveals the tuned values in preexisting cost drivers

The proposed algorithm is validated with two differentdatasets of NASA projects According to the evaluationcriteria the proposed method has marginal difference ineffortswith actual project efforts in comparison toCOCOMO

Advances in Software Engineering 7

Table 3 Proposed algorithm based cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 09 086 071pcap 142 09 1 086 07aexp 129 14 1 091 082modp 138 092 1 091 082tool 124 11 099 093 083vexp 138 103 1 09lexp 114 108 09 095sced 123 108 099 104 11stor 1 106 119 138data 103 09 106 138time 09 111 13 166turn 097 092 103 09virt 087 1 115 13cplx 07 085 111 115 116 165rely 075 088 1 125 14

0

2

4

6

8

10

12

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91

MRE

NASA projects

MRE proposed modelMRE COCOMO

Figure 3 Comparison of MRE for NASA 93 projects

generated efforts shown in Figure 2 Most of the results arekept near to the mean of MRE for 63 data values Other93 datasets were also used to evaluate the projects with theproposed method (Figure 3 and Figure 4)

A comparison is made between proposed method andother estimation methods by MMRE in Table 4 Proposedmethod is having average error 027 with actual efforts andCOCOMO produces a bit higher percentage of error withactual efforts Proposedmodel is working efficiently for other93 datasets as well

Essentially we want to measure useful functionalityproduced per time unit Productivity is anothermeasurement

of effectiveness of the model It is a measure of the rateor ratio at which individual software developers involvedin software development produce software and associateddocumentation

Higher productivity reflects the better quality achieve-ment for the project development Proposed method ishaving productivity 029 which is closer to the actual efforts027 as productivity Seven percent of proposed methodproductivity is increased and 9 percent of COCOMO pro-ductivity is decreased in comparison with actual productivity(Table 5) So the percentage of difference between proposedmethod and COCOMO results is approximately 1795

8 Advances in Software Engineering

0

02

04

06

08

1

12

14

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61

Prod

uctiv

ity

Software projects

Productivity with actual effortsProductivity with COCOMO effortsProductivity with proposed method

Figure 4 Comparison of productivity for NASA 63 projects

Table 4 The MMRE for two different methods

MMRE (for 63 datasets) MMRE (for 93 datasets)COCOMO versus actual Proposed method versus actual COCOMO versus actual Proposed method versus actual036 027 059 056

Table 5 The productivity of various approaches

Productivity(COCOMO)

Productivity(proposed method)

Productivity(actual)

025 030 028Differencefrom actual 003 002

Table 6 MMRE of NASA 63 projects for various project modes

Project mode No ofprojects (63)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 27 029 039Organic 25 028 037Semidetached 11 022 023

Table 7 MMRE of NASA 93 projects for various project modes

Project mode No ofprojects (93)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 21 072 082Organic 3 08 088Semidetached 69 051 051

Tables 6 and 7 depict the presence of error in all threecategories of projectmodes for two different types of datasetsThe comparison was made between proposed model gener-ated results versus COCOMO results We also evaluate thedifferent type of project application categorically 80 of totaldatasets are producing the results which are better than theCOCOMO based results (Table 8)

PRED was calculated with the two separate approachesand Table 9 depicts that for 3 different PRED assumptionsproposedmethod is producing approximately 6665 801and 834 increase in PRED respectively

7 Conclusion

Work carried out in the paper explores the inter-relationshipamong different dimensions of data driven software projectsnamely project size and effort The above-mentioned resultsdemonstrate that applying proposed method to the softwareeffort estimation is by far the most feasible approach foraddressing the problem of apprehension and ambiguityexisting in software effort drivers Order of occurrence ofvarious cost drivers has a significant impact on overall effortsin project estimation Small adjustments to the COCOMOcost drivers bring significant improvements to the qualitycriteria applied to the proposed approach Proposed methodis producing tuned values of the cost drivers which areeffective enough to improve the productivity of the projectsPrediction at different levels of MRE for each project reflects

Advances in Software Engineering 9

Table 8 Description of projects on application basis

Type of application No of projects MMRE COCOMO MMRE proposed methodApplication ground 2 028 025Avionics 11 095 080Avionics monitoring 30 066 055Batch data processing 2 008 012Communications 1 018 005Data capture 3 009 007Launch processing 1 032 046Mission planning 20 038 034Monitor control 8 020 050Operating system 4 382 363Real data processing 3 012 006Science 2 018 041Simulation 4 017 029Utility 2 012 031

Table 9 Pred calculation at different values for both the models

PREDCOCOMO Proposed method

10 20 30 10 20 30Percentage of 63 NASA datasets 2381 3968 5714 254 4286 6191

the percentage of projects with desired accuracy Further-more this model is validated on two different datasets whichrepresents better estimation accuracy as compared to theCOCOMO 81 based NASA 63 and NASA 93 datasets Theutilization of proposed algorithm for other applications in thesoftware engineering field can also be explored in the future

Conflict of Interests

The authors certify that there is no actual or potential conflictof interests in relation to this paper The American CompanyTRWSystems Inc has been referred to as the company whereBarry W Boehm the developer of COCOMO worked

References

[1] K M Furulund and K Moloslashkken-Oslashstvold ldquoIncreasing soft-ware effort estimation accuracymdashusing experience data esti-mationmodels and checklistsrdquo in Proceedings of the 7th Interna-tional Conference on Quality Software (QSIC rsquo07) pp 342ndash347Portland OR USA October 2007

[2] Q Alam P Bhatia and S Sarwar Systematic Review of EffortEstimation and Cost Estimation Institute of Management Stud-ies Roorkee India 2012

[3] J J Dolado On the Problem of the Software Cost FunctionFacultad de Informatica Universidad del Pais Vasco-EuskalHerriko Unibertsitatea Gipuzkoa Spain 2000

[4] K Molokken and M Jorgensen ldquoA review of software surveyson software effort estimationrdquo inProceedings of the InternationalSymposium on Empirical Software Engineering (ISESE rsquo03) pp220ndash230 2003

[5] F Ferrucci C Gravino R Oliveto and F Sarro ldquoGenetic pro-gramming for effort estimation an analysis of the impact of dif-ferent fitness functionsrdquo in Proceedings of the 2nd InternationalSymposium on Search Based Software Engineering (SSBSE rsquo10)pp 89ndash98 IEEE Computer Society DMI University of SalernoBenevento Italy October 2010

[6] A F Sheta ldquoEstimation of the COCOMO model parametersusing genetic algorithms for NASA software projectsrdquo Journalof Computer Science vol 2 no 2 pp 118ndash123 2006

[7] B W Boehm Software Engineering Economics Prentice HallIEEE 1984

[8] J Magne and M Shepperd ldquoA Systematic Review Of SoftwareDevelopment Cost Estimation Studiesrdquo IEEE Transactions onSoftware Engineering vol 33 no 1 pp 33ndash53 2007

[9] P L Braga A L I Oliveira and S R L Meira ldquoA GA-based feature selection andparameters optimization for supportvector regression applied to software effort estimationrdquo inProceedings of the 23rd Annual ACM Symposium on AppliedComputing (SAC rsquo08) pp 1788ndash1792 Ceara Brazil March 2008

[10] M Harman and B F Jones ldquoSearch-based software engineer-ingrdquo Information and Software Technology vol 43 no 14 pp833ndash839 2001

[11] J Clarke J J DoladoMHarman et al ldquoReformulating softwareengineering as a search problemrdquo IEE Proceedings Software vol150 no 3 pp 161ndash175 2003

[12] M Joslashrgensen and S Grimstad ldquoAvoiding irrelevant and mis-leading informationwhen estimating development effortrdquo IEEESoftware vol 25 no 3 pp 78ndash83 2008

[13] A L Lederer and J Prasad ldquoA causal model for software costestimating errorrdquo IEEE Transactions on Software Engineeringvol 24 no 2 pp 137ndash148 1998

10 Advances in Software Engineering

[14] S Basha and P Dhavachelvan ldquoAnalysis of empirical softwareeffort estimation modelsrdquo International Journal of ComputerScience and Information Security vol 7 no 3 pp 68ndash77 2010

[15] B L Barber Investigative search of quality historical softwaresupport cost data and software support cost-related data [MSthesis] 1991

[16] N H Chiu and S J Huang ldquoThe adjusted analogy-based soft-ware effort estimation based on similarity distancesrdquo Journal ofSystems and Software vol 80 no 4 pp 628ndash640 2007

[17] G Kadoda and M Shepperd ldquoUsing simulation to evaluateprediction techniquesrdquo in Proceedings of the 7th InternationalSoftware Metrics Symposium (METRICS rsquo01) pp 349ndash359 IEEEPress London UK 2001

[18] M J Shepperd and G F Kadoda ldquoComparing software predic-tion techniques using simulationrdquo IEEE Transactions on Soft-ware Engineering vol 27 no 11 pp 1014ndash1022 2001

[19] M J Shepperd and C Schofield ldquoEstimating software projecteffort using analogiesrdquo IEEE Transactions on Software Engineer-ing vol 23 no 11 pp 736ndash743 1997

[20] M Jooslashrgensen and D I K Sjoslashberg ldquoThe impact of customerexpectation on software development effort estimatesrdquo Interna-tional Journal of Project Management vol 22 no 4 pp 317ndash3252004

[21] J Kaczmarek and M Kucharski ldquoSize and effort estimation forapplications written in Javardquo Information and Software Technol-ogy vol 46 no 9 pp 589ndash601 2004

[22] R Jeffery M Ruhe and I Wieczorek ldquoUsing public domainmetrics to estimate software development effortrdquo in Proceedingsof the 7th International Software Metrics Symposium (METRICSrsquo01) pp 16ndash27 IEEE Computer Society Washington DC USAApril 2001

[23] G H Subramanian P C Pendharkar and MWallace ldquoAn em-pirical study of the effect of complexity platform and programtype on software development effort of business applicationsrdquoEmpirical Software Engineering vol 11 no 4 pp 541ndash553 2006

[24] D E GoldbergGenetic Algorithms in Search Optimization andMachine Learning chapter 1ndash8 Addison-Wesley New York NYUSA 1989

[25] A Heiat ldquoComparison of artificial neural network and regres-sion models for estimating software development effortrdquo Infor-mation and Software Technology vol 44 no 15 pp 911ndash9222002

[26] K Srinivasan and D Fisher ldquoMachine learning approaches toestimating software development effortrdquo IEEE Transactions onSoftware Engineering vol 21 no 2 pp 126ndash137 1995

[27] S J Huang C Y Lin and N H Chiu ldquoFuzzy decision treeapproach for embedding risk assessment information into soft-ware cost estimation modelrdquo Journal of Information Science andEngineering vol 22 no 2 pp 297ndash313 2006

[28] M van Genuchten and H Koolen ldquoOn the use of software costmodelsrdquo Information and Management vol 21 no 1 pp 37ndash441991

[29] A J Albrecht and J E Gaffney ldquoSoftware function source linesof code and development effort prediction a software sciencevalidationrdquo IEEE Transactions on Software Engineering vol 9no 6 pp 639ndash648 1983

[30] I Attarzadeh and SHOw ldquoA novel algorithmic cost estimationmodel based on soft computing techniquerdquo Journal of ComputerScience vol 6 no 2 pp 117ndash125 2010

[31] F J Heemstra Software Cost Estimation Models University ofTechnology Department of Industrial Engineering IEEE 1990

[32] M Joslashrgensen B Boehm and S Rifkin ldquoSoftware developmenteffort estimation formal models or expert judgmentrdquo IEEESoftware vol 26 no 2 pp 14ndash19 2009

[33] Y F Li M Xie and T N Goh ldquoA study of genetic algorithm forproject selection for analogy based software cost estimationrdquo inProceedings of the IEEE International Conference on IndustrialEngineering and EngineeringManagement (IEEM rsquo07) pp 1256ndash1260 Singapore December 2007

[34] H Liu and L Yu ldquoToward integrating feature selection algo-rithms for classification and clusteringrdquo IEEE Transactions onKnowledge and Data Engineering vol 17 no 4 pp 491ndash5022005

[35] A Kumar S Tiwari K KMishra andA KMisra ldquoGenerationof efficient test data using path selection strategy with elitist GAin regression testingrdquo inProceedings of the 3rd IEEE Internation-al Conference on Computer Science and Information Technology(ICCSIT rsquo10) vol 9 pp 389ndash393 Chengdu China July 2010

[36] K K Mishra S Tiwari A Kumar and A K Misra ldquoAnapproach for mutation testing using elitist genetic algorithmrdquoin Proceedings of the 3rd IEEE International Conference on Com-puter Science and Information Technology (ICCSIT rsquo10) vol 5pp 426ndash429 Chengdu China July 2010

[37] S Sarmady An Investigation on Genetic Algorithm ParametersP-COM000507(R) P-COM008807 School of Computer Sci-ences Universiti Sains Malaysia Penang Malaysia 2007

[38] K F Man K S Tang and S Kwong Genetic Algorithms Con-cepts and Designs Chapter 1ndash10 Springer New York NY USA2001

[39] L C Briand K El-Emam and I Wieczorek ldquoExplaining thecost of European space and military projectsrdquo in Proceedings ofthe International Conference on Software Engineering (ICSE rsquo99)pp 303ndash312 ACM Press May 1999

[40] L C Briand T Langley and I Wieczorek ldquoReplicated assess-ment and comparison of common software cost modelingtechniquesrdquo in Proceedings of the International Conference onSoftware Engineering (ICSE rsquo22) pp 377ndash386 ACM Press June2000

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article Tuning of Cost Drivers by Significance

Advances in Software Engineering 5

Table 1 COCOMO cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 1 086 071pcap 142 117 1 086 07aexp 129 113 1 091 082modp 124 11 1 091 082tool 124 11 1 091 083vexp 121 11 1 09lexp 114 107 1 095sced 123 108 1 104 11stor 1 106 121 156data 094 1 108 116time 1 111 13 166turn 087 1 107 115virt 087 1 115 13cplx 07 085 1 115 13 165rely 075 088 1 115 14

various projects are used as the input parameters to calculatethe mean of relative error (MRE) of each project MMRE forCOCOMO results is recorded as the original MMRE

In submodel 2 influenced MMRE is calculated on thebasis of occurrences of 15 cost driversThis influencedMMREshows the effectiveness of each cost driver in the sequenceof development of efforts in terms of person-months In thisprocess we take sample data having 18 input parametersthat is 15 cost drivers modes source lines and actual effortThe estimated efforts are calculated for the sample data bynullifying the effect of cost driver one by oneThese efforts areused to calculate the influenced MMRE for each cost drivercorresponding to the actual effort provided in the sampledata The difference between influenced MMRE and originalMMRE is recorded in the list along with driver

In submodel 3 the list with the difference of influencedMMRE and originalMMRE is sorted in the descending orderof the difference to provide the significant occurrence of thedriver This order has been named as Sig

In submodel 4 we will try to minimize the MMRE byupdating the value of cost driver with the help of geneticalgorithm in the order of their significance This is done byselecting the projects falling in the category of particularcost driver and then using the genetic algorithm operatorThe results obtained are evaluated using the fitness functionas MMRE If the MMRE is reduced the cost driver valuefor particular rating is updated otherwise discarded ThereducedMMREwill be recorded asMmodwhichwill be usedas the MMRE for the remaining cost drivers

5 Evaluation Method

51 Conceptual View Software cost estimation models needto be quantitatively evaluated in terms of estimation accuracyto improve themodeling process Some rules or themeasure-ments must be provided for model assessment purpose Thismeasurement of accuracy defines how close the estimatedresult is with its actual value Software cost estimates play

significant role in delivering software projects As a resultresearchers have proposed the most widely used evaluationcriterion to assess the performance of software predic-tion models that is the mean magnitude of relative error(MMRE) to evaluate the opulence of prediction systemsMMRE is usually computed by following standard evaluationprocesses such as cross-validation [39] It is independent ofsize scale and effort units Comparisons can be made acrossdata sets and prediction model types [40]

COCOMO computes effort on the basis of source linesof codes In intermediate COCOMO Boehm used 15 morepredictor variables called cost drivers which are requiredto calibrate the nominal effort of a project to the actualproject environment The values are set to each cost driveraccording to the properties of the specific software projectThese numerical values of 15 cost drivers are multiplied to getthe effort adjustment factor that is EAF

52 Data Analysis Performance of estimation methods isusually evaluated by several ratio measurements of accuracymetrics including RE (relative error) MRE (magnitude ofrelative error) and MMRE (mean magnitude of relativeerror) which are computed as follows

RE = Estimated efforts minus Actual effortsActual Efforts

MRE = Estimated Efforts sim Actual EffortsActual Efforts

MMRE = sum MRE119873

(3)

Another parameter used in evaluation of performance ofestimationmethod is PRED (percentage of prediction) whichis determined as

PRED (119883) = 119860119873

(4)

6 Advances in Software Engineering

Table 2 Significant occurrences of cost drivers

1 acap2 pcap3 aexp4 rely5 Virt6 vexp7 time8 modp9 cplx10 data11 tool12 sced13 lexp14 turn15 stor

where 119860 is the number of projects with MRE less than orequal to level 119883 and119873 is the number of considered projectsUsually the acceptable level of 119883 is 025 and the variousmethods are compared based on this level Decreasing ofMMRE and increasing of PRED are the main aim of allestimation techniques

6 Results and Discussion

61 Dataset Description Experiments were done by taking63 COCOMO 81 based dataset used by NASA and variousother calculations performed on it 93 NASA projects fromdifferent centers for projects from the years of 1971 to 1987were collected by Jairus Hihn JPL NASA Manager of SQIPMeasurement and Benchmarking Element The proposedmodel is validated by these datasetsThese are one of themostanalyzed data setsThe independent variable used is ldquoadjusteddelivered source instructionsrdquo which takes into account thevariation of effort when adapting software COCOMO is builtupon these data points by introducing many factors in theform of multipliers

These datasets include 156 historical projects with 17 effortdrivers and one dependent variable of the software develop-ment effort

62 Result Analysis Cost drivers play a vital role in estima-tion of the efforts and cost to be incurredThey show charac-teristics of software development that influence effort incarrying out a certain project Cost drivers are selected basedon the arguments that they have a linear effect on effortCOCOMO cost drivers are the basis for the analysis ofproposed algorithm Table 1 depicts the COCOMO effortmultipliers

Significance of 15 cost drivers can be shown by theirimpact on MMRE of efforts on original 63 NASA datasetsThe significance occurrences of 15 cost drivers are calculatedby applying step 1 to step 4 which are shown in Table 2

0005

01015

02025

03035

04045

05

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

MM

RE

Cost drivers

MMRE(COCOMO)Significant occurrences

Figure 1 Relationship between MMRE and cost drivers

002040608

112141618

2

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61

MRE

Software projects

MRE COCOMOMRE with tuned parameters

Figure 2 Comparison of MRE for NASA 63 projects

The occurrence of each cost driver is having linearity withthe MMRE calculated between actual efforts produced andestimated effort with COCOMO In Figure 1 each cost drivermoves against MMRE that is constant for all cases The effectof each cost driver on the MMRE is the significant aspect ofderiving the occurrence of cost drivers The proportionaterelationship can be seen from Figure 1 where the higherinfluencedMMRE with each independent cost driver againstconstant value of MMRE is the most significant and thosewith lower values are less significant

Once significant occurrences of the cost drivers are foundthe sequence of cost drivers is used to produce tuned valuesfor different ratings of various cost drivers Step 5 and step 6are used to generate the new values of available cost driversTable 3 reveals the tuned values in preexisting cost drivers

The proposed algorithm is validated with two differentdatasets of NASA projects According to the evaluationcriteria the proposed method has marginal difference ineffortswith actual project efforts in comparison toCOCOMO

Advances in Software Engineering 7

Table 3 Proposed algorithm based cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 09 086 071pcap 142 09 1 086 07aexp 129 14 1 091 082modp 138 092 1 091 082tool 124 11 099 093 083vexp 138 103 1 09lexp 114 108 09 095sced 123 108 099 104 11stor 1 106 119 138data 103 09 106 138time 09 111 13 166turn 097 092 103 09virt 087 1 115 13cplx 07 085 111 115 116 165rely 075 088 1 125 14

0

2

4

6

8

10

12

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91

MRE

NASA projects

MRE proposed modelMRE COCOMO

Figure 3 Comparison of MRE for NASA 93 projects

generated efforts shown in Figure 2 Most of the results arekept near to the mean of MRE for 63 data values Other93 datasets were also used to evaluate the projects with theproposed method (Figure 3 and Figure 4)

A comparison is made between proposed method andother estimation methods by MMRE in Table 4 Proposedmethod is having average error 027 with actual efforts andCOCOMO produces a bit higher percentage of error withactual efforts Proposedmodel is working efficiently for other93 datasets as well

Essentially we want to measure useful functionalityproduced per time unit Productivity is anothermeasurement

of effectiveness of the model It is a measure of the rateor ratio at which individual software developers involvedin software development produce software and associateddocumentation

Higher productivity reflects the better quality achieve-ment for the project development Proposed method ishaving productivity 029 which is closer to the actual efforts027 as productivity Seven percent of proposed methodproductivity is increased and 9 percent of COCOMO pro-ductivity is decreased in comparison with actual productivity(Table 5) So the percentage of difference between proposedmethod and COCOMO results is approximately 1795

8 Advances in Software Engineering

0

02

04

06

08

1

12

14

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61

Prod

uctiv

ity

Software projects

Productivity with actual effortsProductivity with COCOMO effortsProductivity with proposed method

Figure 4 Comparison of productivity for NASA 63 projects

Table 4 The MMRE for two different methods

MMRE (for 63 datasets) MMRE (for 93 datasets)COCOMO versus actual Proposed method versus actual COCOMO versus actual Proposed method versus actual036 027 059 056

Table 5 The productivity of various approaches

Productivity(COCOMO)

Productivity(proposed method)

Productivity(actual)

025 030 028Differencefrom actual 003 002

Table 6 MMRE of NASA 63 projects for various project modes

Project mode No ofprojects (63)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 27 029 039Organic 25 028 037Semidetached 11 022 023

Table 7 MMRE of NASA 93 projects for various project modes

Project mode No ofprojects (93)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 21 072 082Organic 3 08 088Semidetached 69 051 051

Tables 6 and 7 depict the presence of error in all threecategories of projectmodes for two different types of datasetsThe comparison was made between proposed model gener-ated results versus COCOMO results We also evaluate thedifferent type of project application categorically 80 of totaldatasets are producing the results which are better than theCOCOMO based results (Table 8)

PRED was calculated with the two separate approachesand Table 9 depicts that for 3 different PRED assumptionsproposedmethod is producing approximately 6665 801and 834 increase in PRED respectively

7 Conclusion

Work carried out in the paper explores the inter-relationshipamong different dimensions of data driven software projectsnamely project size and effort The above-mentioned resultsdemonstrate that applying proposed method to the softwareeffort estimation is by far the most feasible approach foraddressing the problem of apprehension and ambiguityexisting in software effort drivers Order of occurrence ofvarious cost drivers has a significant impact on overall effortsin project estimation Small adjustments to the COCOMOcost drivers bring significant improvements to the qualitycriteria applied to the proposed approach Proposed methodis producing tuned values of the cost drivers which areeffective enough to improve the productivity of the projectsPrediction at different levels of MRE for each project reflects

Advances in Software Engineering 9

Table 8 Description of projects on application basis

Type of application No of projects MMRE COCOMO MMRE proposed methodApplication ground 2 028 025Avionics 11 095 080Avionics monitoring 30 066 055Batch data processing 2 008 012Communications 1 018 005Data capture 3 009 007Launch processing 1 032 046Mission planning 20 038 034Monitor control 8 020 050Operating system 4 382 363Real data processing 3 012 006Science 2 018 041Simulation 4 017 029Utility 2 012 031

Table 9 Pred calculation at different values for both the models

PREDCOCOMO Proposed method

10 20 30 10 20 30Percentage of 63 NASA datasets 2381 3968 5714 254 4286 6191

the percentage of projects with desired accuracy Further-more this model is validated on two different datasets whichrepresents better estimation accuracy as compared to theCOCOMO 81 based NASA 63 and NASA 93 datasets Theutilization of proposed algorithm for other applications in thesoftware engineering field can also be explored in the future

Conflict of Interests

The authors certify that there is no actual or potential conflictof interests in relation to this paper The American CompanyTRWSystems Inc has been referred to as the company whereBarry W Boehm the developer of COCOMO worked

References

[1] K M Furulund and K Moloslashkken-Oslashstvold ldquoIncreasing soft-ware effort estimation accuracymdashusing experience data esti-mationmodels and checklistsrdquo in Proceedings of the 7th Interna-tional Conference on Quality Software (QSIC rsquo07) pp 342ndash347Portland OR USA October 2007

[2] Q Alam P Bhatia and S Sarwar Systematic Review of EffortEstimation and Cost Estimation Institute of Management Stud-ies Roorkee India 2012

[3] J J Dolado On the Problem of the Software Cost FunctionFacultad de Informatica Universidad del Pais Vasco-EuskalHerriko Unibertsitatea Gipuzkoa Spain 2000

[4] K Molokken and M Jorgensen ldquoA review of software surveyson software effort estimationrdquo inProceedings of the InternationalSymposium on Empirical Software Engineering (ISESE rsquo03) pp220ndash230 2003

[5] F Ferrucci C Gravino R Oliveto and F Sarro ldquoGenetic pro-gramming for effort estimation an analysis of the impact of dif-ferent fitness functionsrdquo in Proceedings of the 2nd InternationalSymposium on Search Based Software Engineering (SSBSE rsquo10)pp 89ndash98 IEEE Computer Society DMI University of SalernoBenevento Italy October 2010

[6] A F Sheta ldquoEstimation of the COCOMO model parametersusing genetic algorithms for NASA software projectsrdquo Journalof Computer Science vol 2 no 2 pp 118ndash123 2006

[7] B W Boehm Software Engineering Economics Prentice HallIEEE 1984

[8] J Magne and M Shepperd ldquoA Systematic Review Of SoftwareDevelopment Cost Estimation Studiesrdquo IEEE Transactions onSoftware Engineering vol 33 no 1 pp 33ndash53 2007

[9] P L Braga A L I Oliveira and S R L Meira ldquoA GA-based feature selection andparameters optimization for supportvector regression applied to software effort estimationrdquo inProceedings of the 23rd Annual ACM Symposium on AppliedComputing (SAC rsquo08) pp 1788ndash1792 Ceara Brazil March 2008

[10] M Harman and B F Jones ldquoSearch-based software engineer-ingrdquo Information and Software Technology vol 43 no 14 pp833ndash839 2001

[11] J Clarke J J DoladoMHarman et al ldquoReformulating softwareengineering as a search problemrdquo IEE Proceedings Software vol150 no 3 pp 161ndash175 2003

[12] M Joslashrgensen and S Grimstad ldquoAvoiding irrelevant and mis-leading informationwhen estimating development effortrdquo IEEESoftware vol 25 no 3 pp 78ndash83 2008

[13] A L Lederer and J Prasad ldquoA causal model for software costestimating errorrdquo IEEE Transactions on Software Engineeringvol 24 no 2 pp 137ndash148 1998

10 Advances in Software Engineering

[14] S Basha and P Dhavachelvan ldquoAnalysis of empirical softwareeffort estimation modelsrdquo International Journal of ComputerScience and Information Security vol 7 no 3 pp 68ndash77 2010

[15] B L Barber Investigative search of quality historical softwaresupport cost data and software support cost-related data [MSthesis] 1991

[16] N H Chiu and S J Huang ldquoThe adjusted analogy-based soft-ware effort estimation based on similarity distancesrdquo Journal ofSystems and Software vol 80 no 4 pp 628ndash640 2007

[17] G Kadoda and M Shepperd ldquoUsing simulation to evaluateprediction techniquesrdquo in Proceedings of the 7th InternationalSoftware Metrics Symposium (METRICS rsquo01) pp 349ndash359 IEEEPress London UK 2001

[18] M J Shepperd and G F Kadoda ldquoComparing software predic-tion techniques using simulationrdquo IEEE Transactions on Soft-ware Engineering vol 27 no 11 pp 1014ndash1022 2001

[19] M J Shepperd and C Schofield ldquoEstimating software projecteffort using analogiesrdquo IEEE Transactions on Software Engineer-ing vol 23 no 11 pp 736ndash743 1997

[20] M Jooslashrgensen and D I K Sjoslashberg ldquoThe impact of customerexpectation on software development effort estimatesrdquo Interna-tional Journal of Project Management vol 22 no 4 pp 317ndash3252004

[21] J Kaczmarek and M Kucharski ldquoSize and effort estimation forapplications written in Javardquo Information and Software Technol-ogy vol 46 no 9 pp 589ndash601 2004

[22] R Jeffery M Ruhe and I Wieczorek ldquoUsing public domainmetrics to estimate software development effortrdquo in Proceedingsof the 7th International Software Metrics Symposium (METRICSrsquo01) pp 16ndash27 IEEE Computer Society Washington DC USAApril 2001

[23] G H Subramanian P C Pendharkar and MWallace ldquoAn em-pirical study of the effect of complexity platform and programtype on software development effort of business applicationsrdquoEmpirical Software Engineering vol 11 no 4 pp 541ndash553 2006

[24] D E GoldbergGenetic Algorithms in Search Optimization andMachine Learning chapter 1ndash8 Addison-Wesley New York NYUSA 1989

[25] A Heiat ldquoComparison of artificial neural network and regres-sion models for estimating software development effortrdquo Infor-mation and Software Technology vol 44 no 15 pp 911ndash9222002

[26] K Srinivasan and D Fisher ldquoMachine learning approaches toestimating software development effortrdquo IEEE Transactions onSoftware Engineering vol 21 no 2 pp 126ndash137 1995

[27] S J Huang C Y Lin and N H Chiu ldquoFuzzy decision treeapproach for embedding risk assessment information into soft-ware cost estimation modelrdquo Journal of Information Science andEngineering vol 22 no 2 pp 297ndash313 2006

[28] M van Genuchten and H Koolen ldquoOn the use of software costmodelsrdquo Information and Management vol 21 no 1 pp 37ndash441991

[29] A J Albrecht and J E Gaffney ldquoSoftware function source linesof code and development effort prediction a software sciencevalidationrdquo IEEE Transactions on Software Engineering vol 9no 6 pp 639ndash648 1983

[30] I Attarzadeh and SHOw ldquoA novel algorithmic cost estimationmodel based on soft computing techniquerdquo Journal of ComputerScience vol 6 no 2 pp 117ndash125 2010

[31] F J Heemstra Software Cost Estimation Models University ofTechnology Department of Industrial Engineering IEEE 1990

[32] M Joslashrgensen B Boehm and S Rifkin ldquoSoftware developmenteffort estimation formal models or expert judgmentrdquo IEEESoftware vol 26 no 2 pp 14ndash19 2009

[33] Y F Li M Xie and T N Goh ldquoA study of genetic algorithm forproject selection for analogy based software cost estimationrdquo inProceedings of the IEEE International Conference on IndustrialEngineering and EngineeringManagement (IEEM rsquo07) pp 1256ndash1260 Singapore December 2007

[34] H Liu and L Yu ldquoToward integrating feature selection algo-rithms for classification and clusteringrdquo IEEE Transactions onKnowledge and Data Engineering vol 17 no 4 pp 491ndash5022005

[35] A Kumar S Tiwari K KMishra andA KMisra ldquoGenerationof efficient test data using path selection strategy with elitist GAin regression testingrdquo inProceedings of the 3rd IEEE Internation-al Conference on Computer Science and Information Technology(ICCSIT rsquo10) vol 9 pp 389ndash393 Chengdu China July 2010

[36] K K Mishra S Tiwari A Kumar and A K Misra ldquoAnapproach for mutation testing using elitist genetic algorithmrdquoin Proceedings of the 3rd IEEE International Conference on Com-puter Science and Information Technology (ICCSIT rsquo10) vol 5pp 426ndash429 Chengdu China July 2010

[37] S Sarmady An Investigation on Genetic Algorithm ParametersP-COM000507(R) P-COM008807 School of Computer Sci-ences Universiti Sains Malaysia Penang Malaysia 2007

[38] K F Man K S Tang and S Kwong Genetic Algorithms Con-cepts and Designs Chapter 1ndash10 Springer New York NY USA2001

[39] L C Briand K El-Emam and I Wieczorek ldquoExplaining thecost of European space and military projectsrdquo in Proceedings ofthe International Conference on Software Engineering (ICSE rsquo99)pp 303ndash312 ACM Press May 1999

[40] L C Briand T Langley and I Wieczorek ldquoReplicated assess-ment and comparison of common software cost modelingtechniquesrdquo in Proceedings of the International Conference onSoftware Engineering (ICSE rsquo22) pp 377ndash386 ACM Press June2000

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article Tuning of Cost Drivers by Significance

6 Advances in Software Engineering

Table 2 Significant occurrences of cost drivers

1 acap2 pcap3 aexp4 rely5 Virt6 vexp7 time8 modp9 cplx10 data11 tool12 sced13 lexp14 turn15 stor

where 119860 is the number of projects with MRE less than orequal to level 119883 and119873 is the number of considered projectsUsually the acceptable level of 119883 is 025 and the variousmethods are compared based on this level Decreasing ofMMRE and increasing of PRED are the main aim of allestimation techniques

6 Results and Discussion

61 Dataset Description Experiments were done by taking63 COCOMO 81 based dataset used by NASA and variousother calculations performed on it 93 NASA projects fromdifferent centers for projects from the years of 1971 to 1987were collected by Jairus Hihn JPL NASA Manager of SQIPMeasurement and Benchmarking Element The proposedmodel is validated by these datasetsThese are one of themostanalyzed data setsThe independent variable used is ldquoadjusteddelivered source instructionsrdquo which takes into account thevariation of effort when adapting software COCOMO is builtupon these data points by introducing many factors in theform of multipliers

These datasets include 156 historical projects with 17 effortdrivers and one dependent variable of the software develop-ment effort

62 Result Analysis Cost drivers play a vital role in estima-tion of the efforts and cost to be incurredThey show charac-teristics of software development that influence effort incarrying out a certain project Cost drivers are selected basedon the arguments that they have a linear effect on effortCOCOMO cost drivers are the basis for the analysis ofproposed algorithm Table 1 depicts the COCOMO effortmultipliers

Significance of 15 cost drivers can be shown by theirimpact on MMRE of efforts on original 63 NASA datasetsThe significance occurrences of 15 cost drivers are calculatedby applying step 1 to step 4 which are shown in Table 2

0005

01015

02025

03035

04045

05

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

MM

RE

Cost drivers

MMRE(COCOMO)Significant occurrences

Figure 1 Relationship between MMRE and cost drivers

002040608

112141618

2

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61

MRE

Software projects

MRE COCOMOMRE with tuned parameters

Figure 2 Comparison of MRE for NASA 63 projects

The occurrence of each cost driver is having linearity withthe MMRE calculated between actual efforts produced andestimated effort with COCOMO In Figure 1 each cost drivermoves against MMRE that is constant for all cases The effectof each cost driver on the MMRE is the significant aspect ofderiving the occurrence of cost drivers The proportionaterelationship can be seen from Figure 1 where the higherinfluencedMMRE with each independent cost driver againstconstant value of MMRE is the most significant and thosewith lower values are less significant

Once significant occurrences of the cost drivers are foundthe sequence of cost drivers is used to produce tuned valuesfor different ratings of various cost drivers Step 5 and step 6are used to generate the new values of available cost driversTable 3 reveals the tuned values in preexisting cost drivers

The proposed algorithm is validated with two differentdatasets of NASA projects According to the evaluationcriteria the proposed method has marginal difference ineffortswith actual project efforts in comparison toCOCOMO

Advances in Software Engineering 7

Table 3 Proposed algorithm based cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 09 086 071pcap 142 09 1 086 07aexp 129 14 1 091 082modp 138 092 1 091 082tool 124 11 099 093 083vexp 138 103 1 09lexp 114 108 09 095sced 123 108 099 104 11stor 1 106 119 138data 103 09 106 138time 09 111 13 166turn 097 092 103 09virt 087 1 115 13cplx 07 085 111 115 116 165rely 075 088 1 125 14

0

2

4

6

8

10

12

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91

MRE

NASA projects

MRE proposed modelMRE COCOMO

Figure 3 Comparison of MRE for NASA 93 projects

generated efforts shown in Figure 2 Most of the results arekept near to the mean of MRE for 63 data values Other93 datasets were also used to evaluate the projects with theproposed method (Figure 3 and Figure 4)

A comparison is made between proposed method andother estimation methods by MMRE in Table 4 Proposedmethod is having average error 027 with actual efforts andCOCOMO produces a bit higher percentage of error withactual efforts Proposedmodel is working efficiently for other93 datasets as well

Essentially we want to measure useful functionalityproduced per time unit Productivity is anothermeasurement

of effectiveness of the model It is a measure of the rateor ratio at which individual software developers involvedin software development produce software and associateddocumentation

Higher productivity reflects the better quality achieve-ment for the project development Proposed method ishaving productivity 029 which is closer to the actual efforts027 as productivity Seven percent of proposed methodproductivity is increased and 9 percent of COCOMO pro-ductivity is decreased in comparison with actual productivity(Table 5) So the percentage of difference between proposedmethod and COCOMO results is approximately 1795

8 Advances in Software Engineering

0

02

04

06

08

1

12

14

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61

Prod

uctiv

ity

Software projects

Productivity with actual effortsProductivity with COCOMO effortsProductivity with proposed method

Figure 4 Comparison of productivity for NASA 63 projects

Table 4 The MMRE for two different methods

MMRE (for 63 datasets) MMRE (for 93 datasets)COCOMO versus actual Proposed method versus actual COCOMO versus actual Proposed method versus actual036 027 059 056

Table 5 The productivity of various approaches

Productivity(COCOMO)

Productivity(proposed method)

Productivity(actual)

025 030 028Differencefrom actual 003 002

Table 6 MMRE of NASA 63 projects for various project modes

Project mode No ofprojects (63)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 27 029 039Organic 25 028 037Semidetached 11 022 023

Table 7 MMRE of NASA 93 projects for various project modes

Project mode No ofprojects (93)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 21 072 082Organic 3 08 088Semidetached 69 051 051

Tables 6 and 7 depict the presence of error in all threecategories of projectmodes for two different types of datasetsThe comparison was made between proposed model gener-ated results versus COCOMO results We also evaluate thedifferent type of project application categorically 80 of totaldatasets are producing the results which are better than theCOCOMO based results (Table 8)

PRED was calculated with the two separate approachesand Table 9 depicts that for 3 different PRED assumptionsproposedmethod is producing approximately 6665 801and 834 increase in PRED respectively

7 Conclusion

Work carried out in the paper explores the inter-relationshipamong different dimensions of data driven software projectsnamely project size and effort The above-mentioned resultsdemonstrate that applying proposed method to the softwareeffort estimation is by far the most feasible approach foraddressing the problem of apprehension and ambiguityexisting in software effort drivers Order of occurrence ofvarious cost drivers has a significant impact on overall effortsin project estimation Small adjustments to the COCOMOcost drivers bring significant improvements to the qualitycriteria applied to the proposed approach Proposed methodis producing tuned values of the cost drivers which areeffective enough to improve the productivity of the projectsPrediction at different levels of MRE for each project reflects

Advances in Software Engineering 9

Table 8 Description of projects on application basis

Type of application No of projects MMRE COCOMO MMRE proposed methodApplication ground 2 028 025Avionics 11 095 080Avionics monitoring 30 066 055Batch data processing 2 008 012Communications 1 018 005Data capture 3 009 007Launch processing 1 032 046Mission planning 20 038 034Monitor control 8 020 050Operating system 4 382 363Real data processing 3 012 006Science 2 018 041Simulation 4 017 029Utility 2 012 031

Table 9 Pred calculation at different values for both the models

PREDCOCOMO Proposed method

10 20 30 10 20 30Percentage of 63 NASA datasets 2381 3968 5714 254 4286 6191

the percentage of projects with desired accuracy Further-more this model is validated on two different datasets whichrepresents better estimation accuracy as compared to theCOCOMO 81 based NASA 63 and NASA 93 datasets Theutilization of proposed algorithm for other applications in thesoftware engineering field can also be explored in the future

Conflict of Interests

The authors certify that there is no actual or potential conflictof interests in relation to this paper The American CompanyTRWSystems Inc has been referred to as the company whereBarry W Boehm the developer of COCOMO worked

References

[1] K M Furulund and K Moloslashkken-Oslashstvold ldquoIncreasing soft-ware effort estimation accuracymdashusing experience data esti-mationmodels and checklistsrdquo in Proceedings of the 7th Interna-tional Conference on Quality Software (QSIC rsquo07) pp 342ndash347Portland OR USA October 2007

[2] Q Alam P Bhatia and S Sarwar Systematic Review of EffortEstimation and Cost Estimation Institute of Management Stud-ies Roorkee India 2012

[3] J J Dolado On the Problem of the Software Cost FunctionFacultad de Informatica Universidad del Pais Vasco-EuskalHerriko Unibertsitatea Gipuzkoa Spain 2000

[4] K Molokken and M Jorgensen ldquoA review of software surveyson software effort estimationrdquo inProceedings of the InternationalSymposium on Empirical Software Engineering (ISESE rsquo03) pp220ndash230 2003

[5] F Ferrucci C Gravino R Oliveto and F Sarro ldquoGenetic pro-gramming for effort estimation an analysis of the impact of dif-ferent fitness functionsrdquo in Proceedings of the 2nd InternationalSymposium on Search Based Software Engineering (SSBSE rsquo10)pp 89ndash98 IEEE Computer Society DMI University of SalernoBenevento Italy October 2010

[6] A F Sheta ldquoEstimation of the COCOMO model parametersusing genetic algorithms for NASA software projectsrdquo Journalof Computer Science vol 2 no 2 pp 118ndash123 2006

[7] B W Boehm Software Engineering Economics Prentice HallIEEE 1984

[8] J Magne and M Shepperd ldquoA Systematic Review Of SoftwareDevelopment Cost Estimation Studiesrdquo IEEE Transactions onSoftware Engineering vol 33 no 1 pp 33ndash53 2007

[9] P L Braga A L I Oliveira and S R L Meira ldquoA GA-based feature selection andparameters optimization for supportvector regression applied to software effort estimationrdquo inProceedings of the 23rd Annual ACM Symposium on AppliedComputing (SAC rsquo08) pp 1788ndash1792 Ceara Brazil March 2008

[10] M Harman and B F Jones ldquoSearch-based software engineer-ingrdquo Information and Software Technology vol 43 no 14 pp833ndash839 2001

[11] J Clarke J J DoladoMHarman et al ldquoReformulating softwareengineering as a search problemrdquo IEE Proceedings Software vol150 no 3 pp 161ndash175 2003

[12] M Joslashrgensen and S Grimstad ldquoAvoiding irrelevant and mis-leading informationwhen estimating development effortrdquo IEEESoftware vol 25 no 3 pp 78ndash83 2008

[13] A L Lederer and J Prasad ldquoA causal model for software costestimating errorrdquo IEEE Transactions on Software Engineeringvol 24 no 2 pp 137ndash148 1998

10 Advances in Software Engineering

[14] S Basha and P Dhavachelvan ldquoAnalysis of empirical softwareeffort estimation modelsrdquo International Journal of ComputerScience and Information Security vol 7 no 3 pp 68ndash77 2010

[15] B L Barber Investigative search of quality historical softwaresupport cost data and software support cost-related data [MSthesis] 1991

[16] N H Chiu and S J Huang ldquoThe adjusted analogy-based soft-ware effort estimation based on similarity distancesrdquo Journal ofSystems and Software vol 80 no 4 pp 628ndash640 2007

[17] G Kadoda and M Shepperd ldquoUsing simulation to evaluateprediction techniquesrdquo in Proceedings of the 7th InternationalSoftware Metrics Symposium (METRICS rsquo01) pp 349ndash359 IEEEPress London UK 2001

[18] M J Shepperd and G F Kadoda ldquoComparing software predic-tion techniques using simulationrdquo IEEE Transactions on Soft-ware Engineering vol 27 no 11 pp 1014ndash1022 2001

[19] M J Shepperd and C Schofield ldquoEstimating software projecteffort using analogiesrdquo IEEE Transactions on Software Engineer-ing vol 23 no 11 pp 736ndash743 1997

[20] M Jooslashrgensen and D I K Sjoslashberg ldquoThe impact of customerexpectation on software development effort estimatesrdquo Interna-tional Journal of Project Management vol 22 no 4 pp 317ndash3252004

[21] J Kaczmarek and M Kucharski ldquoSize and effort estimation forapplications written in Javardquo Information and Software Technol-ogy vol 46 no 9 pp 589ndash601 2004

[22] R Jeffery M Ruhe and I Wieczorek ldquoUsing public domainmetrics to estimate software development effortrdquo in Proceedingsof the 7th International Software Metrics Symposium (METRICSrsquo01) pp 16ndash27 IEEE Computer Society Washington DC USAApril 2001

[23] G H Subramanian P C Pendharkar and MWallace ldquoAn em-pirical study of the effect of complexity platform and programtype on software development effort of business applicationsrdquoEmpirical Software Engineering vol 11 no 4 pp 541ndash553 2006

[24] D E GoldbergGenetic Algorithms in Search Optimization andMachine Learning chapter 1ndash8 Addison-Wesley New York NYUSA 1989

[25] A Heiat ldquoComparison of artificial neural network and regres-sion models for estimating software development effortrdquo Infor-mation and Software Technology vol 44 no 15 pp 911ndash9222002

[26] K Srinivasan and D Fisher ldquoMachine learning approaches toestimating software development effortrdquo IEEE Transactions onSoftware Engineering vol 21 no 2 pp 126ndash137 1995

[27] S J Huang C Y Lin and N H Chiu ldquoFuzzy decision treeapproach for embedding risk assessment information into soft-ware cost estimation modelrdquo Journal of Information Science andEngineering vol 22 no 2 pp 297ndash313 2006

[28] M van Genuchten and H Koolen ldquoOn the use of software costmodelsrdquo Information and Management vol 21 no 1 pp 37ndash441991

[29] A J Albrecht and J E Gaffney ldquoSoftware function source linesof code and development effort prediction a software sciencevalidationrdquo IEEE Transactions on Software Engineering vol 9no 6 pp 639ndash648 1983

[30] I Attarzadeh and SHOw ldquoA novel algorithmic cost estimationmodel based on soft computing techniquerdquo Journal of ComputerScience vol 6 no 2 pp 117ndash125 2010

[31] F J Heemstra Software Cost Estimation Models University ofTechnology Department of Industrial Engineering IEEE 1990

[32] M Joslashrgensen B Boehm and S Rifkin ldquoSoftware developmenteffort estimation formal models or expert judgmentrdquo IEEESoftware vol 26 no 2 pp 14ndash19 2009

[33] Y F Li M Xie and T N Goh ldquoA study of genetic algorithm forproject selection for analogy based software cost estimationrdquo inProceedings of the IEEE International Conference on IndustrialEngineering and EngineeringManagement (IEEM rsquo07) pp 1256ndash1260 Singapore December 2007

[34] H Liu and L Yu ldquoToward integrating feature selection algo-rithms for classification and clusteringrdquo IEEE Transactions onKnowledge and Data Engineering vol 17 no 4 pp 491ndash5022005

[35] A Kumar S Tiwari K KMishra andA KMisra ldquoGenerationof efficient test data using path selection strategy with elitist GAin regression testingrdquo inProceedings of the 3rd IEEE Internation-al Conference on Computer Science and Information Technology(ICCSIT rsquo10) vol 9 pp 389ndash393 Chengdu China July 2010

[36] K K Mishra S Tiwari A Kumar and A K Misra ldquoAnapproach for mutation testing using elitist genetic algorithmrdquoin Proceedings of the 3rd IEEE International Conference on Com-puter Science and Information Technology (ICCSIT rsquo10) vol 5pp 426ndash429 Chengdu China July 2010

[37] S Sarmady An Investigation on Genetic Algorithm ParametersP-COM000507(R) P-COM008807 School of Computer Sci-ences Universiti Sains Malaysia Penang Malaysia 2007

[38] K F Man K S Tang and S Kwong Genetic Algorithms Con-cepts and Designs Chapter 1ndash10 Springer New York NY USA2001

[39] L C Briand K El-Emam and I Wieczorek ldquoExplaining thecost of European space and military projectsrdquo in Proceedings ofthe International Conference on Software Engineering (ICSE rsquo99)pp 303ndash312 ACM Press May 1999

[40] L C Briand T Langley and I Wieczorek ldquoReplicated assess-ment and comparison of common software cost modelingtechniquesrdquo in Proceedings of the International Conference onSoftware Engineering (ICSE rsquo22) pp 377ndash386 ACM Press June2000

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article Tuning of Cost Drivers by Significance

Advances in Software Engineering 7

Table 3 Proposed algorithm based cost drivers

Cost drivers Very low Low Nominal High Very high Extra highacap 146 119 09 086 071pcap 142 09 1 086 07aexp 129 14 1 091 082modp 138 092 1 091 082tool 124 11 099 093 083vexp 138 103 1 09lexp 114 108 09 095sced 123 108 099 104 11stor 1 106 119 138data 103 09 106 138time 09 111 13 166turn 097 092 103 09virt 087 1 115 13cplx 07 085 111 115 116 165rely 075 088 1 125 14

0

2

4

6

8

10

12

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91

MRE

NASA projects

MRE proposed modelMRE COCOMO

Figure 3 Comparison of MRE for NASA 93 projects

generated efforts shown in Figure 2 Most of the results arekept near to the mean of MRE for 63 data values Other93 datasets were also used to evaluate the projects with theproposed method (Figure 3 and Figure 4)

A comparison is made between proposed method andother estimation methods by MMRE in Table 4 Proposedmethod is having average error 027 with actual efforts andCOCOMO produces a bit higher percentage of error withactual efforts Proposedmodel is working efficiently for other93 datasets as well

Essentially we want to measure useful functionalityproduced per time unit Productivity is anothermeasurement

of effectiveness of the model It is a measure of the rateor ratio at which individual software developers involvedin software development produce software and associateddocumentation

Higher productivity reflects the better quality achieve-ment for the project development Proposed method ishaving productivity 029 which is closer to the actual efforts027 as productivity Seven percent of proposed methodproductivity is increased and 9 percent of COCOMO pro-ductivity is decreased in comparison with actual productivity(Table 5) So the percentage of difference between proposedmethod and COCOMO results is approximately 1795

8 Advances in Software Engineering

0

02

04

06

08

1

12

14

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61

Prod

uctiv

ity

Software projects

Productivity with actual effortsProductivity with COCOMO effortsProductivity with proposed method

Figure 4 Comparison of productivity for NASA 63 projects

Table 4 The MMRE for two different methods

MMRE (for 63 datasets) MMRE (for 93 datasets)COCOMO versus actual Proposed method versus actual COCOMO versus actual Proposed method versus actual036 027 059 056

Table 5 The productivity of various approaches

Productivity(COCOMO)

Productivity(proposed method)

Productivity(actual)

025 030 028Differencefrom actual 003 002

Table 6 MMRE of NASA 63 projects for various project modes

Project mode No ofprojects (63)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 27 029 039Organic 25 028 037Semidetached 11 022 023

Table 7 MMRE of NASA 93 projects for various project modes

Project mode No ofprojects (93)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 21 072 082Organic 3 08 088Semidetached 69 051 051

Tables 6 and 7 depict the presence of error in all threecategories of projectmodes for two different types of datasetsThe comparison was made between proposed model gener-ated results versus COCOMO results We also evaluate thedifferent type of project application categorically 80 of totaldatasets are producing the results which are better than theCOCOMO based results (Table 8)

PRED was calculated with the two separate approachesand Table 9 depicts that for 3 different PRED assumptionsproposedmethod is producing approximately 6665 801and 834 increase in PRED respectively

7 Conclusion

Work carried out in the paper explores the inter-relationshipamong different dimensions of data driven software projectsnamely project size and effort The above-mentioned resultsdemonstrate that applying proposed method to the softwareeffort estimation is by far the most feasible approach foraddressing the problem of apprehension and ambiguityexisting in software effort drivers Order of occurrence ofvarious cost drivers has a significant impact on overall effortsin project estimation Small adjustments to the COCOMOcost drivers bring significant improvements to the qualitycriteria applied to the proposed approach Proposed methodis producing tuned values of the cost drivers which areeffective enough to improve the productivity of the projectsPrediction at different levels of MRE for each project reflects

Advances in Software Engineering 9

Table 8 Description of projects on application basis

Type of application No of projects MMRE COCOMO MMRE proposed methodApplication ground 2 028 025Avionics 11 095 080Avionics monitoring 30 066 055Batch data processing 2 008 012Communications 1 018 005Data capture 3 009 007Launch processing 1 032 046Mission planning 20 038 034Monitor control 8 020 050Operating system 4 382 363Real data processing 3 012 006Science 2 018 041Simulation 4 017 029Utility 2 012 031

Table 9 Pred calculation at different values for both the models

PREDCOCOMO Proposed method

10 20 30 10 20 30Percentage of 63 NASA datasets 2381 3968 5714 254 4286 6191

the percentage of projects with desired accuracy Further-more this model is validated on two different datasets whichrepresents better estimation accuracy as compared to theCOCOMO 81 based NASA 63 and NASA 93 datasets Theutilization of proposed algorithm for other applications in thesoftware engineering field can also be explored in the future

Conflict of Interests

The authors certify that there is no actual or potential conflictof interests in relation to this paper The American CompanyTRWSystems Inc has been referred to as the company whereBarry W Boehm the developer of COCOMO worked

References

[1] K M Furulund and K Moloslashkken-Oslashstvold ldquoIncreasing soft-ware effort estimation accuracymdashusing experience data esti-mationmodels and checklistsrdquo in Proceedings of the 7th Interna-tional Conference on Quality Software (QSIC rsquo07) pp 342ndash347Portland OR USA October 2007

[2] Q Alam P Bhatia and S Sarwar Systematic Review of EffortEstimation and Cost Estimation Institute of Management Stud-ies Roorkee India 2012

[3] J J Dolado On the Problem of the Software Cost FunctionFacultad de Informatica Universidad del Pais Vasco-EuskalHerriko Unibertsitatea Gipuzkoa Spain 2000

[4] K Molokken and M Jorgensen ldquoA review of software surveyson software effort estimationrdquo inProceedings of the InternationalSymposium on Empirical Software Engineering (ISESE rsquo03) pp220ndash230 2003

[5] F Ferrucci C Gravino R Oliveto and F Sarro ldquoGenetic pro-gramming for effort estimation an analysis of the impact of dif-ferent fitness functionsrdquo in Proceedings of the 2nd InternationalSymposium on Search Based Software Engineering (SSBSE rsquo10)pp 89ndash98 IEEE Computer Society DMI University of SalernoBenevento Italy October 2010

[6] A F Sheta ldquoEstimation of the COCOMO model parametersusing genetic algorithms for NASA software projectsrdquo Journalof Computer Science vol 2 no 2 pp 118ndash123 2006

[7] B W Boehm Software Engineering Economics Prentice HallIEEE 1984

[8] J Magne and M Shepperd ldquoA Systematic Review Of SoftwareDevelopment Cost Estimation Studiesrdquo IEEE Transactions onSoftware Engineering vol 33 no 1 pp 33ndash53 2007

[9] P L Braga A L I Oliveira and S R L Meira ldquoA GA-based feature selection andparameters optimization for supportvector regression applied to software effort estimationrdquo inProceedings of the 23rd Annual ACM Symposium on AppliedComputing (SAC rsquo08) pp 1788ndash1792 Ceara Brazil March 2008

[10] M Harman and B F Jones ldquoSearch-based software engineer-ingrdquo Information and Software Technology vol 43 no 14 pp833ndash839 2001

[11] J Clarke J J DoladoMHarman et al ldquoReformulating softwareengineering as a search problemrdquo IEE Proceedings Software vol150 no 3 pp 161ndash175 2003

[12] M Joslashrgensen and S Grimstad ldquoAvoiding irrelevant and mis-leading informationwhen estimating development effortrdquo IEEESoftware vol 25 no 3 pp 78ndash83 2008

[13] A L Lederer and J Prasad ldquoA causal model for software costestimating errorrdquo IEEE Transactions on Software Engineeringvol 24 no 2 pp 137ndash148 1998

10 Advances in Software Engineering

[14] S Basha and P Dhavachelvan ldquoAnalysis of empirical softwareeffort estimation modelsrdquo International Journal of ComputerScience and Information Security vol 7 no 3 pp 68ndash77 2010

[15] B L Barber Investigative search of quality historical softwaresupport cost data and software support cost-related data [MSthesis] 1991

[16] N H Chiu and S J Huang ldquoThe adjusted analogy-based soft-ware effort estimation based on similarity distancesrdquo Journal ofSystems and Software vol 80 no 4 pp 628ndash640 2007

[17] G Kadoda and M Shepperd ldquoUsing simulation to evaluateprediction techniquesrdquo in Proceedings of the 7th InternationalSoftware Metrics Symposium (METRICS rsquo01) pp 349ndash359 IEEEPress London UK 2001

[18] M J Shepperd and G F Kadoda ldquoComparing software predic-tion techniques using simulationrdquo IEEE Transactions on Soft-ware Engineering vol 27 no 11 pp 1014ndash1022 2001

[19] M J Shepperd and C Schofield ldquoEstimating software projecteffort using analogiesrdquo IEEE Transactions on Software Engineer-ing vol 23 no 11 pp 736ndash743 1997

[20] M Jooslashrgensen and D I K Sjoslashberg ldquoThe impact of customerexpectation on software development effort estimatesrdquo Interna-tional Journal of Project Management vol 22 no 4 pp 317ndash3252004

[21] J Kaczmarek and M Kucharski ldquoSize and effort estimation forapplications written in Javardquo Information and Software Technol-ogy vol 46 no 9 pp 589ndash601 2004

[22] R Jeffery M Ruhe and I Wieczorek ldquoUsing public domainmetrics to estimate software development effortrdquo in Proceedingsof the 7th International Software Metrics Symposium (METRICSrsquo01) pp 16ndash27 IEEE Computer Society Washington DC USAApril 2001

[23] G H Subramanian P C Pendharkar and MWallace ldquoAn em-pirical study of the effect of complexity platform and programtype on software development effort of business applicationsrdquoEmpirical Software Engineering vol 11 no 4 pp 541ndash553 2006

[24] D E GoldbergGenetic Algorithms in Search Optimization andMachine Learning chapter 1ndash8 Addison-Wesley New York NYUSA 1989

[25] A Heiat ldquoComparison of artificial neural network and regres-sion models for estimating software development effortrdquo Infor-mation and Software Technology vol 44 no 15 pp 911ndash9222002

[26] K Srinivasan and D Fisher ldquoMachine learning approaches toestimating software development effortrdquo IEEE Transactions onSoftware Engineering vol 21 no 2 pp 126ndash137 1995

[27] S J Huang C Y Lin and N H Chiu ldquoFuzzy decision treeapproach for embedding risk assessment information into soft-ware cost estimation modelrdquo Journal of Information Science andEngineering vol 22 no 2 pp 297ndash313 2006

[28] M van Genuchten and H Koolen ldquoOn the use of software costmodelsrdquo Information and Management vol 21 no 1 pp 37ndash441991

[29] A J Albrecht and J E Gaffney ldquoSoftware function source linesof code and development effort prediction a software sciencevalidationrdquo IEEE Transactions on Software Engineering vol 9no 6 pp 639ndash648 1983

[30] I Attarzadeh and SHOw ldquoA novel algorithmic cost estimationmodel based on soft computing techniquerdquo Journal of ComputerScience vol 6 no 2 pp 117ndash125 2010

[31] F J Heemstra Software Cost Estimation Models University ofTechnology Department of Industrial Engineering IEEE 1990

[32] M Joslashrgensen B Boehm and S Rifkin ldquoSoftware developmenteffort estimation formal models or expert judgmentrdquo IEEESoftware vol 26 no 2 pp 14ndash19 2009

[33] Y F Li M Xie and T N Goh ldquoA study of genetic algorithm forproject selection for analogy based software cost estimationrdquo inProceedings of the IEEE International Conference on IndustrialEngineering and EngineeringManagement (IEEM rsquo07) pp 1256ndash1260 Singapore December 2007

[34] H Liu and L Yu ldquoToward integrating feature selection algo-rithms for classification and clusteringrdquo IEEE Transactions onKnowledge and Data Engineering vol 17 no 4 pp 491ndash5022005

[35] A Kumar S Tiwari K KMishra andA KMisra ldquoGenerationof efficient test data using path selection strategy with elitist GAin regression testingrdquo inProceedings of the 3rd IEEE Internation-al Conference on Computer Science and Information Technology(ICCSIT rsquo10) vol 9 pp 389ndash393 Chengdu China July 2010

[36] K K Mishra S Tiwari A Kumar and A K Misra ldquoAnapproach for mutation testing using elitist genetic algorithmrdquoin Proceedings of the 3rd IEEE International Conference on Com-puter Science and Information Technology (ICCSIT rsquo10) vol 5pp 426ndash429 Chengdu China July 2010

[37] S Sarmady An Investigation on Genetic Algorithm ParametersP-COM000507(R) P-COM008807 School of Computer Sci-ences Universiti Sains Malaysia Penang Malaysia 2007

[38] K F Man K S Tang and S Kwong Genetic Algorithms Con-cepts and Designs Chapter 1ndash10 Springer New York NY USA2001

[39] L C Briand K El-Emam and I Wieczorek ldquoExplaining thecost of European space and military projectsrdquo in Proceedings ofthe International Conference on Software Engineering (ICSE rsquo99)pp 303ndash312 ACM Press May 1999

[40] L C Briand T Langley and I Wieczorek ldquoReplicated assess-ment and comparison of common software cost modelingtechniquesrdquo in Proceedings of the International Conference onSoftware Engineering (ICSE rsquo22) pp 377ndash386 ACM Press June2000

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article Tuning of Cost Drivers by Significance

8 Advances in Software Engineering

0

02

04

06

08

1

12

14

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61

Prod

uctiv

ity

Software projects

Productivity with actual effortsProductivity with COCOMO effortsProductivity with proposed method

Figure 4 Comparison of productivity for NASA 63 projects

Table 4 The MMRE for two different methods

MMRE (for 63 datasets) MMRE (for 93 datasets)COCOMO versus actual Proposed method versus actual COCOMO versus actual Proposed method versus actual036 027 059 056

Table 5 The productivity of various approaches

Productivity(COCOMO)

Productivity(proposed method)

Productivity(actual)

025 030 028Differencefrom actual 003 002

Table 6 MMRE of NASA 63 projects for various project modes

Project mode No ofprojects (63)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 27 029 039Organic 25 028 037Semidetached 11 022 023

Table 7 MMRE of NASA 93 projects for various project modes

Project mode No ofprojects (93)

MMRE forproposedmethod

MMRE forCOCOMO

Embedded 21 072 082Organic 3 08 088Semidetached 69 051 051

Tables 6 and 7 depict the presence of error in all threecategories of projectmodes for two different types of datasetsThe comparison was made between proposed model gener-ated results versus COCOMO results We also evaluate thedifferent type of project application categorically 80 of totaldatasets are producing the results which are better than theCOCOMO based results (Table 8)

PRED was calculated with the two separate approachesand Table 9 depicts that for 3 different PRED assumptionsproposedmethod is producing approximately 6665 801and 834 increase in PRED respectively

7 Conclusion

Work carried out in the paper explores the inter-relationshipamong different dimensions of data driven software projectsnamely project size and effort The above-mentioned resultsdemonstrate that applying proposed method to the softwareeffort estimation is by far the most feasible approach foraddressing the problem of apprehension and ambiguityexisting in software effort drivers Order of occurrence ofvarious cost drivers has a significant impact on overall effortsin project estimation Small adjustments to the COCOMOcost drivers bring significant improvements to the qualitycriteria applied to the proposed approach Proposed methodis producing tuned values of the cost drivers which areeffective enough to improve the productivity of the projectsPrediction at different levels of MRE for each project reflects

Advances in Software Engineering 9

Table 8 Description of projects on application basis

Type of application No of projects MMRE COCOMO MMRE proposed methodApplication ground 2 028 025Avionics 11 095 080Avionics monitoring 30 066 055Batch data processing 2 008 012Communications 1 018 005Data capture 3 009 007Launch processing 1 032 046Mission planning 20 038 034Monitor control 8 020 050Operating system 4 382 363Real data processing 3 012 006Science 2 018 041Simulation 4 017 029Utility 2 012 031

Table 9 Pred calculation at different values for both the models

PREDCOCOMO Proposed method

10 20 30 10 20 30Percentage of 63 NASA datasets 2381 3968 5714 254 4286 6191

the percentage of projects with desired accuracy Further-more this model is validated on two different datasets whichrepresents better estimation accuracy as compared to theCOCOMO 81 based NASA 63 and NASA 93 datasets Theutilization of proposed algorithm for other applications in thesoftware engineering field can also be explored in the future

Conflict of Interests

The authors certify that there is no actual or potential conflictof interests in relation to this paper The American CompanyTRWSystems Inc has been referred to as the company whereBarry W Boehm the developer of COCOMO worked

References

[1] K M Furulund and K Moloslashkken-Oslashstvold ldquoIncreasing soft-ware effort estimation accuracymdashusing experience data esti-mationmodels and checklistsrdquo in Proceedings of the 7th Interna-tional Conference on Quality Software (QSIC rsquo07) pp 342ndash347Portland OR USA October 2007

[2] Q Alam P Bhatia and S Sarwar Systematic Review of EffortEstimation and Cost Estimation Institute of Management Stud-ies Roorkee India 2012

[3] J J Dolado On the Problem of the Software Cost FunctionFacultad de Informatica Universidad del Pais Vasco-EuskalHerriko Unibertsitatea Gipuzkoa Spain 2000

[4] K Molokken and M Jorgensen ldquoA review of software surveyson software effort estimationrdquo inProceedings of the InternationalSymposium on Empirical Software Engineering (ISESE rsquo03) pp220ndash230 2003

[5] F Ferrucci C Gravino R Oliveto and F Sarro ldquoGenetic pro-gramming for effort estimation an analysis of the impact of dif-ferent fitness functionsrdquo in Proceedings of the 2nd InternationalSymposium on Search Based Software Engineering (SSBSE rsquo10)pp 89ndash98 IEEE Computer Society DMI University of SalernoBenevento Italy October 2010

[6] A F Sheta ldquoEstimation of the COCOMO model parametersusing genetic algorithms for NASA software projectsrdquo Journalof Computer Science vol 2 no 2 pp 118ndash123 2006

[7] B W Boehm Software Engineering Economics Prentice HallIEEE 1984

[8] J Magne and M Shepperd ldquoA Systematic Review Of SoftwareDevelopment Cost Estimation Studiesrdquo IEEE Transactions onSoftware Engineering vol 33 no 1 pp 33ndash53 2007

[9] P L Braga A L I Oliveira and S R L Meira ldquoA GA-based feature selection andparameters optimization for supportvector regression applied to software effort estimationrdquo inProceedings of the 23rd Annual ACM Symposium on AppliedComputing (SAC rsquo08) pp 1788ndash1792 Ceara Brazil March 2008

[10] M Harman and B F Jones ldquoSearch-based software engineer-ingrdquo Information and Software Technology vol 43 no 14 pp833ndash839 2001

[11] J Clarke J J DoladoMHarman et al ldquoReformulating softwareengineering as a search problemrdquo IEE Proceedings Software vol150 no 3 pp 161ndash175 2003

[12] M Joslashrgensen and S Grimstad ldquoAvoiding irrelevant and mis-leading informationwhen estimating development effortrdquo IEEESoftware vol 25 no 3 pp 78ndash83 2008

[13] A L Lederer and J Prasad ldquoA causal model for software costestimating errorrdquo IEEE Transactions on Software Engineeringvol 24 no 2 pp 137ndash148 1998

10 Advances in Software Engineering

[14] S Basha and P Dhavachelvan ldquoAnalysis of empirical softwareeffort estimation modelsrdquo International Journal of ComputerScience and Information Security vol 7 no 3 pp 68ndash77 2010

[15] B L Barber Investigative search of quality historical softwaresupport cost data and software support cost-related data [MSthesis] 1991

[16] N H Chiu and S J Huang ldquoThe adjusted analogy-based soft-ware effort estimation based on similarity distancesrdquo Journal ofSystems and Software vol 80 no 4 pp 628ndash640 2007

[17] G Kadoda and M Shepperd ldquoUsing simulation to evaluateprediction techniquesrdquo in Proceedings of the 7th InternationalSoftware Metrics Symposium (METRICS rsquo01) pp 349ndash359 IEEEPress London UK 2001

[18] M J Shepperd and G F Kadoda ldquoComparing software predic-tion techniques using simulationrdquo IEEE Transactions on Soft-ware Engineering vol 27 no 11 pp 1014ndash1022 2001

[19] M J Shepperd and C Schofield ldquoEstimating software projecteffort using analogiesrdquo IEEE Transactions on Software Engineer-ing vol 23 no 11 pp 736ndash743 1997

[20] M Jooslashrgensen and D I K Sjoslashberg ldquoThe impact of customerexpectation on software development effort estimatesrdquo Interna-tional Journal of Project Management vol 22 no 4 pp 317ndash3252004

[21] J Kaczmarek and M Kucharski ldquoSize and effort estimation forapplications written in Javardquo Information and Software Technol-ogy vol 46 no 9 pp 589ndash601 2004

[22] R Jeffery M Ruhe and I Wieczorek ldquoUsing public domainmetrics to estimate software development effortrdquo in Proceedingsof the 7th International Software Metrics Symposium (METRICSrsquo01) pp 16ndash27 IEEE Computer Society Washington DC USAApril 2001

[23] G H Subramanian P C Pendharkar and MWallace ldquoAn em-pirical study of the effect of complexity platform and programtype on software development effort of business applicationsrdquoEmpirical Software Engineering vol 11 no 4 pp 541ndash553 2006

[24] D E GoldbergGenetic Algorithms in Search Optimization andMachine Learning chapter 1ndash8 Addison-Wesley New York NYUSA 1989

[25] A Heiat ldquoComparison of artificial neural network and regres-sion models for estimating software development effortrdquo Infor-mation and Software Technology vol 44 no 15 pp 911ndash9222002

[26] K Srinivasan and D Fisher ldquoMachine learning approaches toestimating software development effortrdquo IEEE Transactions onSoftware Engineering vol 21 no 2 pp 126ndash137 1995

[27] S J Huang C Y Lin and N H Chiu ldquoFuzzy decision treeapproach for embedding risk assessment information into soft-ware cost estimation modelrdquo Journal of Information Science andEngineering vol 22 no 2 pp 297ndash313 2006

[28] M van Genuchten and H Koolen ldquoOn the use of software costmodelsrdquo Information and Management vol 21 no 1 pp 37ndash441991

[29] A J Albrecht and J E Gaffney ldquoSoftware function source linesof code and development effort prediction a software sciencevalidationrdquo IEEE Transactions on Software Engineering vol 9no 6 pp 639ndash648 1983

[30] I Attarzadeh and SHOw ldquoA novel algorithmic cost estimationmodel based on soft computing techniquerdquo Journal of ComputerScience vol 6 no 2 pp 117ndash125 2010

[31] F J Heemstra Software Cost Estimation Models University ofTechnology Department of Industrial Engineering IEEE 1990

[32] M Joslashrgensen B Boehm and S Rifkin ldquoSoftware developmenteffort estimation formal models or expert judgmentrdquo IEEESoftware vol 26 no 2 pp 14ndash19 2009

[33] Y F Li M Xie and T N Goh ldquoA study of genetic algorithm forproject selection for analogy based software cost estimationrdquo inProceedings of the IEEE International Conference on IndustrialEngineering and EngineeringManagement (IEEM rsquo07) pp 1256ndash1260 Singapore December 2007

[34] H Liu and L Yu ldquoToward integrating feature selection algo-rithms for classification and clusteringrdquo IEEE Transactions onKnowledge and Data Engineering vol 17 no 4 pp 491ndash5022005

[35] A Kumar S Tiwari K KMishra andA KMisra ldquoGenerationof efficient test data using path selection strategy with elitist GAin regression testingrdquo inProceedings of the 3rd IEEE Internation-al Conference on Computer Science and Information Technology(ICCSIT rsquo10) vol 9 pp 389ndash393 Chengdu China July 2010

[36] K K Mishra S Tiwari A Kumar and A K Misra ldquoAnapproach for mutation testing using elitist genetic algorithmrdquoin Proceedings of the 3rd IEEE International Conference on Com-puter Science and Information Technology (ICCSIT rsquo10) vol 5pp 426ndash429 Chengdu China July 2010

[37] S Sarmady An Investigation on Genetic Algorithm ParametersP-COM000507(R) P-COM008807 School of Computer Sci-ences Universiti Sains Malaysia Penang Malaysia 2007

[38] K F Man K S Tang and S Kwong Genetic Algorithms Con-cepts and Designs Chapter 1ndash10 Springer New York NY USA2001

[39] L C Briand K El-Emam and I Wieczorek ldquoExplaining thecost of European space and military projectsrdquo in Proceedings ofthe International Conference on Software Engineering (ICSE rsquo99)pp 303ndash312 ACM Press May 1999

[40] L C Briand T Langley and I Wieczorek ldquoReplicated assess-ment and comparison of common software cost modelingtechniquesrdquo in Proceedings of the International Conference onSoftware Engineering (ICSE rsquo22) pp 377ndash386 ACM Press June2000

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article Tuning of Cost Drivers by Significance

Advances in Software Engineering 9

Table 8 Description of projects on application basis

Type of application No of projects MMRE COCOMO MMRE proposed methodApplication ground 2 028 025Avionics 11 095 080Avionics monitoring 30 066 055Batch data processing 2 008 012Communications 1 018 005Data capture 3 009 007Launch processing 1 032 046Mission planning 20 038 034Monitor control 8 020 050Operating system 4 382 363Real data processing 3 012 006Science 2 018 041Simulation 4 017 029Utility 2 012 031

Table 9 Pred calculation at different values for both the models

PREDCOCOMO Proposed method

10 20 30 10 20 30Percentage of 63 NASA datasets 2381 3968 5714 254 4286 6191

the percentage of projects with desired accuracy Further-more this model is validated on two different datasets whichrepresents better estimation accuracy as compared to theCOCOMO 81 based NASA 63 and NASA 93 datasets Theutilization of proposed algorithm for other applications in thesoftware engineering field can also be explored in the future

Conflict of Interests

The authors certify that there is no actual or potential conflictof interests in relation to this paper The American CompanyTRWSystems Inc has been referred to as the company whereBarry W Boehm the developer of COCOMO worked

References

[1] K M Furulund and K Moloslashkken-Oslashstvold ldquoIncreasing soft-ware effort estimation accuracymdashusing experience data esti-mationmodels and checklistsrdquo in Proceedings of the 7th Interna-tional Conference on Quality Software (QSIC rsquo07) pp 342ndash347Portland OR USA October 2007

[2] Q Alam P Bhatia and S Sarwar Systematic Review of EffortEstimation and Cost Estimation Institute of Management Stud-ies Roorkee India 2012

[3] J J Dolado On the Problem of the Software Cost FunctionFacultad de Informatica Universidad del Pais Vasco-EuskalHerriko Unibertsitatea Gipuzkoa Spain 2000

[4] K Molokken and M Jorgensen ldquoA review of software surveyson software effort estimationrdquo inProceedings of the InternationalSymposium on Empirical Software Engineering (ISESE rsquo03) pp220ndash230 2003

[5] F Ferrucci C Gravino R Oliveto and F Sarro ldquoGenetic pro-gramming for effort estimation an analysis of the impact of dif-ferent fitness functionsrdquo in Proceedings of the 2nd InternationalSymposium on Search Based Software Engineering (SSBSE rsquo10)pp 89ndash98 IEEE Computer Society DMI University of SalernoBenevento Italy October 2010

[6] A F Sheta ldquoEstimation of the COCOMO model parametersusing genetic algorithms for NASA software projectsrdquo Journalof Computer Science vol 2 no 2 pp 118ndash123 2006

[7] B W Boehm Software Engineering Economics Prentice HallIEEE 1984

[8] J Magne and M Shepperd ldquoA Systematic Review Of SoftwareDevelopment Cost Estimation Studiesrdquo IEEE Transactions onSoftware Engineering vol 33 no 1 pp 33ndash53 2007

[9] P L Braga A L I Oliveira and S R L Meira ldquoA GA-based feature selection andparameters optimization for supportvector regression applied to software effort estimationrdquo inProceedings of the 23rd Annual ACM Symposium on AppliedComputing (SAC rsquo08) pp 1788ndash1792 Ceara Brazil March 2008

[10] M Harman and B F Jones ldquoSearch-based software engineer-ingrdquo Information and Software Technology vol 43 no 14 pp833ndash839 2001

[11] J Clarke J J DoladoMHarman et al ldquoReformulating softwareengineering as a search problemrdquo IEE Proceedings Software vol150 no 3 pp 161ndash175 2003

[12] M Joslashrgensen and S Grimstad ldquoAvoiding irrelevant and mis-leading informationwhen estimating development effortrdquo IEEESoftware vol 25 no 3 pp 78ndash83 2008

[13] A L Lederer and J Prasad ldquoA causal model for software costestimating errorrdquo IEEE Transactions on Software Engineeringvol 24 no 2 pp 137ndash148 1998

10 Advances in Software Engineering

[14] S Basha and P Dhavachelvan ldquoAnalysis of empirical softwareeffort estimation modelsrdquo International Journal of ComputerScience and Information Security vol 7 no 3 pp 68ndash77 2010

[15] B L Barber Investigative search of quality historical softwaresupport cost data and software support cost-related data [MSthesis] 1991

[16] N H Chiu and S J Huang ldquoThe adjusted analogy-based soft-ware effort estimation based on similarity distancesrdquo Journal ofSystems and Software vol 80 no 4 pp 628ndash640 2007

[17] G Kadoda and M Shepperd ldquoUsing simulation to evaluateprediction techniquesrdquo in Proceedings of the 7th InternationalSoftware Metrics Symposium (METRICS rsquo01) pp 349ndash359 IEEEPress London UK 2001

[18] M J Shepperd and G F Kadoda ldquoComparing software predic-tion techniques using simulationrdquo IEEE Transactions on Soft-ware Engineering vol 27 no 11 pp 1014ndash1022 2001

[19] M J Shepperd and C Schofield ldquoEstimating software projecteffort using analogiesrdquo IEEE Transactions on Software Engineer-ing vol 23 no 11 pp 736ndash743 1997

[20] M Jooslashrgensen and D I K Sjoslashberg ldquoThe impact of customerexpectation on software development effort estimatesrdquo Interna-tional Journal of Project Management vol 22 no 4 pp 317ndash3252004

[21] J Kaczmarek and M Kucharski ldquoSize and effort estimation forapplications written in Javardquo Information and Software Technol-ogy vol 46 no 9 pp 589ndash601 2004

[22] R Jeffery M Ruhe and I Wieczorek ldquoUsing public domainmetrics to estimate software development effortrdquo in Proceedingsof the 7th International Software Metrics Symposium (METRICSrsquo01) pp 16ndash27 IEEE Computer Society Washington DC USAApril 2001

[23] G H Subramanian P C Pendharkar and MWallace ldquoAn em-pirical study of the effect of complexity platform and programtype on software development effort of business applicationsrdquoEmpirical Software Engineering vol 11 no 4 pp 541ndash553 2006

[24] D E GoldbergGenetic Algorithms in Search Optimization andMachine Learning chapter 1ndash8 Addison-Wesley New York NYUSA 1989

[25] A Heiat ldquoComparison of artificial neural network and regres-sion models for estimating software development effortrdquo Infor-mation and Software Technology vol 44 no 15 pp 911ndash9222002

[26] K Srinivasan and D Fisher ldquoMachine learning approaches toestimating software development effortrdquo IEEE Transactions onSoftware Engineering vol 21 no 2 pp 126ndash137 1995

[27] S J Huang C Y Lin and N H Chiu ldquoFuzzy decision treeapproach for embedding risk assessment information into soft-ware cost estimation modelrdquo Journal of Information Science andEngineering vol 22 no 2 pp 297ndash313 2006

[28] M van Genuchten and H Koolen ldquoOn the use of software costmodelsrdquo Information and Management vol 21 no 1 pp 37ndash441991

[29] A J Albrecht and J E Gaffney ldquoSoftware function source linesof code and development effort prediction a software sciencevalidationrdquo IEEE Transactions on Software Engineering vol 9no 6 pp 639ndash648 1983

[30] I Attarzadeh and SHOw ldquoA novel algorithmic cost estimationmodel based on soft computing techniquerdquo Journal of ComputerScience vol 6 no 2 pp 117ndash125 2010

[31] F J Heemstra Software Cost Estimation Models University ofTechnology Department of Industrial Engineering IEEE 1990

[32] M Joslashrgensen B Boehm and S Rifkin ldquoSoftware developmenteffort estimation formal models or expert judgmentrdquo IEEESoftware vol 26 no 2 pp 14ndash19 2009

[33] Y F Li M Xie and T N Goh ldquoA study of genetic algorithm forproject selection for analogy based software cost estimationrdquo inProceedings of the IEEE International Conference on IndustrialEngineering and EngineeringManagement (IEEM rsquo07) pp 1256ndash1260 Singapore December 2007

[34] H Liu and L Yu ldquoToward integrating feature selection algo-rithms for classification and clusteringrdquo IEEE Transactions onKnowledge and Data Engineering vol 17 no 4 pp 491ndash5022005

[35] A Kumar S Tiwari K KMishra andA KMisra ldquoGenerationof efficient test data using path selection strategy with elitist GAin regression testingrdquo inProceedings of the 3rd IEEE Internation-al Conference on Computer Science and Information Technology(ICCSIT rsquo10) vol 9 pp 389ndash393 Chengdu China July 2010

[36] K K Mishra S Tiwari A Kumar and A K Misra ldquoAnapproach for mutation testing using elitist genetic algorithmrdquoin Proceedings of the 3rd IEEE International Conference on Com-puter Science and Information Technology (ICCSIT rsquo10) vol 5pp 426ndash429 Chengdu China July 2010

[37] S Sarmady An Investigation on Genetic Algorithm ParametersP-COM000507(R) P-COM008807 School of Computer Sci-ences Universiti Sains Malaysia Penang Malaysia 2007

[38] K F Man K S Tang and S Kwong Genetic Algorithms Con-cepts and Designs Chapter 1ndash10 Springer New York NY USA2001

[39] L C Briand K El-Emam and I Wieczorek ldquoExplaining thecost of European space and military projectsrdquo in Proceedings ofthe International Conference on Software Engineering (ICSE rsquo99)pp 303ndash312 ACM Press May 1999

[40] L C Briand T Langley and I Wieczorek ldquoReplicated assess-ment and comparison of common software cost modelingtechniquesrdquo in Proceedings of the International Conference onSoftware Engineering (ICSE rsquo22) pp 377ndash386 ACM Press June2000

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article Tuning of Cost Drivers by Significance

10 Advances in Software Engineering

[14] S Basha and P Dhavachelvan ldquoAnalysis of empirical softwareeffort estimation modelsrdquo International Journal of ComputerScience and Information Security vol 7 no 3 pp 68ndash77 2010

[15] B L Barber Investigative search of quality historical softwaresupport cost data and software support cost-related data [MSthesis] 1991

[16] N H Chiu and S J Huang ldquoThe adjusted analogy-based soft-ware effort estimation based on similarity distancesrdquo Journal ofSystems and Software vol 80 no 4 pp 628ndash640 2007

[17] G Kadoda and M Shepperd ldquoUsing simulation to evaluateprediction techniquesrdquo in Proceedings of the 7th InternationalSoftware Metrics Symposium (METRICS rsquo01) pp 349ndash359 IEEEPress London UK 2001

[18] M J Shepperd and G F Kadoda ldquoComparing software predic-tion techniques using simulationrdquo IEEE Transactions on Soft-ware Engineering vol 27 no 11 pp 1014ndash1022 2001

[19] M J Shepperd and C Schofield ldquoEstimating software projecteffort using analogiesrdquo IEEE Transactions on Software Engineer-ing vol 23 no 11 pp 736ndash743 1997

[20] M Jooslashrgensen and D I K Sjoslashberg ldquoThe impact of customerexpectation on software development effort estimatesrdquo Interna-tional Journal of Project Management vol 22 no 4 pp 317ndash3252004

[21] J Kaczmarek and M Kucharski ldquoSize and effort estimation forapplications written in Javardquo Information and Software Technol-ogy vol 46 no 9 pp 589ndash601 2004

[22] R Jeffery M Ruhe and I Wieczorek ldquoUsing public domainmetrics to estimate software development effortrdquo in Proceedingsof the 7th International Software Metrics Symposium (METRICSrsquo01) pp 16ndash27 IEEE Computer Society Washington DC USAApril 2001

[23] G H Subramanian P C Pendharkar and MWallace ldquoAn em-pirical study of the effect of complexity platform and programtype on software development effort of business applicationsrdquoEmpirical Software Engineering vol 11 no 4 pp 541ndash553 2006

[24] D E GoldbergGenetic Algorithms in Search Optimization andMachine Learning chapter 1ndash8 Addison-Wesley New York NYUSA 1989

[25] A Heiat ldquoComparison of artificial neural network and regres-sion models for estimating software development effortrdquo Infor-mation and Software Technology vol 44 no 15 pp 911ndash9222002

[26] K Srinivasan and D Fisher ldquoMachine learning approaches toestimating software development effortrdquo IEEE Transactions onSoftware Engineering vol 21 no 2 pp 126ndash137 1995

[27] S J Huang C Y Lin and N H Chiu ldquoFuzzy decision treeapproach for embedding risk assessment information into soft-ware cost estimation modelrdquo Journal of Information Science andEngineering vol 22 no 2 pp 297ndash313 2006

[28] M van Genuchten and H Koolen ldquoOn the use of software costmodelsrdquo Information and Management vol 21 no 1 pp 37ndash441991

[29] A J Albrecht and J E Gaffney ldquoSoftware function source linesof code and development effort prediction a software sciencevalidationrdquo IEEE Transactions on Software Engineering vol 9no 6 pp 639ndash648 1983

[30] I Attarzadeh and SHOw ldquoA novel algorithmic cost estimationmodel based on soft computing techniquerdquo Journal of ComputerScience vol 6 no 2 pp 117ndash125 2010

[31] F J Heemstra Software Cost Estimation Models University ofTechnology Department of Industrial Engineering IEEE 1990

[32] M Joslashrgensen B Boehm and S Rifkin ldquoSoftware developmenteffort estimation formal models or expert judgmentrdquo IEEESoftware vol 26 no 2 pp 14ndash19 2009

[33] Y F Li M Xie and T N Goh ldquoA study of genetic algorithm forproject selection for analogy based software cost estimationrdquo inProceedings of the IEEE International Conference on IndustrialEngineering and EngineeringManagement (IEEM rsquo07) pp 1256ndash1260 Singapore December 2007

[34] H Liu and L Yu ldquoToward integrating feature selection algo-rithms for classification and clusteringrdquo IEEE Transactions onKnowledge and Data Engineering vol 17 no 4 pp 491ndash5022005

[35] A Kumar S Tiwari K KMishra andA KMisra ldquoGenerationof efficient test data using path selection strategy with elitist GAin regression testingrdquo inProceedings of the 3rd IEEE Internation-al Conference on Computer Science and Information Technology(ICCSIT rsquo10) vol 9 pp 389ndash393 Chengdu China July 2010

[36] K K Mishra S Tiwari A Kumar and A K Misra ldquoAnapproach for mutation testing using elitist genetic algorithmrdquoin Proceedings of the 3rd IEEE International Conference on Com-puter Science and Information Technology (ICCSIT rsquo10) vol 5pp 426ndash429 Chengdu China July 2010

[37] S Sarmady An Investigation on Genetic Algorithm ParametersP-COM000507(R) P-COM008807 School of Computer Sci-ences Universiti Sains Malaysia Penang Malaysia 2007

[38] K F Man K S Tang and S Kwong Genetic Algorithms Con-cepts and Designs Chapter 1ndash10 Springer New York NY USA2001

[39] L C Briand K El-Emam and I Wieczorek ldquoExplaining thecost of European space and military projectsrdquo in Proceedings ofthe International Conference on Software Engineering (ICSE rsquo99)pp 303ndash312 ACM Press May 1999

[40] L C Briand T Langley and I Wieczorek ldquoReplicated assess-ment and comparison of common software cost modelingtechniquesrdquo in Proceedings of the International Conference onSoftware Engineering (ICSE rsquo22) pp 377ndash386 ACM Press June2000

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article Tuning of Cost Drivers by Significance

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014