HeuristicMethods2009

Embed Size (px)

Citation preview

  • 8/13/2019 HeuristicMethods2009

    1/9

    Using genetic programming to evolvedetection strategies for object-orienteddesign flaws

    Advanced studies reportfor Heuristic Methods.

    author

    Mihai-Iulian BalintFirst year PhD. student,

    Politehnica University, Timisoara

    Abstract

    In the context of object-oriented design, given a training set com-posed of software metrics and design flaws, this paper introduces anovel approach of using machine learning techniques through genetic

    algorithms to infer relevant filtering formulas (detection strategies) fordesign flaws.The paper examines the feasibility of this technique by comparing

    the resulting formulas for a number of design flaws to those proposedin the literature and concludes that given a set of relevant metrics,our approach can be used to obtain a formula that is relevant for thescrutinized software system.

    Keywords: genetic programming, genetic algorithms, machine learning,software metrics, design flaws, object oriented design, reverse engineering

    1

  • 8/13/2019 HeuristicMethods2009

    2/9

    1 Introduction

    It has been clearly shown in the literature that software design - like anyhuman activity is error prone. Design errors may not translate immediatelyinto malfunctioning programs, however, in time bad design leads to under-standing problems which in turn lead to software bugs.

    An attempt at automated design flaw identification has been proposed in[7]. The author uses Detection Strategies - a set of metrics based rules -to identify design flaws in large-scale object-oriented software. The authoruses the informal rules presented in the literature to obtain a more formalspecification for design flaws.

    In this paper we present an alternative approach at identifying DetectionStrategies using genetic programming. Just as in [7] we will use a metricbased filter to identify design flaws. However whereas Marinescu used ex-isting literature to propose metric based formulas to filter out design flawcandidates, we will evolve the formula from a training-set.

    The training data-set is composed from a set of case-studies that havebeen inspected by an object-oriented design expert. For the classes andmethods of these systems a set of metrics has been calculated and the designflaws have been extracted using manual inspection by our expert. Thus ourapproach consists of evolving a formula that is able to extract (as preciselyas possible) the set of candidates for each type of design flaw.

    To evaluate this approach we have created an analysis tool that we haveused to evolve detection strategy formulas for three design flaws using onlyone training data-set.

    In the next section we present the challenges of measuring design qualityand briefly introduce genetic algorithms and genetic programming. Then wecontinue with a presentation of the details and challenges of our approach,we introduce our tool and the experimental results obtained using it. Finallywe conclude this paper with a few remarks on the performance and precisionof our results.

    This paper continues the work presented by Mihancea et al. in [8] in

    the sense that we do not restrict ourselves to adjusting only the thresholdvalues within the detection strategy formulas instead allowing the structureand terms of the formulas to change and evolve. We use machine learningtechniques that given a training set composed of software metrics and a setof badly designed software entities (design flaws) can infer relevant filteringformulas for each type of design flaw.

    As a preliminary evaluation of the effectiveness of our approach we haveimplemented DSEvo, a software tool that is able to evolve detection strategyformulas in a manner similar to genetic programming. We deploy DSEvo

    2

  • 8/13/2019 HeuristicMethods2009

    3/9

    alongside a library of 100+ (mostly) structural metrics in the two environ-

    ments:

    A small experimental software system (entitled LAN Simulation [3])with few design entities and design flaws.

    A medium-sized open source software system (Apache Wicket) whichhas been analyzed with iPlasma- a specialized reverse engineeringplatform [4]. All design flaw candidates reported by iPlasma havebeen manually inspected and false positives removed from the trainingset.

    Both software systems contain design flaws previously presented in theliterature and in both environments DSEvo is able to evolve relevant for-mulas that are consistent with the majority of the published research on thetopic.

    The paper is structured as follows: next, in Section 3, we present a fewchallenges encountered while implementing genetic algorithms to search fordetection strategy formulas. In Section 4 we present the two experimentalenvironments and discuss the results that we have obtained. We conclude thepaper with Section 5, arguing that machine learning is a viable approach atidentifying design flaw detection strategies and presenting the future work.

    2 Problem statement

    One one side we have software metrics and on the other, design flaws, badsmells, anti-patterns. Software metrics are clearly defined and cleanly spec-ified and usually are relatively straight-forward to implement. Design flawsare usually defined using natural language and

    3 Detection Strategy Genetics

    Detection strategies are presented in detail in [7] and in [6], in our tool we im-plement them using binary trees of logical expressions. A logical expressionsmay be one of:

    a simple comparison between a metric and a constant threshold (e.g.,class NumberOfMethods>10)

    a metric with a logical value (e.g., class isAbstract)

    3

  • 8/13/2019 HeuristicMethods2009

    4/9

    GodClass(S) =S

    S

    S, CS

    (ATFD(C), TopV alues(20%)) (ATFD(C), HigherThan(4))((WMC(C), HigherThan(20)) (TCC(C), LowerThan(0.33)))

    (1)

    GodClass(c)

    AND

    ANDATFD in

    TopVal(20%)

    ATFD(c) > 4

    TCC(c) < 0.33WMC(C) > 20

    OR

    Figure 1: God Class detection strategy representation using binary trees.

    a logical aggregate (using logical operators, e.g.,and, or) of two logicalexpressions.

    For example, equation 1 depicts the quantified GodClass detection strat-egy (as presented in [8]) which we represent in binary tree form as in Figure 1.The GodClass design flaw is characteristic of classes that centralizes intelli-gence in a software system, breaking Riels uniformly distributed intelligenceheuristic [9]. The metrics used to detect god classes are described below.

    Access To Foreign Data. ATFD represents the number of external classesfrom which a given class accesses attributes, directly or via accessor methods(get/set methods) [5].

    Weighted Method Count. WMC is the sum of the static complexity of allmethods in a class [2].

    Tight Class Cohesion. TCC is defined as the relative number of directlyconnected methods. Two methods are directly connected if they access acommon instance variable of their class [1].

    Note. We do not implement the relative operators TopValues and Bot-tomValues that were presented in Marinescus original thesis [6] becausealthough they convey additional semantic information, for the purpose ofthis work they are redundant and can naturally be replaced with absolutecomparisons with thresholds relevant for the current software system.

    4

  • 8/13/2019 HeuristicMethods2009

    5/9

    To obtain general formulas for detection strategies (i.e.,using relative op-

    erators) using our approach, one would have to analyze a significant numberof different software systems and statistically approximate any TopValuesand BottomValues percentages. However, this is beyond the scope of thispaper and constitutes future research.

    3.1 Initial population

    All genetic algorithms start with an initial random population. The challengefor genetic programming in general and our approach in particular is togenerate a meaningful random individual. We define a meaningful individual

    as one that represents an expression that can be evaluated to a logical value.When generating the logical expression trees, the first thing to consider is

    the complexity of the expression. We define the complexity of the expressionusing the maximum depth of the tree (which we also make an adjustableparameter).

    To generate the random expression tree we of a given depth we repeatedlygenerate subtrees of depth less than one and add them to the main tree.When generating numeric comparisons special care is taken to avoid a applesand oranges type of comparison. When comparing a numeric metric with aconstant we adjust the constant to take values only in a interval relevant tothe metric values.

    3.2 Mutation

    Mutation is the process through which new, unrelated features may be in-jected in a population of potential formulas for a detection strategy. In ourapproach, mutation may occur at several levels:

    as in the original work by Mihancea et al. [8] the value of the nu-merical thresholds may be mutated. However, we have optimized thisprocess by randomly mutating only between numerical values that are

    semantically significant for the current metric in the current softwaresystem. For example, in the current system, metric A takes valuesonly within the exclusive interval (Amin,Amax) we then assume thatall relevant threshold values for A are within the inclusive interval[Amin,Amax].

    an absolute comparison may mutate in one of two ways: either bychanging the comparison operator or by changing the metric. Note thatwhen changing the metric in a comparison operation, the threshold alsoneeds to change to a semantically relevant value.

    5

  • 8/13/2019 HeuristicMethods2009

    6/9

    potential

    GodClass(c)formula

    AND

    ATFD(c) > 4

    TCC(c) < 0.33WMC(C) > 20

    OR

    potentialGodClass(c)

    formula

    WMC(C) > 35

    Figure 2: Crossover operation between potential detection strategy formulas.

    a term in the form of a single logical metric is mutated by changing itto another logical metric.

    an aggregate logical expression mutates by changing the aggregationoperand (e.g.,and changes to or).

    a negated logical expression may mutate to an affirmative expression.

    3.3 Crossover

    Crossover is the operation through which existing (presumably successful)features from a population are combined in an attempt to obtain betterdetection strategy formulas. Crossover consists of switching subtrees betweenformulas. A random node is chosen from the each formula and each nodeand all its children is replaced with the selected node from the other formula.A crossover instance is depicted in in Figure 2.

    3.4 Detection Strategy Fitness

    The fitness function we use is identical to that used in the original paperby Mihanceaet al. [8]. Succinctly, it consists of the sum of false negativesand false positives, each with an adjustable coefficient to control fitness bias.Depending on the type of design flaw and the size of the system better resultsmay be obtained by adjusting the fitness bias to favor either false negativesof false positives.

    6

  • 8/13/2019 HeuristicMethods2009

    7/9

    Design entity Number of entities Number of metrics

    Class 3341 63Method 15200 52

    Total lines of code 248514

    Table 1: Case study overall size.

    For example if out of all design entities from the training only a few aredesign flaws, better results will be obtained if the fitness is biased towardsfalse negatives. This means that the resulting detection rule will be less likelyto miss any flaws and more likely to miss-label healthy entities as flaws.

    4 Evaluation

    To evaluate the feasibility of our approach we have selected eight types ofdesign flaws (five specific to classes and three specific to methods) and haveused the iPlasma reverse engineering platform to identify the design flawsin two software systems. We then manually confirmed that each candidatewas indeed a flawed design entity.

    We then attempted a number of 20 runs for each type of design flaw withour tool, each time recording the overall best (fewest false candidates) and

    worst (most false candidates) detection strategy obtained.

    4.1 Case studies

    To validate out approach we used Wicket - a web application frameworkwritten in java as a software case-study. Wicket is la medium to large opensource software system developed by several programmers from all over theworld. The LAN Simulation case study [3] was used only for testing purpusesand will not be presented here. To get an overall impression of wicket wehave presented a few size metrics in Table 1.

    We have analysed wicket with iPlasma

    and have found a number ofdesign flaws which are presented in Table 2. These were used as trainingdata for our genetic programming tool.

    We searched for design flaws for classes and methods and have fountthe following types of flaws: (1) GodClass (as detailed in Section 3), (2)BrainClass - classes which have many methods that are very complex, (3)DataClass - classes which are simple data holders, (4) SchizophrenicClassand (5) Refused Parent Bequest classes that violate the contract inheritedfrom their super classes; (6) BrainMethod - methods that have a high cyclo-

    7

  • 8/13/2019 HeuristicMethods2009

    8/9

    Class design flaw Number of flawed classes

    God Class 37Brain Class 6Data Class 97

    Schizophrenic Class 59Refused Parent Bequest 8

    Method design flaw

    Brain Method 94Feature Envy 224

    Shotgun Surgery 223

    Table 2: Identified design flaws.

    matic complexity and access little exterior data, (7) Feature Envy - methodsthat access lots of data from other classes and finally (8) Shotgun Surgery -methods that are highly coupled and if changed may trigger a large wave ofadditional changes in depending classes of methods.

    4.2 Results Discussion

    Our approach generates impressive results for GodClass, BrainClass Fea-

    tureEnvy, BrainMethod design flaws - most of the time it generates part ofthe formula and depending on the initial population, sometimes it generatesthe complete formula.

    For DataClass design flaws, most-times we generate the part of the for-mula containing the NumberOfAccessorMethods metric - this is because thesoftware system used to train the algorithm does not use public attributes.

    For RefusedParentBequest a simple formula using the terms specified inSection 3 is insufficient with the current metric library. RPB is computedusing a number of private intermediate results, for example: the number ofmethods overridden by sub-classes.

    Another interesting case is the SchizophrenicClass design flaw which iscomputed as penalty score between several features of a class: size and com-plexity, coupling and inheritance hierarchy features.

    5 Conclusions

    Our approach yields promising results as long as information about the targetsystem is available as software metrics. Currently most metrics required to

    8

  • 8/13/2019 HeuristicMethods2009

    9/9

    compute design flaws are available in our library, however some are computed

    but not accessible (i.e.,are nested within the logic of the detection strategy).While our approach is promising as a tool that can infer formulas for

    design flaw detection (or any other label that a developer might generate)it is limited by the information that is available in the form of metrics. Ifnew design flaws are proposed in the literature, our tool will identify themas long as the backing metrics library is sufficiently expressive and relevantfor the type o flaw.

    References

    [1] J. Bieman and B. Kang. Cohesion and reuse in an object-oriented system. InProceedings ACM Symposium on Software Reusability, Apr. 1995.

    [2] S. R. Chidamber and C. F. Kemerer. A metrics suite for object oriented design.IEEE Transactions on Software Engineering, 20(6):476493, June 1994.

    [3] S. Demeyer, F. V. Rysselberghe, T. Grba, J. Ratzinger, R. Marinescu,T. Mens, B. D. Bois, D. Janssens, S. Ducasse, M. Lanza, M. Rieger, H. Gall,M. Wermelinger, and M. El-Ramly. The LAN-simulation: A research andteaching example for refactoring. In Proceedings of IWPSE 2005 (8th Inter-national Workshop on Principles of Software Evolution), pages 123131, LosAlamitos CA, 2005. IEEE Computer Society Press.

    [4] C. Marinescu, R. Marinescu, P. F. Mihancea, D. Ratiu, and R. Wettel. Iplasma:An integrated platform for quality assessment of object-oriented design. InPro-ceedings of the 21st IEEE International Conference on Software Maintenance(ICSM 2005), Tool Demonstration Track. IEEE Computer Society Press, 2005.

    [5] R. Marinescu. Detecting design flaws via metrics in ob ject-oriented systems.InProceedings of TOOLS, pages 173182, 2001.

    [6] R. Marinescu. Measurement and Quality in Object-Oriented Design. PhDthesis, Department of Computer Science, Politehnica University of Timisoara,2002.

    [7] R. Marinescu. Detection strategies: Metrics-based rules for detecting de-sign flaws. In20th IEEE International Conference on Software Maintenance(ICSM04), pages 350359, Los Alamitos CA, 2004. IEEE Computer SocietyPress.

    [8] P. Mihancea and R. Marinescu. Towards the optimization of automatic de-tection of design flaws in object-oriented software systems. In Proceedings ofEuropean Conference on Software Maintenance (CSMR 2005), pages 92101,2005.

    [9] A. Riel. Object-Oriented Design Heuristics. Addison Wesley, Boston MA,1996.

    9