78

Motivation □ Maintenance dominates software costs 2

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Motivation □ Maintenance dominates software costs 2
Page 2: Motivation □ Maintenance dominates software costs 2

Motivation

□Maintenance dominates software costs

2

Page 3: Motivation □ Maintenance dominates software costs 2

Motivation

□>50% of maintenance time spent understanding the program

3

Page 4: Motivation □ Maintenance dominates software costs 2

Motivation

□>50% of maintenance time spent understanding the program□Where are the features,

reqs, etc. in the code?

4

Page 5: Motivation □ Maintenance dominates software costs 2

Motivation

□>50% of maintenance time spent understanding the program□Where are the features,

reqs, etc. in the code?□What is this code for?

5

Page 6: Motivation □ Maintenance dominates software costs 2

Motivation

□>50% of maintenance time spent understanding the program□Where are the features,

reqs, etc. in the code?□What is this code for?□Why is it hard to

understand and changethe program?

6

Page 7: Motivation □ Maintenance dominates software costs 2

Main Contributions

□Improved state of theart of concern location

□Innovative metricsand experimentalmethodology

□Evidence of the dangersof crosscutting concerns

7

Program Dependency Graph

interface A {public void foo();

}public class B implements A {

public void foo() { ... }public void bar() { ... }

}public class C {

public static void main() {B b = new B();b.bar();

}

Source Code

C AB

foofoomain bar

refs

contains contains

calls

contains

contains

Page 8: Motivation □ Maintenance dominates software costs 2

Improving concern location□ Statement Annotations for Fine-Grained Advising

□ ECOOP Workshop on Reflection, AOP, and Meta-Data for Software Evolution (2006)□ Eaddy and Aho

□ Demo: Wicca 2.0 - Dynamic Weaving using the .NET 2.0 Debugging APIs□ Aspect-Oriented Software Development (2007)□ Eaddy

□ Identifying, Assigning, and Quantifying Crosscutting Concerns□ ICSE Workshop on Assessment of Contemporary Modularization Techniques (2007)□ Eaddy, Aho, and Murphy

□ Cerberus: Tracing Requirements to Source Code Using Information Retrieval, Dynamic Analysis, and Program Analysis□ IEEE International Conference on Program Comprehension (2008)□ Eaddy, Aho, Antoniol, and Guéhéneuc

8

Page 9: Motivation □ Maintenance dominates software costs 2

Innovative metrics & methodology

□ Towards Assessing the Impact of Crosscutting Concerns on Modularity□ AOSD Workshop on Assessment of Aspect Techniques

(2007)□ Eaddy and Aho

□ Do Crosscutting Concerns Cause Defects?□ IEEE Transactions on Software Engineering (2008)□ Eaddy, Zimmerman, Sherwood, Garg, Murphy, Nagappan, and

Aho

9

Page 10: Motivation □ Maintenance dominates software costs 2

Dangers of crosscutting

□ Do Crosscutting Concerns Cause Defects?□ IEEE Transactions on Software Engineering (2008)□ Eaddy, Zimmerman, Sherwood, Garg, Murphy, Nagappan, and

Aho

10

Page 11: Motivation □ Maintenance dominates software costs 2

Roadmap

□Improved state of theart of concern location

□Innovative metricsand experimentalmethodology

□Evidence of the dangersof crosscutting concerns

11

Program Dependency Graph

interface A {public void foo();

}public class B implements A {

public void foo() { ... }public void bar() { ... }

}public class C {

public static void main() {B b = new B();b.bar();

}

Source Code

C AB

foofoomain bar

refs

contains contains

calls

contains

contains

Page 12: Motivation □ Maintenance dominates software costs 2

What is a “concern?”

□Feature, requirement, design pattern, code idiom, etc.

□Raison d'être for code□Every line of code exists to satisfy some

concern

□Existing definitions are poor□Concern domain must be “well-defined

set”

12

Anything that affects the implementation of a program

Page 13: Motivation □ Maintenance dominates software costs 2

Concern location problem

13

ConcernsProgram Elements

Concern–code relationship hard to obtain

Page 14: Motivation □ Maintenance dominates software costs 2

Concern location problem

14

ConcernsProgram Elements

□Concern–code relationship undocumented

Concern–code relationship hard to obtain

Page 15: Motivation □ Maintenance dominates software costs 2

Concern location problem

15

ConcernsProgram Elements

□Concern–code relationship undocumented□Reverse engineer the relationship

Concern–code relationship hard to obtain

Page 16: Motivation □ Maintenance dominates software costs 2

Manual concern location

□Existing techniques too subjective□Inaccurate, unreliable

□Ideal□Code affected when concern is changed

□My insight□Prune dependency rule [ACOM’07]

□Code affected when concern is pruned (removed)

□i.e., software pruning□Practical approximation

16

Concern–code relationship determined by a human

Page 17: Motivation □ Maintenance dominates software costs 2

Prune dependency rule

□Code is prune dependent on concern if□Concern pruned code removed or

altered □Distinguish between removing and

altering code□Easily determine change impact of

removing code□Code dependent on removed code must

be altered (to prevent compile errors)□Easy for human to approximate

17

Page 18: Motivation □ Maintenance dominates software costs 2

Manual concern location

□Existing tools impractical for analyzing all concerns of a real system□Many concerns (>100)□Many concern–code links (>10K)□Hierarchical concerns

□My solution: ConcernTagger [TSE’08]

18

Concern–code relationship determined by a human

Page 19: Motivation □ Maintenance dominates software costs 2

19

Page 20: Motivation □ Maintenance dominates software costs 2

Automated concern location

□Experts look for clues in docs and code

□Existing techniques only consult 1 or 2 experts

□My solution: Cerberus [ICPC’08]1. Information retrieval2. Execution tracing3. Prune dependency analysis

20

Concern–code relationship predicted by an “expert”

Page 21: Motivation □ Maintenance dominates software costs 2

IR-based concern location

□i.e., Google for code□Program entities are documents□Requirements are queries

21

join

Id_joinjs_join()

Requirement“Array.join”

SourceCode

Page 22: Motivation □ Maintenance dominates software costs 2

Vector space model [Salton]

□Parse code and reqs doc to extract term vectors□NativeArray.js_join() method “native,” “array,”

“join”□“Array.join” requirement “array,” “join”

□My contributions□Expand abbreviations

□numconns number, connections, numberconnections□Index fields

□Weigh terms (tf · idf)□Term frequency (tf)□Inverse document frequency (idf)

□Similarity = cosine distance between document and query vectors 22

Page 23: Motivation □ Maintenance dominates software costs 2

Tracing-based concern location

□Observe elements activated when concern is exercised□Unit tests for each concern□e.g., find elements uniquely activated by a

concern

23

Page 24: Motivation □ Maintenance dominates software costs 2

Tracing-based concern location

□Observe elements activated when concern is exercised□Unit tests for each concern□e.g., find elements uniquely activated by a

concern

24

CallGraph

js_join

var a = new Array(1, 2);if (a.join(',') == "1,2"){ print "Test passed";}else { print "Test failed";}

js_construct

Unit Testfor “Array.join”

Page 25: Motivation □ Maintenance dominates software costs 2

Tracing-based concern location

□Observe elements activated when concern is exercised□Unit tests for each concern□e.g., find elements uniquely activated by a

concern

25

CallGraph

js_join

var a = new Array(1, 2);if (a.join(',') == "1,2"){ print "Test passed";}else { print "Test failed";}

js_construct

Unit Testfor “Array.join”

Page 26: Motivation □ Maintenance dominates software costs 2

Tracing-based concern location

□Elements often activated by multiple concerns

□What is “information content” of element activation?

□Element Frequency–Inverse ConcernFrequency [ICPC’08]

26

Page 27: Motivation □ Maintenance dominates software costs 2

Prune dependency analysis

□Infer relevant elements based on structural relationship to relevant element e (seed)□Assumes we already have some seeds

□Prune dependency analysis [ICPC’08]□Automates prune dependency rule

[ACOM’07]□Find references to e□Find superclasses and subclasses of e

27

Page 28: Motivation □ Maintenance dominates software costs 2

PDA example

28

C AB

foofoomain barcalls

contains

refs contains

contains

contains

Program Dependency Graphinterface A {

public void foo();}public class B implements A { public void foo() { ... } public void bar() { ... }}public class C { public static void main() { B b = new B(); b.bar(); }

Source Code

inherits

Page 29: Motivation □ Maintenance dominates software costs 2

29

C AB

foofoomain barcallscalls

contains

refs contains

contains

contains

Program Dependency Graphinterface A {

public void foo();}public class B implements A { public void foo() { ... } public void bar() { ... }}public class C { public static void main() { B b = new B(); b.bar(); }

Source Code

inherits

PDA example

Page 30: Motivation □ Maintenance dominates software costs 2

PDA example

30

C AB

foofoomain bar

contains

contains

refs contains

contains

contains

contains

Program Dependency Graphinterface A {

public void foo();}public class B implements A { public void foo() { ... } public void bar() { ... }}public class C { public static void main() { B b = new B(); b.bar(); }

Source Code

calls

inherits

Page 31: Motivation □ Maintenance dominates software costs 2

PDA example

31

Program Dependency Graphinterface A {

public void foo();}public class B implements A { public void foo() { ... } public void bar() { ... }}public class C { public static void main() { B b = new B(); b.bar(); }

Source Code

inheritsinheritsC AB

foofoomain bar

refscontains

contains

calls

contains

contains

Page 32: Motivation □ Maintenance dominates software costs 2

PDA example

32

Program Dependency Graphinterface A {

public void foo();}public class B implements A { public void foo() { ... } public void bar() { ... }}public class C { public static void main() { B b = new B(); b.bar(); }

Source Code

C AB

foofoomain bar

refscontains

contains

contains

calls

contains

contains

inherits

Page 33: Motivation □ Maintenance dominates software costs 2

Cerberus

33

Page 34: Motivation □ Maintenance dominates software costs 2

Cerberus effectiveness

34

Cerberus

Page 35: Motivation □ Maintenance dominates software costs 2

Roadmap

□Improved state of theart of concern location

□Innovative metricsand experimentalmethodology

□Evidence of the dangersof crosscutting concerns

35

Program Dependency Graph

interface A {public void foo();

}public class B implements A {

public void foo() { ... }public void bar() { ... }

}public class C {

public static void main() {B b = new B();b.bar();

}

Source Code

C AB

foofoomain bar

refs

contains contains

calls

contains

contains

Page 36: Motivation □ Maintenance dominates software costs 2

The crosscutting concern problem

□Code related to the concern is…□Scattered across (crosscuts) multiple

files□Often tangled with other concern code

36

Some concerns difficult to modularize

ConcernsProgram Elements

Page 37: Motivation □ Maintenance dominates software costs 2

Example: Pathfinding in Goblin

□Pathfinding is modularized37

Page 38: Motivation □ Maintenance dominates software costs 2

Example: Collision detection

□Collision detection not modularized

38

Page 39: Motivation □ Maintenance dominates software costs 2

How to measure scattering?

□Existing metrics inadequate□My solution

□Degree of scattering [ASAT’07]□Degree of tangling [ASAT’07]

39

Page 40: Motivation □ Maintenance dominates software costs 2

Degree of scattering (DOS)

□ Measures concern modularity, i.e., distribution of concern code across multiple classes

□ Average DOS – Overall modularity of concerns□ Summarizes amount of crosscutting present

□ More insightful than traditional metrics□ “class A is highly coupled” vs. “feature A is hard to

change”40

[Wong, et al.]

[ACOM’07]

Page 41: Motivation □ Maintenance dominates software costs 2

□More descriptive than class count□Consider two different concern

implementations

Insight behind DOS

Marc Eaddy 41

DOS = 1.00

#Classes = 4

DOS = 0.08

#Classes = 4

Page 42: Motivation □ Maintenance dominates software costs 2

Degree of tangling (DOT)

□Distribution of class code across multiple concerns

□Average DOT – Overall separation of concerns

Marc Eaddy 42

[Wong, et al.]

[ACOM’07]

Page 43: Motivation □ Maintenance dominates software costs 2

Roadmap

□Improved state of theart of concern location

□Innovative metricsand experimentalmethodology

□Evidence of the dangersof crosscutting concerns

43

Program Dependency Graph

interface A {public void foo();

}public class B implements A {

public void foo() { ... }public void bar() { ... }

}public class C {

public static void main() {B b = new B();b.bar();

}

Source Code

C AB

foofoomain bar

refs

contains contains

calls

contains

contains

Page 44: Motivation □ Maintenance dominates software costs 2

Do crosscutting concerns cause defects? [TSE’08]

□Created mappings1. Requirement–code map (via

ConcernTagger)2. Bug–code map (via BugTagger)3. Bug–requirement map (inferred)

Project Lang. Size (KLOC)

MappedConcern

s

Mapping

Time (hr)

Mylyn–Bugzilla

Java 13 28 31

Rhino Java 44 417 102

iBATIS Java 13 173 18

44

Page 45: Motivation □ Maintenance dominates software costs 2

Do crosscutting concerns cause defects? [TSE’08]

□Correlated scatteringand bug count□Spearman

correlation

□Found moderateto strong correlation between scatteringand defects

□As scattering increasesso do defects

45

Page 46: Motivation □ Maintenance dominates software costs 2

How widespread is the problem?

□5 case studies of OO programs□Scattering

□Concerns related to 6 classes on average□OO unsuitable for representing these problem domains?

□Most (86%) concerns are crosscutting to some extent

□Dispels “modular base” notion□General-purpose solution needed

□Tangling□Classes related to 10 concerns on average□Poor separation of concerns

□Classes doing too much

□Crosscutting concerns severely limit modularity

46

Page 47: Motivation □ Maintenance dominates software costs 2

Main Contributions

□Improved state of theart of concern location

□Innovative metricsand experimentalmethodology

□Evidence of the dangersof crosscutting concerns

47

Program Dependency Graph

interface A {public void foo();

}public class B implements A {

public void foo() { ... }public void bar() { ... }

}public class C {

public static void main() {B b = new B();b.bar();

}

Source Code

C AB

foofoomain bar

refs

contains contains

calls

contains

contains

Page 48: Motivation □ Maintenance dominates software costs 2

Future work

□Further explore new concern analysis field□Techniques to reduce crosscutting□Improve concern location

□Improve PDA generality, precision, and heuristics

□Use machine learning to combine judgments□Incorporate smart “grep” and PDA into IDE

□Gather empirical evidence□Impact of reducing crosscutting□Impact of crosscutting on maintenance effort□Impact of code tangling on quality

48

Page 49: Motivation □ Maintenance dominates software costs 2

Acknowledgements

49

Page 50: Motivation □ Maintenance dominates software costs 2

Questions?

Marc EaddyColumbia University

[email protected]

50

Page 51: Motivation □ Maintenance dominates software costs 2

Wicca□Extends C# tool chain

□wsc – Wicca# preprocessor/compiler

□Phx.Morph – Post-compile tool [AOSD’06]

□wdbg – Debugger [SC’07]

51

Page 52: Motivation □ Maintenance dominates software costs 2

Wicca□Extends runtime environment

□wicca – Command-line shell[AOSD’07]□Dynamic aspect-oriented programming□Dynamic software updating□Edit-and-continue

52

Page 53: Motivation □ Maintenance dominates software costs 2

Wicca architecture

53

Page 54: Motivation □ Maintenance dominates software costs 2

Wicca transformation pipeline

54

Deltas

WovenProgram

BreakPoints

Wicca#CompilerWicca#

CompilerPhx.MorphPhx.Morph

AspectAssemblies

CompiledProgram

SourceFiles

C# Compiler Phoenix

Page 55: Motivation □ Maintenance dominates software costs 2

Phx.Morph architecture

55

Phx.Morph

EditorsOpen Classes, binary and breakpoint weaving

Phoenix-specific AOP

Attribute Handlers

PhoenixPhoenix

PEREWAssemblyRe-Writer

PEREWAssemblyRe-Writer

Phx.Aop

AOPJoinpoints, pointcuts, …

AttributesCustom AOP annotations

.NET

MorphPlugin

Page 56: Motivation □ Maintenance dominates software costs 2

Example: SimpleDraw

□SimpleDraw draws shapes on a display

□Naïve implementation couples Shape and Display concerns

56

public init() { s = new Shape[3]; s[0] = new Point(10, 10); s[1] = new Point(5, 5); s[2] = new Line(new Point(1, 9), new Point(9, 1)); for (int i = 0; i < s.Length; i++) Display.instance().addShape(s[i]); Display.instance().update();

SimpleDraw.cs public void moveBy(int dx, int dy) { x += dx; y += dy; Display.instance().update();

public void moveBy(int dx, int dy) { a.moveBy(dx, dy); b.moveBy(dx, dy); Display.instance().update();

Point.cs

Line.cs

Page 57: Motivation □ Maintenance dominates software costs 2

Example: SimpleDraw

□SimpleDraw draws shapes on a display

□Naïve implementation couples Shape and Display concerns

57

public init() { s = new Shape[3]; s[0] = new Point(10, 10); s[1] = new Point(5, 5); s[2] = new Line(new Point(1, 9), new Point(9, 1)); for (int i = 0; i < s.Length; i++) Display.instance().addShape(s[i]); Display.instance().update();

SimpleDraw.cs public void moveBy(int dx, int dy) { x += dx; y += dy; Display.instance().update();

public void moveBy(int dx, int dy) { a.moveBy(dx, dy); b.moveBy(dx, dy); Display.instance().update();

Point.cs

Line.cs

Page 58: Motivation □ Maintenance dominates software costs 2

Display updating aspectclass DisplayUpdating {

[Advice(AdviceType.after, "execution(void Shape.moveBy)")][Advice(AdviceType.after, "execution(void Point.moveBy)")][Advice(AdviceType.after, "execution(void Line.moveBy)")]static public void updateAfterMove() {

Display.instance().update();}…

}

58

Page 59: Motivation □ Maintenance dominates software costs 2

Confounding effect of size□Larger concerns are

buggier□Scattering, size, and bug

count strongly interrelated□Is scattering just measuring

the “size effect”?

□Control for size usingcovariant analysis□Step-wise regression□Principal component analysis

□Scattering still explains some bug count variance

59

DOS

Size

Bugs

.66

.90

.68

Page 60: Motivation □ Maintenance dominates software costs 2

Side classesPowerful and disciplined class extension

60

Page 61: Motivation □ Maintenance dominates software costs 2

The expression problem

□Need to parse simple expression language□e.g., 3 + 4□Expressions: Literal, Add□Semantics: eval

□Extensibility goals□Easily add new expressions□Easily add new semantics

□Cannot do both easily

61

P. Tarr, H. Ossher, W. Harrison, S. Sutton Jr., “N Degrees of Separation: Multi-Dimensional Separation of Concerns,” ICSE 1999.

M. Torgersen, “The Expression Problem Revisited,” ECOOP 2004.

Page 62: Motivation □ Maintenance dominates software costs 2

Parser implementation in OOabstract class Expr {

abstract int eval();}

class Add : Expr {public Expr l, r;public Add(Expr l, Expr r) { this.l = l; this.r = r; }public int eval() { return l.eval() + r.eval(); }

}

62

• New concern: Add printing semantics– Concern is crosscutting!

Page 63: Motivation □ Maintenance dominates software costs 2

Add printing using subclassing

abstract class PrintingExpr : Expr {abstract void print();

}

class PrintingAdd : PrintingExpr, Add {public void print() { Console.WriteLine(l + “+” + r); }

}

63

• Invasive• Inflexible

Page 64: Motivation □ Maintenance dominates software costs 2

Printing the side class way

abstract class PrintingExpr + Expr {abstract void print();

}

class PrintingAdd + PrintingExpr, Add {public void print() { Console.WriteLine(l + “+” + r); }

}

64

• Modularizes the printing concern• Side class implicitly extends base

class• Clients (and other sub/side classes)

oblivious

Page 65: Motivation □ Maintenance dominates software costs 2

Caching result of eval()

□Concern: Cache expression evaluation

□Invalidate cache when expression changes

65

Page 66: Motivation □ Maintenance dominates software costs 2

Intertype declarations

interface IObserver {void onChanged();

}

class CachableExpr + Expr : IObserver {protected IObserver obs = null;

  protected int cache;protected bool isCacheValid = false;

void onChanged() { isCacheValid = false; if (obs != null) obs.onChanged();

}}

66

Page 67: Motivation □ Maintenance dominates software costs 2

Intertype declarations

interface IObserver {void onChanged();

}

class CachableExpr + Expr : IObserver {protected IObserver obs = null;

  protected int cache;protected bool isCacheValid = false;

void onChanged() { isCacheValid = false; if (obs != null) obs.onChanged();

}}

67

Page 68: Motivation □ Maintenance dominates software costs 2

Method chainingclass CachableAdd + Add {

public CachableAdd(CachableExpr l, CachableExpr r) { base(l, r);this.l.obs = this;this.r.obs = this;

}

override public int eval() {if (!isCacheValid) {

cache = base.eval();isCacheValid = true;

}return cache;

68

Page 69: Motivation □ Maintenance dominates software costs 2

Method chainingclass CachableAdd + Add {

public CachableAdd(CachableExpr l, CachableExpr r) { base(l, r);this.l.obs = this;this.r.obs = this;

}

override public int eval() {if (!isCacheValid) {

cache = base.eval();isCacheValid = true;

}return cache;

69

Page 70: Motivation □ Maintenance dominates software costs 2

Virtual fieldsoverride Expr l {

set { base.l = value; this.l.obs = this; this.onChanged();

}}override Expr r {

set { base.r = value; this.r.obs = this; this.onChanged();

}} 70

Page 71: Motivation □ Maintenance dominates software costs 2

Virtual fields

71

override Expr l {set {

base.l = value; this.l.obs = this; this.onChanged();

}}override Expr r {

set { base.r = value; this.r.obs = this; this.onChanged();

}}

Page 72: Motivation □ Maintenance dominates software costs 2

Metaprogramming

override Expr [$which = l | r] {set {

base.$which = value; this.$which.obs = this; this.onChanged();

}}

72

Page 73: Motivation □ Maintenance dominates software costs 2

However, base class mustpermit extension

73

class Add : Expr {public virtual Expr l, r;public Add(Expr l, Expr r) { this.l = l; this.r = r; }public virtual int eval() { return l.eval() + r.eval(); }

}

Page 74: Motivation □ Maintenance dominates software costs 2

Addresses issues of existing mechanisms

□Subclasses□ Inflexible composition□Only extend one base class□Only extend instance methods□Requires invasive changes in clients

□Open classes/refinements/mixins□Need for composition ordering□Need for constructor and static member overriding

□Aspects□Lack of modular reasoning□Poor integration with existing OO syntax and

semantics□Need for symmetric model□Matching wrong join points

74

Page 75: Motivation □ Maintenance dominates software costs 2

Statement annotations [RAM-SE’06]

□Concerns with “irregular” implementations□Error handling, assertions,

optimizations, logging

□Hard to modularize using regular aspects

□Apply aspects to specific locations,statements, or object instances

75

Difficult to modularize irregular concern code

[NonNull] AuthorizationRequest ar =new AuthorizationRequest(this,

dest);

Page 76: Motivation □ Maintenance dominates software costs 2

76

Page 77: Motivation □ Maintenance dominates software costs 2

77

Page 78: Motivation □ Maintenance dominates software costs 2

78