60
LEVERAGING LIGHTWEIGHT ANALYSES TO AID SOFTWARE MAINTENANCE ZACHARY P. FRY PHD PROPOSAL

LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

LEVERAGING LIGHTWEIGHT ANALYSES TO AID SOFTWARE MAINTENANCE ZACHARY P. FRY

PHD PROPOSAL

Page 2: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

MAINTENANCE COSTS

For persistent systems, software maintenance can account for up to 90% of the software lifecycle costs.

90%

Requirements

Design

Implementation

Verification

Maintenance R.C. Seacord, D. Plakosh, and G. A. Lewis. Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices. Addison-Wesley Longman Publishing Co. Inc., Boston, MA, USA, 2003. 2

Page 3: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

KEY PARTS OF THE MAINTENANCE PROCESS

Bug Reporting

3

File: …

Lines: …

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! keys.remove(s);! }!}!

Bug Fixing

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! }!}!

Update Documentation

Page 4: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

MAINTENANCE PROCESSES IN PRACTICE

•  Manual bug reporting is costly •  Reputation •  Human effort

•  Automatic bug finders yield thousands of bugs, requiring verification and triage.

4

Page 5: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

MAINTENANCE PROCESSES IN PRACTICE

5

1

10

100

1000

10000

Def

ect R

epor

ts

Benchmark Programs

Number of Automatically Reported Defects by Program

Page 6: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

MAINTENANCE PROCESSES IN PRACTICE

Bug reports come in at an alarming rate, humans simply cannot triage and fix them all.

0 2000 4000 6000 8000

10000 12000 14000 16000 18000

Confirmed New Bugs

Confirmed Resolved Bugs

2000 2012

OpenOffice bugs: 2000-2012

Automatic program repair

6

Page 7: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

/* A reporter reporting the number of page faults since startup should have units UNITS_COUNT. */

MAINTENANCE PROCESSES IN PRACTICE

Fixing bugs means lots of code changes.

Comments are often overlooked •  Out-of-date documentation

/* The number of tabs currently open would have UNITS_COUNT. */

7

Page 8: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

MAINTENANCE PROCESSES IN PRACTICE

Automated techniques have helped to facilitate the maintenance process. However, the process remains costly. Research question: Can we reduce the effort necessary for specific parts of the maintenance process, thereby reducing the overall cost?

8

Page 9: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

PROPOSAL THESIS

By using lightweight analyses to extract and use latent information encoded by humans in software development artifacts we can reduce the costs of software maintenance by relieving bottlenecks in various stages throughout the process.

9

Page 10: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

RESEARCH CONSIDERATIONS

Overall Goal: •  Reduce maintenance costs Design Constraint: •  Minimize additional human effort •  Ease of incremental adoption Overall Intuition: •  Leverage information often overlooked

by existing techniques

10

Page 11: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

THE REST OF THIS PRESENTATION

•  An overview of the proposed thrusts •  Clustering Duplicate Automatically-

Generated Defect Reports •  Improved Fitness Functions for Automatic

Program Repair •  Ensuring Documentation Consistency

•  Proposed research timeline •  Conclusion and Questions

11

Page 12: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

Improved Fitness Functions for Automatic Program Repair

PROJECT OUTLINE

Clustering Duplicate Automatically-Generated Defect Reports

Ensuring Documentation Quality

12

File: …

Lines: …

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! keys.remove(s);! }!}!

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! }!}!

Page 13: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

Automatic bug finders successfully report many bugs with little developer effort

Defect Reports

Verification and Triage

False Positives

Actual Defects

13

Bug Finder

Page 14: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

Automatic bug finders successfully report many bugs with little developer effort However…

Defect Reports

Verification and Triage

False Positives

Actual Defects

x 1000s

14

Bug Finder

Page 15: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

Intuitions: Duplicates are detrimental in related fields.

15

Manual Reports

Source Code Code Clone

Detectors

Duplicate Report

Detector

Page 16: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

Intuitions: Duplicates are detrimental in related fields.

Manual Reports

Source Code

16

Code Clone Detectors

Duplicate Report

Detector

Page 17: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

Hypothesis: By exploiting the special structure of automatic defect detection tools’ output we can accurately cluster defect reports to save effort by handling similar defect reports aggregately. Success depends on:

•  Internal accuracy of the produced clusters •  Amount of effort saved from clustering

defect reports

17

Page 18: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

18

Defect Report 1

File: NSReader.java

Suspected Line: plot = lst.get(i); !

Defect Report 3

File: NSReader.java

Suspected Line: plot = lst.get(n); !

Defect Report 2

File: NSReader.java

Suspected Line: p = lst.get(i); !

Defect Report 4

File: UI_Impl.java

Suspected Line: plot = lst.get(i); !

Page 19: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

19

Defect Report 1

File: NSReader.java

Suspected Line: plot = lst.get(i); !

Defect Report 3

File: NSReader.java

Suspected Line: plot = lst.get(n); !

Defect Report 2

File: NSReader.java

Suspected Line: p = lst.get(i); !

Defect Report 4

File: UI_Impl.java

Suspected Line: plot = lst.get(i); !

Page 20: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

20

Defect Report 1

File: NSReader.java

Suspected Line: plot = lst.get(i); !

Defect Report 3

File: NSReader.java

Suspected Line: plot = lst.get(n); !

Defect Report 2

File: NSReader.java

Suspected Line: p = lst.get(i); !

Defect Report 4

File: UI_Impl.java

Suspected Line: plot = lst.get(i); !

Page 21: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

21

Defect Report 1

File: NSReader.java

Suspected Line: plot = lst.get(i); !

Defect Report 3

File: NSReader.java

Suspected Line: plot = lst.get(n); !

Defect Report 2

File: NSReader.java

Suspected Line: p = lst.get(i); !

Defect Report 4

File: UI_Impl.java

Suspected Line: plot = lst.get(i); !

Page 22: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

Clustering technique:

R3

R5

R7

R4 R6

R9

R8 R10

R1 R11

R2

22

R12

Page 23: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

Clustering technique:

R3

R5

R7

R4 R6

R9

R8 R10

R1 R11

R2

23

R12

Page 24: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

Preliminary Cluster Accuracy vs. Effort Savings

0

20

40

60

80

100

0 0.2 0.4 0.6 0.8 1

Effo

rt Sa

ved

(% o

f def

ects

col

laps

ed)

Accuracy (fraction of correctly clustered reports)

Pareto Frontier - All Java Benchmark Programs

Our TechniqueConQAT

PMDCheckstyle

24

Page 25: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

Preliminary Cluster Accuracy vs. Effort Savings

0

20

40

60

80

100

0 0.2 0.4 0.6 0.8 1

Effo

rt Sa

ved

(% o

f def

ects

col

laps

ed)

Accuracy (fraction of correctly clustered reports)

Pareto Frontier - All Java Benchmark Programs

Our TechniqueConQAT

PMDCheckstyle

25

Saving more effort at all levels

of accuracy

Page 26: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

CLUSTERING DUPLICATE DEFECT REPORTS

Preliminary Cluster Accuracy vs. Effort Savings

0

20

40

60

80

100

0 0.2 0.4 0.6 0.8 1

Effo

rt Sa

ved

(% o

f def

ects

col

laps

ed)

Accuracy (fraction of correctly clustered reports)

Pareto Frontier - All Java Benchmark Programs

Our TechniqueConQAT

PMDCheckstyle

26

Capable of perfect accuracy

Page 27: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

Improved Fitness Functions for Automatic Program Repair

PROJECT OUTLINE

Clustering Duplicate Automatically-Generated Defect Reports

Ensuring Documentation Quality

27

File: …

Lines: …

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! keys.remove(s);! }!}!

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! }!}!

Page 28: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

Automatic program repair

GenProg

Bugs

? 28

Page 29: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

Automatic program repair can fix bugs.

Bugs

Fixes

29

GenProg

Page 30: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

Automatic program repair can fix bugs.

Bugs

Fixes

30

GenProg

Page 31: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

•  Measuring proximity to a fix •  Insert, delete, and swapping lines in the program

31

d(135)

NO FIX

FIX i(251,205) i(774,111) s(598,324)

Page 32: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

•  Measuring proximity to a fix •  Insert, delete, and swapping lines in the program

32

d(135)

NO FIX

FIX i(251,205) i(774,111) s(598,324)

i(251,205) i(774,111) s(598,324) d(63) ✓ ✓ ✓ ✗ 75%

Page 33: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

•  Measuring proximity to a fix •  Insert, delete, and swapping lines in the program

33

d(135)

NO FIX

FIX i(251,205) i(774,111) s(598,324)

i(251,205) i(774,111) s(598,324) d(63) ✓ ✓ ✓ ✗ 75%

d(84) s(844,265) i(774,111) i(735,431) ✓ ✗ ✗ 25% ✗

Page 34: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

•  The current model of fitness does not correlate well with proximity to a fix.

Intuitions: •  Not all test cases are created equal. •  Not all bugs are created equal. •  Not all fixes are created equal. We propose to address the naivety of the current fitness representation.

34

Page 35: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

Hypothesis: By taking into account previously unused information about test cases, bugs, and fixes we can better inform the evolutionary bug fixing process to fix bugs faster and more often. Success depends on:

•  Increase the number of bugs fixed •  For bugs that can currently be fixed, shorten

the time it takes to fix them

35

Page 36: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

Approach: weight test cases based on known fixes

36

Test Case 1

Test Case 2

FIX NO FIX

Page 37: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

Approach: weight test cases based on known fixes

37

Test Case 1

Test Case 2

FIX NO FIX

Page 38: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

Approach: weight test cases based on known fixes

38

Test Case 1

Test Case 2

FIX NO FIX

0.8 0.2

Page 39: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

IMPROVED FITNESS FUNCTIONS

Evaluation: •  How many more bugs can we fix?

•  55 out of 105 bugs fixed in the most recently published work1

•  How much can we speed up fixes? •  Computational time and monetary cost

1.  Claire Le Goues, Westley Weimer, Stephanie Forrest: Representations and Operators for Improving Evolutionary Software Repair. Genetic and Evolutionary Computing Conference (GECCO) 2012

39

Page 40: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

Improved Fitness Functions for Automatic Program Repair

PROJECT OUTLINE

Clustering Duplicate Automatically-Generated Defect Reports

Ensuring Documentation Quality

40

File: …

Lines: …

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! keys.remove(s);! }!}!

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! }!}!

Page 41: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

•  “The documentation becomes increasingly inaccurate thereby making future changes even more difficult.” (Parnas)

•  Real developers: •  76% agree documentation is crucial to

understanding •  But poorly executed in practice (27%

complete, 33% consistent)

41

Page 42: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

42

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! }!}!

Page 43: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

43

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! }!}!

/*loop through all keys, removing ! corrupted values from ‘map’*/!HashMap validMap = new HashMap();!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isValid()){! validMap.put(s,map.get(s));! map.remove(s);! }!}!

Page 44: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

44

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! }!}!

/*loop through all keys, removing ! corrupted values from ‘map’*/!HashMap validMap = new HashMap();!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isValid()){! validMap.put(s,map.get(s));! map.remove(s);! }!}!

INCONSISTENT! Comment incorrectly describes the functionality

Page 45: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! }!}!

ENSURING DOCUMENTATION QUALITY

45

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! if(s.equals(“Primary”))! printf(“debug: %s\n”, ! map.get(s).toString());! map.remove(s);! }!}!

Page 46: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! map.remove(s);! }!}!

ENSURING DOCUMENTATION QUALITY

46

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! if(s.equals(“Primary”))! printf(“debug: %s\n”, ! map.get(s).toString());! map.remove(s);! }!}!

INCOMPLETE! Comment fails to describe all relevant functionality

Page 47: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

•  Reduce understandability over time Intuitions: •  Existing tools can accurately extract

concepts from code and generate comments about those concepts.

•  There should be natural language overlap in a high quality comments and the associated code.

47

Page 48: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

Hypothesis: By comparing concepts extracted from the code with the existing comments, we can accurately identify inconsistent and incomplete documentation. Success depends on:

•  The accuracy of our incomplete and inconsistent comment identification technique

•  The ease with which humans update and understand comments when using our tool

48

Page 49: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

Approach:

49

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! if(s.equals(“Primary”))! printf(“debug: %s\n”, ! map.get(s).toString());! map.remove(s);! }!}!

Page 50: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

Approach:

50

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! if(s.equals(“Primary”))! printf(“debug: %s\n”, ! map.get(s).toString());! map.remove(s);! }!}!

Page 51: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

Approach:

51

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! if(s.equals(“Primary”))! printf(“debug: %s\n”, ! map.get(s).toString());! map.remove(s);! }!}!

Page 52: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

Approach:

52

Generated Documentation (DeltaDoc):

Now call printf if s is “Primary” !

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! if(s.equals(“Primary”))! printf(“debug: %s\n”, ! map.get(s).toString());! map.remove(s);! }!}!

Page 53: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

Approach:

53

Existing comment lacks this info, thus is incomplete.

Generated Documentation (DeltaDoc):

Now call printf if s is “Primary” !

/*loop through all keys, removing ! corrupted values from ‘map’*/!Vector keys = ! new Vector(map.keySet());!for(String s : keys){! if(map.get(s).isCorrupted()){! if(s.equals(“Primary”))! printf(“debug: %s\n”, ! map.get(s).toString());! map.remove(s);! }!}!

Page 54: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

54

Evaluation: Human studies Study 1 – (FIX) – Humans identify and fix low-quality comments with and without our tool Study 2 – (RATE) – Using the resulting data set, have different humans identify and rate the modified comments

Page 55: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

Evaluation: •  Compare our tool’s accuracy when

identifying low-quality comments with humans’ abilities to do the same task •  Use identification data from both

FIX and RATE •  Inter-annotator agreement vs. tool-human

agreement

55

Page 56: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

ENSURING DOCUMENTATION QUALITY

Evaluation: •  Measure our tools’ effectiveness in

helping humans identify and fix low quality comments •  Effort (time) in FIX •  Use ratings from RATE to compare data from

groups in FIX

56

Page 57: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

SUMMARY

We propose work that will specifically target three parts of the maintenance process to reduce the overall cost: 1.  Cluster automatically-generated defect

reports to facilitate triage and bug fixing 2.  Improve fitness functions to aid in

automatic program repair to fix more bugs, faster

3.  Identify incomplete and inconsistent comments to promote continued documentation quality and foster program understanding

57

Page 58: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

COMPREHENSIVE GOALS - REVISITED

We desire techniques that add minimal human effort •  Techniques work “off the shelf” •  Encourages incremental adoption

Use latent, often-overlooked information •  Syntactic, semantic defect report fields •  Test case quality, types of bugs/fixes •  Natural language in code and comments

58

Page 59: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

RESEARCH TIMELINE

59

Publications to date: •  E. Schulte, Z. Fry, E. Fast, W. Weimer, S. Forrest. Software Mutational Robustness.

Genetic Programming and Evolvable Machines 2013. (under submission) •  Z. Fry, W. Weimer. Clustering Static Analysis Defect Reports to Reduce Maintenance

Costs. International Conference on Tools and Algorithms for the Construction and Analysis of Systems 2013 (TACAS). (under submission)

•  Z. Fry, B. Landau, W. Weimer. A Human Study of Patch Maintainability. International Symposium on Software Testing and Analysis 2012 (ISSTA). (Acc Rate: 29%)

•  Z. Fry, W. Weimer. A Human Study of Fault Localization Accuracy. International Conference on Software Maintenance 2010 (ICSM). (Acc. Rate: 26%)

Patch&Quality&[ISSTA&'12]&

Duplicate&Defect&Detec9on&

Enhanced&Fitness&Func9ons&

Documenta9on&Quality&

Op9onal&Documenta9on&Quality&Journal&Submission&

Research&Period&

Publica9on&Lag&

2012&2011& 2013& 2014&

toda

y&

expe

cted

&grad

ua9o

n&

Page 60: LEVERAGING LIGHTWEIGHT ANALYSES TO AID ...weimerw/students/ZakFry...• Automatic bug finders yield thousands of bugs, requiring verification and triage. 4 MAINTENANCE PROCESSES IN

QUESTIONS?

60