Promise 2011: "Does Measuring Code Change Improve Fault Prediction?"

Code Change and Fault Prediction Tom Ostrand, Robert Bell, Elaine Weyuker AT&T Labs – Research Florham Park, NJ, USA

PROMISE 2011

Banff, Alberta, September 20-21, 2011

Overview

•Do measures of code change or churn provide useful input to fault prediction models?

•Standard model

•Base models

•Churn-augmented models

The Standard Model

• Underlying statistical model

• Negative binomial regression

• Output (dependent) variable

• Predicted fault count in each file of release n

• Predictor (independent) variables

• KLOC (n)

• Previous faults (n-1)

• Previous changes (n-1, n-2)

• File age (number of releases)

• File type (C,C++,java,sql,make,sh,perl,...)

Evaluating prediction models

• Model produces ranking of files in a release, from predicted most faults to fewest faults

• Choose cutoff point in ranking, X%

• Yield = percent of all faults in the release that are in the first X% of the ranked files

We’ve usually evaluated models at a 20% cutoff.

• Fault-percentile average (FPA) is the average yield over all values of X

Prediction Results, from the Standard

91 87 88

Percent of faults in top 20% of files FPA

Measures of Code Change

•Changed/not changed

•Number of changes during a release

•Number of lines added

•Number of lines deleted

•Number of lines modified

•Relative churn (line changes/LOC)

Two Subject Systems

Large provisioning system

• 18 releases, 5 year lifespan

• 6 programming languages:

• Java (60%), C, C++, SQL, SQL-C, SQL-C++

• 3000+ files

• 1.5Mil LOC

• Average of 395 faults/release

Two Subject Systems

Utility, data aggregation system

• 18 releases, 5 year lifespan

• >10 programming languages:

• Java (77%), Perl, xml, sh, ...

• 800 files

• 280K LOC

• Average of 90 faults/release

Distribution of files,

averages over all releases.

6.8% 11.0%

Percent of Files: Provisioning

Changed

Unchanged

1.6% 15.1%

Percent of Files: Utility

Changed

Unchanged

Where do faults occur?

Distribution of faults over files

Faults/file: Provisioning

Changed

Unchanged

Faults/file: Utility

Changed

Unchanged

Provisioning system faults per file, by

release

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Release

Faults per File, by Change Status and Release

New (Mean=0.24) Unchanged (Mean=0.02) Changed (Mean=0.80)

Utility system faults per file, by release

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Faults per File, by Change Status and Release

New (Mean=.09) Unchanged (Mean=.002) Changed (Mean=.92)

Potential predictor combinations

• Added lines only

• Deleted lines only

• Modified lines only

• Adds & Deletes

• Adds & Mods

• Deletes & Mods

• Adds & Deletes & Mods

• Relative values: changed lines/LOC

Distribution of change combinations,

all check-ins, all releases:

Provisioning system

Mods, 683 Deletes, 296

Adds, 597

Mods & Deletes, 168

Mods & Adds, 1894

Deletes & Adds, 126

M & D & A, 2625

Number of Files

Average lines touched for each combination of

changes

Mods, 4 Deletes, 5

Adds, 21

Mods & Deletes, 23

Mods & Adds, 37

Deletes & Adds, 21

M & D & A, 210

Average Lines touched

Faults per file, changed files only:

Provisioning system

Mods, 0.19

Deletes, 0.04

Adds, 0.3

Mods & Deletes, 0.36

Mods & Adds, 0.55

Deletes & Adds, 0.5

M & D & A, 1.38

Faults per File

Fault prediction models

•Univariate models

•Base model: log(KLOC), File age, File type

•Augmented models:

• Previous Changes

• Previous {Adds / Deletes / Mods}

• Previous Adds + Deletes + Modifications

• Previous {Adds / Deletes / Mods} / LOC (relative churn)

• Previous Developers

Fault-percentile averages for univariate

predictor models: Provisioning system (best result from raw variable, square root, fourth root)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

log(KLOC)

Prior Changes

Prior Adds+Deletes+Mods

Prior Developers

Prior Lines Added

Prior Lines Modified

Prior Changed

Prior Faults

Prior Lines Deleted

Language

Standard Model

FPA, univariate models

Base Model 1

• KLOC

Base Model 1, and added variables

• Base model 1

• KLOC

89 90 91 92 93 94

Base 1 prev-prev changes

prev-deletes prev-mods

prev-changed prev-adds

prev-developers (prev-adds,dels,mods)/LOC

prev-adds,dels,mods prev-changes

Standard Model

Mean FPA, Provisioning System

87 88 89 90 91 92 93

Base 1

prev-prev changes

prev-deletes

prev-mods

prev-changed

prev-adds

prev-developers

prev-adds,dels,mods

prev-changes

Standard Model

Mean FPA, Utility System

Base Model 2

• KLOC

• (Previous changes)1/2

Base Model 2, and added variables

93.2 93.25 93.3 93.35 93.4 93.45 93.5 93.55

Base 2

prev-changed

prev-deletes

(prev-adds,dels,mods)/LOC

prev-developers

prev-mods

prev-adds

prev-adds,dels,mods

prev-prev changes

Mean FPA, Provisioning System

• Base model 2

• KLOC

• (Previous changes)1/2

Summary

• Churn can be an effective aid for improving fault prediction

• {Adds+Deletes+Mods} improves the accuracy of a model that doesn’t include any change information

• a simple count of prior changes slightly outperforms {Adds+Deletes+Mods}

• Prior changed is nearly as good as either, when added to a model without change info

• Lines added is the most effective single predictor

• Lines deleted is least effective single predictor

• Relative churn is no better than absolute churn for predicting total fault count

Promise 2011: "Does Measuring Code Change Improve Fault Prediction?"

Technology

Fault Prediction in Fuzzy Discrete Event Systems: A Diagnoser …dline.info/fpaper/jdim/v17i6/jdimv17i6_3.pdf · 2020-01-28 · Fault Prediction in Fuzzy Discrete Event Systems: A

PROMISE 2011: What Prediction Model Should Be?

Superheat Prediction & Fault Diagnostics of HVAC from

Early fault prediction and detection of hydrocephalus shunting

Real-World Challenges in Building Accurate Software Fault Prediction Models

Defect Prediction and Software Risk Promise'14 Keynote/Tutorial …mockus.org/papers/promise1.pdf · 2014-09-17 · Defect Prediction and Software Risk Promise’14 Keynote/Tutorial

Feature Selection Techniques for Software Fault Prediction (Summary)

SCE Fault Locating, Prediction and Protection Project 2010 Peer Review...SCE Fault Locating, Prediction and Protection Project Bob Yinger Southern California Edison. DOE Peer Review,

Software Fault Prediction: A Systematic Mapping Study · Software Fault Prediction: A Systematic Mapping Study Juan Murillo-Morera1, Christian Quesada-L´opez 2, Marcelo Jenkins 1

A Systematic Review of Fault Prediction Performance in Software Engineering

Cognitive Behavior Analysis framework for Fault Prediction ... · PDF fileCognitive Behavior Analysis framework for Fault Prediction in Cloud Computing ... After providing the high-level

Assessing the Effectiveness of Fault-Proneness Prediction ...jurgenv/papers/roydewildt.pdf · fault-proneness prediction modelling techniques; (iii) or by considering the context

Fault prediction

Hotspot Detection and Fault Level Prediction on 6

Fault Prediction and Localization with Test Logs

Defect Prediction and Software Risk Promise'14 Keynote ...web.eecs.utk.edu/~audris/papers/promise1.pdf · Defect Prediction and Software Risk Promise’14 Keynote/Tutorial Torino

Software fault prediction metrics: A systematic literature review · 2015-01-07 · Software fault prediction metrics: A systematic literature review Danijel Radjenovic´ a,b,⇑,

Fault state detection and remaining useful life prediction

The Promise of Enterprise Prediction Markets

Fault Seal Prediction and Uncertainty Estimation of a Water Wet Fault