Software Defect Repair Times: A Multiplicative Model

1© 2006 Cisco Systems, Inc. All rights reserved.

Session NumberPresentation_ID

Software Defect Repair Times:A Multiplicative Model

Robert Mullen

Cisco Systems

Boxborough MA

bomullen @ cisco.com

Swapna S. Gokhale

Univ. of Connecticut

Storrs CT

[email protected]

2Mullen Gokhale PROMISE 2008

Outline

• The need for timely, correct fixes, and tracking.

• Two approaches, MTTR and Age; Tradeoffs.

• Log-transform of data, form of the distribution

• Multiplicative factors lognormal

• Transformation from rates to age

• Comparison of models

• Implications for management


Problem definition

• Our problem was to characterize and improve software defect repair times order to improve both reliability of released networking products and time-to-market of products under development.

• Repair time is from date defect record is created until defect is repaired in at least one version.

• Both interval before defect is recorded and interval until fix is distributed are not included.


One approach: Mean Time To Repair, MTTR( Not today ! )

• Little’s Law:

• average wait time = queue length / service rate

• Similar to days accounts receivable or days of inventory; well understood by management and goaled at Cisco

• Both unfixed and recent fixes affect the result

• Integrate both queue length and service rate over 90 days

• Ordinarily track all dispositions, not just fixes

• Suitable for comparing products, teams, etc.

• Retrospective trending can be done using on historical data


Second approach: Measuring age at fix

• Closed bugs: age is interval from creation to fix

• Open bugs: age is from creation to present

• Not studied here; distribution may differ from Closed.

• Average age of open or average age of closed can be erratic if there are outliers

• Controlling variability depends on preventing outliers.

• Data collection: pick a product and a range of time during which > 1000 defects were fixed. Determine the age of each defect at the time it was fixed.

• We included only defects for which there was a fix, not other dispositions.


One year, Severities 1-3, Linear plot

• Very skewed distribution

• Median 37 days

• Mean 81

• Std. dev. 147

• 85%-ile 139

Percent bugs fixed in under N days

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

N days

Bug Sev 1

Bug Sev 2

Bug Sev 3


One year, Severities 1-3, Log plot

• Same chart but Log scale.

• Log chart shows distinct S curve

• Lower counts for S1 yield relatively greater fluctuations.

• Severe bugs (S1, S2) get faster service except for tail.

Percent bugs fixed in under N days

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

ln N days

Bug Sev 1

Bug Sev 2

Bug Sev 3


Lognormal provides excellent fit

• N >> 1000. In this case Age at Fix is visually identical with the Lognormal.

• The lognormal is the most commonly used distribution in maintainability analysis because it is considered representative of the distribution of most repair times. MIL-HDBK-470A 4.4.3.1

• Note for later --- fitted lognormal is slightly lower at the left edge

Cumulative Percent Bugs Fixedas function of age at fix (MRV).

0

10

20

30

40

50

60

70

80

90

100

ln Age at Fix (MRV)

Cu

mu

lati

ve

% B

ug

s

Num Defects

Fitted Lognormal

Cumulative Percent Bugs Fixedas function of age at fix (MRV).

0

10

20

30

40

50

60

70

80

90

100

Age at Fix (MRV)

Cu

mu

lati

ve

% B

ug

s

Num Defects

Fitted Lognormal


Relationship between the mean and varianceof the Log(age) and of the age itself

• Mean (Log(age)) =

• Variance (Log(age)) =

• Median (Log(age)) =

• Median (age) = exp ()

• Mean (age) = exp ( + )

• Variance (age)

=exp(2 + ) (exp() -1)

mean stdev

2.15 1.70 31 77

2.26 1.69 34 103

2.35 1.66 37 128

3.17 1.52 65 126

3.30 1.50 73 140

3.46 1.47 81 147

3.0 1.5 62 180

3.0 1.6 72 250

3.0 1.7 85 351

2.5 1.6 44 151

3.0 1.6 72 250

3.5 1.6 119 411

Example Values


Why might the Ages be Lognormal?

• The Lognormal can be generated when a random variable is the product of other random variables, just as a Normal distribution can be generated by summing random variables.

• Informally, the conditions are that the constituent random factors be substantially independent, that no one variable dominate the others, and that there be a large number of factors.

• We propose a hypothetical model of the defect repair process including realistic multiplicative factors and approximating the mathematical conditions.


Seven hypothetical factors affecting resolution timeDrawn from experience and COCOMO

PRIORITY BUG CLARITY DIFFICULTY SPEED SKILLS RESOURCES TOOLS

P1 Complete Obvious Superstar Practiced Available Specific

P2 Well Written Moderate Fast Moderate Shared/Wait Workable

P3 Oversights Hard Average Minimal Remote Inadequate

Misleading Subtle Slow Novice Substitute None


Seven hypothetical factors: tentative distributionsThere is a 2% Probability the Priority is P1, and if so the Time multiplier is .5, etc

For Severity and the other 6 dimensions there is a probability distribution of levels of difficulty

We model the distributions by a discrete distribution with 3 or 4 relative levels of difficulty, each with a given probablility

Probabilities add to 1.0, i.e. 100%

For each factor, we know the variance of the log

Defect Personnel Process Support

Severity Clarity Difficulty Speed Skills Resources ToolsProb. Value Prob. Value Prob. Value Prob. Value Prob. Value Prob. Value Prob. Value

0.02 1 0.20 1 0.10 0.5 0.05 0.5 0.20 0.6 0.40 1 0.15 0.8

0.19 1.78 0.30 1.5 0.40 1 0.45 1 0.30 0.9 0.30 1.5 0.35 0.9

0.79 3.49 0.30 2.5 0.30 2 0.30 3 0.30 1.2 0.20 2 0.35 1.1

0.20 4 0.20 3 0.20 10 0.20 1.7 0.10 4 0.15 1.4

Var 0.04 Var. 0.22 Var. 0.30 Var. 0.87 Var. 0.12 Var. 0.18 Var. 0.077


Is seven factors enough to generate lognormal?

• MONTE CARLO: randomly chose sets of 7 factors, based on their distributions

• Summing variance of factors, we expect = 1.372

• MonteCarlo yielded = 1.36, no surprise

0

25

50

75

100

0 1 2 3 4 5 6 7

ln (product of rate factors)

Per

cen

tMonte Carlo

Lognorm al


Data: number of defects fixed in N days or less

• For fitting models to data the defects were grouped in over 30 buckets representing ranges of ages

• Age zero means bug fixed on day it arrived.

AGE (days) B C G

0 118 123 138

1 230 207 218

2 288 257 263

3 343 300 304

4 384 345 345

5 409 394 368

6 459 449 401

7 507 509 435

8 550 539 464

9 577 565 488

11 598 605 522

13 650 641 550

15 697 691 575

18 738 727 601

21 778 777 625

25 816 814 648

29 852 846 688

34 881 877 718

47 961 933 785

55 990 952 814

64 1017 967 839

74 1038 985 864

86 1054 1001 883

99 1064 1017 903

114 1075 1028 926

132 1082 1036 956

175 1093 1058 1000

202 1097 1063 1020

233 1102 1071 1039

268 1103 1074 1058

309 1105 1080 1073

356 1107 1081 1086

410 1112 1085 1099

Total 1125 1096 1139

1 230 207 218

2 288 257 263

3 343 300 304

4 384 345 345

5 409 394 368

6 459 449 401

7 507 509 435

8 550 539 464

11 598 605 522

13 650 641 550

15 697 691 575

18 738 727 601

21 778 777 625

25 816 814 648

29 852 846 688

34 881 877 718

47 961 933 785

55 990 952 814

64 1017 967 839

74 1038 985 864

86 1054 1001 883

99 1064 1017 903

114 1075 1028 926

132 1082 1036 956

175 1093 1058 1000

202 1097 1063 1020

233 1102 1071 1039

268 1103 1074 1058

309 1105 1080 1073

356 1107 1081 1086

410 1112 1085 1099

Total 1125 1096 1139

Mean AGE 30.733 32.400 69.709

s.d. AGE 73.560 75.925 125.413


Nine product families

0%

25%

50%

75%

100%

0.1 1 10 100 1000

Figure 3: Repair AGE CDFs for nine products (x-axis in days)


Models considered

• We have explanation for why rates may be lognormal

• But • The fit near the origin is not quite right

• The actual age at fix depends on other random conditions

• We use the Laplace Transform (Miller-1985) to convert from rates to times.

• We compare three models• Exponential (commonly used)

• Lognormal (commonly used)

• Laplace Transform of Lognormal


Conversion from rates (LN) to times (LTLN)• Doubly stochastic

• Select rate from lognormal

• Select time from exponential, given that rate.

0

25

50

75

100

1 10 100 1000

LTLN (σ=1.0)

LN (σ=1.0)

dedL22/2)ln(

2

1

dLttM

0

exp1)(


Comparing product families & models

• AIC = - 2 * log_likelihood + 2 * num_parameters

Product

Family

LTLN

Neg.

LLH

LN

Neg.

LLH

EXP

Neg.

LLH

AIC

LTLN

vs LN

B 142.71 159.04 517.66 32.67

C 138.64 155.82 526.54 34.35

G 141.57 149.19 703.69 15.25

M 148.48 165.46 982.63 33.97

N 133.54 155.06 529.42 43.03

S 142.53 146.89 983.76 8.72

T 165.01 169.64 869.63 9.26

U 138.93 154.94 489.80 32.02

Y 152.23 182.55 649.73 60.65

LTLN

LN Product

Family

B 1.350 -2.602 1.883 2.024

C 1.375 -2.625 1.897 2.050

G 2.028 -3.127 2.443 2.574

M 1.631 -2.616 2.082 2.050

N 1.301 -3.050 1.844 2.476

S 1.781 -2.462 2.167 1.909

T 1.619 -2.867 2.036 2.312

U 1.365 -2.389 1.867 1.818

Y 1.256 -2.586 1.804 2.009


Implications for management

• Result: Factors surely multiplicative

• Suggest: Estimate and manage factors• Training for novice engineers, or teaming

• Tools for difficult problems

• Documentation for difficult subsystems

• Reduction of classification errors by training• Classification makes a difference

• Tail on S1 distribution may be due to conversion of S2, even S3, to S1 after some aging.


Opportunities

• Can we make a combined model (occurrence, repair) ?

• Repair times of defects are LT-Lognormal (PROMISE-2008)

• Defect occurrence rates are Lognormal (ISSRE-1998)

• Occurrence counts are Discrete Lognormal (ISSRE-2005)

• What is typical range for sigma. How hard is it to change?


Other Lognormal Relationships

Distribution of Ln of Product of 15 Uniform(0:1) Random Numbers (N=327868)

0

500

1000

1500

2000

2500

3000

3500

4000

-30 -25 -20 -15 -10 -5 0

MonteCarlo

Normal

Cumulative faults discoveredas a function of time and rate

0.0

0.5

1.0

0.4 1.0 2.7 7.4 20.1 54.6 148.4 403.4 1096.6time --->

pro

po

rtio

n f

ou

nd

Trouble Tickets = Discrete-LN

SRGM = Cumulative Defects = Laplace Transform of LN

COMMON UNCOMMON RARE

Read Open Create

Local Nearby Distant

By book User error UBD

IO works IO error Removed

ETC ETC ETC

Test Strategy

Ten x the rare rates will find rare-rare interactions 100 times as fast.

Equivalent to Heat/Power/ Temp “corner testing” of HW.

Multiplicative Rates Limiting Distribution = Lognormal

Triggering Conditions

Release Strategy

Is it ready? Which is best?

States, Usage, Code

Repair Strategy

Risk vs. Benefit ?

First Year

110

100

1000

1000

0

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

N Tickets

Log

Num

Bug

s w

ith

N t

icke

ts

data

fitted


Further Reading

• MIL-HDBK-470A: Designing and Developing Maintainable Products and Systems, 4.4.3.1 - Lognormal Distribution, Aug 1997. (Lognormal is representative of most repair times.)

• R. Mullen, Lognormal Distribution of Software Failure Rates: Origin and Evidence, ISSRE 1998. (re Central Limit Theorem and Lognormal.)

• R. Mullen and S. Gokhale: Software Defect Rediscoveries: A Discrete Lognormal Model, ISSRE 2005. (Further references to Lognormal in SW.)

• B. Schroeder and G. Gibson A large-scale study of failures in high-performance-computing systems, CMU-PDL-05-112, Dec 2005. Later in DSN-2006. (Lognormal provides best fit for repair times).


Thank you & Questions

Bob Mullen

bomullen @ cisco.com

Swapna Gokhale

ssg @ engr.uconn.edu

Technology

Software Defect Repair Times: A Multiplicative Model