A process to improve the accuracy of mk ii fp to cosmic charles symons

‘A PROCESS TO IMPROVE THE ACCURACY OF MKII FP TO COSMIC SIZE CONVERSION: INSIGHTS INTO THE COSMIC METHOD DESIGN ASSUMPTIONS

IWSM/Mensura, Krakow, October 2015 Aveek Dasgupta (SITA), Cigdem Gencel (DEISER)

Charles Symons (COSMIC)

Objectives

§  To present results of MkII to COSMIC functional size conversion by §  statistical analysis §  a calculation method with ‘functional profiling’

§  To suggest how these ideas might be applied for IFPUG to COSMIC size conversion

§  To present some new insights into COSMIC method design assumptions

Agenda

§  Overview of MkII and COSMIC Functional Size Measurement (FSM) methods

§  Data sources §  Statistical conversion of MkII to COSMIC sizes §  A calculation method for MkII to COSMIC sizes §  Conclusions

The MkII and COSMIC FSM methods have a similar structure

Functional User Requirements

Functional Processes

Data Movements •  Entries (≈ input)

•  Reads + Writes (≈ Process) •  Exits (≈ Output)

(account for data manipulation)

1

1

n

2 - n

COSMIC FSM

Functional User Requirements

Logical Transactions

1

1

n

MkII FPA

1

equivalent

Input Process Output

(account for data manipulation)

1 1

DET’s DET’s ER’s 1-n 1-n 1-n

The methods have different rules for measuring Functional Size

MkII Logical Transaction Size = 0.58 x (# Input DET’s) + 1.66 x (# Entity References) + 0.26 x (# Output DET’s)

COSMIC Functional Process Size = # Entries + # Reads & Writes + # Exits


Minimum: 2.5 MkII FP 2 CFP Maximum: (No limit)

Some differences, some similarities

MkII FP §  Weights of I/P/O

components calibrated for development effort

§  ‘Entity references’ are only for stored data

§  Size of changes: measure changed # DET’s and ER’s

COSMIC FSM §  No weights

§  All data movement types (E, R, W, X) move data about ‘Objects of interest’

§  Size of changes: measure changed # E’s, R’s, W’s X’s

Agenda



We had 22 pairs of MkII & COSMIC size measurements from five organizations

Org. Domain # Systems Size Range (CFP)

A Control 4 251 – 3524 C Control 2 275 – 321 B Information 1 1029 D Information 2 1113 – 1947 S Information 13 148 – 1029 S = SITA (Société Internationale de Télécommunications Aéronautiques)

Agenda



A basic process to develop a MkII to COSMIC size conversion formula Ideally: §  Measure at least 10 software items, with a ‘common profile’

on both methods §  Plot pairs of (MkII, CFP) sizes and review outliers §  Fit a straight line and use this for converting MkII to CFP

sizes

In practice: §  Given the minimum size of a functional process is 2.5 MkII

or 2.0 CFP, we fitted straight lines that are constrained to pass through the origin (0,0)

A first plot of all 22 data points

0

500

1000

1500

2000

2500

3000

3500

4000

0 1000 2000 3000 4000 5000

MEA

SURE

D CFP SIZES

MEASURED MKII SIZES

A B C D S

•  MkII/COSMIC sizes correlate well, in spite of multiple sources of data from two domains.

•  Two outliers?

The OLS* fitted lines for Control and Information systems are very similar

y = 0.8017xR² = 0.999

y = 0.7371xR² = 0.9828

0

1000

2000

3000

4000

0 1000 2000 3000 4000 5000

MEA

SURE

D CFP SIZE

MEASURED MKII FP SIZE

Control System Information System Note: the slopes of both lines are close to 0.8 (=ratio of minimum CFP to MkII FP sizes) * OLS = Ordinary Least Squares

In spite of a high R2, an OLS-fitted line may not predict COSMIC sizes very accurately

y = 0.7605xR² = 0.9957

0

500

1000

1500

2000

2500

0 500 1000 1500 2000 2500

COSM

IC FP SIZE

MKII FP SIZE

COSMIC VS MKII SIZE FOR 13 SITA INFORMATION SYSTEMS

Accuracy of COSMIC sizes predicted from the OLS-fitted line: Av. of absolute differences: 6% # under-sized items: 4 # over-sized items 9 3 x highest % differences:

28%, 13%, 7.8%

A homogeneous dataset?

Statistical conversion methods have two fundamental weaknesses

1.  We can eliminate outliers from the sample used to establish the conversion formula, BUT how can we predict potential outliers amongst the other measurements to be converted?

2.  Converted sizes may have a low average error, BUT individual converted sizes may have very significant errors

Agenda



Two criteria for outlier rejection 1.  Discard data points in the sample well outside the

upper size limit of most data points. They will contribute too much weight in OLS curve fitting

2.  Use a ‘profiling’ method to discard other outliers: §  on the sample sizes-to-be-converted to help form

homogeneous datasets, §  that can also be used to predict potential outliers

for the mass of sizes-to-be-converted

We used an ‘IPO Profiling’ test for dataset homogeneity

‘IPO Profile’ = % contributions to total size of the Input/Process/Output components

0%

10%

20%

30%

40%

50%

60%


MkII COSMIC

13 x SITA Info Systems (Data from Orgs. B and D did not fit

this profile)

0%

20%

40%

60%

80%


MkII COSMIC

4 x Org. A Control Systems (Data from Org. C did not fit this

profile)

We could then decide on outliers intelligently.

0

500

1000

1500

2000

2500

3000

3500

4000

0 1000 2000 3000 4000 5000

MEA

SURE

D CFP SIZES

MEASURED MKII SIZES

A B C D S

Reject this point because it is an outlier on size and profile.

We should really reject this point as an outlier on size.

Reject Orgs. B, C and D points because different profiles

Research idea: are there constant ratios between the sizes of the MkII and CFP I/P/O components?

1. Compute the following ratios from these sums for the whole set:

AIDE = Average Input DET’s per Entry = (∑ Input DET’s) / ∑ E’s

AODX = Average Output DET’s per Exit = (∑ Output DET’s) / ∑ X’s

AERP = Average Entity Refs per (R + W) data movement = ∑ ER’s / (∑ R’s + ∑ W’s)

2. Compute the CFP size of each individual software item from:

CFP = (∑ Input DET’s) / AIDE + (∑ Output DET’s) / AODX + (∑ ER’s) / AERP

Apply the ‘Calculation method’ to the 13 SITA systems

OLS-fitted line Calculation (1) Av. of absolute differences: 6.0% 6.7%

# under-sized items: 4 6

# over-sized items 9 7

3 x highest % differences: 28% 18% 13% 11% 7.8% 11%

Accuracy of COSMIC sizes predicted from:

We then noticed that the values of IDE, ODX and ERP vary with MkII size

y = -‐0.0005x + 3.962

y = -‐0.0003x + 3.4441

y = 4E-‐05x + 0.70690.0

1.0

2.0

3.0

4.0

5.0

0 500 1000 1500 2000 2500MKII FP SIZE

IDE ODX ERP So (Calculation method 2): Let’s use the values of IDE, ODX and ERP computed from these OLS fits instead of the averages used in Calculation method 1

The accuracy of predicted CFP sizes is much improved

OLS-fitted line Calculation (1) Calculation (2) Av. of absolute differences: 6.0% 6.7% 3.8%

# under-sized items: 4 6 6

# over-sized items 9 7 7

3 x highest % differences: 28% 18% 11% 13% 11% 8.9% 7.8% 11% 6.4%


We applied the same process to the four Org. D Control systems

OLS-fitted line Calculation (1) Calculation (2) Av. of absolute differences: 9.7% 8.6% 6.6%

# under-sized items: 1 2 2

# over-sized items 3 2 2

2 x highest % differences: 25% 17% 12% 11% 8.4%

9.9%


(This result obvious has low statistical significance)

Agenda



1. Convertibility studies have focused too much on finding ‘one-conversion-formula-for-all’

§  One simple statistically-based formula to convert sizes measured by method A to method B sizes is unlikely to be very accurate for all individual software sizes.

§  A better approach: §  Re-think the task as ‘define a process to predict method B total

sizes from method A size measurement data’ §  Apply ‘functional profiling’ to:

§  check homogeneity of the sample measurements used to establish the conversion process

§  predict which individual method A sizes will be ‘outliers’. i.e. will be inaccurately converted by the chosen process

2. A ‘calculation method’ to predict COSMIC sizes from IFPUG size data is worth exploring

If an organization has recorded the # DET’s and # FTR’s for each EP, then adapt the MkII to COSMIC calculated size conversion process:

•  Measure the IFPUG and COSMIC sizes for several software items that are assumed to have a common functional profile

•  Plot FP vs CFP total sizes; review for outliers •  For each EI, EO and EQ:

•  allocate # DET’s to input and output •  assume # FTR’s are equivalent to # (R + W)

•  Examine the ‘I/P/O functional profiles’ for the software items •  Compute AIDE, AODX and AERP •  Calculate CFP size of each individual software item

3. This study has given new insights into the COSMIC method design assumptions …

A legitimate question: does it matter for practical performance measurement and estimating that the COSMIC method :

§  ignores the number of DET’s on each data movement type?

§  does not weight the data movement types (E, X, R, W) for relative development effort?

(Both the IFPUG and MkII method take these factors into account.)

…. 3. MkII vs. COSMIC size comparisons suggest the COSMIC method design is well-founded

I/P/O size contributions are very similar

Total MkII & COSMIC sizes correlate very

well

… in spite of the COSMIC sizes not accounting for DET’s and not being calibrated for development effort

y = 0.8017xR² = 0.999

y = 0.7371xR² = 0.9828

0

1000

2000

3000

4000

0 1000 2000 3000 4000 5000

MEA

SURE

D CFP SIZE

MEASURED MKII FP SIZE

Control System Information System

0%

10%

20%

30%

40%

50%

60%


MkII COSMIC0%

20%

40%

60%

80%


MkII COSMIC

Thank you for your attention

Charles Symons (www.cosmic-sizing.org)

[email protected]

www.cosmic-sizing.org

Software

A process to improve the accuracy of mk ii fp to cosmic charles symons