12
A Replication Case Study to Measure the Architectural Quality of a Commercial System Presented by: Derek Reimanis

A Replication Case Study with CLIO

Embed Size (px)

Citation preview

Page 1: A Replication Case Study with CLIO

A Replication Case Study to Measure the Architectural Quality of a Commercial

System

Presented by:

Derek Reimanis

Page 2: A Replication Case Study with CLIO

Modularity Violations

• Two components which change together, yet are not expected to change together

Releases

Component A

Component B

Component A

Component B

Page 3: A Replication Case Study with CLIO

The CLIO Process

Revision History

Software Project

Identify Historical Pairwise

Dependencies

Find Pairwise Dependencies

Ticket History

Gather Metrics Associated with

Quality

Correlate Quality Metrics and Modularity Violations

Measure and Predict System

Quality

Isolate Groups of Affected Files

Locate Unexpected

Dependencies

Visualize Groups to Understand Scope

of Problems

Page 4: A Replication Case Study with CLIO

SVS7 Demographics

Factor SVS7 Baseline

Programming Language C++ Java

Number of Modules 18 173

Number of Developers Up to 11 Up to 20

Project Lifetime 4 years 2 years

Number of Source Files 3903 (1569 cpp, 267 c, 2067 h) 900

Source Lines of Code (in thousands)

1300 300

Golden Helix’s SNP & Variation Suite (SVS7)

Page 5: A Replication Case Study with CLIO

Metrics Associated with Quality

Metric Description

File size File size on disk of u

Fan-in Sum of references pointing from a file pair v to u

Fan-out Sum of references pointing from u to a file pair v

Change Frequency The number of times u is modified in the commit log

Ticket Frequency The number of times u is modified because of a ticket reference

Bug Change Frequency

The number of times u is modified because of a bug ticket reference

Pair Change Frequency

The number of times u and file pair v are modified in the same commit

Define a file pair as a pair of C/C++ source file and corresponding header file. Then, for each file pair u,

Page 6: A Replication Case Study with CLIO

Scatter Plot Analysis

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120 140 160

R7

.5 C

han

ge F

req

ue

ncy

R7 Fan-out

R7.5 Change Frequency vs. R7 Fan-out

Page 7: A Replication Case Study with CLIO

Correlate Quality Metrics

• Non-parametric statistic test

• Many values fall at zero

– Ordinary Least Squares performs poorly

• Kendall’s tau-b

𝜏𝐵 𝐹, 𝐺 =𝑐𝑜𝑛𝑐𝑜𝑟𝑑 𝐹, 𝐺 − 𝑑𝑖𝑠𝑐𝑜𝑟𝑑(𝐹, 𝐺)

𝑐𝑜𝑛𝑐𝑜𝑟𝑑 𝐹, 𝐺 + 𝑑𝑖𝑠𝑐𝑜𝑟𝑑(𝐹, 𝐺)

Page 8: A Replication Case Study with CLIO

Correlate Quality Metrics

Tau-b table of metrics for svs7 + svs7.5

r7+r7.5 fan-in fan-out file size changes tickets bugs

Fan-in 1 0.257 0.301 0.331 0.328 0.464

Fan-out 0.257 1 0.441 0.417 0.416 0.637

size 0.301 0.441 1 0.293 0.273 0.510

changes 0.331 0.417 0.293 1 0.972 0.858

tickets 0.328 0.416 0.273 0.972 1 0.857

bugs 0.463 0.637 0.510 0.858 0.857 1

Page 9: A Replication Case Study with CLIO

Presenting to Developers

• Findings are not surprising

– Most violations are connection points between modules

• Correlation between fan-out and bug change frequency

Page 10: A Replication Case Study with CLIO

Study Comparisons

Similarities Differences

A select few files contributed to the majority of modularity

violations

Correlation between fan-out and bug change frequency

Usefulness of identified modularity violations

Page 11: A Replication Case Study with CLIO

Conclusions

• CLIO needs further refinement

– More repeated case studies

• Importance of domain knowledge

Page 12: A Replication Case Study with CLIO

Questions