Upload
sailqu
View
108
Download
1
Embed Size (px)
Citation preview
1
Production
2
3
Features Code Changes
FA CA1
FB CB1
4
• If change CA1 implements FA and change CB1 implements FB
Feature Code Changes
FA CA1
FB CB1
Missing Dependencies
CA2 CB1
• If a change CA2 is added to modify
FA and CA2 is dependent on CB1
CA1 CA2 CB1
CB1
CA1
Integrate FA
Integrate FB 5
CA1 CB1 CA2
6
Commit Assignment Algorithm
Automated Grouping ( during
Integration)
Developer Guided Grouping ( during
Development)
Calibrate the Metrics on
Prior Versions
Define Dissimilarity
Metrics
8
Metric Description
File Dependency Distance (FD)
Captures source code dependencies between files involved in two commits
File Association Distance (FA)
Captures logical dependencies between files involved in two commits
Developer Dissimilarity Distance (DD) Captures the working relation between two developers submitting commits
CR Dependency Distance (CRD)
Captures the dissimilarity between the CRs implemented by two commits
9
Given two commits characterized by files, developers and change requests (CRs)
Commit Assignment Algorithm
Automated Grouping ( during
Integration)
Developer Guided Grouping ( during
Development)
Calibrate the Metrics on
Prior Versions
Define Dissimilarity
Metrics
10
For each of the four metrics - • Min_Threshold = Avg(a)
• Max_Threshold = Avg(bmin)
• Silhouette= Avg{(bmin-a)/max(bmin,a)}
A higher silhouette value is better
a
b1
b2 b3
11
Commit Assignment Algorithm
Automated Grouping ( during
Integration)
Developer Guided Grouping ( during
Development)
Calibrate the Metrics on
Prior Versions
Define Dissimilarity
Metrics
12
Color > Shape
• Apply the similarity metrics in order of their precedence
• If no suitable group is found for a commit, assign the commit to a new group
13
Commit Assignment Algorithm
Automated Grouping ( during
Integration)
Developer Guided Grouping ( during
Development)
Calibrate the Metrics on
Prior Versions
Define Dissimilarity
Metrics
14
15
Groups commits incrementally and uses developers’ feedback to improve the grouping during development
Both approaches follow the k-means clustering method which consists in assigning each item to the cluster with the nearest mean.
16
We analyzed three major versions of a family of mobile applications
• Validate the dissimilarity metrics
Can the proposed metrics be used to identify commit dependencies ?
• Validate the grouping approaches
How efficient are our proposed grouping approaches?
• Value for Developers
Can the proposed approaches identify commit dependencies missed by developers ?
17
The four similarity metrics display good abilities in grouping commits ( i.e. high silhouette values)
0.94 0.96 0.96
0.76 0.79
0.67 0.63
0.67
0.57
0.46 0.47 0.49
0
0.2
0.4
0.6
0.8
1
Verion 1 Version 2 Version 3
CRD FA DD FD
Silh
ouet
te V
alue
18
CRD > FA > DD > FD
• Efficiency of the Grouping Approaches – 82% of commit dependencies were recovered by
the automated grouping with a precision of 95% – The accuracy of the developer-guided grouping
approach is 98% – We observed that precision/recall improves with
longer history data • Value for Developers
– Automated grouping and Developer-guided grouping approaches were able to reduce integration failures by 76% and 94% respectively
19
20