19
Recovering Commit Dependencies for Selective Code Integration in Software Product Lines Tejinder Dhaliwal, Foutse Khomh, Ying Zou, Ahmed E. Hassan 1

Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

Embed Size (px)

Citation preview

Page 1: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

1

Recovering Commit Dependencies for Selective Code Integration in

Software Product Lines

Tejinder Dhaliwal, Foutse Khomh, Ying Zou, Ahmed E. Hassan

Page 2: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

2

Software Product Lines

Multiple Products

Production

Shared Components

Page 3: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

3

Main Line Branching Model for Software Product Lines

Main Branch

Product-1 Branch

Product-2 Branch

Product-n Branch

Developers add new features

Integrators integrate selected features

Page 4: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

4

Feature to Code Change Mapping

Developer adds Code Changes

Integrator integrates Features

Features Code Changes

FA CA1

FB CB1

Mapping facilitates selective

integration

Page 5: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

5

Cost of Integration Failure • If change CA1 implements FA and

change CB1 implements FB

Feature Code Changes

FA CA1

FB CB1

Missing Dependencies

CA2 CB1

• If a change CA2 is added to modify FA and CA2 is dependent on CB1

CA1 CA2 CB1

CB1

CA1

Integrate FA

Integrate FB

530% more time

Product-1 Branch

Product-2 Branch

Page 6: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

6

Our Solution

CA1 CB1CA2

Group dependent commits and propose to integrate a group as a whole

Page 7: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

8

Commit Assignment Algorithm

Automated Grouping ( during

Integration)

Developer Guided Grouping ( during

Development)

Calibrate the Metrics on

Prior Versions

Our Approach to Group Dependent Commits

Define Dissimilarity

Metrics

Page 8: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

9

Dissimilarity Metrics

Metric Description

File Dependency Distance (FD) Captures source code dependencies between files involved in two commits

File Association Distance (FA) Captures logical dependencies between files involved in two commits

Developer Dissimilarity Distance (DD) Captures the working relation between two developers submitting commits

CR Dependency Distance (CRD) Captures the dissimilarity between the CRs implemented by two commits

Given two commits characterized by files, developers and change requests (CRs)

Page 9: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

10

Commit Assignment Algorithm

Automated Grouping ( during

Integration)

Developer Guided Grouping ( during

Development)

Calibrate the Metrics on

Prior Versions

Our Approach to Group Dependent Commits

Define Dissimilarity

Metrics

Page 10: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

11

Calibrate Metrics on Prior Versions

For each of the four metrics - • Min_Threshold = Avg(a)• Max_Threshold = Avg(bmin)

• Silhouette= Avg{(bmin-a)/max(bmin,a)}

A higher silhouette value is better

a

b1

b2b3

Page 11: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

12

Commit Assignment Algorithm

Automated Grouping ( during

Integration)

Developer Guided Grouping ( during

Development)

Calibrate the Metrics on

Prior Versions

Our Approach to Group Dependent Commits

Define Dissimilarity

Metrics

Page 12: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

13

Commit Assignment Algorithm

Color > Shape

• Apply the similarity metrics in order of their precedence

• If no suitable group is found for a commit, assign the commit to a new group

Page 13: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

14

Commit Assignment Algorithm

Automated Grouping ( during

Integration)

Developer Guided Grouping ( during

Development)

Calibrate the Metrics on

Prior Versions

Our Approach to Group Dependent Commits

Define Dissimilarity

Metrics

Page 14: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

15

Commit Grouping ApproachesDeveloper-guided

Grouping

Automated Grouping

Groups commits incrementally and uses developers’ feedback to improve the grouping during development

Both approaches follow the k-means clustering method which consists in assigning each item to the cluster with the nearest mean.

Page 15: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

16

Evaluation

We analyzed three major versions of a family of mobile applications

Page 16: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

17

Evaluation Criteria• Validate the dissimilarity metrics

Can the proposed metrics be used to identify commit dependencies ?

• Validate the grouping approachesHow efficient are our proposed grouping approaches?

• Value for DevelopersCan the proposed approaches identify commit dependencies missed by developers ?

Page 17: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

18

ResultsThe four similarity metrics display good abilities in grouping

commits ( i.e. high silhouette values)

Verion 1 Version 2 Version 30

0.2

0.4

0.6

0.8

1 0.94 0.96 0.96

0.760000000000001

0.790.6700000000000

010.63

0.670000000000001

0.57

0.46 0.47 0.49 CRDFADDFD

Sil

ho

uet

te V

alu

e

CRD > FA > DD > FD

Page 18: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

19

Results

• Efficiency of the Grouping Approaches– 82% of commit dependencies were recovered by the

automated grouping with a precision of 95% – The accuracy of the developer-guided grouping

approach is 98%–We observed that precision/recall improves with

longer history data• Value for Developers– Automated grouping and Developer-guided grouping

approaches were able to reduce integration failures by 76% and 94% respectively

Page 19: Recovering Commit Dependencies for Selective Code Integration in Software Product Lines

20

Summary