Investigating Code Review Practices in Defective Files
Patanamon (Pick) Thongtanunam
Shane McIntosh Ahmed E. Hassan Hajimu Iida
May 16-17, 2015. Firenze, Italy
[email protected] @pamon
Modern Code Review: A lightweight, tool-supported code review process
Code Review Tool
Modern Code Review: A lightweight, tool-supported code review process
Code Review Tool
Code change
Modern Code Review: A lightweight, tool-supported code review process
Code Review Tool
Code change
Modern Code Review: A lightweight, tool-supported code review process
Code Review Tool
Code change
Examine Code
Modern Code Review: A lightweight, tool-supported code review process
Upstream VCS repositories
Code Review Tool
Code change
Examine Code
Modern Code Review: A lightweight, tool-supported code review process
Upstream VCS repositories
Code changeA lack of code review activity can increase the
risk of post-release defects [McIntosh et. al., MSR2014]
My code is awesome! No needs for a review
Code Review Tool
Code change
Examine Code
Modern Code Review: A lightweight, tool-supported code review process
Upstream VCS repositories
Code changeA lack of code review activity can increase the
risk of post-release defects [McIntosh et. al., MSR2014]
My code is awesome! No needs for a review
Code Review Tool
Code change
How should reviewers do a code review to reduce the risk of
having defects?
Examine Code
What is the difference between code review practices of defective and clean files?
What is the difference between code review practices of defective and clean files?
Review Practice A
Defective i.e., files that have defects
Defect-free (Clean) i.e., files that do not have defects
Review Practice B
What is the difference between code review practices of defective and clean files?
Review Practice AVS
Defective i.e., files that have defects
Defect-free (Clean) i.e., files that do not have defects
Review Practice B
What is the difference between code review practices of defective and clean files?
Review Practice AVS
Defective i.e., files that have defects
We measure 3 dimensions of review activity metrics
Review Intensity e.g., #Review Iterations,
Discussion Length
Review Participation e.g., #Reviewers,
Review Agreement
Reviewing Time e.g., Review Length, Code Reading Speed
Defect-free (Clean) i.e., files that do not have defects
Review Practice B
What is the difference between code review practices of defective and clean files?
Review Practice AVS
Defective i.e., files that have defects
Defect-free (Clean) i.e., files that do not have defects
Review Practice B
What is the difference between code review practices of defective and clean files?
Review Practice AVS
Defective i.e., files that have defects
Defective i.e., files that have defects
We investigate defective files along 2 perspectives
Defective i.e., files that have defects
We investigate defective files along 2 perspectives
Risky Files
Files that have historically been defective
Past Defective i.e., files that have defects
We investigate defective files along 2 perspectives
Risky Files
Files that have historically been defective
Past FutureDefective i.e., files that have defects
Future-Defective Files
Files that will eventually have defects
We investigate defective files along 2 perspectives
Risky Files
Files that have historically been defective
Past FutureDefective i.e., files that have defectsFuture-Defective Files
Files that will eventually have defects
Conjecture: Reviews of Future-Defective will be
• less intense, • with less team participation, • completed with a shorter time than reviews of clean files
We investigate defective files along 2 perspectives
Future-Defective Files: Files that have post-release defects
VCS Repositories
Future-Defective Files: Files that have post-release defects
VCS Repositories
Release date Bug-fixing commit
Future-Defective Files: Files that have post-release defects
VCS Repositories
Release date Bug-fixing commitFuture-Defective
Future-Defective Files: Files that have post-release defects
VCS Repositories
Release date
No bug-fixing commits
Release date
Bug-fixing commitFuture-Defective
Future-Defective Files: Files that have post-release defects
VCS Repositories
Release date
No bug-fixing commits
Release date
Bug-fixing commitFuture-Defective
Clean
Future-Defective Files: Files that have post-release defects
Studied ReviewsVCS Repositories
Release date
No bug-fixing commits
Release date
Bug-fixing commit
6 months
Future-Defective
Clean
Future-Defective Files: Files that have post-release defects
Studied ReviewsVCS Repositories
Release date
No bug-fixing commits
Release date
Bug-fixing commit
6 months
1,176 Files 3,470 Reviews
10,513 Files 2,727 Reviews
5.0.0
866 Files 2,849 Reviews
11,931 Files 2,690 Reviews
5.1.0
Future-Defective
Clean
#Reviewers
#Reviews of Clean files
#Reviewers
#Reviews of Future-Defective files
VS
Review Activity Analysis: Compare code review activity that has been applied to future-defective and clean files
#Reviewers
#Reviews of Clean files
#Reviewers
#Reviews of Future-Defective files
VS
Using a statistical test to determine the difference between the distributions of code review activity
Review Activity Analysis: Compare code review activity that has been applied to future-defective and clean files
#Reviewers
#Reviews of Clean files
#Reviewers
#Reviews of Future-Defective files
VS
Using a statistical test to determine the difference between the distributions of code review activity
Raw code review activity metric is normalized by patch size
Review Activity Analysis: Compare code review activity that has been applied to future-defective and clean files
Findings
Code review activity in the reviews of future-defective files
Conjecture
Results
Review Intensity
Review Participation
Reviewing Time
Less Intense
Less Team Participation
Completed with a shorter time
Findings
Code review activity in the reviews of future-defective files
Conjecture
Less Intense
Results
Review Intensity
Review Participation
Reviewing Time
Less Intense
Less Team Participation
Completed with a shorter time
Findings
Code review activity in the reviews of future-defective files
Conjecture
Less Intense
Less Team Participation
Results
Review Intensity
Review Participation
Reviewing Time
Less Intense
Less Team Participation
Completed with a shorter time
Findings
Code review activity in the reviews of future-defective files
Conjecture
Less Intense
Less Team Participation
Faster Code Reading Rate
Results
Review Intensity
Review Participation
Reviewing Time
Less Intense
Less Team Participation
Completed with a shorter time
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
Risky Files
Files that have historically been defective
FutureFuture-Defective Files
Files that will eventually have defects
Conjecture: Reviews of Future-Defective will be
• less intense, • with less team participation, • completed with a shorter time than reviews of clean files
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
Risky Files
Files that have historically been defective
FutureFuture-Defective Files
Files that will eventually have defects
Reviews of future-defective files tend to be less rigorous than
reviews of clean files
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
Risky Files
Files that have historically been defective
FutureFuture-Defective Files
Files that will eventually have defects
Reviews of future-defective files tend to be less rigorous than
reviews of clean files
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
Risky Files
Files that have historically been defective
Conjecture: Reviews of risky files should be
• more intense, • with more team participation, • reviewed for a longer time
to reduce the risk of having defects in the future
FutureFuture-Defective Files
Files that will eventually have defects
Reviews of future-defective files tend to be less rigorous than
reviews of clean files
Risky Files: Files that had post-release defects in prior release
VCS Repositories
Risky Files: Files that had post-release defects in prior release
VCS Repositories
Release datePrior release date
Bug-fixing commit
Risky Files: Files that had post-release defects in prior release
VCS Repositories
Release datePrior release date
Bug-fixing commit
Risky
Risky Files: Files that had post-release defects in prior release
VCS Repositories
No bug-fixing commits
Release datePrior release date
Release datePrior release date
Bug-fixing commit
Risky
Risky Files: Files that had post-release defects in prior release
VCS Repositories
No bug-fixing commits
Release datePrior release date
Release datePrior release date
Bug-fixing commit
Normal
Risky
Risky Files: Files that had post-release defects in prior release
Studied ReviewsVCS Repositories
No bug-fixing commits
Release datePrior release date
Release datePrior release date
Bug-fixing commit
6 months
Normal
Risky
Risky Files: Files that had post-release defects in prior release
Studied ReviewsVCS Repositories
No bug-fixing commits
Release datePrior release date
Release datePrior release date
Bug-fixing commit
6 months
1,168 Files 2,671 Reviews
11,629 Files 2,868 Reviews
5.1.0
Normal
Risky
Findings
Code review activity in the reviews of risky files
Conjecture
Results
Review Intensity
Review Participation
Reviewing Time
More Intense
More Team Participation
Completed with a longer time
Findings
Code review activity in the reviews of risky files
Conjecture
Less Intense
Results
Review Intensity
Review Participation
Reviewing Time
More Intense
More Team Participation
Completed with a longer time
Findings
Code review activity in the reviews of risky files
Conjecture
Less Intense
Less Team Participation
Results
Review Intensity
Review Participation
Reviewing Time
More Intense
More Team Participation
Completed with a longer time
Findings
Code review activity in the reviews of risky files
Conjecture
Less Intense
Less Team Participation
Receive Slow Feedback & Faster Code Reading Rate
Results
Review Intensity
Review Participation
Reviewing Time
More Intense
More Team Participation
Completed with a longer time
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
FutureFuture-Defective Files
Files that will eventually have defects
Reviews of future-defective files tend to be less rigorous than
reviews of clean files
Risky Files
Files that have historically been defective
Conjecture: Reviews of risky files should be
• more intense, • with more team participation, • reviewed for a longer time
to reduce the risk of having defects in the future
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
FutureFuture-Defective Files
Files that will eventually have defects
Reviews of future-defective files tend to be less rigorous than
reviews of clean files
Risky Files
Files that have historically been defective
Developers are not as careful when they review risky files.
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
FutureReviews of future-defective files tend to be less rigorous than
reviews of clean files
Developers are not as careful when they review risky files.
Future-Defective Files
Files that will eventually have defects
Risky Files
Files that have historically been defective
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
FutureReviews of future-defective files tend to be less rigorous than
reviews of clean files
Developers are not as careful when they review risky files.
Future-Defective Files
Files that will eventually have defects
Risky Files
Files that have historically been defective
Will careless reviews of risky files lead to future defects?
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
FutureReviews of future-defective files tend to be less rigorous than
reviews of clean files
Developers are not as careful when they review risky files.
Future-Defective Files
Files that will eventually have defects
Risky Files
Files that have historically been defective
Will careless reviews of risky files lead to future defects?
Investigating code review practice in risky & future-
defective files
Risky & Future-Defective Files: Risky files that will eventually have defects
VCS Repositories
Risky & Future-Defective Files: Risky files that will eventually have defects
VCS Repositories
Bug-fixing commit
Release datePrior release date
Bug-fixing commit
Risky & Future-Defective Files: Risky files that will eventually have defects
VCS Repositories
Bug-fixing commit
Release datePrior release date
Bug-fixing commit
Risky & Future-Defective
Risky & Future-Defective Files: Risky files that will eventually have defects
VCS Repositories
Bug-fixing commit
Release datePrior release date
Bug-fixing commit
No bug-fixing commits
Release datePrior release date
Bug-fixing commit
Risky & Future-Defective
Risky & Future-Defective Files: Risky files that will eventually have defects
VCS Repositories
Bug-fixing commit
Release datePrior release date
Bug-fixing commit
No bug-fixing commits
Release datePrior release date
Bug-fixing commit
Risky & Clean
Risky & Future-Defective
Risky & Future-Defective Files: Risky files that will eventually have defects
Studied ReviewsVCS Repositories
Bug-fixing commit
Release datePrior release date
Bug-fixing commit
No bug-fixing commits
Release datePrior release date
Bug-fixing commit 6 months
Risky & Clean
Risky & Future-Defective
Risky & Future-Defective Files: Risky files that will eventually have defects
Studied ReviewsVCS Repositories
Bug-fixing commit
Release datePrior release date
Bug-fixing commit
No bug-fixing commits
Release datePrior release date
Bug-fixing commit 6 months
206 Files 1,299 Reviews
962 Files 1,372 Reviews
5.1.0
Risky & Clean
Risky & Future-Defective
Findings
Code review activity in the reviews of risky & future-defective files
Conjecture
Less Intense
Less Team Participation
Receive Slow Feedback & Faster Code Reading Rate
Results
Review Intensity
Review Participation
Reviewing Time
Less Intense
Less Team Participation
Completed with a shorter time
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
FutureReviews of future-defective files tend to be less rigorous than
reviews of clean files
Developers are not as careful when they review risky files.
Future-Defective Files
Files that will eventually have defects
Risky Files
Files that have historically been defective
Will careless reviews of risky files lead to future defects?
Investigating code review practice in risky & future-
defective files
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
FutureReviews of future-defective files tend to be less rigorous than
reviews of clean files
Developers are not as careful when they review risky files.
Future-Defective Files
Files that will eventually have defects
Risky Files
Files that have historically been defective
Reviews of files that are both risky & future defective are less rigorous than files that are risky but clean
Evolvability e.g., Fixing code comments,
Decomposing complex function
Functionality e.g., Fixing incorrect
program logic
Traceability e.g., Updating commit
message
We compare concerns that are addressed during reviews of defective and clean files
Evolvability
Functionality Traceability
Proportion of reviews in future-defective files in Qt5.0.0
82%
40%40%
Reviews of defective files often address evolvability concernsResults
10% higher than clean files
5% higher than clean files
10% lower than clean files
We observe the similar results for the reviews of risky files and risky & future-defective files
Modern Code Review: A lightweight, tool-supported code review process
Upstream VCS repositories
Code changeA lack of code review
activity can increase the
risk of post-release defects [McIntosh et. al., MSR2014]
My code is awesome! No needs for a review
Code Review Tool
Code change
How should reviewers do a code review to reduce the risk of
having defects?
Examine Code
Modern Code Review: A lightweight, tool-supported code review process
Upstream VCS repositories
Code changeA lack of code review
activity can increase the
risk of post-release defects [McIntosh et. al., MSR2014]
My code is awesome! No needs for a review
Code Review Tool
Code change
How should reviewers do a code review to reduce the risk of
having defects?
Examine Code
Defect-free (Clean) i.e., files that do not have defects
Review Practice B
What is the difference between code review practices of defective and clean files?
Review Practice AVS
Defective i.e., files that have defects
Modern Code Review: A lightweight, tool-supported code review process
Upstream VCS repositories
Code changeA lack of code review
activity can increase the
risk of post-release defects [McIntosh et. al., MSR2014]
My code is awesome! No needs for a review
Code Review Tool
Code change
How should reviewers do a code review to reduce the risk of
having defects?
Examine Code
Defect-free (Clean) i.e., files that do not have defects
Review Practice B
What is the difference between code review practices of defective and clean files?
Review Practice AVS
Defective i.e., files that have defects
We measure 3 dimensions of review activity metrics
Review Intensity e.g., #Review Iterations,
Discussion Length
Review Participation e.g., #Reviewers,
Review Agreement
Reviewing Time e.g., Review Length, Code Reading Speed
Modern Code Review: A lightweight, tool-supported code review process
Upstream VCS repositories
Code changeA lack of code review
activity can increase the
risk of post-release defects [McIntosh et. al., MSR2014]
My code is awesome! No needs for a review
Code Review Tool
Code change
How should reviewers do a code review to reduce the risk of
having defects?
Examine Code
Defect-free (Clean) i.e., files that do not have defects
Review Practice B
What is the difference between code review practices of defective and clean files?
Review Practice AVS
Defective i.e., files that have defects
We measure 3 dimensions of review activity metrics
Review Intensity e.g., #Review Iterations,
Discussion Length
Review Participation e.g., #Reviewers,
Review Agreement
Reviewing Time e.g., Review Length, Code Reading Speed
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
Future
Reviews of future-defective files tend to be less rigorous than
reviews of clean files
Developers are not as careful when they review risky files.
Future-Defective Files
Files that will eventually have defects
Risky Files
Files that have historically been defective
Reviews of files that are both risky & future defective are less rigorous than files that are risky but clean
Defect-free (Clean) i.e., files that do not have defects
Review Practice B
What is the difference between code review practices of defective and clean files?
Review Practice AVS
Defective i.e., files that have defects
We measure 3 dimensions of review activity metrics
Review Intensity e.g., #Review Iterations,
Discussion Length
Review Participation e.g., #Reviewers,
Review Agreement
Reviewing Time e.g., Review Length, Code Reading Speed
Defective i.e., files that have defects
Past
We investigate defective files along 2 perspectives
Future
Reviews of future-defective files tend to be less rigorous than
reviews of clean files
Developers are not as careful when they review risky files.
Future-Defective Files
Files that will eventually have defects
Risky Files
Files that have historically been defective
Reviews of files that are both risky & future defective are less rigorous than files that are risky but [email protected] @pamon
Investigating Code Review Practices in Defective FilesModern Code Review: A lightweight, tool-supported
code review process
Upstream VCS repositories
Code changeA lack of code review
activity can increase the
risk of post-release defects [McIntosh et. al., MSR2014]
My code is awesome! No needs for a review
Code Review Tool
Code change
How should reviewers do a code review to reduce the risk of
having defects?
Examine Code