ESEC/FSE 2003
Leveraging Field Data for ImpactAnalysis and Regression Testing
Alessandro Orsojoint work with
Taweesup ApiwattanapongMary Jean Harrold
{orso|term|harrold}@cc.gatech.eduThis work was supported in part by National Science Foundation awards CCR-9988294, CCR-0096321,
CCR-0205422, SBE-0123532, and EIA-0196145 to Georgia Tech, and by the State of Georgia toGeorgia Tech under the Yamacraw Mission
ESEC/FSE 2003
Motivation
Fundamental shift in SW development• Software virtually everywhere• Most computers interconnected• Large amount of user resources
Opportunity to use field data and resources in SE• Testing and analysis limited by the use of in-house
inputs and configurations• Limits can be overcome by augment these tasks with
field data
ESEC/FSE 2003
FieldExecution
Data
Overall Picture
ProgramP
Regression Testing
Impact Analysis
UserUser
User
UserUserUser
UserUserUser
UserUser
User
UserUserUser
User
ESEC/FSE 2003
Gathering Field Data
m1
Program P
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
m2
m4m3
m5 m6
m1 m2
m4m3
m5 m6
m1 m2
m4m3
m5 m6
exec
utio
n da
ta
m1 m2
m4m3
m5 m6
m1 m2
m4m3
m5 m6
User A User B
C1 X X
ESEC/FSE 2003
Impact Analysis
Assesses the effects of changes on asoftware system
Predictive: help decide which changes toperform and how to implement changes
Our approach• Program-sensitive impact analysis• User-sensitive impact analysis
ESEC/FSE 2003
Program Sensitive Impact Analysis
Step 1• Identify user executions through
methods in C• Identify methods covered by
such executions
1. Field execution data
2. Change
Input:
C={m2, m5}Step 2• Static forward slice from C
covered methods = {m1,m2,m3,m5,m6}
forward slice = {m2,m4,m5,m6}Step 3• Intersect covered methods and
forward slice
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
C1
Impact set = Output:
{m2,m5,m6}
X X
ESEC/FSE 2003
1. Collective impact =
User-sensitive Impact Analysis
Collective impact• Percentage of executions through
at least one changed method
XXA2XXXB2
XXXXB1XXA1
C1
Input:
Affected users• Percentage of users that executed
at least once one changed method
3/5 = 60%
3/3 = 100%
2. Affected users =
2. Change
Output:
C={m5, m6}
60%
100%
1. Field execution data
X X
m6m5m4m3m2m1
ESEC/FSE 2003
Regression Testing
Performed after P is changed to P’ toprovide confidence that• Changed parts behave as intended• Unchanged parts are not adversely
affected by modificationsThree important issues
• Tests in T to rerun on P’ (selection)• New tests for P’ (augmentation)• Order of execution of tests (prioritization)
ESEC/FSE 2003
Regression Testing Using Field Data
1. Tests T’ to be rerun on P’ =
XXA2XXXB2
XXXXB1XXA1
1. Field execution dataInput: Output:
C={m2, m4} impact set = {m1,m2,m3,m5,m6}
Xt3 X
X
Xt2
Xt1
3. In-house tests for P
2. Critical methods =
• Compute the impact set for m
• For each t in T’ mark methods inimpact set exercised by t
• Remove marked methods fromimpact set
XX X X
CM[m4] = {m1}CM[m2] = {m3,m5}
m6m5m4m3m2m1
m6m5m4m3m2m1
{t2, t3}
2. Change
For each changed method m in C• Add all tests through m to T’
ESEC/FSE 2003
Regression Testing Using Field Data
1. Tests T’ to be rerun on P’ =
For each changed method m in C• Add all tests through m to T’
XXA2XXXB2
XXXXB1XXA1
1. Field execution dataInput: Output:
C={m2, m4} impact set = {m1,m2,m4,m6}
Xt3 X
X
Xt2
Xt1
3. In-house tests for P
2. Critical methods =
• Compute the impact set for m
• For each t in T’ mark methods inimpact set exercised by t
• Remove marked methods fromimpact set
XX X X
CM[m4] = {m1}CM[m2] = {m3,m5}
m6m5m4m3m2m1
m6m5m4m3m2m1
{t2, t3}
C={m2, m4}2. Change
ESEC/FSE 2003
Empirical Studies
Subject• JABA: Java Architecture for Bytecode Analysis• 60 KLOC, 550 classes, 2,800 Methods
Data• Field data: 1,100 executions (14 users, 12 weeks)• In-house data: 195 test cases, 63% method
coverage• Changes: 20 real changes extracted from JABA’s
CVS repository
ESEC/FSE 2003
Study 1: Impact Analysis
Research questionDoes field data yield different results thanin-house data in terms of impact sets?
Experimental setup• Computed impact sets for the 20 changes
• Using field data• Using in-house data
• Compared impact sets for the two datasets
ESEC/FSE 2003
Study 1: Impact Analysis
0
200
400
600
800
1000
C1
C2
C3
C4
C5
C6
C7
C8
C9
C10
C11
C12
C13
C14
C15
C16
C17
C18
C19
C20
Field InHouse Field - InHouse InHouse - Field
InHouse
100 96636
Field
Num
ber o
f met
hods
Changes
ESEC/FSE 2003
Study 2: Regression Testing
Research questionDoes the use of field data actually result inadditional testing requirements?
Experimental setupComputation of the set of critical methodsfor the 20 real changes
ESEC/FSE 2003
Study 2: Regression Testing#C
ritic
al M
etho
ds
Changes
0
50
100
150
200
250
300
350
400
C1C2C3C4C5C6C7C8C9
C10C11C12C13C14C15C16C17C18C19C20
ESEC/FSE 2003
Related Work
• Perpetual/Residual testing(Clarke, Osterweil, Richardson, and Young)
• Expectation-Driven Event Monitoring (EDEM)(Hilbert, Redmiles, and Taylor)
• Echelon(Srivastava and Thiagarajan)
• Impact analysis based on whole-path profiling(Law and Rothermel)
ESEC/FSE 2003
Final RemarksConclusion
• Two new techniques for impact analysis andregression testing based on field data
• Empirical evaluation on a real subject with realusers
• Results showing that using field dataconsiderably affect these tasks
Open Issues and future work• Study on the stability of user behaviors• Collection of additional data• Clustering of field data• Capture and replay of users’ executions
ESEC/FSE 2003
For more information:
http://gamma.cc.gatech.edu
Questions?