Upload
rnbach
View
183
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Presentation slides for my ICSM talk.
Citation preview
Sahara: Guiding the Debugging of
Failed Software Upgrades
Rekha Bachwani, Olivier Crameri, Ricardo Bianchini, Dejan Kostic, Willy Zwaenepoel
Modern software is complex and requires regular
updates (once every few weeks)
Fix bugs
Patch security vulnerabilities
Software upgrade failures are frequent [sosp’07]
5-10% of all upgrades fail
Upgrade failures can be catastrophic
Service disruption ($$)
User dissatisfaction ( )
ICSM, 2011 Rekha Bachwani, Rutgers University 2
Motivation
Difference in vendor and user environment is a major
source of failures [sosp’07]
Broken dependencies
Incompatibilities with the legacy systems
Testing in all user environments is impractical
Set of all possible environment settings is large
Set of possible user inputs is huge
Debugging software upgrade failures is hard
Incomplete user environment data
Unable to reproduce user conditions or failure
ICSM, 2011 Rekha Bachwani, Rutgers University 3
Motivation
Integrate users in the testing environment
Test upgrade in (many) user environments with their input
Collect data from (willing) users
Environment settings
Success or failure flags
Leverage data from the users to isolate the cause
ICSM, 2011 Rekha Bachwani, Rutgers University 4
Approach
Sahara: Upgrade Debugging System
Simplifies debugging of environment-related failures
Prioritizes the set of routines to consider when debugging
Uses machine learning, and static and dynamic analyses
Evaluate Sahara with 3 applications (5 failures)
Three real upgrade failures in OpenSSH
One synthetic failure each in SQLite and uServer
ICSM, 2011 Rekha Bachwani, Rutgers University 5
Contributions
ICSM, 2011 Rekha Bachwani, Rutgers University 6
Outline
Overview
Sahara: Debugging Failed Upgrades
Evaluation
Conclusion
ICSM, 2011 Rekha Bachwani, Rutgers University 7
Sahara - Key Idea
Upgrade failures are caused by user environments
Identify the suspect environment resources (SERs)
Identify the code affected by SERs
Software behaved correctly before the upgrade
Identify the code deviations in the upgrade
The root cause is most likely in the code that is both
Affected by the suspect aspects of the environment
Has deviated after the upgrade
ICSM, 2011 Rekha Bachwani, Rutgers University 8
Sahara - Identifying Suspects
Upgrade
Upgrade
Upgrade
Test
Upgrade
Test
Upgrade
Test
Upgrade
User Sites
Vendor
ICSM, 2011 Rekha Bachwani, Rutgers University 9
Sahara - Identifying Suspects
Pass/Fail labels
Environment data
Suspect
Environment
Resources
Feature Selection
Static AnalysisSuspect
Routines
User Sites
Vendor
ICSM, 2011 Rekha Bachwani, Rutgers University 10
Sahara – Identifying Deviations
Instrumented
code
Dynamic
analysis code
Run
Original
and New
Version
Run
Original
and New
Version
Suspect
Routines
Vendor
Instrumented
code
Dynamic
analysis code
User Sites
ICSM, 2011 Rekha Bachwani, Rutgers University 11
Sahara – Identifying Deviations
Run
Original
and New
Version
Run
Original
and New
Version
Suspect
Routines
Vendor
Dynamic
Analysis
Dynamic Analysis
Deviated
Routines
User Sites
ICSM, 2011 Rekha Bachwani, Rutgers University 12
Sahara – Identifying Deviations
Suspect
Routines ∩ Prime Suspects
Vendor
Deviated
Routines
ICSM, 2011 Rekha Bachwani, Rutgers University 13
Sahara - Summary
Identifies environment resources that caused failure
Feature selection using feedback from many users
Isolates routines affected by suspect environment
Def-use static analysis
Finds routines that have deviated after upgrade
Dynamic source analysis [icsm’04]
Combines results from static and dynamic analysis to
produce prime suspects
ICSM, 2011 Rekha Bachwani, Rutgers University 14
Outline
Overview
Sahara: Debugging Failed Upgrades
Evaluation
Conclusion
Experimental Setup
Upgrade deployment Environment data from 87 machines
Experiments
Application-specific configuration
Random
Modified 3 out of 8 real configurations to induce failures
Feature Selection 20 fail profiles, 67 success profiles
Failure Correlation
Perfect (100%) – all failure-inducing profiles result in failure
Imperfect (60%) – 60% of failure-inducing profiles result in failure
Imperfect (20%) – 20% of failure-inducing profiles result in failure
Suspects: features within 30% of the top-ranked feature
ICSM, 2011 15Rekha Bachwani, Rutgers University
Evaluation
Evaluate Sahara with three applications:
OpenSSH – 3 real upgrade failures
Upgrades every 3-6 months
50-70K lines of code
SQLite – 1 synthetic upgrade failure
67K lines of code
uServer – 1 synthetic upgrade failure
37K lines of code
Results for only two OpenSSH bugs discussed next
ICSM, 2011 16Rekha Bachwani, Rutgers University
OpenSSH bugs – Port Forwarding
Large data transfers abort with port forwarding
Regression bug in ssh version 4.7
Abort not reproducible at vendor site
Reasons for the abort
Users with port forwarding enabled issued large transfers
Default window size increased from 128KB to 2MB
Window size incorrectly advertised as packet size
sshd limits maximum packet size to 256KB
ICSM, 2011 17Rekha Bachwani, Rutgers University
Results – Port Forwarding
Sahara reduces no. of routines by
2-3x over static analysis, 17-20x over dynamic analysis, and 9-10x over diff
Produces small number of routines
Prime suspects always include offending routine(s)
ICSM, 2011 18Rekha Bachwani, Rutgers University
65 65 65 65 65 65
12 12 12
22 22 22
124 124 124 124 124 124
6 6 6 7 7 7
0
40
80
120
Perfect (100%) (SERs = 1)
Imperfect (60%) (SERs = 1)
Imperfect (20%) (SERs = 1)
Perfect (100%) (SERs = 3)
Imperfect (60%) (SERs = 3)
Imperfect (20%) (SERs = 3)
Random Real
diff Suspect Routines (Static Analysis)
Deviated Routines (Dynamic Analysis) Primary Suspects (Sahara)
No
. o
f R
ou
tin
es
OpenSSH bugs – X11 Forwarding
X forwarding won't start when executed in background
Regression bug in sshd version 4.2
X11 forwarding is enabled and X-session started in the
background
Reasons for the failure
X11 code modified to destroy listeners whose session has ended
X11 session in background, closes the session
ICSM, 2011 19Rekha Bachwani, Rutgers University
Results – X11 Forwarding
ICSM, 2011 20Rekha Bachwani, Rutgers University
137 137 137 137 137 137
18 18 18 21 20 20
157 157 157 157 157 157
6 6 6 7 6 60
40
80
120
160
Perfect (100%) (SERs = 1)
Imperfect (60%) (SERs = 1)
Imperfect (20%) (SERs = 1)
Perfect (100%) (SERs = 3)
Imperfect (60%) (SERs = 3)
Imperfect (20%) (SERs = 3)
Random Real
diff Suspect Routines (Static Analysis)
Deviated Routines (Dynamic Analysis) Primary Suspects (Sahara)
No
. o
f R
ou
tin
es
Sahara reduces no. of routines by
3x over static analysis, 20x over dynamic analysis, and 15x over diff
Produces small number of routines
Offending routine(s) always included in prime suspects
Results – Sensitivity Analysis (1/2)
Impact of number of failure-inducing profiles
Default - 20 failure-inducing profiles
Case 1 - 30 failure-inducing profiles
Number of SERs reduces by at most 2 features
Number of prime suspects reduces by at most 1
Case 2 - 10 failure-inducing profiles
Number of SERs reduces by at most 1
Number of prime suspects reduces by at most 1
ICSM, 2011 21Rekha Bachwani, Rutgers University
More profiles result in fewer SERs and prime suspects
Fewer profiles sometimes result in less noise
Results – Sensitivity Analysis (2/2)
Impact of feature selection accuracy
Default – suspects within 30% of top-ranked feature
Case 1 - suspects within 50% of top-ranked feature
Prime suspects increase by at most 2x
Case 2 – All configuration parameters are suspect
Prime suspects increase by 6-7x
ICSM, 2011 22Rekha Bachwani, Rutgers University
Lower feature selection accuracy results in more
SERs and prime suspects
ICSM, 2011 Rekha Bachwani, Rutgers University 23
Conclusion
Leverages user feedback, machine learning, and
program analyses
Produces accurate recommendations with a small set
of routines
The recommended set always includes
the offending routine
culprit environment resource
Demonstrates that combining different techniques can
be effective for debugging
Thanks for your time!
Questions ?