Upload
patricia-bradford
View
227
Download
0
Tags:
Embed Size (px)
Citation preview
Systematic Debugging
Zeller
DAIMI Henrik Bærbak Christensen 2
Literature
“Why Programs Fail”, 2nd Ed.– Chap 1: TRAFFIC / Overview– Chap 6: Scientific Method in debugging
“Beautiful Code”, eds. Oram & Wilson– Chap 28: Delta debugging
DAIMI Henrik Bærbak Christensen 3
Terminology Increment
The story goes– The programmer creates a defect
– The defect causes an infection: program state that differs from the intended state
– The infection propagates• Erroneous state is fed into functions and the infection
spreads
– The infection causes a failure• An externally observable error,
DAIMI Henrik Bærbak Christensen 4
Debugging
Debugging is:– Identify the infection chain– Find the root cause, and
thereby the defect– Remove the defect
´Note:– The chain can be long from
infection to failure – Not all infections lead to
failure
DAIMI Henrik Bærbak Christensen 5
TRAFFIC
Seven step process– T: Track the problem in the database– R: Reproduce the failure– A: Automate and simplify the test case– F: Find possible infection regions– F: Focus on the most likely origins– I: Isolate the infection chain– C: Correct the defect
DAIMI Henrik Bærbak Christensen 6
Debugging as Search
Searching:– Finding the transition from sane
to infected state• In time & and in place (= var.)
Principles:– Separate sane from infected state
• If a state is sane, there is no infection to propagate
– Separate relevant from irrelevant• A variable value is the result of a
limited number of earlier variable values. I.e. only part of the earlier state may be relevant for the failure
DAIMI Henrik Bærbak Christensen 7
Searching
State dependency– Infected state is a result of
a changes in a limited set of state variables
– Finding these of course limits the search space a lot…
Yet another argument in favor of loose coupling of highly cohesive software units!
Scientific Method
Wikipedia
And
Zeller Chap 6
DAIMI Henrik Bærbak Christensen 9
Process
The essential elements of the scientific method are iterations and recursions of the following four steps:– Characterization (clear terminology, careful logging)– Hypothesis (a theoretical, hypothetical explanation) – Prediction (logical deduction from the hypothesis) – Experiment (test of all of the above)
– Observation & Conclusion: Hypothesis reject/confirm
– Theory: A hypothesis that cannot be rejected even though it has been thoroughly tried…
DAIMI Henrik Bærbak Christensen 10
Characterization
Natural Science
Observation
HypothesisExperiment
Theory / Literature Experience
My Best Debugging Story
Utilizing the Scientific Method
DAIMI Henrik Bærbak Christensen 12
Debugging…
The method– Observe hypothesis prediction experiment– and detailed record keeping
… served me well in a EU project in which I was hired as advisor…
Domain: Symphony orchestra sheet music editor.
DAIMI Henrik Bærbak Christensen 13
Case: Haken-Blostein algorithm
Haken-Blostein is a complex algorithm for spacing sheet music.
Basically it treats the graphical objects like a system of connected springs with rods inserted.
DAIMI Henrik Bærbak Christensen 14
Observation
Observation: Each invocation of the algorithm spaced the sheet music differently. Something gets accumulated or changed between invocations?
Hypothesis: The observed behavior is due to some kind of accumulating defect in the code. Some variable(s) change value during invocation
But - Too vague to be useful...
DAIMI Henrik Bærbak Christensen 15
Physics: Spring system
A classic problem in physics is connected springs, ex. Three springs connected.
Problem: – how will the connection points move when you
push/pull the system?– When the spring constants are different?– Inserting rods will set a lower limit on how
compressed they can become
DAIMI Henrik Bærbak Christensen 16
Iteration 1
Parameters:– external algorithm parameters: a
– spring constants ci, rod sizes Rj
Hypothesis 1:– The algorithm parameter, a, changes value
Experiment:– Inspect the value of a before and after invocation– Easy: there is a dialogue box where it can be
inspected.
Result: Hypothesis false.
DAIMI Henrik Bærbak Christensen 17
Iteration 2
Hypothesis 2:– Spring constants changes
Experiment:– Inspect the values of a before and after invocation– Difficult:
• There are a lot of springs and no user interface. • Inserting debug output into code.• Manually comparing long lists of double values.
Result: Hypothesis false.
DAIMI Henrik Bærbak Christensen 18
Iteration 3
Hypothesis 3:– Rod sizes changes
Experiment:– Inspect the values of a before and after invocation– Even more difficult:
• There are a hundreds of rods and no user interface. • Inserting debug output into code.• Using diff-tool to compare very long lists of double values.
Result: Hypothesis accepted.– a few rod sizes somehow accumulate width…
Zeller
DAIMI Henrik Bærbak Christensen 20
Process
DAIMI Henrik Bærbak Christensen 21
Our aids
The writing template– Hypothesis– Prediction– Experiment– Observation– Conclusion
Be explicit! Write it down and keep a log
DAIMI Henrik Bærbak Christensen 22
Example: Haken-Blostein 1
Hypothesis H1– The spacing algorithm changes the value of a
Prediction– The value of a before spacing differs from that after
Experiment– Inspect a (dialog box); respace; inspect a
Observation– The value of a is the same before and after respacing
Conclusion– H1 rejected.
DAIMI Henrik Bærbak Christensen 23
Exercise
Discuss this experiment/iteration using the terminology of– Defect, infection, failure
– Principles:• Separate sane from infected state
– If a state is sane, there is no infection to propagate
• Separate relevant from irrelevant– A variable value is the result of a limited number of earlier
variable values. I.e. only part of the earlier state may be relevant for the failure
DAIMI Henrik Bærbak Christensen 24
Creating New Hypotheses
If a hypothesis is rejected – we have to formulate a new one!– Creative – but effective!
Sources– Problem description (be explicit and precise)– Program code– Failing run/test case– Alternative runs/test cases
Earlier Hypotheses– New must
• Include all confirmed earlier hypotheses• Exclude all rejected earlier hypotheses
DAIMI Henrik Bærbak Christensen 25
Reasoning
Fig 6.5 Deduction (0 runs)
– Concluding from abstract to concrete• I do not have to measure if the sum of angles in a triangle is
180 degrees. I deduce it from a mathematical theory.
Observation (1 run)– Observe a phenomenon once
Induction (n runs)– Collecting many concrete observation to form abstract
• “I have met 15 stupid men, thus all men are stupid…”
Experimentation (n controlled runs)– Induction, but controlled by scientific method
Delta Debugging
DAIMI Henrik Bærbak Christensen 27
Automatic the Search
TRAFFIC– T: Track the problem in the
database– R: Reproduce the failure– A: Automate and simplify the
test case– F: Find possible infection
regions– F: Focus on the most likely
origins– I: Isolate the infection chain– C: Correct the defect
Debugging = Search
But can we automate the search?
DAIMI Henrik Bærbak Christensen 28
Case
ddd– Front end to gdb– Bug: command-line arguments no longer
remembered between runs– Pass: gdb 4.16 Fail: gdb 4.17– Reasoning
• Some code change, Δ, between 4.16 and 4.17 is the cause of the failure.
• Can find it by reviewing the set of Δ’s
– Diff: • 178.200 changed lines• 8721 places
DAIMI Henrik Bærbak Christensen 29
Idea
Apply one Δ at a time– Test to see if test case fails
Starting at 4.16 (pass) until fail (4.16’)– Review the last applied Δ – that is it!
But…– The order of Δ is not known
• Δ1 must be applied before Δ2 and Δ3– Ex: Δ1 declare a new variable, used by Δ2
DAIMI Henrik Bærbak Christensen 30
Permutations?
All orderings of set of Δ?– 8721!
All subsets of the set of Δ?– 2^8721 = 10^2625
There are approx 3 x 10^7 seconds in a year so the ‘fast’ approach would take 375 years if each iteration takes 1 second.
DAIMI Henrik Bærbak Christensen 31
A Simpler Problem
First – let us look at a simpler, but related problem.– I generate HTML from my XML database
• schedule.xml, lesson.xml, exercise.xml, …
– Using an XSLT translator ‘xt’
xt xml_src/schedule.xml xsl/gen-schedule.xsl schedule.html
But the problem is that if the xml is not well formed I generally get error without any hint of what went wrong– Like the ‘good old days’ of a C core dump…
DAIMI Henrik Bærbak Christensen 32
The Scientific Method + Divide&Conquer
I then revert to binary search:– H1: The defect is in the first half of the xml file
• I cut out the last (meaningful) half, and run ‘xt’• If the error is there, H1 is confirmed, otherwise rejected
– If H1 is rejected then H2: defect in second half
– If H1/ is confirmed, I further divided the halved file with the defect into two – and define two new hypotheses
• i.e. call myself recursively
DAIMI Henrik Bærbak Christensen 33
ddmin
ddmin is an algorithmic version of this procedure– Test(c) only returns X
in case the exact same failure occurs
– If xt fails for other reasons, return ?
DAIMI Henrik Bærbak Christensen 34
What Happens?
N = 8
Test(Cx\C4) = X =>
Call recursively with new set and N=7– N = max(N-1,2)
DAIMI Henrik Bærbak Christensen 35
The Math Formulation
DAIMI Henrik Bærbak Christensen 36
Back to gdb Problem
You apply smaller and smaller changes sets to gdb 4.16 and test until the change set is small enough is small enough to – A) compile and run– B) find one the reproduce the defect
Note: No guaranty that it will be found!– The change that has the defect may rely on a
previous change that is not part of the change set…
DAIMI Henrik Bærbak Christensen 37
The Full dd Algorithm
The dd algoritms in the handed-out chapter is the extended version, dd, of ddmin.
It tests from both ends– Failing test case set to be minimized (like ddmin)
• All deltas / gdb 4.17
– Passing test case set to be maximized• No deltas / gdb 4.16
I tried to hand-run it but …
DAIMI Henrik Bærbak Christensen 38
Summary
TRAFFIC– An algorithm to help in debugging
Scientific Process– Debugging as iterations over hypothesis, prediction,
observation, and conclusion.
Delta-debugging– Automating the search for the defect
• By divide-n-conquer of the search space…