Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
Selected Topics of Software Technology 3
Spectrum-based Fault Localization1
S C I E N C E P A S S I O N T E C H N O L O G Y
u www.tugraz.at
Selected Topics of Software Technology 3
Spectrum-based Fault Localization
Birgit Hofer
Institute for Software Technology
Selected Topics of Software Technology 3
Spectrum-based Fault Localization2
We before we start, a few organizational things
VO
Date for the written exam?
Practical PART 2 – Static Code Analysis
Teams?
Selected Topics of Software Technology 3
Spectrum-based Fault Localization3
Fault localization
Spectrum-based
Model-based
Repair Genetic
Spreadsheet
Quality Assurance
Techniques
Visualization
Static Analysis
Debugging
Testing
Modeling
Design & Maintenance
Support
Source: Jannach et al. “Avoiding, Finding and Fixing Spreadsheet Errors – A Survey of
Automated Approaches for Spreadsheet QA”, in Journal of Systems and Software, 2014.
Visualization: Patrick Koch,Diploma Seminar, TU Graz, 2015.
Debugging
Spectrum-based
Selected Topics of Software Technology 3
Spectrum-based Fault Localization4
Outline – Spectrum-based Fault Localization
Software
Repetition
Bank Account Example
Influence of the Test suite quality
SFL in practice
Spreadsheets
Summary
Selected Topics of Software Technology 3
Spectrum-based Fault Localization5
Outline – Spectrum-based Fault Localization
Software
Repetition
Bank Account Example
Influence of the Test suite quality
SFL in practice
Spreadsheets
Summary
Selected Topics of Software Technology 3
Spectrum-based Fault Localization6
Relevant Literature
Abreu, Zoeteweij, Gemund:
“An Evaluation of Similarity Coefficients for Software Fault Localization”
12th IEEE Pacific Rim International Symposium on Dependable Computing
(PRDC) 2006.
Lucia, Lo, Jiang, Thung, Budi:
“Extended comprehensive study of association measures for fault localization”
Journal of Software, 2013.
Hofer, Perez, Abreu, Wotawa:
“On the empirical evaluation of similarity coefficients
for spreadsheets fault localization”
Journal of Automated Software Engineering, 2014.
Selected Topics of Software Technology 3
Spectrum-based Fault Localization7
Observation Matrix
Error Vector
Source: Yu, Jones, Harrold: “An empirical study of the effects of test-suite reduction on fault localization”,
ICSE '08: Proceedings of the 30th international conference on Software engineering, ACM, 2008, 201-210
Selected Topics of Software Technology 3
Spectrum-based Fault Localization8Source: Yu, Jones, Harrold: “An empirical study of the effects of test-suite reduction on fault localization”,
ICSE '08: Proceedings of the 30th international conference on Software engineering, ACM, 2008, 201-210
Selected Topics of Software Technology 3
Spectrum-based Fault Localization9Source: Yu, Jones, Harrold: “An empirical study of the effects of test-suite reduction on fault localization”,
ICSE '08: Proceedings of the 30th international conference on Software engineering, ACM, 2008, 201-210
Selected Topics of Software Technology 3
Spectrum-based Fault Localization10Source: Yu, Jones, Harrold: “An empirical study of the effects of test-suite reduction on fault localization”,
ICSE '08: Proceedings of the 30th international conference on Software engineering, ACM, 2008, 201-210
Selected Topics of Software Technology 3
Spectrum-based Fault Localization11Source: Yu, Jones, Harrold: “An empirical study of the effects of test-suite reduction on fault localization”,
ICSE '08: Proceedings of the 30th international conference on Software engineering, ACM, 2008, 201-210
Selected Topics of Software Technology 3
Spectrum-based Fault Localization12
Spectrum-based Fault Localization (SFL)
Divide into passed and failed test cases
Log which statements where executed when
executing the single test cases (Observation Matrix)
Identify those program parts whose execution pattern
correlates most with the error vector
Selected Topics of Software Technology 3
Spectrum-based Fault Localization13Source: Yu, Jones, Harrold: “An empirical study of the effects of test-suite reduction on fault localization”,
ICSE '08: Proceedings of the 30th international conference on Software engineering, ACM, 2008, 201-210
●
1 match,
5 mismatches
Selected Topics of Software Technology 3
Spectrum-based Fault Localization14Source: Yu, Jones, Harrold: “An empirical study of the effects of test-suite reduction on fault localization”,
ICSE '08: Proceedings of the 30th international conference on Software engineering, ACM, 2008, 201-210
●
3 matches,
3 mismatches
Selected Topics of Software Technology 3
Spectrum-based Fault Localization15Source: Yu, Jones, Harrold: “An empirical study of the effects of test-suite reduction on fault localization”,
ICSE '08: Proceedings of the 30th international conference on Software engineering, ACM, 2008, 201-210
●
5 matches,
1 mismatch
5 matches,
1 mismatch
Selected Topics of Software Technology 3
Spectrum-based Fault Localization16
Similarity Coefficients
0010
10
0111
11
0111
11
aa
a
aa
a
aa
a
Tarantula
1011
11
aa
aIsolationBugStat
100111
11
aaa
aJaccard
)()( 10110111
11
aaaa
aOchiai
a11 = executed and error
a10 = executed and no error
a01 = not executed and error
a00 = not executed and no error
There are more than 40 similarity coefficients!
Selected Topics of Software Technology 3
Spectrum-based Fault Localization17
Solution for previous Example)()( 10110111
11
aaaa
a
a11 = executed and error
a10 = executed and no error
a01 = not executed and error
a00 = not executed and no error
Selected Topics of Software Technology 3
Spectrum-based Fault Localization18
Solution for previous Example)()( 10110111
11
aaaa
a
Selected Topics of Software Technology 3
Spectrum-based Fault Localization19
Outline – Spectrum-based Fault Localization
Software
Repetition
Bank Account Example
Influence of the Test suite quality
SFL in practice
Spreadsheets
Summary
Selected Topics of Software Technology 3
Spectrum-based Fault Localization20
Bank Account Example – Version 1
Source: Wotawa: „Fault localization based on dynamic slicing and hitting-set computation“. QSIC 2010
Selected Topics of Software Technology 3
Spectrum-based Fault Localization21
Bank Account Example – Test Cases (1)
Selected Topics of Software Technology 3
Spectrum-based Fault Localization22
Bank Account Example – Test Cases (2)
Selected Topics of Software Technology 3
Spectrum-based Fault Localization23
BankAccount Example – Solution
Selected Topics of Software Technology 3
Spectrum-based Fault Localization24
BankAccount Example – Solution
88
Selected Topics of Software Technology 3
Spectrum-based Fault Localization25
Evaluation Methods
Best Case
Average Case
Worst Case
5.02
|})(|{||})(|{|
fcOchiaicfcOchiaicRankAVG
1|})(|{| fcOchiaicRankBest
|})(|{| fcOchiaicRankWORST
Selected Topics of Software Technology 3
Spectrum-based Fault Localization26
Outline – Spectrum-based Fault Localization
Software
Repetition
Bank Account Example
Influence of the Test suite quality
SFL in practice
Spreadsheets
Summary
Selected Topics of Software Technology 3
Spectrum-based Fault Localization27
The more Test Cases
the better the Results
Selected Topics of Software Technology 3
Spectrum-based Fault Localization28
BankAccount Example
What happens when you …
do not use all test cases?
duplicate a passing test case?
duplicate a failing test case?
Selected Topics of Software Technology 3
Spectrum-based Fault Localization29
BankAccount Example
What happens when you …
do not use all test cases?
Selected Topics of Software Technology 3
Spectrum-based Fault Localization30
BankAccount Example
What happens when you …
duplicate a passing test case?
duplicate a failing test case?
Selected Topics of Software Technology 3
Spectrum-based Fault Localization31
Simple C
Compiler
# S
tate
ments
Selected Topics of Software Technology 3
Spectrum-based Fault Localization32
Failing TC: 1
Passing TC: 0
Ranking: ~1400
Failing TC: 1
Passing TC: 39
Ranking: ~600
Failing TC: 6
Passing TC: 0
Ranking: ~1000
# S
tate
ments
Failing TC: 6
Passing TC: 39
Ranking: ~170
Simple C
Compiler
Selected Topics of Software Technology 3
Spectrum-based Fault Localization33
A voidance
A Small Case Study
77 NCSS (LOC)
1546 Test Cases (TC)
39 Predef. single faults
Collision
T raffic alert and
S ystem
Selected Topics of Software Technology 3
Spectrum-based Fault Localization34
TcasFault 38
(wrong array size)
Passing TC: Few
Failing TC: Many
# S
tate
ments
Selected Topics of Software Technology 3
Spectrum-based Fault Localization35
TcasFault 9
(>= instead of >)
Passing TC: Not too few
Failing TC: Non-relevant
# S
tate
ments
Selected Topics of Software Technology 3
Spectrum-based Fault Localization36
TcasFault 30
(missing case)
# S
tate
ments
Selected Topics of Software Technology 3
Spectrum-based Fault Localization37
TcasFault 30
(missing case)
Passing TC: Not to few
Failing TC: Many
# S
tate
ments
Selected Topics of Software Technology 3
Spectrum-based Fault Localization38
TcasFault 35
(missing negation)
Passing TC: Not too few
not too many
Failing TC: Many
# S
tate
ments
Selected Topics of Software Technology 3
Spectrum-based Fault Localization39
TcasFault 16
(wrong const. value)
Sometimes it does
not matter
# S
tate
ments
Selected Topics of Software Technology 3
Spectrum-based Fault Localization40
Influence of the Test Suite on the Ranking Result
Number of failing TC
Higher number improves accuracy of the diagnosis
Benefit of having more than 10 failing TC is marginal
Number of passing TC
Can have a significant effect
(both in positive and negative direction)
Effect stabilizes around 20 passing TC
Source: Abreu: „Spectrum-based Fault Localization in Embedded Software“,
PhD thesis, Delft University of Technology, November 2009
Selected Topics of Software Technology 3
Spectrum-based Fault Localization41
Outline – Spectrum-based Fault Localization
Software
Repetition
Bank Account Example
Influence of the Test suite quality
SFL in practice
Spreadsheets
Summary
Selected Topics of Software Technology 3
Spectrum-based Fault Localization42
How to implement?
Use a code coverage tool
Java: EMMA
.NET: Ncover
Memory Optimization
Increment a11, a10, a01 instead of storing the whole
information matrix
Selected Topics of Software Technology 3
Spectrum-based Fault Localization43
Outline – Spectrum-based Fault Localization
Software
Repetition
Bank Account Example
Influence of the Test suite quality
SFL in practice
Spreadsheets
Summary
Selected Topics of Software Technology 3
Spectrum-based Fault Localization44
Running Example
Faulty Spreadsheet
Formula View
Source: EUSES Spreadsheet Corpus
Selected Topics of Software Technology 3
Spectrum-based Fault Localization45
Test Cases for Spreadsheets
Input cells: cells that do not reference other cells
I = {B2=23, C2=31, E2=15, B3=35, C3=34, E3=17}
Output cells: any formula cell, determined by user
O = {B4=58, C4=65, D4=123, F2=810, F3=1173}
Selected Topics of Software Technology 3
Spectrum-based Fault Localization46
Program debugging: execution traces, slices
Spreadsheets: cones (borrowed from hardware debugging)
The function ρ(c) returns all cells referenced in c.
From 3rd level programs to spreadsheets
Selected Topics of Software Technology 3
Spectrum-based Fault Localization47
CONE(F2) = {B2, D2, E2, F2}
CONE(D4) = {B2, D2, B3, C3, D3, D4}
Example for cones
Investigating
Intersection of cones
Selected Topics of Software Technology 3
Spectrum-based Fault Localization48
Faults where ∩ of cones does not work
Several faults
Single wrong output cell
Cone(F2) = {B2, C2, D2, F2} Cone(F3) = {B3, C3, D3, F3}
Cone(F3) = {B3, C3, D3, D4, F3}
Selected Topics of Software Technology 3
Spectrum-based Fault Localization49
SFL for spreadsheets
a11 = in cone & error
a10 = in cone & no error
a01 = not in cone & error
a00 = not in cone & no error
0010
10
0111
11
0111
11
aa
a
aa
a
aa
a
Tarantula
1011
11
aa
aIsolationBugStat
100111
11
aaa
aJaccard
)()( 10110111
11
aaaa
aOchiai
Selected Topics of Software Technology 3
Spectrum-based Fault Localization50
Spectrum-based Fault Localization
Spectra:
Cones of faulty and correct output cells
CONE(F2) = {B2,D2,E2,F2}
CONE(D4) = {B2,D2,B3,C3,D3,D4}
CONE(B4) = {B2,B3,B4}
CONE(C4) = {C2,C3,C4}
CONE(F3) = {B3,C3,D3,E3,F3}
Selected Topics of Software Technology 3
Spectrum-based Fault Localization51
Spectrum-based Fault Localization
Spectra:
Cones of faulty and correct output cells
CONE(F2) = {B2,D2,E2,F2}
CONE(D4) = {B2,D2,B3,C3,D3,D4}
CONE(B4) = {B2,B3,B4}
CONE(C4) = {C2,C3,C4}
CONE(F3) = {B3,C3,D3,E3,F3}
F2 D4 B4 C4 F3 Coef. Rank.
B2 ● ● ●
B3 ● ● ●
B4 ●
C2 ●
C3 ● ● ●
C4 ●
D2 ● ●
D3 ● ●
D4 ●
E2 ●
E3 ●
F2 ●
F3 ●
Error ● ●
Selected Topics of Software Technology 3
Spectrum-based Fault Localization52
Spectrum-based Fault Localization
Spectra:
Cones of faulty and correct output cells
CONE(F2) = {B2,D2,E2,F2}
CONE(D4) = {B2,D2,B3,C3,D3,D4}
CONE(B4) = {B2,B3,B4}
CONE(C4) = {C2,C3,C4}
CONE(F3) = {B3,C3,D3,E3,F3}
F2 D4 B4 C4 F3 Coef. Rank.
B2 ● ● ● 0.816 2
B3 ● ● ● 0.408 7
B4 ● -
C2 ● -
C3 ● ● ● 0.408 7
C4 ● -
D2 ● ● 1.000 1
D3 ● ● 0.500 6
D4 ● 0.707 3
E2 ● 0.707 3
E3 ● -
F2 ● 0.707 3
F3 ● -
Error ● ●
Selected Topics of Software Technology 3
Spectrum-based Fault Localization53
Demo
Selected Topics of Software Technology 3
Spectrum-based Fault Localization54
Influence of the “Test suite” – Avg. Rank
ISCAS85
c7552_BOOL_tc1_96_1Fault
EUSES
my_financial_model_1FAULTS_V5
Ra
nk
AV
G
Ra
nk
AV
G
Source: Hofer, Perez, Abreu, and Wotawa: “On the empirical evaluation of similarity coefficients for
spreadsheets fault localization”, Automated Software Engineering, 2014.
No user wants to indicate for so many output cells if they are correct.
Selected Topics of Software Technology 3
Spectrum-based Fault Localization55
Research questions
Do spreadsheets contain correct output cells that positively
or negatively influence the ranking of the faulty cells?
If yes, is it possible to a-priori determine which correct
output cells would positively influence the ranking?
Is it possible to avoid a decreasing fault localization quality
when adding more correct output cells?
RQ1
RQ2
RQ3
Selected Topics of Software Technology 3
Spectrum-based Fault Localization56
RQ1: Do spreadsheets contain correct output cells that
positively or negatively influence the ranking of the faulty cells?
EUSES my_financial_model
Number of correct output cells
Fa
ult R
an
kin
g (
Ra
nk
AV
G)
Selected Topics of Software Technology 3
Spectrum-based Fault Localization57
RQ1: Do spreadsheets contain correct output cells that
positively or negatively influence the ranking of the faulty cells?
ISCAS85 c7552
Number of correct output cells
Fault R
ankin
g (
Rank
AV
G)
Selected Topics of Software Technology 3
Spectrum-based Fault Localization58
RQ2: If yes, is it possible to a-priori determine which
correct output cells would positively influence the ranking?
Avoid coincidental correct output cells
A-priori definition not possible
Too many potential coincidental correct output cells
Take output cells with largest cones first
RankAVG for one
correct output cellRandom selection Largest cone
EUSES
my_financial5.8 2.5
ISCAS85
c7552100.6 69.5
Selected Topics of Software Technology 3
Spectrum-based Fault Localization59
Coincidental Correctness
Conditional like IF-function
Abstraction function like MIN, MAX, COUNT
Boolean
Multiplication by zero
Power with 0 or 1 as base number or 0 as exponent
Coincidential Correctness is also a problem for other software (see mid-Example)
Selected Topics of Software Technology 3
Spectrum-based Fault Localization60
RQ3: Is it possible to avoid a decreasing fault localization
quality when adding more correct output cells?
Balance ratio of correct and erroneous output cells
Duplicate cones of erroneous output cells
ISCAS85 c7552
Ra
nk
AV
G
Ra
nk
AV
G
Selected Topics of Software Technology 3
Spectrum-based Fault Localization61
Do spreadsheets contain correct output
cells that positively or negatively
influence the ranking of the faulty cells?
If yes, is it possible to a-priori determine
which correct output cells would
positively influence the ranking?
Is it possible to avoid a decreasing fault
localization quality when adding more
correct output cells?
RQ1
RQ2
RQ3
Research questions - Summary
RankAVG (1 correct
output cell)Random
Largest
cone
EUSES
my_financial5.8 2.5
ISCAS85
c7552100.6 69.5
YES
YES (use largest cones first)
…
YES (duplicate cones of erroneous
output cells)
Selected Topics of Software Technology 3
Spectrum-based Fault Localization62
Outline – Spectrum-based Fault Localization
Software
Repetition
Bank Account Example
Influence of the Test suite quality
SFL in practice
Spreadsheets
Summary
Selected Topics of Software Technology 3
Spectrum-based Fault Localization63
Summary
Approach
correct /
faulty
Result
User input
SFL SENDYS CONBUG MUSSCO
Computational
Complexitylow
correct /
faultyexpected
values
several times
expected
values
low to
moderatehigh very high
ranking ranking filtered set repair
single faultsFault
Complexity
multiple
faultsmultiple
faults
multiple
faults
Granularity block level Stmt. level Stmt. level fine