Agenda β’ What is Search-Based Software
Analysis?
β’ Sample Problems
β’ Strength of Simpler Search
β’ Conclusions
What is Search-Based Software Analysis?
β’ Search-based optimization
β’ Search for software analysis
β’ Three paradigms for search
β Meta-heuristic search
β Search via a NP problem solver
β Specific search strategies
Agenda β’ What is Search-Based Software
Analysis?
β’ Sample Problems
β’ Strength of Simpler Search
β’ Conclusions
Problem 1: Metamorphic-Relation Identification
β’ Background
β Test oracle problem
β Metamorphic testing: detect faults in programs by looking for violation of metamorphic relations (MRs)
β Metamorphic relations: how a particular change to the input would change the output, e.g., β’ sin π₯ = sin(π₯ + 2π)
β’ Metamorphic relation identification:
β Manually or automatically identify MRs for a program
Search-based Solution
β’ Focusing on only polynomial MRs whose relations between inputs and relations between outputs are both polynomial equations
β’ Formalize polynomial MRs, e.g.,
β c1P I1 + c2P Ξ±πΌ1+ π½ + π = 0
β c1P2 I1 + c2P I1 P Ξ±πΌ1+ π½ + π3π
2 Ξ±πΌ1+ π½ +π1π πΌ1 + π2π(Ξ±πΌ1+ π½) + π = 0
β’ Polynomial MR identification search for the values of parameters in the polynomial MRs
PSO -> MR Identification
β’ Particle Swarm Optimization (PSO) β An optimization algorithm simulating the birds foraging
behavior β In PSO, each particle has a velocity and a location, which
keep changing during the search. The fitness function is to evaluate how close the location of a particle is to an optimal location
β Searching in a D-dimensional space with N particles β’ Given:
β Velocity of the i-th particle at moment t (t=1,2,β¦): πππ‘=<vi1
t, vi2t ,β¦ , viD
t > β Location of the i-th particle at moment t: Lπ
π‘=<li1t, li2
t ,β¦ , liDt >
β d-th dimension of the personal optimum location that the i-th particle has reached on and before moment t: πππ
π‘
β d-th dimension of the global optimum location that the i-th particle has reached on and before moment t: πππ
π‘
β’ Then: β Velocity of the i-th particle at moment t+1: π£ππ
π‘ + 1 = ππ£πππ‘ + π1π1 πππ
π‘ β ππππ‘ + π2π2(ππππ‘ β ππππ‘) β Location of the i-th particle at moment t+1:
ππππ‘ + 1 = ππππ‘ + π£πππ‘
+ 1
PSO -> MR Identification
β’ MR identification β’ For example, c1P x1, x2, β¦ , xn +c2P π1ππ₯π + π1, β¦ ,
ππ=1 ππππ₯π + ππ
ππ=1 + π = 0
β’ Given a vector L of values for c1, c2, aij, bi, d, if L and input Ik satisfy this equation, π πΏ, π = 1; otherwise,π πΏ, π = 0.
β’ Fitness function: πππ‘πππ π πΏ = π(πΏ, π)ππ=1
β’ Further reading: Zhang et al., Search-Based Inference of Polynomial Metamorphic Relations for Scientific Programs, ASE 2014.
Problem 2: Test-Case Prioritization
β’ Background of test-case prioritization β Regression testing: retest a new version using
existing test cases within a test suite
β It is expensive to reuse all the test cases
β To meet some test goals earlier (e.g., reveal more faults and time concerns), the test cases should be reordered
β’ Test-case prioritization β Schedule the execution order of test cases to
achieve some test goal (i.e., less time but more faults)
Solutions to Test-Case Prioritization
β’ Test-case prioritization β Given: T: a test suite; PT: its set of permutations of all subsets of T; f: a function from PT to numbers denoting the award value of an ordering of test cases β Problem: Find Tβ² β ππ satisfying that βπβ²β² πβ²β² β ππ πβ²β² β πβ² (π πβ² β₯ π πβ²β² )
β’ Typical solutions for test-case prioritization β record the coverage information of the old version
with T β based on the preceding coverage information,
prioritize test cases within T for a new version
Solutions to Test-Case Prioritization
β’ Test-case prioritization β Given: T: a test suite; PT: its set of permutations of all subsets of T; f: a function from PT to numbers denoting the award value of an ordering of test cases β Problem: Find Tβ² β ππ satisfying that βπβ²β² πβ²β² β ππ πβ²β² β πβ² (π πβ² β₯ π πβ²β² )
β’ Typical solutions for test-case prioritization β record the coverage information of the old version
with T β based on the preceding coverage information,
prioritize test cases within T for a new version
Search-based Solution: ILP -> Test-Case Prioritization
β’ Integer linear programming (ILP)
β Solve an optimization problem
β requirements:
β’ all the variables are integers
β’ all the functions and constraints are linear
β Popular problem: Travelling Salesman
β’ Formalize test-case prioritization by ILP
β Decision variables
β’ Boolean variable xij : whether the j-th test case in Tβ is ti
β’ Boolean Variable yjk: whether the first j test cases in Tβ covers statement stk
β’ Boolean Variable cik: whether test case ti covers statement stk
β Constraints β’ π₯ππ = 1, π₯ππ = 1
ππ=1
ππ=1
β’ πππ β π₯π1 = π¦1π, π¦ππ β₯ πππ β π₯ππ π β₯ 2 , π¦ππ β₯ π¦π β 1, π
π β₯ 2 , πππ β π₯ππ + π¦π β 1, πβ₯ π¦ππ(π β₯ 2)π
π=1ππ=1 π
π=1
β Objective function
β’ πππ₯ππππ§π π¦ππππ=1
πβ1π=1
β’ Further reading: Hao et al., On Optimal Coverage-Based Test-Case Prioritization, Submitted to ISSRE14.
Problem 3: Time-Aware Test-Case Prioritization
β’ Time-Aware Test-case prioritization β Add constraints on the time budget
β Formalization β’ Given:
T: a test suite; PT: its set of permutations of all subsets of T; f: a function from PT to numbers denoting the award value of an ordering of test cases; time: a function from PT to numbers denoting the execution time of an ordering of test cases; timemax: time budget
β’ Problem:
Find Tβ² β ππ πππ π‘πππ πβ² β€ π‘ππππππ₯ satisfying that βπβ²β² πβ²β² β ππ πβ²β² β πβ² π‘πππ πβ²β² β€ π‘ππππππ₯ (π π
β² β₯π πβ²β² )
Time-Aware Test-Case Prioritization
β’ Time-Aware Test-case prioritization β Add constraints on the time budget
β Formalization β’ Given:
T: a test suite; PT: its set of permutations of all subsets of T; f: a function from PT to numbers denoting the award value of an ordering of test cases; time: a function from PT to numbers denoting the execution time of an ordering of test cases; timemax: time budget
β’ Problem:
Find Tβ² β ππ πππ π‘πππ πβ² β€ π‘ππππππ₯ satisfying that βπβ²β² πβ²β² β ππ πβ²β² β πβ² π‘πππ πβ²β² β€ π‘ππππππ₯ (π π
β² β₯π πβ²β² )
Search-based Solution: ILP -> Test-Case Prioritization
β’ Formalize test-case prioritization by ILP
β Defined variables
β’ Boolean variable xi : selection of test ti
β’ variable StN(ti): number of statements covered by test ti
β Objective function: πππ₯ ππ‘π(π π‘π) β π₯π
β Constraint System: π‘πππ π‘π β π₯ππ β€ π‘ππππππ₯
β’ Further reading: Zhang et al., Time-Aware Test-Case Prioritization using Integer Linear Programming, ISSTA 2009
Problem 4: Test-Suite Reduction β’ Background of test-suite reduction
β Regression testing: retest a new version using existing test cases within a test suite
β It is expensive to reuse all the test cases β To reduce the time required for testing, a
representative subset of test cases satisfying the same testing requirements as the given test suite should be found
β’ Test-suite reduction β Reduce the number of test cases
guaranteeing that the reduced test suite satisfies the same testing requirements as the original test suite
Solutions to Test-Suite Reduction β’ Test-suite reduction
β Given a test suite T, finds its subset Tβ satisfying that βπβ²β² β π(π πβ²β² = π πβ² = π(π) β |πβ²| β€ |πβ²β²|), where f is a function defining to what extent a subset satisfies the specified testing requirement.
β’ Typical solutions for test-suite reduction
β record the coverage information of the old version with T
β based on the preceding coverage information, prioritize test cases within T for a new version
Search-based Solution: ILP -> Test-Suite Reduction
β’ Formalize test-suite reduction by ILP (single-objective) β Decision variables
β’ Boolean variable xi : selection of test ti in the reduced test suite
β’ Boolean variable aij: whether test ti covers some test requirement ri
β Objective function: πππππππ§π π₯π π
β Constraints: for any i, πππ β π₯π π β₯ 1
β’ Further reading: Black et al., Bi-Criteria Models for All-Uses Test Suite Reduction, ICSE 2004
Search-based Solution: ILP -> Test-Suite Reduction
β’ Formalize test-suite reduction by ILP (single-objective) β Decision variables
β’ Boolean variable xi : selection of test ti in the reduced test suite
β’ Boolean variable aij: whether test ti covers some test requirement ri
β Objective function: πππππππ§π π₯π π
β Constraints: for any i, πππ β π₯π π β₯ 1
β’ Further reading: Jennifer Black, Emanuel Melachrinoudis, and David Kaeli, βBi-Criteria Models for All-Uses Test Suite Reduction," Proceedings of 26th International Conference on Software Engineering (ICSE 2004)
Compared with other techniques, including greedy strategy, genetic algorithm, other heuristic algorithms, we got the following findings. β’ Generic-based algorithm is bad considering both effectiveness and efficiency. β’ ILP based algorithm is more effective than the other algorithms. Further reading: Zhong et al., An Experimental Study of Four Typical Test Suite Reduction Techniques, IST 2008
Issues in Existing Test-Suite Reduction
β’ Test-suite reduction
β Given a test suite T, finds its subset Tβ satisfying that βπβ²β² β π(π πβ²β² = π πβ² = π(π) β |πβ²| β€ |πβ²β²|), where f is a function defining to what extent a subset satisfies the specified testing requirement.
β’ Actually, from T to Tβ, the testing requirement (e.g., fault-detection capability) usually reduces.
β’ On-demand test-suite reduction: guarantee an upper limit l% on acceptable loss in fault-detection with confidence c%
Solutions to On-Demand Test-Suite Reduction
β’ Test-suite reduction
β Given a test suite T, finds its subset Tβ satisfying that βπβ²β² β π(π πβ²β² = π πβ² = π(π) β |πβ²| β€ |πβ²β²|), where f is a function defining to what extent a subset satisfies the specified testing requirement.
β’ Typical solutions for test-suite reduction
β record the coverage information of the old version with T
β based on the preceding coverage information, prioritize test cases within T for a new version
Given a test suite T, finds its subset Tβ satisfying that fl, c T
β² and βπβ²β² β π(ππ, ππβ²β² β |πβ²| β€ |πβ²β²|), where flc(Tβ)
denotes the fact that Tβ is a subset of T and that the loss of Tβ in fault-detection capability is at most l% in at least c% of circumstances.
Search-based Solution: ILP->On-Demand Test-Suite Reduction
β’ Formalize on-demand test-suite reduction by ILP β Defined variables
β’ Boolean variable xi : selection of test ti
β’ Boolean variable wj,q: if q test cases in Tβ cover statement sj
β’ variable C(i,j): if test case ti covers statement sj β’ variable Vc(pj,q): the loss in fault-detection capability
for one statement at confidence level c% when the coverage changes from pj to q
β Objective function: πππ π₯ππ
β Constraint System: π€π, π β ππ(ππ, π)πππ=1 β€ π%β¦
β’ Further reading: Hao et al., On-Demand Test Suite Reduction, ICSE 2012
Agenda β’ What is Search-Based Software
Analysis?
β’ Sample Problems
β’ Strength of Simpler Search
β’ Conclusions
Strength of Simpler Search (1)
β’ Greedy algorithms
β Test-case prioritization
β’ Greedy > Genetic > ILP
β Test-suite reduction
β’ Greedy β ILP > Genetic
Agenda β’ What is Search-Based Software
Analysis?
β’ Sample Problems
β’ Strength of Simpler Search
β’ Conclusions
Conclusions (1)
β’ Take-home messages (1)
βAlways try simpler strategies first
βIf an SA problem can be formulated as a search problem, but not an NP problem, it might be a very good candidate for meta-heuristic search
Conclusions (2)
β’ Take-home messages (2)
β If an SA problem can be formulated as an NP problem with size inflation, try meta-heuristic search (instead of an NP solver) first
β If an SA problem can be formulated as an NP problem without size inflation, try an NP solver (instead of meta-heuristic search) first