The$credit$for$crea-ng$these$slides$belongs$to$ Fall$2014 ...brun/class/2014Fall/... · Bug fixing...

Preview:

Citation preview

The$credit$for$crea-ng$these$slides$belongs$to$Fall$2014$CS$521/621$students.$$Student$names$have$been$removed$per$FERPA$regula-ons.$

enter Dept name in Slide Master Electrical and Computer Engineering

SemFix:

Program Repair via Semantic Analysis Hoang Duong Thien Nguyen Dawei Qi Abhik Roychoudhury

Satish Chandra

Electrical and Computer Engineering

Background

▪  Bug fixing is mostly manual, time consuming and expensive activity ▪  Current automatic bug-fixing techniques: ▪ Specification-based repair

➢  formal specification is needed ▪ Genetic-programming-based repair

➢  correct expression should be present in program ▪ Enumeration-based repair

➢  all possible expressions should be considered

Electrical and Computer Engineering

Research Question

▪ Can we fix bugs of a program without formal specification? ▪ Is there a better way to fix bugs than genetic programming or enumeration? ▪ Is program synthesis better than enumeration?

Electrical and Computer Engineering

Contribution

▪ Provide an automatically bug-fixing tool without formal specification ▪ Come up with a Constraint: Requirement to make repaired code pass all given tests ▪ Higher success-rate and fast bug repair ▪ Provide a new efficient and wide scalability technique to add component synthesis

Electrical and Computer Engineering

Key Idea

▪ Get a Ranked Bug report • using statistical fault localization

▪ Find a repair constraint according to given tests • using symbolic execution

▪ Compute a repair for the program • using program synthesis

Electrical and Computer Engineering

Key Idea

▪ Ranked bug report ▪ generated by Tarantula toolkit -- can use other metric ▪ a list contains all faulty statements with location ▪ faulty statement ranked by suspiciousness score from most ‘suspicious’ statement to the least one.

o  Suspiciousness of a line is how often the line is executed in successful and failing executions. Greater the number

of failures, greater the score

Electrical and Computer Engineering

Key Idea

▪ To get repair Constraint C (Symbolic Execution) ▪ x = fbuggy(…) -> x=f(…) ▪ Input-output pair of each testi generate one constraint Ci 1. symbolic T = f(input) 2. ci is the requirement of T to get the expected output (eg: T>10)

▪  Repair Constraint C is the conjunction of Ci

Electrical and Computer Engineering

Key Idea

▪ To solve the repair Constraint C(Program Synthesis) ▪  Decide components which can appear in the fix

➢ select primitive components based on complexity. ➢ define location variables for each component.

▪  Generate a repair statement by solving repair constraint done by SMT

Electrical and Computer Engineering

Example

bias = down_sep->bias=f(inhibit,up_sep,down_sep) T = f(inhibit,up_sep,down_sep) C={(C1 : T<=100) ^ (C2: T>110) ^…^ (C5: T<10) }

1: get bug report

2.Get repair constraint for first bug

Electrical and Computer Engineering

Example

▪ Repair constraint : C={(C1 : T<100) ^ (C2: T>110) ^…^ (C5: T<10) Provide component for function f(inhibit,up_sep,down_sep) --start with level 1 : function f(inhibit,up_sep,down_sep) = constant --if level 1 cannot satisfy C, combine level 1 and level 2 --process continues until a repair is generated : f(inhibit,up_sep,down_sep) =up_sep + 100

Electrical and Computer Engineering

Repair Algorithm

Electrical and Computer Engineering

Summary of Evaluation

▪ Subject programs used

Electrical and Computer Engineering

Summary of Evaluation

▪ SemFix versus Genprog (Based on Genetic programming) ▪ Success repair rate: Semfix > Genprog (SIR) Overall 90 buggy programs for 50 given tests: Semfix repaired 48/90 GenProg repaired 16/90

Electrical and Computer Engineering

Summary of Evaluation

▪ Bug types: SemFix fixed more types of bugs than GenProg

Electrical and Computer Engineering

Summary of Evaluation

▪ Running time: GenProg running time is greater than 3 times of SemFix (SIR)

Electrical and Computer Engineering

Summary of Evaluation

Repair that were not fixed: ● Multiple line fix ● Same wrong branch condition if (c){ ... } ... if (c) { ... } ● Updates to multiple variables x =e1; ... ; y =e2; ● Floating point bugs n = (int) (count*ratio +1.1)

Electrical and Computer Engineering

Conclusion

➔ The SemFix tool can automatically fix bugs without formal specification ➔ SemFix has higher success rate than GenProg and runs faster than than the latter. ➔ SemFix can fix variable types of bugs

Electrical and Computer Engineering

Discussion

1. If the termination condition of a loop is an expression over our introduced symbolic variable, the symbolic execution may never terminate. What should we do? For example: while(i<x){x=buggy-expression}

Electrical and Computer Engineering

Discussion

2. If we set the bound too small, what might happen?

Electrical and Computer Engineering

Discussion

3. Why SemFix run faster than GenProg tool?

Electrical and Computer Engineering

Discussion

4. Any other bugs that you can think of which SemFix can’t Fix?

Electrical and Computer Engineering

Discussion

5. Can a test case outside the test suite be used to generate a repair?

Electrical and Computer Engineering

Discussion

6. Is it easier to fix multiple simple repairs than just one complex repair?

Electrical and Computer Engineering

Discussion

7. SemFix uses program synthesis which uses different components, levels of statements to generate a repair. Why is this faster than enumeration?

Recommended