49
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Code Clones for Refactor with Clone Metrics : A Case Study of Open Source Software 1 †Osaka University, Japan ‡Nara Institute of Science and Technology , Japan *NEC Corporation, Japan Eunjong Choi †, Norihiro Yoshida‡, Takashi Ishio†, Katsuro Inoue†, and Tateki Sano*

Finding Code Clones for Refactoring with Clone Metrics : A Case Study of Open Source Software

Embed Size (px)

DESCRIPTION

Finding Code Clones for Refactoring with Clone Metrics : A Case Study of Open Source Software. Eunjong Choi † , Norihiro Yoshida ‡ , Takashi Ishio † , Katsuro Inoue † , and Tateki Sano*. †Osaka University, Japan ‡ Nara Institute of Science and Technology , Japan *NEC Corporation, Japan. - PowerPoint PPT Presentation

Citation preview

Page 1: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Finding Code Clones for Refactoring with Clone Metrics : A Case Study of Open Source Software

1

†Osaka University, Japan ‡Nara Institute of Science and Technology , Japan

*NEC Corporation, Japan

Eunjong Choi†, Norihiro Yoshida‡, Takashi Ishio†,Katsuro Inoue†, and Tateki Sano*

Page 2: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Contents

1. Background

2. Clone Metrics

3. Industrial Case Study

4. Case Study of Open Source Software

5. Summary and Future Work

2

Page 3: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Background: Clone Clone

Identical or similar code fragments in source code

The presence of code clones indication of low maintainability of software

if a bug is found in a code clone, the other code clone have to be checked for defect detection.

3

Similar

Page 4: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Refactoring is a process of restructuring an existing code.Alter software’s internal structure without

changing its external behaviorImprove the maintainability of software

Background: Refactoring [Fowler1999] (1/2)

4

[Fowler1999] M. Fowler, et al., Refactoring: Improving The Design of Existing Code, Addition Wesley, 1999.

Page 5: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Refactoring Code ClonesMerge code clones into a single program

unit

Background: Refactoring [Fowler1999] (2/2)

5

Refactoringcallstatement

[Fowler1999] M. Fowler, et al., Refactoring: Improving The Design of Existing Code, Addition Wesley, 1999.

Page 6: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

It is unavoidable to exist in source code because of specifications of the used program

language.

6

Background: Language-dependent Code Clone

Example of the language-dependent code clone(Consecutive setter invocations)

replacement.setTaskType(taskType); replacement.setTaskName(taskName); replacement.setLocation(location); replacement.setOwningTarget(target); replacement.setRuntime (wrapper); wrapper.setProxy(replacement);

Page 7: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Background: Clone Set

A set of code clones

7

Code Clone 1

Code Clone 2

Code Clone 3

Clone Set

Page 8: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Background: Clone Metrics [Higo2007]

Quantitative information on clone setsE.g., LEN(S), RNR(S), POP(S)

PurposesTo check features of code clones in software To extract code clones for several purposes

E.g., The highest length of code clones…

8

[Higo2007] Y.Higo, T. Kamiya, S.Kusumoto, K.Inoue, "Method and Implementation for Investigating Code Clones in a Software System", Information and Software Technology, pp. 985-998 (2007-9)

Page 9: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Clone Metrics: LEN(S)

The average length of token sequences of code clones in a clone set S

9

Clone set S

A token sequence [a b b ] is detected as a code clone

LEN(S) = 3

a b b

a b b a b b

Page 10: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Clone Metrics: RNR(S)

The ratio of non-repeated token sequences of code clones in a clone set S

Eliminate language dependent code clonesHigh RNR value

10

RNR(S) =     • 100 = 33.3 1

3

The length of non-repeatedThe length of non-repeated token sequencetoken sequence

The length of whole The length of whole token sequencetoken sequence

Clone set S

a b b

a b b a b b

Page 11: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Clone Metrics: POP(S)

The number of code clones in a clone set S

11

POP(S) = 3

1

23

Clone set S

Page 12: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Single Clone Metric (1/3)

Clone sets whose LEN(S) is higherThey Include many consecutive if (of if-else) blocks

involve similar but different conditional expressions.

12

if ((p = getProject().getProperty("ant.netrexxc.binary")) != null) { this.binary = Project.toBoolean(p); } // classpath makes no sense if ((p = getProject().getProperty("ant.netrexxc.comments")) != null) { this.comments = Project.toBoolean(p); }

…………The last part is omitted……………………

Code Clone in a clone set whose POP(S) is the highest in Ant1.7.0

Page 13: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Single Clone Metric (2/3)

Clone sets whose RNR(S) is higherThey do not organize a single semantic unit

semantic unit : many instructions forming a single functionality

13

Code Clone in a clone set whose RNR(S) is the second highest in Ant 1.7.0

else { // is the zip file in the cache ZipFile zipFile = (ZipFile) zipFiles.get(file); if (zipFile == null) { zipFile = new ZipFile(file); zipFiles.put(file, zipFile); } ZipEntry entry = zipFile.getEntry(resourceName); if (entry != null) {

a a part of part of semantic unitsemantic unit

Page 14: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Single Clone Metric (3/3)

Clone sets whose POP(S) is higherThey Include many language-dependent code clones

14

Code Clone in a clone set whose POP(S) is higher than others

out.println("\">");

out.println("");

out.print("<!ELEMENT project (target | "); out.print(TASKS); out.print(" | "); out.print(TYPES);

Page 15: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Key Idea

It is not appropriate to extract code clones for refactoring using just a single clone metric According to our experiences

We propose a method based on combined clone metricsTo improve the weakness of single-metric-based

extraction

15

Page 16: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Combined Clone Metrics

Clone sets whose RNR(S), POPS(S) are higherEach code clone organizes a single semantic units

16

Code Clone in a clone set whose RNR(S), POP(S) are higher than others

if (ifProperty != null && p.getProperty(ifProperty) == null) { return false; } else if (unlessProperty != null && p.getProperty(unlessProperty) != null) { return false; }

return true; }

Appropriate for Refactoring!

Page 17: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Industrial Case Study (1/2)

Goal: validating our key ideaUsing combined clone metrics is a feasible

method to extract code clone for refactoring

Target SystemIndustrial Java software developed by NEC110KLOC, 736 clone sets

17

Page 18: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Industrial Case Study (2/2)

Experimental Step1. Selected 62 clone sets from CCFinder's

output using clone metrics.

2. Conducted a survey about these clone sets and got feedback from a developer.

18

Source files

CCFinderClone sets using clone metrics

Survey

Feed back

Page 19: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Subject Code Clones (1/2)

Clone sets whose either clone metric value is highSLEN : Clone sets whose LEN(S) value is top 10

highSRNR : Clone sets whose RNR(S) value is top 10

highSPOP : Clone sets whose POP(S) value is top 10

high

19

Page 20: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Subject Code Clones (2/2)

Clone sets whose combined clone metrics values are highSLEN•RNR: 15 clone sets whose LEN(S) and

RNR(S) values are high rank in the top 15SLEN•POP: 7 clone sets whose LEN(S) and POP(S)

values are high rank in the top 15SRNR•POP: 18 clone sets whose RNR(S) and

POP(S) values are high rank in the top 15SLEN•RNR•POP : 1 clone set whose LEN(S), RNR(S)

and POP(S) values are high rank in the top 15

20

Page 21: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

In Survey : About Clone set XXX

Q. Which practice is appropriate for this clone set?

[] Perform refactoring

[] Write comments about code clones, but don’t perform refactoring.

[] Change nothing.

[] Others. (   )

21

Page 22: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

In Survey : About Clone set XXX

Q. Which practice is appropriate for this clone set?

[] Perform refactoring

[] Write comments about code clones, but don’t perform refactoring.

[] Change nothing.

[] Others. (   )

22

= Appropriate for refactoring√

Page 23: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

In Survey : About Clone set XXX

Q. Which practice is appropriate for this clone set?

[] Perform refactoring

[] Write comments about code clones, but don’t perform refactoring.

[] Change nothing.

[] Others. (   )

23

=Inappropriate for refactoring

√√

Page 24: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Results of Case Study (1/2)

24

#Selected Clone Sets: The number of selected clones #Refactoring: The number of clone sets marked as

“Perform refactoring“ in survey

Filtering#Selected Clone Sets

#Refactoring Precision

Each Single Clone metric 30 14 0.47

Combined Clone metrics 41 34 0.87

Page 25: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Results of Case Study (2/2)

25

Precision : “How many refactoring candidates were accepted by a developer?“

Combined clone metrics is more accepted as refactoring candidates by a developer

#Refactoring

#Selected Clone SetsPrecision =

Filtering#Selected Clone Sets

#Refactoring Precision

Each Single Clone metric 30 14 0.47

Combined Clone metrics 41 34 0.87

Page 26: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Case Study of Open Source Software

Goal: validating our key ideaUsing combined clone metrics is a feasible

method to extract code clone for refactoringUsing open source software

Experimental Step1. Selected clone sets from CCFinder's output

using clone metrics.

2. Checked Clone sets whether they are appropriate for performing refactoring.

26

Page 27: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Target systems

implementation in java Apache Ant:

198KLOC, 998 clone sets

Jboss: 633KLOC, 4284 clone sets

27

Page 28: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Subject clone sets

Subject clone setsApached Ant: 87 clone setsJboss: 299 clone sets

Clone sets whose either clone metric value is top 10 high

Clone sets whose combined clone metrics values are high rank in the 15

28

Page 29: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Subject Code Clones (Apache Ant)

29

Filtering#Selected Clone Sets

#Refactoring Precision

Each Single Clone metric 30 6 0.20

Combined Clone metrics 60 31 0.53

Page 30: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Subject Code Clones (Jboss)

30

Filtering#Selected Clone Sets

#Refactoring Precision

Each Single Clone metric 30 9 0.30

Combined Clone metrics 298 76 0.25

Q.Why results are different between the software?Because of the open source software dose not allow coding rule?

Page 31: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Analysis of Results: defects of RNR metric (1/2)

31

RNR metric sometimes extract unintentional code clones E.g., Language-dependent code clones

Page 32: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Analysis of Results: defects of RNR metric (2/2)

32

lIndex = lReturn.indexOf( "*" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%2a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( "*" ); } lIndex = lReturn.indexOf( ":" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%3a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( ":" ); }

Code Clone in a clone sets whose LEN(S) and RNR(S) (=96) values are high rank in the top 15 in JBOSS

Page 33: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Analysis of Results: defects of RNR metric (2/2)

33

lIndex = lReturn.indexOf( "*" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%2a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( "*" ); } lIndex = lReturn.indexOf( ":" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%3a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( ":" ); }

The value of RNR is really 96?

Code Clone in a clone sets whose LEN(S) and RNR(S) (=96) values are high rank in the top 15 in JBOSS

Page 34: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Analysis of Results: defects of RNR metric (2/2)

34

lIndex = lReturn.indexOf( "*" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%2a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( "*" ); } lIndex = lReturn.indexOf( ":" ); while( lIndex >= 0 ) { lReturn = ( lIndex > 0 ? lReturn.substring( 0, lIndex ) : "" ) + "%3a" + ( ( lIndex + 1 ) < lReturn.length() ? lReturn.substring( lIndex + 1 ) : "" ); lIndex = lReturn.indexOf( ":" ); }

Code Clone in a clone sets whose LEN(S) and RNR(S) (=96) values are high rank in the top 15 in JBOSS

Page 35: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Code Clone in a clone sets whose LEN(S) and RNR(S) (=96) values are high rank in the top 15 in JBOSS

RNR value of this clone sets Code Clone in a clone sets whose LEN(S) and RNR(S) (=50)

35

Analysis of Results: defects of RNR metric (2/2)

Page 36: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Summary and Future Work

SummaryWe conducted a case study to validate our key

idea and discuss its result Future Work

Update used metricsInvestigate about recallUse more metrics.Conduct case studies of open source software

36

Page 37: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 37

Thank You for Your Attention!

감사합니다 .

ありがとうございます

Page 38: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Example of clone set that are not selected…

It is too short to organize a semantic unit. RNR metric sometimes extract unintentional code

clones E.g., Language-dependent code clones

38

boolean isEqual(final DeweyDecimal other) { final int max = Math.max(other.components.length, components.length);

for (int i = 0; i < max; i++) { final int component1 = (i < components.length) ? components[ i ] : 0; final int component2 = (i < other.components.length) ? other.components[ i ] : 0; if (

Page 39: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Clone sets whose RNR(S) is higher than others

Each code clone in a clone set S consists of more non-repeated token sequences

39

/* Code Clone in a clone set whose RNR(S) is the second highest in Ant 1.7.0 */ else { // is the zip file in the cache ZipFile zipFile = (ZipFile) zipFiles.get(file); if (zipFile == null) { zipFile = new ZipFile(file); zipFiles.put(file, zipFile); } ZipEntry entry = zipFile.getEntry(resourceName); if (entry != null) {/* … */

Page 40: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Clone sets whose RNR(S) is lower than others

Consists of more repeated token sequences Involve in language-dependent code clone

40

/* Code Clone in a clone set whose RNR(S) is the lowest in Ant 1.7.0 */ String sosCmdDir = null; …… skip code….  

   private String filename = null;

private boolean noCompress = false; private boolean noCache = false; private boolean recursive = false; private boolean verbose = false;/* … */

Consecutive variable declarations

Page 41: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Clone metric: RNR(S) (1/2)

File:F1: a b c a b,F2: c c* c* a b,F3: d a b, e fF4: c c* d e f

Superscript * indicated that the token is in a repeated token sequence

RNR(S1) of Clone Set S1 is

41

RNR(S1) =            • 100 = 100

2 + 2 + 2 + 22 + 2 + 2 + 2

Clone Set:S1: { , , , }ab ab ab ab

Page 42: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Clone metric: RNR(S) (2/2)

File:F1: a b c a b,F2: c c* c* a b,F3: d a b, e fF4: c c* d e f

Superscript * indicated that the token is in a repeated token sequence

RNR(S2) of Clone Set S2 is

42

Clone Set:S2: { , , }c c* c* c* c c*

RNR(S2) = • 100 = 33.31 + 0 + 12 + 2 + 2

Page 43: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

| SRNR ∩ SPOP ∩ SRNR ∙ POP| = 1 | SRNR ∩ SRNR ∙ POP| = 2 | S POP ∩ SRNR ∙ POP| = 2 | SLEN ∙ RNR ∩ SLEN ∙ POP ∩ SRNR ∙ POP

∩ SLEN ∙ RNR ∙ POP| = 1

CS セミナー 2010/12/01

43

The Number of Duplicate Clone Set(Industrial)

Page 44: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

| SRNR ∩ SRNR ∙ POP| = 1 | SPOP ∩ SRNR ∙ POP| = 1 | SPOP ∩ SLEN ∙ POP| = 1

CS セミナー 2010/12/01

44

The Number of Duplicate Clone Set(Apache ant)

Page 45: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

| SRNR ∩ SLEN ∙ RNR| = 3 | SRNR ∩ SRNR ∙ POP| = 1 | SLEN ∙ RNR ∩ SLEN ∙ POP ∩ SRNR ∙ POP

∩ SLEN ∙ RNR ∙ POP| = 2

CS セミナー 2010/12/01

45

The Number of Duplicate Clone Set(JBOSS)

Page 46: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 46

Clone set metrics LEN (C ): Length of token sequence of each element in clone set C

POP (C ): Number of elements in clone set C

RAD (C ): Distribution in the file system of elements in clone set C

DFL (C ): Estimation of how many tokens would be removed from source files when all code fragments of clone set C are replaced with caller statements of a new identical routine

new sub routinecaller statements

Page 47: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Results, and Precision of each clone set in the survey

47

Filtering #Selected Clone Sets

#Refactoring Precision

Clone sets whose LEN(S) value is top 10 high 10 7 0.70Clone sets whose RNR(S) value is top 10 high 10 4 0.40Clone sets whose POP(S) value is top 10 high 10 3 0.30Clone sets whose LEN(S) and RNR(S) values are high rank in the top 15

15 13 0.87

Clone sets whose LEN(S) and POP(S) values are high rank in the top

7 6 0.86

RNR(S) and POP(S) values are high rank in the top 15

18 14 0.78

Clone sets whose 1 clone set whose LEN(S), RNR(S), and POP(S) values are high rank in the top 15

1 1 1.00

Page 48: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Subject Code Clones (Apache Ant)

48

Clone Sets #Selected Clone Sets

#Refactoring Precision

SLEN 10 0 0.00

SRNR 10 6 0.60

SPOP 10 0 0.00

SLEN•RNR 8 6 0.75

SLEN•POP 18 9 0.50

SRNR•POP 34 16 0.47

SLEN•RNR•POP - - -

Page 49: Finding Code Clones for Refactoring   with Clone Metrics :   A Case Study of Open Source Software

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University

Subject Code Clones (Jboss)

49

Clone Sets #Selected Clone Sets

#Refactoring Precision

SLEN 10 2 0.20

SRNR 10 7 0.60

SPOP 10 0 0.00

SLEN•RNR 63 37 0.59

SLEN•POP 104 5 0.05

SRNR•POP 129 32 0.25

SLEN•RNR•POP 2 2 1.00